RAND > L&P > Center for the Study of Aging > Data Products > Help Using Data Products

Help Using Data Products

Unpacking and setting up RAND HRS Data and Fat Files

Lining up interview years, raw variable names, Fat Files, RAND HRS waves

More fat file information, including documentation, SAS formats

Sample programs using RAND HRS Data and Fat Files:

SAS sample

SPSS sample

Stata sample

Notes on sample programs

FAQs:


RAND HRS data file:

RAND-enhanced fat file:


RAND HRS data file:

  • How do I read a Stata-SE file using Stata Intercooled?

    You can read Stata Special Edition(SE) files such as the longitudinal RAND HRS data file or the RAND-enhanced Fat Files with Stata Intercooled by selecting variables on the use command, so long as the total number of variables does not exceed 2047.

    For example:
    use rahhidpn r1iwstat r2iwstat using "rndhrs_h.dta"
    would select the respondent ID and 2 variables from the RAND HRS Ver H File.

  • How are spouses included in the RAND HRS?

    The HRS samples at the household level. In a couple household, one or both of the couple is/are age-eligible for the study, but in either case BOTH individuals in a couple are given an interview and treated as respondents. So, the number of respondents in the RAND HRS is the same as in the core HRS respondent-level files and includes all HRS age-eligibles AND any non-age- eligible spouses.

    For example, if Tom and Judy are a couple and both agree to be interviewed in 1998, you will see two records, one for Tom and one for Judy in both the RAND HRS and the 1998 HRS core data. On the RAND HRS data, you will see whether Judy has ever smoked as R4SMOKEV on Judy's record, and as S4SMOKEV on Tom's record. Moreover, you will see whether Tom has ever smoked as R4SMOKEV on Tom's record and as S4SMOKEV on Judy's. We add the spouse variables from the spouse report as a convenience.

    If you are only interested in individuals without regard to spouses, you can simply ignore the Sw___ variables, and just use the Rw___ ones.

  • How can I tell if a respondent has died?

    RwIWSTAT indicates the response and mortality status of the respondent at each wave. Respondents are identified by code 1, non-respondents by codes 4-7 and 9. Non-respondents who died between the last interview and the current one are assigned a 5 in RwIWSTAT, while those who died before the previous interviewed are assigned a 6.

    Non-response code 4 means that the respondent is alive so far as we know but did not respond. A code of 7 means that the respondent has asked to be dropped from the sample, but was alive the last time this was observed. A code of 9 means that we don't know if the individual is alive or not.

    Mortality status is taken from the Tracker file. Known alive and presumed alive are both treated as indication that the individual is living.

    If the last available wave is based on Early Release data, the Tracker file may not yet indicate whether an individual is alive or not. If the Tracker does not include a mortality flag for the early release wave, and exit interview data for the interview year are available, RwIWSTAT will flag those with an exit interview as deceased.

  • Which interview dates and age variables should be used for each wave?

    The public use HRS data provide two interview dates for the early waves, a beginning interview date and an end interview date. In most cases the two dates are the same. On the RAND HRS data file there are three versions of interview dates and age variables.

    The RwIWBEG interview data and RwAGEY_B respondent age variables reflect the beginning interview dates, and the RwIWEND and RwAGEY_E variables are based on the end interview date. The RwIWMID and RwAGEY_M variables are derived as the midpoint between the beginning and ending interview dates.

    For most purposes it is best to use the variables based on the end interview date, that is, RwIWEND and RwAGEY_E. For most interviews that have two dates, the interview was postponed just after starting. So most of the interview was administered at the end date.

  • How do I use weights in the RAND HRS Data?

    The weights included in the RANDHRS dataset are described in the RAND HRS Data Documentation under the following sections:
    • Sampling Weight
    • Household Analysis Weight
    • Person-Level Analysis Weight

    The weights you use, of course, are going to be driven by the types of analyses you are doing. Though statistical advice is beyond the scope of the help we can provide, we can verify that the sampling weights are only available for HRS (1992) on the RAND HRS file, and that the respondent and HH weights (RwWTRESP and RwWTHH) are taken directly from the HRS-provided weights on the Tracker file.
    The HRS weight document explains how the respondent and HH level weights are created and may help you decide which weights are appropriate to your analysis. More information on HRS weights can be found in on the HRS web site, under Documentation, Weights.

    There are also some resources available through the HRS website that describe how one would begin performing analyses in various statistical packages. An HRS User Guide, Getting Started with the Health and Retirement Study, Chapter 8 shows an example of using weights in Stata.

    There are just a couple of additional issues to be aware of:
    • If you plan to use the data longitudinally you should also be aware that before 1998, non-age-eligible spouses in the HRS and AHEAD cohorts are given zero weights but in 1998 and beyond these spouses are given weights if they were born in years corresponding to other sample cohorts.
    • Respondents living in nursing homes are given zero weight in the individual and household weights provided in the RAND HRS RwWTHH and RwWTRESP. There are also weights recently developed by HRS that provide non-zero weight to institutionalized individuals, that is, those living in nursing homes. These weights, provided for waves 5 and 6 in RwWTR_NH, are zero for those not living in a nursing home.
    We do not do any additional processing of the weight variables; we simply copy them onto the RAND HRS files for your convenience. The data use guides published by the HRS folks may also be useful.

RAND-enhanced fat file:

  • How do I read a RAND-enhanced fat file using Stata Intercooled?

    You can read Stata Special Edition(SE) files such as the longitudinal RAND HRS data file or the RAND-enhanced Fat Files with Stata Intercooled by selecting variables on the use command, so long as the total number of variables does not exceed 2047.

    For example:
    use rahhidpn KC053 KC054 KC055 KC060 KC061 KC062 KC063 KC064 using "h06e2ah.dta"
    would select the respondent ID and 8 other variables from the 2006 Fat File.

  • Why don't the frequency counts on the fat files match those in the HRS codebooks for household level variables?

    In the RAND-enhance Fat Files, we reorganized the data so that each observation represents one individual, and we merged the appropriate information from the various modules to each observation. This means that household level information is present for each individual respondent in couple households. So for household level variables, the counts on the RAND-enhanced Fat Files will be higher than those listed in the HRS codebooks.

Send questions or comments about this webpage to RANDHRSHelp@rand.org

Last modified March 2008
Home About RAND Opportunities Research Areas Books and Publications View Shopping Cart