Demographic survey data from NHANES 2015 to 2016, with data on 9971 participants, including sampling weights.
Format
A data frame with 9971 rows and 13 variables:
- PSU
SDMVPSU - Masked variance pseudo-PSU
- weights
WTINT2YR - Full sample 2 year interview weight
- strata
SDMVSTRA - Masked variance pseudo-stratum
- gender
RIAGENDR - Gender
- age
RIDAGEYR - Age in years at screening
- birth_country
DMDBORN4 - Country of birth
- marital_status
DMDMARTL - Marital status
- interview_lang
SIALANG - Language of interview
- edu_level
DMDHREDU - Household reference person's education level
- household_size
DMDHHSIZ - Total number of people in the Household
- family_size
DMDFMSIZ - Total number of people in the Family
- annual_household_income
INDHHIN2 - Annual household income
- annual_family_income
INDFMIN2 - Annual family income
Note
The data sets provided in this package are derived from the NHANES database and have been adapted for educational purposes. As such, they are NOT suitable for use as a research database. For research purposes, you should download original data files from the NHANES website and follow the analysis instructions given there.
Examples
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
glimpse(nhanes)
#> Rows: 9,971
#> Columns: 13
#> $ PSU <dbl> 1, 1, 1, 1, 2, 1, 1, 2, 1, 2, 1, 2, 2, 2, 2, 1…
#> $ weights <dbl> 134671.370, 24328.560, 12400.009, 102717.996, …
#> $ strata <dbl> 125, 125, 131, 131, 126, 128, 120, 124, 119, 1…
#> $ gender <fct> Male, Male, Male, Female, Female, Female, Fema…
#> $ age <dbl> 62, 53, 78, 56, 42, 72, 11, 4, 1, 22, 32, 18, …
#> $ birth_country <fct> US, Other, US, US, US, Other, US, US, US, US, …
#> $ marital_status <fct> Married, Divorced, Married, Living with partne…
#> $ interview_lang <fct> English, English, English, English, English, S…
#> $ edu_level <fct> College graduate or above, High School, High S…
#> $ household_size <dbl> 2, 1, 2, 1, 5, 5, 5, 5, 7, 3, 4, 3, 1, 3, 4, 2…
#> $ family_size <dbl> 2, 1, 2, 1, 5, 5, 5, 5, 7, 3, 4, 3, 1, 3, 4, 2…
#> $ annual_household_income <dbl> 10, 4, 5, 10, 7, 14, 6, 15, 77, 7, 6, 15, 3, 4…
#> $ annual_family_income <dbl> 10, 4, 5, 10, 7, 14, 6, 15, 77, 7, 6, 15, 3, 4…
nhanes |> dplyr::count(edu_level)
#> # A tibble: 7 × 2
#> edu_level n
#> <fct> <int>
#> 1 College degree 2908
#> 2 College graduate or above 2331
#> 3 High School 2015
#> 4 9-11th Grade 1200
#> 5 Less Than 9th Grade 1087
#> 6 Missing 396
#> 7 Don't Know 34