Returns a subset of the EPA summary or individual data that fulfills the given parameters. Filtering can be done by term, data set, component (identity, behavior, modifier, setting), type of data (summary or individual), statistics (mean, standard deviation, covariance), institutions the term belongs to, and gender of raters.
Usage
epa_subset(
expr = ".*",
exactmatch = FALSE,
dataset = "everything",
component = "everything",
datatype = "summary",
group = "everything",
stat = "everything",
stat_na_exclude = TRUE,
instcodes = TRUE,
institutions = "everything",
drop.na.instcodes = FALSE
)
Arguments
- expr
A term, regular expression, or list of terms or regexs to search. If a list is provided, entries will be treated as separated by "or", so all terms matching one or more of the entries will be returned. Default matches all terms.
- exactmatch
Logical indicating whether the function should return only exact matches to the expression provided. If FALSE (default), all terms containing the expression are returned.
- dataset
The key of the data set (or list of multiple) to search in. Default is "everything". Call
dict_info()
to see available data sets.- component
The component of the dictionary to use (identity, behavior, modifier, setting). Default is "everything."
- datatype
Whether to retrieve summary or individual data. Default is summary.
- group
The subgroup of respondents to use. Usually datasets are subgrouped by gender; options are male, female, all. Default is "everything." Ignored when datatype is individual.
- stat
The statistics to include in the subset that is returned. Default is all, options are mean, sd (standard deviation), cov (covariance), and n (number of raters). Terms that do not contain values for the required statistic will be excluded from the results. Ignored if datatype is individual.
- stat_na_exclude
Ignored if stat is not specified of datatype is individual. A logical indicating whether to exclude entries with NA values for any of the required statistics. Default is TRUE.
- instcodes
Logical. Whether to include the institution codes in the output. Default is TRUE.
- institutions
Character list. Institutions to include (defaults to everything)
- drop.na.instcodes
Logical. When filtering by institution, whether or not to keep terms which have no institution code.
Value
a dataset containing the entries that match the given parameters or FALSE if no matches are found.
Examples
epa_subset("teacher")
#> # A tibble: 201 × 25
#> term component dataset context year group instcodes E P A n_E
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 scho… identity calcut… India 2017 male NA 2.02 1.47 0.83 20
#> 2 scho… identity calcut… India 2017 fema… NA 1.88 1.14 1.01 20
#> 3 scho… identity calcut… India 2017 all NA 1.95 1.3 0.92 40
#> 4 scho… identity calcut… India 2017 male NA 2.15 1.72 1.95 20
#> 5 scho… identity calcut… India 2017 fema… NA 2.27 1.83 1.47 20
#> 6 scho… identity calcut… India 2017 all NA 2.21 1.77 1.71 40
#> 7 scho… identity calcut… India 2017 male NA 2.09 1.47 1.62 20
#> 8 scho… identity calcut… India 2017 fema… NA 1.68 1.39 0.5 20
#> 9 scho… identity calcut… India 2017 all NA 1.89 1.43 1.06 40
#> 10 teac… identity calcut… India 2017 male 11 00001… 2.2 1.99 1.91 20
#> # ℹ 191 more rows
#> # ℹ 14 more variables: n_P <dbl>, n_A <dbl>, sd_E <dbl>, sd_P <dbl>,
#> # sd_A <dbl>, cov_EE <dbl>, cov_EP <dbl>, cov_EA <dbl>, cov_PE <dbl>,
#> # cov_PP <dbl>, cov_PA <dbl>, cov_AE <dbl>, cov_AP <dbl>, cov_AA <dbl>
epa_subset(dataset = "politics2003")
#> # A tibble: 216 × 25
#> term component dataset context year group instcodes E P A
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 alderman identity politi… US 2003 all NA 0.735 0.905 0.335
#> 2 alderman identity politi… US 2003 male NA 0.82 0.96 0.15
#> 3 alderman identity politi… US 2003 fema… NA 0.65 0.85 0.52
#> 4 analyze_s… behavior politi… US 2003 all NA 2.00 1.50 -0.325
#> 5 analyze_s… behavior politi… US 2003 male NA 2.32 1.83 -0.53
#> 6 analyze_s… behavior politi… US 2003 fema… NA 1.67 1.16 -0.12
#> 7 assembly identity politi… US 2003 all NA 1.38 1.24 1.09
#> 8 assembly identity politi… US 2003 male NA 1.51 1.05 1.06
#> 9 assembly identity politi… US 2003 fema… NA 1.25 1.43 1.12
#> 10 authorize… behavior politi… US 2003 all NA 1.10 2.04 0.51
#> # ℹ 206 more rows
#> # ℹ 15 more variables: n_E <dbl>, n_P <dbl>, n_A <dbl>, sd_E <dbl>, sd_P <dbl>,
#> # sd_A <dbl>, cov_EE <dbl>, cov_EP <dbl>, cov_EA <dbl>, cov_PE <dbl>,
#> # cov_PP <dbl>, cov_PA <dbl>, cov_AE <dbl>, cov_AP <dbl>, cov_AA <dbl>
epa_subset(expr = ".*woman", component = "identity", group = c("male", "female"),
institutions = c("lay", "business"))
#> # A tibble: 72 × 25
#> term component dataset context year group instcodes E P A n_E
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 batt… identity calcut… India 2017 male 01 10100… -0.58 -0.22 0.33 12
#> 2 batt… identity calcut… India 2017 fema… 01 10100… 0 0.43 -0.61 16
#> 3 preg… identity calcut… India 2017 male NA 0.46 0.27 1.02 16
#> 4 preg… identity calcut… India 2017 fema… NA 1.97 1.23 0.56 19
#> 5 woman identity calcut… India 2017 male 01 10000… 1.24 0.58 -0.03 15
#> 6 woman identity calcut… India 2017 fema… 01 10000… 1.2 0.55 0.85 14
#> 7 woma… identity calcut… India 2017 male NA 0.55 0.34 0.11 20
#> 8 woma… identity calcut… India 2017 fema… NA 1.04 0.57 0 20
#> 9 batt… identity calcut… India 2017 male 01 10100… -0.58 -1.37 -2.02 12
#> 10 batt… identity calcut… India 2017 fema… 01 10100… 0 0.13 0.01 16
#> # ℹ 62 more rows
#> # ℹ 14 more variables: n_P <dbl>, n_A <dbl>, sd_E <dbl>, sd_P <dbl>,
#> # sd_A <dbl>, cov_EE <dbl>, cov_EP <dbl>, cov_EA <dbl>, cov_PE <dbl>,
#> # cov_PP <dbl>, cov_PA <dbl>, cov_AE <dbl>, cov_AP <dbl>, cov_AA <dbl>
epa_subset(dataset = "morocco2015", stat = "cov", stat_na_exclude = FALSE)
#> # A tibble: 1,448 × 16
#> term component dataset context year group instcodes cov_EE cov_EP cov_EA
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 abandon behavior morocc… Morocco 2015 all 01 11111… 4.43 1.76 -1.6
#> 2 abortio… identity morocc… Morocco 2015 all 11 00000… 5.5 1.66 -1.39
#> 3 abuse behavior morocc… Morocco 2015 all 10 10100… 4.12 0.37 -1.69
#> 4 abusive modifier morocc… Morocco 2015 all 10 01000… 5.98 3.34 -1.1
#> 5 accommo… modifier morocc… Morocco 2015 all 10 01000… 4.38 2.8 -0.87
#> 6 accuse behavior morocc… Morocco 2015 all 10 11111… 6 1.66 -1.15
#> 7 address behavior morocc… Morocco 2015 all 10 11111… 3.61 1.43 -1.68
#> 8 admonish behavior morocc… Morocco 2015 all 10 01111… 4.4 1.23 -1.88
#> 9 adolesc… identity morocc… Morocco 2015 all 11 10000… 3.03 1.23 -0.19
#> 10 adult identity morocc… Morocco 2015 all 11 10000… 4.21 2.43 0.3
#> # ℹ 1,438 more rows
#> # ℹ 6 more variables: cov_PE <dbl>, cov_PP <dbl>, cov_PA <dbl>, cov_AE <dbl>,
#> # cov_AP <dbl>, cov_AA <dbl>
epa_subset(dataset = "usmturk2015", datatype = "individual")
#> # A tibble: 264,844 × 16
#> dataset context year userid gender age raceeth race1 race2 hisp term
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No tele…
#> 2 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No bewi…
#> 3 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No coal…
#> 4 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No unde…
#> 5 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No appl…
#> 6 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No gran…
#> 7 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No love
#> 8 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No barr…
#> 9 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No humb…
#> 10 usmturk2015 US 2015 MTurk1 Female 51 NA Whit… No a… No land…
#> # ℹ 264,834 more rows
#> # ℹ 5 more variables: component <chr>, instcodes <chr>, E <dbl>, P <dbl>,
#> # A <dbl>