Get a tidy data frame of classifications of all functions used in your analysis

get_classifications(lexicon = NULL, include_duplicates = TRUE)

Arguments

lexicon

Character. The classification lexicon to retrieve. Either "crowdsource" or "leeklab". If NULL (default), will return all lexicons.

include_duplicates

Logical. Indicates whether to include all functions and classifications along with their score (default, TRUE) - this may result in multiple lines (with multiple classifications) for a single function. If FALSE, the most prevalent classification will be selected.

Value

A tbl_df with columns:

  • func: the function

  • classification: the classification

If include_duplicates = TRUE, will include a column:

  • score: the score

If lexicon is NULL, will include a column:

  • lexicon: the classification lexicon

Examples

# Get a data frame of all classifications get_classifications()
#> # A tibble: 4,181 x 4 #> lexicon func classification score #> <chr> <chr> <chr> <dbl> #> 1 crowdsource - data cleaning 0.386 #> 2 crowdsource - visualization 0.174 #> 3 crowdsource - modeling 0.159 #> 4 crowdsource - evaluation 0.102 #> 5 crowdsource - exploratory 0.0833 #> 6 crowdsource - setup 0.0458 #> 7 crowdsource - import 0.0367 #> 8 crowdsource - communication 0.0133 #> 9 crowdsource : data cleaning 0.458 #> 10 crowdsource : exploratory 0.115 #> # … with 4,171 more rows
# Get a data frame of the most prevalent classifications get_classifications(include_duplicates = FALSE)
#> # A tibble: 1,957 x 3 #> lexicon func classification #> <chr> <chr> <chr> #> 1 crowdsource - data cleaning #> 2 crowdsource : data cleaning #> 3 crowdsource :: data cleaning #> 4 crowdsource := data cleaning #> 5 crowdsource ! data cleaning #> 6 crowdsource != data cleaning #> 7 crowdsource . data cleaning #> 8 crowdsource .libPaths setup #> 9 crowdsource ( data cleaning #> 10 crowdsource [ data cleaning #> # … with 1,947 more rows
# Get a data frame of only `leeklab` classifications get_classifications("leeklab")
#> # A tibble: 642 x 3 #> func classification score #> <chr> <chr> <dbl> #> 1 %% data cleaning 1 #> 2 %>% data cleaning 0.864 #> 3 %>% exploratory 0.0909 #> 4 %>% setup 0.0455 #> 5 %in% data cleaning 1 #> 6 %within% data cleaning 1 #> 7 abline evaluation 1 #> 8 abs data cleaning 0.667 #> 9 abs modeling 0.333 #> 10 ad.data.frame exploratory 1 #> # … with 632 more rows