--- title: "Introduction" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{1) Introduction} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(zctaCrosswalk) ``` This package is designed to help answer common analytical questions that arise when working with US ZIP Codes. Note: the entity which maintains US ZIP Codes (the US Postal Service) does not release a map or crosswalk of that dataset. As a result, most analysts instead use [ZIP Code Tabulation Areas (ZCTAs)](https://www.census.gov/programs-surveys/geography/guidance/geo-areas/zctas.html) which are maintained by the US Census Bureau. Census also provides [Relationship Files](https://www.census.gov/geographies/reference-files/time-series/geo/relationship-files.2020.html#zcta) that maps ZCTAs to other geographies. This package provides the Census Bureau's "2020 ZCTA to County Relationship File" as a tibble, combines it with useful publicly available metadata (such as State names) and provides convenience functions for querying it. The main functions in this package are: * `?get_zctas_by_state` * `?get_zctas_by_county` * `?get_zcta_metadata` ## ?get_zctas_by_state `?get_zctas_by_state` takes a vector of states and returns the vector of ZCTAs in those states. Here are some examples: ```{r} # Not case sensitive when using state names head( get_zctas_by_state("California") ) # USPS state abbreviations are also OK - but these *are* case sensitive head( get_zctas_by_state("CA") ) # Multiple states at the same time are also OK head( get_zctas_by_state(c("CA", "NY")) ) # Throws an error - you can't mix types in a single request # get_zctas_by_state(c("California", "NY")) ``` A common problem when doing analytics with states is ambiguity around names. For example, most people write "Washington, DC". But this dataset uses "District of Columbia". The most common solution to this problem is to use [FIPS Codes](https://en.wikipedia.org/wiki/Federal_Information_Processing_Standard_state_code) when doing analytics with states. And so `?get_zctas_by_state` also supports FIPS codes. Note that technically FIPS codes are characters and have a leading zero (e.g. California is "06"). But in practice people often use numbers (e.g. 6 for California) as well. As a result, `?get_zctas_by_state` supports both: ```{r} ca1 = get_zctas_by_state("CA") ca2 = get_zctas_by_state("06") ca3 = get_zctas_by_state(6) all(ca1 == ca2) all(ca2 == ca3) ``` ## ?get_zctas_by_county `?get_zctas_by_county` works analogously to `?get_zctas_by_state`. The primary difference is that it only accepts FIPS codes. This is because [FIPS county codes](https://en.wikipedia.org/wiki/FIPS_county_code) are unique, but their names are not. (For example, 30 counties in this dataset are named "Washington County"!) If you need to find the FIPS code for a particular county, I recommend simply googling it (e.g. "FIPS code for San Francisco County California") or consulting [this](https://en.wikipedia.org/wiki/List_of_United_States_FIPS_codes_by_county) page. Note that the FIPS codes can be either character or numeric. ```{r} # "06075" is San Francisco County, California head( get_zctas_by_county("06075") ) # 6075 (== as.numeric("06075")) works too head( get_zctas_by_county(6075) ) # Multiple counties at the same time are also OK head( get_zctas_by_county(c("06075", "36059")) ) ``` ## ?get_zcta_metadata `?get_zcta_metadata` takes a vector of ZCTAs and returns all available metadata on them. The ZCTAs can be either character or numeric. ```{r} get_zcta_metadata("90210") # Some ZCTAs span multiple counties get_zcta_metadata(39573) ```