Emsi offers employment data by industry and occupation at the ZIP code level. We begin with Emsi’s final county-level industry data. ZIP-level industry data is created by disaggregating industry county-level data down to the ZIP level with the help of several outside sources. ZIP-level occupation data is created by applying staffing patterns to ZIP-level industry data.
Modeling Industry Data from County to ZIP
The backbone of ZIP-level data is Emsi county-level data, which is built using the BLS’s Quarterly Census of Employment and Wages (QCEW) dataset, the most complete and trustworthy source of employment data available in the United States. We use these numbers as the foundation for ZIP-level data, ensuring that employment at the ZIP level exactly matches employment at the county level.
To model the industry county data down to the ZIP level, we use the ZBP dataset to create percentages of employment among ZIPs and industries within a county. For instance, if Emsi county data shows that a 3-ZIP county has employment of 200 in industry x, and that ZBP shows employment ratios of 57%, 43%, and 0% for that industry in the ZIPs in that county, we will assign 114 jobs, 85 jobs, and 0 jobs for that industry to each ZIP in the county, respectively.
If Emsi’s county-level data contains employment for an industry, but ZBP shows no employment for the industry, we move up to the parent 5-digit NAICS and check ZBP again. This happens up to the 2-digit NAICS level, as necessary to find data in ZBP.
We use USPS’s DelStat dataset to create default fallback proportions for each county in case no ZBP data is available for that county-industry combination. DelStat provides business address counts by ZIP. We create a default proportion for each county by counting the number of business addresses in each ZIP within the county. This means we create a unique business address percentage mapping of each county, showing what percent of the county’s businesses are in each ZIP. If the initial method of using ZBP to assign employment for an industry to ZIPs doesn’t work, we fall back to the county’s default percentage map to distribute employment for that industry. The fallback method is only necessary in 0.5% of cases.
Modeling ZIP Occupation Data from ZIP Industry Data
Emsi ZIP occupation data is created in the same way as Emsi county occupation data—we use regionalized staffing patterns created from the BLS’s Occupational Employment Statistics (OES) dataset. OES provides a national-level staffing pattern, which we regionalize using regional industry and occupation data for each OES substate region. These staffing patterns are then applied to Emsi county-level industry data, producing county-level occupation data, and are also applied to ZIP-level industry data, producing ZIP-level occupation data.