Washington DC street

Using the ACS to Calculate Daytime Population

I’m in the home stretch for getting the last chapter of the first draft of my census book completed. The next to last chapter of the book provides an overview of a number of derivatives that you can create from census data, and one of them is the daytime population.

There are countless examples of using census data for site selection analysis and for comparing and ranking places for locating new businesses, providing new public services, and generally measuring potential activity or population in a given area. People tend to forget that census data measures people where they live. If you were trying to measure service or business potential for residents, the census is a good source.

Counts of residents are less meaningful if you wanted to gauge how crowded or busy a place was during the day. The population of an area changes during the day as people leave their homes to go to work or school, or go shopping or participate in social activities. Given the sharp divisions in the US between residential, commercial, and industrial uses created by zoning, residential areas empty out during the weekdays as people travel into the other two zones, and then fill up again at night when people return. Some places function as job centers while others serve as bedroom communities, while other places are a mixture of the two.

The Census Bureau provides recommendations for calculating daytime population using a few tables from the American Community Survey (ACS). These tables capture where workers live and work, which is the largest component of the daytime population.

Using these tables from the ACS:

Total resident population
B01003: Total Population
Total workers living in area and Workers who lived and worked in same area
B08007: Sex of Workers by Place of Work–State and County Level (‘Total:’ line and ‘Worked in county of residence’ line)
B08008: Sex of Workers by Place of Work–Place Level (‘Total:’ line and ‘Worked in place of residence’ line)
B08009: Sex of Workers by Place of Work–Minor Civil Division Level (‘Total:’ line and ‘Worked in MCD of residence’ line)
Total workers working in area
B08604: Total Workers for Workplace Geography

They propose two different approaches that lead to the same outcome. The simplest approach: add the total resident population to the total number of workers who work in the area, and then subtract the total resident workforce (workers who live in the area but may work inside or outside the area):

Daytime Population = Total Residents + Total Workers in Area - Total Resident Workers

For example, according to the 2017 ACS Washington DC had an estimated 693,972 residents (from table B01003), 844,345 (+/- 11,107) people who worked in the city (table B08604), and 375,380 (+/- 6,102) workers who lived in the city. We add the total residents and total workers, and subtract the total workers who live in the city. The subtraction allows us to avoid double counting the residents who work in the city (as they are already included in the total resident population) while omitting the residents who work outside the city (who are included in the total resident workers). The result:

693,972 + 844,345 - 375,380 = 1,162,937

And to get the new margin of error:

SQRT(0^2 + 11,107^2 + 6,102^2) = 12,673

So the daytime population of DC is approx 468,965 people (68%) higher than its resident population. The district has a high number of jobs in the government, non-profit, and education sectors, but has a limited amount of expensive real estate where people can live. In contrast, I did the calculation for Philadelphia and its daytime population is only 7% higher than its resident population. Philadelphia has a much higher proportion of resident workers relative to total workers. Geographically the city is larger than DC and has more affordable real estate, and faces stiffer suburban competition for private sector jobs.

The variables in the tables mentioned above are also cross-tabulated in other tables by age, sex, race, Hispanic origin , citizenship status, language, poverty, and tenure, so it’s possible to estimate some characteristics of the daytime population. Margins of error will limit the usefulness of estimates for small population groups, and overall the 5-year period estimates are a better choice for all but the largest areas. Data for workers living in an area who lived and worked in the same area is reported for states, counties, places (incorporated cities and towns), and municipal civil divisions (MCDs) for the states that have them.

Data for the total resident workforce is available for other, smaller geographies but is reported for those larger places, i.e. we know how many people in a census tract live and work in their county or place of residence, but not how many live and work in their tract of residence. In contrast, data on the number of workers from B08604 is not available for smaller geographies, which limits the application of this method to larger areas.

Download or explore these ACS tables from your favorite source: the American Factfinder, the Census Reporter, or the Missouri Census Data Center.

Business and Labor Force Data: The Census and the BLS

I’m still cranking away on my book, which will be published by SAGE Publications and is tentatively titled Exploring the US Census: Your Guide to America’s Data. I’m putting the finishing touches on the chapter devoted to business datasets.

Most of the chapter is dedicated to the Census Bureau’s (CB) Business Patterns and the Economic Census. In a final section I provide an overview of labor force data produced by the Bureau of Labor Statistics (BLS). At first glance these datasets appears to cover a lot of the same ground, but they do vary in terms of methodology, geographic detail, number of variables, and currency / frequency of release. I’ll provide a summary of the options in this post.

The Basics

Most of these datasets provide data for business establishments, which are individual physical locations where business is conducted or where services or industrial operations are performed, and are summarized by industries, which are groups of businesses that produce similar products or provide similar services. The US federal government uses the North American Industrial Classification System (NAICS), a hierarchical series of codes used to classify businesses and the labor force into divisions and subdivisions at varying levels of detail.

Since most of these datasets are generated from counts, surveys, or administrative records for business establishments they summarize business activity and the labor force based on where people work, i.e. where the businesses are. The Current Population Survey (CPS) and American Community Survey (ACS) are exceptions, as they summarize the labor force based on residency, i.e. where people live. The Census Bureau datasets tend to be more geographically detailed and present data at one point in time, while the BLS datasets tend to be more timely and are focused on providing data in time series. The BLS gives you the option to look at employment data that is seasonally adjusted; this data has been statistically “smoothed” to remove fluctuations in employment due to normal cyclical patterns in the economy related to summer and winter holidays, the start and end of school years, and general weather patterns.

Many of the datasets are subject to data suppression or non-disclosure to protect the confidentiality of businesses; if a given geography or industrial category has few establishments, or if a small number of establishments constitutes an overwhelmingly majority of employees or wages, data is either generalized or withheld. Most of these datasets exclude agricultural workers, government employees, and individuals who are self-employed. Data for these industries and workers is available through the USDA’s Census of Agriculture and the CB’s Census of Governments and Nonemployer Statistics.

The CB datasets are published on the Census Bureau’s website via the American Factfinder, the new, the FTP site and API, and via individual pages dedicated to specific programs. The BLS datasets are accessible through a variety of  applications via the BLS Data Tools. For each of the datasets discussed below I link to their program page, so you can see fuller descriptions of how the data is collected and what’s included.

The Census Bureau’s Business Data

Business Patterns (BP)
Typically referred to as the County and ZIP Code Business Patterns, this Census Bureau dataset is also published for states, metropolitan areas, and Congressional Districts. Published on an annual basis from administrative records, the number of employees, establishments, and wages (annual and first quarter) is published by NAICS, along with a summary of business establishments by employee size categories.
Economic Census
Released every five years in years ending in 2 and 7, this dataset is less timely than the BP but includes more variables: in addition to employment, establishments, and wages data is published on production and sales for various industries, and is summarized both geographically and in subject series that cover the entire industry. The Economic Census employs a mix of enumerations (100% counts) and sample surveying. It’s available for the same geographies as the BP with two exceptions: data isn’t published for Congressional Districts but is available for cities and towns.

Bureau of Labor Statistics Data

Current Employment Statistics (CES)
This is a monthly sample survey of approximately 150k businesses and government agencies that represent over 650k physical locations. It measures the number of workers, hours worked, and average hourly wages. Data is published for broad industrial categories for states and metropolitan areas.
Quarterly Census of Employment and Wages (QCEW)
An actual count of business establishments that’s conducted four times a year, it captures the same data that’s in the CES but also includes the number of establishments, total wages, and average annual pay (wages and salaries). Data is tabulated for states, metropolitan areas, and counties at detailed NAICS levels.
Occupational Employment Statistics (OES)
A bi-annual survey of 200k business establishments that measures the number of employees by occupation as opposed to industry (the specific job people do rather than the overall focus of the business). Data on the number of workers and wages is published for over 800 occupations for states and metro areas using the Standard Occupational Classification (SOC) system.

Labor Force Data by Residency

Current Population Survey (CPS)
Conducted jointly by the CB and BLS, this monthly survey of 60k households captures a broad range of demographic and socio-economic information about the population, but was specifically designed for measuring employment, unemployment, and labor force participation. Since it’s a survey of households it measures the labor force based on where people live and is able to capture people who are not working (which is something a survey of business establishments can’t achieve). Monthly data is only published for the nation, but sample microdata is available for researchers who want to create their own tabulations.
Local Area Unemployment Statistics (LAUS)
This dataset is generated using a series of statistical models to provide the employment and unemployment data published in the CPS for states, metro areas, counties, cities and towns. Over 7,000 different areas are included.
American Community Survey (ACS)
A rolling sample survey of 3.5 million addresses, this dataset is published annually as 1-year and 5-year period estimates. This is the Census Bureau’s primary program for collecting detailed socio-economic characteristics of the population on an on-going basis and includes labor force status and occupation. Data is published for all large geographies and small ones including census tracts, ZCTAs, and PUMAs. Each estimate is published with a margin of error at a 90% confidence interval. Labor force data from the ACS is best used when you’re OK with generally characterizing an area rather than getting a precise and timely measurement, or when you’re working with an array of ACS variables and want labor force data generated from the same source using the same methodology.

Wrap Up

In the book I’ll spend a good deal of time navigating the NAICS codes, explaining the impact of data suppression and how to cope with it, and covering the basics of using this data from an economic geography approach. I’ve written some exercises where we calculate location quotients for advanced industries and aggregate ZIP-Code based Business Patterns data to the ZCTA-level. This is still a draft, so we’ll have to wait and see what stays and goes.

In the meantime, if you’re looking for summaries of additional data sources in any and every field I highly recommend Julia Bauder’s excellent Reference Guide to Data Sources. Even though it was published back in 2014 I find that the descriptions and links are still spot on – it primarily covers public and free US federal and international government sources.

BLS Data Portal

Bureau of Labor Statistics Data Tools