data sources

Info about any spatial or attribute data sources

Business and Labor Force Data: The Census and the BLS

I’m still cranking away on my book, which will be published by SAGE Publications and is tentatively titled Exploring the US Census: Your Guide to America’s Data. I’m putting the finishing touches on the chapter devoted to business datasets.

Most of the chapter is dedicated to the Census Bureau’s (CB) Business Patterns and the Economic Census. In a final section I provide an overview of labor force data produced by the Bureau of Labor Statistics (BLS). At first glance these datasets appears to cover a lot of the same ground, but they do vary in terms of methodology, geographic detail, number of variables, and currency / frequency of release. I’ll provide a summary of the options in this post.

The Basics

Most of these datasets provide data for business establishments, which are individual physical locations where business is conducted or where services or industrial operations are performed, and are summarized by industries, which are groups of businesses that produce similar products or provide similar services. The US federal government uses the North American Industrial Classification System (NAICS), a hierarchical series of codes used to classify businesses and the labor force into divisions and subdivisions at varying levels of detail.

Since most of these datasets are generated from counts, surveys, or administrative records for business establishments they summarize business activity and the labor force based on where people work, i.e. where the businesses are. The Current Population Survey (CPS) and American Community Survey (ACS) are exceptions, as they summarize the labor force based on residency, i.e. where people live. The Census Bureau datasets tend to be more geographically detailed and present data at one point in time, while the BLS datasets tend to be more timely and are focused on providing data in time series. The BLS gives you the option to look at employment data that is seasonally adjusted; this data has been statistically “smoothed” to remove fluctuations in employment due to normal cyclical patterns in the economy related to summer and winter holidays, the start and end of school years, and general weather patterns.

Many of the datasets are subject to data suppression or non-disclosure to protect the confidentiality of businesses; if a given geography or industrial category has few establishments, or if a small number of establishments constitutes an overwhelmingly majority of employees or wages, data is either generalized or withheld. Most of these datasets exclude agricultural workers, government employees, and individuals who are self-employed. Data for these industries and workers is available through the USDA’s Census of Agriculture and the CB’s Census of Governments and Nonemployer Statistics.

The CB datasets are published on the Census Bureau’s website via the American Factfinder, the new data.census.gov, the FTP site and API, and via individual pages dedicated to specific programs. The BLS datasets are accessible through a variety of  applications via the BLS Data Tools. For each of the datasets discussed below I link to their program page, so you can see fuller descriptions of how the data is collected and what’s included.

The Census Bureau’s Business Data

Business Patterns (BP)
Typically referred to as the County and ZIP Code Business Patterns, this Census Bureau dataset is also published for states, metropolitan areas, and Congressional Districts. Published on an annual basis from administrative records, the number of employees, establishments, and wages (annual and first quarter) is published by NAICS, along with a summary of business establishments by employee size categories.
Economic Census
Released every five years in years ending in 2 and 7, this dataset is less timely than the BP but includes more variables: in addition to employment, establishments, and wages data is published on production and sales for various industries, and is summarized both geographically and in subject series that cover the entire industry. The Economic Census employs a mix of enumerations (100% counts) and sample surveying. It’s available for the same geographies as the BP with two exceptions: data isn’t published for Congressional Districts but is available for cities and towns.

Bureau of Labor Statistics Data

Current Employment Statistics (CES)
This is a monthly sample survey of approximately 150k businesses and government agencies that represent over 650k physical locations. It measures the number of workers, hours worked, and average hourly wages. Data is published for broad industrial categories for states and metropolitan areas.
Quarterly Census of Employment and Wages (QCEW)
An actual count of business establishments that’s conducted four times a year, it captures the same data that’s in the CES but also includes the number of establishments, total wages, and average annual pay (wages and salaries). Data is tabulated for states, metropolitan areas, and counties at detailed NAICS levels.
Occupational Employment Statistics (OES)
A bi-annual survey of 200k business establishments that measures the number of employees by occupation as opposed to industry (the specific job people do rather than the overall focus of the business). Data on the number of workers and wages is published for over 800 occupations for states and metro areas using the Standard Occupational Classification (SOC) system.

Labor Force Data by Residency

Current Population Survey (CPS)
Conducted jointly by the CB and BLS, this monthly survey of 60k households captures a broad range of demographic and socio-economic information about the population, but was specifically designed for measuring employment, unemployment, and labor force participation. Since it’s a survey of households it measures the labor force based on where people live and is able to capture people who are not working (which is something a survey of business establishments can’t achieve). Monthly data is only published for the nation, but sample microdata is available for researchers who want to create their own tabulations.
Local Area Unemployment Statistics (LAUS)
This dataset is generated using a series of statistical models to provide the employment and unemployment data published in the CPS for states, metro areas, counties, cities and towns. Over 7,000 different areas are included.
American Community Survey (ACS)
A rolling sample survey of 3.5 million addresses, this dataset is published annually as 1-year and 5-year period estimates. This is the Census Bureau’s primary program for collecting detailed socio-economic characteristics of the population on an on-going basis and includes labor force status and occupation. Data is published for all large geographies and small ones including census tracts, ZCTAs, and PUMAs. Each estimate is published with a margin of error at a 90% confidence interval. Labor force data from the ACS is best used when you’re OK with generally characterizing an area rather than getting a precise and timely measurement, or when you’re working with an array of ACS variables and want labor force data generated from the same source using the same methodology.

Wrap Up

In the book I’ll spend a good deal of time navigating the NAICS codes, explaining the impact of data suppression and how to cope with it, and covering the basics of using this data from an economic geography approach. I’ve written some exercises where we calculate location quotients for advanced industries and aggregate ZIP-Code based Business Patterns data to the ZCTA-level. This is still a draft, so we’ll have to wait and see what stays and goes.

In the meantime, if you’re looking for summaries of additional data sources in any and every field I highly recommend Julia Bauder’s excellent Reference Guide to Data Sources. Even though it was published back in 2014 I find that the descriptions and links are still spot on – it primarily covers public and free US federal and international government sources.

BLS Data Portal

Bureau of Labor Statistics Data Tools

US Census Bureau

Exploring US Census Datasets – Which One do you Choose?

US Census data isn’t “big data” in the technical sense, as it’s not being captured and updated in real time and it isn’t fine-grained enough to pinpoint specific coordinates. But it’s big in the conventional sense: it consists of many different datasets that record a variety of aspects about the entire population at many scales, and it’s relational and flexible in nature (tables can be joined, new data can be added or modified). And, there’s a LOT of data!

So which census dataset do you choose for a particular application? In this post I provide a summary overview of what I consider to be the big five: what they are, how they’re constructed, and what’s available. I’ll be describing summary data here, which is data that’s aggregated and published by geographic area and population groups. I won’t be addressing sample microdata (individual responses to census questions) which are available for the decennial census, American Community Survey, and Current Population Survey.

ALL of these datasets are available through the American Factfinder, and via the Census Bureau’s APIs. The smaller datasets can also be downloaded directly from the individual program pages (I note this when it’s available). Outside of he Census Bureau, the Census Reporter is a nice tool for exploring data, while the Missouri Census Data Center and the NHGIS are good alternatives for generating summaries and downloading data in bulk.

The Decennial Census (DEC)

Census 2020

When people think of “the census” the decennial census (DEC)  is the dataset that typically comes to mind. It’s the 100% count of the population that’s conducted every ten years on April 1st in years ending with a zero. Required by the Constitution and taken since 1790, its primary purpose is to provide detailed population counts that are used to re-apportion seats in the US House of Representatives. It’s also used to study population distribution and change at the smallest geographic levels, and serves as baseline data for many of the other Census Bureau statistical programs.

The modern census (from the year 2010 forward) collects just basic demographic variables about the population and housing units: gender, age, race, household relationships, group quarters, occupied and vacant units, owner and renter units. This data is published in a series of collections; the primary one is summary file 1 (SF1), but there’s also a summary file 2 (SF2) that contains more detailed cross-tabs. The Redistricting Data file (PL 94-171) is always the first to be released, and contains just the basics.

From the year 2000 back, the DEC had additional summary files that included detailed socio-economic characteristics of the population that were captured on a longer sample form sent to one in six households. The on-going American Community Survey has since replaced it, so if you are looking for anything beyond the basics you need to look at the ACS. For older DEC data, you can find the 2000 census in the American Factfinder but if you want to go back further in time use the NHGIS.

For a sample of what’s included in the DEC, look at the demographic profile table (DP-1), which contains a good cross-section of variables.

Use the DEC when:

  • You need 100% counts of the population
  • You need to use the smallest geographies available (census blocks, block groups, tracts)
  • You don’t need anything more than basic demographic variables
  • You’re studying very small population groups in a given area
  • You’re making historical comparisons with earlier DEC data

The American Community Survey (ACS)

American Community Survey

The American Community Survey (ACS) was launched in 2005 to provide more timely data about the US population on an on-going basis. In addition to the basic demographic variables captured in the DEC, the ACS also captures all the detailed socio-economic statistics that the older census used to capture, such as: employment, marital status, educational enrollment and attainment, veteran status, income, poverty, place of origin, housing value and rent, housing characteristics, and much more.

The ACS is a rolling sample survey that’s conducted each month, and is sent to 3.5 million households annually. The data is published as 1-year averages for any geographical area (state, county, place, etc) that has more than 65k people. 5-year averages are published for all geographies down to the census tract level (some block group level data is available, but it’s highly unreliable). The 5-year average is updated each year by adding a new year of data and dropping the oldest year.

ACS estimates are published at a 90% confidence interval with a margin of error that indicates the possible range of the estimate. For example, if the population for an area is 20,000 people plus or minus 1,000, that means we’re 90% confident that the population is between 19,000 and 21,000 people, and there’s a 10% chance the true population falls outside this range. The timeliness, geographic depth, and variety of variables make the ACS an essential dataset. However, it’s more complicated to work with compared to the simple counts in the DEC, and as a researcher you must pay close attention to the margins of error; estimates for small areas and small population groups can be highly unreliable. To manage this, you can aggregate the data into larger geographies or into fewer population groups.

The 1-year averages are available for all states and metropolitan areas, and for statistical areas called PUMAS that are designed to have 100k people. 1-year averages are available for large counties or places (cities and towns), but since many of these areas have less than 65k people coverage will not be complete. Use the 5-year averages if you need complete coverage of all counties or places in an area, or if you need small areas like census tracts and ZCTAs. When making historical comparisons, it’s only appropriate to compare five year periods that do not overlap. For example, comparing 2007-2011 to 2012-2016 would be appropriate.

For a thorough sample of what’s included in the ACS, look at the demographic profile tables for social (DP02), economic (DP03), housing (DP04), and demographic (DP05) variables.

Use the ACS when:

  • You need detailed socio-economic indicators about the population
  • You need the most recent data for these indicators
  • You’re not working with data below the census tract level
  • You can live with the margins of error associated with the estimates
  • Use the 1-year averages when you are looking at just large places with more than 65k people and large population groups
  • Use 5-year averages to study all areas of a given type, small areas and population groups, and to reduce the size of the margin of error for larger areas and groups

Population Estimates Program (PEP)

Population Estimates Program

The Population Estimates Program (PEP) is used to create basic, annual estimates of the US population for large areas. Using the latest DEC as a starting point, the Bureau takes data on births, deaths, and domestic and international migration to estimate what the population is the following year, and then creates a new estimate each year on July 1st. The estimates are created at the county level, and are rolled up to states and metropolitan areas and disaggregated down to census places (cities and towns). Once a new DEC is taken, the Bureau will go back to the previous decade and issue a set of revised estimates to approximate what actually happened.

Besides the total population, estimates are created for age, gender, race, and housing units. The PEP is also a source for the components of population change for each place (births, deaths, migration), which the Census Bureau compiles from other sources. Since this is a much smaller dataset compared to the DEC or ACS, PEP data can be downloaded in pre-compiled spreadsheets directly from the PEP website, in addition to the American Factfinder and APIs.

Use the PEP when:

  • You just need basic demographic variables for large geographic areas
  • You’re interested in annual population change
  • You want a simple dataset to work with
  • You’re interested in the components of population change

Current Population Survey (CPS)

Bureau of Labor Statistics

The Current Population Survey (CPS) is a monthly survey of 60,000 households that’s sponsored by the Census Bureau and the Bureau of Labor Statistics (BLS). It’s designed to provide national estimates for a variety of demographic and labor force indicators on a regular basis. The same household is: interviewed for 4 consecutive months, not interviewed again for 8 months, interviewed again for 4 consecutive months, and then is removed from the survey.

Some questions like employment and unemployment are asked repeatedly each month, other questions are asked only during certain months at regular intervals (for example, every two years in November a question is asked about voter registration and participation), and other questions are special topics that are asked on a one-time or limited basis.

Some of the most important indicators are tabulated and published as annual estimates directly on the CPS website, while many of the labor force statistics are published monthly or annually on the BLS website. Given the small sample size of the CPS relative to the ACS, it’s more common that researchers will manipulate the raw CPS data (the individual responses) to create their own estimates and cross-tabulations. The CPS website has some tools for doing this, and IPUMS USA is a popular tool as well.

Given the size of the sample, CPS estimates tabulated by the Census or BLS are published at a national or regional level, and in limited cases at the state level. Aggregated, summarized data is published with margins of error at a 90% confidence interval.

Use the CPS when:

  • You need monthly or annual, detailed demographic or labor force data for the entire country or for large regions
  • You’re looking for special topic data that’s not published in any other dataset
  • You’re comfortable working with microdata if the data you’re looking for has not been aggregated or summarized

Business Patterns and Economic Census

Economic Census

The previous four sources provide data on population and housing units. If you’re looking for data on businesses, here are the two most common options.

The County and ZIP Code Business Patterns provides annual counts of the number of businesses for states, metropolitan areas, counties, and ZIP Codes that includes the number of employees, establishments, and wages. It also provides counts of establishments classified by the North American Industrial Classification System (NAICS). You can look at broad (i.e. manufacturing, retail, finance & insurance) or narrow (auto parts manufacturers, department stores, commercial banks) NAICS summaries. If a particular area has fewer than 3 businesses of a specific type the data isn’t disclosed for confidentiality reasons. The data is generated from the Business Register, which is a government master file of businesses that’s updated on an on-going basis from several sources.

The Economic Census is conducted every five years in years that end in two or seven. It is an actual count of businesses that captures all of the fields that are published in the Business Patterns, but it also: captures sales as well as wages, provides place-level data (cities and towns), and is published in a variety of topical as well as geographic summaries. Because there is quite a time-lag between the collection and publication of this data (several years), the Economic Census is better suited for studying the economy in retrospect. The USDA publishes a Census of Agriculture which covers farming in more detail.

For business statistics:

  • Use the Business Patterns for basic counts of the latest data
  • Use the Economic Census for more detailed information that’s a bit older
  • Use the USDA’s Census of Agriculture to study farming

 

Review of The Census Reporter

Picking up where I left off from my previous post (gee – welcome to 2016!) I thought I’d give a brief review of another census resource, The Census Reporter at http://censusreporter.org/.

The Census Reporter was created to make it easier for journalists to write stories using census data. To that end, they’ve created a really slick and easy to use web site that makes the data accessible and fun to explore. From the homepage you have three ways of diving into the data: you can pull up a profile by typing in the name of a place, you can enter an address and explore places that contain that address, or you can explore tables by topic.

Census Reporter Homepage

First, the place-based approach. You can type in a named place, like a state, county, or a census place (incorporated cities and towns, or census designated places) to get started. This will give you a selection of data from the most recent release of the American Community Survey. For larger areas where the data is available, it gives you 1-year ACS data by default; otherwise you get the latest 5-year data.

You’re presented with a map of the location at the top, and a series of attractive looking graphs and charts sorted by the demographic profile table source – social, economic, housing, and demographic. If you hover over a data point in a table it gives you some geographic context by comparing this place’s value with that of larger places where it’s contained. For example, if I search for Philadelphia I can hover over the chart to get the value for the Philly metro area and the State of Pennsylvania. I can click a link below each chart to open the full table, which includes both estimates and margins of error. There are small links for viewing the table by itself on a separate page (which also gives you the ability to download it) and for embedding the chart in a website.

Census Reporter Chart and Table

Viewing the table gives you additional options, like adding additional places for comparison, or subdividing the place into smaller areas for comparison. So if I’m looking at Philadelphia, I can break it down into tracts, block groups, or ZIP Codes. From there I can toggle away from the table view to view a map or a distribution bar to explore that variable by individual geographies.

Census Reporter Chart and Table

The place-based search is great at allowing you to drill down either by topic or by these smaller geographies. But if you wanted to access a fuller range of geographies like congressional districts or PUMAs, it seems easier to do an address-based search. Back on the homepage, selecting the address button and typing in an address brings you to a map with the address pin-pointed, and on the left you can choose any geography that encloses that address. Once you do that, you get a profile for that geography and can start doing the same sorts of operations for changing the topics or tables, or adding or subdividing geographies for comparison.

Address Search

The topic-based search lets you search just by topic and then figure out the geography piece later. Of the three types of searches this one is the toughest, given the sheer number of tables and cross-tabulations. You can click on a link for a general topic to narrow things down a bit before beginning a search.

In downloading the data you have a variety of useful options: CSV, Excel, GeoJSON, KML, and shapefiles. So in theory you can download data that’s readily mappable – in practice I wasn’t able to download a shapefile, but could grab a KML or GeoJson and was able to visualize it in QGIS. One challenge in downloading any of the files is that the column names use the identifier codes, and the actual names of the variables aren’t included in the download format you choose – they’re included in a json file. So you can use that for reference, but it can’t be readily incorporated into the table.

So – where would this resource fall within the pantheon of US Census data resources? I think it’s great for accessing and, especially, visualizing profiles (profile = lots of data for one place) from the most recent ACS releases. It’s easy to use and succeeds at making the data interesting; for that reason I certainly would incorporate it into undergraduate courses where I’m introducing data. The ability to embed the charts into websites is certainly a bonus, and they deserve a big thumbs up for incorporating the margin of error data, rather than hiding or discarding it like other resources do.

The ability to create and view comparison tables (comparison = one piece of data for many places) is good – select an area and then break it down – but not as strong as the profile options. If you want to get a profile for a non-named place like a tract, ZIP Code, or PUMA you can’t do that from the profile search. You can do an address search and back out (if you know an address for that you’re interested in) or you can drill down by topic, which lets you search by summary area in addition to named places.

For users who need to download a lot of data, or for folks who need datasets that aren’t the most recent ACS release, this resource isn’t the place to go. The focus here is on providing the data in an easy and compelling way, as is. In viewing the profiles, it’s not clear if you can choose 5-year data over 1-year data for places where both datasets are available – even for large geographic areas with high population, sometimes it’s preferable to use the 5-year data to take advantage of the smaller margins of error. I also didn’t see an option for choosing decennial census data.

In short, this resource is well-designed and definitely worth exploring. It seems clear why this would be a go-to source for journalists, but it can be for many others as well.

The New NYC Census Factfinder

As I’m updating my presentations and handouts for the new academic year, I’m taking two new census resources for a test drive. I’ll talk about the first resource in this post.

The NYC Department of City Planning has been collating census data and publishing it for the City for quite some time. They’ve created neighborhood tabulation areas (NTAs) by aggregating census tracts, so that they could publish more reliable ACS data for small areas (since the margins of error for census tracts can be quite large) and so that New Yorkers have data for neighborhood-like areas that they would recognize. The City also publishes PUMA-level data that’s associated with the City’s Community Districts, as well as borough and city-level data. All of this information is available in a large series of Excel spreadsheets or PDFs in the form of comparison tables for each dataset.

The Department of Planning also created the NYC Census Factfinder, a web-mapping interface that let’s users explore census tract and NTA level data profiles. You could plug in an address or click on the map and get a 2010 Census profile, or a demographic change profile that showed shifts between the 2000 and 2010 Census.

pic1_factfinder

It was a nice application, but they’ve just made a series of updates that make it infinitely better:

  1. They’ve added the American Community Survey data from 2009-2013, and you can view the four demographic profile tables (demographics, social, economic, and housing) for tracts and NTAs.
  2. Unlike many other sources, they do publish the margin of error for all of the ACS data, which is immensely important. Estimates that have a high margin of error (as defined by a coefficient of variation) appear in grey instead of solid black. While the actual margins are not shown by default, you can simply click the Show radio button to turn on the Reliability data.
  3. Tracts or neighborhoods can be compared to the City as a whole or to an individual borough by selecting the drop down for the column header.
  4. This is especially cool – if you’re viewing census tracts you can use the select pointer and hold down the Control key (Command key on a Mac) to select multiple tracts, and then the data tables will aggregate the tract-level data for you (so essentially you can build your own neighborhoods). What’s noteworthy here is that it also calculates the new margins of error for all of the derived estimates, AND it even calculates new medians and averages with margins of error! This is something that I’ve never seen in any other application.
  5. In addition to searching for locations by address, you can hit the search type drop down and you have a number of additional options like Intersection, Place of Interest, and even Subway Stations.

nyc_factfinder_table

There are a few quirks:

    1. I had trouble viewing the map in Firefox – this isn’t a consistent problem but something I noticed today when I went exploring. Hopefully something temporary that will be corrected. Had no problems in IE.
    2. If you want to click to select an area on the map, you have to hit the select button first (the arrow beside the zoom slider and print button) and then click on your area to select it. Just clicking on the map without hitting select first won’t do much – it will just highlight the area and tell you it’s name. Clicking the arrow button turns it blue and allows you to select features, clicking it again turns it white and lets you identify features and pan around the map.

factfinder_buttons

  1. The one bummer is that there isn’t a way to download any of the profiles – particularly the ones you custom design by selecting tracts. Hitting the Get Data button takes you out of the Factfinder and back to the page with all of the pre-compiled comparison tables. You can print the table out to a PDF for presentation purposes, but if you want a data-friendly format you’ll have to highlight and select the table on the page, copy, and paste into a spreadsheet.

These are just small quibbles that I’m sure will eventually be addressed. As is stands, with the addition of the ACS and the new features they’ve added, I’ll definitely be integrating the NYC Census Factfinder into my presentations and will be revising my NYC Neighborhood Census data handout to add it as a source. It’s unique among resources in that it provides NTA-level data in addition to tract data, has 2000 and 2010 historical change and the latest 5-year ACS (with margins of error) in one application, and allows you to build your own neighborhoods to aggregate tract data WITH new margins of error for all derived estimates. It’s well-suited for users who want basic Census demographic profiles for neighborhood-like areas in NYC.

Downloading Data for Small Census Geographies in Bulk

I needed to download block group level census data for a project I’m working on; there was one particular 2010 Census table that I needed for every block group in the US. I knew that the American Factfinder was out – you can only download block group data county by county (which would mean over 3,000 downloads if you want them all). I thought I’d share the alternatives I looked at; as I searched around the web I found many others who were looking for the same thing (i.e. data for the smallest census geographies covering a large area).

The Census FTP site at http://www2.census.gov/

This would be the first logical step, but in the end it wasn’t optimal based on my need. When you drill down through Census 2010, Summary File 1, you see a file for every state and a national file. Initially I thought – great! I’ll just grab the national file. But the national file does NOT contain the small census statistical areas – no tracts, block groups, or blocks. If you want those small areas you have to download the files for each of the states – 51 downloads. When you download the data you can also download an MS Access database, which is an empty shell with the geography and field headers, and you can import each of the text file data tables (there a lot of them for 2010 SF1) into the db and match them to the headers during import (the instructions that were included for doing this were pretty good). This is great if you need every variable in every table for every geography, but I was only interested in one table for one geography. I could just import the one text file with my table, but then I’d have to do this import process 51 times. The alternative is to use some Python to get that one text file for every state into one big file and then do the import once, but I opted for a different route.

The NHGIS at https://www.nhgis.org/

I always recommend this resource to anyone who’s looking for historical census data or boundary files, but it’s also good if you want current data for these small areas. I was able to use their query window to widdle down the selection by dataset (2010 SF1), geography (block groups), and topic (Hispanic origin and race in my case), then I was able to choose the table I needed. On the last screen before download I was able to check a box to include all 50 states plus DC and PR in one file. I had to wait a couple minutes for the request to process, then downloaded the file as a CSV and loaded it into my database. This was the best solution for my circumstances by far – one table for all block groups in the country. If you had to download a lot (or all) of the tables or variables for every block group or block it may take quite awhile, and plugging through all of those menus to select everything would be tedious – if that’s your situation it may be easier to grab everything using the Census FTP.

nhgis

UExplore / Dexter at http://mcdc.missouri.edu/applications/uexplore.shtml

The Missouri Census Data Center’s UExplore / Dexter tool lets you choose a dataset and takes you to a window that resembles a file system, with a ton of files in it. The MCDC takes their extracts directly from the Census, so they’re structured in a similar way to the FTP site as state-based files. They begin with the state prefix and have a name that indicates geography – there are files for block groups, blocks, and one for everything else. There are national files (which don’t contain small census areas) that begin with ‘us’. The difference here is – when you click on a file, it launches a query window that let’s you customize the extract. The interface may look daunting at first, but it’s worth exploring (and there’s a tutorial to help guide you). You can choose from several output formats, specific variables or tables (if you don’t want them all), and there are a bunch of handy options that you can specify like aggregation or percent totals. In addition to the complete datasets, they’ve also created ‘Standard Extracts’ that have the most common variables, if you want just a core subset. While the NHGIS was the best choice for my specific need, the customization abilities in Dexter may fit your needs – and the state-level block group and block data is conveniently broken out from the other files.

Lastly…

There are a few others tools – I’ll give an honorable mention to the Summary File Retrieval tool, which is an Excel plugin that lets you tap directly into the American Community Survey from a spreadsheet. So if you wanted tracts or block groups for a wide area for but a small number of variables (I think 20 is the limit) that could be a winner, provided you’re using Excel 2007 or later and are just looking at the ACS. No dice in my case, as I needed Decennial Census data and use OpenOffice at home.