Proximity Home Page



Census 2000
Data Access & Use
Main Page


Using PUMS Files

(page updated 6/17/03)
 
Census 2000 Data Access and Use
    Public Use Microdata Sample (PUMS) Files
    -- an information resource developed and maintained by Warren Glimpse

This page is a focus section on the Public Use Microdata Use Sample (PUMS) data. The main Census 2000 Data Access and Use page is located at http://proximityone.com/cen2000.htm

PUMS files contain data for individual persons and housing units with any confidential identifier data removed. The individual (micro) data in the files are developed by taking a sample of individual respondent records from the master database (basic record file).

PUMS File Access

The Census Bureau is now in the process of completing release of all 5-percent PUMS files. The 1-percent PUMS files have been released for all states. All PUMS files are scheduled for completion by September 2003.

Census 2000 1% PUMS state files. The first Census 2000 one percent Public Use Microdata Sample (PUMS) became available as of 4/23/03.

The files may be accessed via http://www2.census.gov/census_2000/datasets/PUMS/OnePercent .

Census Bureau Census 2000 one percent PUMS Web site.

Census 2000 5% PUMS state files. The Census 2000 five percent Public Use Microdata Sample (PUMS) files were released by the Census Bureau starting 8/6/03.

The files may be accessed via http://www2.census.gov/census_2000/datasets/PUMS/FivePercent/ .

Census Bureau Census 2000 five percent PUMS Web site.

Census 2000 PUMS technical documentation (4.5 MB).

Overview. The Census 2000 Public Use Microdata Use Sample (PUMS) files contain data records consisting of a sample of individual respondent data with approximate safeguards to insure that confidentiality is maintained. The primary advantage of using the PUMS files is that they enable the tabulation of user defined subject matter characteristics not available in the Census 2000 summary statistic files.

For decennial censuses, the term "microdata file" refers to a file that contains a sample of respondent data records. The preplanned decennial census microdata files referred to as the Public Use Microdata Sample (PUMS) files.

Public Use Microdata Sample files contain records representing 5-percent or 1-percent samples of the occupied and vacant housing units in the U.S. and the people in the occupied units. Group quarters people also are included. The file contains individual weights for each person and housing unit, which when applied to the individual records, expand the sample to the relevant total.

Geography. PUMS files contain geographic units known as super-Public Use Microdata Areas (super-PUMAs) and Public Use Microdata Areas (PUMAs). To maintain the confidentiality of the PUMS data, minimum population thresholds are set for PUMAs and super-PUMAs. For the 1-percent state-level files, the super-PUMAs contain a minimum population of 400,000 and are composed of a PUMA or a group of contiguous PUMAs delineated on the 5-percent state-level PUMS files. Super-PUMAs are a new geographic entity for Census 2000. The 5-percent state-level files contain PUMAs, each having a minimum population of 100,000; the 5-percent files also show corresponding super-PUMAs codes. Each state is separately identified and may be comprised of one or more super-PUMAs or PUMAs. Large metropolitan areas may be subdivided into super-PUMAs and PUMAs. PUMAs and super-PUMAs do not cross state lines. Super-PUMAs and PUMAs also are defined for place of residence on April 1,1995 and place of work.

Small Area Tabulations: Implications of PUMA Geography. The following graphic shows the maximum potential small area geography for which tabulations can be prepared using the 5% PUMS data. This map shows Harris County, TX (blue boundary) with the City of Houston in beige color. Census 2000 census tract boundaries are shown with green boundaries. 5% PUMA areas are shown with red boundaries.



Contrast the potential (must meet the minimum population threshold criteria) 5% PUMA small tabulation geography with the corresponding 1% PUMA small area tabulation geography shown in the following map.



Separate 1-Percent and 5-Percent Files. Each microdata file is a stratified sample of the population which was created by subsampling the full census sample (approximately 15.8 percent of all housing units) that received census long form questionnaires. Initial sampling was done address-by-address in order to allow the study of family relationships and housing unit characteristics for occupied and vacant units. Sampling of people in institutions and other group quarters was done on a person-by-person basis.

There are two independently drawn samples, designated "5 percent" and "1 percent," each featuring a different geographic scheme. Nationwide, the Census 2000 5-percent sample provides the records for over 14 million people and over 5 million housing units. For the 1-percent sample, there are records for over 2.8 million people and over 1 million housing units. Since processing a smaller sample is less resource intensive, it may be easier to produce extracts using the subsample numbers provided in the housing record.

1-Percent Files. The 1-percent files give the maximum amount of social, economic, and housing data available. There is no national minimum threshold for the identification of subject matter categories, with the exceptions of a national minimum population of 8,000 for race and Hispanic origin. The goal of these files is to provide a similar level of detail as was available in the 1990 PUMS files (and, in some cases, more detail).

In order to provide the level of characteristic detail for the 1-percent files described above, the minimum geographic population threshold needed to be raised above 100,000 (the PUMA minimum). A new geographic entity was created — the super-PUMA. Super-PUMAs have a minimum population of 400,000 and are composed of a PUMA or PUMAs delineated on the 5-percent PUMS files. Each state is identified, and any state with a population of 800,000 or greater can be subdivided into two or more super-PUMAs.

5-Percent Files. To maintain confidentiality, while retaining as much characteristic detail as possible, a minimum threshold of 100,000 nationally is set for the identification of variable categories within categorical variables in the 5-percent PUMS files.

Each PUMA in the 5-percent files must meet a minimum population threshold of 100,000. The minimum PUMA threshold was held at 100,000 by increasing the degree of subject matter collapsing as described above. The 100,000 minimum population threshold, the threshold set for both the 1980 and 1990 PUMS files, permits greater historical comparability.

Each record in the microdata file contains the population or housing data attributes about an individual respondent or sample case. The data record contains responses to census questionnaire, typically recoded to follow specifications for the specific item (e.g., income), but with name, address, and any respondent identification removed and the geography sufficiently broad to protect confidentiality.

Perhaps the most important attribute of a microdata file is that microdata files may be used to estimate summary statistics that have not been provided as a part of preplanned summary statistic files. Stated another way, the microdata files enable you to prepare your own customized tabulations and cross tabulations of most population and housing subjects covered by the questionnaire.

Using the PUMS Files

The PUMS technical documentation contains the detailed record layout of the data. But the form of these descriptive data ("codebook") integrated into a PDF or even ASCII version of the PDF is not readily usable.

Use this spreadsheet file (expand this zip file) to review the contents of the PUMS 1-percent person and housing unit records. Fields in the person and housing unit records are defined and values that the variables/fields take on are also summarized.

More detailed information on PUMS applications, making use of software including SAS and Access, is planned for this section. If you have questions on these applications now, contact Proximity.

[goto top]

Sitemap News Contact     Copyright © . Proximity. All Rights Reserved.