Friday, April 18, 2008

Mapping with Mr. ZIP

ZIP code areas and what they can (and can't) do for you

I am often requested to map or facilitate the mapping of some data by ZIP code. The desired result is a thematic choropleth (or heat map) of sales/customer data for these ZIP code divisions . The request itself is not unreasonable, but filled with difficulties inherit to the nature and history of the US ZIP code system.

The problems mapping data by ZIP code areas ultimately originate with the idea that we live in a ZIP code. The very idea of a ZIP code area is ingrained in our minds as these static geographic units with definite boundaries that divide people into manageable groups-this concept even creeps in from pop-culture, ala Beverly Hills, 90210.



If we live in ZIP code areas, reasoning goes, they are a geographic location on which data can be placed, shaped, analyzed and visualized.

Unfortunately, the USPS does not and will not generate a ZIP codes area map for one difficult to grasp reason, ZIP codes do not represent areas. ZIP codes are assigned to post offices or mail collection spots, hence the availability of coordinate location GIS shapefiles and tables. These points service specific routes for mail carriers. Streets, or the sides or streets, or even a single building (different floors of buildings as well, though rarely) may be covered by different ZIP code which acts as a routing code. ZIP codes, therefore may be thought of as linear and point features, but not areas.

One problem influencing the creation of ZIP code maps is that ZIP codes change every year. As new roads are built and companies grow or shrink, ZIP codes may be added or removed. On average, there are about 43,000 ZIP codes in the United States. If you take out the unique P.O. Box, corporate, military and government agency ZIP codes, you end up with a little over 40,000. About 1000 to 3000 ZIP codes may be removed or added, but the number usually fluctuates around 40k.

These changes mean that any ZIP code maps need to be updated every few months to remain current. While the USPS will produce lists of post offices and ZIP codes, they stay away from delineating boundaries. Therefore ZIP code point maps may be seen as more reliable, since they are based on Post Office or mail collection locations. These ZIP code point maps still need to be updated often.

While the USPS does not distribute a ZIP code area map, there are a few third party sources. The techniques used to create these vary, but one method involves buffering the U.S. street centerlines in a GIS, then assigning each resulting polygon the ZIP code on that street segment. Further work must be done to close up holes, divide polygons along streets centers and deal with ZIP codes "islands" where a route might be discontinuous with a portion bisected by another route. One method I have experimented with is the creation of Voronoi polygons. These voronoi polygons make no claims to match up to any "boundaries", but create a closed collection of polygons using the post office point locations suitable for colorful thematic maps. Ultimately, any one purporting to offer ZIP code areas as GIS shapefiles have at best an extremely rough estimate.

This is not to say you should never map your data by ZIP codes, but be aware of the factors that may eliminate data or alter the accuracy of your map. Much of this depends on the ultimate purpose of the ZIP codes map. A general reference showing which businesses or homes fall into some ZIP code may not be accurate since the ZIP code boundaries are not officially defined. In the end, you should realize that:

  1. ZIP codes are not Areas. They do not represent boundaries on the ground in any real sense that you might think of state or county boundaries.

  2. ZIP codes are fluid in that they change periodically.

  3. ZIP codes might represent a single building, P.O. box, or U.S. Naval vessel.

  4. ZIP codes may not fall into the city they are related to, so any census data tied to them may be erroneous.

Sort of, official ZIP code boundaries, but not really

The USPS, despite many calls to do otherwise, has stayed out of the spatial data business. ZIP codes, in the postal mind were designed to do one job and they they do that job very well. So, the Census Bureau eventually took up the task. Enter, the ZIP Code Tabulation Area.

The US Census Bureau created the ZIP Code Tabulation Area or ZCTA (pronounced zikta) due to frustration with requests to map decennial census data at ZIP code level. A solution was to aggregate census blocks and assign them a ZIP code. This process leads to some problems, such as multiple true ZIP codes points (Post Offices) in each ZCTA polygon (not in all, but several) as illustrated in the map below.

Zip Code Points vs ZCTA Polygons

The purple line represent ZCTA boundaries (ZCTA code in purple font) between Winston Salem and Greensboro, North Carolina. Post Office locations are green triangles, labeled with the ZIP codes they service. Note that in many of the ZCTA boundaries there are 2 to 5 Post Offices, meaning any attempt to treat the ZCTA boundary shapefile as a ZIP code boundary will result in missing data.

The Census Bureau, not to be outdone, also provides ZIP code census statistics especially for these ZCTA boundaries. This data allows researchers and businesses to get some rough approximation of the demographics for some ZIP codes... very rough though. Beware too, there are over 43,000 ZIP codes in the US, while there are just less than 30,000 ZCTA numbers. There will always be missing information when trying to map or something else like customer or business data using a ZCTA map layer.

With ZCTA you should understand, at the very least:

  1. The ZCTA was designed for specific census datasets and do not reflect the USPS postal code system.

  2. The USPS does not support it.

  3. Real ZIP codes are point locations referring to post offices or mail collection spots.

Further Reading on the topic of ZIP codes, ZCTA and related data can be found, here:

An article that shares pretty much the same feelings I do on the matter

About ZIP Code Tabulation Areas

The Census position on ZCTA and ZIP codes

ZCTA FAQ

US Census Cartographic Boundary Files

2 comments:

  1. Another good article is at
    http://www.maponics.com/GIS_Map_Data/ZIP_Code_and_Carrier_Route_GIS/ZCTAs_vs_ZIP_Code_Data/zctas_vs_zip_code_data.html

    ReplyDelete
  2. I was more than happy to uncover this site. I need to to thank you for your time
    for this fantastic read!! I definitely really liked every
    bit of it and i also have you bookmarked to see new things in your
    blog.

    ReplyDelete

Note: Only a member of this blog may post a comment.