Geodata Sources

Tip

For a list of all geodata sources currently in the registry, see geodata_sources.csv on Github.

Geodata sources define the base geospatial data that can be joined to CSV data. There are 4 different geography levels (referred to as "summary levels" in the code baes): States, Counties, Tracts, and Zip Code Tabulation Areas (ZCTAs). Because we have many different years of data in the CSVs, we also need to include different years of spatial data, as boundaries and geographic unit ids change over time.

Characteristics of geodata source shapefiles:

  • Has a HEROP_ID field which will be used by all CSV files for joins.
  • Stored as zip file in AWS S3, not locally in the repository.

Each geodata source is defined by the following attributes:

Property Description Comment
name ID of geodata source Will always be in {geography (plural)}-{year} format.
title Human-readable title
description Short description Will always be "Shapefile of {geography} boundaries from the US Census Bureau, {year}".
path Path to zipped shapefile This is a full URL to the zipped file in AWS S3.
format Will always be shp
mediatype Will always be application/vnd.shp
summary_level Type of geography Will be one of: state, county, tract, or zcta.
bq_dataset_name Target dataset during BigQuery upload To be deprecated.
bq_table_name Target table during BigQuery upload To be deprecated.
schema A nested JSON schema object The schema defines primaryKey (always "HEROP_ID"), and fields, which will include, at least an entry for HEROP_ID.

Note

Geodata sources are very similar to table sources (based on Frictionless Data Resources) but instead of referencing CSV files they reference zipped ESRI Shapfiles, and they do include the schema property.

Example geodata_source

{
    "bq_dataset_name": "spatial",
    "bq_table_name": "counties2010",
    "name": "counties-2010",
    "title": "County Boundaries, 2010",
    "description": "Shapefile of county boundaries from the US Census Bureau, 2010.",
    "path": "https://herop-geodata.s3.us-east-2.amazonaws.com/census/county-2010-500k-shp.zip",
    "format": "shp",
    "mediatype": "application/vnd.shp",
    "summary_level": "county",
    "schema": {
        "primaryKey": "HEROP_ID",
        "fields": [
            {
                "name": "HEROP_ID",
                "title": "HEROP_ID",
                "type": "string"
            },
            {
                "name": "name",
                "title": "Name",
                "type": "string"
            }
        ]
    }
}