ids#

ids.py

Functions for computing geographic identifiers.

  • geo_id: to link identical polygons securely through time (within a small spatial tolerance, to minimize corrections).

  • openlocationcode: for point locations

  • unique building IDs (UBID) for building footprints

Functions#

get_geo_ids(gdf[, grid_degrees, hash_length, ...])

Generate stable, unique parcel IDs from polygon geometry.

add_geo_id_index(gdf[, name, handle_duplicates, verbose])

Return the GeoDataFrame using geo_id as the index

get_openlocationcodes(gdf[, name, codelength, ...])

Assign a location index based on centroid (Open Location Code).

add_openlocationcode_index(gdf[, name, codelength, ...])

Return the GeoDataFrame using geo_id as the index

add_ubid_index(gdf[, name, duplicates])

Return the GeoDataFrame using geo_id as the index

decode_openlocationcodes(→ geopandas.GeoDataFrame)

Decode level-11 Open Location Codes to a GeoDataFrame of points.

decode_ubids(→ geopandas.GeoDataFrame)

Decode a Series of UBIDs to a GeoDataFrame of bounding boxes.

Module Contents#

openplaces.geo.ids.get_geo_ids(gdf, grid_degrees=1e-06, hash_length=24, handle_duplicates=True, verbose=False)#

Generate stable, unique parcel IDs from polygon geometry.

Uses fixed degree grid in EPSG:4326 for full Earth coverage.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame with parcel geometries

  • grid_degrees (float) – Grid size in degrees (default 0.00003)

  • hash_length (int) – Number of hex characters in output (default 18 = 72 bits)

  • handle_duplicates (bool) – If True, adds numeric suffix to duplicate GIDs (default True)

  • verbose (bool) – If True, prints information on duplicates

Returns:

Series of geo_ids with same index as input GeoDataFrame

Return type:

pd.Series

Notes

Why degrees instead of projected CRS: - No projection covers entire Earth without distortion/singularities - Degree grid is globally consistent (same grid cell = same x/y) - Simple, fast (no reprojection needed) - Works everywhere including poles

Trade-off: - Grid “size” in meters varies by latitude (larger at equator) - But parcels at same location always use same grid - This guarantees non-overlapping parcels get different IDs

openplaces.geo.ids.add_geo_id_index(gdf, name='geo_id', handle_duplicates=True, verbose=False)#

Return the GeoDataFrame using geo_id as the index

Parameters:
  • gdf (GeoDataFrame) – Polygon data

  • name (str) – Name of index column

  • handle_duplicates (bool) – If True, adds numeric suffix to duplicate GIDs (default True)

  • verbose (bool) – If True, prints information on duplicates

openplaces.geo.ids.get_openlocationcodes(gdf: geopandas.GeoDataFrame, name='openlocationcode', codelength=11, handle_duplicates=True)#

Assign a location index based on centroid (Open Location Code).

Parameters:
  • gdf (GeoDataFrame) – Vector data with polygon geometries in any CRS.

  • name (str) – Name of index

  • codelength (int) – openlocationcode code length

  • handle_duplicates (bool) – If True, adds numeric suffix to duplicate OLCs (default True)

openplaces.geo.ids.add_openlocationcode_index(gdf, name='openlocationcode', codelength=11, handle_duplicates=True)#

Return the GeoDataFrame using geo_id as the index

Parameters:
  • gdf (GeoDataFrame) – Polygon data

  • name (str) – Name of index column

  • codelength (int) – openlocationcode code length

  • handle_duplicates (bool) – If True, adds numeric suffix to duplicate GIDs (default True)

openplaces.geo.ids.add_ubid_index(gdf, name='ubid', duplicates='raise')#

Return the GeoDataFrame using geo_id as the index

Parameters:
  • gdf (GeoDataFrame) – Polygon data

  • name (str) – Name of index column

  • duplicates (str) – ‘raise’ or ‘drop’. Duplicate UBID indices are not permitted.

openplaces.geo.ids.decode_openlocationcodes(codes: list[str] | pandas.Series) geopandas.GeoDataFrame#

Decode level-11 Open Location Codes to a GeoDataFrame of points.

Returns the center of each OLC cell as a Point geometry.

Parameters:
  • codes (list[str] or pd.Series) – Sequence of OLC strings with codelength=11 (e.g. ‘85G8Q23G+CFM’). If a Series, its index is preserved in the output.

  • name (str) – Name for the geometry column (default ‘geometry’).

Returns:

Points at OLC cell centers, CRS EPSG:4326.

Return type:

GeoDataFrame

openplaces.geo.ids.decode_ubids(ubids: pandas.Series, outer: bool = True) geopandas.GeoDataFrame#

Decode a Series of UBIDs to a GeoDataFrame of bounding boxes.

Parameters:
  • ubids (pd.Series) – Series of UBID strings in the format ‘OLC-N-E-S-W’.

  • outer (bool) – If True (default), return the outermost plausible bounding box. If False, return the innermost plausible bounding box (shrinks each side by 0.5 OLC cells to account for rounding).