cloud_geoparquet_ingester#

Helper functions for the latlon_tile partition mode in Ingester.

Functions#

tile_bounds(→ tuple[float, float, float, float])

Return (minx, miny, maxx, maxy) in EPSG:4326 for a tile_id string.

tile_ids_for_admin(→ list[str])

Return all tile IDs whose bbox overlaps the admin polygon's bounding box.

fetch_latlon_tile_to_cache(→ None)

Fetch a bbox-filtered slice of a cloud GeoParquet dataset and save to cache.

Module Contents#

openplaces.io.cloud_geoparquet_ingester.tile_bounds(tile_id: str, tile_size_deg: float) tuple[float, float, float, float]#

Return (minx, miny, maxx, maxy) in EPSG:4326 for a tile_id string.

openplaces.io.cloud_geoparquet_ingester.tile_ids_for_admin(admin_id: str, tile_size_deg: float = 1.0) list[str]#

Return all tile IDs whose bbox overlaps the admin polygon’s bounding box.

openplaces.io.cloud_geoparquet_ingester.fetch_latlon_tile_to_cache(download_url: str, tile_id: str, tile_size_deg: float, bbox_column: str, cache_path, s3_anonymous: bool = False, s3_region: str | None = None, redownload: bool = False, verbose: bool = False) None#

Fetch a bbox-filtered slice of a cloud GeoParquet dataset and save to cache.

Parameters:
  • download_url – S3 or https base path to the parquet dataset (directory or single file).

  • tile_id – Tile identifier string (e.g. ‘lat+035_lon-077’).

  • tile_size_deg – Tile size in degrees.

  • bbox_column – Name of the parquet column containing bbox struct fields xmin, ymin, xmax, ymax.

  • cache_path – Local path to write the cached GeoParquet tile.

  • s3_anonymous – If True, use anonymous S3 credentials for public buckets.

  • s3_region – AWS region hint (e.g. ‘us-west-2’).

  • redownload – If True, overwrite existing cache.

  • verbose – Print progress messages.