parcel#

Data ingestion functions specific to parcel data

Functions#

drop_problematic_parcels(parcels[, attribute_columns, ...])

Drop parcels that complicate processing without adding much value

Module Contents#

openplaces.io.parcel.drop_problematic_parcels(parcels, attribute_columns=None, parcel_id_col='parcel_id_admin3', parcel_n_vertices_threshold=PARCEL_N_VERTICES_THRESHOLD, thinness_threshold=THINNESS_THRESHOLD, parcel_id_blacklisted_endings=PARCEL_ID_BLACKLISTED_ENDINGS, parcel_id_whitelisted_endings=PARCEL_ID_WHITELISTED_ENDINGS, empty_attributes_ignore_columns=EMPTY_ATTRIBUTES_IGNORE_COLUMNS, empty_attributes_fraction_max=EMPTY_ATTRIBUTES_FRACTION_MAX, thinness_test_buffer_degrees=THINNESS_TEST_BUFFER_DEGREES, thinness_test_area_ratio_min=THINNESS_TEST_AREA_RATIO_MIN)#

Drop parcels that complicate processing without adding much value

This is an experimental function mostly designed to remove fill-in parcels (e.g. roads, lakes) that don’t offer valuable attribute data

First, the algorithm identifies parcels that might be unwanted:

  • Parcels with IDs that point to gap fillers, roads, water (detection derived from Wisconsin)

  • Parcels with a large number of vertices (water features, fill-ins)

  • Parcels with a high perimeter^2-to-area ratio (slivers)

Then, the algorithm drops parcels that meet at least 2/3 criteria:

  • Parcels whose ID seems questionable (empty, blacklisted, text-only duplicates)

  • Their attributes are mostly empty

  • They are only slivers (internal buffering creates empty polygon)