links#
Pipeline steps that create and use relationships between the spine and reference datasets: - link_to_reference: load a reference and build a spine ↔ reference crosswalk - infer_spine_additions: add new spine entries inferred from a reference - resolve_overlaps: remove remaining geometry overlaps from the spine
Functions#
|
Load a reference dataset and build a spine ↔ reference crosswalk. |
Add spine entries inferred from a reference crosswalk. |
|
|
Resolve remaining geometry overlaps in the spine. |
Module Contents#
- openplaces.io.harmonizer.links.link_to_reference(state: openplaces.io.harmonizer.HarmonizeState, join: str = 'spatial_overlay', entity_type: str | None = None, recipe_id: str | None = None, thresholds: dict | None = None, remap_id: str | None = None, source_geometry_type: str | None = None, aggregation_function=None, sort_by: str | None = None, list_columns: list[str] | None = None) openplaces.io.harmonizer.HarmonizeState#
Load a reference dataset and build a spine ↔ reference crosswalk.
Populates
state.references[recipe_id],state.crosswalks[recipe_id],state.overlays[recipe_id](forspatial_overlayjoins),state.reference_types[recipe_id], andstate.source_geometry_types[recipe_id](when source_geometry_type is provided).- Parameters:
join (str) –
How to join the reference to the spine:
'spatial_overlay'Polygon-on-polygon identity overlay. Produces a crosswalk table with IoU and area-intersection columns. Populates
state.overlays[recipe_id]with the full geometry-bearing overlay result for use by later steps.'spatial_point'Point-in-polygon sjoin. Joins reference points to the spine entities, and unlinked points to any polygon reference already in
state.referencesmatching the reference’s entity type.
entity_type (str, optional) – Auto-discover the best ingest recipe of this entity type for the current
admin_id. Ignored whenrecipe_idis given.recipe_id (str, optional) – Explicit reference recipe ID. Takes precedence over
entity_type.thresholds (dict, optional) – For
spatial_overlay:min_fraction_of_largest(float, default 1/6) — minimum fraction of the largest spine-reference intersection to keep a secondary link.area_intersection_m2_min(float, default 10) — minimum intersection area in m² to keep a link. Forspatial_point:proximity_m(float, default 10) — radius for inner proximity pass.far_proximity_m(float, default 100) — radius for outer proximity pass (same-parcel constraint applied). Set to 0 to disable.remap_id (str, optional) – Recipe ID for a column-remap table applied to the reference after loading (see
openplaces.io.transform.remap()).source_geometry_type (str, optional) –
SourceGeometryTypevalue describing what this source represents spatially (e.g.'single_building_point'). Stored instate.source_geometry_typesfor use by downstream steps such asclassify_footprint_role.aggregation_function (None, callable, or dict, optional) – Controls how duplicate
geo_idrows in the reference are reduced to one row before joining.None(default) applies the aggregation function from the attribute registry. A dict maps column names to callables; columns absent from the dict fall back to the registry default. Only used forspatial_overlayjoins.sort_by (str, optional) – Column to sort reference rows by descending before aggregation. Falls back to geometry area when the column is absent and the reference is a GeoDataFrame. Only used for
spatial_overlayjoins.list_columns (list of str, optional) – Column names for which an extra
{col}_listcolumn is added to the aggregated reference, collecting all values pergeo_idinto a list. Normal scalar aggregation for each column still applies alongside. Only used forspatial_overlayjoins.
- openplaces.io.harmonizer.links.infer_spine_additions(state: openplaces.io.harmonizer.HarmonizeState, entity_type: str | None = None, recipe_id: str | None = None, thresholds: dict | None = None) openplaces.io.harmonizer.HarmonizeState#
Add spine entries inferred from a reference crosswalk.
For each reference polygon that has no existing spine coverage and exceeds the improvement-value threshold, creates a new spine geometry equal to the reference polygon geometry.
The inferred GeoDataFrame is stored in
state.metadata['inferred_from_<recipe_id>']for use byreconcile_attributes.- Parameters:
entity_type (str, optional) – Selects all crosswalks matching this type via
state.reference_types.recipe_id (str, optional) – Explicit crosswalk key to use. Takes precedence over entity_type.
thresholds (dict, optional) –
n_per_group_min(float, default 0.2) — minimum mean spine count per purpose group to be eligible for inference.value_per_ha_quantile(float, default 0.05) — quantile of improvement_value_per_ha used as the lower inference bound.
- openplaces.io.harmonizer.links.resolve_overlaps(state: openplaces.io.harmonizer.HarmonizeState, **_params) openplaces.io.harmonizer.HarmonizeState#
Resolve remaining geometry overlaps in the spine.
Calls
resolve_overlapping_polygons()onstate.spine(withkeep=False).