spine#

Pipeline steps for building and refining the primary entity spine:
  • resolve_spine: merge multiple source GeoDataFrames via IoU dedup

  • split_by_reference: split spine geometries at reference boundaries [stub]

Functions#

get_oriented_dims(→ tuple[float, float, float])

Return (angle_deg % 180, length, width) of the minimum bounding rectangle.

project_onto_axis(→ tuple[float, float])

Return (min, max) projection of polygon coordinates onto (cos_a, sin_a).

drop_elongated_duplicates(→ geopandas.GeoDataFrame)

Remove from to_add candidates that are elongated duplicates of spine.

resolve_spine(→ openplaces.io.harmonizer.HarmonizeState)

Build the primary entity spine from multiple prioritized sources.

split_by_reference(...)

Split spine geometries at reference polygon boundaries [stub].

Module Contents#

openplaces.io.harmonizer.spine.get_oriented_dims(geom) tuple[float, float, float]#

Return (angle_deg % 180, length, width) of the minimum bounding rectangle.

openplaces.io.harmonizer.spine.project_onto_axis(geom, cos_a: float, sin_a: float) tuple[float, float]#

Return (min, max) projection of polygon coordinates onto (cos_a, sin_a).

Works for both the long-axis direction (to compute projection interval) and the perpendicular direction (to compute lateral centroid offset). No rotation is applied — the projection is a direct dot product.

openplaces.io.harmonizer.spine.drop_elongated_duplicates(spine: geopandas.GeoDataFrame, to_add: geopandas.GeoDataFrame, aspect_ratio_min: float = 2.5, angle_tol_deg: float = 15.0, long_overlap_min: float = 0.5, lateral_sep_ratio_max: float = 2.0) geopandas.GeoDataFrame#

Remove from to_add candidates that are elongated duplicates of spine.

Catches displaced thin-rectangle buildings (mobile homes, trailers) that escape IoU-based deduplication because the positional offset is perpendicular to the long axis rather than along it. The two footprints do not need to overlap — a close lateral neighbour with a matching long axis is enough.

Two polygons are declared duplicates when all of the following hold:

  • Both have minimum-bounding-rectangle aspect ratio ≥ aspect_ratio_min

  • Long-axis orientations agree to within angle_tol_deg degrees (mod 180°)

  • 1-D projections onto the shared long axis overlap by ≥ long_overlap_min (fraction of the shorter polygon’s projected length)

  • Lateral (perpendicular) distance between centroid projections is < lateral_sep_ratio_max × average width

Parameters:
  • aspect_ratio_min (float) – Minimum OBB aspect ratio (length / width) to classify as elongated.

  • angle_tol_deg (float) – Maximum long-axis angle difference (degrees, mod 180°) to be aligned.

  • long_overlap_min (float) – Minimum fraction of the shorter polygon’s length covered by the long-axis projection overlap.

  • lateral_sep_ratio_max (float) – Maximum lateral separation as a multiple of the average polygon width.

openplaces.io.harmonizer.spine.resolve_spine(state: openplaces.io.harmonizer.HarmonizeState, sources: list[dict] | None = None, thresholds: dict | None = None) openplaces.io.harmonizer.HarmonizeState#

Build the primary entity spine from multiple prioritized sources.

Merges GeoDataFrames from each entry in sources in order. A geometry from a lower-priority source is added to the spine only if its IoU with every existing spine geometry is below overlap_iou_max (default 0.02).

Source entries may include an auto_discover: true sentinel entry, which is replaced at runtime by all ingest recipes of the same entity type that are scoped to child admin_ids of the recipe’s admin_id.

Parameters:
  • sources (list of dict) – Ordered source entries. Each entry is either {'recipe_id': str, 'label': str} or {'auto_discover': True}.

  • thresholds (dict, optional) – overlap_iou_max (float, default 0.02) — maximum IoU to treat two footprints as overlapping duplicates. min_area_m2 (float, default 0.0) — drop footprints smaller than this area (in m²) before IoU deduplication. 0.0 disables the filter. elongated_aspect_min (float) — when set, enables the elongated-duplicate filter via drop_elongated_duplicates(). Also accepts elongated_angle_tol, elongated_long_overlap_min, elongated_lateral_sep_ratio.

openplaces.io.harmonizer.spine.split_by_reference(state: openplaces.io.harmonizer.HarmonizeState, entity_type: str | None = None, recipe_id: str | None = None, thresholds: dict | None = None) openplaces.io.harmonizer.HarmonizeState#

Split spine geometries at reference polygon boundaries [stub].

Splits contiguous spine geometries (e.g., large building footprints that span multiple parcels) at reference polygon boundaries to reflect differences in age, ownership, or use across the merged footprint.

Intended use: urban rowhouses / townhouses where a single contiguous footprint covers multiple independently owned units that differ in renovation history, assessed value, etc.

Parameters:
  • entity_type (str, optional) – Entity type of the reference dataset (e.g. 'parcel').

  • recipe_id (str, optional) – Explicit reference recipe ID (takes precedence over entity_type).

  • thresholds (dict, optional) – Step-specific thresholds (e.g. min_area_m2).

Raises:

NotImplementedError – Always — this step is not yet implemented.