attributes#
- Pipeline steps that attach attributes from reference datasets to the spine:
reconcile_attributes: aggregate columns from established crosswalks
infer_attributes: compute derived columns (area, value ratios, etc.)
Functions#
|
Re-classify a summed unit count to the nearest occupancy_type label. |
|
Aggregate reference attributes to the spine via established crosswalks. |
Classify spine footprints as |
|
|
Compute derived columns on the spine. |
|
Infer a coarse occupancy category from NSI, parcel, and footprint geometry. |
Module Contents#
- openplaces.io.harmonizer.attributes.reverse_occ_units(total_units: float) str#
Re-classify a summed unit count to the nearest occupancy_type label.
Mirrors the
map_to_unitslogic from Lochhead et al. (2026). Used when multiple NSI points link to the same footprint and their unit counts must be aggregated and re-classified.
- openplaces.io.harmonizer.attributes.reconcile_attributes(state: openplaces.io.harmonizer.HarmonizeState, sources: list[dict] | None = None, priority: dict[str, list[str]] | None = None) openplaces.io.harmonizer.HarmonizeState#
Aggregate reference attributes to the spine via established crosswalks.
For each source in sources, looks up the crosswalk in
state.crosswalks(resolved viarecipe_idorentity_type) and aggregates the requested columns to the spine.- Parameters:
sources (list of dict) –
Each dict describes one reference source and may contain:
recipe_id(str, optional)Explicit crosswalk key in
state.crosswalks.entity_type(str, optional)Selects all matching crosswalks via
state.reference_types; used whenrecipe_idis absent.columns(list of str, optional)Columns to aggregate. Defaults to all available columns from the corresponding default column list.
priority (dict of {feature: [source_suffix, ...]}, optional) –
Between-source priority for specific features (Lochhead et al. 2026, Step C). Each key is a bare feature name (e.g.
'year_built'); the value is an ordered list of source suffixes (without leading_) to try in order. The first non-null suffixed column wins.Example:
priority: purpose_subgroup: [nsi, parcel] year_built: [parcel, nsi]
- openplaces.io.harmonizer.attributes.classify_footprint_role(state: openplaces.io.harmonizer.HarmonizeState, entity_type: str | None = None, thresholds: dict | None = None, **_params) openplaces.io.harmonizer.HarmonizeState#
Classify spine footprints as
'primary','secondary', or'unknown'.Uses dwelling-point and building-point evidence to assign roles within each parcel (Lochhead et al. 2026, Table 4):
If any footprint on the parcel has dwelling-point evidence (
SourceGeometryType.single_dwelling_point), those footprints are'primary'; all others on the same parcel are'secondary'.Else if any footprint has single-building-point evidence (
SourceGeometryType.single_building_point, e.g. NSI), those are'primary'; all others are'secondary'.If no footprint on a multi-footprint parcel has evidence, all are
'secondary'.Footprints that are the sole geometry on their parcel are always
'primary'.Footprints not linked to any parcel are
'unknown', unless they carry dwelling-point evidence — those are promoted to'primary'.
- Parameters:
entity_type (str, optional) – Entity type used to locate the parcel crosswalk in
state.crosswalks. Defaults to'parcel'.thresholds (dict, optional) – Not currently used; retained for recipe compatibility.
- openplaces.io.harmonizer.attributes.infer_attributes(state: openplaces.io.harmonizer.HarmonizeState, derived: list[str] | None = None, **_params) openplaces.io.harmonizer.HarmonizeState#
Compute derived columns on the spine.
- Parameters:
derived (list of str, optional) –
Names of derived columns to compute. Supported values:
'area'/'m2'Footprint area in square metres (stored as
'm2').'value_per_sqft'/'value_per_area'improvement_value{suffix} / m2andstructure_value{suffix} / m2.'openplaces_group_combined'Combined group label reconciling polygon and point reference sources.
'n_dwelling_units'Fill null
n_dwelling_unitsvalues from occupancy-class mapping when apurpose_subgroupcolumn is present on the spine.
When derived is
Noneor empty, all of the above are attempted.
- openplaces.io.harmonizer.attributes.infer_occupancy_type(state: openplaces.io.harmonizer.HarmonizeState, thresholds: dict | None = None, **_params) openplaces.io.harmonizer.HarmonizeState#
Infer a coarse occupancy category from NSI, parcel, and footprint geometry.
Populates
occupancy_type(categorical) on the spine using a three-step cascade:occupancy_type_building_nsi— NSI occupancy label mapped to coarse class (Single-Family, Multi-Family, Mobile Home).Footprint geometry — elongated small footprints flagged as Mobile Home when NSI and parcel evidence are absent.
n_dwelling_units— fills remaining residential gaps (n==1 → Single-Family, n≥2 → Multi-Family).
- Parameters:
thresholds (dict, optional) –
mobile_home_aspect_min(float, default 2.5) — minimum oriented-bounding-box aspect ratio (length/width) to consider a footprint elongated.mobile_home_area_max_m2(float, default 185) — maximum footprint area (m²) for the mobile-home geometry signal (~2 000 sqft).