admin#

Administration

Worldwide administrative referencing and mapping

  • Manage global admin files

  • Manage globally unique identifiers (admin_ids)

Functions#

get_admin1_iso()

Get dataframe with country ISO alpha codes and names

admin1_id_index_from_admin1_id_a3(gdf)

Give dataframe gdf an admin1_id index from admin1_id_a3

get_admin2_iso()

Get dataframe with state/province ISO3116-2 codes and names

admin2_id_index_from_admin2_gadm(admin2)

Give dataframe admin an admin2_id index based on GADM data

clean_geographic_name(name)

Comprehensive cleaning for admin4 geographic names.

generate_admin_ids(df[, new_admin_id_col, ...])

Generate unique two-letter admin unit codes within parent units.

update_admin_spine(level, admin_recipe_id, test[, silent])

Update the openplaces admin spine with admin recipe info

Module Contents#

openplaces.io.admin.get_admin1_iso()#

Get dataframe with country ISO alpha codes and names

openplaces.io.admin.admin1_id_index_from_admin1_id_a3(gdf)#

Give dataframe gdf an admin1_id index from admin1_id_a3

Single-use function to create linkage between GADM and ISO

openplaces.io.admin.get_admin2_iso()#

Get dataframe with state/province ISO3116-2 codes and names

openplaces.io.admin.admin2_id_index_from_admin2_gadm(admin2)#

Give dataframe admin an admin2_id index based on GADM data

openplaces.io.admin.clean_geographic_name(name)#

Comprehensive cleaning for admin4 geographic names. Returns: (clean_text, digits, letter_suffix, generic_word)

openplaces.io.admin.generate_admin_ids(df, new_admin_id_col='admin4_id', parent_admin_id_col='admin3_id', name_col='name', id_separator='-', verbose=False)#

Generate unique two-letter admin unit codes within parent units.

Generate unique admin ID codes for administrative units

Level-agnostic design: works for any parent-child relationship: admin2->admin3 (state->county), admin3->admin4 (county->town)

Strategy#

Each name is first cleaned into structured components: a text portion, digit portion, letter suffix, and detected generic word. IDs are then assigned through a waterfall of prioritized strategies. Each row moves to the next strategy only if it remains unassigned:

  1. Pure numeric — If the name reduces to only digits with no text (e.g., “N.A. (12)”) use the number directly.

  2. Generic word + number — If a recognized generic word (ward, zone, barangay, district, etc.) is detected alongside a number, prefix the number with the generic word’s initial(s). A letter suffix is appended if present (e.g., “Ward 3B” → “W3B”).

  3. Name + number for duplicates — If the same base name appears multiple times under the same parent and a digit is present, disambiguate by combining the name’s initial(s) with the number (and any letter suffix).

  4. Initials from multi-word names — For names with two or more words, take the first letter of the first two words (e.g., “North East” → “NE”).

  5. First two letters — Take the first two characters of the cleaned name, assigned only where unique within the parent.

  6. Any two letters — Try all pairwise letter combinations from the cleaned name until a unique code is found.

  7. Letter + number combinations — Combine any letter from the name with any digit from the name; fall back to “X” + digit if no letters exist.

  8. Swapping — If a desired two-letter code is taken by another row, check whether that row can be reassigned to an alternative code, freeing up the preferred code for the current row.

  9. Three-letter codes — Try the first three letters, then all three-letter combinations from the name.

  10. Sequential fallback — Assign codes like X01, X02, … (with a

    letter disambiguator if needed) to any rows that all prior strategies failed to place.

After assignment, all IDs are verified to be non-null and globally unique; an exception is raised if either condition is violated.

param df:

Input dataframe with administrative unit data

type df:

pd.DataFrame

param new_admin_id_col:

Name for the new administrative ID column (default ‘admin4_id’)

type new_admin_id_col:

str

param parent_admin_id_col:

Column name containing parent admin ID (e.g., ‘admin3_id’)

type parent_admin_id_col:

str

param name_col:

Column name containing subdivision name

type name_col:

str

param name_long_col:

Column name containing long-form name. Example: in the US, this might include ‘city’ and ‘township’ suffixes that resolve ambiguities between entity names If None or column doesn’t exist, city/township detection is skipped

type name_long_col:

str, optional

param id_separator:

Separator to use in IDs (default ‘_’)

type id_separator:

str

param verbose:

If True, prints statistics and other outputs

type verbose:

bool

returns:

DataFrame indexed by new_admin_id_col with diagnostics column

rtype:

pd.DataFrame

raises ValueError:

If unable to generate unique IDs for all rows

openplaces.io.admin.update_admin_spine(level, admin_recipe_id, test, silent=False)#

Update the openplaces admin spine with admin recipe info

Parameters:
  • level (int) – Administrative level of the spine to update

  • admin_recipe (str) – ID of admin recipe to update the spine with (This function assumes the recipe is already ingested.)

  • test (bool) – If True, writes to ‘{file}_test.csv’ instead of the original

  • silent (bool) – If True, silences printouts when new admin IDs are added.