tabulation#
Stacked horizontal bar charts for cross-tabulated data.
Functions#
|
Cross-tabulate a numeric variable by two categorical columns. |
|
Stacked horizontal bar chart of v ~ f(y_cat, x_cat). |
Module Contents#
- openplaces.viz.tabulation.tabulate(df, y_cat, x_cat, v='n', y_max_n=None, x_max_n=None, y_cat_order=None, x_cat_order=None, show_empty_category=True)#
Cross-tabulate a numeric variable by two categorical columns.
- Parameters:
df (pd.DataFrame or gpd.GeoDataFrame)
y_cat (str) – Column for y-axis categories.
x_cat (str) – Column for x-axis categories (the stacked dimension).
v (str) – Numeric column to aggregate;
'n'counts rows.y_max_n (int, optional) – Keep only the top-N y categories by total weight; remainder is collapsed into
'(all others)'.x_max_n (int, optional) – Same for x categories.
y_cat_order (list, optional) – Explicit ordering for y-axis values.
x_cat_order (list, optional) – Explicit ordering for x-axis values (stack order).
show_empty_category (bool) – If True, fill NaN labels with
'(N/A)'instead of dropping.
- Returns:
Normalized crosstab (values sum to 1), shape (y_cats, x_cats).
- Return type:
pd.DataFrame
- openplaces.viz.tabulation.plot_tabulation(df, y_cat, x_cat, v='n', title=None, y_max_n=None, x_max_n=None, y_cat_order=None, x_cat_order=None, show_empty_category=True, y_lab_maxlength=30, x_lab_maxlength=30, gap_perc=0.01, cmap='tab20b', alpha=0.8, savefig=None, figsize=(7, 5), legend_kwds=None, colors=None, fontsize=9, titlesize=12)#
Stacked horizontal bar chart of v ~ f(y_cat, x_cat).
Bar heights are proportional to group totals; bar widths show the x_cat breakdown within each y_cat group.
- Parameters:
df (pd.DataFrame or gpd.GeoDataFrame)
y_cat (str) – Column for y-axis groups (one bar per value).
x_cat (str) – Column for the stacked dimension (legend entries).
v (str) – Numeric column to aggregate;
'n'counts rows.title (str, optional) – Plot title. Defaults to
'% of <v> by <y_cat>'.y_max_n (int, optional) – Cap the number of categories shown per axis.
x_max_n (int, optional) – Cap the number of categories shown per axis.
y_cat_order (list, optional) – Explicit category orderings.
x_cat_order (list, optional) – Explicit category orderings.
show_empty_category (bool) – Show NaN values as
'(N/A)'.y_lab_maxlength (int) – Truncate labels longer than this.
x_lab_maxlength (int) – Truncate labels longer than this.
gap_perc (float) – Gap between bars as a fraction of total data weight.
cmap (str) – Matplotlib colormap name, used when
colorsis None.alpha (float) – Bar opacity.
savefig (str or Path, optional) – Save to this path if provided.
figsize (tuple, optional) – Figure size
(width, height).legend_kwds (dict, optional) – Keyword arguments passed to
ax.legend(). Defaults to{'loc': 'upper right', 'bbox_to_anchor': (0.985, 0.985)}. Any key overrides the default; omitted keys keep their default value.colors (list, optional) – Explicit list of colors, one per x_cat value.
fontsize (int) – Font sizes for labels and title.
titlesize (int) – Font sizes for labels and title.
- Returns:
fig (matplotlib.figure.Figure)
ax (matplotlib.axes.Axes)