Summary#

DataFrameComparison.summary([...])

Generate a summary of all aspects of the comparison.

class diffly.summary.Summary(
comparison: DataFrameComparison,
show_perfect_column_matches: bool,
top_k_column_changes: int,
sample_k_rows_only: int,
show_sample_primary_key_per_change: bool,
left_name: str,
right_name: str,
slim: bool,
hidden_columns: list[str] | None,
metrics: Mapping[str, Metric] | None,
)[source]#

Container object for generating a summary of the comparison of two data frames.

Note

Do not initialize this object directly. Instead, use DataFrameComparison.summary().

Summary.format([pretty])

Format this summary for printing.

Summary.to_json(**kwargs)

Serialize this summary as a JSON string.

Metrics#

The metrics argument of summary() accepts a mapping from display label to a Metric callable. diffly.metrics ships a set of presets.

diffly.metrics.Metric#

A metric is a callable mapping (left_expr, right_expr) to a scalar aggregation expression.

The expressions refer to the left-side and right-side values of a single column across all joined rows.

alias of Callable[[Expr, Expr], Expr]

mean(left, right)

Mean of right - left.

median(left, right)

Median of right - left.

min(left, right)

Minimum of right - left.

max(left, right)

Maximum of right - left.

std(left, right)

Standard deviation of right - left.

mean_absolute_deviation(left, right)

Mean of |right - left|.

mean_relative_deviation(left, right)

Mean of |(right - left) / left|.

quantile(q)

Factory returning a metric that computes the q-quantile of right - left.