Metric Pivot Functionality

Transform long-format evaluation results to wide-format tables

Metric Pivot Functionality

The MetricEvaluator.pivot_by_group() and MetricEvaluator.pivot_by_model() methods transform long-format evaluation results into wide-format tables, making them suitable for reporting, analysis, and visualization. This guide demonstrates the pivot functionality across different scenarios.

Overview

The pivot methods support: - Multiple scope types: GLOBAL, MODEL, GROUP, DEFAULT - Two pivot orientations: Group-based rows vs Model-based rows - Subgroup compatibility: Works with single and multiple subgroup variables - Automatic column ordering: Index -> Global -> Group -> Default - Custom ordering controls: column_order_by toggles metric-first vs. estimate-first layouts

Basic Setup

import polars as pl
from polars_eval_metrics import MetricDefine, MetricEvaluator, pivot_to_gt
from data_generator import generate_sample_data

# Create sample dataset with comprehensive structure for pivot examples
df = generate_sample_data(
    n_subjects=12,
    n_visits=3,
    n_groups=2,
)
df = df.with_columns(
    pl.col("age_group").cast(pl.Enum(["Young", "Middle", "Senior"]))
)

Case 1: Pivot by Group (without subgroups)

When pivoting by groups, we get one column per group combination for each metric.

# Define mixed scope metrics for comprehensive pivot demonstration
mixed_metrics = [
    MetricDefine(name="n_subject", label="Total Enrolled Subjects", scope="global"),
    MetricDefine(name="n_subject", label="Number of Subjects", scope="group"),
    MetricDefine(name="mae", label="MAE"),
    MetricDefine(name="rmse", label="RMSE")
]

# Create evaluator with group_by
evaluator_group = MetricEvaluator(
    df=df,
    metrics=mixed_metrics,
    ground_truth="actual",
    estimates=["model1", "model2"],
    group_by=["treatment"]
)

# Get long format results
evaluator_group.evaluate()
shape: (11, 7)
estimate metric label value metric_type scope treatment
enum enum enum str str str str
null "n_subject" "Total Enrolled Subjects" "12" "across_sample" "global" null
null "n_subject" "Number of Subjects" "6" "across_sample" "group" "A"
"model1" "mae" "MAE" "1.2" "across_sample" null "A"
"model2" "mae" "MAE" "1.7" "across_sample" null "A"
"model1" "rmse" "RMSE" "1.6" "across_sample" null "A"
null "n_subject" "Number of Subjects" "6" "across_sample" "group" "B"
"model1" "mae" "MAE" "1.2" "across_sample" null "B"
"model2" "mae" "MAE" "1.4" "across_sample" null "B"
"model1" "rmse" "RMSE" "1.3" "across_sample" null "B"
"model2" "rmse" "RMSE" "1.7" "across_sample" null "B"
# Pivot to wide format with group combinations as rows
evaluator_group.pivot_by_group()
shape: (2, 7)
treatment Total Enrolled Subjects Number of Subjects {"model1","MAE"} {"model2","MAE"} {"model1","RMSE"} {"model2","RMSE"}
str str str str str str str
"A" "12" "6" "1.2" "1.7" "1.6" "2.0"
"B" "12" "6" "1.2" "1.4" "1.3" "1.7"
evaluator_group.pivot_by_group().\
    pipe(pivot_to_gt)
treatment Total Enrolled Subjects Number of Subjects MAE RMSE
model1 model2 model1 model2
A 12 6 1.2 1.7 1.6 2.0
B 12 6 1.2 1.4 1.3 1.7

Controlling column order

Specify column_order_by to switch whether metrics or estimate labels nest first in the resulting column hierarchy.

# Make estimates the outer column level, metrics nested inside
evaluator_group.pivot_by_group(column_order_by="estimates")
shape: (2, 7)
treatment Total Enrolled Subjects Number of Subjects {"model1","MAE"} {"model1","RMSE"} {"model2","MAE"} {"model2","RMSE"}
str str str str str str str
"A" "12" "6" "1.2" "1.6" "1.7" "2.0"
"B" "12" "6" "1.2" "1.3" "1.4" "1.7"
evaluator_group.pivot_by_group(column_order_by="estimates").\
    pipe(pivot_to_gt)
treatment Total Enrolled Subjects Number of Subjects MAE RMSE
model1 model2 model1 model2
A 12 6 1.2 1.7 1.6 2.0
B 12 6 1.2 1.4 1.3 1.7

Case 2: Pivot by Group (with subgroups)

Adding subgroups creates separate analyses for each subgroup variable, with multiple rows in the wide format.

# Same mixed scope metrics with subgroups
evaluator_group_sub = MetricEvaluator(
    df=df,
    metrics=mixed_metrics,
    ground_truth="actual",
    estimates={"model1": "Model 1", "model2": "Model 2"},
    group_by={"treatment": "Treatment"},
    subgroup_by={"age_group": "Age Group", "region": "Region"}
)

evaluator_group_sub.pivot_by_group()
shape: (10, 9)
Treatment subgroup_name subgroup_value Total Enrolled Subjects Number of Subjects {"Model 1","MAE"} {"Model 2","MAE"} {"Model 1","RMSE"} {"Model 2","RMSE"}
str str enum str str str str str str
"A" "Age Group" "Young" "4" "2" "1.2" "1.7" "1.6" "2.0"
"A" "Age Group" "Middle" "4" "2" "1.3" "1.7" "1.6" "2.0"
"A" "Age Group" "Senior" "4" "2" "1.3" "1.7" "1.6" "2.0"
"A" "Region" "East" "3" "3" "2.3" "1.7" "2.3" "2.0"
"A" "Region" "North" "3" "3" "0.3" "1.7" "0.4" "2.0"
"B" "Age Group" "Young" "4" "2" "1.2" "1.4" "1.3" "1.7"
"B" "Age Group" "Middle" "4" "2" "1.2" "1.3" "1.3" "1.6"
"B" "Age Group" "Senior" "4" "2" "1.2" "1.4" "1.3" "1.7"
"B" "Region" "South" "3" "3" "1.0" "1.3" "1.1" "1.7"
"B" "Region" "West" "3" "3" "1.4" "1.4" "1.4" "1.7"

Order by group

evaluator_group_sub.pivot_by_group(row_order_by = "group").\
    pipe(pivot_to_gt)
Treatment Total Enrolled Subjects Number of Subjects MAE RMSE
Model 1 Model 2 Model 1 Model 2
Age Group
Young A 4 2 1.2 1.7 1.6 2.0
Middle A 4 2 1.3 1.7 1.6 2.0
Senior A 4 2 1.3 1.7 1.6 2.0
Young B 4 2 1.2 1.4 1.3 1.7
Middle B 4 2 1.2 1.3 1.3 1.6
Senior B 4 2 1.2 1.4 1.3 1.7
Region
East A 3 3 2.3 1.7 2.3 2.0
North A 3 3 0.3 1.7 0.4 2.0
South B 3 3 1.0 1.3 1.1 1.7
West B 3 3 1.4 1.4 1.4 1.7

Order by subgroup

evaluator_group_sub.pivot_by_group(row_order_by = "subgroup").\
    pipe(pivot_to_gt)
Treatment Total Enrolled Subjects Number of Subjects MAE RMSE
Model 1 Model 2 Model 1 Model 2
Age Group
Young A 4 2 1.2 1.7 1.6 2.0
Young B 4 2 1.2 1.4 1.3 1.7
Middle A 4 2 1.3 1.7 1.6 2.0
Middle B 4 2 1.2 1.3 1.3 1.6
Senior A 4 2 1.3 1.7 1.6 2.0
Senior B 4 2 1.2 1.4 1.3 1.7
Region
East A 3 3 2.3 1.7 2.3 2.0
North A 3 3 0.3 1.7 0.4 2.0
South B 3 3 1.0 1.3 1.1 1.7
West B 3 3 1.4 1.4 1.4 1.7

Case 3: Pivot by Model (without subgroups)

# Use same evaluator as Case 1 for model comparison
evaluator_model = evaluator_group

# Pivot to wide format for model comparison
evaluator_model.pivot_by_model()
shape: (2, 9)
estimate Total Enrolled Subjects {"A","Number of Subjects"} {"B","Number of Subjects"} {"A","MAE"} {"A","RMSE"} {"B","MAE"} {"B","RMSE"} estimate_label
str str str str str str str str str
"model1" "12" "6" "6" "1.2" "1.6" "1.2" "1.3" "model1"
"model2" "12" "6" "6" "1.7" "2.0" "1.4" "1.7" "model2"
evaluator_model.pivot_by_model().\
    pipe(pivot_to_gt)
estimate Total Enrolled Subjects Number of Subjects MAE RMSE estimate_label
A B A B A B
model1 12 6 6 1.2 1.2 1.6 1.3 model1
model2 12 6 6 1.7 1.4 2.0 1.7 model2

Case 4: Pivot by Model (with subgroups)

# Use same evaluator as Case 2 for model comparison with subgroups
evaluator_model_sub = evaluator_group_sub

# Pivot to wide format for model comparison with subgroups
evaluator_model_sub.pivot_by_model()
shape: (14, 11)
estimate subgroup_name subgroup_value Total Enrolled Subjects {"A","Number of Subjects"} {"B","Number of Subjects"} {"A","MAE"} {"A","RMSE"} {"B","MAE"} {"B","RMSE"} estimate_label
str str enum str str str str str str str str
"model1" "Age Group" "Young" "4" "2" "2" "1.2" "1.6" "1.2" "1.3" "Model 1"
"model2" "Age Group" "Young" "4" "2" "2" "1.7" "2.0" "1.4" "1.7" "Model 2"
"model1" "Age Group" "Middle" "4" "2" "2" "1.3" "1.6" "1.2" "1.3" "Model 1"
"model2" "Age Group" "Middle" "4" "2" "2" "1.7" "2.0" "1.3" "1.6" "Model 2"
"model1" "Age Group" "Senior" "4" "2" "2" "1.3" "1.6" "1.2" "1.3" "Model 1"
"model2" "Region" "North" "3" "3" null "1.7" "2.0" null null "Model 2"
"model1" "Region" "South" "3" null "3" null null "1.0" "1.1" "Model 1"
"model2" "Region" "South" "3" null "3" null null "1.3" "1.7" "Model 2"
"model1" "Region" "West" "3" null "3" null null "1.4" "1.4" "Model 1"
"model2" "Region" "West" "3" null "3" null null "1.4" "1.7" "Model 2"
evaluator_model_sub.pivot_by_model().\
    pipe(pivot_to_gt)
estimate Total Enrolled Subjects Number of Subjects MAE RMSE estimate_label
A B A B A B
Age Group
Young model1 4 2 2 1.2 1.2 1.6 1.3 Model 1
Young model2 4 2 2 1.7 1.4 2.0 1.7 Model 2
Middle model1 4 2 2 1.3 1.2 1.6 1.3 Model 1
Middle model2 4 2 2 1.7 1.3 2.0 1.6 Model 2
Senior model1 4 2 2 1.3 1.2 1.6 1.3 Model 1
Senior model2 4 2 2 1.7 1.4 2.0 1.7 Model 2
Region
East model1 3 3 None 2.3 None 2.3 None Model 1
East model2 3 3 None 1.7 None 2.0 None Model 2
North model1 3 3 None 0.3 None 0.4 None Model 1
North model2 3 3 None 1.7 None 2.0 None Model 2
South model1 3 None 3 None 1.0 None 1.1 Model 1
South model2 3 None 3 None 1.3 None 1.7 Model 2
West model1 3 None 3 None 1.4 None 1.4 Model 1
West model2 3 None 3 None 1.4 None 1.7 Model 2

Key Points

The pivot functionality transforms long-format results to wide-format tables:

  • pivot_by_group(): Creates rows for each group combination, columns for metrics/models
  • pivot_by_model(): Creates rows for each model, columns for group x metric combinations
  • Subgroups: Add subgroup_by to stratify analysis by demographic variables
  • Automatic column ordering: Index -> Global -> Group -> Default scope columns
  • Configurable column nesting: Use column_order_by to switch between metric-first or estimate-first layouts

Summary

The pivot functionality provides two complementary views of evaluation results:

  • pivot_by_group(): Groups as rows, model x metric combinations as columns - ideal for comparing metrics across groups
  • pivot_by_model(): Models as rows, group x metric combinations as columns - ideal for comparing models
  • Mixed scope support: Handles Global, Group, and Default scopes automatically
  • Subgroup stratification: Separate analysis for each demographic subgroup
  • Clean output: Intuitive column names using Polars default conventions

Perfect for creating summary tables, analysis reports, and data ready for visualization or export.