Calibrate threshold from Parquet scores by streaming batches
Source:R/calibrate_threshold.R
calibrate_threshold.RdSweeps candidate thresholds over scores stored in a Parquet dataset without loading all rows into memory. Uses two passes: first to determine the score range on the labeled subset; second to accumulate confusion counts across a fixed grid of thresholds. Returns the best threshold per the chosen metric.
Usage
calibrate_threshold(
scores_parquet,
score_col,
labels_parquet,
metric = c("f1", "precision_at_recall"),
recall_min = 0.8,
thresholds = NULL,
n_thresholds = 1001,
batch_size = 1e+05,
verbose = TRUE
)Arguments
- scores_parquet
Path to a Parquet dataset (file or directory) with at least columns
idand the score column.- score_col
Name of the score column to calibrate (e.g., "ensemble", "relevance_score", or "margin").
- labels_parquet
Parquet dataset path with columns
idandlabel(0/1) used for calibration labels.- metric
Optimisation target:
"f1"(default) or"precision_at_recall".- recall_min
Minimum recall required when
metric = "precision_at_recall".- thresholds
Optional numeric vector of thresholds to evaluate. If
NULL, a regular grid between observed min/max is used (seen_thresholds).- n_thresholds
Number of thresholds to generate when
thresholdsisNULL(default1001).- batch_size
Approximate Arrow scan batch size.
- verbose
Logical; print progress messages.