Files
bank-fraud-baf-lakehouse/man/tune_lgbm.Rd
Rob Wiederstein df978d042f Refactor bucket structure: baf-fraud/ prefix under lake bucket
All functions now default to bucket_name = "lake" with "baf-fraud/"
prepended to all layer prefixes, matching the contemporary lakehouse
naming convention (one bucket per environment, project as prefix).

Migration: copy baf-fraud/ data to lake/baf-fraud/ on analyticsvm,
update BAF_BUCKET env var from "baf-fraud" to "lake".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 05:36:25 -05:00

40 lines
1.2 KiB
R

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/functions.R
\name{tune_lgbm}
\alias{tune_lgbm}
\title{Tune LightGBM Hyperparameters}
\usage{
tune_lgbm(
imbalance_windows,
bucket_name = "lake",
inputs_prefix = "baf-fraud/05_model_input",
grid_size = 30L,
seed = 42L
)
}
\arguments{
\item{imbalance_windows}{A tibble with columns \code{window_id},
\code{train_months}, and \code{test_month}, as produced by the
\code{imbalance_windows} target.}
\item{bucket_name}{Character. MinIO bucket name. Default \code{"lake"}.}
\item{inputs_prefix}{Character. Prefix for the model input layer.
Default \code{"05_model_input"}.}
\item{grid_size}{Integer. Number of space-filling candidates. Default \code{30}.}
\item{seed}{Integer. Random seed for reproducibility. Default \code{42}.}
}
\value{
A named list with elements \code{trees}, \code{tree_depth},
\code{learn_rate}, and \code{min_n}.
}
\description{
Performs a grid search over LightGBM hyperparameters using the same rolling
time windows as the imbalance tournament. Optimises PR-AUC on the pre-baked
baseline data stored in MinIO. Returns the best parameters as a named list
ready for use in \code{evaluate_final_model()} and
\code{train_production_model()}.
}