Refactor: consistent naming across functions, targets, and pkgdown

Functions: prepare_eda_recipe -> build_eda_recipe,
           create_efficiency_plot -> plot_efficiency,
           format_class_imbalance_tourney_gt -> format_tournament_gt

Targets: model_inputs_prefix -> baf_model_input_prefix,
         tbl_fraud_by_month_data -> fraud_by_month_summary,
         model_diag -> diag_fit, winning_params -> best_params,
         production_recipe_blueprint -> prod_recipe,
         final_eval_data -> test_predictions

pkgdown: restructured reference index into 6 logical sections,
         removed stale names and development comments.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-22 03:52:34 -05:00
parent f47b2e1be2
commit b38892f49e
7 changed files with 159 additions and 212 deletions

View File

@@ -2,7 +2,7 @@ url: https://docs.robwiederstein.org/baflakehouse
template:
bootstrap: 5
bootswatch: flatly # Clean, professional look
bootswatch: flatly
navbar:
structure:
@@ -15,44 +15,51 @@ navbar:
reference:
- title: "Data Ingestion & Lakehouse Setup"
desc: "Functions for moving data from CSV to partitioned Parquet in MinIO."
desc: "Functions for moving raw CSV data into the MinIO Lakehouse as partitioned Parquet."
contents:
- baflakehouse-package
- convert_to_parquet
- connect_baf
- clean_baf_base
- title: "Feature Engineering & Preprocessing"
desc: "The 'Recipes' layer of the pipeline."
desc: "Recipes and transformations applied across the pipeline layers."
contents:
- engineer_features
- prepare_eda_recipe
- build_baf_recipe # NEW: Untrained blueprint for production
- generate_model_inputs
- build_eda_recipe
- build_baf_recipe
- title: "The Tournament (Model Selection)"
desc: "Cross-validation and imbalance strategy testing."
- title: "Exploratory Data Analysis"
desc: "Diagnostic model and visualizations for understanding the fraud signal."
contents:
- train_diag_model
- plot_var_imp
- plot_hexbin_interaction
- plot_missingness
- plot_num_cor
- title: "Model Selection & Tuning"
desc: "Imbalance strategy tournament, hyperparameter tuning, and results formatting."
contents:
- run_imbalance_tournament
- tune_lgbm
- train_diag_model
- create_efficiency_plot # Moved here: Belongs with the tournament
- format_tournament_gt
- plot_efficiency
- title: "Final Evaluation & Production Deployment"
desc: "Results on unseen data (Months 6-7) and MinIO artifact serialization."
desc: "Holdout evaluation on months 6-7 and MinIO model artifact serialization."
contents:
- evaluate_final_model
- train_production_model # NEW: The final deployment function
- train_production_model
- title: "Reporting: Tables & Visualizations"
desc: "Generating ggplot2 figures and gt tables for Quarto."
- title: "Reporting"
desc: "Figures, tables, and slide rendering for the Quarto presentation."
contents:
- starts_with("plot_")
- starts_with("compute_")
- starts_with("format_") # Neatly catches all your gt table formatters
- title: "Pipeline Utilities"
desc: "Internal helpers for the targets workflow and slide generation."
contents:
- starts_with("save_report_")
- render_slides # Consolidated here
- plot_fraud_by_month
- plot_conf_mat_heatmap
- compute_fraud_by_month
- format_fraud_by_month_gt
- save_report_figure
- save_report_table
- render_slides