All functions now default to bucket_name = "lake" with "baf-fraud/"
prepended to all layer prefixes, matching the contemporary lakehouse
naming convention (one bucket per environment, project as prefix).
Migration: copy baf-fraud/ data to lake/baf-fraud/ on analyticsvm,
update BAF_BUCKET env var from "baf-fraud" to "lake".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- deploy/baflakehouse.caddy: handle_path snippet routes /baflakehouse*
to docs/ with prefix stripping so pkgdown flat structure maps correctly
- bin/sync-caddy.sh: one-time script to install snippet and zero-downtime
reload Caddy; deploy.R handles everything after that automatically
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- testthat infrastructure with 15 tests covering env-var guards,
return types for all format/save functions, and spelling
- inst/WORDLIST with 52 domain terms (LightGBM, MinIO, Parquet, etc.)
- Spelling test wired into devtools::test() via test-spelling.R
- styler::style_file() added as step 0 in deploy.R (auto-fixes before ship)
- .gitea/workflows/test.yaml: runs testthat suite on push
- .gitea/workflows/lint.yaml: lychee link check + styler dry-run on push
- Removed internal IP address from comment in train_production_model()
- Language: en-US added to DESCRIPTION
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
resources/images/confusion-matrix.png is a static Wikipedia screenshot
used in index.qmd slides -- not a generated artifact, so it belongs
in version control.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add reports/figures/, reports/slides/, reports/tables/ to .gitignore
and untrack previously committed PNGs. These are build artifacts
regenerated by tar_make() and deploy.R.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Converts scratch/tune_model.R into a pure tune_lgbm() function,
replacing hardcoded winning_params with a fully automated tar_target.
Best params (trees=844, depth=3, lr=0.0204, min_n=389) now flow
reproducibly into evaluate_final_model() and train_production_model().
PR-AUC improved from 0.165 to 0.198.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
End-to-end LightGBM fraud detection pipeline built as an R package,
orchestrated by targets with data stored in MinIO via Apache Arrow.
Includes 6-layer Lakehouse architecture, class imbalance tournament,
formally tuned hyperparameters (PR-AUC 0.198), and Quarto RevealJS slides.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>