Commit Graph

17 Commits

Author SHA1 Message Date
b482f0b496 Downgrade RoxygenNote to 7.3.2 to match CRAN release
Some checks failed
Lint & Format Check / Link Check (push) Successful in 3s
Lint & Format Check / Format Check (styler) (push) Successful in 15s
R Package Tests / test (push) Successful in 51s
Deploy Lakehouse Docs / build-and-deploy (push) Failing after 1m37s
7.3.3 is not yet available on CRAN; rocker/verse:4.4 ships with 7.3.2.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 20:37:09 -05:00
0cd1242502 Fix missing deps and roxygen2 version in deploy workflow
Some checks failed
Deploy Lakehouse Docs / build-and-deploy (push) Failing after 1m3s
Lint & Format Check / Link Check (push) Successful in 4s
Lint & Format Check / Format Check (styler) (push) Successful in 16s
R Package Tests / test (push) Successful in 57s
Add tidymodels and here to DESCRIPTION Suggests so remotes::install_deps
picks them up in CI. Explicitly install roxygen2 in deploy.yaml to ensure
version >= 7.3.3 is available, matching the RoxygenNote declared in DESCRIPTION.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 20:32:31 -05:00
def3e3b478 Add styler dependency to deploy job
Some checks failed
Deploy Lakehouse Docs / build-and-deploy (push) Failing after 1m21s
Lint & Format Check / Link Check (push) Successful in 3s
Lint & Format Check / Format Check (styler) (push) Successful in 18s
R Package Tests / test (push) Successful in 1m3s
2026-02-22 20:21:21 -05:00
00b9fe808b Relax styler check to avoid version mismatch failures
Some checks failed
Deploy Lakehouse Docs / build-and-deploy (push) Failing after 42s
Lint & Format Check / Link Check (push) Successful in 4s
Lint & Format Check / Format Check (styler) (push) Successful in 18s
R Package Tests / test (push) Successful in 51s
2026-02-22 20:09:13 -05:00
0841ee7205 Fix syntax and runner permissions in workflows
Some checks failed
Lint & Format Check / Link Check (push) Successful in 3s
Lint & Format Check / Format Check (styler) (push) Failing after 17s
R Package Tests / test (push) Successful in 1m32s
Deploy Lakehouse Docs / build-and-deploy (push) Failing after 46s
2026-02-22 19:55:56 -05:00
1d0202e3aa Install Node.js in container before checkout
Some checks failed
Deploy Lakehouse Docs / build-and-deploy (push) Failing after 7s
Lint & Format Check / Link Check (push) Successful in 12s
Lint & Format Check / Format Check (styler) (push) Failing after 16s
R Package Tests / test (push) Successful in 1m32s
2026-02-22 19:43:43 -05:00
e781eb3703 Downgrade checkout action to v3 for container compatibility
Some checks failed
Deploy Lakehouse Docs / build-and-deploy (push) Failing after 9s
Lint & Format Check / Link Check (push) Successful in 3s
Lint & Format Check / Format Check (styler) (push) Failing after 1s
R Package Tests / test (push) Failing after 1s
2026-02-22 16:57:21 -05:00
e6c20bd221 Add Gitea CI deployment workflow and update dependencies
Some checks failed
Deploy Lakehouse Docs / build-and-deploy (push) Failing after 34s
Lint & Format Check / Link Check (push) Successful in 17s
Lint & Format Check / Format Check (styler) (push) Failing after 3s
R Package Tests / test (push) Failing after 1s
2026-02-22 16:18:15 -05:00
df978d042f Refactor bucket structure: baf-fraud/ prefix under lake bucket
All functions now default to bucket_name = "lake" with "baf-fraud/"
prepended to all layer prefixes, matching the contemporary lakehouse
naming convention (one bucket per environment, project as prefix).

Migration: copy baf-fraud/ data to lake/baf-fraud/ on analyticsvm,
update BAF_BUCKET env var from "baf-fraud" to "lake".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 05:36:25 -05:00
dac01da6cb Update renv.lock with spelling, styler, and test dependencies
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 05:15:11 -05:00
5218deab74 Add Phase 5 Caddy deployment config and sync script
- deploy/baflakehouse.caddy: handle_path snippet routes /baflakehouse*
  to docs/ with prefix stripping so pkgdown flat structure maps correctly
- bin/sync-caddy.sh: one-time script to install snippet and zero-downtime
  reload Caddy; deploy.R handles everything after that automatically

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 04:56:10 -05:00
7a1a8e0053 Add Phase 4: code quality, CI/CD, and formatting
- testthat infrastructure with 15 tests covering env-var guards,
  return types for all format/save functions, and spelling
- inst/WORDLIST with 52 domain terms (LightGBM, MinIO, Parquet, etc.)
- Spelling test wired into devtools::test() via test-spelling.R
- styler::style_file() added as step 0 in deploy.R (auto-fixes before ship)
- .gitea/workflows/test.yaml: runs testthat suite on push
- .gitea/workflows/lint.yaml: lychee link check + styler dry-run on push
- Removed internal IP address from comment in train_production_model()
- Language: en-US added to DESCRIPTION

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 04:41:37 -05:00
705b2a13d0 Re-track resources/ as static presentation assets
resources/images/confusion-matrix.png is a static Wikipedia screenshot
used in index.qmd slides -- not a generated artifact, so it belongs
in version control.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 03:57:36 -05:00
e8d2c69f2d Remove generated report artifacts from version control
Add reports/figures/, reports/slides/, reports/tables/ to .gitignore
and untrack previously committed PNGs. These are build artifacts
regenerated by tar_make() and deploy.R.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 03:54:51 -05:00
b38892f49e Refactor: consistent naming across functions, targets, and pkgdown
Functions: prepare_eda_recipe -> build_eda_recipe,
           create_efficiency_plot -> plot_efficiency,
           format_class_imbalance_tourney_gt -> format_tournament_gt

Targets: model_inputs_prefix -> baf_model_input_prefix,
         tbl_fraud_by_month_data -> fraud_by_month_summary,
         model_diag -> diag_fit, winning_params -> best_params,
         production_recipe_blueprint -> prod_recipe,
         final_eval_data -> test_predictions

pkgdown: restructured reference index into 6 logical sections,
         removed stale names and development comments.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 03:52:34 -05:00
f47b2e1be2 Add tune_lgbm() and wire hyperparameter tuning into DAG
Converts scratch/tune_model.R into a pure tune_lgbm() function,
replacing hardcoded winning_params with a fully automated tar_target.
Best params (trees=844, depth=3, lr=0.0204, min_n=389) now flow
reproducibly into evaluate_final_model() and train_production_model().
PR-AUC improved from 0.165 to 0.198.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 03:25:35 -05:00
33d0fc31c7 Initial commit: BAF Lakehouse fraud detection pipeline
End-to-end LightGBM fraud detection pipeline built as an R package,
orchestrated by targets with data stored in MinIO via Apache Arrow.
Includes 6-layer Lakehouse architecture, class imbalance tournament,
formally tuned hyperparameters (PR-AUC 0.198), and Quarto RevealJS slides.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 21:19:09 -05:00