Initial commit: BAF Lakehouse fraud detection pipeline
End-to-end LightGBM fraud detection pipeline built as an R package, orchestrated by targets with data stored in MinIO via Apache Arrow. Includes 6-layer Lakehouse architecture, class imbalance tournament, formally tuned hyperparameters (PR-AUC 0.198), and Quarto RevealJS slides. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
34
DESCRIPTION
Normal file
34
DESCRIPTION
Normal file
@@ -0,0 +1,34 @@
|
||||
Package: baflakehouse
|
||||
Title: Lakehouse Workflow for the Bank Account Fraud Dataset
|
||||
Version: 0.0.0.9000
|
||||
Authors@R:
|
||||
person("Rob", "Wiederstein", role = c("aut", "cre"),
|
||||
email = "REPLACE_ME@example.com")
|
||||
Description: Tools to ingest the Bank Account Fraud (BAF) Base dataset into a
|
||||
MinIO/S3-backed lakehouse, clean encoded missing values, and produce
|
||||
reproducible reporting artifacts (tables, figures, slides) orchestrated with
|
||||
targets.
|
||||
License: MIT + file LICENSE
|
||||
Encoding: UTF-8
|
||||
Roxygen: list(markdown = TRUE)
|
||||
RoxygenNote: 7.3.3
|
||||
Imports:
|
||||
arrow,
|
||||
colorspace,
|
||||
cowplot,
|
||||
dplyr,
|
||||
tidyr,
|
||||
stringr,
|
||||
readr,
|
||||
gt,
|
||||
quarto,
|
||||
ggplot2,
|
||||
bonsai
|
||||
Suggests:
|
||||
duckdb,
|
||||
targets,
|
||||
tarchetypes,
|
||||
knitr,
|
||||
scales
|
||||
URL: https://docs.robwiederstein.org/baflakehouse
|
||||
BugReports: https://git.robwiederstein.org/rkw/bank-fraud-baf-lakehouse/issues
|
||||
Reference in New Issue
Block a user