Fraud Detection ML Suite — Case Study | Romeo Thomas

Fraud Detection ML Suite

Gradient-boosted ensemble with explainability to cut false positives and raise capture rate.

-27% false positives +14% catch rate XAI reports

Problem

False negatives drive loss while high manual review volume slows decisions. Compliance and model risk teams require transparent, defensible decisions with auditable reasoning.

Data and Sourcing

Labeled transactions with timestamps (events, approvals, chargebacks).
Device, network, merchant, payment, and customer features (PII excluded).
Leak-free splits with temporal validation and out-of-time (OOT) backtests.

Approach

Feature store with rolling windows, risk signals, and interaction terms.
Class imbalance handling (SMOTE/undersample), calibrated thresholds, cost-sensitive objective.
Gradient boosting (LightGBM) with Bayesian hyperparameter search & early stopping.
Explainability with SHAP (global & local) for model risk management and reviewer guidance.

Experiments and Evaluation

Time-based CV; tracked ROC-AUC and PR-AUC across folds and OOT windows.
Threshold sweep to map precision–recall vs reviewer capacity (Ops intake curves).
OOT backtest confirming stability and no target leakage.

Results and Impact

-27% false positives at matched recall vs baseline rules.
+14% catch rate on an OOT window at constant review volume.
Reduced reviewer workload while maintaining coverage of high-severity cases.

What I Did

Designed the feature set, training pipeline, and evaluation framework.
Shipped a FastAPI scoring endpoint with auth, rate limits, and logging.
Produced SHAP reports and governance notes to support model sign-off.

Stack

Python LightGBM Pandas Scikit-learn Imbalanced-learn SHAP Plotly FastAPI Postgres Docker GitHub Actions

Live Demo ← Back to Portfolio