Fraud Detection ML Suite — Case Study | Romeo Thomas

Fraud Detection ML Suite

Gradient-boosted ensemble with explainability to cut false positives and raise capture rate.

-27% false positives +14% catch rate XAI reports
Fraud Detection ML Suite cover

Problem

False negatives drive loss while high manual review volume slows decisions. Compliance and model risk teams require transparent, defensible decisions with auditable reasoning.

Data and Sourcing

  • Labeled transactions with timestamps (events, approvals, chargebacks).
  • Device, network, merchant, payment, and customer features (PII excluded).
  • Leak-free splits with temporal validation and out-of-time (OOT) backtests.

Approach

  • Feature store with rolling windows, risk signals, and interaction terms.
  • Class imbalance handling (SMOTE/undersample), calibrated thresholds, cost-sensitive objective.
  • Gradient boosting (LightGBM) with Bayesian hyperparameter search & early stopping.
  • Explainability with SHAP (global & local) for model risk management and reviewer guidance.

Experiments and Evaluation

  • Time-based CV; tracked ROC-AUC and PR-AUC across folds and OOT windows.
  • Threshold sweep to map precision–recall vs reviewer capacity (Ops intake curves).
  • OOT backtest confirming stability and no target leakage.

Results and Impact

  • -27% false positives at matched recall vs baseline rules.
  • +14% catch rate on an OOT window at constant review volume.
  • Reduced reviewer workload while maintaining coverage of high-severity cases.

What I Did

  • Designed the feature set, training pipeline, and evaluation framework.
  • Shipped a FastAPI scoring endpoint with auth, rate limits, and logging.
  • Produced SHAP reports and governance notes to support model sign-off.

Stack

Python LightGBM Pandas Scikit-learn Imbalanced-learn SHAP Plotly FastAPI Postgres Docker GitHub Actions