Catching the 0.17% of transactions that cost billions
12 models tested. One winner.
Understanding the patterns
Only 492 frauds in 284,807 transactions
Fraudsters prefer smaller transactions
Fraud timing throughout the day
V14, V17, V12 most correlated with fraud
Gradient boosting with data augmentation crushed 11 other models including deep learning.
Understanding what drives the predictions
V14 + V17 account for 34% of decisions
Precision vs recall business impact
Area under curve: 91.04%
Gradient boosting dominates
Understanding why the model makes decisions
XGBoost, CatBoost, LightGBM, TabNet, Transformer
The complete picture