Predicting Telecom Customer Churn with Explainable AI
Problem
Customer churn is one of the biggest challenges in the telecom industry. Acquiring new customers costs significantly more than retaining existing ones. The key business question was: “Can we predict which customers are likely to churn, and explain why, so retention teams can take targeted action?”
My Approach
1) Data Understanding & Feature Engineering
I worked with the Kaggle Telecom Churn dataset. To make the model more powerful, I created business-relevant features such as average call duration, total charges, international call ratio, and grouped states.
2) Model Development & Tuning
Compared Logistic Regression, Random Forest, and XGBoost using 5-fold Stratified CV. After tuning, XGBoost delivered strong metrics.
- CV AUC ≈ 0.93
- Hold-out test: AUC = 0.9249, Recall = 0.87, Precision = 0.94, Accuracy = 97.45%
3) Explainability with SHAP
Applied SHAP to explain both global and individual predictions. Waterfall plots illustrated customer-specific drivers of churn.
4) ROI Simulation
Simple ROI model assumed €4,000 CLV and €150 retention offer. Targeting the top 5% highest-risk customers produced a net savings uplift vs. random targeting.
Impact
- Predictive Power: 97% accuracy with strong recall & precision.
- Business Alignment: ROI-focused targeting strategy.
- Transparency: SHAP explanations make the model actionable & trustworthy.
Tech Stack
Python (scikit-learn, XGBoost, imbalanced-learn, SHAP) • pandas • NumPy • matplotlib • Packaged demo + ROI workbook
Screenshots / Prototype Visuals
Visual explanations of the churn model using SHAP. These plots highlight both global feature importance and individual customer-level drivers of churn, making the model’s predictions transparent and actionable.



