๐ฆ Loan Default Risk Management System
A full-stack predictive analytics solution designed to assess loan default risk across diverse customer segments. This project analyzes 1,000,000+ loan applications to empower financial institutions with data-driven insights, risk modeling, and interactive dashboards for smarter credit decisioning and portfolio management.
๐ GitHub Project Repository
๐ Click to view Loan-Default-Risk-Management-System
๐ง Project Overview
Loan defaults pose a significant challenge to financial stability and lending efficiency. This project delivers an end-to-end analytics platform that enables:
- ๐ Risk profiling of loan applicants
- ๐ Portfolio-level default trend analysis
- ๐ง Predictive modeling for credit decisions
- ๐ Dashboard-driven insights for compliance and strategy
๐ฏ Key Objectives
- Clean and preprocess large-scale loan application data
- Engineer features for default prediction and dashboarding
- Build classification models to assess loan risk
- Deploy interactive dashboards for stakeholder decision-making
๐ Project Structure
File Name |
Description |
loan Default risk managemnt cleaned.csv |
Alternate cleaned dataset version |
cleaned_loandefault.csv |
Preprocessed dataset with feature engineering |
loan_risk_model.pkl |
Trained model for predicting loan default |
model_features.pkl |
Feature list used in model training |
loandefault.sql |
SQL queries for data extraction and filtering |
sqlconnect.py |
Python script for SQL database connection |
app.py |
Streamlit app for dashboard deployment |
finance_dashboard.py |
Additional dashboard for financial KPIs |
test_request.py |
Script for model testing and API simulation |
Loan Default Risk Management System.ipynb |
Jupyter notebook with EDA, modeling, and insights |
๐งน Data Preprocessing
- Converted
Application_Date
to datetime format
- Calculated
Debt_to_Income
and Income_Loan_Ratio
- Imputed missing values in
Property_Ownership
- One-hot encoded categorical features (
Employment_Status
, Region
, Loan_Purpose
)
- Removed outliers and normalized financial metrics
๐ Exploratory Data Analysis
- ๐ Default trends by region, credit score, and loan purpose
- ๐ฐ Income vs. loan amount correlation
- ๐ฆ Approval channel impact on default rates
- ๐
Distribution of monthly installments and loan terms
๐ค Modeling Approach
- Target Variable:
Defaulted
- Algorithms Used: Logistic Regression, Random Forest, XGBoost
- Evaluation Metrics: Accuracy, Precision, Recall, F1 Score, ROC-AUC
- Top Features:
Credit_Score
, Debt_to_Income
, Past_Defaults
, Loan_Amount
๐ Dashboard Overview
๐ท Power BI Dashboard
Visualizes credit risk and portfolio health:
- ๐บ๏ธ Regional default heatmaps
- ๐ Credit score and income distribution
- ๐
Application timeline analysis
- ๐ KPI cards for approval rates, risk levels, and loan volumes



๐ข Streamlit App
Interactive dashboard for real-time risk prediction:
- ๐ง Customer-level risk summary
- ๐ฎ Default prediction tool
- ๐ Feature importance visualization
- ๐ Filters by region, employment status, and loan purpose


๐ Deployment
- Model serialized with
joblib
as loan_risk_model.pkl
- Dashboard deployed via Streamlit Cloud
- SQL integration for dynamic data updates
- Git LFS used for large file management
๐ง Business Impact
- Flags high-risk applicants before loan approval
- Improves portfolio health and reduces NPA rates
- Enables real-time credit risk monitoring
- Supports data-driven lending strategy and compliance
๐ ๏ธ Tech Stack
- Python: Pandas, NumPy, Scikit-learn, Streamlit
- SQL: Data extraction and filtering
- Visualization: Power BI, Matplotlib, Seaborn, Plotly
- Deployment: Streamlit Cloud, GitHub, Git LFS
๐ Future Enhancements
- Integrate real-time credit bureau APIs
- Add explainability via SHAP or LIME
- Enable user-uploaded loan applications for prediction
- Expand dashboard to include repayment forecasting and risk scoring
๐ค Author
Anesh Raj
๐ GitHub Profile