๐ฅ Hospital Readmission Analysis
An end-to-end analytics and modeling solution designed to assess patient readmission risk across specialties and treatment patterns. This project analyzes 25,000+ patient records to empower hospitals with predictive insights, operational dashboards, and data-driven care strategies.
GitHub Repository
๐ click to view Hospital-Readmission-Analysis
๐ง Project Overview
Hospital readmissions are a key metric for care quality and operational efficiency. This project delivers a full-stack solution that enables:
- ๐ Risk profiling of patients based on visit history and diagnosis
- ๐ Specialty-level readmission trend analysis
- ๐ง Predictive modeling for care optimization
- ๐ Dashboard-driven insights for hospital decision-making
๐ฏ Key Objectives
- Clean and preprocess patient-level hospital data
- Engineer features for readmission prediction and dashboarding
- Build classification models to assess readmission risk
- Deploy interactive dashboards for real-time clinical insights
๐ Project Structure
File Name |
Description |
hospital_readmission.sql |
SQL queries for data extraction and transformation |
hospital readmission.ipynb |
Jupyter notebook with full analysis workflow |
sqlconnect.py |
Python script for SQL database connection |
app.py |
Streamlit app for interactive model deployment |
readmission_model.pkl |
Trained classification model for readmission prediction |
feature_names.pkl |
Serialized feature list used in model training |
cleaned_hospital_readmission.csv |
Preprocessed dataset used for modeling |
hospital_readmissions_cleaned.csv |
Alternate cleaned dataset version |
Hospital Readmission Analytics.docx |
Project documentation and summary report |
๐ Dataset Summary
- Total Records: 25,000
- Key Columns:
age
, time_in_hospital
, n_lab_procedures
, n_procedures
, n_medications
, n_outpatient
, n_inpatient
, n_emergency
, medical_specialty
, diag_1
, diag_2
, diag_3
, glucose_test
, A1Ctest
, change
, diabetes_med
, readmitted
, total_visits
- Target Variable:
readmitted
- Feature Set: Visit counts, diagnosis codes, medication history, specialty, and encoded categorical variables
๐งน Data Preprocessing
- Converted date fields and calculated total visits
- Normalized treatment metrics and encoded categorical features
- Removed outliers and handled missing values
- Engineered features for diagnosis grouping and specialty impact
๐ Exploratory Data Analysis
- ๐ Readmission trends by specialty and diagnosis
- ๐ฅ Visit patterns across inpatient, outpatient, and emergency channels
- ๐ Medication change and diabetes treatment impact
- ๐
Hospital stay duration and seasonal readmission patterns
๐ค Modeling Approach
- Target Variable:
readmitted
- Algorithms Used: Logistic Regression, Random Forest, XGBoost
- Evaluation Metrics: Accuracy, Precision, Recall, F1 Score, ROC-AUC
- Top Features:
n_inpatient
, change
, medical_specialty
, diag_1
, n_medications
๐ Dashboard Overview
๐ท Power BI Dashboard
Visualizes hospital-level readmission metrics:
- ๐งโโ๏ธ Specialty-wise readmission breakdown
- ๐ KPI cards for hospital stay, medication count, and readmission rate
- ๐
Diagnosis trends and treatment intensity

๐ข Streamlit App
Interactive dashboard for real-time patient risk prediction:
- ๐ง Patient-level readmission summary
- ๐ฎ Risk prediction tool based on visit and diagnosis history
- ๐ Feature importance visualization
- ๐ Filters by specialty, diagnosis, and treatment type



๐ Deployment
- Model serialized with
joblib
as readmission_model.pkl
- Dashboard deployed via Streamlit Cloud
- SQL integration for dynamic data updates
- Git LFS used for large file management
๐ง Business Impact
- Flags high-risk patients for proactive care
- Improves hospital resource allocation and discharge planning
- Enables real-time readmission monitoring
- Supports data-driven clinical strategy and quality improvement
๐ ๏ธ Tech Stack
- Python: Pandas, NumPy, Scikit-learn, Streamlit
- SQL: Data extraction and transformation
- Visualization: Power BI, Matplotlib, Seaborn
- Deployment: Streamlit Cloud, GitHub, Git LFS
๐ Future Enhancements
- Integrate real-time EHR feeds via APIs
- Add explainability via SHAP or LIME
- Enable user-uploaded patient records for prediction
- Expand dashboard to include treatment outcome forecasting
๐ค Author
Anesh Raj
๐ GitHub Profile