🌐 Live Demo: https://heart-attack-prediction-ilk1.onrender.com
ML Internship Project · IntrainTech, Bangalore · Aug–Nov 2023 Role: Machine Learning Engineer Intern
End-to-end heart attack risk prediction system — from raw clinical data to a live Flask web application and Power BI dashboard. Patients fill a form, the model predicts risk probability, and the app returns personalised lifestyle change recommendations.
Patient fills web form (test.html)
│
▼ POST /predict
┌──────────────────────────────┐
│ server.py │
│ │
│ 1. Parse form inputs │
│ 2. Scale with StandardScaler│
│ 3. model.predict_proba() │
│ 4. determine_lifestyle_ │
│ changes(prob, inputs) │
│ 5. Return JSON response │
└──────────────┬───────────────┘
│
▼
result_template.html
• Risk probability score
• High / Low risk label
• Personalised recommendations
(smoking, BMI, exercise,
diet, sleep, stress)
| Model | Accuracy |
|---|---|
| Random Forest ✅ | 69.17% |
| Light Gradient Boost | ~67% |
| SVM | ~65% |
| XGBoost | ~64% |
| KNN | ~63% |
| Logistic Regression | ~62% |
| Decision Tree | ~58% |
| Naive Bayes | ~57% |
Random Forest selected — best cross-validated accuracy across 10 folds. Evaluated using Accuracy, F1-Score, ROC-AUC, Precision, and Recall.
Why SMOTE before training? Heart attack risk classes are imbalanced. SMOTE generates synthetic minority samples preserving feature distributions, preventing the model from always predicting the majority class.
Why StandardScaler? Features like Cholesterol (100–300), BMI (15–45), and Heart Rate (60–100) have very different ranges. Scaling ensures no single feature dominates distance-based calculations.
Why lifestyle recommendations? A risk score alone isn’t actionable. The recommendations engine maps specific input values (Smoking=1, BMI>25, Exercise<1.25h/week) to concrete changes — making the app clinically useful.
Why split Blood Pressure? The raw dataset stores BP as “120/80” string. Splitting into systolic and diastolic gives the model two meaningful numeric features instead of one useless string.
Heart Attack Risk Prediction Dataset — 8,763 patient records, 25 features:
| Category | Features |
|---|---|
| Demographics | Age, Sex, Country, Continent, Hemisphere |
| Vitals | BP (Systolic/Diastolic split), Heart Rate, Cholesterol, BMI, Triglycerides |
| Lifestyle | Smoking, Alcohol, Exercise Hrs/Week, Diet, Sedentary Hrs, Stress Level, Sleep Hrs |
| Medical History | Diabetes, Family History, Previous Heart Problems, Obesity, Medication Use |
| Target | Heart Attack Risk (0 = Low, 1 = High) |
heart-attack-prediction/
├── .github/
│ └── workflows/
│ └── ci.yml # GitHub Actions CI
├── server.py # Flask app — trains model + serves predictions
├── 1.ipynb # Full EDA + 8-model benchmark notebook
├── heart_attack_prediction_dataset.csv # Dataset (8,763 patient records)
├── Dashboard.pbix # Power BI dashboard
├── templates/
│ ├── test.html # Patient input form
│ └── result_template.html # Risk result + lifestyle suggestions
├── requirements.txt
└── README.md
# Clone the repo
git clone https://github.com/samuel-mekala/heart-attack-prediction.git
cd heart-attack-prediction
# Install dependencies
pip install -r requirements.txt
# Run the Flask app
# (model trains automatically on startup — ~10-15 seconds)
python server.py
# Open browser → http://localhost:5000
# To explore EDA and all 8 models:
jupyter notebook 1.ipynb
IntrainTech Internship · Bangalore · Aug–Nov 2023