Project Deep Dive

Flood Prediction ML Model

PythonScikit-learnPandasXGBoost

About the Project

This project predicts flood probability using supervised machine learning on an environmental dataset sourced from Kaggle. The workflow focuses on robust preprocessing, domain-aware feature engineering, and comparative model evaluation for reliable risk estimation.

Feature engineering included crafted interaction terms such as Rainfall × Deforestation to capture compounding environmental effects often missed by linear assumptions.

Evaluation results from the final analysis showed:Random Forest R2: 0.763XGBoost R2: 0.938 (Mean CV: 0.984)

Project Structure

Flood-Prediction-ML-model/
├── data/
│   └── flood_dataset.csv
├── notebooks/
│   └── model_experiments.ipynb
├── src/
│   ├── preprocess.py
│   └── train.py
└── requirements.txt

Installation & Usage

Installation

pip install -r requirements.txt

Usage

python src/train.py

Model Performance Visualization

Actual vs predicted comparisons for both models from the final analysis pipeline.

Random Forest Actual vs Predicted
XGBoost Actual vs Predicted