~/projects/house-price-prediction⎇ main✓❯cat case-study.md
House Price Prediction — Jabodetabek
End-to-end ML pipeline predicting house prices in the Jabodetabek metro area. Random-Forest model deployed behind a Flask API; I owned the data preprocessing stage.
- Data Preprocessing Contributor
- Feb – May 2025
- shipped
Python Scikit-Learn Pandas Next.js Flask
Context
Class project, team of 3. Real-estate pricing in Jabodetabek is extremely heterogeneous — same area, same square footage, prices differ wildly. The goal was a data-driven price estimator that gives a fast, objective baseline against a noisy scraped dataset.
What I built
- Owned the data preprocessing stage: missing-value handling, duplicate removal, outlier treatment.
- Cleaned the raw dataset from 3,500+ rows down to 2,397 high-quality rows without losing signal.
- Worked end-to-end with the team through model tuning, evaluation, and deployment into a Flask API behind a Next.js front-end.
Stack
Python · Scikit-Learn · Pandas · Next.js · Flask · Git
Outcome
Final Random Forest regressor: R² ≈ 0.85. Shipped as a full interactive web app — the cleaning pass was the difference between a useless model and a useful one.
~/projects/house-price-prediction⎇ main✓❯cat links.md