Skip to content

Predicting House Price of Delhi region in a particular area/colony.

Notifications You must be signed in to change notification settings

Shubhamag12/House-Price-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

House-Price-Prediction

Overview

This is an ML-based application that predicts the house prices based upon the features provided (such as area, num of bedrooms, etc.).

For testing using sample inputs, click here

Model Used Accuracy
Linear Regressor 81.705317 (Cost)
Ordinary Least Squares 92.9 (R-squared value)

File Description:

  • Delhi_load.ipynb : Application interface for users.
  • CSV File/ Delhi.csv : dataset
  • Codes/ Delhi.ipynb : Main file with all processing stuff
  • Codes/ ols_results_delhi.pickle : Pickled OLS model after training on dataset

Problem Statement

Given a set of features, predict the price of any given house in the Delhi region.

Dataset and libraries used

Dataset Summary
  1. Reference link for dataset : click here
  2. df.info()

Image for dataframe summary

For full image Click here

Libraries Used
  1. Pandas
  2. Statsmodel
  3. Sklearn
  4. Numpy
  5. Seaborn
  6. Matplotlib
  7. Google.colab

Feature analysis

Outliers
  1. Price

Raw Data Price_outliers Rectified Data Corrected Price

  1. Area

Raw Data Area Outliers Rectified Data Corrected Area

  1. Price per sq. foot

Raw Data Price/ Sq. foot Outliers Rectified Data Corrected Price/ Sq. foot

New Features

All the features provided can be reframed to the format : {Area, AttributeScore, Resale, LogPremium, Bedrooms}

  1. Area = Floor area of the property
  2. AttributeScore = An integer based on features like num of bedrooms, gym facility, etc
  3. Resale = A binary value denoting if the propery is first hand usage (0) or a resale (1)
  4. LogPremium = An integer value depending on the Price per sq. foot value
  5. Bedrooms = Number of bedrooms in the property

Final Correlation Matrix : Plot of Correlation Matrix

Model selection

Two models were cosidered as most optimum ones, whose predictions are depicted below:

Ordinary Least Squares Linear Regressor
OLS Plot Linear Regression Plot
92.9 (R-squared value) 81.705317 (Score)

Results

The trained OLS model gives an accuracy of 92.9 (R-squared value). The model summary for the same is provided below

OLS Model Summary

About

Predicting House Price of Delhi region in a particular area/colony.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published