Skip to content

rishig47-dev/crime-rate-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚨 Crime Rate Prediction Using Machine Learning

An end-to-end Machine Learning project that predicts crime rates based on socio-economic and regional factors using Random Forest Regression. This project demonstrates how data-driven approaches can support smart cities, law enforcement, and urban planning.


📌 Table of Contents

  • Overview
  • Problem Statement
  • Objectives
  • Features
  • Project Structure
  • Dataset Description
  • Tech Stack
  • Machine Learning Model
  • Workflow
  • Installation & Setup
  • How to Run the Project
  • Results & Evaluation Metrics
  • Applications
  • Limitations
  • Future Enhancements
  • Author

🔍 Overview

Crime rate analysis is a critical component of public safety and urban development. Traditional crime analysis methods rely heavily on manual interpretation of historical data. This project uses Machine Learning (ML) techniques to predict crime rate trends by learning patterns from historical and demographic data.

The model helps in:

  • Identifying high-risk areas
  • Supporting law enforcement planning
  • Improving public safety decision-making

This project is suitable for:

  • MTech / BTech students
  • Machine Learning & Data Science courses
  • Academic mini / major projects
  • Research and demonstrations

❓ Problem Statement

Crime patterns depend on multiple dynamic factors such as population density, unemployment, and previous crime history. Manual prediction is inefficient and error-prone.

The goal of this project is to build a regression-based ML model that can accurately predict crime rate using historical and socio-economic data.


🎯 Objectives

  • To generate and analyze crime-related data
  • To preprocess categorical and numerical features
  • To build a Machine Learning regression model
  • To evaluate model performance using standard metrics
  • To predict crime rate for new unseen data
  • To demonstrate ML applications in social domains

✨ Features

  • Synthetic crime dataset generation
  • Data preprocessing and encoding
  • Random Forest Regression model
  • Performance evaluation (MAE, RMSE, R²)
  • Prediction on new inputs
  • Model saving for future use

📂 Project Structure

Crime-Rate-Prediction/
│
├── crime_rate.ipynb              # Jupyter Notebook (main implementation)
├── crime_dataset.csv             # Generated dataset
├── crime_rate_prediction_model.pkl  # Saved trained model
├── README.md                     # Project documentation
└── requirements.txt              # Required Python libraries

🗂 Dataset Description

The dataset used in this project is synthetically generated to simulate real-world crime data.

Input Features:

  • Area – Type of area (Downtown, Residential, Industrial, etc.)
  • Year – Year of record
  • Month – Month of record
  • Population_Density – Population per unit area
  • Unemployment_Rate – Percentage of unemployed people
  • Police_Stations – Number of police stations
  • Previous_Crime_Count – Historical crime count

Target Variable:

  • Crime_Rate – Continuous value representing crime intensity

🧰 Tech Stack

  • Programming Language: Python

  • Libraries:

    • NumPy
    • Pandas
    • Scikit-learn
    • Matplotlib
    • Joblib
  • Platform: Jupyter Notebook / Google Colab


🧠 Machine Learning Model

Random Forest Regressor

Random Forest is an ensemble learning algorithm that:

  • Combines multiple decision trees
  • Reduces overfitting
  • Handles non-linear relationships efficiently

Model Parameters:

  • Number of estimators: 100
  • Random state: 42

🔄 Workflow

  1. Generate synthetic crime dataset
  2. Perform data preprocessing
  3. Encode categorical variables
  4. Split data into training and testing sets
  5. Train Random Forest Regression model
  6. Evaluate model performance
  7. Predict crime rate for new inputs
  8. Save trained model and dataset

⚙️ Installation & Setup

Clone the repository:

git clone https://github.com/your-username/Crime-Rate-Prediction.git
cd Crime-Rate-Prediction

Install dependencies:

pip install -r requirements.txt

▶️ How to Run the Project

  1. Open crime_rate.ipynb in Jupyter Notebook or Google Colab
  2. Run all cells sequentially
  3. Observe dataset generation
  4. Train the ML model
  5. View evaluation metrics
  6. Test predictions on new data

📊 Results & Evaluation Metrics

The model is evaluated using:

  • MAE (Mean Absolute Error)
  • RMSE (Root Mean Squared Error)
  • R² Score

Lower MAE and RMSE with higher R² indicate better prediction performance.


🏙 Applications

  • Smart city crime monitoring
  • Police resource allocation
  • Urban planning and safety analysis
  • Crime trend forecasting
  • Academic research

⚠️ Limitations

  • Dataset is synthetic, not real-world data
  • Limited number of features
  • Does not include real-time crime data

🚀 Future Enhancements

  • Use real government crime datasets
  • Add time-series forecasting models
  • Integrate GIS-based crime mapping
  • Use deep learning models
  • Deploy as a web-based dashboard

👨‍💻 Author

Galla Rishi MTech – Robotics / AI & Machine Learning


⭐ Acknowledgment

If you find this project useful, please ⭐ the repository.


End of README.md

Releases

No releases published

Packages

No packages published