💼 Salary Predictor

A full-stack salary prediction pipeline that integrates Python, SQLite, Julia, Go, SQL, and Bash. Built to demonstrate working knowledge of ML pipelines, API development, and multi-language system design — and to learn Julia and Go, two languages I picked up the same day I built this. The model uses a fixed random seed for more consistent predictions, but variations may still occur.

🚀 Features

🐍 Python – Ingests and populates salary data into an SQLite database.
🗃️ SQLite – Lightweight relational DB used for structured storage.
🧠 Julia (MLJ) – Trains a regression model to predict salary using job_title, experience_level, and location.
💻 Go – Serves a lightweight HTTP API with HTML form and JSON endpoint.
🌐 HTML – Clean UI served from a static folder via Go.
🖥️ Bash – End-to-end pipeline runner (run_pipeline.sh).
🎯 RMSE on test set: $43,664

🧠 Model Details

The model is a DecisionTreeRegressor from the MLJ.jl ecosystem. It was chosen for its simplicity, interpretability, and ability to handle categorical features without manual one-hot encoding.

Preprocessing:

Converted job_title, experience_level, and location columns to categorical types in Julia.
No feature scaling was required due to the nature of the tree-based model.

Hyperparameters:

Used default parameters (max_depth = -1, etc.) to keep the model configuration lightweight.
Random seed set for reproducibility, though slight variation in predictions may still occur.

🧮 Baseline Comparison

To benchmark model performance, a simple DecisionTreeRegressor was trained in Python using scikit-learn on the same dataset and features (job_title, experience_level, company_location).

Python (scikit-learn) baseline results:

RMSE: $49,820
MAE: $37,564
R²: 0.37

The Julia model (MLJ.jl) achieved an RMSE of $43,664, indicating slightly better performance while maintaining cross-language consistency.

This baseline comparison validates the effectiveness of the Julia model and provides a familiar point of reference for Python developers.

📂 Project Structure

salary-predictor/
│
├── data/                  # Raw CSV from Kaggle
├── Julia/                 # Model training in Julia
├── go-api/                # Go web server and HTML UI
│   └── static/index.html
├── Screenshots/           # UI screenshots
├── init.sql               # DB schema
├── run_pipeline.sh        # End-to-end bash runner
├── salary.db              # SQLite database
└── utl.py                 # Python ETL script

🗃️ Database Schema

Table: predictions

Column	Type	Description
job_title	TEXT	Role title
experience_level	TEXT	Entry / Mid / Senior
location	TEXT	Country or region code
predicted_salary	INTEGER	Model-predicted salary

🧪 Try It Locally

1. Clone and run:

git clone https://github.com/daniel-mehta/Salary-Predictor.git
cd salary-predictor
chmod +x run_pipeline.sh
./run_pipeline.sh

2. Open the browser:

Go to http://localhost:8080 Fill the dropdowns and click Predict Salary.

📊 Dataset

Kaggle – Data Science Salaries

🛠️ Built With

Language	Purpose
Python	Data ingestion & DB fill
SQL	DB schema/querying
Julia	ML model training
Go	API + web frontend
Bash	Full pipeline automation
HTML/CSS	User interface (form)

📄 License

MIT — free to use, modify, or extend.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💼 Salary Predictor

🚀 Features

🧠 Model Details

🧮 Baseline Comparison

📂 Project Structure

🗃️ Database Schema

🧪 Try It Locally

1. Clone and run:

2. Open the browser:

📊 Dataset

🛠️ Built With

📄 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.vscode		.vscode
Julia		Julia
Screenshots		Screenshots
data		data
go-api		go-api
.DS_Store		.DS_Store
.gitattributes		.gitattributes
README.md		README.md
init.sql		init.sql
run_pipeline.sh		run_pipeline.sh
salary-predictor.session.sql		salary-predictor.session.sql
salary.db		salary.db
utl.py		utl.py

daniel-mehta/Salary-Predictor

Folders and files

Latest commit

History

Repository files navigation

💼 Salary Predictor

🚀 Features

🧠 Model Details

🧮 Baseline Comparison

📂 Project Structure

🗃️ Database Schema

🧪 Try It Locally

1. Clone and run:

2. Open the browser:

📊 Dataset

🛠️ Built With

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages