forked from yass-78/Osteoporosis-Risk-Prediction
-
Notifications
You must be signed in to change notification settings - Fork 0
Logistic regression model for predicting osteoporosis risk based on clinical and lifestyle variables. Includes data preprocessing, performance evaluation (ROC, Precision-Recall curves), and feature importance analysis. Developed in R for clarity and reproducibility.
Gos2two/random-M81
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
# Osteoporosis Prediction Project This project aims to develop a prediction system for osteoporosis using Logistic Regression and Random Forest models. The system predicts osteoporosis risk based on a variety of clinical and lifestyle factors. Below is a detailed explanation of the key components and scripts in this project. ## Documents Overview **Treball_Final_Bioinfo.R** This script contains the entire process of training and evaluating a Logistic Regression model for osteoporosis prediction. The key steps include: 1. Package Installation 2. Dataset Preparation: - Loads the `osteoporosis.csv` dataset, removes irrelevant columns, handles missing data, and converts categorical variables into factors. 3. Model Training: - Trains a Logistic Regression model using 70% of the dataset (training set) and evaluates it using confusion matrix, ROC curve, and Precision-Recall curve. 4. Feature Importance Evaluation: - A Random Forest model is trained to identify the most influential features for predicting osteoporosis. 5. Model Saving: - Saves the trained Logistic Regression and Random Forest models for future use. **API.R** This script sets up a **REST API** using the plumber package to provide osteoporosis predictions based on the trained Logistic Regression model. The key steps are: 1. Library Loading 2. API Endpoint Definition: - Defines the predict_logistic endpoint that accepts user inputs such as gender, age, family history, physical activity, etc., and returns the probability and prediction (Yes/No) of osteoporosis. 3. Input Format: - Accepts both categorical (e.g., gender, family history) and numeric (e.g., age) input variables for making predictions. 4. Prediction: - Uses the trained Logistic Regression model to make predictions based on the provided inputs. **Run_API.R** This script runs the API defined in 'API.R' on a local server. --- ## How to Run the Project 1. **Train the model**: - Run 'Treball_Final_Bioinfo.R' to train the Logistic Regression. Make sure the dataset (osteoporosis.csv) is placed correctly at the specified path. 2. **Run the API**: - Run 'API.R' to load the trained Logistic Regression model and set up the API with the predict_logistic` endpoint. 3. **Start the server**: - Run 'Run_API.R' to start the server and make the API available on port 8000. --- ## Required Packages The following R packages are required to run the project: - `plumber`: For creating REST APIs in R. - `caret`: For building predictive models. - `dplyr`: For data manipulation. - `PRROC`: For Precision-Recall curve creation. - `pROC`: For ROC curve creation and evaluation. - `smotefamily`: For handling imbalanced data through SMOTE. - `randomForest`: For building Random Forest models.
About
Logistic regression model for predicting osteoporosis risk based on clinical and lifestyle variables. Includes data preprocessing, performance evaluation (ROC, Precision-Recall curves), and feature importance analysis. Developed in R for clarity and reproducibility.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- R 100.0%