This project showcases the implementation of Decision Tree Classification to predict drug types based on patient attributes. By leveraging the power of scikit-learn
, the project explores how decision trees can handle categorical and numerical data effectively.
- Data Preparation: Processes a healthcare dataset containing patient characteristics (age, sex, blood pressure, and cholesterol levels) and drug types.
- Decision Tree Implementation: Builds and visualizes decision tree models for classification tasks.
- Performance Evaluation: Measures the accuracy of the trained model to ensure reliable predictions.
- Tree Visualization: Demonstrates the structure of the decision tree for interpretability.
- Python 3.8+
- Libraries:
numpy
,pandas
,matplotlib
,seaborn
,sklearn
Install dependencies using:
pip install -r requirements.txt
-
Clone the repository:
git clone https://github.com/AbdullahAlForman/Decision-Trees-Drug-Classification.git cd Decision-Trees-Drug-Classification
-
Open the notebook:
jupyter notebook Class-Decision-Trees-drug.ipynb
-
Execute the cells step-by-step to build the model and analyze results.
- Exploratory Data Analysis (EDA): Analyzes the features and their relationships with drug types.
- Data Preprocessing: Encodes categorical variables and splits data into training and testing sets.
- Model Training: Trains a decision tree classifier using the
sklearn
library. - Evaluation and Visualization: Evaluates model performance and visualizes the decision tree structure.
- Achieved a high classification accuracy on the test set.
- Generated an intuitive decision tree for identifying patterns in the data.
- Notebook:
Class-Decision-Trees-drug.ipynb
- Dataset: Patient data for drug classification (details in the notebook).
Have ideas to improve the project? Feel free to fork the repo and submit pull requests!