Skip to content

Latest commit

 

History

History
70 lines (46 loc) · 2.66 KB

README.md

File metadata and controls

70 lines (46 loc) · 2.66 KB

Data Science Projects

This repository contains two projects developed as part of the Data Science Internship at CipherByte Technologies:

  1. Iris Flower Classification Model: A machine learning project focused on classifying Iris flowers based on their physical measurements.
  2. Time Series Forecasting: A project for analyzing and predicting trends in time series data.

Below is a detailed overview of the Iris Flower Classification project.


Iris Flower Classification Model

Data Science Internship - CipherByte Technologies

This project involves training a machine learning model to classify Iris flowers into one of three species: Setosa, Versicolor, or Virginica. The classification is based on the measurements of the flowers' sepals and petals.

Task Overview

The Iris Flower Classification task includes:

  • Data collection and preprocessing
  • Visualizing the data to understand the distribution of features
  • Training a Logistic Regression model on the Iris dataset
  • Evaluating model performance using accuracy, classification report, and cross-validation
  • Predicting the species of Iris flowers based on new data inputs

Dataset Description

The Iris dataset consists of the following features:

  • Sepal Length (cm)
  • Sepal Width (cm)
  • Petal Length (cm)
  • Petal Width (cm)
  • Species (Target variable: Setosa, Versicolor, Virginica)

Libraries and Tools Used

  • Pandas: For data manipulation and analysis
  • Seaborn and Matplotlib: For data visualization
  • Plotly: For interactive scatter plot visualizations
  • Scikit-learn: For model training, evaluation, and metrics

Project Structure

├── IrisFlowerClassification.ipynb  # Jupyter Notebook with the complete code
├── Iris_Flower_Data.csv            # Dataset
├── TimeSeriesForecasting.ipynb     # Time Series Forecasting project
└── README.md                       # Project documentation

Results

The model is evaluated based on:

  • Accuracy: Measures the proportion of correct predictions.
  • Confusion Matrix: Analyzes how well the model classifies each species.
  • Cross-Validation: Ensures model stability across different subsets of the dataset.

Conclusion

This Iris Flower Classification project successfully demonstrates how to build, train, and evaluate a machine learning model using the Logistic Regression algorithm for classification tasks.


For details on the Time Series Forecasting project, please refer to the TimeSeriesForecasting.ipynb file in the repository.

Acknowledgments

This project was developed as part of the Data Science Internship at CipherByte Technologies.