Skip to content

Projects written in Python with a focus on automation for data analyst in Aerospace. These are all simulating data, and uses very little to no real world data as it's not publicly available (or I don't know where to look!). This is all purely for learning & curiosity.

Notifications You must be signed in to change notification settings

omgitsbees/AeroSpace-Python-Scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Resource Allocation and Utilization Tracking System

This Python project is a Resource Allocation and Utilization Tracking tool, designed for use in industries such as aerospace to track the assignment of resources, forecast their utilization, and simulate resource consumption over time. The tool features an interactive UI and includes functionalities for AI-based optimization, forecasting, simulation, and Jira integration.

This project is built to showcase complex problem-solving skills and can be used as a portfolio piece for job applications, such as for companies like Blue Origin. Features 🛠️ AI-Based Resource Allocation

Uses Linear Programming via PuLP to optimize the allocation of resources to projects.

📈 Forecasting Resource Utilization

Forecasts future resource utilization based on historical data using SARIMA models from statsmodels.

⏳ Simulation Mode

Simulates resource usage over time, with background processes managed by schedule.

🔐 User Authentication

Password hashing and authentication using bcrypt to ensure secure access to the application.

🛠️ Jira Integration

Connects to Jira's API to fetch and display issues, which helps in tracking tasks and managing resource allocation accordingly.

📊 Tkinter GUI

A user-friendly graphical interface that enables interaction with all functionalities, including login, resource allocation, simulation, Jira integration, and forecasting.

Tech Stack

Frontend/UI: Tkinter for the graphical interface.
Optimization: PuLP for Linear Programming.
Simulation: schedule for background simulation tasks.
Authentication: bcrypt for hashing and validating passwords.
Data Processing: pandas for handling utilization data.
Forecasting: statsmodels for time-series forecasting (SARIMA model).
Jira Integration: JIRA API for fetching tasks and managing them within the application.

Setup Instructions

  1. Clone the Repository

bash

git clone https://github.com/your-username/resource-allocation-utilization-tracking.git cd resource-allocation-utilization-tracking

  1. Install Dependencies

Install the required Python packages:

bash

pip install pulp schedule bcrypt pandas statsmodels jira

  1. Run the Application

Execute the Python script to launch the Tkinter GUI:

bash

python resource_allocation.py

  1. Jira Integration

For Jira integration, you'll need:

A Jira account
Your Jira domain and an API token from Jira

Update the fetch_jira_issues() function in the script with your Jira credentials:

python

jira_options = {'server': 'https://your-jira-domain.atlassian.net'} jira = JIRA(options=jira_options, basic_auth=('[email protected]', 'api_token'))

How to Use

Login: Start by entering your username and password in the UI. For demo purposes, the password is password123.

Optimize Resource Allocation: Click the "Optimize Allocation" button to view the best allocation of available resources to projects based on AI optimization.

Start Simulation: Run the resource utilization simulation in the background by clicking the "Start Simulation" button. This simulates resource consumption over time.

Forecast Utilization: Click the "Forecast Utilization" button to view predicted utilization rates for the next 12 time steps, based on historical data.

Fetch Jira Issues: To view current tasks from your Jira project, click the "Fetch Jira Issues" button. Ensure you have updated the script with your Jira credentials.

Sample Screenshots

Fig 1: AI-based optimized resource allocation interface

Fig 2: Forecasted utilization data output

Screenshot 2024-09-22 100208


✈️ Flight Data Analysis and Reporting Overview

This project simulates the collection of flight data, performs comprehensive analysis, and generates a detailed report with visualizations. The report is automatically created as a PDF, providing insights into key flight parameters such as altitude, velocity, and temperature over time. Table of Contents

Features
Installation
Usage
Data
Analysis
Visualization and Reporting
Future Enhancements
Contributing
License

Features

Simulated Flight Data: Generate synthetic flight data including time, altitude, velocity, and temperature.
Data Analysis: Compute key statistics such as mean, max, and min values for altitude, velocity, and temperature.
Visualizations: Create line plots for altitude, velocity, and temperature over time.
PDF Report Generation: Compile the analysis results and visualizations into a professional PDF report.

Installation Prerequisites

Ensure you have Python 3.6+ installed along with the following packages:

pandas
numpy
matplotlib
seaborn
fpdf

Usage

Run the script:

bash

python flight_data_analysis_and_reporting.py

The script will generate simulated flight data, analyze it, create visualizations, and compile a PDF report.

The output report is saved as flight_data_analysis_report.pdf.

Data

The simulated flight data includes the following features:

Time (s): The time in seconds.
Altitude (ft): The altitude in feet.
Velocity (mph): The velocity in miles per hour.
Temperature (C): The temperature in Celsius.

Analysis

The analysis computes key statistics for each of the flight parameters:

Mean Altitude (ft)
Max Altitude (ft)
Min Altitude (ft)
Mean Velocity (mph)
Max Velocity (mph)
Min Velocity (mph)
Mean Temperature (C)
Max Temperature (C)
Min Temperature (C)

Visualization and Reporting

The script generates the following visualizations:

Altitude Over Time
Velocity Over Time
Temperature Over Time

These visualizations are embedded in the automatically generated PDF report, which also includes the computed statistics. Future Enhancements

Real Flight Data Integration: Integrate with real-world flight data for analysis.
Interactive Visualizations: Add interactive elements to the visualizations for better data exploration.
Extended Data Analysis: Include additional parameters like pressure, wind speed, and fuel consumption.    

🚀 Propulsion System Performance Monitoring Overview

This project involves simulating, analyzing, and monitoring the performance of a propulsion system. The system's key parameters such as thrust, fuel consumption, temperature, and vibration are tracked over time. The project includes functionality for detecting anomalies, generating alerts, and producing a detailed performance report. Table of Contents

Features
Installation
Usage
Data Simulation
Performance Analysis
Anomaly Detection and Alerts
Reporting
Visualization
Future Enhancements
Contributing
License

Features

Data Simulation: Generate synthetic data for propulsion system parameters including thrust, fuel consumption, temperature, and vibration.
Performance Analysis: Compute key statistics like average thrust, total fuel consumption, maximum temperature, and maximum vibration.
Anomaly Detection: Identify anomalies in the system's performance using Z-scores.
Alerts: Generate alerts if anomalies are detected.
Reporting: Generate a comprehensive text report summarizing the system's performance and any detected anomalies.
Visualization: Optionally visualize the propulsion system data over time.

Installation Prerequisites

Ensure you have Python 3.6+ installed along with the following packages:

numpy
pandas
matplotlib
seaborn
scipy

Usage

Run the script:

bash

python propulsion_system_monitoring.py

View the report:

The script generates a performance report saved as propulsion_system_report.txt. The report includes a summary of the system's performance and details of any anomalies detected.

Visualize the data (optional):

The script also offers an optional visualization of the propulsion system data over time.

Data Simulation

The data simulation step generates synthetic time-series data for the following propulsion system parameters:

Time: Time steps (in seconds).
Thrust (N): Thrust generated by the system (in Newtons).
Fuel Consumption (kg): Fuel consumption rate (in kilograms).
Temperature (°C): System temperature (in degrees Celsius).
Vibration (g): Vibration level (in gravitational units).

Performance Analysis

The performance analysis computes key metrics:

Average Thrust (N)
Total Fuel Consumption (kg)
Max Temperature (°C)
Max Vibration (g)

Anomalies are detected using Z-scores, and any data points that deviate significantly from the norm are flagged. Anomaly Detection and Alerts

If anomalies are detected in the system's performance, an alert is generated, and the details of the anomalies are included in the performance report. Reporting

The generated performance report, propulsion_system_report.txt, contains:

A summary of key performance metrics.
Details of any anomalies detected during the analysis.
Conclusions based on the analysis.

Visualization

The script includes an optional step to visualize the propulsion system data over time, displaying line plots for:

Thrust
Fuel Consumption
Temperature
Vibration

Future Enhancements

Real-time Monitoring: Implement real-time data monitoring and anomaly detection.
Extended Parameters: Incorporate additional parameters such as pressure and speed.
Interactive Visualizations: Add interactive dashboards for deeper data exploration.

Launch Schedule Optimization = The idea behind this project is to optimize the scheduling of rocket launches, taking into consideration factors like weather conditions, rocket availability, payload readiness, crew availability, and air traffic.

step-by-step breakdown: Step 1: Define the Problem

We want to optimize the scheduling of rocket launches. The inputs to the problem might include:

Launch Windows: Available time slots for launching rockets.
Rocket Availability: Whether a rocket is ready for launch.
Payload Readiness: Whether the payload is ready for launch.
Weather Conditions: Whether the weather is favorable for launch.
Crew Availability: Whether the crew is available and ready.

Step 2: Define Constraints

Some constraints to consider:

Rockets cannot be launched outside of their designated launch windows.
Weather conditions must be favorable for a launch to occur.
Payload must be ready and loaded.
Crew must be available and rested.

Step 3: Implement the Optimization

simple optimization technique where we will try to assign the best possible launch windows based on the constraints.

Constraint Satisfaction Problem (CSP): treat this as a CSP where each launch needs to satisfy several constraints (e.g., weather, rocket readiness, payload, etc.).

Optimization Algorithm: use an optimization algorithm like Genetic Algorithms (GA) to find the optimal launch schedule.

Data Structures: use sophisticated data structures like classes to represent rockets, payloads, and schedules.

Concurrency: Introduce concurrency with asyncio to simulate real-time checks and decision-making.

Logging and Error Handling: Add logging and exception handling for robustness.

Visualization: Generate a Gantt chart to visualize the schedule.

Explanation of the Enhanced Code:

Classes and Data Structures: The Rocket, Payload, Weather, Crew, LaunchWindow, and Schedule are more clearly defined using namedtuples and classes for better data management.

Constraint Satisfaction and Genetic Algorithm: The code uses a Genetic Algorithm (GA) for optimization, which is a more sophisticated method than simple loops and checks. The GA simulates a natural selection process to find an optimal solution by generating, mutating, and crossing over schedules.

Concurrency: asyncio is used to simulate real-time decision-making and to allow the script to handle multiple tasks concurrently, which can be especially useful if the scheduling process needs to be responsive to real-time changes.

Logging and Error Handling: Logging is implemented to track the progress and decisions made during the optimization process, which adds professionalism to the script and helps in debugging and analysis.

Visualization: A Gantt chart is generated using matplotlib to visually represent the launch schedule, which is a critical feature for demonstrating the results to stakeholders.

Anomaly Detection in Telemetry Data = comprehensive Python script that performs anomaly detection in telemetry data using a combination of statistical methods, machine learning, and visualization. This script is designed for automation, allowing it to process telemetry data, detect anomalies, and generate detailed reports.

comprehensive anomaly detection system specifically designed for aerospace telemetry data. This system will incorporate the following features:

Synthetic Data Generation: Since real aerospace telemetry data might not be readily available, we'll simulate realistic telemetry data including parameters like altitude, velocity, acceleration, temperature, pressure, and fuel levels.

Data Preprocessing: Handling missing values, scaling, and smoothing the data to prepare it for analysis.

Anomaly Detection Techniques:
    Statistical Methods: Z-Score and Interquartile Range (IQR) methods.
    Machine Learning Methods:
        Isolation Forest
        One-Class SVM
        Autoencoders (Deep Learning): Specifically suited for capturing complex patterns in time-series data.

Visualization: Plotting the telemetry data with highlighted anomalies for easy interpretation.

Reporting: Generating detailed reports summarizing detected anomalies, including timestamps and parameter specifics.

Modularity and Scalability: The code will be organized into functions and classes for easy maintenance and scalability.

Real-time Processing Capability: Designing the system to handle streaming data for real-time anomaly detection.

Technologies and Libraries Used:

Python 3.x
NumPy and Pandas: Data manipulation and analysis.
Matplotlib and Seaborn: Data visualization.
Scikit-Learn: Machine learning algorithms.
TensorFlow/Keras: Building and training deep learning models (Autoencoders).
PyOD: Specialized library for outlier detection.
Streamlit (Optional): For creating an interactive dashboard/interface.

Detailed Explanation:
  1. Synthetic Data Generation:

Since real aerospace telemetry data may not be publicly available, we generate synthetic data that mimics real-world scenarios.

Features Generated:
    Altitude: Simulated around 10,000 meters with some variance.
    Velocity: Around 250 m/s.
    Acceleration: Centered around 0 m/s² to represent steady flight with occasional fluctuations.
    Temperature: Around 15°C with variations.
    Pressure: Around 1013 hPa (standard atmospheric pressure).
    Fuel Level: Starting with 5000 liters and varying slightly.

Anomaly Injection:
    We introduce anomalies by randomly altering the values of these features significantly at random timestamps.
    This simulates scenarios like sudden drops or spikes in readings due to malfunctions or external factors.
  1. Data Preprocessing:

    Handling Missing Values: Although our synthetic data doesn't have missing values, this step ensures robustness by forward and backward filling any gaps. Scaling: We use StandardScaler to normalize the data, which is crucial for many anomaly detection algorithms to perform effectively.

  2. Anomaly Detection Methods:

We implement multiple methods to detect anomalies, each with its strengths: a. Z-Score Method:

Concept: Identifies how many standard deviations an observation is from the mean.
Threshold: Typically set at 3, meaning any data point beyond 3 standard deviations is considered an anomaly.

b. Interquartile Range (IQR) Method:

Concept: Uses the middle 50% of data to identify outliers.
Threshold: Commonly set at 1.5 times the IQR above the 75th percentile or below the 25th percentile.

c. Isolation Forest:

Concept: An ensemble method that isolates anomalies by randomly selecting a feature and splitting value.
Advantages: Efficient on large datasets and can handle high-dimensional data.
Parameters:
    Contamination: Proportion of anomalies in the data.

d. One-Class SVM:

Concept: Learns the boundary of normal data and identifies points outside this boundary as anomalies.
Kernel: Radial Basis Function (RBF) kernel captures non-linear relationships.
Parameters:
    Nu: An upper bound on the fraction of training errors and a lower bound of the fraction of support vectors.

e. Autoencoder (Deep Learning):

Concept: Neural network trained to reconstruct its input. High reconstruction error indicates anomalies.
Architecture:
    Encoder: Compresses input data.
    Decoder: Reconstructs data from the compressed representation.
Training:
    Trained on normal data to minimize reconstruction error.
Detection:
    Reconstruction error is calculated, and a threshold is set to identify anomalies.
  1. Visualization:

    For each detection method, we plot the time series of each feature and highlight detected anomalies. Plots are saved in the specified directory for review.

  2. Reporting:

    Detailed reports are generated in CSV format listing all detected anomalies with their timestamps and feature values. These reports are essential for post-analysis and validation.

  3. Evaluation:

    We use classification metrics like precision, recall, F1-score, and confusion matrix to evaluate the performance of each detection method against the known anomalies. This helps in understanding which method performs best under different scenarios.

  4. Model Saving:

    Trained models are saved for future use or deployment. Models like Isolation Forest and One-Class SVM are saved using pickle, while the Autoencoder model is saved in HDF5 format.

  5. Modularity and Extensibility:

    The code is organized into functions, making it easy to maintain and extend. New detection methods or features can be added seamlessly.

  6. Logging:

    Comprehensive logging is implemented to track the execution flow and debug if necessary. Logs provide timestamps and detailed messages for each major step.

  7. Real-time Processing Capability:

    While the current implementation processes batch data, it can be extended to handle streaming data by adapting the functions to process incoming data points or windows


Automated Risk Assessment = a comprehensive tool designed to evaluate, predict, and manage risks in aerospace projects. The application integrates various advanced features such as real-time data collection, machine learning-based risk prediction, historical data analysis, and secure communication of risk alerts. The app also includes a user-friendly Graphical User Interface (GUI) to allow easy interaction with these complex functionalities.

Key Components of the Application

Graphical User Interface (GUI):
    The program uses tkinter, a standard Python library for creating GUIs. The GUI enables users to interact with the risk assessment features without needing to dive into the code or understand the underlying processes.
    The main window includes buttons for sending alerts, analyzing historical data, running machine learning predictions, and adding real-time data.

Risk Assessment Class (RiskAssessment):
    This class forms the core of the application. It manages all risk assessment functionalities, including data loading, analysis, machine learning predictions, real-time data integration, and communication (alerts).
    Initialization: The __init__ method initializes the risk assessment with the name of the aerospace project. It sets up paths for saving reports and handling sensitive data.
    Data Handling: The class loads and preprocesses historical risk data, which is essential for both analysis and machine learning models.
    Historical Data Analysis: This feature visualizes the historical risk data to understand risk trends and distributions, providing valuable insights for risk management.
    Machine Learning Predictions: The program utilizes a Gradient Boosting Classifier, a powerful machine learning algorithm, to predict the likelihood of high-risk scenarios based on historical data. This feature allows for proactive risk management by forecasting potential issues.
    Real-time Data Integration: Real-time data, particularly weather data relevant to aerospace operations, is fetched from the OpenWeatherMap API. This data is critical for making current and accurate risk assessments.
    PDF Reporting: The application can generate PDF reports summarizing risk assessments. This is useful for documentation and sharing findings with stakeholders.
    Secure Communication: The program can send email alerts to notify stakeholders of critical risks. Sensitive information is encrypted to ensure data security during transmission.

Security Features:
    The application incorporates encryption using the cryptography.fernet module to protect sensitive risk assessment data. This is crucial in an industry where data confidentiality is paramount.

Real-time Data Handling:
    By integrating with external APIs, the app fetches live weather data, which can be pivotal in aerospace risk assessment. The real-time aspect allows for more accurate and dynamic risk evaluations.

Email Alert System:
    The program can send automated email alerts when a high-risk scenario is identified. This ensures that relevant stakeholders are immediately informed, allowing for quick decision-making.

Error Handling and Data Integrity:
    The program includes robust error handling to manage potential issues, such as missing data or API failures. This ensures the application runs smoothly and provides reliable results.

What the Application Does

Risk Assessment:
    The application assesses the risk involved in aerospace projects by analyzing historical data and integrating real-time data (e.g., weather conditions). It predicts potential risks using machine learning, enabling proactive management.

Real-time Data Integration:
    Real-time data, such as current weather conditions, is fetched and analyzed to update the risk assessment dynamically. This feature is particularly valuable in aerospace, where conditions can change rapidly and unpredictably.

Historical Data Analysis:
    The app allows users to analyze historical risk data visually, helping to identify patterns and trends that could inform future risk management strategies.

Machine Learning Prediction:
    Using a Gradient Boosting Classifier, the application predicts the likelihood of various risks. This model is trained on historical data to identify high-risk scenarios with better accuracy.

PDF Report Generation:
    The application generates detailed PDF reports of the risk assessments, which can be shared with teams or stored for documentation purposes.

Email Alerts:
    The application sends out alerts via email when significant risks are detected, ensuring that critical information reaches the relevant stakeholders promptly.

Environmental Impact Analysis Overview

The Environmental Impact Analysis tool is designed to assess and analyze the environmental impact of aerospace missions, particularly focusing on CO2 emissions, noise pollution, and ecological footprint. This tool provides a comprehensive analysis by leveraging data processing, predictive modeling, and visualization techniques. It also features an easy-to-use graphical user interface (GUI) for interacting with the tool and generating reports. Features

Data Loading: Load environmental data from CSV or Excel files for analysis.
Data Preprocessing: Handle missing values and normalize the data to ensure accurate results.
CO2 Emissions Calculation: Estimate CO2 emissions based on fuel consumption data.
Noise Pollution Estimation: Calculate noise pollution levels and assess the impact radius.
Ecological Footprint Assessment: Analyze the impact on land use and biodiversity.
Predictive Modeling: Use linear regression to forecast future environmental impacts.
Data Visualization: Generate plots to visualize the data and the results of the analysis.
Interactive Maps: Visualize impact zones on maps using geographical data (requires geopandas).
PDF Report Generation: Automatically generate a detailed PDF report of the analysis.
Graphical User Interface (GUI): A simple and intuitive interface for users to interact with the tool.

Installation Prerequisites

Python 3.6 or higher
Required Python packages: pandas, numpy, matplotlib, seaborn, scikit-learn, fpdf, tkinter
Optional for map visualization: geopandas, shapely, fiona, pyproj, rtree

Install Required Packages

You can install the required packages using pip:

bash

pip install pandas numpy matplotlib seaborn scikit-learn fpdf geopandas shapely fiona pyproj rtree

Running the Application

Clone the repository:

bash

git clone https://github.com/yourusername/environmental-impact-analysis.git

Navigate to the project directory:

bash

cd environmental-impact-analysis

Run the script:

bash

python environmental_impact_analysis.py

Use the GUI:
    A window will pop up allowing you to load data, run the analysis, and generate reports with a few clicks.

Usage Loading Data

Use the "Load Data" button to import your environmental data file (CSV or Excel).
The data should contain columns such as Fuel_Consumed, Distance, Area_Used, Longitude, and Latitude.

Running the Analysis

Click on the "Run Analysis" button to preprocess the data, calculate emissions, noise, and ecological footprint, and generate a predictive model.
The analysis results will be displayed, and a PDF report will be automatically generated.

Material Usage and Stress Analysis Overview

The "Material Usage and Stress Analysis" project is a Python application designed for engineers and researchers in the aerospace industry, particularly those involved in material science and structural analysis. The application aims to provide comprehensive tools for analyzing material usage efficiency, performing stress analysis on aerospace components, optimizing material selection, and predicting material performance under various conditions.

This project is intended to be part of a Blue Origin portfolio, demonstrating advanced technical skills in material analysis, engineering simulations, and software development. Features

Material Data Loading: Load material properties from CSV or Excel files, including key attributes like tensile strength, yield strength, density, and modulus of elasticity.

Component Design and Material Allocation: Define aerospace components, assign materials, and calculate material usage efficiency based on the component's design and selected material properties.

Finite Element Analysis (FEA) Integration: Perform simplified FEA to simulate stress distribution across the component under different load conditions. Visualize the results on a 3D model.

Safety Factor Calculation: Calculate safety factors for each component based on the applied stresses and material properties. Identify critical areas where safety factors are below acceptable levels.

Material Optimization: Optimize material usage by suggesting alternative materials or geometries to achieve the desired strength-to-weight ratio and cost-effectiveness.

Predictive Modeling: Use machine learning models to predict material behavior under extreme conditions, such as high temperatures or cyclic loads. Train models on historical data and simulate future scenarios.

Interactive GUI: An intuitive graphical user interface (GUI) for inputting component designs, selecting materials, and viewing analysis results. The GUI includes 3D visualization of stress distribution.

Automated Reporting: Generate detailed PDF reports summarizing material usage, stress analysis, safety factors, and optimization recommendations.

Installation Prerequisites

Python 3.7 or higher
Required Python libraries: pandas, numpy, scipy, matplotlib, sklearn, tkinter

Installation Steps

Clone the Repository:

bash

git clone https://github.com/your-username/material-usage-stress-analysis.git cd material-usage-stress-analysis

Install the Required Libraries:

bash

pip install pandas numpy scipy matplotlib scikit-learn

Run the Application:

bash

python material_usage_stress_analysis.py

Usage

Load Material Data: Use the GUI to load a CSV or Excel file containing material properties.

Input Component Design: Define the volume and density of the component through the GUI.

Run Analysis: Click the "Run Analysis" button to perform material usage calculations, stress analysis, safety factor determination, and optimization.

View Results: The application will display the results in the GUI, including material usage, stress distribution, and safety factors.

Generate Report: A detailed PDF report will be automatically generated, summarizing all analysis and optimization results.

Example Usage

Here's a simple example of how to use the application:

Load a Material Data File: Click the "Load Material Data" button and select a file containing material properties.

Input Component Volume and Density: Enter the volume and density of the component you are analyzing.

Perform Analysis: Click "Run Analysis" to calculate material usage, simulate stress distribution, and evaluate safety factors.

Generate Report: The application will generate a detailed PDF report with all the analysis results.

Next Steps and Future Development

Dynamic Load Analysis: Expand the application to include dynamic load analysis, allowing users to analyze stress and strain over time.

Material Database Integration: Integrate a comprehensive database for material properties to facilitate easy updates and management of material data.

Advanced FEA Integration: Incorporate professional FEA tools like Abaqus or ANSYS for more detailed and accurate stress analysis.

Multi-Objective Optimization: Implement optimization algorithms that can handle multiple objectives, such as minimizing weight while maximizing strength and cost-efficiency.

API Development: Develop API endpoints for integration with CAD and other design tools, enabling seamless data transfer and analysis within a larger engineering workflow.

The tool automatically generates a PDF report summarizing the analysis, including CO2 emissions, noise pollution, ecological footprint, and predictive model performance.

Project Structure

environmental_impact_analysis.py: Main script containing the code for the analysis.
data/: Folder for storing input data files (if applicable).
reports/: Folder for storing generated PDF reports (if applicable).

🚀 Historical Launch Data Analysis Overview

This project performs a comprehensive analysis of historical aerospace launch data, leveraging various data science techniques to gain insights, predict outcomes, and identify trends. The analysis includes exploratory data analysis, machine learning classification, clustering, and time series analysis. Table of Contents

Features
Installation
Usage
Data
Analysis
Visualization
Future Enhancements
Contributing
License

Features

Synthetic Dataset Generation: Create a simulated dataset for historical launch data.
Exploratory Data Analysis (EDA): Visualize trends, correlations, and distributions in the dataset.
Machine Learning:
    Random Forest Classification: Predict the success of future launches based on historical data.
    Principal Component Analysis (PCA): Dimensionality reduction for better model performance.
    K-Means Clustering: Identify patterns in launch data.
Time Series Analysis:
    Trend Analysis: Analyze and visualize trends over time.
    Anomaly Detection: Identify anomalies in launch success rates.
Data Visualization: Comprehensive visualizations for better insights.

Installation Prerequisites

Ensure you have Python 3.6+ installed along with the following packages:

pandas
numpy
matplotlib
seaborn
scikit-learn

Usage

Run the script:

bash

python historical_launch_data_analysis.py

The script will generate a synthetic dataset and perform various analyses including data visualization, machine learning, and time series analysis.

The output includes visualizations, classification reports, and predictions based on the provided features.

Data

The script generates a synthetic dataset simulating historical launch data with the following features:

Launch_Date: Date of the launch.
Launch_Mass: Mass of the launch vehicle.
Payload_Mass: Mass of the payload.
Launch_Success: Binary outcome indicating the success (1) or failure (0) of the launch.
Year, Month, Day: Extracted from Launch_Date.
Payload_to_LaunchMass_Ratio: Ratio of payload mass to launch mass.

Analysis Exploratory Data Analysis (EDA)

The script provides visualizations for:

Yearly launch successes and failures.
Correlations between launch features.
Payload to launch mass ratio vs. launch success.

Machine Learning

Random Forest Classifier: A model to predict launch success.
PCA: Dimensionality reduction to improve model performance.
K-Means Clustering: Clustering of launch data for pattern recognition.

Time Series Analysis

Monthly Launch Success Rate: Analysis and visualization of the monthly success rate.
Rolling Average: Visualization of rolling averages to identify trends.
Anomaly Detection: Identification and visualization of anomalies in launch data.

Visualization

The script generates various plots:

Yearly launch success vs. failure.
Box plots for payload ratios.
Correlation matrix heatmap.
Scatter plot for K-Means clustering.
Time series plots for launch success rates and anomalies.

Future Enhancements

Integration with Real Data: Replace the synthetic dataset with real-world historical launch data.
Enhanced Predictive Models: Implement additional machine learning algorithms for better predictions.
Interactive Visualizations: Use tools like Plotly or Dash for interactive data exploration.

About

Projects written in Python with a focus on automation for data analyst in Aerospace. These are all simulating data, and uses very little to no real world data as it's not publicly available (or I don't know where to look!). This is all purely for learning & curiosity.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages