Skip to content

A comprehensive system that combines web crawling, sentiment analysis, and deep learning to predict short-term stock price movements. The system integrates news sentiment with technical indicators to provide more accurate stock price predictions.

Notifications You must be signed in to change notification settings

scfengv/GDSC-ai-stock

Repository files navigation

Stock Market Analysis and Prediction System

A comprehensive system that combines web crawling, sentiment analysis, and deep learning to predict short-term stock price movements. The system integrates news sentiment with technical indicators to provide more accurate stock price predictions.

Check out our portfolio

Contributors

scfengv

CX330Blake

3Frank3 & weip12

C0D1R

yichen7299

🌟 Features

  • Automated data collection from multiple sources:
    • Yahoo Finance news articles
    • CNBC news articles
    • Earnings call transcripts
    • Tweets
  • Sentiment analysis using fine-tuned BERT model
  • Stock price prediction using LSTM with sentiment features
  • Scalable architecture for multiple stocks

🏗 System Architecture

1. Data Collection (News_Crawler.py)

  • Implements web crawling using Selenium and BeautifulSoup4
  • Collects news titles from Yahoo Finance and CNBC
  • Stores data in CSV format with timestamps
  • Handles rate limiting and browser automation

2. Sentiment Analysis (FineTune.py)

  • Fine-tunes BERT model for financial sentiment analysis
  • Uses pseudo-labeling technique for efficient data labeling
  • Supports three sentiment classes: positive, neutral, negative
  • Includes custom dataset handling and metrics computation

3. Stock Price Prediction (LSTM.py)

  • Implements LSTM model for time series prediction
  • Features:
    • Technical indicators integration
    • VIX index integration
    • Sentiment score integration
    • Hyperparameter tuning using Keras Tuner

🚀 Getting Started

Prerequisites

pip install -r requirements.txt

Required packages:

  • tensorflow
  • torch
  • transformers
  • pandas
  • numpy
  • selenium
  • beautifulsoup4
  • yfinance
  • keras-tuner
  • scikit-learn

Running the Pipeline

  1. Data Collection
python News_Crawler.py
  1. Model Fine-tuning
python FineTune.py
  1. Stock Prediction
python LSTM.py

📊 Model Architecture

BERT Fine-tuning

  • Base model: google-bert/bert-large-uncased
  • Custom classification head
  • Configurable hyperparameters:
    • Learning rate
    • Batch size
    • Number of epochs
    • Maximum sequence length

LSTM Model

  • Features:
    • Close price
    • Volume
    • VIX index
    • Technical indicators
    • Sentiment scores
  • Configurable architecture:
    • Number of LSTM layers
    • Units per layer
    • Dropout rates
    • Dense layer configuration

🔧 Configuration

BERT Fine-tuning Configuration

config = ModelConfig(
    model_name = "google-bert/bert-large-uncased",
    num_labels = 3,
    train_batch_size = 32,
    eval_batch_size = 32,
    learning_rate = 2e-5,
    num_epochs = 5
)

LSTM Configuration

  • Configurable through StockPredictor class initialization
  • Supports hyperparameter tuning via Keras Tuner

📈 Performance Metrics

Sentiment Analysis

  • F1 Score (weighted average)
  • Classification accuracy
  • Confusion matrix

Stock Prediction

  • Mean Squared Error (MSE)
  • Root Mean Squared Error (RMSE)
  • Direction accuracy

🔍 Future Improvements

  1. Add support for more news sources
  2. Implement real-time prediction pipeline
  3. Add more technical indicators
  4. Enhance model interpretability
  5. Add backtesting framework
  6. Implement portfolio optimization

⚠️ Disclaimer

This project is for educational purposes only. The predictions should not be used as financial advice. Always do your own research before making investment decisions.

About

A comprehensive system that combines web crawling, sentiment analysis, and deep learning to predict short-term stock price movements. The system integrates news sentiment with technical indicators to provide more accurate stock price predictions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published