FinLove — Intelligent Portfolio Construction Platform

Authors: Nguyen Van Duy Anh · Pham Dinh Hieu · Cao Pham Minh Dang · Tran Anh Chuong · Ngo Dinh Khanh
GitHub: https://github.com/dinhieufam/FinLove

Overview

FinLove is a comprehensive portfolio construction and analysis platform that integrates advanced quantitative finance methodologies with modern machine learning techniques. The system provides intelligent investment planning, comprehensive risk analysis, forward-looking predictions, and AI-powered explanations to support informed investment decision-making.

The platform is built as a full-stack web application with a Next.js frontend and FastAPI backend, providing a modern, responsive user interface. It combines sophisticated risk models, multiple optimization strategies, realistic backtesting capabilities, and state-of-the-art forecasting methods to deliver actionable portfolio insights. FinLove is designed for both individual investors seeking professional-grade portfolio management tools and financial professionals requiring robust analytical capabilities.

Additionally, a Streamlit dashboard is available as an alternative interface for quick prototyping and analysis.

Key Features

💰 Investment Plan

The Investment Plan feature transforms optimized portfolio weights into actionable dollar-based allocation strategies. Users can specify their total investment capital, and the system automatically calculates precise dollar allocations for each asset based on the optimized weights generated by the portfolio construction engine.

Capabilities:

Convert optimized portfolio weights to dollar allocations
Support for customizable investment amounts
Real-time allocation updates based on portfolio optimization results
Visual representation of capital deployment across assets
Detailed allocation tables with percentage and dollar breakdowns

🔍 Analyze

The Analyze module provides comprehensive portfolio performance and risk assessment through advanced statistical analysis and visualization. It leverages multiple risk models and performance metrics to deliver deep insights into portfolio behavior.

Capabilities:

Performance Analysis: Cumulative returns, rolling Sharpe ratios, drawdown analysis, and benchmark comparisons
Risk Assessment: Value at Risk (VaR), Conditional VaR (CVaR), volatility analysis, and correlation matrices
Portfolio Composition: Current allocation visualization, weight evolution over time, and concentration analysis
Statistical Metrics: Annualized returns, volatility, Sharpe ratio, maximum drawdown, turnover, and weight stability
Risk Model Diagnostics: Comprehensive analysis using multiple risk estimation methods (Ledoit-Wolf, GLASSO, GARCH, DCC)

🔮 Prediction

The Prediction system employs ensemble forecasting methodologies to project future portfolio performance. The system evaluates multiple model combinations, selects top-performing strategies, and aggregates predictions to generate robust forward-looking estimates.

Capabilities:

Model Selection: Automatic evaluation of 20 model combinations (5 optimization methods × 4 risk models)
Ensemble Forecasting: Combines predictions from top-performing models using multiple time series methods
Forecasting Methods: Supports ARIMA, Prophet, LSTM, Exponential Smoothing, and Moving Average approaches
Future Returns Projection: Forecasts portfolio returns over user-specified horizons (e.g., 30 days, 90 days)
Confidence Intervals: Provides uncertainty estimates for predictions
Performance-Based Selection: Ranks and selects models based on historical performance metrics

🤖 LLM to Explain

The LLM Explanation feature leverages large language models to provide natural language interpretations of portfolio analysis results. The system uses Retrieval-Augmented Generation (RAG) to ground explanations in actual portfolio data, ensuring accurate and contextually relevant insights.

Capabilities:

AI-Powered Insights: Natural language explanations of portfolio performance, risk metrics, and allocation decisions
Context-Aware Responses: RAG system retrieves relevant portfolio data to provide accurate, data-grounded explanations
Interactive Q&A: Users can ask questions about their portfolio and receive detailed, contextual answers
Chart Explanations: Automatic generation of explanations for visualizations and performance charts
Educational Guidance: Provides investment education and interpretation guidance without making specific investment recommendations
Multi-Model Support: Supports multiple LLM providers (OpenAI GPT models, Google Gemini) with automatic fallback mechanisms

Installation

Prerequisites

Python 3.8 or higher
Node.js 18+ and npm
pip package manager

Step 1: Install Backend Dependencies

# Install Python dependencies
pip install -r requirements.txt

# Install backend-specific dependencies
cd web/backend
pip install -r requirements.txt
cd ../..

Step 2: Set Up API Keys and Environment Variables

FinLove requires API keys for LLM-powered features (such as Gemini) and uses environment variables for configuration.

Set up environment variables for the backend API:
- Copy the example environment file:
```
cp web/backend/env.example.txt web/backend/.env
```
- Edit web/backend/.env to add your own credentials and API keys.
  Example contents (see web/backend/env.example.txt for full list):
```
# Multiple API keys (comma-separated, for rotation)
GEMINI_API_KEYS=your-gemini-api-key-here

# Multiple models (comma-separated)
GEMINI_MODELS=gemini-2.5-pro

# Embedding model for vector search
HUGGINGFACE_EMBEDDING_MODEL=all-MiniLM-L6-v2
```
- Note: You must provide at least one valid GEMINI API key for LLM explanations and chatbot features.

Step 3: Install Frontend Dependencies

cd web/frontend
npm install
cd ../..

Step 4: (Optional) Pre-download Data

For improved performance, pre-download financial datasets:

python scripts/download_data.py

See DATA.md for detailed information about data download and caching mechanisms.

Step 5: Run the Web Application

The main FinLove platform is a full-stack web application with Next.js frontend and FastAPI backend:

Terminal 1 - Start Backend:

cd web/backend
python app.py

The backend API will run on http://localhost:8000

Terminal 2 - Start Frontend:

cd web/frontend
npm run dev

The web application will be available at http://localhost:3000

The frontend automatically proxies API requests to the backend, so you only need to access http://localhost:3000 in your browser.

Quick Start Guide

Using the Web Application

Start the Application

# Terminal 1: Start backend
cd web/backend
python app.py

# Terminal 2: Start frontend
cd web/frontend
npm run dev

Access the Application
- Open http://localhost:3000 in your web browser
- The frontend will automatically connect to the backend API
Select Assets
- Enter company tickers separated by commas (e.g., AAPL,MSFT,GOOGL)
- Or use the default sector ETFs option
Configure Analysis Parameters
- Date Range: Select start and end dates for historical analysis
- Portfolio Objective: Choose optimization method (Markowitz, Sharpe, CVaR, etc.)
- Risk Engine: Select risk model (Ledoit-Wolf recommended for stability)
- Investment Capital: Specify total capital amount for investment plan
- Risk Appetite: Adjust risk aversion parameter
- Testing Style: Choose between simple or walk-forward backtesting
Run Analysis
- Click "Run Analysis" to execute portfolio optimization
- Explore results across the main features:
  - 💰 Investment Plan: Dollar-based allocation strategy
  - 🔍 Analyze: Comprehensive performance and risk analysis
  - 🔮 Prediction: Future portfolio performance forecasts
  - 🤖 LLM Explanations: AI-powered Q&A about your portfolio
Use LLM Explanations
- The web application includes an integrated chatbot for asking questions about your portfolio
- The RAG system provides context-aware answers based on your portfolio data

Alternative: Using the Streamlit Dashboard

For a simpler interface, you can use the Streamlit dashboard:

Launch the Dashboard
```
streamlit run dashboard/app.py
```
Follow similar steps as above, but using the Streamlit interface at http://localhost:8501

Using the Prediction System Programmatically

from src.predict import predict_future_performance

# Run complete prediction pipeline
results = predict_future_performance(
    tickers=['AAPL', 'MSFT', 'GOOGL'],
    start_date="2015-01-01",
    end_date="2024-01-01",
    forecast_horizon=30,  # Forecast next 30 days
    forecast_method='ensemble',  # Use ensemble forecasting
    use_top_models=5  # Use top 5 performing models
)

# Access aggregated prediction
future_returns = results['aggregated_prediction']
print(f"Expected daily return: {future_returns.mean()*100:.4f}%")

Project Structure

FinLove/
├── src/                    # Core business logic and shared modules
│   ├── data.py            # Data acquisition and preprocessing
│   ├── risk.py            # Risk models (Ledoit-Wolf, GLASSO, GARCH, DCC)
│   ├── optimize.py        # Optimization methods (Markowitz, BL, CVaR, etc.)
│   ├── backtest.py        # Backtesting engine
│   ├── metrics.py         # Performance metrics calculation
│   ├── forecast.py        # Time series forecasting methods
│   ├── predict.py         # Prediction pipeline orchestration
│   └── model_collector.py # Model evaluation and selection
├── web/                   # Full-stack web application
│   ├── backend/          # FastAPI backend
│   │   ├── app.py        # API server entry point
│   │   ├── routers/      # API route definitions
│   │   └── src/          # Backend-specific modules (RAG system)
│   └── frontend/         # Next.js frontend
│       ├── app/          # Application pages and routing
│       └── components/   # Reusable UI components
├── scripts/               # Utility scripts
│   ├── download_data.py  # Data pre-download script
│   ├── example_prediction.py  # Example prediction usage
│   └── train_all_models.py  # Model training script
├── report_generation/     # Report generation utilities
│   ├── convert.py        # Markdown to PDF converter
│   └── report.md         # Project report (markdown)
├── models/                # Saved machine learning models
├── data/                  # Raw and processed data storage
├── data_cache/            # Cached financial data
├── evaluation/            # Model evaluation notebooks
├── requirements.txt       # Python dependencies
├── README.md             # This file

Technical Architecture

Risk Models

Ledoit-Wolf Shrinkage: Reduces estimation error by shrinking sample covariance matrix toward a structured target, improving stability in high-dimensional settings
Graphical LASSO (GLASSO): Estimates sparse precision matrix using L1 regularization, useful for identifying conditional independence relationships
GARCH(1,1): Models time-varying volatility per asset, capturing volatility clustering and heteroskedasticity
DCC (Dynamic Conditional Correlation): Estimates time-varying correlation structure, allowing for dynamic relationships between assets

Optimization Methods

Markowitz Mean-Variance: Maximizes expected return minus risk penalty: μ'w - (λ/2) * w'Σw
Minimum Variance: Minimizes portfolio variance subject to constraints
Sharpe Maximization: Maximizes risk-adjusted returns: (μ'w - rf) / sqrt(w'Σw)
Black-Litterman: Combines market equilibrium returns with investor views for more stable optimization
CVaR Optimization: Minimizes Conditional Value at Risk, focusing on tail risk management

Backtesting Framework

Simple Backtest: One-time optimization using all available historical data
Walk-Forward Backtest: Rolling window approach with training and testing periods for realistic performance evaluation
Transaction Costs: Incorporates proportional transaction costs per rebalancing event
Rebalance Bands: Implements drift-based rebalancing to reduce unnecessary turnover

Forecasting System

ARIMA/SARIMA: Statistical time series models for trend and seasonality

LLM Integration

RAG Architecture: Retrieval-Augmented Generation system for context-aware responses
Multi-Provider Support: Compatible with OpenAI GPT models and Google Gemini
Automatic Fallback: Handles API quota limits and errors with graceful degradation
Portfolio Context: Embeds portfolio-specific data for accurate, grounded explanations

Usage Examples

Example 1: Investment Plan Generation

from src.data import prepare_portfolio_data
from src.risk import get_covariance_matrix
from src.optimize import optimize_portfolio

# Prepare data
tickers = ['AAPL', 'MSFT', 'GOOGL']
returns, prices = prepare_portfolio_data(tickers, start_date="2020-01-01")

# Estimate covariance
covariance = get_covariance_matrix(returns, method='ledoit_wolf')

# Optimize portfolio
weights = optimize_portfolio(
    returns,
    covariance,
    method='markowitz',
    constraints={'long_only': True},
    risk_aversion=1.0
)

# Convert to dollar allocation
investment_amount = 100000  # $100,000
dollar_allocation = weights * investment_amount
print(dollar_allocation)

Example 2: Comprehensive Analysis

from src.backtest import walk_forward_backtest
from src.metrics import calculate_all_metrics

# Run walk-forward backtest
portfolio_returns, weights_history, metrics = walk_forward_backtest(
    returns,
    train_window=36,  # 36 months training
    test_window=1,    # 1 month testing
    optimization_method='markowitz',
    risk_model='ledoit_wolf',
    transaction_cost=0.001,  # 0.1% transaction cost
    rebalance_band=0.05  # 5% rebalance band
)

# Calculate comprehensive metrics
all_metrics = calculate_all_metrics(portfolio_returns, weights_history)
print(f"Sharpe Ratio: {all_metrics['sharpe_ratio']:.3f}")
print(f"Annualized Return: {all_metrics['annualized_return']*100:.2f}%")
print(f"Max Drawdown: {all_metrics['max_drawdown']*100:.2f}%")

Example 3: Future Performance Prediction

from src.predict import predict_future_performance

# Generate predictions
results = predict_future_performance(
    tickers=['AAPL', 'MSFT', 'GOOGL'],
    start_date="2015-01-01",
    end_date="2024-01-01",
    forecast_horizon=30,
    forecast_method='ensemble',
    use_top_models=5
)

# Access results
future_returns = results['aggregated_prediction']
top_models = results['top_models']

print(f"Expected 30-day return: {(1 + future_returns).prod() - 1:.2%}")
print(f"Top models used: {list(top_models['model_id'])}")

Dependencies

Core Dependencies

Data Processing: numpy>=1.24.0, pandas>=2.0.0, scipy>=1.10.0
Financial Data: yfinance>=0.2.28
Optimization: cvxpy>=1.3.0, scikit-learn>=1.3.0
Risk Models: arch>=6.2.0 (for GARCH/DCC)

Visualization

matplotlib>=3.7.0
seaborn>=0.12.0
plotly>=5.14.0

Web Application

Backend: FastAPI (included in requirements)
Frontend: Next.js, React, TypeScript (see web/frontend/package.json)

Alternative Dashboard

streamlit>=1.28.0 (for Streamlit dashboard alternative)

AI/ML (Optional)

openai>=1.12.0 (for LLM explanations)
statsmodels>=0.14.0 (for ARIMA forecasting)
prophet>=1.1.4 (for Prophet forecasting)
tensorflow>=2.13.0 (for LSTM models)
xgboost>=2.0.0 (for XGBoost forecasting)

Report Generation (Optional)

markdown>=3.4.0 (for markdown to HTML conversion)
weasyprint>=60.0 (for HTML to PDF conversion)
pypandoc>=1.12 (optional, for better LaTeX support in PDFs - requires pandoc binary)

See requirements.txt for the complete list of dependencies.

Data Sources

Primary Source: Yahoo Finance via yfinance library
Default Universe: 11 liquid Sector ETFs (XLK, XLF, XLV, XLY, XLP, XLE, XLI, XLB, XLU, XLRE, XLC)
Data Types: Historical prices (OHLCV), company fundamentals, market data
Frequency: Daily data
Caching: Automatic 24-hour cache for improved performance

For detailed information about data management, see DATA.md.

Contributors

This project is the result of collaborative effort by the following team members:

Cao Pham Minh Dang: Prediction models and forecasting system
Tran Anh Chuong: Data cleaning, exploratory data analysis (EDA), and dashboard development
Pham Dinh Hieu: LLM integration, RAG system, and dashboard features
Nguyen Van Duy Anh: Risk models and optimization algorithms
Ngo Dinh Khanh: Dashboard development and landing page design

Project Advisors

Nguyen Huy Hung: Project Advisor
Dr. Mo El-Haj: Project Advisor

Best Practices

Data Quality: Ensure at least 2-3 years of historical data for reliable analysis
Risk Models: Ledoit-Wolf is recommended for most use cases due to its stability
Transaction Costs: Include realistic transaction costs (0.1-0.5% for stocks) in backtesting
Walk-Forward Testing: Use walk-forward backtesting for more realistic performance estimates
Model Selection: The prediction system automatically selects top-performing models, but review the selection criteria
LLM API Keys: Store API keys securely and be aware of usage quotas when using LLM explanations

Troubleshooting

Common Issues

"Insufficient data" Error

Verify that tickers are valid and have sufficient historical data for the selected date range
Try adjusting the date range or selecting different tickers

"No valid data after cleaning" Error

Some tickers may have excessive missing values
Remove problematic tickers or use a shorter date range

Slow Performance

Reduce the number of tickers in the portfolio
Use shorter date ranges for analysis
Prefer simpler risk models (sample or ledoit_wolf) for faster computation
Pre-download data using scripts/download_data.py

Web Application Setup Issues

Ensure both backend and frontend are running in separate terminals
Verify backend is accessible at http://localhost:8000 (check /health endpoint)
Check that frontend can connect to backend (check browser console for API errors)
Ensure Node.js 18+ is installed for the frontend
Run npm install in web/frontend if dependencies are missing

LLM API Errors

Verify API key is valid and has sufficient quota
Check network connectivity
The system includes automatic fallback mechanisms for quota errors

Project Status

✅ Production Ready - All core features implemented and tested

Completed Features:

✅ Multiple risk models (Ledoit-Wolf, GLASSO, GARCH, DCC)
✅ Various optimization methods (Markowitz, Black-Litterman, CVaR, Minimum Variance, Sharpe)
✅ Realistic backtesting with transaction costs and rebalance bands
✅ Investment plan generation with dollar allocations
✅ Comprehensive analysis and visualization
✅ Prediction system with ensemble forecasting
✅ LLM-powered explanations with RAG architecture
✅ Full-stack web application (Next.js frontend + FastAPI backend)
✅ Alternative Streamlit dashboard for quick prototyping
✅ Data caching system for performance optimization
✅ Comprehensive documentation

License

See LICENSE file for details.

Support and Contributions

For issues, questions, or contributions:

GitHub Issues: Report bugs or request features via GitHub issues
Documentation: Refer to the documentation files for detailed guides
Repository: https://github.com/dinhieufam/FinLove

Disclaimer

This software is provided for educational and research purposes. The predictions, analyses, and recommendations generated by this system should not be considered as financial advice. Users should conduct their own due diligence and consult with qualified financial professionals before making investment decisions. Past performance does not guarantee future results.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
data		data
evaluation		evaluation
report_generation		report_generation
scripts		scripts
src		src
web		web
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

dinhieufam/FinLove

Folders and files

Latest commit

History

Repository files navigation

FinLove — Intelligent Portfolio Construction Platform

Overview

Key Features

💰 Investment Plan

🔍 Analyze

🔮 Prediction

🤖 LLM to Explain

Installation

Prerequisites

Step 1: Install Backend Dependencies

Step 2: Set Up API Keys and Environment Variables

Step 3: Install Frontend Dependencies

Step 4: (Optional) Pre-download Data

Step 5: Run the Web Application

Quick Start Guide

Using the Web Application

Alternative: Using the Streamlit Dashboard

Using the Prediction System Programmatically

Project Structure

Technical Architecture

Risk Models

Optimization Methods

Backtesting Framework

Forecasting System

LLM Integration

Usage Examples

Example 1: Investment Plan Generation

Example 2: Comprehensive Analysis

Example 3: Future Performance Prediction

Dependencies

Core Dependencies

Visualization

Web Application

Alternative Dashboard

AI/ML (Optional)

Report Generation (Optional)

Data Sources

Contributors

Project Advisors

Best Practices

Troubleshooting

Common Issues

Project Status

License

Support and Contributions

Disclaimer

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages