Authors: Nguyen Van Duy Anh · Pham Dinh Hieu · Cao Pham Minh Dang · Tran Anh Chuong · Ngo Dinh Khanh
GitHub: https://github.com/dinhieufam/FinLove
FinLove is a comprehensive portfolio construction and analysis platform that integrates advanced quantitative finance methodologies with modern machine learning techniques. The system provides intelligent investment planning, comprehensive risk analysis, forward-looking predictions, and AI-powered explanations to support informed investment decision-making.
The platform is built as a full-stack web application with a Next.js frontend and FastAPI backend, providing a modern, responsive user interface. It combines sophisticated risk models, multiple optimization strategies, realistic backtesting capabilities, and state-of-the-art forecasting methods to deliver actionable portfolio insights. FinLove is designed for both individual investors seeking professional-grade portfolio management tools and financial professionals requiring robust analytical capabilities.
Additionally, a Streamlit dashboard is available as an alternative interface for quick prototyping and analysis.
The Investment Plan feature transforms optimized portfolio weights into actionable dollar-based allocation strategies. Users can specify their total investment capital, and the system automatically calculates precise dollar allocations for each asset based on the optimized weights generated by the portfolio construction engine.
Capabilities:
- Convert optimized portfolio weights to dollar allocations
- Support for customizable investment amounts
- Real-time allocation updates based on portfolio optimization results
- Visual representation of capital deployment across assets
- Detailed allocation tables with percentage and dollar breakdowns
The Analyze module provides comprehensive portfolio performance and risk assessment through advanced statistical analysis and visualization. It leverages multiple risk models and performance metrics to deliver deep insights into portfolio behavior.
Capabilities:
- Performance Analysis: Cumulative returns, rolling Sharpe ratios, drawdown analysis, and benchmark comparisons
- Risk Assessment: Value at Risk (VaR), Conditional VaR (CVaR), volatility analysis, and correlation matrices
- Portfolio Composition: Current allocation visualization, weight evolution over time, and concentration analysis
- Statistical Metrics: Annualized returns, volatility, Sharpe ratio, maximum drawdown, turnover, and weight stability
- Risk Model Diagnostics: Comprehensive analysis using multiple risk estimation methods (Ledoit-Wolf, GLASSO, GARCH, DCC)
The Prediction system employs ensemble forecasting methodologies to project future portfolio performance. The system evaluates multiple model combinations, selects top-performing strategies, and aggregates predictions to generate robust forward-looking estimates.
Capabilities:
- Model Selection: Automatic evaluation of 20 model combinations (5 optimization methods × 4 risk models)
- Ensemble Forecasting: Combines predictions from top-performing models using multiple time series methods
- Forecasting Methods: Supports ARIMA, Prophet, LSTM, Exponential Smoothing, and Moving Average approaches
- Future Returns Projection: Forecasts portfolio returns over user-specified horizons (e.g., 30 days, 90 days)
- Confidence Intervals: Provides uncertainty estimates for predictions
- Performance-Based Selection: Ranks and selects models based on historical performance metrics
The LLM Explanation feature leverages large language models to provide natural language interpretations of portfolio analysis results. The system uses Retrieval-Augmented Generation (RAG) to ground explanations in actual portfolio data, ensuring accurate and contextually relevant insights.
Capabilities:
- AI-Powered Insights: Natural language explanations of portfolio performance, risk metrics, and allocation decisions
- Context-Aware Responses: RAG system retrieves relevant portfolio data to provide accurate, data-grounded explanations
- Interactive Q&A: Users can ask questions about their portfolio and receive detailed, contextual answers
- Chart Explanations: Automatic generation of explanations for visualizations and performance charts
- Educational Guidance: Provides investment education and interpretation guidance without making specific investment recommendations
- Multi-Model Support: Supports multiple LLM providers (OpenAI GPT models, Google Gemini) with automatic fallback mechanisms
- Python 3.8 or higher
- Node.js 18+ and npm
- pip package manager
# Install Python dependencies
pip install -r requirements.txt
# Install backend-specific dependencies
cd web/backend
pip install -r requirements.txt
cd ../..FinLove requires API keys for LLM-powered features (such as Gemini) and uses environment variables for configuration.
-
Set up environment variables for the backend API:
-
Copy the example environment file:
cp web/backend/env.example.txt web/backend/.env
-
Edit
web/backend/.envto add your own credentials and API keys.
Example contents (see web/backend/env.example.txt for full list):# Multiple API keys (comma-separated, for rotation) GEMINI_API_KEYS=your-gemini-api-key-here # Multiple models (comma-separated) GEMINI_MODELS=gemini-2.5-pro # Embedding model for vector search HUGGINGFACE_EMBEDDING_MODEL=all-MiniLM-L6-v2 -
Note: You must provide at least one valid GEMINI API key for LLM explanations and chatbot features.
-
cd web/frontend
npm install
cd ../..For improved performance, pre-download financial datasets:
python scripts/download_data.pySee DATA.md for detailed information about data download and caching mechanisms.
The main FinLove platform is a full-stack web application with Next.js frontend and FastAPI backend:
Terminal 1 - Start Backend:
cd web/backend
python app.pyThe backend API will run on http://localhost:8000
Terminal 2 - Start Frontend:
cd web/frontend
npm run devThe web application will be available at http://localhost:3000
The frontend automatically proxies API requests to the backend, so you only need to access http://localhost:3000 in your browser.
-
Start the Application
# Terminal 1: Start backend cd web/backend python app.py # Terminal 2: Start frontend cd web/frontend npm run dev
-
Access the Application
- Open
http://localhost:3000in your web browser - The frontend will automatically connect to the backend API
- Open
-
Select Assets
- Enter company tickers separated by commas (e.g.,
AAPL,MSFT,GOOGL) - Or use the default sector ETFs option
- Enter company tickers separated by commas (e.g.,
-
Configure Analysis Parameters
- Date Range: Select start and end dates for historical analysis
- Portfolio Objective: Choose optimization method (Markowitz, Sharpe, CVaR, etc.)
- Risk Engine: Select risk model (Ledoit-Wolf recommended for stability)
- Investment Capital: Specify total capital amount for investment plan
- Risk Appetite: Adjust risk aversion parameter
- Testing Style: Choose between simple or walk-forward backtesting
-
Run Analysis
- Click "Run Analysis" to execute portfolio optimization
- Explore results across the main features:
- 💰 Investment Plan: Dollar-based allocation strategy
- 🔍 Analyze: Comprehensive performance and risk analysis
- 🔮 Prediction: Future portfolio performance forecasts
- 🤖 LLM Explanations: AI-powered Q&A about your portfolio
-
Use LLM Explanations
- The web application includes an integrated chatbot for asking questions about your portfolio
- The RAG system provides context-aware answers based on your portfolio data
For a simpler interface, you can use the Streamlit dashboard:
-
Launch the Dashboard
streamlit run dashboard/app.py
-
Follow similar steps as above, but using the Streamlit interface at
http://localhost:8501
from src.predict import predict_future_performance
# Run complete prediction pipeline
results = predict_future_performance(
tickers=['AAPL', 'MSFT', 'GOOGL'],
start_date="2015-01-01",
end_date="2024-01-01",
forecast_horizon=30, # Forecast next 30 days
forecast_method='ensemble', # Use ensemble forecasting
use_top_models=5 # Use top 5 performing models
)
# Access aggregated prediction
future_returns = results['aggregated_prediction']
print(f"Expected daily return: {future_returns.mean()*100:.4f}%")FinLove/
├── src/ # Core business logic and shared modules
│ ├── data.py # Data acquisition and preprocessing
│ ├── risk.py # Risk models (Ledoit-Wolf, GLASSO, GARCH, DCC)
│ ├── optimize.py # Optimization methods (Markowitz, BL, CVaR, etc.)
│ ├── backtest.py # Backtesting engine
│ ├── metrics.py # Performance metrics calculation
│ ├── forecast.py # Time series forecasting methods
│ ├── predict.py # Prediction pipeline orchestration
│ └── model_collector.py # Model evaluation and selection
├── web/ # Full-stack web application
│ ├── backend/ # FastAPI backend
│ │ ├── app.py # API server entry point
│ │ ├── routers/ # API route definitions
│ │ └── src/ # Backend-specific modules (RAG system)
│ └── frontend/ # Next.js frontend
│ ├── app/ # Application pages and routing
│ └── components/ # Reusable UI components
├── scripts/ # Utility scripts
│ ├── download_data.py # Data pre-download script
│ ├── example_prediction.py # Example prediction usage
│ └── train_all_models.py # Model training script
├── report_generation/ # Report generation utilities
│ ├── convert.py # Markdown to PDF converter
│ └── report.md # Project report (markdown)
├── models/ # Saved machine learning models
├── data/ # Raw and processed data storage
├── data_cache/ # Cached financial data
├── evaluation/ # Model evaluation notebooks
├── requirements.txt # Python dependencies
├── README.md # This file
- Ledoit-Wolf Shrinkage: Reduces estimation error by shrinking sample covariance matrix toward a structured target, improving stability in high-dimensional settings
- Graphical LASSO (GLASSO): Estimates sparse precision matrix using L1 regularization, useful for identifying conditional independence relationships
- GARCH(1,1): Models time-varying volatility per asset, capturing volatility clustering and heteroskedasticity
- DCC (Dynamic Conditional Correlation): Estimates time-varying correlation structure, allowing for dynamic relationships between assets
- Markowitz Mean-Variance: Maximizes expected return minus risk penalty: μ'w - (λ/2) * w'Σw
- Minimum Variance: Minimizes portfolio variance subject to constraints
- Sharpe Maximization: Maximizes risk-adjusted returns: (μ'w - rf) / sqrt(w'Σw)
- Black-Litterman: Combines market equilibrium returns with investor views for more stable optimization
- CVaR Optimization: Minimizes Conditional Value at Risk, focusing on tail risk management
- Simple Backtest: One-time optimization using all available historical data
- Walk-Forward Backtest: Rolling window approach with training and testing periods for realistic performance evaluation
- Transaction Costs: Incorporates proportional transaction costs per rebalancing event
- Rebalance Bands: Implements drift-based rebalancing to reduce unnecessary turnover
- ARIMA/SARIMA: Statistical time series models for trend and seasonality
- RAG Architecture: Retrieval-Augmented Generation system for context-aware responses
- Multi-Provider Support: Compatible with OpenAI GPT models and Google Gemini
- Automatic Fallback: Handles API quota limits and errors with graceful degradation
- Portfolio Context: Embeds portfolio-specific data for accurate, grounded explanations
from src.data import prepare_portfolio_data
from src.risk import get_covariance_matrix
from src.optimize import optimize_portfolio
# Prepare data
tickers = ['AAPL', 'MSFT', 'GOOGL']
returns, prices = prepare_portfolio_data(tickers, start_date="2020-01-01")
# Estimate covariance
covariance = get_covariance_matrix(returns, method='ledoit_wolf')
# Optimize portfolio
weights = optimize_portfolio(
returns,
covariance,
method='markowitz',
constraints={'long_only': True},
risk_aversion=1.0
)
# Convert to dollar allocation
investment_amount = 100000 # $100,000
dollar_allocation = weights * investment_amount
print(dollar_allocation)from src.backtest import walk_forward_backtest
from src.metrics import calculate_all_metrics
# Run walk-forward backtest
portfolio_returns, weights_history, metrics = walk_forward_backtest(
returns,
train_window=36, # 36 months training
test_window=1, # 1 month testing
optimization_method='markowitz',
risk_model='ledoit_wolf',
transaction_cost=0.001, # 0.1% transaction cost
rebalance_band=0.05 # 5% rebalance band
)
# Calculate comprehensive metrics
all_metrics = calculate_all_metrics(portfolio_returns, weights_history)
print(f"Sharpe Ratio: {all_metrics['sharpe_ratio']:.3f}")
print(f"Annualized Return: {all_metrics['annualized_return']*100:.2f}%")
print(f"Max Drawdown: {all_metrics['max_drawdown']*100:.2f}%")from src.predict import predict_future_performance
# Generate predictions
results = predict_future_performance(
tickers=['AAPL', 'MSFT', 'GOOGL'],
start_date="2015-01-01",
end_date="2024-01-01",
forecast_horizon=30,
forecast_method='ensemble',
use_top_models=5
)
# Access results
future_returns = results['aggregated_prediction']
top_models = results['top_models']
print(f"Expected 30-day return: {(1 + future_returns).prod() - 1:.2%}")
print(f"Top models used: {list(top_models['model_id'])}")- Data Processing:
numpy>=1.24.0,pandas>=2.0.0,scipy>=1.10.0 - Financial Data:
yfinance>=0.2.28 - Optimization:
cvxpy>=1.3.0,scikit-learn>=1.3.0 - Risk Models:
arch>=6.2.0(for GARCH/DCC)
matplotlib>=3.7.0seaborn>=0.12.0plotly>=5.14.0
- Backend: FastAPI (included in requirements)
- Frontend: Next.js, React, TypeScript (see
web/frontend/package.json)
streamlit>=1.28.0(for Streamlit dashboard alternative)
openai>=1.12.0(for LLM explanations)statsmodels>=0.14.0(for ARIMA forecasting)prophet>=1.1.4(for Prophet forecasting)tensorflow>=2.13.0(for LSTM models)xgboost>=2.0.0(for XGBoost forecasting)
markdown>=3.4.0(for markdown to HTML conversion)weasyprint>=60.0(for HTML to PDF conversion)pypandoc>=1.12(optional, for better LaTeX support in PDFs - requires pandoc binary)
See requirements.txt for the complete list of dependencies.
- Primary Source: Yahoo Finance via
yfinancelibrary - Default Universe: 11 liquid Sector ETFs (XLK, XLF, XLV, XLY, XLP, XLE, XLI, XLB, XLU, XLRE, XLC)
- Data Types: Historical prices (OHLCV), company fundamentals, market data
- Frequency: Daily data
- Caching: Automatic 24-hour cache for improved performance
For detailed information about data management, see DATA.md.
This project is the result of collaborative effort by the following team members:
- Cao Pham Minh Dang: Prediction models and forecasting system
- Tran Anh Chuong: Data cleaning, exploratory data analysis (EDA), and dashboard development
- Pham Dinh Hieu: LLM integration, RAG system, and dashboard features
- Nguyen Van Duy Anh: Risk models and optimization algorithms
- Ngo Dinh Khanh: Dashboard development and landing page design
- Nguyen Huy Hung: Project Advisor
- Dr. Mo El-Haj: Project Advisor
- Data Quality: Ensure at least 2-3 years of historical data for reliable analysis
- Risk Models: Ledoit-Wolf is recommended for most use cases due to its stability
- Transaction Costs: Include realistic transaction costs (0.1-0.5% for stocks) in backtesting
- Walk-Forward Testing: Use walk-forward backtesting for more realistic performance estimates
- Model Selection: The prediction system automatically selects top-performing models, but review the selection criteria
- LLM API Keys: Store API keys securely and be aware of usage quotas when using LLM explanations
"Insufficient data" Error
- Verify that tickers are valid and have sufficient historical data for the selected date range
- Try adjusting the date range or selecting different tickers
"No valid data after cleaning" Error
- Some tickers may have excessive missing values
- Remove problematic tickers or use a shorter date range
Slow Performance
- Reduce the number of tickers in the portfolio
- Use shorter date ranges for analysis
- Prefer simpler risk models (sample or ledoit_wolf) for faster computation
- Pre-download data using
scripts/download_data.py
Web Application Setup Issues
- Ensure both backend and frontend are running in separate terminals
- Verify backend is accessible at
http://localhost:8000(check/healthendpoint) - Check that frontend can connect to backend (check browser console for API errors)
- Ensure Node.js 18+ is installed for the frontend
- Run
npm installinweb/frontendif dependencies are missing
LLM API Errors
- Verify API key is valid and has sufficient quota
- Check network connectivity
- The system includes automatic fallback mechanisms for quota errors
✅ Production Ready - All core features implemented and tested
Completed Features:
- ✅ Multiple risk models (Ledoit-Wolf, GLASSO, GARCH, DCC)
- ✅ Various optimization methods (Markowitz, Black-Litterman, CVaR, Minimum Variance, Sharpe)
- ✅ Realistic backtesting with transaction costs and rebalance bands
- ✅ Investment plan generation with dollar allocations
- ✅ Comprehensive analysis and visualization
- ✅ Prediction system with ensemble forecasting
- ✅ LLM-powered explanations with RAG architecture
- ✅ Full-stack web application (Next.js frontend + FastAPI backend)
- ✅ Alternative Streamlit dashboard for quick prototyping
- ✅ Data caching system for performance optimization
- ✅ Comprehensive documentation
See LICENSE file for details.
For issues, questions, or contributions:
- GitHub Issues: Report bugs or request features via GitHub issues
- Documentation: Refer to the documentation files for detailed guides
- Repository: https://github.com/dinhieufam/FinLove
This software is provided for educational and research purposes. The predictions, analyses, and recommendations generated by this system should not be considered as financial advice. Users should conduct their own due diligence and consult with qualified financial professionals before making investment decisions. Past performance does not guarantee future results.