Neural vs. Numerical HJB Solver for Merton Portfolio Optimization

Overview

This project studies the continuous-time Merton portfolio optimization problem through the lens of Hamilton–Jacobi–Bellman (HJB) equations and performs a comparative analysis of all three methods using error metrics, policy comparisons, and spatial–temporal error localization:

Closed-form analytical solution
Finite-difference numerical solver (C++, implicit scheme)
Physics-Informed Neural Network (PINN)

The goal is not to showcase neural networks as a superior replacement but to analyze when classical numerical methods outperform neural solvers and when PINNs may still recover economically meaningful controls at higher dimensions and real-market scenarios.

1. The Merton Portfolio Optimization Problem

Problem setup

The Merton portfolio problem is a continuous-time stochastic control problem where an investor allocates wealth between:

A risk-free asset with constant return ( r )
A risky asset following geometric Brownian motion: see Eq. 1

The investor chooses a trading strategy ( \pi_t ) (dollar amount invested in the risky asset) to maximize expected utility of terminal wealth: see Eq. 3

where wealth evolves as: see Eq. 2

Why this problem matters

It is the canonical benchmark in continuous-time portfolio theory.
Admits a closed-form solution under CRRA preferences.
Serves as an ideal testbed for HJB solvers.
Allows precise analytical validation of numerical and neural methods.

2. Hamilton–Jacobi–Bellman (HJB) Equation

The HJB equation is a partial differential equation that characterizes the optimal value function in optimal control and reinforcement learning problems. It says that an optimal policy remains optimal over any remaining time horizon, or, in layman's terms the best decision now plus the best decisions later should minimize (or maximize) total cost.

Let V(t,x) denote the value function. Applying dynamic programming yields the HJB equation along with the terminal condition: see Eq. 4

Key properties of the value function

The value function satisfies the following properties:

( V_x > 0 ): the value function is increasing in wealth (monotonicity)
( V_{xx} < 0 ): the value function is concave (risk aversion)
The PDE is fully nonlinear, reflecting endogenous optimization over the control

This nonlinear HJB equation is the central object analyzed and solved in this project.

3. Analytical Solution (Benchmark)

Under CRRA utility: see Eq. 5

the Merton problem admits a closed-form solution: see Eq. 6

and optimal policy: see Eq. 7

Role in this project

Serves as ground truth
Enables exact error quantification
Prevents solver-to-solver ambiguity

Full derivation is provided in docs/derivations.pdf.

4. C++ Finite-Difference HJB Solver

This project implements a fully implicit finite-difference solver in C++ to numerically solve the Merton HJB equation. The solver is designed as a high-accuracy, low-dimensional benchmark against which the neural solution is evaluated.

The implementation follows classical numerical PDE methodology, emphasizing:

Stability
Consistency
Analytical interpretability
Computational efficiency

4.1 Discretization of the State Space

The value function ( V(t, x) ) is discretized on a rectangular grid with time domain, wealth domain. To avoid singular behavior at zero wealth (where CRRA utility is undefined), the grid is shifted slightly away from zero, and uniform grid spacing is used, respectively: see Eq. 8

4.2 Terminal and Boundary Conditions

Terminal condition

At maturity ( t = T ), the value function is initialized using the CRRA utility: see Eq. 9

This provides the starting point for backward time integration.

Boundary conditions

Boundary conditions are imposed using the closed-form Merton solution: see Eq. 10

These conditions:

Preserve the correct asymptotic growth rate
Prevent artificial reflection at the domain boundaries
Strongly stabilize the numerical scheme

4.3 Optimal Control Substitution

The control variable ( \pi ) is eliminated analytically before discretization.

From the HJB first-order condition: [ \pi^*(x) = -\frac{\mu - r}{\sigma^2} \frac{V_x}{V_{xx}} ]

Under CRRA preferences, this simplifies to the linear policy: [ \pi^*(x) = \frac{\mu - r}{\gamma \sigma^2} x ]

This substitution transforms the fully nonlinear HJB equation into a linear parabolic PDE with state-dependent coefficients, significantly improving numerical stability.

4.4 Finite-Difference Scheme

Spatial derivatives are approximated using second-order central differences: see Eq. 11

where ( \mathcal{L} ) is the controlled diffusion operator.

Key advantages of the implicit scheme:

Unconditional stability
Robustness to large time steps
Reliable convergence for stiff PDEs

4.5 Resulting Linear System

At each backward time step, the discretization produces a tridiagonal linear system: see Eq. 12

where:

( a_i ) encodes diffusion and drift from the left
( b_i ) is the diagonal dominance term
( c_i ) encodes diffusion and drift from the right

The tridiagonal structure arises naturally from second-order spatial discretization.

4.6 Thomas Algorithm (Tridiagonal Solver)

To solve the tridiagonal system efficiently, the solver uses the Thomas algorithm, a specialized form of Gaussian elimination.

Why the Thomas algorithm?

Linear time complexity: ( O(N_x) )
Minimal memory footprint
Numerically stable for diagonally dominant matrices
Ideal for implicit PDE solvers

Algorithm overview

Given a system: [ a_i v_{i-1} + b_i v_i + c_i v_{i+1} = d_i ]

The Thomas algorithm proceeds in two phases:

Forward elimination

Eliminates the sub-diagonal
Produces modified coefficients ( \tilde{c}_i, \tilde{d}_i )

Backward substitution

Recovers the solution starting from the final grid point

The implementation avoids matrix construction entirely and operates directly on coefficient vectors.

Numerical significance

Because the matrix is strictly diagonally dominant under the implicit scheme, the Thomas algorithm:

Converges reliably
Introduces minimal numerical diffusion
Preserves monotonicity and concavity of the value function

This makes the finite-difference solver a gold-standard benchmark for evaluating neural HJB solvers.

4.7 Implementation Structure

cpp_solver/
├── hjb_solver.h     # Solver class definition
├── hjb_solver.cpp   # Implicit FD + Thomas algorithm
└── main.cpp         # Driver and parameter setup

The solver is fully self-contained, requires no external libraries, and produces deterministic, reproducible results.

4.8 Output and Validation

The solver outputs the value function at ( t = 0 ) as a CSV file:

wealth,value

Validation is performed by:

Direct comparison with the analytical solution
Grid refinement studies
Policy function reconstruction

These results serve as the numerical baseline in the comparative analysis.

5. Physics-Informed Neural Network (PINN)

Motivation

PINNs approximate the value function ( V(t,x) ) using a neural network and enforce the PDE weakly by minimizing the HJB residual via automatic differentiation.

Advantages:

Mesh-free
Easily extensible to higher dimensions
Avoids curse of dimensionality in principle

Architecture & Training

Fully connected feedforward network
Inputs: ( (t, x) )
Output: ( V(t,x) )
Loss function:
- HJB residual
- Terminal condition penalty
- Boundary condition penalty

Automatic differentiation is used to compute: see Eq. 13

6. Comparative Analysis

This section presents a systematic comparison between the analytical solution, the C++ finite-difference (FD) solver, and the Physics-Informed Neural Network (PINN). All comparisons are performed at time ( t = 0 ) on the same wealth grid.

6.1 Value Function Comparison

Figure 1 compares the value function ( V(0,x) ) obtained from all three methods.

Observations:

The finite-difference solution almost overlaps/closely matches the analytical solution across the entire domain.
The PINN solution captures the overall shape of the value function but shows visible deviations, particularly near the boundaries.

This visual comparison already suggests that classical grid-based methods are highly effective for this low-dimensional HJB problem.

6.2 Quantitative Error Metrics

To quantify the visual differences, we report relative ( L^2 ) and sup-norm (( L^\infty )) errors with respect to the analytical solution.

Method	Relative (L^2) Error	(L^\infty) Error
C++ FDM	0.0036942419717000186	0.05053301478750516
PINN	0.11883714775611513	1.7348024919565797

Interpretation:

The finite-difference solver achieves two orders of magnitude lower error than the PINN in both norms.
The large ( L^\infty ) error for the PINN indicates localized regions of failure, instead of an overall inaccuracy.

6.3 Error Localization Across Wealth

Figure 2 shows the pointwise absolute error as a function of wealth on a logarithmic scale.

Observations:

The finite-difference error decays smoothly as wealth increases, consistent with uniform convergence of the scheme.
PINN errors are:
- largest near the low-wealth boundary
- non-monotonic
- persistent even at higher wealth levels

Key insight: The PINN struggles in regions where the HJB solution exhibits strong curvature and where second-order derivatives dominate the dynamics because the second-order derivatives were computed via automatic differentiation and are prone to noise amplification in regions of high curvature and near boundaries.

6.4 Optimal Policy Comparison

Figure 3 compares the optimal investment policy implied by each method.

Observations:

The finite-difference solver mimics the analytical policy almost exactly across the domain.
Despite significant value-function errors, the PINN:
- correctly captures the linear structure of the optimal policy
- tracks the correct slope over maximum of the wealth range
Instabilities appear near domain boundaries due to numerical differentiation limitations.

Important takeaway: While PINNs perform poorly in value-function accuracy, they still recover economically meaningful trading policies over most of the state space.

All quantitative analyses and figures are generated in:

comparative_analysis/

7. Conclusions

Classical finite-difference methods(FDM) significantly outperform Physics-Informed Neural Networks (PINNs) for low-dimensional nonlinear Hamilton–Jacobi–Bellman equations, achieving significantly lower errors in both relative (L^2) and sup-norm metrics.
The superior performance of the FDM in this particular problem can be attributed to the following - its numerical stability, its ability to accurately handle second-order derivatives, and strong emphasis on the boundary conditions.
While PINNs provide a flexible framework for approximating solutions, that maybe be difficult to perform using FDM, they exhibit notable value-function inaccuracies due to it's sensitivity to second derivatives.
Despite relatively poor value-function accuracy, the PINN is able to recover economically meaningful optimal trading policies across most of the wealth domain, correctly capturing the qualitative structure and slope of the analytical solution.
This distinction between value-function accuracy and policy recovery highlights an important insight: accurate control policies may still be obtained even when pointwise value errors are non-negligible.
These results suggest that PINNs should not be treated as replacements for classical numerical methods but rather used as complementary tools whose main advantage would be approximating higher-dimensional HJB problems where grid-based finite-difference methods become computationally heavy.
Overall, this study demonstrates the importance of method-appropriate evaluation and rigorous benchmarking, showcasing that the choice of numerical technique should be made according to the problem dimensionality, structural properties of the PDE, and the ultimate objective of the analysis.

Mathematical Details

Full derivations (PDF): docs/derivations.pdf
Rendered equations (MathJax): docs/mathjax.html

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
analytical_solver		analytical_solver
assets		assets
comparative_analysis		comparative_analysis
cpp_solver		cpp_solver
data		data
docs		docs
neural_solver		neural_solver
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural vs. Numerical HJB Solver for Merton Portfolio Optimization

Overview

1. The Merton Portfolio Optimization Problem

Problem setup

Why this problem matters

2. Hamilton–Jacobi–Bellman (HJB) Equation

Key properties of the value function

3. Analytical Solution (Benchmark)

Role in this project

4. C++ Finite-Difference HJB Solver

4.1 Discretization of the State Space

4.2 Terminal and Boundary Conditions

Terminal condition

Boundary conditions

4.3 Optimal Control Substitution

4.4 Finite-Difference Scheme

4.5 Resulting Linear System

4.6 Thomas Algorithm (Tridiagonal Solver)

Why the Thomas algorithm?

Algorithm overview

Numerical significance

4.7 Implementation Structure

4.8 Output and Validation

5. Physics-Informed Neural Network (PINN)

Motivation

Architecture & Training

6. Comparative Analysis

6.1 Value Function Comparison

6.2 Quantitative Error Metrics

6.3 Error Localization Across Wealth

6.4 Optimal Policy Comparison

7. Conclusions

Mathematical Details

About

Uh oh!

Languages

License

pratycodes/merton-hjb-solver

Folders and files

Latest commit

History

Repository files navigation

Neural vs. Numerical HJB Solver for Merton Portfolio Optimization

Overview

1. The Merton Portfolio Optimization Problem

Problem setup

Why this problem matters

2. Hamilton–Jacobi–Bellman (HJB) Equation

Key properties of the value function

3. Analytical Solution (Benchmark)

Role in this project

4. C++ Finite-Difference HJB Solver

4.1 Discretization of the State Space

4.2 Terminal and Boundary Conditions

Terminal condition

Boundary conditions

4.3 Optimal Control Substitution

4.4 Finite-Difference Scheme

4.5 Resulting Linear System

4.6 Thomas Algorithm (Tridiagonal Solver)

Why the Thomas algorithm?

Algorithm overview

Numerical significance

4.7 Implementation Structure

4.8 Output and Validation

5. Physics-Informed Neural Network (PINN)

Motivation

Architecture & Training

6. Comparative Analysis

6.1 Value Function Comparison

6.2 Quantitative Error Metrics

6.3 Error Localization Across Wealth

6.4 Optimal Policy Comparison

7. Conclusions

Mathematical Details

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages