This project studies the continuous-time Merton portfolio optimization problem through the lens of Hamilton–Jacobi–Bellman (HJB) equations and performs a comparative analysis of all three methods using error metrics, policy comparisons, and spatial–temporal error localization:
- Closed-form analytical solution
- Finite-difference numerical solver (C++, implicit scheme)
- Physics-Informed Neural Network (PINN)
The goal is not to showcase neural networks as a superior replacement but to analyze when classical numerical methods outperform neural solvers and when PINNs may still recover economically meaningful controls at higher dimensions and real-market scenarios.
The Merton portfolio problem is a continuous-time stochastic control problem where an investor allocates wealth between:
- A risk-free asset with constant return ( r )
- A risky asset following geometric Brownian motion: see Eq. 1
The investor chooses a trading strategy ( \pi_t ) (dollar amount invested in the risky asset) to maximize expected utility of terminal wealth: see Eq. 3
where wealth evolves as: see Eq. 2
- It is the canonical benchmark in continuous-time portfolio theory.
- Admits a closed-form solution under CRRA preferences.
- Serves as an ideal testbed for HJB solvers.
- Allows precise analytical validation of numerical and neural methods.
The HJB equation is a partial differential equation that characterizes the optimal value function in optimal control and reinforcement learning problems. It says that an optimal policy remains optimal over any remaining time horizon, or, in layman's terms the best decision now plus the best decisions later should minimize (or maximize) total cost.
Let V(t,x) denote the value function. Applying dynamic programming yields the HJB equation along with the terminal condition: see Eq. 4
The value function satisfies the following properties:
- ( V_x > 0 ): the value function is increasing in wealth (monotonicity)
- ( V_{xx} < 0 ): the value function is concave (risk aversion)
- The PDE is fully nonlinear, reflecting endogenous optimization over the control
This nonlinear HJB equation is the central object analyzed and solved in this project.
Under CRRA utility: see Eq. 5
the Merton problem admits a closed-form solution: see Eq. 6
and optimal policy: see Eq. 7
- Serves as ground truth
- Enables exact error quantification
- Prevents solver-to-solver ambiguity
Full derivation is provided in docs/derivations.pdf.
This project implements a fully implicit finite-difference solver in C++ to numerically solve the Merton HJB equation. The solver is designed as a high-accuracy, low-dimensional benchmark against which the neural solution is evaluated.
The implementation follows classical numerical PDE methodology, emphasizing:
- Stability
- Consistency
- Analytical interpretability
- Computational efficiency
The value function ( V(t, x) ) is discretized on a rectangular grid with time domain, wealth domain. To avoid singular behavior at zero wealth (where CRRA utility is undefined), the grid is shifted slightly away from zero, and uniform grid spacing is used, respectively: see Eq. 8
At maturity ( t = T ), the value function is initialized using the CRRA utility: see Eq. 9
This provides the starting point for backward time integration.
Boundary conditions are imposed using the closed-form Merton solution: see Eq. 10
These conditions:
- Preserve the correct asymptotic growth rate
- Prevent artificial reflection at the domain boundaries
- Strongly stabilize the numerical scheme
The control variable ( \pi ) is eliminated analytically before discretization.
From the HJB first-order condition: [ \pi^*(x) = -\frac{\mu - r}{\sigma^2} \frac{V_x}{V_{xx}} ]
Under CRRA preferences, this simplifies to the linear policy: [ \pi^*(x) = \frac{\mu - r}{\gamma \sigma^2} x ]
This substitution transforms the fully nonlinear HJB equation into a linear parabolic PDE with state-dependent coefficients, significantly improving numerical stability.
Spatial derivatives are approximated using second-order central differences: see Eq. 11
where ( \mathcal{L} ) is the controlled diffusion operator.
Key advantages of the implicit scheme:
- Unconditional stability
- Robustness to large time steps
- Reliable convergence for stiff PDEs
At each backward time step, the discretization produces a tridiagonal linear system: see Eq. 12
where:
- ( a_i ) encodes diffusion and drift from the left
- ( b_i ) is the diagonal dominance term
- ( c_i ) encodes diffusion and drift from the right
The tridiagonal structure arises naturally from second-order spatial discretization.
To solve the tridiagonal system efficiently, the solver uses the Thomas algorithm, a specialized form of Gaussian elimination.
- Linear time complexity: ( O(N_x) )
- Minimal memory footprint
- Numerically stable for diagonally dominant matrices
- Ideal for implicit PDE solvers
Given a system: [ a_i v_{i-1} + b_i v_i + c_i v_{i+1} = d_i ]
The Thomas algorithm proceeds in two phases:
Forward elimination
- Eliminates the sub-diagonal
- Produces modified coefficients ( \tilde{c}_i, \tilde{d}_i )
Backward substitution
- Recovers the solution starting from the final grid point
The implementation avoids matrix construction entirely and operates directly on coefficient vectors.
Because the matrix is strictly diagonally dominant under the implicit scheme, the Thomas algorithm:
- Converges reliably
- Introduces minimal numerical diffusion
- Preserves monotonicity and concavity of the value function
This makes the finite-difference solver a gold-standard benchmark for evaluating neural HJB solvers.
cpp_solver/
├── hjb_solver.h # Solver class definition
├── hjb_solver.cpp # Implicit FD + Thomas algorithm
└── main.cpp # Driver and parameter setup
The solver is fully self-contained, requires no external libraries, and produces deterministic, reproducible results.
The solver outputs the value function at ( t = 0 ) as a CSV file:
wealth,value
Validation is performed by:
- Direct comparison with the analytical solution
- Grid refinement studies
- Policy function reconstruction
These results serve as the numerical baseline in the comparative analysis.
PINNs approximate the value function ( V(t,x) ) using a neural network and enforce the PDE weakly by minimizing the HJB residual via automatic differentiation.
Advantages:
- Mesh-free
- Easily extensible to higher dimensions
- Avoids curse of dimensionality in principle
-
Fully connected feedforward network
-
Inputs: ( (t, x) )
-
Output: ( V(t,x) )
-
Loss function:
- HJB residual
- Terminal condition penalty
- Boundary condition penalty
Automatic differentiation is used to compute: see Eq. 13
This section presents a systematic comparison between the analytical solution, the C++ finite-difference (FD) solver, and the Physics-Informed Neural Network (PINN). All comparisons are performed at time ( t = 0 ) on the same wealth grid.
Figure 1 compares the value function ( V(0,x) ) obtained from all three methods.
Observations:
- The finite-difference solution almost overlaps/closely matches the analytical solution across the entire domain.
- The PINN solution captures the overall shape of the value function but shows visible deviations, particularly near the boundaries.
This visual comparison already suggests that classical grid-based methods are highly effective for this low-dimensional HJB problem.
To quantify the visual differences, we report relative ( L^2 ) and sup-norm (( L^\infty )) errors with respect to the analytical solution.
| Method | Relative (L^2) Error | (L^\infty) Error |
|---|---|---|
| C++ FDM | 0.0036942419717000186 | 0.05053301478750516 |
| PINN | 0.11883714775611513 | 1.7348024919565797 |
Interpretation:
- The finite-difference solver achieves two orders of magnitude lower error than the PINN in both norms.
- The large ( L^\infty ) error for the PINN indicates localized regions of failure, instead of an overall inaccuracy.
Figure 2 shows the pointwise absolute error as a function of wealth on a logarithmic scale.
Observations:
-
The finite-difference error decays smoothly as wealth increases, consistent with uniform convergence of the scheme.
-
PINN errors are:
- largest near the low-wealth boundary
- non-monotonic
- persistent even at higher wealth levels
Key insight: The PINN struggles in regions where the HJB solution exhibits strong curvature and where second-order derivatives dominate the dynamics because the second-order derivatives were computed via automatic differentiation and are prone to noise amplification in regions of high curvature and near boundaries.
Figure 3 compares the optimal investment policy implied by each method.
Observations:
- The finite-difference solver mimics the analytical policy almost exactly across the domain.
- Despite significant value-function errors, the PINN:
- correctly captures the linear structure of the optimal policy
- tracks the correct slope over maximum of the wealth range
- Instabilities appear near domain boundaries due to numerical differentiation limitations.
Important takeaway: While PINNs perform poorly in value-function accuracy, they still recover economically meaningful trading policies over most of the state space.
All quantitative analyses and figures are generated in:
comparative_analysis/
-
Classical finite-difference methods(FDM) significantly outperform Physics-Informed Neural Networks (PINNs) for low-dimensional nonlinear Hamilton–Jacobi–Bellman equations, achieving significantly lower errors in both relative (L^2) and sup-norm metrics.
-
The superior performance of the FDM in this particular problem can be attributed to the following - its numerical stability, its ability to accurately handle second-order derivatives, and strong emphasis on the boundary conditions.
-
While PINNs provide a flexible framework for approximating solutions, that maybe be difficult to perform using FDM, they exhibit notable value-function inaccuracies due to it's sensitivity to second derivatives.
-
Despite relatively poor value-function accuracy, the PINN is able to recover economically meaningful optimal trading policies across most of the wealth domain, correctly capturing the qualitative structure and slope of the analytical solution.
-
This distinction between value-function accuracy and policy recovery highlights an important insight: accurate control policies may still be obtained even when pointwise value errors are non-negligible.
-
These results suggest that PINNs should not be treated as replacements for classical numerical methods but rather used as complementary tools whose main advantage would be approximating higher-dimensional HJB problems where grid-based finite-difference methods become computationally heavy.
-
Overall, this study demonstrates the importance of method-appropriate evaluation and rigorous benchmarking, showcasing that the choice of numerical technique should be made according to the problem dimensionality, structural properties of the PDE, and the ultimate objective of the analysis.
- Full derivations (PDF): docs/derivations.pdf
- Rendered equations (MathJax): docs/mathjax.html


