Linear Algebra (ECE) | Spring 2023 | Dr. Sarafraz | University of Tehran
A Project Designed by: Erfan Asgari, Fardin Abbasi
This project explores the application of Markov Chains to model the dynamics of business cycles, a fundamental topic in macroeconomics. By utilizing matrix operations and eigenvalue decomposition, students will analyze how economic states transition over time and simulate long-term outcomes.
- To understand the role of Markov matrices in modeling real-world phenomena like the business cycle.
- To explore key linear algebra concepts, including eigenvalues, eigenvectors, and matrix multiplication.
- To apply the Monte Carlo method and steady-state analysis in the context of Markov Chains.
- To gain familiarity with numerical methods such as power iteration and inverse iteration for computing eigenvalues and eigenvectors.
In a hypothetical country, every decade,
Writing the algebraic relation for the future population shares of cities (
Next, let's define the population distribution vector as
Rewriting the relation for
Assuming the vector
Note
Matrices like the coefficient matrix in this problem, which are square and represent probabilities, are called Markov Matrices.
A Markov matrix must satisfy two key properties according to probability theory:
-
Non-negative Entries: All entries in a Markov matrix represent probabilities, meaning they must be between 0 and 1, inclusive. Negative values or values greater than 1 violate the probability constraints.
-
Column Sum Equals 1: The sum of the probabilities in each column must equal 1. This ensures that for each possible state, the total probability of transitioning to all other states sums to 1, reflecting the fact that some transition must occur.
These two properties ensure that a matrix can model a valid Markov process where each state evolves probabilistically to the next.
Given the two properties of Markov matrices, it can be shown that:
a) Every Markov matrix has an eigenvalue equal to 1.
b) The eigenvalues of a Markov matrix are not greater than 1.
A Stochastic Vector represents a discrete probability distribution with non-negative entries that sum to 1. Each entry of this vector represents the probability of one of the possible outcomes in a conventional ordering.
The figure below shows a Markov Chain with two states. The initial probability of being in each state is denoted by the probability vector
Here, a more formal definition for the probability vector and transition matrix, which is a Markov matrix, is provided:
If the initial state distribution is given as
this means that initially, there is a
Tip
An optional study of this interactive booklet may help in understanding the concept of Markov chains.
For the matrix
let's plot the vectors
in separate plots and show the direction of the eigenvectors as lines on these plots.
As shown in the diagrams, this vector converges toward the direction of the eigenvector corresponding to the "dominant" eigenvalue, which is the eigenvalue with the largest magnitude.
By decomposing this vector along the unit eigenvectors, we have:
Multiplying the matrix
After
Without loss of generality, we assume that the first eigenvalue
In this formulation, the coefficient of the second eigenvector tends to zero as
This implies that as
Based on the result from the previous part, we can suggest a method to find an eigenvector and eigenvalue of a matrix using iterative matrix multiplication:
def power_iteration(A, num_iterations=100):
n = A.shape[0]
v = np.random.rand(n)
for _ in range(num_iterations):
# Calculate Av
Av = np.dot(A, v)
# Normalize Av
v = Av / np.linalg.norm(Av)
# Calculate the eigenvalue
eigenvalue = np.dot(np.dot(v, A), v) / np.dot(v, v)
return eigenvalue, v
This method is known as "Power Iteration" method and can only find the largest eigenvalue and its corresponding eigenvector.
Given the set of eigenvalues
Each eigenvector
Subtracting
Factoring gives:
Thus, the matrix
Multiply both sides of the eigenvalue equation for
Since
Solving for
Therefore, the eigenvalue
The inverse iteration method is a technique for finding eigenvalues and eigenvectors of a matrix, particularly when used in conjunction with a shift to focus on a particular eigenvalue. It builds on the properties of eigenpairs that we discussed earlier. Here’s how the inverse iteration method leverages these properties and can be adapted to find all eigenpairs, not just one.
Inverse iteration, also known as the inverse power method, focuses on finding the eigenvalue of a matrix that is closest to a given shift value. It works by iterating the solution of a linear system involving
-
Eigenpairs of
$A - \sigma I$ : If$A v_i = \lambda_i v_i$ , then$(A - \sigma I) v_i = (\lambda_i - \sigma) v_i$ . For$\lambda_i$ near$\sigma$ ,$(\lambda_i - \sigma)$ is small, and the matrix$(A - \sigma I)$ has a small eigenvalue, making$(A - \sigma I)^{-1}$ amplify the corresponding eigenvector$v_i$ . -
Eigenpairs of
$(A - \sigma I)^{-1}$ : The eigenvectors of$(A - \sigma I)^{-1}$ are the same as those of$A$ , but the eigenvalues are$\frac{1}{\lambda_i - \sigma}$ . This means that if you perform inverse iteration, you are effectively amplifying the eigenvector corresponding to the eigenvalue of$A$ closest to$\sigma$ .
To find all eigenpairs, not just the one closest to a shift value, you can use a combination of techniques:
-
Shift Strategies:
By choosing different shift values
$\sigma$ , you can use inverse iteration to find eigenvectors corresponding to eigenvalues near each shift. You may need to use different shifts to target different eigenvalues. -
Deflation:
After finding an eigenpair, deflate the matrix
$A$ to remove the effect of the found eigenpair and repeat the process on the reduced matrix. This technique involves subtracting the outer product of the found eigenvector from$A$ and solving the reduced system.
import numpy as np
def inverse_iteration(A, num_iterations=100, shift=0):
n = A.shape[0]
v = np.random.rand(n)
for _ in range(num_iterations):
# Solve (A - shift*I)x = v
x = np.linalg.solve(A - shift * np.eye(n), v)
# Normalize x
v = x / np.linalg.norm(x)
# Calculate the eigenvalue
eigenvalue = np.dot(np.dot(v, A), v) / np.dot(v, v)
return eigenvalue, v
The Monte Carlo Estimation method is a powerful tool for approximating values through random sampling and simulation. One classic example of its application is estimating the value of
-
Setup: Imagine a circle inscribed within a square. The area of the circle is
$\pi r^2$ , and the area of the square is$(2r)^2$ , where$r$ is the radius of the circle. -
Random Sampling: Randomly generate points within the square. For each point, determine if it falls inside the circle.
-
Estimate
$\pi$ : The ratio of the number of points inside the circle to the total number of points approximates the ratio of the area of the circle to the area of the square. Since the area of the circle is$\pi r^2$ and the area of the square is$(2r)^2$ , the ratio is$\frac{\pi r^2}{4r^2} = \frac{\pi}{4}$ . Multiply this ratio by 4 to estimate$\pi$ .
By repeating this sampling process many times, the estimate of
Tip
Take a look at this Interactive Notebook to know more about classic examples of Monte Carlo.
To estimate the steady state distribution of a Markov chain using the Monte Carlo method, you can use the following simulation-based approach:
-
Initialize: Start with an initial state
$x_0$ randomly chosen from the state space of the Markov chain. -
Simulation: Perform a large number of transitions according to the Markov chain's transition probabilities. For each step, sample the next state based on the current state using the transition matrix.
-
Record Frequencies: Track the frequency of each state visited over a large number of transitions. The more transitions you simulate, the closer these frequencies will approximate the steady state probabilities.
-
Estimate Steady State Distribution: After a sufficient number of transitions, the proportion of time spent in each state will estimate the steady state distribution. Normalize these frequencies to ensure they sum to 1.
This method relies solely on sampling and simulation, avoiding direct matrix computations. It provides a practical way to estimate the steady state distribution for complex Markov chains where analytical solutions may be infeasible.
The Business Cycle is one of the key topics in macroeconomics. It refers to a series of stages in the economy that continuously repeat, characterized by expansion and contraction. Governments can make better policy decisions by understanding the different mechanisms of this cycle.
Each business cycle passes through four distinct phases:
Expansion or recovery is the most favorable economic state, characterized by business booms.
At this stage, economic parameters have increased or decreased excessively, and the economy begins to grow uncontrollably. This imbalance signals the start of a recession or contraction phase.
The contraction phase is associated with reduced economic activity. During this period, unemployment usually rises, stock markets fall, and GDP growth drops below 2%.
This phase is the exact opposite of the peak. The cycle reaches the trough when the recession ends, and the economy starts returning to an expansion phase.
In this project, we refer to the state between peak and trough as the average state.
Tip
Similar cycles can be observed in all economic areas! For example, the Product Life Cycle follows a similar pattern in microeconomics when introducing a product to the market. Similar behavior can also be found in stock markets. Try to maximize your profits in this stock market simulation!
Below is a Markov chain graph of a business cycle for a specific country. Each arrow represents the probability of transitioning from one state to another in a given month. It is also possible to remain in the current state.
Using the graph, we can extract the transition matrix:
Finding the eigenvalues and eigenvectors of this Markov matrix and then normalizing the first eigenvector so they become probability vectors:
If, based on current government policies, the initial state distribution vector is:
Using 1000 samples from this distribution, we can create a sample set for the current month. Assuming current government policies remain, we predict the state distribution for 6, 12, 72, and 120 months into the future using the Monte Carlo Estimation method:
initial_distribution = np.array([0.2, 0.5, 0.3])
num_simulations = 1000
def monte_carlo_markov_chain(
transition_matrix,
initial_distribution,
num_steps,
num_simulations):
num_states = len(initial_distribution)
state_counts = np.zeros(num_states)
for _ in range(num_simulations):
current_state = np.random.choice(range(num_states), p=initial_distribution)
for _ in range(num_steps):
current_state = np.random.choice(range(num_states), p=transition_matrix[:,current_state])
state_counts[current_state] += 1
state_probabilities = state_counts / (num_simulations)
return state_probabilities
State Steady States after:
- 6 months:
[0.126 0.752 0.122]
- 12 months:
[0.113 0.771 0.116]
- 72 months:
[0.109 0.795 0.096]
- 120 months:
[0.125 0.761 0.114]
Without the need for statistical simulations, Markov chains allow predicting future state distributions using only the transition matrix. The government may have three possible state distributions this month:
Using matrix multiplication, we calculate the long-term results (over 120 months) for each of these policies, plotting a 3D scatter plot showing the probability distribution for each policy over time.
The steady-state distribution of a Markov chain is a key concept representing the long-term behavior of the system. It describes the probability distribution over states when the chain has reached equilibrium, meaning that the probabilities of being in each state no longer change as the system evolves.
The steady-state distribution is independent of the initial state distribution. This is a fundamental property of Markov chains. Regardless of where the Markov chain starts, after a sufficiently long time, the probability distribution over states will converge to the steady-state distribution.
In a Markov chain, as time progresses, the influence of the initial state distribution diminishes. The chain will converge to the steady-state distribution, which is a stationary distribution that satisfies:
This equation implies that once the chain reaches the steady-state, the distribution over states does not change with further transitions.
The steady-state distribution represents the long-term equilibrium of the Markov chain. For any initial state, as the number of transitions increases, the probability of being in each state converges to the steady-state probabilities. This convergence is independent of the initial state because the steady-state distribution is defined by the transition matrix alone.
The initial state distribution does not affect the steady-state distribution. The steady-state distribution is a property of the transition matrix and represents the equilibrium distribution that the Markov chain converges to regardless of its starting point. This ensures that in the long run, the system behaves consistently and independently of where it started.
Another definition of the steady-state distribution is that it does not change over time, i.e.,
It can be observed that
What assumption about the sum of the elements in