Using statistical entropy as a measure of risk in Portfolio Optimization

The objective of this project was to use the signal entropy in stock data as a measure of risk, for the application of picking the optimal portfolio.

What is Entropy?

A random variable is one whose outcomes can only be found by observation, as they don't have certain values. This introduces the notion of a probability of each possible value this variable can take. A discrete variable is one that can take discrete values (duh). An example is the number of possible outcomes from rolling two dice, or the number of outcomes expected when flipping a coin. The other type of variable is called continuous. An example of this is the average height of all males in North Korea.

Now let's assume that there is a discrete random variable X, that is described by a probability distribution function p(X). The probability of a specific event Y, will then have the probability p(Y), and the entropy of that event is -p(Y)*log(p(Y)). The total entropy in the variable is sum of the entropies of all the events that can possibly occur.

Entropy is a measure of uncertainty in a signal. In thermodynamics, entropy is used to described the number of available atomic energy states. In coding theory, entropy calculated using the base-2 logarithmic function gives the minimum number of bits required to code a random signal.

Why entropy?

In finance, the entropy in the asset pricing data can be used to compute risk. The elegance of this is that we can use the entropies of all the assets as a linear constraint, when posing the "optimal portfolio" problem as a mathematical optimization objective, specifically as a linear programming problem. The current risk management strategy in optimization, is achieved by imposing a quadratic non-linear constraint on the problem, which can be difficult to solve without Lagrange duality.

What are we doing here?

For the engineering description, check out the report here

The programmatic process is a follows:

Pulled data manually from google finance's historical records.
Cleaned, and preprocessed the data.
Built the probability distribution tables.
Ran the linear program in MATLAB.

How are we doing this?

From here, run the run main.py. Note that the cleaning process will create a text file with the tabulated stock data. These tables will be used in the next part.
Compile this code, after specifying the paths to your training and testing directories, and the path to the list of tables in your training directory.
Run "entropylp.m", after loading all the files from here to your MATLAB environment.

What are the pros with this technique?

1. 	Entropy is an information theoretic measure that describes all the useful information in a statistical	
	variable. I find it a better measure of risk than entropy (personal opinion).
2. 	This whole process is extremely easy to do.
3.	The methods in linear programming is pretty well understood, and there is practically an infinite amount
	of resources to learn from.

What are the cons with this technique?

0. 	There are none. (jk)
1.	The data layer of the python project assumes that the data is in csv format. Probably not 
	the most efficient way to store data.
2.	There is a lot of references to absolute paths in the code base.
3.	The data grabbing process is not automated (as of yet).
4.	Lastly, and most importantly, the dependence on MATLAB. A very expensive con. However, 
	OCTAVE is a spiritual twin to MATLAB and is free.

OK, let's see it in action.

Run the "entropylp.m" to get the following plots

The performance of two portfolios with two different entropy constraints.

The average return rate for different entropy constraint.

Performance over 50 months at a good entropy constraint

Performance over 50 months 0 constraint value, and short selling permitted

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
C++ files		C++ files
Documentation		Documentation
Matlab files		Matlab files
Python files		Python files
pictures		pictures
training data		training data
.gitattributes		.gitattributes
.gitignore		.gitignore
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Using statistical entropy as a measure of risk in Portfolio Optimization

What is Entropy?

Why entropy?

What are we doing here?

How are we doing this?

What are the pros with this technique?

What are the cons with this technique?

OK, let's see it in action.

About

Uh oh!

Releases

Packages

Languages

tommathewXC/EntropyOptimizedPortfolio

Folders and files

Latest commit

History

Repository files navigation

Using statistical entropy as a measure of risk in Portfolio Optimization

What is Entropy?

Why entropy?

What are we doing here?

How are we doing this?

What are the pros with this technique?

What are the cons with this technique?

OK, let's see it in action.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages