Skip to content

Commit

Permalink
Fix footnotes in probabilistic PCA tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
penelopeysm committed Nov 17, 2024
1 parent 81a8412 commit 3fb4617
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions tutorials/11-probabilistic-pca/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -278,12 +278,12 @@ Another way to put it: 2 dimensions is enough to capture the main structure of t
A direct question arises from above practice is: how many principal components do we want to keep, in order to sufficiently represent the latent structure in the data?
This is a very central question for all latent factor models, i.e. how many dimensions are needed to represent that data in the latent space.
In the case of PCA, there exist a lot of heuristics to make that choice.
For example, We can tune the number of principal components using empirical methods such as cross-validation based some criteria such as MSE between the posterior predicted (e.g. mean predictions) data matrix and the original data matrix or the percentage of variation explained [3].
For example, We can tune the number of principal components using empirical methods such as cross-validation based some criteria such as MSE between the posterior predicted (e.g. mean predictions) data matrix and the original data matrix or the percentage of variation explained [^3].

For p-PCA, this can be done in an elegant and principled way, using a technique called *Automatic Relevance Determination* (ARD).
ARD can help pick the correct number of principal directions by regularizing the solution space using a parameterized, data-dependent prior distribution that effectively prunes away redundant or superfluous features [4].
ARD can help pick the correct number of principal directions by regularizing the solution space using a parameterized, data-dependent prior distribution that effectively prunes away redundant or superfluous features [^4].
Essentially, we are using a specific prior over the factor loadings $\mathbf{W}$ that allows us to prune away dimensions in the latent space. The prior is determined by a precision hyperparameter $\alpha$. Here, smaller values of $\alpha$ correspond to more important components.
You can find more details about this in e.g. [5].
You can find more details about this in, for example, Bishop (2006) [^5].

```{julia}
@model function pPCA_ARD(X)
Expand Down

0 comments on commit 3fb4617

Please sign in to comment.