-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.py
163 lines (124 loc) · 9.17 KB
/
index.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
import dash_core_components as dcc
import dash_bootstrap_components as dbc
import dash_html_components as html
from dash.dependencies import Input, Output
from app import app, server
from apps import flowfield, user_data
from navbar import navbar
from utils import convert_latex
###################################################################
# Homepage text
###################################################################
# Below is a pretty horrendous hack (see apps/flowfield.py for a much cleaner approach). The markdown sections are split here to get the equations nicely centered (can't use divs etc due to convert_latex function). Hopefully one day dash will sort out their latex support.
home_text = r'''
## Data-Driven Dimension Reduction
The apps contained in these pages utilise the [*equadratures*](https://equadratures.org/) data-driven dimension reduction capability for a number of tasks. In **Flowfield Estimation**, dimension reducing ridges are embedded within an airfoil flow for rapid flowfield estimation and design exploration, whilst in **My Data**, you can apply the techniques to your own datasets. But first, the underlying ideas are briefly introduced.
Many physical systems are high dimensional, which can make it challenging to obtain approximations of them, and it often even more challenging to visualise these approximations. However, all is not lost! Many seemingly high-dimensional systems have intrinsically low-dimensional structure. Although the quantity of interest $f(\mathbf{x})$ might be defined as a function of a large set of parameters $\mathbf{x} \in \mathbb{R}^d$, its variation can often be approximately captured with a small number of linear projections of the original parameters $\mathbf{W}^T \mathbf{x} \in \mathbb{R}^n$. Here, $n \ll d$ and $\mathbf{W} \in \mathbb{R}^{d\times n}$ is a tall matrix whose column span is called the *dimension reducing subspace*, or *active subspace*.
In other words, we assume that our quantities of interest are well--approximated by *ridge functions*,
$$f(\mathbf{x}) \approx g(\mathbf{W}^T \mathbf{x}),$$
where $g:\mathbb{R}^n \rightarrow \mathbb{R}$ is a low-dimensional non-linear function called the *ridge profile*. The intuitive computational advantage behind *ridge approximations* is that instead of estimating a function in $\mathbb{R}^d$, we approximate it in $\mathbb{R}^n$, which also facilitates easy visualisation. In figure b) below, a ridge approximation is obtained for a turbo-machinery dataset \[1], demonstrating how a low dimensional approximation can be obtained with a suitable $\mathbf{W}$ matrix.
<figure style="width:80%">
<img alt="Dimension reducing ridge" src="ridge_figure.png" />
</figure>
The *equadratures* code uses orthogonal polynomials to represent $g$, and so identifying ridge functions consists of computing coefficients for $g$ as well as identifying a suitable subspace matrix $\mathbf{W}$. Two techniques are avaiable for this in the code, *active subspaces* and *variable projection*.
#### Active Subspaces
The active subspaces approach, introduced in \[2], involves estimating a covariance matrix using the gradient of a polynomial approximation
$$\mathbf{C} = \int_{\mathcal{X}} \nabla_{\mathbf{x}} g(\mathbf{x}) \nabla_{\mathbf{x}} g(\mathbf{x})^T\,\tau~d\mathbf{x},$$
where $\tau = 2^{-d}$ defines a uniform distribution over the $d$-dimensional hypercube $\mathcal{X}$. As $\mathbf{C}$ is a symmetrix matrix, one can write its eigendecomposition as
$$\mathbf{C} = [\mathbf{W} \; \mathbf{V}] \begin{bmatrix} \mathbf{\Lambda}_1 & \mathbf{0}\\ \mathbf{0} &\mathbf{\Lambda}_2 \end{bmatrix} \begin{bmatrix} \mathbf{W}^T\\ \mathbf{V}^T \end{bmatrix},$$
where $\mathbf{\Lambda}_1$ is a diagonal matrix containing the largest eigenvalues, and $\mathbf{\Lambda}_2$ the smallest eigenvalues, both sorted in descending order. This partition should be chosen such that there is a large gap between the last eigenvalue of $\mathbf{\Lambda}_1$ and the first eigenvalue of $\mathbf{\Lambda}_2$ \[2]. Thus, this partitioning of $\mathbf{Q}$ yields the active subspace matrix $\mathbf{W}$ and the inactive subspace matrix $\mathbf{V}$.
An example of this method in action is given below, where a $n=1$ dimensional approximation is obtained for the $d=7$ temperature probe dataset available from the [equadratures dataset repository](https://github.com/Effective-Quadratures/data-sets).
```python
import equadratures as eq
# Load the probe data, standardise to -1/1, and split into train/test
data = eq.datasets.load_eq_dataset('probes')
X = data['X']; y = data['y1']
X = eq.scaler_minmax().transform(X)
X_train, X_test, y_train, y_test = eq.datasets.train_test_split(X, y,train=0.8, random_seed = 42)
N,d = X_train.shape
# Obtain subspace
subdim = 1
subspace = eq.Subspaces(method='variable-projection',sample_points=X_train,
sample_outputs=y_train,polynomial_degree=2, subspace_dimension=subdim)
W = subspace.get_subspace()[:,0:subdim]
u_test = (X_test@W).reshape(-1,1)
subpoly = subspace.get_subspace_polynomial()
print('Ridge Poly. R2 score = %.3f' %eq.datasets.score(y_test,subpoly.get_polyfit(u_test),metric='r2'))
```
#### Variable Projection
In order to construct the $\mathbf{C}$ matrix, we must first obtain $g(\mathbf{x})$, a polynomial fitted in the full $d$-dimensional space. This becomes problematic as the number of dimensions is increased. For such circumstances, *equadratures* offers *variable projection* \[3]. The non-linear least squares problem
$$\underset{\mathbf{W}, \boldsymbol{\alpha}}{\text{minimize}} \; \; \left\Vert f\left(\mathbf{x}\right)-g_{\boldsymbol{\alpha}}\left(\mathbf{W}^{T} \mathbf{x}\right)\right\Vert _{2}^{2}$$
is solved by recasting it as a separable non-linear least squares problem. Here, Gauss-Newton optimization used to solve for the polynomial coefficients $\alpha$ and subspace matrix $\mathbf{W}$ together.
#### References
\[1]: P. Seshadri, S. Shahpar, P. Constantine, G. Parks, M. Adams. "Turbomachinery active subspace performance maps". *Journal of Turbomachinery* (2018). [Paper](https://doi.org/10.1115/1.4038839).
\[2]: P. Constantine. "Active subspaces : emerging ideas for dimension reduction in parameter studies". *SIAM Spotlights* (2015). [Book](https://doi.org/10.1137/1.9781611973860).
\[3]: J. Hokanson and P. Constantine. "Data-Driven Polynomial Ridge Approximation Using Variable Projection". *SIAM Journal of Scientific Computing* (2017). [Paper](https://doi.org/10.1137/17M1117690).
'''
home_text = dcc.Markdown(convert_latex(home_text), dangerously_allow_html=True, style={'text-align':'justify'})
# disclaimer message
final_details = r'''
This app is currently hosted *on the cloud* via [Heroku](https://www.heroku.com). Resources are limited and the app may be slow when there are multiple users. If it is too slow please come back later!
Please report any bugs to [[email protected]](mailto:[email protected]).
'''
final_details = dbc.Alert(dcc.Markdown(final_details),
dismissable=True,is_open=True,color='info',style={'padding-top':'0.4rem','padding-bottom':'0.0rem'})
homepage = dbc.Container([home_text,final_details])
###################################################################
# 404 page
###################################################################
msg_404 = r'''
**Oooops**
Looks like you might have taken a wrong turn!
'''
container_404 = dbc.Container([
dbc.Row(
[
dcc.Markdown(msg_404,style={'text-align':'center'})
], justify="center", align="center", className="h-100"
)
],style={"height": "90vh"}
)
###################################################################
# Footer
###################################################################
footer = html.Div(
[
html.P('App built by Ashley Scillitoe'),
# html.A(html.P('ascillitoe.com'),href='https://ascillitoe.com'),
html.P(html.A('ascillitoe.com',href='https://ascillitoe.com')),
html.P('Copyright © 2021')
]
,className='footer', id='footer'
)
###################################################################
# App layout (adopted for all sub-apps/pages)
###################################################################
app.layout = html.Div(
[
dcc.Location(id='url', refresh=True),
navbar,
html.Div(homepage,id="page-content"),
footer,
],
style={'padding-top': '70px'}
)
###################################################################
# Callback to return page requested in navbar
###################################################################
@app.callback(Output('page-content', 'children'),
Output('footer','style'),
Input('url', 'pathname'))
def display_page(pathname):
if pathname == '/':
return homepage, {'display':'block'}
if pathname == '/flowfield':
return flowfield.layout, {'display':'block'}
elif pathname == '/datadriven':
return user_data.layout, {'display':'block'}
else:
return container_404, {'display':'none'}
###################################################################
# Run server
###################################################################
if __name__ == '__main__':
app.run_server(debug=True)