Skip to content

Commit

Permalink
Initial GitHub Release
Browse files Browse the repository at this point in the history
  • Loading branch information
maben-mw committed Sep 19, 2024
0 parents commit bca0bf6
Show file tree
Hide file tree
Showing 34 changed files with 2,539 additions and 0 deletions.
26 changes: 26 additions & 0 deletions .github/workflows/pages.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: Generate Documentation and Publish to Pages
on:
push:
branches:
- main
jobs:
pages:
name: Generate Documentation and Publish to Pages
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install Sphinx Python Dependencies
run: pip install -r requirements.txt
working-directory: Documentation
- name: Build Documentation
run: make html
working-directory: Documentation
- name: Deploy Documentation to Pages
uses: peaceiris/actions-gh-pages@v4
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./Documentation/_build/html
1 change: 1 addition & 0 deletions Documentation/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
_build/
13 changes: 13 additions & 0 deletions Documentation/Customization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Customization/Extending

To customize the generator simply edit `Software/MATLAB/+prodserver/openapi.m` in place. While it is technically also possible to derive from the class and then overload or extend methods; the class was not designed with *such* extensibility in mind. The class is designed to be relatively simple and straightforward: it tries to basically fill out the OpenAPI document as a template. And the idea is that this "template" can be edited in place. Most of the MATLAB code is literally just writing out lines of OpenAPI YAML code while filling out the correct names and values. The remainder of the MATLAB code is mostly about iterating through all archives, functions and types.

## YamlWriter

In order to be able to edit the generator, it is important to understand the `YamlWriter` class it uses internally. Type the following in the MATLAB Command Window to learn more:

```matlabsession
>> help prodserver.internal.YamlWriter
```

[//]: # (Copyright 2023-2024 The MathWorks, Inc.)
109 changes: 109 additions & 0 deletions Documentation/DeployToMPS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Deploy as endpoint to MATLAB Production Server

It is possible to add a live endpoint to MATLAB Production Server which can return the OpenAPI spec for the functionality deployed to the instance. It is for example possible to wrap the OpenAPI spec generator in a [Custom Routes and Payloads function](https://www.mathworks.com/help/releases/R2024a/mps/server/use-web-handler-for-custom-routes-and-custom-payloads.html) which first queries the server's discovery endpoint, then uses the generator to translate this to an OpenAPI spec and then returns the OpenAPI YAML document.

An *example* wrapper is included in the package:

```{literalinclude} ../Software/MATLAB/examples/mps/openapiEndpoint.m
:language: matlab
:caption: Software/MATLAB/examples/mps/openapiEndpoint.m
```

This is "just an example" as it is meant to be customized. For example it is possible to configure whether or not to include the asynchronous interface, or to add authentication information, etc. Most importantly however, the code which queries the discovery endpoint may have to be customized; see the next section.

## Querying the discovery endpoint

There is no dedicated internal or loopback endpoint or function inside MATLAB Production Server which allows retrieving the discovery document directly from MATLAB code running on the instance. And so, the wrapper MATLAB code will have to query the "external" `/api/discovery` endpoint in the same way as any other MATLAB Production Server client would.

```{hint}
The discovery endpoint needs to be enabled for it to be available, see [`--enable-discovery`](https://www.mathworks.com/help/mps/server/mps.configuration-properties.html#propname_enable-discovery).
```

From the MATLAB wrapper point of view, the easiest option is to query the discovery endpoint over `http` (and not `https`), and ideally the server can simply refer to itself by `localhost` (or `127.0.0.1`). In that case the code does not have to be customized *per MATLAB Production Server instance* and the same CTF-archive can be used across different MATLAB Production Server instances.

* If `http` is **enabled** on the instance anyway, this should indeed simply work.

* If `http` has been **disabled** on the instance and it is configured to listen on `https` *only*:

* It may be worth considering simply enabling `http` again. Note that it is possible to configure the instance to only listen on a specific network interface (see the [--http **host**:port](https://www.mathworks.com/help/mps/server/mps.configuration-properties.html#buf0ewd) option), i.e. it is possible to enable `http` for `localhost` or `127.0.0.1` *only* such that only *local* applications on the same machine can access the instance over `http` but all *external* traffic still has to go over `https`. And/or it is possible to configure the `http` endpoint to listen on a port which is explicitly blocked for all external traffic. Some firewalls may even be able to limit the internal traffic and only allow specific processes to access the port or only allow local communication between processes *running under the same user account*.

* It is also possible to query the `/discovery/api` endpoint over `https` but:

* It is unlikely then that the instance can be referenced by `localhost` as the https certificate is unlikely to be valid for hostname `localhost`. Ensure to update the code to refer to the correct hostname.

* If the https certificate is self-signed (or at least not signed by an certificate authority which MATLAB trusts *by default*) the request will have to be configured to accept the server certificate explicitly.

1. Download the certificate in PEM-format and for example save it as `YourServerCertificate.pem`.

2. Update the MATLAB code to accept this certificate when querying the endpoint:

```matlab
opts = weboptions('CertificateFilename','YourServerCertificate.pem');
discovery = webread('https://example.com/api/discovery',opts);
```

````{note}
Technically it is also possible to disable certificate trust validation altogether by setting `CertificateFilename` to an empty string `''`. And it is even possible to disable host name verification when switching to working with the [HTTP interface in MATLAB](https://www.mathworks.com/help/matlab/http-interface.html):
```matlab
% DANGER: Disable both hostname as well as certificate trust validation
opts = matlab.net.http.HTTPOptions('CertificateFilename','','VerifyServerName',false);
% Start a new request
req = matlab.net.http.RequestMessage();
% Perform the request with the specified options
result = req.send('https://localhost:9920/api/discovery',opts);
% Obtain the response body data
discovery = result.Body.Data;
```
These options are not recommended however and should only be considered for temporary testing.
````

### Summary

The following points can be taken into consideration when choosing how to access the discovery endpoint. It is important however that in the end *you* make *your own* security assessment and *you* choose the option(s) which meet(s) *your* requirements.

1. `http` is easier from the MATLAB wrapper point of view, there are no certificates in play and the instance should be able to simply refer to itself as `localhost`. Also this requires just a single CTF-archive which can be used across different MATLAB Production Server instances.

2. In general `https` communication is more secure but involves certificates which will likely require customization of the wrapper. The customization will likely differ *per MATLAB Production Server instance*. Each instance may require a custom CTF-archive specifically build for that specific instance.

3. Some security concerns related to using `http` may be mitigated by simply binding the http listener to `localhost` only. This is done through a simple configuration option on the MATLAB Production Server instance.

4. Concerns with `http` which are not addressed by 3. may require further (more complex) firewall configurations. In that case, due to this added complexity, the advantages of 1. may no longer outweigh the disadvantages of 2. and it might be easier to stick/switch to working over `https` only.


## Customizing other aspects of the wrapper

As shown in the example, other aspects of the wrapper can be customized as well. It is possible to configure the various [options for the OpenAPI spec generator](./Usage.md#prodserveropenapi). It is possible to customize the "interface" of the endpoint, e.g. add additional query parameters to allow configuring the behavior. See [Write MATLAB Functions for Web Request Handler](https://www.mathworks.com/help/mps/server/use-web-handler-for-custom-routes-and-custom-payloads.html) in the MATLAB Production Server documentation to learn more about implementing these kinds of functions.

## URL Routes

A custom route must be configured to make the endpoint work correctly. As [documented](https://www.mathworks.com/help/mps/server/use-web-handler-for-custom-routes-and-custom-payloads.html#mw_1c57566a-9876-44ca-9f07-f67709bfb3f1), routes can be configured on an instance level or archive level (in releases R2023b and newer). The easiest option is to make use of archive level routes as this requires no further configuration and restart of the MATLAB Production Server instance(s). An example archive level routes file is included as `routes.json`:

```{literalinclude} ../Software/MATLAB/examples/mps/routes.json
:language: matlab
:caption: Software/MATLAB/examples/mps/routes.json
```

If an instance level configuration is desired/required, update the routes configuration file (typically `config/routes.json`) on the instance(s).

## Compile

To build the CTF-archive with archive level routes, use:

```matlab
compiler.build.productionServerArchive(...
'openapiEndpoint.m',...
ArchiveName='openapi',...
RoutesFile='routes.json');
```

Note that with archive level routes, the endpoint will become `http(s)://example.com/`*`ArchiveName`*`/`*`RouteAsDefinedInRoutesFile`*.

Choose `ArchiveName` and configure the routes file appropriately to make the functionality available at the desired endpoint. E.g. with the routes file as included in the example and the archive name used above, the final endpoint becomes `http://example.com/openapi/spec`.

## Deploy

As with any other component, copy the resulting CTF-archive to the `auto_deploy` folder of the MATLAB Production Server instance(s). If working with instance level routes, restart the MATLAB Production Server instance after having updated the routes configuration file.

[//]: # (Copyright 2023-2024 The MathWorks, Inc.)
7 changes: 7 additions & 0 deletions Documentation/Installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Installation

The package is a pure MATLAB implementation without any further dependencies, so it can be "installed" by simply cloning the repository (or downloading as ZIP-file and extracting) and then running `startup.m` from the `Software/MATLAB` directory to add the relevant directories to the [MATLAB path](https://www.mathworks.com/help/matlab/matlab_env/what-is-the-matlab-search-path.html).

It is then possibly to permanently add these directories to the MATLAB path by using `savepath` after which `startup.m` does not have to be rerun again. Alternatively simply do rerun `startup.m` in new MATLAB session if the package is needed in that session.

[//]: # (Copyright 2023-2024 The MathWorks, Inc.)
90 changes: 90 additions & 0 deletions Documentation/MPSOpenAPI.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# OpenAPI representations of MATLAB Production Server Types and Methods

OpenAPI Specification Version 3.0.3 appears to have mainly been designed with modern request and response bodies in mind where these bodies are JSON objects with multiple different *fields* to represent multiple different in- or outputs. For example, for a MATLAB function like:

```matlab
function [a,b] = myFun(x,y,z)
a = "Hello " + x;
b = y + z;
```

A typical modern REST interface for the input would look something like:

```json
{
"x": "value1",
"y": [2,20,200],
"z": [3,30,300]
}
```

And responses which look something like:

```json
{
"a": "Hello value1",
"b": [5,50,500],
}
```

However, this is not what the actual MATLAB Production Server REST interface looks like, it is somewhat older and was designed before specs like the ones above became commonplace. MATLAB Production Server's actual interface for the function above, in [large notation](https://www.mathworks.com/help/mps/restfuljson/json-representation-of-matlab-data-types.html) is:

```json
{
"nargout": 2,
"rhs": [
{"mwtype": "string", "mwsize": [1,6], "mwdata": "value1"},
{"mwtype": "double", "mwsize": [1,3], "mwdata": [2,20,200]},
{"mwtype": "double", "mwsize": [1,3], "mwdata": [3,30,300]}
]
}
```

or in small notation:

```json
{
"nargout": 2,
"rhs": [
"value1",
[2,20,200],
[3,30,300]
]
}
```

It *is* a JSON object as well, with fields like `nargout` (to specify the number of requested outputs) and `rhs`. But the actual function inputs are represented in one single ordered array in the `rhs` field. And unfortunately, OpenAPI 3.0.3 is limited with regards to how accurately arrays can be described: it is not possible to explicitly describe the order of the array elements, you can describe how many elements there are and you can describe the data types which the array may contain, *and* each type description has to be unique.

So, for the small notation example above, in OpenAPI 3.0.3 you can*not* say: `rhs` is an array with 3 elements and the elements are of primitive types: "string", "array of doubles" and "array of doubles". You may not say "array of doubles" *twice*, the type descriptions of array elements must be unique. So the best you *can* say here is that it is an array with 3 elements and the elements are a mix of primitive types "strings" and "arrays of doubles". You cannot say that it is "string" once and "array of double" twice, let alone that you would be able to say that it is specifically the first element which is the "string" and then the second and third are "array of double". Therefore small notation is *not* supported by the package. While it is possible to describe the API in a way which is *not wrong*, the description it is also *not accurate enough* to actually be useful.

For the large notation, luckily it is possible to describe each array element with an unique type description, even if there are duplicate underlying MATLAB types. In large notation, each array element is a JSON object rather than a JSON primitive and JSON objects can be described much more elaborately (with additional metadata) than primitives, which allows us to make sure each type description is unique. Strictly speaking we still cannot prescribe the exact order of the elements, but most OpenAPI tools/client generators will just keep the order in which the types were described in the first place.

In OpenAPI Specification Version 3.1.0, `prefixItems` was added to array descriptions which do allow describing exact array elements in their exact order, including duplicate type definitions if necessary. In that sense the MATLAB Production Server REST APIs can be more accurately described in version 3.1.0. And 3.1.0 should thus theoretically also allow describing the small notation accurately enough to become useful. Support for small notation has not been added to the package (yet) though. Also do note that many of the OpenAPI tooling/client generators do not (fully) support version 3.1.0 yet.

Similarly, in order to be able to accurately describe the in- and outputs, the generated specs may include `oneOf` and/or `anyOf` schemas where the level of support for `oneOf` and `anyOf` varies in different OpenAPI tooling/client generators.

## Heterogenous array item definition

According to the OpenAPI 3.0.3 spec <https://spec.openapis.org/oas/v3.0.3#properties>:

> items - Value _MUST_ be an object and not an array. Inline or referenced schema _MUST_ be of a Schema Object and not a standard JSON Schema. `items` MUST be present if the type is `array`.
Meaning that in order to be able to describe an heterogenous array (which `lhs` and `rhs` typically are), an `anyOf` or `oneOf` object needs to be used. The spec is not entirely clear however, on which of the two options should be used (when). For example, for a definition as follows:

```yaml
type: array
items:
oneOf:
- type: string
- type: integer
```
* Some people and tooling (see for example <https://www.baeldung.com/openapi-array-of-varying-types>) say this definition means that *for the entire array* you may choose **one of** the two options `string` or `integer`, essentially giving homogeneous arrays. So, for example, `["a",",b","c"]` or `[1,2,3]` would be valid array instances, but `["a",2,"c"]` would not be. And these tools then say you should use `anyOf` instead, if you *do* want to allow a heterogeneous array with a mix of element types.

* Other people and tooling (see for example [Mixed-Type Arrays section in the Swagger documentation](https://swagger.io/docs/specification/data-models/data-types/#mixed-array)) say that you are *not* describing the *entire* array here at all, you are describing the _items_ instead. So, the definition above then actually means that for *each and every* separate *item* in the array you may *independently* choose **one of** the options `string` or `integer`. So then, an heterogenous array with a mix of types (like `[1,"b",3]`) is simply already allowed with `oneOf`. And these tools then basically also say that `oneOf` is the only option you ever need, and you never use `anyOf` at all here.

By default the OpenAPI Interface for MATLAB Production Server generates specs including `anyOf`, i.e. following the first reasoning; but this *is* configurable.

In OpenAPI 3.1.0, this problem does not exist, for this version `prefixItems` is used instead of `items`. `prefixItems` is more accurate in the sense that it also really prescribed the order of the elements in the array. And, `prefixItems` does not require the usage of `oneOf` or `anyOf` in the first place.

[//]: # (Copyright 2023-2024 The MathWorks, Inc.)
19 changes: 19 additions & 0 deletions Documentation/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
5 changes: 5 additions & 0 deletions Documentation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# *OpenAPI Interface for* MATLAB Production Server - Documentation

This is the Markdown version of the package documentation. The HTML published version of the documentation is available at:

<https://mathworks-ref-arch.github.io/openapi-productionserver>
Loading

0 comments on commit bca0bf6

Please sign in to comment.