-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
307 lines (198 loc) · 12.3 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# openmpp
<!-- badges: start -->
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![R-CMD-check](https://github.com/mattwarkentin/openmpp/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/mattwarkentin/openmpp/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
The goal of `openmpp` is to provide a programmatic interface to the OpenM++ API directly from R to simplify creating scenarios, running models, and gathering results for further processing.
## Installation (R Package)
You can install the development version of `openmpp` from [GitHub](https://github.com/mattwarkentin/openmpp) with:
``` r
# install.packages("remotes")
remotes::install_github("mattwarkentin/openmpp")
```
## Installation (OpenM++)
If you do not have access to an existing OpenM++ server, you can download and install OpenM++ locally and run a local server on your workstation. The `openmpp` package can then connect to this local server instance.
### Download
For most users, the best way to install OpenM++ locally is to download the pre-compiled binaries. To install OpenM++, download and unzip the "Source code and binaries" appropriate for your operating system. The latest release of OpenM++ can be found here: <https://github.com/openmpp/main/releases/latest>. Pre-compiled binaries are available for Mac (Intel and Arm), Windows, and several common Linux distributions (Debian, Ubuntu, RedHat).
NOTE: Windows may allow you to view the contents of the zip directory without extracting, however, the files must be extracted for the installation to function properly.
### Running OpenM++ on Windows
Enter the OpenM++ directory using the File Explorer. Right-click anywhere inside the folder and select "Open in Terminal".
In the Windows Terminal, enter the following command:
``` cmd
.\bin\oms.exe
```
This will start the process responsible for running the OpenM++ web service (OMS). Note that the local host address (e.g., http://localhost:4040) will be printed in the console and is the address used by the `openmpp` R package to communicate with the API. This local host address will be set as the `OPENMPP_LOCAL_URL`. See the Usage section for more details.
### Running OpenM++ on MacOS
Open a new MacOS Terminal window (either by using Spotlight Search or by navigating to "Applications" and then "Utilities" in Finder).
After unzipping the downloaded directory in Finder, drag the folder into the terminal and press Enter. This will change your active directory to the OpenM++ folder.
Enter the following command into the terminal:
``` bash
bin/oms
```
Similar to the Windows installation, the web service (OMS) will initiate and the local host address will be shared with the R package for API communication.
## Usage
The `openmpp` package contains many functions that provide access to nearly every OpenM++ API endpoint. However, users of this package will typically only use a smaller set of functions for most common tasks.
### User Authentication
Each user is required to set their local or remote host address (i.e., URL) for the OpenM++ API in their global or project-specific `.Renviron` file in order for the `openmpp` package to authenticate and communicate with the API on behalf of the user.
If you are working in an IDE (e.g., Positron, RStudio), you may consider using the following function `usethis::edit_r_environ()` to open your `.Renviron` file for editing. Note that you will need to restart your R session after editing the file for the effect to take place.
For an API running locally, set the following environment variable in your `.Renviron` file:
```
OPENMPP_LOCAL_URL=http://localhost:XXXX
```
Where `XXXX` is the four digits corresponding to your specific local host address (typically 4040 is used). The local host address is printed to the console when starting the OpenM++ web service in the terminal.
This package also provides the ability to remotely connect to OpenM++ using JWT tokens. For an API running remotely, set the following environment variables in your `.Renviron` file:
```
OPENMPP_REMOTE_URL=...
OPENMPP_REMOTE_USER=...
OPENMPP_REMOTE_PWD=...
```
If you aren't sure of your remote URL or your username/password, you may contact your OpenM++ administrator to retrieve this information. Note that the URL, user name, and password should be kept confidential and not committed into version control (e.g., git).
Once the environment variables are set, users may register a local or remote API connection in their R scripts.
```r
library(openmpp)
use_OpenMpp_local()
```
Or,
```r
library(openmpp)
use_OpenMpp_remote()
```
see `?use_OpenMpp_local` or `?use_OpenMpp_remote` for more information.
### Main Functions
- Functions for accessing tables of models, worksets, or model runs
+ `get_models()`
+ `get_worksets()` / `get_scenarios()`
+ `get_model_runs()` / `get_runs()`
- Functions for creating new worksets or scenarios
+ `create_scenario()` / `create_workset()`
- Functions for loading models, worksets, or model runs
+ `load_model()`
+ `load_workset()` / `load_scenario()`
+ `load_model_run()` / `load_run()`
+ `load_model_runs()` / `load_runs()`
- Functions for deleting worksets or model runs
+ `delete_workset()` / `delete_scenario()`
+ `delete_model_run()` / `delete_run()`
### Models, Scenarios, Runs, and RunSets
There are 4 main classes you will work with when using the `openmpp` package: `OpenMppModel`, `OpenMppWorkset`, `OpenMppModelRun`, and `OpenMppModelRunSet`. Each of these are `R6` classes. `R6` is an encapsulated object-oriented programming (OOP) system for R. Use the `load_*()` set of functions to load a model, workset/scenario, model run, or set of model runs into memory.
Instances of each of these 4 classes have methods (i.e., functions) and fields (i.e., data) associated with them. You can access these functions and data using the standard `$` subset operator (e.g., `obj$function()` or `obj$data`).
__Why use R6?__ We chose to use the R6 OOP as we believe it can simplify the ability for the R package to communicate with OpenM++ to ensure that all changes made to the microsimulation objects in the R session are propagated and synchronized with the OpenM++ database. Encapsulated OOP allows the internal state of the object (i.e., the connection to the actual object in the OpenM++ database) to be accessed and modified through well-defined and high-level methods, rather than directly manipulating the data with low-level function calls. This approach enforces data integrity, improves code readability, and simplifies maintenance by abstracting away the implementation details of an object and preventing unintended modifications to its state. More information about `R6` can be found [here](https://r6.r-lib.org/articles/Introduction.html).
### Developing New Models
Developing new microsimulation or agent-based models in OpenM++ is beyond the scope of this package. In-depth information on model development can be found here: <https://github.com/openmpp/openmpp.github.io/wiki/Model-Development-Topics>.
### Example
Next, we will work through a very simple example of creating a new scenario, extracting parameters to change, changing parameters, running the model, and extracting results. This example will use the __RiskPaths__ model that comes with the OpenM++ software. RiskPaths is a simple, competing risk, case-based continuous time miscrosimulation model ([More Information](https://www.statcan.gc.ca/en/microsimulation/modgen/new/chap3/chap3-2)).
To run this example, you must have installed OpenM++, initiated the OpenM++ web service (OMS) in the shell, and configured the R package using the instructions above.
```{r echo=FALSE, message=FALSE}
oms_path <- source(here::here('tests/testthat/setup.R'))
local_initiate_oms <- source(here::here('tests/testthat/helper-oms.R'))
pid <- local_initiate_oms$value(oms_path$value)
```
```{r message=FALSE}
library(openmpp)
use_OpenMpp_local()
```
Let's see what models are available:
```{r}
get_models()
```
We can now see what worksets and model runs exist for a given model.
```{r}
get_worksets('RiskPaths')
```
```{r}
get_runs('RiskPaths')
```
Now we can load the `RiskPaths` model to inspect.
```{r}
rp <- load_model('RiskPaths')
rp
```
We will now load the `Default` set of input parameters for the RiskPaths model.
```{r}
rp_default <- load_scenario('RiskPaths', 'Default')
rp_default
```
Finally, we will load the base run for the RiskPaths model.
```{r warning=FALSE}
baserun_digest <- rp$ModelRuns$RunDigest[[1]]
rp_baserun <- load_run('RiskPaths', baserun_digest)
rp_baserun
```
We will create a new scenario based on the parameters from the `RiskPaths_Default` model run.
```{r}
create_scenario('RiskPaths', 'MyNewScenario', baserun_digest)
```
We will load the new scenario, copy over the `AgeBaselinePreg1` parameter from the base run.
```{r warning=FALSE}
my_scenario <- load_scenario('RiskPaths', 'MyNewScenario')
```
Let's reduce the fertility rate by 10% across all age groups...
```{r}
my_scenario$copy_params('AgeBaselinePreg1')
```
```{r message=FALSE}
library(dplyr)
current_rates <- my_scenario$Parameters$AgeBaselinePreg1
reduced_rates <-
current_rates |>
mutate(across(-sub_id, \(x) x * 0.9))
my_scenario$Parameters$AgeBaselinePreg1 <- reduced_rates
```
We will now run the model and give it the name `'ExampleRun'`. We use the `wait = TRUE` flag to make sure we want for the model run to finish before returning to our R session. We use `progress = FALSE` to avoid printing progress bars in this document. Note that model runs may take a long time when the number of simulation cases is large.
```{r}
my_scenario$ReadOnly <- TRUE
my_scenario$run('ExampleRun', wait = TRUE, progress = FALSE)
```
Note that we can use the `opts` argument and the `opts_run()` function to configure our run. By default, models are run with 5,000 simulation cases and 12 SubValues. This allows for quick model runs and faster iteration, but users will want to increase the number of simulation cases when performing a full model run.
Now that our model run is complete, let's load it into memory.
```{r warning=FALSE}
example_run <- load_run('RiskPaths', 'ExampleRun')
example_run
```
We can now extract an output table from the `Tables` field in the model run object (`example_run$Tables`).
```{r}
example_run$Tables$T06_BirthsByUnion
```
Great, we have created a new scenario, modified some parameters, ran the model, and extracted output tables. In this last step, we will load multiple model runs into memory to compare them.
```{r warning=FALSE}
rp_runs <- load_runs('RiskPaths', rp$ModelRuns$RunDigest)
rp_runs
```
We will extract a new table from both models. Note that an extra column, `RunName` is added to indicate which model run the output table data corresponds to.
```{r}
births <- rp_runs$Tables$T06_BirthsByUnion
births
```
We can even plot this using `ggplot2`! Note that the number of simulation cases for `ExampleRun` is **low** so the results are not to be trusted! This is only for demonstration purposes.
```{r warning=FALSE}
library(ggplot2)
births |>
ggplot(aes(Dim0, expr_value, fill = RunName)) +
geom_col(position = position_dodge()) +
labs(x = NULL, y = 'Number of births by union') +
coord_flip() +
theme_minimal() +
theme(legend.position = 'bottom')
```
When we are sure we no longer need a scenario or model run, we can use `delete_scenario()` or `delete_run()` to clean things up!
```{r echo=FALSE}
my_scenario$ReadOnly <- FALSE
delete_scenario('RiskPaths', 'MyNewScenario')
delete_model_run('RiskPaths', 'ExampleRun')
unlink(oms_path, recursive = TRUE)
```
## Contributor Guidelines
Contributions to this package are welcome. The preferred method of contribution is through a GitHub pull request. Before contributing, please file an issue to discuss the idea with the project team. More details on contributing can be found in the [CONTRIBUTING](https://github.com/mattwarkentin/openmpp/blob/main/CONTRIBUTING.md) document.
## Code of Conduct
Please note that the `openmpp` project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.