This repository contains the code to reproduce experiments of the associated manuscript 'RMS: A ML-based system for ICU Respiratory Monitoring and Resource Planning'.
Note: while the code provided here works with the HiRID-II dataset, we are still working on the public release of this dataset. Once released, we will update this repository accordingly to make sure the findings are fully reproducible.
As part of this work, we will release a significantly revised version of the HiRID dataset, denoted as HiRID-II, on Physionet. It will be a freely accessible critical care dataset containing data from more than 55,000 patient admissions to the Department of Intensive Care Medicine, Bern University Hospital, Switzerland, from 2008 to 2019. The initial version of the dataset, HiRID-I, was released as part of the journal paper 'Early prediction of circulatory failure in the intensive care unit using Machine Learning' (known as circEWS).
The machine learning models to solve individual prediction tasks on the individual patient-level as well as for resource planning were trained using LightGBM.
We propose an event-based evaluation metric, similar to the one proposed in circEWS, which bases the recall on the proportion of caught events, and the precision based on the proportions of generated alarms that are correct, for the tasks RMS-RF, RMS-VENT and RMS-REXT. The extubation failure task (RMS-EF) is evaluated using a conventional AUPRC (Area under the Precision-Recall Curve) metric, evaluated at the observed time points of extubation. The resource planning task is evaluated using MAE of predicted ventilator usage in the future.
We assume a Linux installation, typically HPC, with a 'Slurm' cluster scheduler for dispatching jobs like training or data preprocessing, across the data batches.
- Install a conda distribution like Miniconda or Anaconda
- Clone this repository
- Update dependencies using Conda as required.
-
Get access to the HiRID 2.0 dataset on Physionet, after its release. This includes
- Getting a credentialed Physionet account
- Submit a usage request to the data owner of the HiRID-II dataset.
-
Once access is granted, download the merged stage of the data, from which all derived resources in this project can be built.
The code is organized in several sub-directories in the Python module
RMS
, which contain the following contents:
-
endpoints
Annotation of time series with respiratory system related annotations. -
evaluation
Evaluation of RMS tasks performance and evaluation of resource planning. -
exp_design
Code concerned with splitting PIDs for cluster processing and generating data splits for the experimental design. -
imputation
Code concerned with transforming HIRID-II data to a fixed time grid, making it suitable for feature generation and fitting of machine learning models. Data is partially imputed and sometimes left as missing. -
introspection
Code concerned with SHAP value analysis and analysis of variable importance for predicting various RMS tasks. -
labels
Code for creating machine learning labels of the various RMS tasks. -
learning
Supervised learning scripts for learning risk scores for predicting respiratory failure as well as other RMS tasks. -
ml_dset
Save features/labels in a compact HDF5 format for the training/validation sets. -
ml_input
Contains code for generation of features on partially imputed data, the machine learning labels are also appended to this dataset. -
statistics
Various scripts collecting statistics about different stages of the pipeline. -
utils
Various utility functions used in other modules.
The order in which each component is run is as follows:
- exp_design
- imputation
- endpoints
- labels
- ml_input
- ml_dset
- learning
- evaluation/introspection/statistics
The research code associated with the manuscript is licensed under a MIT license. The HiRID-II data is licensed as specified on Physionet.
When using code from this repository, please consider citing
Hüser, Lyu, Faltys, Pace et al. "A comprehensive ML-based respiratory monitoring system for physiological monitoring & resource planning in the ICU", medRxiv 2024.01.23.24301516