PyMSM in a nutshell¤
Hagai Rossman, Ayya Keshet, Malka Gorfine 2022
PyMSM
is a Python package for fitting competing risks and multistate models, with a simple API which allows user-defined model, predictions at a single or population sample level, statistical summaries and figures.
Features include:
- Fit a Competing risks Multistate model based on survival analysis (time-to-event) models.
- Deals with right censoring, competing events, recurrent events, left truncation, and time-dependent covariates.
- Run Monte-carlo simulations for paths emitted by the trained model and extract various summary statistics and plots.
- Load or configure a pre-defined model and run path simulations.
- Modularity and compatibility for different time-to-event models such as Survival Forests and other custom models.
Installation¤
Requires Python >=3.8
Quick example¤
stateDiagram-v2
s1 : (1) Primary surgery
s2 : (2) Disease recurrence
s3 : (3) Death
s1 --> s2: 1518
s1 --> s3: 195
s2 --> s3: 1077
Background and Motivation¤
Multi-state data are common, and could be used to describe trajectories in diverse health applications; such as describing a patient's progression through disease stages or a patient’s path through different hospitalization states. When faced with such data, a researcher or clinician might seek to characterize the possible transitions between states, their occurrence probabilities, or to predict the trajectory of future patients - all conditioned on various baseline and time-varying individual covariates. By fitting a multi-state model, we can learn the hazard for each specific transition, which would later be used to predict future paths. Predicting paths could be used at a single patient level, for example predict how long until a cancer patient will be relapse-free given his current health status, or at what probability will a patient end a trajectory at any of the possible states; and at the population level, for example predicting how many patients which arrive at the emergency-room will need to be admitted, given their covariates.
Capabilities¤
PyMSM is a Python package for fitting multi-state models, with a simple API which allows user-defined models, predictions at a single or population sample level, and statistical summaries and figures.
Features of this software include:
- Fitting a Competing risks Multistate model based on various types of survival analysis (time-to-event) such as Cox proportional hazards models or machine learning models, while taking into account right censoring, competing events, recurrent events, left truncation, and time-dependent covariates.
- Running Monte-carlo simulations (in parallel computation) for paths emitted by the trained model and extracting various summary statistics and plots.
- Loading or configuring a pre-defined model and generating simulated data in terms of random paths using model parameters, which could be highly useful as a research tool.
- Modularity and compatibility for different time-to-event models such as Survival Forests and other custom ML models provided by the user.
The package is designed to allow modular usage by both experienced researchers and non-expert users. In addition to fitting a multi-state model for a given data - PyMSM allows the user to simulate trajectories, thus creating a multi-state data-set, from a predefined model. This could be a valuable research tool - both for sharing sensitive simulated individual data and as a tool for any downstream task which needs individual trajectories.
Citation¤
If you found this library useful in academic research, please cite:
@article{Rossman2022, doi = {10.21105/joss.04566},
url = {https://doi.org/10.21105/joss.04566},
year = {2022},
author = {Hagai Rossman and Ayya Keshet and Malka Gorfine},
title = {PyMSM: Python package for Competing Risks and Multi-State models for Survival Data},
journal = {Journal of Open Source Software} }
Also consider starring the project on GitHub
This project is based on methods first introduced by the authors of Roimi et. al. 2021.
Original R code by Jonathan Somer, Asaf Ben Arie, Rom Gutman, Uri Shalit & Malka Gorfine available here.
Also see Rossman & Meir et. al. 2021 for an application of this model on COVID-19 hospitalizations data.