Home Applications iris-fhirsqlbuilder-dbt-integratedml

iris-fhirsqlbuilder-dbt-integratedml

This application is not supported by InterSystems Corporation. Please be notified that you use it at your own risk.
0
0 reviews
0
Awards
47
Views
0
IPM installs
0
0
Details
Releases
Reviews
Issues
Pull requests
Demonstration of building predictive models trained on FHIR data

What's new in this version

added tags to description

iris-fhirsqlbuilder-dbt-integratedml

“ML on FHIR”

This repo demonstrates an end-to-end workflow to build an ML model for disease risk prediction starting from synthetic patient data in FHIR format, roughly based upon this paper: Chen, A., Chen, D.O. Simulation of a machine learning enabled learning health system for risk prediction using synthetic patient data. Sci Rep 12, 17917 (2022). https://doi.org/10.1038/s41598-022-23011-4. That paper started from an export of Synthea synthetic data in CSV format, and used an undisclosed data processing pipeline to flatten and collate the SNOMED and LOINC codes from multiple tables into a single “ML-ready” table with one row per patient. The ML-ready table is ideal for using AutoML tabular machine learning techniques to predict risk of a disease or complication, and the authors developed machine learning models using Python and scikit-learn and other libraries.

In contrast, this repository demonstrates a similar workflow with the following modifications:

  • Start from synthetic patient data in FHIR format, loaded into InterSystems IRIS for Health FHIR repository
  • Use FHIR SQL Builder feature to project SQL tables containing select information extracted from the FHIR repository
  • Use dbt to transform multiple SQL tables for Patient, Observation, etc. into a single ML-ready table
  • Build Machine Learning model within InterSystems IRIS IntegratedML, which enables data analysts to build and manage in-database ML models with elegant SQL syntax

Therefore, we have provided an end-to-end demonstration of using InterSystems’ unique capabilities in data management and Machine Learning, combined with open source projects Synthea and dbt, to process, analyze and develop machine learning models that are then accesses easily from SQL for instant viewing in dashboards like Superset!

This work was presented at InterSystems Global Summit 2023, see presentation recording!

Prepare FHIR data for ML

  • Start environment with Docker Compose
docker-compose up -d --build

The start may take a while, until the FHIR server will be installed and activated

  • Generate FHIR data with Synthea – 100 patients, ages 30-100 years old, seed = 1234, clinician seed = 1234
./run_synthea.sh -p 100 -a 30-100 -s 1234 -cs 1234
  • Load FHIR Data to FHIRServer
./load-data.py

FHIR SQL Builder configured to expose FHIR data through SQL schema fhir

  • Install dbt and dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cd dbt
dbt deps
  • Check connection to IRIS
dbt debug
  • Load seeds and run data transformation with tests
dbt build

After all this steps IRIS will contain table mlonfhir.summary ready for ML

Read more
Made with
Version
1.0.129 Mar, 2024
ObjectScript quality test
Category
Analytics
Works with
InterSystems IRIS for HealthInterSystems FHIR
First published
29 Mar, 2024