An example on how to use InterSystems IRIS for Health FHIR database to perform ML models througth InterSystems IRIS IntegratedML
IntegratedML is a great feature for train/test and deploy ML models. FHIR is a powerful standard for health information interoperability. This project aims to show how to use IRIS/IRIS for Health tools, like DTL transformations to prepare FHIR data for applying ML models in IntegratedML.
Some potential applications for ideas presented in this project:
Clone/git pull the repo into any local directory
$ git clone https://github.com/jrpereirajr/fhir-integratedml-example.git
Open the terminal in this directory and run:
$ cd fhir-integratedml-example
$ docker-compose up -d
If you’d like to have a nice log of what’ve happened in installation use this command:
$ docker-compose up -d > build-log.txt 2>&1
To initialize an IRIS terminal, follow these steps:
In a powershell/cmd terminal run:
docker exec -it fhir-integratedml-example_iris_1 bash
In linux shell, create an IRIS session:
irissession iris
In order to demonstrate the project concept, two models were setup:
First, training datasets were used to generate synthetic FHIR resources. These datasets had information about patients, conditions, observations, encounters, appointments and reminders sent to patients - represented by different FHIR resources. This step emulates a true FHIR database, in which no-show and heart failure predictions could be applied.
With the FHIR database ready to use, data needs to be transformed by combining the FHIR resources which are relevant to the problem, into single tables. Such FHIR combination is done by DTL transformations NoShowDTL and HeartFailureDTL:
As DTL transformations could be exported/imported, it’s possible to share ML models applied on FHIR data. These transformations also could be extended by another teams if necessary.
After applying the DTL transformations, FHIR resources are mapped to single rows, creating tables which could be used to train ML models for no-show and heart failure predictions.
In order to train and test models using IntegratedML, use the following SQL statements.
They are executed in installation, but you are welcome to re-execute them and try IntegratedML by yourself.
-- create the training dataset CREATE OR REPLACE VIEW PackageSample.NoShowMLRowTraining AS SELECT * FROM PackageSample.NoShowMLRow WHERE ID < 1800 -- create the testing dataset CREATE OR REPLACE VIEW PackageSample.NoShowMLRowTest AS SELECT * FROM PackageSample.NoShowMLRow WHERE ID >= 1800
-- avoid errors in CREATE MODEL command; ignore any error here
DROP MODEL NoShowModel
-- creates an IntegratedML model for predinction Noshow column based on other ones, using the PackageSample.NoShowMLRowTraining dataset for tranning step; seed parameter here is to ensure results reproducibility
CREATE MODEL NoShowModel PREDICTING (Noshow) FROM PackageSample.NoShowMLRowTraining USING {"seed": 6}
-- trains the model, as set up in CREATE MODEL command
TRAIN MODEL NoShowModel
-- display information about the trainned model, like which ML model was selected by IntegratedML
SELECT * FROM INFORMATION_SCHEMA.ML_TRAINED_MODELS
-- use the PREDICT function to see how to use the model in SQL statements
SELECT top 10 PREDICT(NoShowModel) AS PredictedNoshow, Noshow AS ActualNoshow FROM PackageSample.NoShowMLRowTest
-- run a validation on testing dataset and calculate the model performance metrics
VALIDATE MODEL NoShowModel FROM PackageSample.NoShowMLRowTest
-- display performance metrics
SELECT * FROM INFORMATION_SCHEMA.ML_VALIDATION_METRICS
-- create the training dataset CREATE OR REPLACE VIEW PackageSample.HeartFailureMLRowTraining AS SELECT DEATHEVENT,age,anaemia,creatininephosphokinase,diabetes,ejectionfraction,highbloodpressure,platelets,serumcreatinine,serumsodium,sex,smoking,followuptime FROM PackageSample.HeartFailureMLRow WHERE ID < 200 -- create the testing dataset CREATE OR REPLACE VIEW PackageSample.HeartFailureMLRowTest AS SELECT DEATHEVENT,age,anaemia,creatininephosphokinase,diabetes,ejectionfraction,highbloodpressure,platelets,serumcreatinine,serumsodium,sex,smoking,followuptime FROM PackageSample.HeartFailureMLRow WHERE ID >= 200
-- avoid errors in CREATE MODEL command; ignore any error here
DROP MODEL HeartFailureModel
-- display information about the trainned model, like which ML model was selected by IntegratedML
CREATE MODEL HeartFailureModel PREDICTING (DEATHEVENT) FROM PackageSample.HeartFailureMLRowTraining USING {"seed": 6}
-- trains the model, as set up in CREATE MODEL command
TRAIN MODEL HeartFailureModel
-- display information about the trainned model, like which ML model was selected by IntegratedML
SELECT * FROM INFORMATION_SCHEMA.ML_TRAINED_MODELS
-- use the PREDICT function to see how to use the model in SQL statements
SELECT top 10 PREDICT(HeartFailureModel) AS PredictedHeartFailure, DEATHEVENT AS ActualHeartFailure FROM PackageSample.HeartFailureMLRowTest
-- run a validation on testing dataset and calculate the model performance metrics
VALIDATE MODEL HeartFailureModel FROM PackageSample.HeartFailureMLRowTest
-- display performance metrics
SELECT * FROM INFORMATION_SCHEMA.ML_VALIDATION_METRICS
The last SQL statement may show you the classification performance parameters:
The same transformation could be applied to transform FHIR resources came from external systems, through a REST API for instance (checkout the code):
If you’re getting errors on trying API requests, saying that model doesn’t exist, probably something wrong happens in container creation on trainning models. Try to re-execute the trainning method. Open an IRIS terminal and run:
ZN "FHIRSERVER"
Do ##class(PackageSample.Utils).TrainNoShowModel()
Do ##class(PackageSample.Utils).TrainHeartFailureModel()
FHIR resources used as templates: http://hl7.org/fhir/
Dataset for no show model training: IntegratedML template
Dataset for heart failure model training: kaggle