iris-airflow-provider Awaiting Review

InterSystems does not provide technical support for this project. Please contact its developer for the technical assistance.

0 reviews

Awards

Views

IPM installs

Details

Releases (1)

Reviews

Awards (1)

Issues

Videos (1)

Articles (1)

Apache Airflow provider for InterSystems IRIS.

What's new in this version

Initial Release

Intersystems IRIS provider for Apache Airflow

The iris-airflow-provider application leverages the airflow-provider-iris python package to enable seamless integration between Apache Airflow workflows and the InterSystems IRIS data platform. It provides native connection support and operators for executing IRIS SQL and automating IRIS-driven tasks within modern ETL/ELT pipelines. Designed for reliability and ease of use, this provider helps data engineers and developers build scalable, production-ready workflows and the newly added IrisSensor that lets you pause workflows until IRIS data is ready, all while keeping everything simple, reliable, and production-grade.

Provider Features

✔️ IrisHook – for managing IRIS connections
✔️ IrisSQLOperator – Execute SQL statements
✔️ IrisSensor - Wait for IRIS data readiness (row counts, status flags, bulk load completion)
✔️ Support for both SELECT/CTE and DML statements
✔️ Native Airflow connection UI customization
✔️ Examples for real-world ETL patterns

About Apache Airflow

Apache Airflow is the leading open-source platform to programmatically author, schedule, and monitor data pipelines and workflows using Python. Workflows are defined as code (DAGs), making them version-controlled, testable, and reusable. With a rich UI, 100+ built-in operators, dynamic task generation, and native support for cloud providers, Airflow powers ETL/ELT, ML pipelines, and batch jobs at companies like Airbnb, Netflix, and Spotify.

Application Layout

Installation

Docker (e.g. for dev purposes)

Clone/git pull the repo into any local directory

$ git clone https://github.com/mwaseem75/iris-airflow-provider.git

Open the terminal in this directory and run below commands:

Initializes the Airflow metadata database

$ docker compose up airflow-init

Initializes IRIS and entire Airflow platform

$ docker-compose up -d

Run the Application

Navigate to http://localhost:8080/ to access the application [Credentials: airflow/airflow]

View/Run Sample Dags

The application comes with three pre-loaded DAGs.

Open the Airflow UI and click on the DAGs tab.
Use the toggle button next to each DAG to enable or disable it.

To run a DAG manually, click the Trigger DAG button (▶ arrow) on the right side of the DAG row.
Click the name of DAG (e.g., 01_IRIS_Raw_SQL_Demo) to view its details, graph, and run history.

The 01_IRIS_Raw_SQL_Demo DAG consists of three tasks:

Create Table
Insert Data
Retrieve Data

Select a task and click the task box to open its details. Click **Details** tab to see its details.

Click **Code** tab to see the task’s source code.

Click **Log** tab to see the Log details.

If the DAG runs successfully, verify the results in the InterSystems Management Portal. Navigate to http://localhost:32783/csp/sys/exp/%25CSP.UI.Portal.SQL.Home.zen?$NAMESPACE=USER [Credentials: _SYSTEM/SYS]

IrisSensor

IrisSensor – Wait for Data in InterSystems IRIS

The IrisSensor is a purpose-built Airflow sensor that repeatedly runs a SQL query against IRIS until a condition is satisfied.
It solves the most common real-world need when integrating Airflow with IRIS:
“Don’t start my downstream jobs until the data has actually landed in IRIS.”

Why you’ll use it every day

Wait for daily bulk loads (CSV, EDI, API, replication, etc.)
Wait for upstream systems to flip a status flag
Wait for a minimum number of rows in a staging table
Wait for a specific value (e.g., Status = 'COMPLETED')
Wait for stored procedures or class methods that write results to a table

irisSensor Example (04_IRIS_Daily_Sales_Report_Sensor.py)

This example DAG waits patiently until the daily bulk sales load is complete, safely creates the summary table if it doesn’t exist, replaces today’s report (making the pipeline fully idempotent), and builds a clean regional summary ready for dashboards or downstream jobs.

–

Add IRIS connection

Go to Admin → Connections → Add Connection

Click on save button to add the connection

Use your InterSystems IRIS connection by setting the iris_conn_id parameter in any of the provided operators.

In the example below, the IrisSQLOperator uses the iris_conn_id parameter to connect to the IRIS instance:

#New_Test_DAG.py
from datetime import datetime

from airflow import DAG

from airflow_provider_iris.operators.iris_operator import IrisSQLOperator
Define the DAG for running a simple SQL command against InterSystems IRIS.
with DAG(

dag_id="01_IRIS_Raw_SQL_Demo_Local",

start_date=datetime(2025, 12, 1),

schedule=None,               # Run manually; no automatic scheduling

catchup=False,               # Do not backfill previous dates

tags=["iris-contest"],       # Tag to group the DAG in Airflow UI

) as dag:
# Create a demo table if it does not already exist.
# This operator connects to the specified IRIS instance and executes the SQL.
create_table = IrisSQLOperator(
    task_id="create_table",
    iris_conn_id="ContainerInstance",   # Airflow connection configured for IRIS
    sql="""
        CREATE TABLE IF NOT EXISTS Test.AirflowDemo (
            ID INTEGER IDENTITY PRIMARY KEY,
            Message VARCHAR(200),
            RunDate TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        )
    """,
)

Adding a new DAG

DAG (Directed Acyclic Graph) is a Python script that defines an Airflow workflow as a collection of tasks, their dependencies, and execution schedule. Airflow automatically discovers and loads any Python file placed in the designated DAGs folder.

About Airflow-provider-iris package

The Apache Airflow Provider for InterSystems IRIS enables Airflow users to connect to InterSystems IRIS databases and execute SQL tasks using a native Airflow connection type (iris).