Custom integration API schema

This schema details the format that API request payloads take.

When you create the integration in the Databand UI:

  1. Write the code to connect to your system and run it to pull the data from your system.
  2. Convert the received data to the request payload that conforms to the Request structureDataband schema.
  3. Send it to the API endpoint that was provided in the last step of the integration.
  4. Run the code periodically to send the data. Go to this example to see a Python script that performs the actions from steps 1-3. In this example, the code runs every 120 seconds.

Databand schema

The Databand schema is based on the open source OpenLineage (OL) schema with some differences, though:

  • Databand accepts a single payload with all the events, whereas with the OL you can provide only a single event per single API call.
  • Databand extensions to the OL schema include the log and startTime fields.
  • Databand tasks and runs are referred to as "runs" in OL, and are reported by runEvent in OL.
  • The data is reported only for the payload with a root run (a run without a parent).

Relationships between the objects in Databand and OpenLineage

To understand different types of relationships in Databand, have a look at the following diagram and its description: a diagram with a parent and upstream relationship

In this case:

  • Parent Task - is parent for both Sub Task and Upstream Sub Task.
  • Upstream Sub Task - is upstream for Sub Task (Upstream Sub Task follows Sub Task).

How would it translate to the OL schema?

  • In both Sub Task and Upstream Sub Task, there will be a parent facet that references Parent Task as their parent.
  • In Upstream Sub Task only, there will be an input that mentions Sub Task and states that Upstream Sub Task is its upstream task.
Look at the following snippet from the OL schema to see how the relationships are presented:
[
    {
        "eventTime": "<current time>",
        "eventType": "COMPLETE",
        "job": {
            "facets": {},
            "name": "Parent Task",
            "namespace": "project_name"
        },
        "runId": "<UUID(Parent Task)>"
    },
    {
        "eventTime": "<current time>",
        "eventType": "COMPLETE",
        "job": {
            "facets": {},
            "name": "Sub Task",
            "namespace": "project_name"
        },
        "facets": {
            "parent": { # reference to Parent Task as a parent task
                "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
                "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
                "run": {
                    "runId": "<UUID(Parent Task)>"
                }
            }
        },
        "runId": "<UUID(Sub Task)>"
    },
    {
        "eventTime": "<current time>",
        "eventType": "COMPLETE",
        "job": {
            "facets": {},
            "name": "Upstream Sub Task",
            "namespace": "project_name"
        },
        "inputs": [
            { # this references Sub Task to indicate Upstream Sub Task as its upstream
                "facets": {},
                "name": "Sub Task",
                "namespace": "project_name"
            }
        ],
        "facets": {
            "parent": { # reference to Parent Task as a parent task
                "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
                "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
                "run": {
                    "runId": "<UUID(Parent Task)>"
                }
            }
        },
        "runId": "<UUID(Upstream Sub Task)>"
    }
]

Endpoint

Send the request payload to the endpoint URL that was generated in the last step of the integration in the Databand UI.

The endpoint is in the following format: HTTPS://your-Databand-hostname/api/v1/tracking/open-lineage/<tracking_source_uid>/events/bulk

Method

To send the payload, use the POST method.

Request structure

The custom integration payload consists of a series of Run Events, which contain information about the runs and tasks execution. Each of such events consists of basic Event information and Facets, which allow extending the Event with other important information, such as tags, log, errorMessage.

For a better understanding of how the payload is created, look at the following schema. It shows the structure of the message that is used in the POST request.

The schema includes two payloads:

What are runs and tasks?

  • Run: A run is an individual execution of your pipeline. Pipelines in Databand consist of a collection of runs over time. You can view all historical runs that have been logged for a particular pipeline by going to its Runs page in the Databand UI.
  • Task: A task is a logical step in your pipeline. Most pipelines consist of a collection of tasks that are executed in a specific sequence. In Databand, tasks provide a way to review metrics from and measure the performance of your pipeline at a finer grain. As a result, you can debug issues more precisely and identify bottlenecks in your pipeline’s performance.

For more information about runs and tasks, go to Tasks and runs uid in Databand and OpenLineage

Because several fields of the payload consist of identical child objects (for example errorMessage facet), they are described only once.

Because the Databand schema is based on the OL schema, some of the fields are OL standard and optional. It is advisable, though, to apply them in the payload.

Note: If you decide to use the optional fields, the nested fields are required. For more details, look at the API tables that are included in the topic.
[
  {
    "eventType": ...,
    "eventTime": ...,
    "inputs": [],
    "job": {
      // Job object
      "facets": {},
      "namespace": ...,
      "name": ...
    },
    "outputs": [],
    "run": {
      // Run object
      "runId": ...,
      "facets": {
        // Run Facets object
        "nominalTime": {
          // Run Facet - nominalTime object
          "_producer": ...,
          "_schemaURL": ...,
          "nominalStartTime": ...,
          "nominalEndTime": ...
        },
        "log": {
          // Run Facet - log object
          "_producer": ...,
          "_schemaURL": ...,
          "logBody": ...,
          "logUrl": ...
        },
        "startTime": {
          // Run Facet - startTime object
          "_producer": ...,
          "_schemaURL": ...,
          "startTime": ...
        },
        "errorMessage": {
          // Run Facet - errorMessage object
          "_producer": ...,
          "_schemaURL": ...,
          "message": ...,
          "stackTrace": ...,
          "programmingLanguage": ...
        },
        "tags": {
          // Run Facet - tags object
          "projectName": ...,
          "runName": ...,
          "_schemaURL": ...,
          "_producer": ...
        }
      }
    },
    "producer": ...,
    "schemaUrl": ...
  },
  {
    "eventTime": ...,
    "eventType": ...,
    "job": {
      // Job object
      "facets": {},
      "namespace": ...,
      "name": ...
    },
    "inputs": [
      // Inputs object
      {
        "facets": {},
        "name": ...,
        "namespace": ...
      }
    ],
    "run": {
      // Run object
      "runId": ...,
      "facets": {
        // Task Facets object
        "nominalTime": {
          // Task Facet - nominalTime object
          "_producer": ...,
          "_schemaURL": ...,
          "nominalStartTime": ...,
          "nominalEndTime": ...
        },
        "log": {
          // Run Facet - log object
          "_producer": ...,
          "_schemaURL": ...,
          "logBody": ...,
          "logUrl": ...
        },
        "startTime": {
          // Run Facet - startTime object
          "_producer": ...,
          "_schemaURL": ...,
          "startTime": ...
        },
        "errorMessage": {
          // Run Facet - errorMessage object
          "_producer": ...,
          "_schemaURL": ...,
          "message": ...,
          "stackTrace": ...,
          "programmingLanguage": ...
        },
        "parent": {
          // Task Facet - parent object
          "_producer": ...,
          "_schemaURL": ...,
          "job": {
            // Task Facet - Parent - job object
            "name": ...,
            "namespace": ...
          },
          "run": {
            // Task Facet - Parent - run object
            "runId": ...
          }
        },
        "metrics":  [
          // Task Facet - metrics
            {
              "metricName": ...,
              "timestamp": ...,
              "metricValue": ...,
              "source": ...,
              "_producer": ...,
              "_schemaURL": ...
            },
          ...
          ],

      }
    },
    "producer": ...,
    "schemaUrl": ...
  }
]

For the sake of clarity, the fields are described for the run and task payloads separately.

Run payload

Note: Run payload is mapped as an OL RunEvent, without ParentFacet. For a single API call that contains the current_state of the run, a single event per run is expected.

RunEvent

Table 1. RunEvent object fields and their description
Element Type Description Required
eventType string Values:
  • START
  • COMPLETE
  • FAIL
  • ABORT
  • RUNNING
  • OTHER
Providing a start and finish state event is not mandatory but if you provide only a finish event you need to provide the startTime facet. If you don't provide the eventType, it gets the OTHER value, by default.
Not required
eventTime string The ISO date and time when the event occurred. Used as start_time or end_time depending on the eventType. An example of a value: 2024-04-09T06:34:06.600323Z. Required
job object See: Job object for nested elements definition. Required
run object See: Run object for nested elements definition. Required
inputs array Not used at the moment. In the schema, though, to make the schema compatible with OL. Not required
outputs array Not used at the moment. In the schema, though, to make the schema compatible with OL. Not required
producer string URI identifying the producer of this metadata (for example, a Git URL with a given tag or SHA). Not used at the moment. In the schema, though, to make the schema compatible with OL. Required
schemaURL string Default value: https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent. The JSON pointer URL to the corresponding version of the schema definition for this RunEvent. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required

Job

Jobs are identified by a unique name within a namespace. Job namespace should be derived from scheduler, for example airflow-prod. The combined namespace and name for a Job are usually enough to uniquely identify it within your environment. All tasks and runs for the same run usually have the same namespace as they come from the same orchestrator.

Table 2. Job object fields and their description
Element Type Description Required
name string Used as the pipeline name. Required
namespace string Jobs have a name that is unique to them in their namespace. The namespace is the root of the naming hierarchy and is used to prevent collisions when a job with the same name is sent from different systems (for example, from different Airflow instances). Required
facets object The job facets. Standard extensions: Documentation, JobType, Ownership, SourceCode, or Location. Not used at the moment. In the schema, though, to make the schema compatible with OL. Not required

Run

Table 3. Run object fields and their description
Element Type Description Required
runId string See: OL - Run naming. The globally unique ID of the run associated with the job. Used as uid. An example of a value: e5648915-6c19-4b1e-8817-5335dabf146d. Required
facets object See: Run facet objects for nested elements definition. Not required

Run Facets

Table 4. RunFacet object fields and their description
Element Type Description Required
nominalTime object See: OL - nominalTime facet. The nominalTime facet describes the nominal start of the run. It is used for execution_date. See: Run Facet - nominalTime for nested elements definition. Not required
log object Log reporting. Required at least one of log_body or log_url. See: Run Facet - log for nested elements definition. Not required
startTime object See: Run Facet - startTime for nested elements definition. Not required
errorMessage object See: OL - errorMessage facet - used for error reporting. See: Run Facet - errorMessage for nested elements definition. Not required
tags objects See: Run Facet - tags for nested elements definition. Not required

Run Facet - nominalTime

Table 5. RunFacet - nominalTime fields and their description
Element Type Description Required
nominalStartTime string Used as execution_date - the reported time in the UI. An example of a value: 2024-04-09T06:34:06.600210Z. Required
nominalEndTime string The nominal end time of the run. An example of a value: 2024-04-09T06:34:06.600210Z. Not used at the moment. In the schema, though, to make the schema compatible with OL. Not required
_producer string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required
_schemaURL string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/NominalTimeRunFacet.json Required

Run Facet - log

Table 6. RunFacet - log fields and their description
Element Type Description Required
logBody string Log body text - displayed in the UI as Log. Not required
logUrl string, URI Remote log url - displayed in the UI as Log URL. Not required
_producer string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required
_schemaURL string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet Required
Note: logBody and logUrl fields are optional, but if you include the Run Facet - log, you need to provide at least one of them.

Run Facet - startTime

Table 7. RunFacet - startTime fields and their description
Element Type Description Required
startTime string Schema metadata. An example of a value: 2024-04-07T06:34:06.600261Z. Required
_producer string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required
_schemaURL string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet Required
Note: Provide startTime for all types of events.

Run Facet - errorMessage

Table 8. RunFacet - errorMessage fields and their description
Element Type Description Required
message string Used as error message. Title and exact error type are grouped in the Top Errors widget. Required
stackTrace string Used as error details. Not required
programmingLanguage string The programming language of the code that returned the error. Required
_producer string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required
_schemaURL string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ErrorMessageRunFacet.json Required

Run Facet - tags

Table 9. RunFacet - tags fields and their description
Element Type Description Required
projectName string Enables you to provide the project name of the run. If you don't provide it, it is "default". Not required
runName string Enables you to provide the name of the run. If you don't provide it, then runName is {job.name}_{nominaltime}. Not required
_schemaURL string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet Required
_producer string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required

Task payload

OpenLineage doesn't differentiate between runs and tasks - element definitions from OpenLineage documentation are the same for both of them. Important differences are provided in the description of the relevant fields.

Note: The task payload is mapped as an OL RunEvent, with ParentFacet pointing to the parent run or another task. For a single API call that contains the current_state of the task, a single event per task (in a run) is expected.

TaskEvent

Table 10. TaskEvent fields and their description
Schema field Type Description Required
eventType string Values:
  • START
  • COMPLETE
  • FAIL
  • ABORT
  • RUNNING
  • OTHER
Providing a start and finish state event is not mandatory but if you provide only a finish event you need to provide the startTime facet. If you don't provide the eventType, it gets the OTHER value, by default.
Not required
eventTime string The ISO datetime when the event occurred. Used as start_time or end_time depending on the eventType. An example of a value: 2024-04-09T06:34:06.600316Z. Required
job object See: Job for nested elements definition. Required
inputs object Used to define the upstream relationship between the tasks. See: Inputs Not required
outputs array Not used at the moment. In the schema, though, to make the schema compatible with OL. Not required
run object See: Run object for nested elements definition. Required
producer string URI identifying the producer of this metadata. Not used at the moment. It is in the schema, though, to make the schema compatible with OL. Required
schemaURL string Not used at the moment. It is in the schema, though, to make the schema compatible with OL. Default value. https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent Required

Inputs

Table 11. Inputs object fields and their description
Element Type Description Required
name string The name of the task that will be the upstream to the task that declares the input. Required
namespace string The namespace of the task that will be the upstream to the task that declares the input. Required
facets dict [str, BaseDatasetFacet] Not used at the moment. Not required

Run - task payload

Table 12. Run object fields and their description
Element Type Description Required
runId string See: OL - Run naming. The globally unique ID of the run associated with the job. Used as uid. An example of a value: 2c6a61ad-b23f-484f-978e-0eac96a983d6. Required
facets object See: Task Facet object for nested elements definition. Not required

Task Facet

Table 13. TaskFacet fields and their description
Schema field Type Description Required
nominalTime object See: OL - nominalTime facet. The nominalTime facet describes the nominal start of the task. Used as execution_date. See: Run Facet - nominalTime for nested elements definition. Not required
log object Log reporting. At least one use of log_body or log_url is required. See: Run Facet - log for nested elements definition. Not required
startTime object See: Run Facet - startTime for nested elements definition. Not required
errorMessage object See: OL - errorMessage facet. Used for error reporting. See: Run Facet - errorMessage for nested elements definition. Not required
tags object See: Run Facet - tags for nested elements definition. Not required
parent object See: Task Facet - parent for nested elements definition. Not required
metrics list See: Task Facet - metric for nested elements definition. Not required

Task Facet - nominalTime

Table 14. TaskFacet - nominalTime fields and their description
Schema field Type Description Required
nominalStartTime string Used as execution_date - the reported time in the UI. An example of a value: 2024-04-09T06:34:06.600299Z. Required
nominalEndTime string The nominal end time of the task. An example of a value: 2024-04-09T06:34:06.600299Z. Not required
_producer string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required
_schemaURL string, URI Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json Required

Task Facet - parent

Table 15. TaskFacet - log fields and their description
Schema field Type Description Required
job object See: Task Facet - parent - job. Required
run object See: Task Facet - parent - run for nested elements definition. Required
_producer string, URI The URI identifying the producer of this metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required
_schemaUrl string, URI The JSON Pointer URL to the corresponding version of the schema definition for this facet. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet Required

Task Facet - metric

Table 16. TaskFacet - metric elements and their description
Element Type Description Required
metricName string Enables you to provide the a name for the metric. Required
timestamp datetime The date and time when tracking for this metric begins. An example of a value: 2024-04-09T06:34:06.600299Z Required
metricValue any The characteristic that is tracked. Required
source string The creator of the metric. Either user or system; user is the default. Required
_producer string, URI The URI identifying the producer of this metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Required
_schemaUrl string, URI The JSON Pointer URL to the corresponding version of the schema definition for this facet. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet Required

Task Facet - parent - job

For more information, go to Job.

Task Facet - parent - run

Table 17. TaskFacet - parent - run fields and their description
Schema field Type Description Required
runId string Actual Parent Run UID. An example of a value: e5648915-6c19-4b1e-8817-5335dabf146d. Required

Examples

Look at the following examples of a payload:

A payload - one run and one task
The following code block presents an example of a payload with one run and one task.
Note: The following is Python code. Make sure that you run it before sending it in the request.
[
  {
    "eventType": "FAIL",
    "eventTime": datetime.utcnow().replace(tzinfo=pytz.utc),
    "inputs": [],
    "job": {
      "facets": {},
      "namespace": "airflow-prod",
      "name": "my_dag"
    },
    "outputs": [],
    "run": {
      "facets": {
        "nominalTime": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/NominalTimeRunFacet.json",
          "nominalStartTime": datetime.utcnow().replace(tzinfo=pytz.utc)
        },
        "log": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
          "logBody": "very helpful log.. and very long",
          "logUrl": "https://bucket.s3.somewhere.com/.../file.log"
        },
        "startTime": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
          "startTime": datetime.utcnow().replace(tzinfo=pytz.utc)
          -
          timedelta(minutes=5)
        },
        "errorMessage": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ErrorMessageRunFacet.json",
          "message": "org.apache.spark.sql.AnalysisException: Table or view not found: wrong_table_name; line 1 pos 14",
          "programmingLanguage": "JAVA",
          "stackTrace": 'Exception in thread "main" java.lang.RuntimeException: A test exception\nat io.openlineage.SomeClass.method(SomeClass.java:13)\nat io.openlineage.SomeClass.anotherMethod(SomeClass.java:9)'
        },
        "tags": {
          "projectName": "test_project",
          "runName": "test_run_name",
          "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
          "_producer": "https://some.producer.com/version/1.0"
        }
      },
      "runId": run_uid
    },
    "producer": "https://custom.api",
    "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
  },
  {
    "eventTime": datetime.utcnow().replace(tzinfo=pytz.utc),
    "eventType": "FAIL",
    "job": {
      "facets": {},
      "namespace": "airflow-prod",
      "name": "my_dag.failing_task_with_log"
    },
    "run": {
      "facets": {
        "nominalTime": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
          "nominalStartTime": datetime.utcnow().replace(tzinfo=pytz.utc)
        },
        "log": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
          "logBody": "very helpful log.. and very long",
          "logUrl": "https://bucket.s3.somewhere.com/.../file.log"
        },
        "startTime": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
          "startTime": datetime.utcnow().replace(tzinfo=pytz.utc)
          -
          timedelta(minutes=5)
        },
        "errorMessage": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ErrorMessageRunFacet.json",
          "message": "org.apache.spark.sql.AnalysisException: Table or view not found: wrong_table_name; line 1 pos 14",
          "programmingLanguage": "JAVA",
          "stackTrace": 'Exception in thread "main" java.lang.RuntimeException: A test exception\nat io.openlineage.SomeClass.method(SomeClass.java:13)\nat io.openlineage.SomeClass.anotherMethod(SomeClass.java:9)'
        },
        "parent": {
          "_producer": "https://some.producer.com/version/1.0",
          "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
          "job": {
            "name": "my_dag",
            "namespace": "airflow-prod"
          },
          "run": {
            "runId": run_uid
          }
        },
        "metrics": [
          {
            "metricName": "metric1",
            "timestamp": datetime.utcnow().replace(tzinfo=pytz.utc),
            "metricValue": 100,
            "source": "user",
            "_producer": "https://some.producer.com/version/1.0",
            "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet"
          },
          {
            "metricName": "metric2",
            "timestamp": datetime.utcnow().replace(tzinfo=pytz.utc),
            "metricValue": 99.99,
            "source": "system",
            "_producer": "https://some.producer.com/version/1.0",
            "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet"
          },
          {
            "metricName": "metric3",
            "timestamp": datetime.utcnow().replace(tzinfo=pytz.utc),
            "metricValue": "string_value",
            "source": "user",
            "_producer": "https://some.producer.com/version/1.0",
            "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet"
          },
          {
            "metricName": "metric4",
            "timestamp": datetime.utcnow().replace(tzinfo=pytz.utc),
            "metricValue": {
              "key": "value"
            },
            "source": "user",
            "_producer": "https://some.producer.com/version/1.0",
            "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet"
          }
        ]
      },
      "runId": uuid.uuid4()
    },
    "producer": "https://custom.api",
    "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
  }
]
A payload - child and upstream relationship

The following code block presents an example of a payload with one run and two tasks where task my_dag.passing_task_child is upstream of task my_dag.failing_task_with_log and my_dag.passing_task_child is child of my_dag.passing_task_child.

Note: The following is Python code. Make sure that you run it before sending it in the request.
parent_run_uid = uuid.uuid4()
run_uid = uuid.uuid4()
simple_payload = [
    {
        "eventTime": NOW.isoformat(),
        "eventType": "COMPLETE",
        "job": {
            "facets": {},
            "name": "my_dag.passing_task",
            "namespace": "airflow-prod"
        },
        "producer": "custom_api",
        "run": {
            "facets": {
                "nominalTime": {
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
                    "nominalStartTime": (NOW - timedelta(minutes=5)).isoformat(),
                },
                "parent": {
                    "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
                    "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
                    "job": {
                        "name": "my_dag",
                        "namespace": "airflow-prod"
                    },
                    "run": {
                        "runId": parent_run_uid
                    }
                },
                "errorMessage": {
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ErrorMessageRunFacet.json",
                    "message": "org.apache.spark.sql.AnalysisException: Table or view not found: wrong_table_name; line 1 pos 14",
                    "programmingLanguage": "JAVA",
                    "stackTrace": "Exception in thread \"main\" java.lang.RuntimeException: A test exception\nat io.openlineage.SomeClass.method(SomeClass.java:13)\nat io.openlineage.SomeClass.anotherMethod(SomeClass.java:9)"
                },
                "startTime": {
                    "startTime": (NOW - timedelta(minutes=5)).isoformat(),
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json"
                }
            },
            "runId": run_uid
        },
        "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
    },
    {
        "eventTime": NOW.isoformat(),
        "eventType": "COMPLETE",
        "job": {
            "facets": {},
            "name": "my_dag.passing_task_child",
            "namespace": "airflow-prod"
        },
        "producer": "custom_api",
        "run": {
            "facets": {
                "nominalTime": {
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
                    "nominalStartTime": (NOW - timedelta(minutes=5)).isoformat(),
                },
                "parent": {
                    "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
                    "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
                    "job": {
                        "name": "my_dag.passing_task",
                        "namespace": "airflow-prod"
                    },
                    "run": {
                        "runId": run_uid
                    }
                },
                "startTime": {
                    "startTime": (NOW - timedelta(minutes=5)).isoformat(),
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json"
                }
            },
            "runId": uuid.uuid4()
        },
        "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
    },
    {
        "eventTime": NOW.isoformat(),
        "eventType": "FAIL",
        "job": {
            "facets": {},
            "name": "my_dag.failing_task_with_log",
            "namespace": "airflow-prod"
        },
        "producer": "custom_api",
        "inputs": [{
                "facets": {},
                "name": "my_dag.passing_task",
                "namespace": "airflow-prod"
            }
        ],
        "run": {
            "facets": {
                "nominalTime": {
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
                    "nominalStartTime": (NOW - timedelta(minutes=5)).isoformat(),
                },
                "log": {
                    "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
                    "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
                    "logBody": "very helpful log.. and very long",
                    "logUrl": "https://bucket.s3.somewhere.com/.../file.log"
                },
                "parent": {
                    "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
                    "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
                    "job": {
                        "name": "my_dag",
                        "namespace": "airflow-prod"
                    },
                    "run": {
                        "runId": parent_run_uid
                    }
                },
                "startTime": {
                    "startTime": (NOW - timedelta(minutes=5)).isoformat(),
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json"
                }
            },
            "runId": uuid.uuid4()
        },
        "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
    },
    {
        "eventTime": NOW.isoformat(),
        "eventType": "FAIL",
        "inputs": [],
        "job": {
            "facets": {},
            "name": "my_dag",
            "namespace": "airflow-prod"
        },
        "outputs": [],
        "producer": "custom_api",
        "run": {
            "facets": {
                "nominalTime": {
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
                    "nominalStartTime": (NOW - timedelta(minutes=5)).isoformat(),
                },
                "log": {
                    "_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
                    "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
                    "logBody": "some less helpful log"
                },
                "startTime": {
                    "startTime": (NOW - timedelta(minutes=5)).isoformat(),
                    "_producer": "https://some.producer.com/version/1.0",
                    "_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json"
                },
                "tags": {
                    "projectName": "test_project"
                }
            },
            "runId": parent_run_uid
        },
        "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
    }
]
A payload with one run and two tasks that depend on one another

Response

After you send the payload to the endpoint by using the POST method, you can get either a success or error response.

  • Success response
    • If the request completes successfully, the endpoint returns status code 200 and the following response body:
    • {'success': True}
  • Error response
    • If the request fails, the endpoint returns status code and a response body in the following format:
    • {
          'error': '<short error message>',
          'message': <detailed error description>,
          'traceback': <traceback>
      }
    • Examples:
      {'error': 'Bad Request', 'message': "Invalid data. details: <>", 'traceback': '<traceback>'}
      {'error': 'Internal Server Error', 'message': "<depend on the error>", 'traceback': '<traceback>'}