Custom integration API schema
This schema details the format that API request payloads take.
When you create the integration in the Databand UI:
- Write the code to connect to your system and run it to pull the data from your system.
- Convert the received data to the request payload that conforms to the Request structureDataband schema.
- Send it to the API endpoint that was provided in the last step of the integration.
- Run the code periodically to send the data. Go to this example to see a Python script that performs the actions from steps 1-3. In this example, the code runs every 120 seconds.
Databand schema
The Databand schema is based on the open source OpenLineage (OL) schema with some differences, though:
- Databand accepts a single payload with all the events, whereas with the OL you can provide only a single event per single API call.
- Databand extensions to the OL schema include the
logandstartTimefields. - Databand tasks and runs are referred to as "runs" in OL, and are reported by
runEventin OL. - The data is reported only for the payload with a root run (a run without a parent).
Relationships between the objects in Databand and OpenLineage
To understand different types of relationships in Databand, have a look at the following diagram
and its description:
In this case:
- Parent Task - is parent for both Sub Task and Upstream Sub Task.
- Upstream Sub Task - is upstream for Sub Task (Upstream Sub Task follows Sub Task).
How would it translate to the OL schema?
- In both Sub Task and Upstream Sub Task, there will be a parent facet that references Parent Task as their parent.
- In Upstream Sub Task only, there will be an input that mentions Sub Task and states that Upstream Sub Task is its upstream task.
[
{
"eventTime": "<current time>",
"eventType": "COMPLETE",
"job": {
"facets": {},
"name": "Parent Task",
"namespace": "project_name"
},
"runId": "<UUID(Parent Task)>"
},
{
"eventTime": "<current time>",
"eventType": "COMPLETE",
"job": {
"facets": {},
"name": "Sub Task",
"namespace": "project_name"
},
"facets": {
"parent": { # reference to Parent Task as a parent task
"_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
"run": {
"runId": "<UUID(Parent Task)>"
}
}
},
"runId": "<UUID(Sub Task)>"
},
{
"eventTime": "<current time>",
"eventType": "COMPLETE",
"job": {
"facets": {},
"name": "Upstream Sub Task",
"namespace": "project_name"
},
"inputs": [
{ # this references Sub Task to indicate Upstream Sub Task as its upstream
"facets": {},
"name": "Sub Task",
"namespace": "project_name"
}
],
"facets": {
"parent": { # reference to Parent Task as a parent task
"_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
"run": {
"runId": "<UUID(Parent Task)>"
}
}
},
"runId": "<UUID(Upstream Sub Task)>"
}
]Endpoint
Send the request payload to the endpoint URL that was generated in the last step of the integration in the Databand UI.
The endpoint is in the following format: HTTPS://your-Databand-hostname/api/v1/tracking/open-lineage/<tracking_source_uid>/events/bulk
Method
To send the payload, use the POST method.
Request structure
The custom integration payload consists of a series of Run Events, which contain information
about the runs and tasks execution. Each of such events consists of basic Event information and
Facets, which allow extending the Event with other important information, such as
tags, log, errorMessage.
For a better understanding of how the payload is created, look at the following schema. It shows
the structure of the message that is used in the POST request.
The schema includes two payloads:
What are runs and tasks?
- Run: A run is an individual execution of your pipeline. Pipelines in Databand consist of a collection of runs over time. You can view all historical runs that have been logged for a particular pipeline by going to its Runs page in the Databand UI.
- Task: A task is a logical step in your pipeline. Most pipelines consist of a collection of tasks that are executed in a specific sequence. In Databand, tasks provide a way to review metrics from and measure the performance of your pipeline at a finer grain. As a result, you can debug issues more precisely and identify bottlenecks in your pipeline’s performance.
For more information about runs and tasks, go to Tasks and runs
uid in Databand and OpenLineage
Because several fields of the payload consist of identical child objects (for example
errorMessage facet), they are described only once.
Because the Databand schema is based on the OL schema, some of the fields are OL standard and optional. It is advisable, though, to apply them in the payload.
[
{
"eventType": ...,
"eventTime": ...,
"inputs": [],
"job": {
// Job object
"facets": {},
"namespace": ...,
"name": ...
},
"outputs": [],
"run": {
// Run object
"runId": ...,
"facets": {
// Run Facets object
"nominalTime": {
// Run Facet - nominalTime object
"_producer": ...,
"_schemaURL": ...,
"nominalStartTime": ...,
"nominalEndTime": ...
},
"log": {
// Run Facet - log object
"_producer": ...,
"_schemaURL": ...,
"logBody": ...,
"logUrl": ...
},
"startTime": {
// Run Facet - startTime object
"_producer": ...,
"_schemaURL": ...,
"startTime": ...
},
"errorMessage": {
// Run Facet - errorMessage object
"_producer": ...,
"_schemaURL": ...,
"message": ...,
"stackTrace": ...,
"programmingLanguage": ...
},
"tags": {
// Run Facet - tags object
"projectName": ...,
"runName": ...,
"_schemaURL": ...,
"_producer": ...
}
}
},
"producer": ...,
"schemaUrl": ...
},
{
"eventTime": ...,
"eventType": ...,
"job": {
// Job object
"facets": {},
"namespace": ...,
"name": ...
},
"inputs": [
// Inputs object
{
"facets": {},
"name": ...,
"namespace": ...
}
],
"run": {
// Run object
"runId": ...,
"facets": {
// Task Facets object
"nominalTime": {
// Task Facet - nominalTime object
"_producer": ...,
"_schemaURL": ...,
"nominalStartTime": ...,
"nominalEndTime": ...
},
"log": {
// Run Facet - log object
"_producer": ...,
"_schemaURL": ...,
"logBody": ...,
"logUrl": ...
},
"startTime": {
// Run Facet - startTime object
"_producer": ...,
"_schemaURL": ...,
"startTime": ...
},
"errorMessage": {
// Run Facet - errorMessage object
"_producer": ...,
"_schemaURL": ...,
"message": ...,
"stackTrace": ...,
"programmingLanguage": ...
},
"parent": {
// Task Facet - parent object
"_producer": ...,
"_schemaURL": ...,
"job": {
// Task Facet - Parent - job object
"name": ...,
"namespace": ...
},
"run": {
// Task Facet - Parent - run object
"runId": ...
}
},
"metrics": [
// Task Facet - metrics
{
"metricName": ...,
"timestamp": ...,
"metricValue": ...,
"source": ...,
"_producer": ...,
"_schemaURL": ...
},
...
],
}
},
"producer": ...,
"schemaUrl": ...
}
]
For the sake of clarity, the fields are described for the run and task payloads separately.
Run payload
current_state of the run, a single event per run is expected.RunEvent
| Element | Type | Description | Required |
|---|---|---|---|
eventType
|
string | Values:
startTime facet. If you don't provide the
eventType, it gets the OTHER value, by default. |
|
eventTime
|
string | The ISO date and time when the event occurred. Used as start_time or
end_time depending on the eventType. An example of a value:
2024-04-09T06:34:06.600323Z. |
|
job
|
object | See: Job object for nested elements definition. |
|
run
|
object | See: Run object for nested elements definition. |
|
inputs
|
array | Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
outputs
|
array | Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
producer
|
string | URI identifying the producer of this metadata (for example, a Git URL with a given tag or SHA). Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
schemaURL
|
string | Default value: https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent. The JSON pointer URL to the corresponding version of the schema definition for this RunEvent. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
Job
Jobs are identified by a unique name within a namespace. Job
namespace should be derived from scheduler, for example
airflow-prod. The combined namespace and name for a Job are
usually enough to uniquely identify it within your environment. All tasks and runs for the same run
usually have the same namespace as they come from the same orchestrator.
| Element | Type | Description | Required |
|---|---|---|---|
name
|
string | Used as the pipeline name. |
|
namespace
|
string | Jobs have a name that is unique to them in their namespace. The namespace is the root of the naming hierarchy and is used to prevent collisions when a job with the same name is sent from different systems (for example, from different Airflow instances). |
|
facets
|
object | The job facets. Standard extensions: Documentation, JobType, Ownership, SourceCode, or Location. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
Run
| Element | Type | Description | Required |
|---|---|---|---|
runId
|
string | See: OL - Run naming. The globally unique ID of the run associated with the job.
Used as uid. An example of a value: e5648915-6c19-4b1e-8817-5335dabf146d. |
|
facets
|
object | See: Run facet objects for nested elements definition. |
![]() |
Run Facets
| Element | Type | Description | Required |
|---|---|---|---|
nominalTime
|
object | See: OL - nominalTime facet. The
nominalTime facet describes the nominal start of the run. It is used for
execution_date. See: Run Facet - nominalTime for nested elements
definition. |
|
log
|
object | Log reporting. Required at least one of log_body or
log_url. See: Run Facet - log for nested elements
definition. |
|
startTime
|
object | See: Run Facet - startTime for nested elements definition. |
|
errorMessage
|
object | See: OL - errorMessage facet - used for error
reporting. See: Run Facet - errorMessage for nested elements definition. |
|
tags
|
objects | See: Run Facet - tags for nested elements definition. |
|
Run Facet - nominalTime
| Element | Type | Description | Required |
|---|---|---|---|
nominalStartTime
|
string | Used as execution_date - the reported time in the UI. An example of a value:
2024-04-09T06:34:06.600210Z. |
|
nominalEndTime
|
string | The nominal end time of the run. An example of a value: 2024-04-09T06:34:06.600210Z. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
_producer
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
_schemaURL
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/NominalTimeRunFacet.json |
|
Run Facet - log
| Element | Type | Description | Required |
|---|---|---|---|
logBody
|
string | Log body text - displayed in the UI as Log. |
|
logUrl
|
string, URI | Remote log url - displayed in the UI as Log URL. |
|
_producer
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
_schemaURL
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet |
|
logBody and logUrl fields are optional, but if you include the Run
Facet - log, you need to provide at least one of them.Run Facet - startTime
| Element | Type | Description | Required |
|---|---|---|---|
startTime
|
string | Schema metadata. An example of a value: 2024-04-07T06:34:06.600261Z. |
|
_producer
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
_schemaURL
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet |
|
startTime for all types of events.Run Facet - errorMessage
| Element | Type | Description | Required |
|---|---|---|---|
message
|
string | Used as error message. Title and exact error type are grouped in the Top Errors widget. |
|
stackTrace
|
string | Used as error details. |
|
programmingLanguage
|
string | The programming language of the code that returned the error. |
|
_producer
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
_schemaURL
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ErrorMessageRunFacet.json |
|
Run Facet - tags
| Element | Type | Description | Required |
|---|---|---|---|
projectName
|
string | Enables you to provide the project name of the run. If you don't provide it, it is "default". |
|
runName
|
string | Enables you to provide the name of the run. If you don't provide it, then
runName is {job.name}_{nominaltime}. |
|
_schemaURL
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet |
|
_producer
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
Task payload
OpenLineage doesn't differentiate between runs and tasks - element definitions from OpenLineage documentation are the same for both of them. Important differences are provided in the description of the relevant fields.
current_state of the task, a
single event per task (in a run) is expected.TaskEvent
| Schema field | Type | Description | Required |
|---|---|---|---|
eventType
|
string | Values:
startTime facet. If you don't provide the
eventType, it gets the OTHER value, by default. |
|
eventTime
|
string | The ISO datetime when the event occurred. Used as start_time or
end_time depending on the eventType. An example of a value:
2024-04-09T06:34:06.600316Z. |
|
job
|
object | See: Job for nested elements definition. |
|
inputs
|
object | Used to define the upstream relationship between the tasks. See: Inputs |
|
outputs
|
array | Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
run
|
object | See: Run object for nested elements definition. |
|
producer
|
string | URI identifying the producer of this metadata. Not used at the moment. It is in the schema, though, to make the schema compatible with OL. |
|
schemaURL
|
string | Not used at the moment. It is in the schema, though, to make the schema compatible with OL. Default value. https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent |
|
Inputs
| Element | Type | Description | Required |
|---|---|---|---|
name
|
string | The name of the task that will be the upstream to the task that declares the input. |
|
namespace
|
string | The namespace of the task that will be the upstream to the task that declares the input. |
|
facets
|
dict [str, BaseDatasetFacet] | Not used at the moment. |
|
Run - task payload
| Element | Type | Description | Required |
|---|---|---|---|
runId
|
string | See: OL - Run naming. The globally unique ID of the run associated with the job.
Used as uid. An example of a value: 2c6a61ad-b23f-484f-978e-0eac96a983d6. |
|
facets
|
object | See: Task Facet object for nested elements definition. |
|
Task Facet
| Schema field | Type | Description | Required |
|---|---|---|---|
nominalTime
|
object | See: OL - nominalTime facet. The nominalTime facet
describes the nominal start of the task. Used as execution_date. See: Run Facet - nominalTime for nested elements definition. |
|
log
|
object | Log reporting. At least one use of log_body or log_url is
required. See: Run Facet - log for nested elements definition. |
|
startTime
|
object | See: Run Facet - startTime for nested elements definition. |
|
errorMessage
|
object | See: OL - errorMessage facet. Used for error
reporting. See: Run Facet - errorMessage for nested elements definition. |
|
tags
|
object | See: Run Facet - tags for nested elements definition. |
|
parent
|
object | See: Task Facet - parent for nested elements definition. |
|
metrics
|
list | See: Task Facet - metric for nested elements definition. |
|
Task Facet - nominalTime
| Schema field | Type | Description | Required |
|---|---|---|---|
nominalStartTime
|
string | Used as execution_date - the reported time in the UI. An example of a value:
2024-04-09T06:34:06.600299Z. |
|
nominalEndTime
|
string | The nominal end time of the task. An example of a value: 2024-04-09T06:34:06.600299Z. |
|
_producer
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
_schemaURL
|
string, URI | Schema metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json |
|
Task Facet - parent
| Schema field | Type | Description | Required |
|---|---|---|---|
job
|
object | See: Task Facet - parent - job. |
|
run
|
object | See: Task Facet - parent - run for nested elements definition. |
|
_producer
|
string, URI | The URI identifying the producer of this metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
_schemaUrl
|
string, URI | The JSON Pointer URL to the corresponding version of the schema definition for this facet. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet |
|
Task Facet - metric
| Element | Type | Description | Required |
|---|---|---|---|
metricName
|
string | Enables you to provide the a name for the metric. |
|
timestamp
|
datetime | The date and time when tracking for this metric begins. An example of a value: 2024-04-09T06:34:06.600299Z |
|
metricValue
|
any | The characteristic that is tracked. |
|
source
|
string | The creator of the metric. Either user or system;
user is the default. |
|
_producer
|
string, URI | The URI identifying the producer of this metadata. Not used at the moment. In the schema, though, to make the schema compatible with OL. |
|
_schemaUrl
|
string, URI | The JSON Pointer URL to the corresponding version of the schema definition for this facet. Not used at the moment. In the schema, though, to make the schema compatible with OL. Default value: https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet |
|
Task Facet - parent - job
For more information, go to Job.
Task Facet - parent - run
| Schema field | Type | Description | Required |
|---|---|---|---|
runId
|
string | Actual Parent Run UID. An example of a value: e5648915-6c19-4b1e-8817-5335dabf146d. |
|
Examples
Look at the following examples of a payload:
[
{
"eventType": "FAIL",
"eventTime": datetime.utcnow().replace(tzinfo=pytz.utc),
"inputs": [],
"job": {
"facets": {},
"namespace": "airflow-prod",
"name": "my_dag"
},
"outputs": [],
"run": {
"facets": {
"nominalTime": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/NominalTimeRunFacet.json",
"nominalStartTime": datetime.utcnow().replace(tzinfo=pytz.utc)
},
"log": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
"logBody": "very helpful log.. and very long",
"logUrl": "https://bucket.s3.somewhere.com/.../file.log"
},
"startTime": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
"startTime": datetime.utcnow().replace(tzinfo=pytz.utc)
-
timedelta(minutes=5)
},
"errorMessage": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ErrorMessageRunFacet.json",
"message": "org.apache.spark.sql.AnalysisException: Table or view not found: wrong_table_name; line 1 pos 14",
"programmingLanguage": "JAVA",
"stackTrace": 'Exception in thread "main" java.lang.RuntimeException: A test exception\nat io.openlineage.SomeClass.method(SomeClass.java:13)\nat io.openlineage.SomeClass.anotherMethod(SomeClass.java:9)'
},
"tags": {
"projectName": "test_project",
"runName": "test_run_name",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
"_producer": "https://some.producer.com/version/1.0"
}
},
"runId": run_uid
},
"producer": "https://custom.api",
"schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
},
{
"eventTime": datetime.utcnow().replace(tzinfo=pytz.utc),
"eventType": "FAIL",
"job": {
"facets": {},
"namespace": "airflow-prod",
"name": "my_dag.failing_task_with_log"
},
"run": {
"facets": {
"nominalTime": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
"nominalStartTime": datetime.utcnow().replace(tzinfo=pytz.utc)
},
"log": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
"logBody": "very helpful log.. and very long",
"logUrl": "https://bucket.s3.somewhere.com/.../file.log"
},
"startTime": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
"startTime": datetime.utcnow().replace(tzinfo=pytz.utc)
-
timedelta(minutes=5)
},
"errorMessage": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ErrorMessageRunFacet.json",
"message": "org.apache.spark.sql.AnalysisException: Table or view not found: wrong_table_name; line 1 pos 14",
"programmingLanguage": "JAVA",
"stackTrace": 'Exception in thread "main" java.lang.RuntimeException: A test exception\nat io.openlineage.SomeClass.method(SomeClass.java:13)\nat io.openlineage.SomeClass.anotherMethod(SomeClass.java:9)'
},
"parent": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
"job": {
"name": "my_dag",
"namespace": "airflow-prod"
},
"run": {
"runId": run_uid
}
},
"metrics": [
{
"metricName": "metric1",
"timestamp": datetime.utcnow().replace(tzinfo=pytz.utc),
"metricValue": 100,
"source": "user",
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet"
},
{
"metricName": "metric2",
"timestamp": datetime.utcnow().replace(tzinfo=pytz.utc),
"metricValue": 99.99,
"source": "system",
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet"
},
{
"metricName": "metric3",
"timestamp": datetime.utcnow().replace(tzinfo=pytz.utc),
"metricValue": "string_value",
"source": "user",
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet"
},
{
"metricName": "metric4",
"timestamp": datetime.utcnow().replace(tzinfo=pytz.utc),
"metricValue": {
"key": "value"
},
"source": "user",
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet"
}
]
},
"runId": uuid.uuid4()
},
"producer": "https://custom.api",
"schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
}
]The following code block presents an example of a payload with one run and two tasks where task my_dag.passing_task_child is upstream of task my_dag.failing_task_with_log and my_dag.passing_task_child is child of my_dag.passing_task_child.
parent_run_uid = uuid.uuid4()
run_uid = uuid.uuid4()
simple_payload = [
{
"eventTime": NOW.isoformat(),
"eventType": "COMPLETE",
"job": {
"facets": {},
"name": "my_dag.passing_task",
"namespace": "airflow-prod"
},
"producer": "custom_api",
"run": {
"facets": {
"nominalTime": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
"nominalStartTime": (NOW - timedelta(minutes=5)).isoformat(),
},
"parent": {
"_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
"job": {
"name": "my_dag",
"namespace": "airflow-prod"
},
"run": {
"runId": parent_run_uid
}
},
"errorMessage": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ErrorMessageRunFacet.json",
"message": "org.apache.spark.sql.AnalysisException: Table or view not found: wrong_table_name; line 1 pos 14",
"programmingLanguage": "JAVA",
"stackTrace": "Exception in thread \"main\" java.lang.RuntimeException: A test exception\nat io.openlineage.SomeClass.method(SomeClass.java:13)\nat io.openlineage.SomeClass.anotherMethod(SomeClass.java:9)"
},
"startTime": {
"startTime": (NOW - timedelta(minutes=5)).isoformat(),
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json"
}
},
"runId": run_uid
},
"schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
},
{
"eventTime": NOW.isoformat(),
"eventType": "COMPLETE",
"job": {
"facets": {},
"name": "my_dag.passing_task_child",
"namespace": "airflow-prod"
},
"producer": "custom_api",
"run": {
"facets": {
"nominalTime": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
"nominalStartTime": (NOW - timedelta(minutes=5)).isoformat(),
},
"parent": {
"_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
"job": {
"name": "my_dag.passing_task",
"namespace": "airflow-prod"
},
"run": {
"runId": run_uid
}
},
"startTime": {
"startTime": (NOW - timedelta(minutes=5)).isoformat(),
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json"
}
},
"runId": uuid.uuid4()
},
"schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
},
{
"eventTime": NOW.isoformat(),
"eventType": "FAIL",
"job": {
"facets": {},
"name": "my_dag.failing_task_with_log",
"namespace": "airflow-prod"
},
"producer": "custom_api",
"inputs": [{
"facets": {},
"name": "my_dag.passing_task",
"namespace": "airflow-prod"
}
],
"run": {
"facets": {
"nominalTime": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
"nominalStartTime": (NOW - timedelta(minutes=5)).isoformat(),
},
"log": {
"_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
"logBody": "very helpful log.. and very long",
"logUrl": "https://bucket.s3.somewhere.com/.../file.log"
},
"parent": {
"_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet",
"job": {
"name": "my_dag",
"namespace": "airflow-prod"
},
"run": {
"runId": parent_run_uid
}
},
"startTime": {
"startTime": (NOW - timedelta(minutes=5)).isoformat(),
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json"
}
},
"runId": uuid.uuid4()
},
"schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
},
{
"eventTime": NOW.isoformat(),
"eventType": "FAIL",
"inputs": [],
"job": {
"facets": {},
"name": "my_dag",
"namespace": "airflow-prod"
},
"outputs": [],
"producer": "custom_api",
"run": {
"facets": {
"nominalTime": {
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json",
"nominalStartTime": (NOW - timedelta(minutes=5)).isoformat(),
},
"log": {
"_producer": "https://github.com/OpenLineage/OpenLineage/tree/1.11.3/client/python",
"_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
"logBody": "some less helpful log"
},
"startTime": {
"startTime": (NOW - timedelta(minutes=5)).isoformat(),
"_producer": "https://some.producer.com/version/1.0",
"_schemaURL": "https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/SQLJobFacet.json"
},
"tags": {
"projectName": "test_project"
}
},
"runId": parent_run_uid
},
"schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
}
]
Response
After you send the payload to the endpoint by using the POST method, you can get
either a success or error response.
- Success response
- If the request completes successfully, the endpoint returns status code 200 and the following response body:
-
{'success': True}
- Error response
- If the request fails, the endpoint returns status code and a response body in the following format:
-
{ 'error': '<short error message>', 'message': <detailed error description>, 'traceback': <traceback> } - Examples:
{'error': 'Bad Request', 'message': "Invalid data. details: <>", 'traceback': '<traceback>'} {'error': 'Internal Server Error', 'message': "<depend on the error>", 'traceback': '<traceback>'}