Debugging a flow
When you run a flow, you can monitor execution at both the flow level and the node level to identify and resolve issues.
Monitoring flow execution
The flow canvas provides a visual representation of the pipeline and its execution status:
Use the top bar to review:
- Runtime - Indicates which execution engine is running your flow, Python or Spark.
- Elapsed time - Displays the duration the flow has been running.
- Nodes - Gives a summary of the number of nodes that are:
- Completed
- Failed
- Completed with warnings
- Documents - Tracks processing of individual documents, listing how many are:
- Read
- Skipped
- Failed
Inspecting node details
Each node in the canvas displays a status icon that indicates its current state. Use these indicators to quickly identify where issues occur. Click a node in the flow to open its details panel.
Log details
Use the Log details tab to review execution logs for the node. The logs include:
- OrchestratorType: PYTHON or SPARK
- Step ID: A unique identifier (Node ID)
- Starting execution message with the operator name.
- Completion message with time taken (e.g., "time= 3.87 seconds").
- Schema: Lists all column names and data types.
- Operator Metadata: A dictionary containing node metadata.
Node summary
Use the Node summary tab to review node-level metrics and results, including:
- Node status:
- Completed - Finished successfully
- Failed - Encountered an error
- Skipped - Was not executed
- Running - Currently processing
- Completed With Warnings - Finished with warnings
- Completed With Errors - Finished, but some documents failed
- Pending - Operator is waiting to start.
- Documents in scope - Total documents available for this operator to process
- Completed docs count - Number of documents successfully processed by this operator
- Processed docs - Total documents processed (successful and failed)
- Failed docs - Number of documents that failed in this operator
- Skipped docs - Number of documents skipped
- Page type stats - Number of pages processed, by format
- Total pages converted - Total number of pages processed
- Total conversion time in seconds
To inspect intermediate output, click View table in the Node output section.
The table preview feature might impact performance. To disable it:
- In the flow canvas, click the Flow properties icon on the toolbar.
- Clear Enable node output preview for the flow.
- Click Save.