Debugging and running data flow EPGs

You can debug each code unit node in an execution plan graph (EPG) sequentially or you can debug the entire graph at once. Likewise you can run each node sequentially or run the entire graph.
About this task
The execution plan of a data flow is heavily optimized for performance and therefore the structure of an EPG produced by a data flow does not match one-on-one with the structure of the original data flow. Hence, you cannot debug a data flow directly from the data flow editor. Instead, you can view the execution plan of a data flow in the EPG editor and use it to debug the data flow.
Note: Cancelling the debugging operation might leave intermediate objects and rows in the tables. If this occurs, you need to remove these objects manually. Otherwise, subsequent runs might fail.
You can use one of the following debugging options:
Debug
This option debugs the entire EPG at once.
Step-wise
This option debugs each code unit node or activity in the correct sequence and allows you to determine when you move from one node or activity in the sequence to the next.

The following procedure describes debugging and running the EPG for a data flow.

Procedure

To debug and run an EPG:

  1. Optional: Add breakpoints in the EPG so that the you can debug an EPG up to a certain point. To add a breakpoint at any of the nodes in the EPG, right-click the node and click Toggle Breakpoint.
  2. Click EPG > Debug EPG. The Debug Flow window opens.
  3. On the General page, specify the debugging requirements by selecting a run profile. Select the database to use to perform the debugging process.
  4. On the Diagnostics page, review the settings and change them as required.
  5. By default, the Resources page shows all of the resource profiles that are referenced by the process that are not variables. To select from the resource profiles that are variables, click Show All. You can also edit or remove a selected resource profile.
    Tip: You can browse the variables page to review the resource type variables.
  6. On the Variables page, review the variables that are used. Edit the current values for these variables as needed.
  7. Click Debug in the window. The first node in the EPG (START_GRAPH) is highlighted to indicate that execution has started.
  8. View the updated node properties, by right-clicking the node and selecting Show Properties View. For the TXN nodes that are generated in the EPG, you can use the attached list of database connections to update the database connection that is used. For the code units, you can edit the attached code. For example, before you run an EPG, you can edit the SQL statements that are generated for the JDBC code.
  9. Debug the remaining nodes in the EPG:
    1. On the Design Studio toolbar, click the down arrow button to move the execution point to the next node in the graph. This kind of EPG debugging is referred to as the step-wise execution of nodes.
    2. Update the node properties.
      Tip: Click the Resume button to resume debugging the EPG to completion. Or, right-click the node and select Debug > Continue.
    3. Repeat step 9 for all of the nodes.
  10. If you added a breakpoint in the EPG, the debugging process stops at the breakpoint and waits for user action. Click one of the following buttons to continue debugging from the breakpoint:
    Button Description
    the continue to end ignoring all breakpoints button
    Ignores all breakpoints and debugs the EPG until the end of the graph
    the execute current node and move to the next button
    Moves to the node that follows the breakpoint
    the continue to the next breakpoint button
    Continues debugging until the next breakpoint
  11. Run the EPG:
    1. Select EPG > Execute. The Flow Execution window opens.
    2. Review the settings in all of the pages and change them as required. The settings are identical to the pages in the Debug Flow window.
    3. Click Execute on the Flow Execution window to start the process. The Executing flow window opens.
      • To work in the Design Studio when the EPG is running, click Run in Background. This minimizes the progress bar to the lower right corner of the Design Studio and makes the canvas available. Clicking the icon next to the minimized window opens a default progress view, which closes when the flow stops running.
      • To check the state of the running EPG, open the Execution Status view and click the refresh icon.
  12. When the EPG is finished running, review the following information:
    • Review the Execution Result window to see a log of the run. Use this information to debug problems if the run fails. You can save the results to a text file.
    • Review the Execution Status view. This view shows a table of information for all of the processes that you ran. This information is available until you close the Design Studio. You can right-click a row in the table, which represents one run process, and delete the process from the table or you can view the entire log file for the process. You can delete a process when it is in the Failed to start, Cancelled, or Completed state. You can also review a log of the process activities. To check the process activity when an EPG is running, review the Progress column. If the Progress column is not visible in the Execution Status view, go to Window > Preferences > Data Warehousing > Execution Status and select the Progress check box.
    • Review the Tail Log page for a selected run to see the last few hundred lines of the log file.


Feedback | Information roadmap