TensorFlow Evaluator

The TensorFlow Evaluator processor uses a TensorFlow machine learning model to generate predictions or classifications of data. For information about supported versions, see Supported systems and versions.

Using the TensorFlow Evaluator processor, you can design flows that read data and then generate predictions or classifications of the data during the flow processing - producing data-driven insights in real time. For example, you can design flows that detect fraudulent transactions or that perform natural language processing as data passes through the flow.

To use the TensorFlow Evaluator processor, you first build and train the model in TensorFlow. You then save the trained model to file and store the saved model directory on the Data Collector machine that runs the flow.

When you configure the TensorFlow Evaluator processor, you define the path to the saved model stored on the Data Collector machine. You also define the input and output tensor information as configured during the building and training of the model.

You configure whether the processor evaluates each record or evaluates the entire batch at once. When evaluating the entire batch, the processor writes the prediction or classification results to events.

Prerequisites

Before you configure a TensorFlow Evaluator processor, complete the following prerequisites:

Build and train the model in TensorFlow
Build and train the model in TensorFlow. The processor uses version 1.15 of the TensorFlow client library and supports TensorFlow version 1.x.
For a tutorial on building and training a TensorFlow model, see the TensorFlow tutorials.
Save and store the trained model on the Data Collector machine
Save the trained model to file in SavedModel format. When you save a model in SavedModel format, TensorFlow creates a directory consisting of the following subdirectories and files:
assets/
assets.extra/
variables/
saved_model.pb
Store the complete SavedModel directory on the Data Collector machine where you plan to run the flow. For Data Collector, store the model directory in the Data Collector resources directory, $SDC_RESOURCES.

Evaluation method

The TensorFlow Evaluator processor can evaluate each record or evaluate the entire batch at once.

Configure the processor to use one of the following evaluation methods, based on the input that the tensor expects:

Evaluate each record
If the tensor requires one input to produce one output, configure the TensorFlow Evaluator processor to evaluate each record. By default, the processor evaluates each record, producing one output per record.

The processor receives each record as one input, performs the tensor computations to predict or classify the data, and then produces one output. The output includes all original fields in the record plus an additional output field that includes the prediction or classification result. The output field is a map or list field containing a field for each output that you configure for the processor.

To evaluate each record, ensure that the following processor properties are cleared:
  • On the General tab, clear the Produce Events property.
  • On the TensorFlow tab, clear the Entire Batch property.
Evaluate the entire batch
If the tensor requires multiple inputs to produce one output, configure the TensorFlow Evaluator processor to evaluate the entire batch.
When evaluating a batch, the processor waits until it receives all records in the batch, performs the tensor computations to predict or classify the data, and then produces one output as an event for the entire batch. The processor output includes the original fields in each record. The event output includes the prediction or classification result.
To evaluate the entire batch, ensure that the following processor properties are selected:
  • On the General tab, select the Produce Events property.
  • On the TensorFlow tab, select the Entire Batch property.
Then, connect the event stream from the TensorFlow Evaluator processor to a target to store the prediction or classification result, as described in Event generation.

Event generation

When configured to evaluate the entire batch, the TensorFlow Evaluator processor can generate events. The events contain the results of the prediction or classification made for the batch.

Important: Configure the TensorFlow Evaluator processor to generate events only when the processor is configured to evaluate the entire batch.
TensorFlow Evaluator events can be used in any logical way. For example:

For more information about dataflow triggers and the event framework, see Dataflow triggers overview.

Event records

Event records generated by the TensorFlow Evaluator processor have the following event-related record header attributes. Record header attributes are stored as String values:
Record Header Attribute Description
sdc.event.type Event type. Uses the following event type:
  • tensorflow-event - Contains the results of the prediction or classification of the batch.
sdc.event.version Integer that indicates the version of the event record type.
sdc.event.creation_timestamp Epoch timestamp when the stage created the event.

The TensorFlow Evaluator processor generates a tensorflow-event record when the processor completes processing all records in the batch. The event record is a Map field containing a field for each output configuration that you define for the processor.

Serving a TensorFlow model

If you include a TensorFlow Evaluator processor in a microservice flow, you can serve the TensorFlow model in the running flow.

When you serve a TensorFlow model, external clients can use the model to perform computations. In a microservice flow, a client makes a REST API call to the source. The microservice flow performs all processing - which can include the predictions or classifications made by the TensorFlow Evaluator processor. The records with the TensorFlow prediction or classification result are sent back to the microservice flow source. The source then transmits JSON-formatted responses back to the originating REST API client.

Configuring a TensorFlow Evaluator processor

About this task

Configure a TensorFlow Evaluator processor to use a TensorFlow machine learning model to generate predictions or classifications of data.

Procedure

  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    Produce Events Generates event records when events occur. Use for event handling.
    Required Fields Fields that must include data for the record to be passed into the stage.
    Tip: You might include fields that the stage uses.

    Records that do not include all required fields are processed based on the error handling configured for the flow.

    Preconditions Conditions that must evaluate to TRUE to allow a record to enter the stage for processing. Click Add to create additional preconditions.

    Records that do not meet all preconditions are processed based on the error handling configured for the stage.

    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the flow for error handling.
    • Stop Flow - Stops the flow.
  2. On the TensorFlow tab, configure the following properties:
    TensorFlow Property Description
    Saved Model Path Path to the saved TensorFlow model on the Data Collector machine. Specify either an absolute path or the path relative to the Data Collector resources directory.
    For example, if you saved a model named my_saved_model to the Data Collector resources directory /var/lib/sdc-resources, then enter either of the following paths:
    • /var/lib/sdc-resources/my_saved_model
    • my_saved_model
    Model Tags Tags applied to the TensorFlow model when the model was built and trained.
    Input Configurations Tensor input information configured during the building and training of the model.
    Define one or more input configurations, configuring the following properties for each:
    • Operation - Operation to perform on the inputs.
    • Index - Position of this input in the matrix of inputs.
    • Fields to Convert - Fields in the record to convert to tensor fields as required by the input operation.

      Specify one or more fields, or configure a field path expression to define a set of fields.

    • Shape - Number of elements in each dimension.
    • Tensor Type - Data type of the tensor.

    Using simple or bulk edit mode, click the Add icon to define another input configuration.

    Output Configurations Tensor output information configured during the building and training of the model.
    Define one or more output configurations, configuring the following properties for each:
    • Operation - Operation to perform on the outputs.
    • Index - Position of this output in the matrix of outputs.
    • Tensor Type - Data type of the tensor.

    Using simple or bulk edit mode, click the Add icon to define another output configuration.

    Entire Batch Evaluates the entire batch at once. Select when the TensorFlow model requires many inputs to generate one output. Clear when the TensorFlow model requires one input to generate one output.

    Default is cleared.

    If selected, you must also configure the processor to generate events so that the processor produces one output as an event for the entire batch. The event output includes the prediction or classification result.

    Output Field If evaluating each record, the output field for the prediction or classification result. The processor adds the output field to each record.