Writing deployable Python functions (Watson Machine Learning)

Learn how to write a Python function and then store it as an asset that you can use to deploy models.

For a list of general requirements for deployable functions refer to General requirements for deployable functions. For information on what happens during a function deployment, refer to Function deployment process

General requirements for deployable functions

To be deployed successfully, a function must meet these requirements:

  • The Python function file on import must have the score function object as part of its scope. Refer to Score function requirements
  • Scoring input payload must meet the requirements that are listed in Scoring input requirements
  • The output payload expected as output of score must include the schema of the score_response variable for status code 200. Note that the prediction parameter, with an array of JSON objects as its value, is mandatory in the score output.
  • When you use the Python client to save a Python function that contains a reference to an outer function, only the code in the scope of the outer function (including its nested functions) is saved. Therefore, the code outside the outer function's scope will not be saved and thus will not be available when you deploy the function.

Score function requirements

  • Two ways to add the score function object exist:
    • explicitly, by user
    • implicitly, by the method that is used to save the Python function as an asset in the Watson Machine Learning repository
  • The score function can accept a single JSON input parameter or two parameters: payload and bearer token.
  • The score function must return a JSON-serializable object (for example: dictionaries or lists).

Scoring input requirements

  • The scoring input payload must include an array with the name values, as shown in this example schema. The input_data parameter is mandatory in the payload. The input_data parameter can also include additional name-value pairs.

    {"input_data": [{
       "values": [["Hello world!"]]
                   }]
    }
    
  • The scoring input payload must be passed as an input parameter value for score. This way you can ensure that the value of the score input parameter is handled accordingly inside the score.

  • The scoring input payload must match the input requirements for the concerned Python function.

  • The scoring input payload must include an array that matches the Example input data schema.

Example input data schema

 {"input_data": [{
    "values": [["Hello, world!"]]
                }]
 }

Example Python code (payload and token)

#wml_python_function
def my_deployable_function():

    def score(payload, token):

        message_from_input_payload = payload.get("input_data")[0].get("values")[0][0]
        response_message = "Received message - {0}".format(message_from_input_payload)

        # Score using the pre-defined model
        score_response = {
            'predictions': [{'fields': ['Response_message_field'],
                             'values': [[response_message]]
                            }]
        }
        return score_response

    return score

score = my_deployable_function()

Testing your Python function

Here's how you can test your Python function:

input_data = { "input_data": [{ "fields": [ "message" ],
                                "values": [[ "Hello, world!" ]]
                               }
                              ]
             }
function_result = score( input_data )
print( function_result )

It returns the message "Hello, world!".

Function deployment process

The Python code of your Function asset gets loaded as a Python module by the Watson Machine Learning engine by using an import statement. This means that the code will be executed exactly once (when the function is deployed or each time when the corresponding pod gets restarted). The score function that is defined by the Function asset is then called in every prediction request.

Handling deployable functions

Use one of these methods to create a deployable Python function:

Creating deployable functions through REST API

For REST APIs, because the Python function is uploaded directly through a file, the file must already contain the score function. Any one time import that needs to be done to be used later within the score function can be done within the global scope of the file. When this file is deployed as a Python function, the one-time imports available in the global scope get executed during the deployment and later simply reused with every prediction request.

Important:

The function archive must be a .gz file.

Sample score function file:

Score function.py
---------------------
def score(input_data):
    return {'predictions': [{'values': [['Just a test']]}]}

Sample score function with one time imports:

import subprocess
subprocess.check_output('pip install gensim --user', shell=True)
import gensim

def score(input_data):
    return {'predictions': [{'fields': ['gensim_version'], 'values': [[gensim.__version__]]}]}

Creating deployable functions through the Python client

To persist a Python function as an asset, the Python client uses the wml_client.repository.store_function method. You can persist a Python function in two ways:

Persisting a function through a file that contains the Python function

This method is the same as persisting the Python function file through REST APIs (score must be defined in the scope of the Python source file). For details, refer to Creating deployable functions through REST API.

Important:

When you are calling the wml_client.repository.store_function method, pass the file name as the first argument.

Persisting a function through the function object

You can persist Python function objects by creating Python Closures with a nested function named score. The score function is returned by the outer function that is being stored as a function object, when called. This score function must meet the requirements that are listed in General requirements for deployable functions. In this case, any one time imports and initial setup logic must be added in the outer nested function so that they get executed during deployment and get used within the score function. Any recurring logic that is needed during the prediction request must be added within the nested score function.

Sample Python function save by using the Python client:

def my_deployable_function():

    import subprocess
    subprocess.check_output('pip install gensim', shell=True)
    import gensim

    def score(input_data):
        import
        message_from_input_payload = payload.get("input_data")[0].get("values")[0][0]
        response_message = "Received message - {0}".format(message_from_input_payload)

        # Score using the pre-defined model
        score_response = {
            'predictions': [{'fields': ['Response_message_field', 'installed_lib_version'],
                             'values': [[response_message, gensim.__version__]]
                            }]
        }
        return score_response

    return score

function_meta = {
    client.repository.FunctionMetaNames.NAME:"test_function",
    client.repository.FunctionMetaNames.SOFTWARE_SPEC_ID: sw_spec_id
}
func_details = client.repository.store_function(my_deployable_function, function_meta)

In this scenario, the Python function takes up the job of creating a Python file that contains the score function and persisting the function file as an asset in the Watson Machine Learning repository:

score = my_deployable_function()

Creating deployable functions in JupyterLab

If you create your Python function in JupyterLab, your code must contain a #wml_python_function comment. If the comment is missing, the function will be imported into Watson Studio as a script asset rather than a Python Function asset.

Refer to this example:

from tornado.escape import json_encode, json_decode, url_escape

# Define scoring function
def callModel(payload_scoring):

    print(json.dumps(payload_scoring))
    predictions =[]
    for value in payload_scoring:
        sums = []
        for value in payload_scoring["input_data"][0]["values"]:
            first = value[0]
            second = value[1]
            sums.append([first, second, first + second])
        predictions.append({"fields": ["FIRST", "SECOND", "SUM"],"values": sums})

    return {"predictions": predictions}


#wml_python_function
def score(input):

    """AI function example.

    Example:
      {"input_data": [{"fields":["FIRST","SECOND","values":[[1,2]]}]}
    """

    # Score using the pre-defined model
    prediction = callModel(input);

    return prediction

Accessing assets that are located in a deployment space

To access assets that are located in a deployment space, you must initialize ibm-watson-studio-lib. The way to do that depends on the scope.

Initialization in deployment scope

In deployment scope, you initialize ibm-watson-studio-lib without passing any additional parameters. See example code:

def my_deployable_function():
    from ibm_watson_studio_lib import access_project_or_space
    wslib = access_project_or_space()
    token = wslib.auth.get_current_token()
    def score( payload ):
        message_from_input_payload = payload.get("input_data")[0].get("values")[0][0]
        response_message = "Received message - {0}".format(message_from_input_payload)

        score_response = {
            'predictions': [{'fields': ['Response_message_field'],
                             'values': [[response_message]]
                            }]
        }

    return score

Initialization within the score function

In the score function, you must pass your bearer token as a parameter. See example code:

def my_deployable_function():
    from ibm_watson_studio_lib import access_project_or_space
    def score(payload , token):
        wslib = access_project_or_space({"token": token})

        message_from_input_payload = payload.get("input_data")[0].get("values")[0][0]
        response_message = "Received message - {0}".format(message_from_input_payload)

        score_response = {
            'predictions': [{'fields': ['Response_message_field'],
                             'values': [[response_message]]
                            }]
        }
    return score

Accessing data from within the score function

You might want to access data assets from within the score function. For example:

  • Remote data must be accessed by using the credentials of the user who is calling the score function.
  • In your specific use case, you can't use JDBC or the Watson Machine Learning Python client.

In these situations, if you generate a data ingestion code snippet from within your notebook, the code will fail.

See these code examples for reference on accessing data from within the score function:

def my_deployable_function():
    import itc_utils.flight_service as itcfs
    from ibm_watson_studio_lib import access_project_or_space

    def score(payload, token):
        # token is from the predictions header.
        # Use it to initialise a Flight client here under score.
        # This ensures multi tenancy of predictions endpoint
        user_wslib = access_project_or_space({"token": token})

        # read the table named IRIS from database with the connection provided
        # the connection should be promoted to space prior to deployment.
        # The connection with name `db2_conn1` is expected to be already present under `space`
        db_query = {
            'connection_name': 'db2_conn1',
            'interaction_properties': { 'table_name': 'IRIS'}
        }

        # Fetch data with Flight client
        flight_client = itcfs.get_flight_client(wslib=user_wslib)
        flight_info = itcfs.get_flight_info(flight_client, nb_data_request=db_query, wslib=user_wslib)
        df = itcfs.read_pandas_and_concat(flight_client, flight_info, timeout=240)

        # return the first 2 rows with the column names
        score_response = {
            "predictions": [{
                "fields": list(df.columns),
                "values": df.iloc[:2].values.tolist()
            }]
        }
        return score_response

    return score
def my_deployable_function():
    import itc_utils.flight_service as itcfs
    from ibm_watson_studio_lib import access_project_or_space

    def score(payload, token):
        # token is from the predictions header.
        # Use it to initialise a Flight client here under score.
        # This ensures multi tenancy of predictions endpoint
        user_wslib = access_project_or_space({"token": token})

        # read the table named IRIS from database with the connection provided
        # the connection should be promoted to space prior to deployment.
        # The connection with name `db2_conn1` is expected to be already present under `space`
        db_query = {
            'connection_name': 'db2_conn1',
            'interaction_properties': { 'table_name': 'IRIS'}
        }

        # Fetch data with Flight client
        flight_client = itcfs.get_flight_client(wslib=user_wslib)
        flight_descriptor = itcfs.get_flight_descriptor(nb_data_request=db_query, wslib=user_wslib)
        flight_info = flight_client.get_flight_info(flight_descriptor)
        df = itcfs.read_pandas_and_concat(flight_client, flight_info, timeout=240)

        # return the first 2 rows with the column names
        score_response = {
            "predictions": [{
                "fields": list(df.columns),
                "values": df.iloc[:2].values.tolist()
            }]
        }
        return score_response

    return score

Learn more

Parent topic: Deploying Python functions