Running inference on a deployed model

Submit an inference job on a deployed model.

Procedure

  1. To submit an inference operation on a deployed model, obtain the REST URI for that model.
    dlim model view model_name
    where model_name is the name of the model.
  2. Use the curl command to submit the inference job
    curl -k -X POST -d 'JSON-input-data' -H "Authorization: Bearer `dlim config -s`" model-REST-URI
    where JSON-input-data is the input data for the service and model-REST-URI is the service URI.
    For example, for an image file the command can look like this:
    (echo -n '{"id":1, "inputs":[ { "name":"gpu_0/data", "datatype":"BYTES","data":"';base64 -w 0 img_file;echo '" } ]}') | curl -k -X POST -d @- -H "Authorization: Bearer `dlim config -s`" https://wmla-inference.ibm.com/dlim/v1/inference/resnet18-pytorch
    {"id": 1, "outputs": [{"name": "output0", "datatype": "FP32", "shape": [1, 10], "data": [0.07383891940116882, -1.1525914669036865, 0.8941959142684937, 0.7748421430587769, 0.5024958848953247, 0.25152453780174255, -0.10158365964889526, 0.24579831957817078, -0.12934023141860962, -0.6191686391830444]}]}
    where img_file is the image file.