Running inference on a deployed model
Submit an inference job on a deployed model.
Procedure
-
To submit an inference operation on a deployed model, obtain the
REST URIfor that model.
where model_name is the name of the model.dlim model view model_name - Use the curl command to submit the inference
job
where JSON-input-data is the input data for the service and model-REST-URI is the service URI.curl -k -X POST -d 'JSON-input-data' -H "Authorization: Bearer `dlim config -s`" model-REST-URIFor example, for an image file the command can look like this:
where img_file is the image file.(echo -n '{"id":1, "inputs":[ { "name":"gpu_0/data", "datatype":"BYTES","data":"';base64 -w 0 img_file;echo '" } ]}') | curl -k -X POST -d @- -H "Authorization: Bearer `dlim config -s`" https://wmla-inference.ibm.com/dlim/v1/inference/resnet18-pytorch {"id": 1, "outputs": [{"name": "output0", "datatype": "FP32", "shape": [1, 10], "data": [0.07383891940116882, -1.1525914669036865, 0.8941959142684937, 0.7748421430587769, 0.5024958848953247, 0.25152453780174255, -0.10158365964889526, 0.24579831957817078, -0.12934023141860962, -0.6191686391830444]}]}