gRPC for elastic distributed inference
What is included
For an elastic distributed inference service, there are two ways to authenticate to a stream connection between client and server, either through a user name and password, or through a token.
The auth() method authenticates the user by username and password and return
user an authorized token. The stream_infer() method transmits data to the service
and does the inferencing using the authorization token.
syntax = "proto3";
package redhare;
// The inference service definition.
service Inference {
//auth
rpc auth(AuthRequest) returns (AuthResponse) {}
// create a inference stream
rpc stream_infer (stream Request) returns (stream Response) {}
}
message AuthRequest {
// key string {"model_name" , "userpasswd", "token"}
map<string, string> attributes = 1;
}
message AuthResponse {
string url = 1;
string token = 2;
}
message Request {
oneof request_oneof {
bytes data = 1;
string token = 2;
}
}
message Response {
bytes data = 1;
}
Client authentication
rpc auth(AuthRequest) returns (AuthResponse);- The name of the model to be used
- Username and password of the user that has login access to the inference service
- The URL of the inference service for the model specified
- A token for further authentication
Creating an inference stream for a client
rpc stream_infer (stream Request) returns (stream Response) {}To create an inference stream, either a token or data must be provided. Either a token that was received from a previous authentication or data that is transmitted to the inference service.
First, transmit a token through this API. If this token passes authentication, a stream connection will be established. Then, transmit data using this connection through this API continuously. The server will process data and return the inference result.
Using gRPC on client side
- Before starting to use gRPC, make sure of the following:
- An inference model kernel has been published using elastic distributed inference in the cluster management console. In this example, the model used is named darkflow. To learn more about creating a kernel file, see Create a kernel file for an inference service.
- The LBD service has been started and the host IP and port information
is found in ${REDHARE_TOP}/conf/dlim.conf. For example:
dlim.lbd.ip = 9.3.89.250 dlim.lbd.stream.port = 9010If you do not have access, contact the cluster administrator for this information.
- Compile stream protocol buffers and define import classes.
- Copy the content of stream.proto and compile
it.
python -m grpc_tools.protoc --python_out=. --grpc_python_out=. stream.proto - The stream protocol buffers generate two python files stream_pb2.py and
stream_pb2_grpc.py, import them using the gRPC
APIs.
import stream_pb2 import stream_pb2_grpc
- Copy the content of stream.proto and compile
it.
- Authenticate to the client using user name and password. For
example:
channel = grpc.insecure_channel('9.3.89.250:9010') stub = stream_pb2_grpc.InferenceStub(channel) request = stream_pb2.AuthRequest() request.attributes["model_name"] = "darkflow" request.attributes["userpasswd"] = `echo -n "<name>:<password>" | base64 -` try: response = stub.auth(request) except Exception, e: print(" authentication failed, due to: " + e.details()) return else: sendstream(response.token, response.url, video_path, quality, fps) request.attributes["password"] = "Admin"- Connect the lbd service and get the stub.
- Input the model name, user name and password into an AuthRequest.
- Call the auth() method to pass the authentication information to the server and save the result to the response.
- If authentication passes, a response will return the URL of model inference service and an authorized token.
- Creating an inference stream for the client.
- Connect to the lbd model and get the
stub.
channel = grpc.insecure_channel(url) stub = stream_pb2_grpc.InferenceStub(channel) - Call
stream_infermethod to pass token and then data to server and save the result in response.- Pass the token to establish a
connection.
auth_request = stream_pb2.Request() auth_request.token = self.token ... responses = self.stub.stream_infer(auth_request) - Once the connection is ready, pass inference data to server for
inferencing.
tinput['key'] = self.count tinput['data'] = base64.encodestring(imgData.tostring()).decode("utf-8").replace('\n','').replace('\r','') request.data = json.dumps(tinput) ... responses = self.stub.stream_infer(request)
- Pass the token to establish a
connection.
- An inference stream is now running using gRPC.
- Connect to the lbd model and get the
stub.