Getting started with watsonx.ai Lightweight Engine
5.0.1 or later After you install the watsonx.ai lightweight engine and add foundation models to your cluster, you can work with the models by using the API.
Accessing the web client
You can perform administrative tasks such as monitoring platform resource use from the web client.
- Use the following command to get details for the
service:
cpd-cli manage get-cpd-instance-details \ --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \ --get_admin_initial_credentials=trueThe output includes the following information:- Base URL:
example.clustername.cluster.domain - User name:
cpadmin - Password:
<password>
- Base URL:
- Save the service details as environment variables so you can reference the details
later.
export CPDUSER=cpadmin export CPDUSER_PASSWORD=<password> export CPD_URL=example.clustername.cluster.domain - To access the web client, open a web page from a URL with the following
syntax:
CPD_URL/zen
To perform generative AI tasks, such as inferencing foundation models or using embedding models to vectorize text, you use the IBM watsonx.ai API.
Generating API credentials
- To get an API key, you can generate the key in one of the following ways:
- From the web client user interface.
For more information, see Generating API keys for authentication.
- Programmatically, by completing the following steps.
- Use the service details that you retrieved earlier to create a bearer token.
For details, see the Get authorization token API in the IBM Cloud Pak for Data platform API documentation.
Copy the bearer token from the JSON that is returned.curl --request POST \ --url "https://${CPD_URL}/icp4d-api/v1/authorize" \ --header "Content-Type: application/json" \ --data "{ \"username\":\"${CPDUSER}\", \"password\": \"${CPDUSER_PASSWORD}\" }" - Replace TOKEN with the copied token to use as the bearer token in the following request.
For details, see the Get API key API in the IBM Cloud Pak for Data platform API documentation.
curl -s -X GET \ --url "${CPD_URL}/usermgmt/v1/user/apiKey" \ --header "Authorization: Bearer <TOKEN>"Note: When you use theusermgmt/v1/user/apiKeyendpoint, a new API key is created and if there is an existing key, the existing key expires.Copy the API key that is generated.
- Use the service details that you retrieved earlier to create a bearer token.
- From the web client user interface.
- For the IBM
watsonx.ai REST API, use the API key to request new bearer tokens as needed.
If you generated the API key programmatically, you might notice that you use the same
${CPD_URL}/icp4d-api/v1/authorizeendpoint that was used earlier, but this time submit the API key instead of a password to generate the bearer token.For details, see the Get authorization token API in the IBM Cloud Pak for Data platform API documentation.
Copy the bearer token from thecurl --request POST \ --url "https://${CPD_URL}/icp4d-api/v1/authorize" \ --header "Content-Type: application/json" \ --data "{ \ \"username\":\"${CPDUSER}\", \"api_key\": \"<APIKEY>\" }"access_tokensection of the JSON response. - When you submit requests, include the bearer token that was returned in the previous step.
curl --request POST \ --url "https://${CPD_URL}/ml/v1/text/generation?version=2023-05-29" \ --header "Accept: application/json" \ --header "Authorization: Bearer <TOKEN>" \ --header "Content-Type: application/json" \ ...
Inferencing a foundation model programmatically
Submit a REST API request by using the IBM watsonx.ai API to inference a foundation model programmatically.
You must add the curated or custom foundation models that you want to use for text generation before you can use the API to inference them. See Adding foundation models or Adding custom foundation models.
- Get a list of the available foundation models
Any custom foundation models that you added are available from the same endpoint as any curated foundation models that you added to the service. For more information, see Foundation model specs.curl -k -X GET \ --url "https://${CPD_URL}/ml/v1/foundation_model_specs?version=2024-07-23&limit=50"- Inference a foundation model
-
curl -k -X POST \ --url "https:${CPD_URL}/ml/v1/text/generation?version=2024-07-23" --header "Authorization: Bearer <TOKEN>" --header "Content-Type: application/json" --data "{ "model_id": "ibm/granite-13b-chat-v2", "input": "Tell me about mortgage insurance."}" - Inference a custom foundation model
-
curl -k -X POST \ --url "https:${CPD_URL}/ml/v1/text/generation?version=2024-07-23" --header "Authorization: Bearer <TOKEN>" --header "Content-Type: application/json" --data "{ "model_id": "tiiuae/falcon-7b", "input": "Tell me about mortgage insurance."}"
project_id that is shown
in the API reference examples. Projects are not used in the watsonx.ai lightweight engine.Supported REST API methods
- Text embeddingsYou must add embedding models before you can vectorize text by using the API. See Adding foundation models.Attention: Omit the
project_idthat is shown in the API reference examples. Projects are not used in the watsonx.ai lightweight engine.
Using the Python library
You can use the IBM watsonx.ai Python library to work with foundation models that you deploy from the watsonx.ai lightweight engine.
For more information, see Python library.
Evaluating a prompt template
You can inference a foundation model that is hosted in the watsonx.ai lightweight engine with various inputs and store the model output in a CSV file. You can then import the CSV file to a cluster where watsonx.governance is installed and evaluate the model output as a detached prompt template.
- Create a CSV file named prompt_data.csv to use as the starting point.
Define values for the
city_nameprompt variable to cycle through from the code.city_name generated_text input_token_count generated_token_count 0 New York City 1 London 2 Tokyo 3 Stockholm - Generate a bearer token that you can specify with REST requests.
- Define a method for submitting a post request to the text generation method of the IBM watsonx.ai API. In the example in Step 4, this method is referred to as `<your-text-generation-method>`.
- Use the
pandaslibrary to work with structured data in a CSV file. Add the model output to thegenerated_textcolumn, and the token counts to theinput_token_countandgenerated_token_countcolumns. The following code example adds information about the output that is generated by the foundation model for each input that is submitted to the CSV file.import pandas as pd test_df = pd.read_csv("prompt_data.csv") token = <specify-your-token> generated_text = [] input_token_count = [] generated_token_count = [] for city in test_df["city_name"]: payload = { "model_id": "mistralai/mixtral-8x7b-instruct-v01", "input": f"Describe the must-see attractions for visiting {city} as a tourist." } scored_response = <your-text-generation-method>(CPD_URL, payload, token) generated_text.append(scored_response["results"][0]["generated_text"]) input_token_count.append(scored_response["results"][0]["input_token_count"]) generated_token_count.append(scored_response["results"][0]["generated_token_count"]) test_df["generated_text"] = generated_text test_df["input_token_count"] = input_token_count test_df["generated_token_count"] = generated_token_count test_df.to_csv("custom_detached_test_prompt_data.csv", index=False) test_df.head()You can import the generated CSV file into a full installation of the watsonx.governance service and evaluate the model by following the instructions in the Evaluating detached prompt templates in projects procedure.