Working with the watsonx.ai lightweight engine
You can work with foundational models in the lightweight installation mode by using the API. The watsonx.ai lightweight engine does not have a dedicated user interface. However, you can perform administrative tasks such as monitoring platform resource usage from the IBM Software Hub platform web client.
Generating API credentials
To perform generative AI tasks, such as inferencing foundation models or using embedding models to vectorize text, you use the watsonx API. Get API credentials that you can specify when you make requests to the watsonx.ai lightweight engine service to show that you are authorized to use the available methods.
To generate credentials, complete the following steps:
-
To get an API key, you can generate the key in one of the following ways:
- From the web client user interface. For more information, see Generating API keys for authentication.
- Programmatically, by completing the following steps:
- Use the service details that you retrieved earlier to create a bearer token. For details, see the Get authorization token.
curl --request POST \ --url "https://${CPD_URL}/icp4d-api/v1/authorize" \ --header "Content-Type: application/json" \ --data "{ \"username\":\"${CPDUSER}\", \"password\": \"${CPDUSER_PASSWORD}\" }"Copy the bearer token from the JSON that is returned.- Replace TOKEN with the copied token to use as the bearer token in the following request. For details, see the Get API key.
curl -s -X GET \ --url "${CPD_URL}/usermgmt/v1/user/apiKey" \ --header "Authorization: Bearer <TOKEN>"Note: When you use the `usermgmt/v1/user/apiKey` endpoint, a new API key is created and if there is an existing key, the existing key expires.Copy the API key that is generated.
-
For the watsonx.ai REST API, use the API key to request new bearer tokens as needed. If you generated the API key programmatically, you might notice that you use the same
${CPD_URL}/icp4d-api/v1/authorizeendpoint that was used earlier, but this time submit the API key instead of a password to generate the bearer token. For details, see the Get authorization token.curl --request POST \ --url "https://${CPD_URL}/icp4d-api/v1/authorize" \ --header "Content-Type: application/json" \ --data "{ \"username\":\"${CPDUSER}\", \"api_key\": \"<APIKEY>\" }"Copy the bearer token from the
access_tokensection of the JSON response. -
When you submit requests, include the bearer token that was returned in the previous step.
curl --request POST \ --url "https://${CPD_URL}/ml/v1/text/generation?version=2023-05-29" \ --header "Accept: application/json" \ --header "Authorization: Bearer <TOKEN>" \ --header "Content-Type: application/json" \ ...
Inferencing a foundation model
Submit a REST API request by using the watsonx.ai API to inference a foundation model programmatically. You must add the curated or custom foundation models that you want to use for text generation before you can use the API to inference them.
Get a list of the available foundation models
Any custom foundation models that you added are accessible from the same endpoint as any curated foundation models that you added to the service. For more information, see Foundation model specs.
curl -k -X GET \
--url "https://${CPD_URL}/ml/v1/foundation_model_specs?version=2024-07-23&limit=50"
Inference a foundation model
curl -k -X POST \
--url "https:${CPD_URL}/ml/v1/text/generation?version=2024-07-23"
--header "Authorization: Bearer <TOKEN>"
--header "Content-Type: application/json"
--data "{
"model_id": "ibm/granite-13b-chat-v2",
"input": "Tell me about mortgage insurance."}"
Inference a custom foundation model
curl -k -X POST \
--url "https:${CPD_URL}/ml/v1/text/generation?version=2024-07-23"
--header "Authorization: Bearer <TOKEN>"
--header "Content-Type: application/json"
--data "{
"model_id": "tiiuae/falcon-7b",
"input": "Tell me about mortgage insurance."}"
For more information about the text generation method, see Text generation.
Supported REST API methods
In addition to text generation, you can use the following watsonx.ai REST API method from the watsonx.ai lightweight engine:
- Text embeddings You must add embedding models before you can vectorize text by using the API.
Using the Python library
You can use the IBM watsonx.ai Python library to work with foundation models that you deploy from the watsonx.ai lightweight engine.
For more information, see Using the Python library from the watsonx.ai lightweight engine.
Sample notebooks:
Evaluating a prompt template
You can inference a foundation model that is hosted in the watsonx.ai lightweight engine with various inputs and store the model output in a CSV file. You can then import the CSV file to a cluster where watsonx.governance is installed and evaluate the model output as a detached prompt template.
The high level steps you can take to evaluate a prompt template are as follows:
-
Create a CSV file named
prompt_data.csvto use as the starting point. Define values for thecity_nameprompt variable to cycle through from the code.city_name generated_text input_token_count generated_token_count 0 New York City 1 London 2 Tokyo 3 Stockholm -
Generate a bearer token that you can specify with REST requests. See Credentials for programmatic access.
-
Define a method for submitting a post request to the text generation method of the watsonx.ai API. In the example in Step 4, this method is referred to as
<your-text-generation-method>. See Inferencing a foundation model programmatically. -
Use the
pandaslibrary to work with structured data in a CSV file. Add the model output to thegenerated_textcolumn, and the token counts to theinput_token_countandgenerated_token_countcolumns. The following code example adds information about the output that is generated by the foundation model for each input that is submitted to the CSV file.
import pandas as pd
test_df = pd.read_csv("prompt_data.csv")
token = <specify-your-token>
generated_text = []
input_token_count = []
generated_token_count = []
for city in test_df["city_name"]:
payload = {
"model_id": "mistralai/mixtral-8x7b-instruct-v01",
"input": f"Describe the must-see attractions for visiting {city} as a tourist."
}
scored_response = <your-text-generation-method>(CPD_URL, payload, token)
generated_text.append(scored_response["results"][0]["generated_text"])
input_token_count.append(scored_response["results"][0]["input_token_count"])
generated_token_count.append(scored_response["results"][0]["generated_token_count"])
test_df["generated_text"] = generated_text
test_df["input_token_count"] = input_token_count
test_df["generated_token_count"] = generated_token_count
test_df.to_csv("custom_detached_test_prompt_data.csv", index=False)
test_df.head()
You can import the generated CSV file into a full installation of the watsonx.governance service and evaluate the model by following the instructions in the Evaluating detached prompt templates in projects procedure.
Learn more
Parent topic: Devloping generative AI solutions