In this tutorial, you will build a generative AI-powered personal trainer. This AI trainer leverages the latest opensource Meta Llama 4 Scout large language model (LLM) for processing image input and generating personalized workout plans to reach your fitness goals effectively. We will access the model through the IBM watsonx.ai™ API.
Are you looking to be more active? Has your fitness journey reached a plateau and are you looking to take your workout routines to the next level? In this tutorial, you will use AI to provide a personalized training experience.
Our workout app is composed of the following stages:
The user uploads images of their current workout equipment, one item at a time. This set can consist of both home & gym equipment.
The user selects the following criteria:
Upon submission of the input, the multimodal Llama 4 Scout model iterates over the list of images and returns the following output:
The same Llama 4 model then serves as a fitness coach. The LLM uses the previous output to provide a training plan that is suitable for the user’s selections.
The training program and the images in the described personalized recommendation are all returned to the user.
You need an IBM Cloud® account to create a watsonx.ai™ project.
In order to use the watsonx application programming interface (API), you will need to complete the following steps. Note, you can also access this tutorial on GitHub.
While you can choose from several tools, this tutorial walks you through how to set up an IBM account to use a Jupyter Notebook.
Log in to watsonx.ai using your IBM Cloud account.
Create a watsonx.ai project.
You can get your project ID from within your project. Click the Manage tab. Then, copy the project ID from the Details section of the General page. You need this ID for this tutorial.
Create a Jupyter Notebook.
This step will open a Notebook environment where you can copy the code from this tutorial. Alternatively, you can download this notebook to your local system and upload it to your watsonx.ai project as an asset. This Jupyter Notebook along with the images used can be found on GitHub.
Create a watsonx.ai Runtime service instance (choose the Lite plan, which is a free instance).
Generate an API Key.
Associate the watsonx.ai Runtime service to the project that you created in watsonx.ai.
We need a few libraries and modules for this tutorial. Make sure to import the following ones; if they're not installed, you can resolve this issue with a quick pip installation.
# Install required packages
!pip install -q image ibm-watsonx-ai
# Required imports
import getpass, os, base64, json
from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models import ModelInference
from PIL import Image
To set our credentials, we need the WATSONX_APIKEYWATSONX_PROJECT_ID
WATSONX_APIKEY = getpass.getpass("Please enter your watsonx.ai Runtime API key (hit enter): ")
WATSONX_PROJECT_ID = getpass.getpass("Please enter your project ID (hit enter): ")
URL = "https://us-south.ml.cloud.ibm.com"
We can use the Credentials
credentials = Credentials(
url=URL,
api_key=WATSONX_APIKEY
)
The augment_api_request_body
def augment_api_request_body(user_query, image):
messages = [
{
"role": "user",
"content": [{
"type": "text",
"text": user_query
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{image}"
}
}]
}
]
return messages
We can also instantiate the model interface by using the ModelInference
model = ModelInference(
model_id="meta-llama/llama-4-scout-17b-16e-instruct",
credentials=credentials,
project_id=WATSONX_PROJECT_ID,
params={
"max_tokens": 128000,
"temperature": 0
}
)
To encode our images in a way that is digestible for the LLM, we encode them to bytes that we then decode to UTF-8 representation. In this case, our images are located in the local images
directory = "images" #directory name
images = []
filenames = []
for filename in os.listdir(directory):
if filename.endswith(".jpeg") or filename.endswith(".png"):
filepath = directory + '/' +filename
with open(filepath, "rb") as f:
images.append(base64.b64encode(f.read()).decode('utf-8'))
filenames.append(filename)
print(filename)
Output:
image0.jpeg
image1.jpeg
image6.jpeg
image7.jpeg
image10.jpeg
image8.jpeg
image4.jpeg
image5.jpeg
image9.jpeg
image2.jpeg
image3.jpeg
Now that we have loaded and encoded our images, we can query the Vision model. Our prompt is specific to our desired output to limit the model's creativity as we seek valid JSON output. We will store the description, category and workout type of each image in a list called image_descriptions
user_query = """Provide a description, category, and workout type for the kinds of exercise equipment in each image, eg. "barbell", "dumbbell", "machine", "bodyweight", etc.
Classify the description as "equipment" or "other".
Classify the category as "barbell", "dumbbell", "machine", "bodyweight", etc.
Classify the workout type as "strength", "endurance", "flexibility", "balance", "cardio", etc.
Ensure the output is valid JSON. Do not create new categories or occasions. Only use the allowed classifications.
Your response should be in this schema:
{
"description": "<description>",
"category": "<category>",
"workout_type": "<workout_type>"
}
"""
image_descriptions = []
for i in range(len(images)):
image = images[i]
message = augment_api_request_body(user_query, image)
response = model.chat(messages=message)
result = response['choices'][0]['message']['content']
print(result)
image_descriptions.append(result)
Output:
{
"description": "elliptical trainer",
"category": "machine",
"workout_type": "cardio"
}
```json
{
"description": "treadmill",
"category": "machine",
"workout_type": "cardio"
}
```
```
{
"description": "exercise bike",
"category": "machine",
"workout_type": "cardio"
}
```
```json
{
"description": "A ballet barre",
"category": "barre",
"workout_type": "strength"
}
```
```json
{
"description": "Stairmaster",
"category": "machine",
"workout_type": "cardio"
}
```
```json
{
"description": "Pilates reformer",
"category": "machine",
"workout_type": "strength"
}
```
```json
{
"description": "barbell",
"category": "barbell",
"workout_type": "strength"
}
```
```json
{
"description": "A weightlifting bench with a barbell rack and weights",
"category": "barbell",
"workout_type": "strength"
}
```
```json
{
"description": "A dumbbell with multiple weight plates",
"category": "dumbbell",
"workout_type": "strength"
}
```
```json
{
"description": "rowing machine",
"category": "machine",
"workout_type": "cardio"
}
```
```json
{
"description": "yoga mat",
"category": "other",
"workout_type": "flexibility"
}
```
To align the filenames with the image descriptions, we can enumerate the list of image descriptions and create a list of dictionaries. These dictionaries will store the description, category, occasion and filename of each item in the respective fields.
# Add filenames to the image descriptions
gym_equipment = []
for i, desc in enumerate(image_descriptions):
# Clean up the string by removing markdown code block markers and 'json' identifier
cleaned_desc = desc.strip()
if cleaned_desc.startswith('```'):
cleaned_desc = cleaned_desc.split('```')[1] # Remove opening ```
if cleaned_desc.startswith('json'):
cleaned_desc = cleaned_desc[4:] # Remove 'json' identifier
cleaned_desc = cleaned_desc.split('```')[0] # Remove closing ```
cleaned_desc = cleaned_desc.strip()
desc_dict = json.loads(cleaned_desc)
desc_dict['filename'] = filenames[i]
image_descriptions[i] = json.dumps(desc_dict)
gym_equipment = [json.loads(js) for js in image_descriptions]
print(gym_equipment)
Output:
[{'description': 'elliptical trainer', 'category': 'machine', 'workout_type': 'cardio', 'filename': 'image0.jpeg'}, {'description': 'treadmill', 'category': 'machine', 'workout_type': 'cardio', 'filename': 'image1.jpeg'}, {'description': 'exercise bike', 'category': 'machine', 'workout_type': 'cardio', 'filename': 'image6.jpeg'}, {'description': 'A ballet barre', 'category': 'barre', 'workout_type': 'strength', 'filename': 'image7.jpeg'}, {'description': 'Stairmaster', 'category': 'machine', 'workout_type': 'cardio', 'filename': 'image10.jpeg'}, {'description': 'Pilates reformer', 'category': 'machine', 'workout_type': 'strength', 'filename': 'image8.jpeg'}, {'description': 'barbell', 'category': 'barbell', 'workout_type': 'strength', 'filename': 'image4.jpeg'}, {'description': 'A weightlifting bench with a barbell rack and weights', 'category': 'barbell', 'workout_type': 'strength', 'filename': 'image5.jpeg'}, {'description': 'A dumbbell with multiple weight plates', 'category': 'dumbbell', 'workout_type': 'strength', 'filename': 'image9.jpeg'}, {'description': 'rowing machine', 'category': 'machine', 'workout_type': 'cardio', 'filename': 'image2.jpeg'}, {'description': 'yoga mat', 'category': 'other', 'workout_type': 'flexibility', 'filename': 'image3.jpeg'}]
Now, let's query the Llama 4 model to produce a workout plan for our specified criteria by using the gym_equipment
workout_type = "cardio"
length = "1 hour"
fitness_level = "beginner"
workout_type = input("Enter the workout type") #strength, endurance, flexibility, balance, cardio, etc. (e.g. "cardio")
length = input("Enter the length of the workout") #30 minutes, 1 hour, 1.5 hours, etc.
fitness_level = input("Enter your fitness level") #beginner, intermediate or advanced
prompt = f"""Use the description, category, and workout type of the exercise equipment in my gym to put together a workout for a {fitness_level} {workout_type} workout. The workout must be no longer than {length}.
You must inclue the filename of each image in your output along with the file extension. Here is the equipment in my gym: {gym_equipment}"""
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": f"{prompt}"
}
]
}
]
workout = model.chat(messages=messages)['choices'][0]['message']['content']
print(workout)
Output:
Based on the equipment available in your gym, I've put together a beginner-friendly cardio workout that can be completed within 1 hour. Since you're looking for a cardio workout, I'll focus on the equipment that falls under the 'cardio' or 'cardio' category. Here's a suggested workout:
**Warm-up (5 minutes)**
* Start with the 'rowing machine' (image2.jpeg) for 5 minutes to get your heart rate up and loosen your muscles.
**Cardio Circuit (30 minutes)**
* Move to the 'treadmill' (image1.jpeg) and set it to a walk or jog at a moderate pace. Spend 10 minutes on the treadmill to get your heart rate up and get some cardio benefits.
* Next, head to the 'elliptical trainer' (image0.jpeg) and spend 10 minutes on it, taking your heart rate to a moderate level. You can adjust the resistance to make it more challenging.
* Finally, hop on the 'exercise bike' (image6.jpeg) for 10 minutes to get some more cardio action.
**High-Intensity Interval Training (HIIT) (20 minutes)**
* Move to the 'Stairmaster' (image10.jpeg) and spend 5 minutes warming up at a moderate pace.
* Then, increase the resistance and sprint for 2 minutes at maximum intensity.
* Reduce the intensity and recover for 2 minutes. Repeat for a total of 15-20 minutes.
**Cool-down (5 minutes)**
* Finish your workout with some light stretching on the 'yoga mat' (image3.jpeg) to help prevent muscle soreness.
Here's your workout schedule:
1. Warm-up on the 'rowing machine' (image2.jpeg) (5 minutes)
2. Cardio circuit:
* Treadmill (10 minutes)
* Elliptical trainer (image0.jpeg) (10 minutes)
* Exercise bike (image6.jpeg) (10 minutes)
3. HIIT on the Stairmaster (image10.jpeg) (20 minutes)
4. Cool-down with stretching (5 minutes)
This workout should get your heart rate up and provide a great cardio session for beginners. Remember to listen to your body and adjust the intensity and duration according to your needs.
Example Output:
```
**Beginner Cardio Workout**
Warm-up (5 minutes):
- Rowing machine (image2.jpeg)
Cardio Circuit (30 minutes):
- Treadmill (image1.jpeg) - 10 minutes
- Elliptical trainer (image0.jpeg) - 10 minutes
- Exercise bike (image6.jpeg) - 10 minutes
**Cool-down (5 minutes)**
- Static stretches on the yoga mat (image3.jpeg)
Great! This personal training algorithm returned a well-described cardio workout plan including the number of repetitions, or reps, and rest time. The LLM also places importance on proper form and hydration.
With this generated workout description, we can also display the gym equipment that the model recommends. To do so, we can simply extract the filenames. In case the model mentions the same filename twice, it is important to check whether the image has not already been displayed as we iterate the list of images. We can do so by storing displayed images in the selected_items
selected_items = []
#extract the images of clothing that the model recommends
for item, uploaded_file in zip(gym_equipment, images):
if item['filename'].lower() in workout.lower() and not any(key['filename'] == item['filename'] for key in selected_items):
selected_items.append({
'image': uploaded_file,
'category': item['category'],
'filename': item['filename']
})
#display the selected clothing items
if len(selected_items) > 0:
for item in selected_items:
display(Image.open(directory + '/' + item['filename']))
In this tutorial, you built a fitness app that uses an AI coach to customize and automate training sessions for new clients. This coaching platform and other advancements alike have the potential to reshape the fitness industry by providing real-time feedback to a real person looking for online training. Using photos or screenshots of the user's equipment, workout plans are customized by the AI tool to meet the specified criteria. Workouts for weight loss, muscle gain and strength training are all possible outputs. The Llama 4 model was critical for labeling and categorizing each item as well as generating the workout plan.
Some next steps for building off this application can include: