Learn how easy it is to control the speaking rate of the text using the <prosody> element in a cURL command.

A developer recently had a query that I was trying to answer: How do I adjust the speaking rate in IBM Watson Text to Speech using cURL POST? 

While investigating, I came across the terms speaking rate and SSML. Before posting the answer to the query, let’s get familiar with the terms first.

What are speaking rate and SSML?

Speaking rate is often expressed in words per minute (wpm). To calculate this value, you’ll need to record yourself talking for a few minutes and then add up the number of words in your speech. Divide the total number of words by the number of minutes your speech took.

Speaking rate (wpm) = total words/number of minutes

The Speech Synthesis Markup Language (SSML) is an XML-based markup language that provides annotations of text for speech-synthesis applications. It is a recommendation of the W3C Voice-Browser Working Group that has been adopted as the standard markup language for speech synthesis by the VoiceXML 2.0 specification. SSML provides developers of speech applications with a standard way to control aspects of the synthesis process by enabling them to specify pronunciation, volume, pitch, speed, and other attributes via markup. For a complete introduction to SSML, refer to the IBM Cloud documentation.

Before you begin

  • Create an instance of the service: 
    • Go to the Text to Speech page in the IBM Cloud Catalog
    • Sign up for a free IBM Cloud account or log in
    • Click Create
  • Copy the credentials to authenticate to your service instance:
    • From the IBM Cloud Resource list, click on your Text to Speech service instance to go to the Text to Speech service dashboard page
    • On the Manage page, click Show Credentials to view your credentials
    • Copy the API Key and URL values and replace the placeholders {API_KEY} and {URL} with the respective values in the next section

Code snippets

Here’s a working example with the POST call on Linux or macOS:

curl -X POST -u "apikey:{API_KEY}" \ --header "Accept: audio/wav" \ --header "Content-Type: application/json" \ --data '{"text": "<p><s><prosody rate=\"+50%\">This is the first sentence of the paragraph.</prosody></s><s>Here is another sentence.</s><s>Finally, this is the last sentence.</s></p>"}' \ --output result.wav \ "{URL}/v1/synthesize" -v 

On a Windows command prompt, create a JSON file input.json with the following command:

echo { "text": "<p><s><prosody rate='+50%'>This is the first sentence of the paragraph.</prosody></s><s>Here is another sentence.</s><s>Finally, this is the last sentence.</s></p>" } > input.json 

Then, cURL to see result.wav file:

curl -X POST -u "apikey:{API_KEY}" ^
--header "Accept: audio/wav" ^
--header "Content-Type: application/json" ^
--data @input.json ^
--output result.wav ^
"{URL}/v1/synthesize" -v

Learn more

Learn how easy it is to quickly create a voice-enabled Android-native chatbot with the watsonx Assistant, Watson Text to Speech, and Watson Speech to Text services on IBM Cloud.

Build a Slackbot to create and search Db2 database entries for events and conferences.

Here are some useful links I followed to create the above code sample that will help you in understanding the SSML attributes. Also, check out the limitations of <prosody> in the links below.

Was this article helpful?

More from Cloud

Enhance your data security posture with a no-code approach to application-level encryption

4 min read - Data is the lifeblood of every organization. As your organization’s data footprint expands across the clouds and between your own business lines to drive value, it is essential to secure data at all stages of the cloud adoption and throughout the data lifecycle. While there are different mechanisms available to encrypt data throughout its lifecycle (in transit, at rest and in use), application-level encryption (ALE) provides an additional layer of protection by encrypting data at its source. ALE can enhance…

Attention new clients: exciting financial incentives for VMware Cloud Foundation on IBM Cloud

4 min read - New client specials: Get up to 50% off when you commit to a 1- or 3-year term contract on new VCF-as-a-Service offerings, plus an additional value of up to USD 200K in credits through 30 June 2025 when you migrate your VMware workloads to IBM Cloud®.1 Low starting prices: On-demand VCF-as-a-Service deployments begin under USD 200 per month.2 The IBM Cloud benefit: See the potential for a 201%3 return on investment (ROI) over 3 years with reduced downtime, cost and…

The history of the central processing unit (CPU)

10 min read - The central processing unit (CPU) is the computer’s brain. It handles the assignment and processing of tasks, in addition to functions that make a computer run. There’s no way to overstate the importance of the CPU to computing. Virtually all computer systems contain, at the least, some type of basic CPU. Regardless of whether they’re used in personal computers (PCs), laptops, tablets, smartphones or even in supercomputers whose output is so strong it must be measured in floating-point operations per…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters