Learn how easy it is to control the speaking rate of the text using the <prosody> element in a cURL command.

A developer recently had a query that I was trying to answer: How do I adjust the speaking rate in IBM Watson Text to Speech using cURL POST? 

While investigating, I came across the terms speaking rate and SSML. Before posting the answer to the query, let’s get familiar with the terms first.

What are speaking rate and SSML?

Speaking rate is often expressed in words per minute (wpm). To calculate this value, you’ll need to record yourself talking for a few minutes and then add up the number of words in your speech. Divide the total number of words by the number of minutes your speech took.

Speaking rate (wpm) = total words/number of minutes

The Speech Synthesis Markup Language (SSML) is an XML-based markup language that provides annotations of text for speech-synthesis applications. It is a recommendation of the W3C Voice-Browser Working Group that has been adopted as the standard markup language for speech synthesis by the VoiceXML 2.0 specification. SSML provides developers of speech applications with a standard way to control aspects of the synthesis process by enabling them to specify pronunciation, volume, pitch, speed, and other attributes via markup. For a complete introduction to SSML, refer to the IBM Cloud documentation.

Before you begin

  • Create an instance of the service: 
    • Go to the Text to Speech page in the IBM Cloud Catalog
    • Sign up for a free IBM Cloud account or log in
    • Click Create
  • Copy the credentials to authenticate to your service instance:
    • From the IBM Cloud Resource list, click on your Text to Speech service instance to go to the Text to Speech service dashboard page
    • On the Manage page, click Show Credentials to view your credentials
    • Copy the API Key and URL values and replace the placeholders {API_KEY} and {URL} with the respective values in the next section

Code snippets

Here’s a working example with the POST call on Linux or macOS:

curl -X POST -u "apikey:{API_KEY}" \ --header "Accept: audio/wav" \ --header "Content-Type: application/json" \ --data '{"text": "<p><s><prosody rate=\"+50%\">This is the first sentence of the paragraph.</prosody></s><s>Here is another sentence.</s><s>Finally, this is the last sentence.</s></p>"}' \ --output result.wav \ "{URL}/v1/synthesize" -v 
Scroll to view full table

On a Windows command prompt, create a JSON file input.json with the following command:

echo { "text": "<p><s><prosody rate='+50%'>This is the first sentence of the paragraph.</prosody></s><s>Here is another sentence.</s><s>Finally, this is the last sentence.</s></p>" } > input.json 
Scroll to view full table

Then, cURL to see result.wav file:

curl -X POST -u "apikey:{API_KEY}" ^
--header "Accept: audio/wav" ^
--header "Content-Type: application/json" ^
--data @input.json ^
--output result.wav ^
"{URL}/v1/synthesize" -v
Scroll to view full table

Learn more

Learn how easy it is to quickly create a voice-enabled Android-native chatbot with the Watson Assistant, Watson Text to Speech, and Watson Speech to Text services on IBM Cloud.

Build a Slackbot to create and search Db2 database entries for events and conferences.

Here are some useful links I followed to create the above code sample that will help you in understanding the SSML attributes. Also, check out the limitations of <prosody> in the links below.

More from Cloud

Strengthening cybersecurity in life sciences with IBM and AWS

7 min read - Cloud is transforming the way life sciences organizations are doing business. Cloud computing offers the potential to redefine and personalize customer relationships, transform and optimize operations, improve governance and transparency, and expand business agility and capability. Leading life science companies are leveraging cloud for innovation around operational, revenue and business models. According to a report on mapping the cloud maturity curve from the EIU, 48% of industry executives said cloud has improved data access, analysis and utilization, 45% say cloud…

7 min read

Kubernetes version 1.27 now available in IBM Cloud Kubernetes Service

< 1 min read - We are excited to announce the availability of Kubernetes version 1.27 for your clusters that are running in IBM Cloud Kubernetes Service. This is our 22nd release of Kubernetes. With our Kubernetes service, you can easily upgrade your clusters without the need for deep Kubernetes knowledge. When you deploy new clusters, the default Kubernetes version remains 1.25 (soon to be 1.26); you can also choose to immediately deploy version 1.27. Learn more about deploying clusters here. Kubernetes version 1.27 In…

< 1 min read

Redefining the consumer experience: Diageo partners with SAP and IBM on global digital transformation

3 min read - In an era of evolving consumer preferences and economic uncertainties, the beverage industry stands as a vibrant reflection of changing trends and shifting priorities. Despite the challenges posed by inflation and the cost-of-living crisis, a dichotomy has emerged in consumer behavior, where individuals untouched by the crisis continue to indulge in their favorite beverages, while those directly affected pivot towards more affordable luxuries, such as a bottle of something special. This intriguing juxtaposition highlights the resilient nature of consumers and…

3 min read

IBM Cloud releases 2023 IBM Cloud for Financial Services Agreed-Upon Procedures (AUP) Report

2 min read - IBM Cloud completed its 2023 independent review of IBM Cloud services and processes. The review report demonstrates to its clients, partners and other interested parties that IBM Cloud services have implemented and adhere to the technical, administrative and physical control requirements of IBM Cloud Framework for Financial Services. What is the IBM Cloud Framework for Financial Services? IBM Cloud for Financial Services® is designed to build trust and enable a transparent public cloud ecosystem with features for security, compliance and…

2 min read