IBM Support

Is there a maximum size limit for audio streaming in Speech to Text?

Question & Answer


Question

Is there a maximum size limit for audio streaming with the Speech to Text service?

Answer

The Speech to Text service allows you to transcribe 100 MB of audio data per request for the Synchronous HTTP and WebSockets interfaces. Using the Asynchronous HTTP interface, you may submit up to 1 GB of audio data per request. 

One way to maximize the amount of audio data that you can pass with a speech recognition request is to use a format that offers compression. There are two basic types of compression: lossy and lossless. The audio format and compression algorithm that you choose can have a direct impact on the accuracy of speech recognition.

You can read more on data limits and on strategies to improve transcription accuracy in the Data limits and compression section of the documentation.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSH3YV","label":"IBM Speech to Text for IBM Cloud"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
12 January 2023

UID

ibm1KB0010954