Is there a way to make speech service transcription faster (diarization with speakers differentiated)?

kk 0

Currently the speed seems to be half the time for wav and 1:1 ratio for mp4 with gstreamer.

From this post, it seems half the time for wav file is the maximum.

https://stackoverflow.com/questions/69845073/how-to-do-voice-recognition-in-azure-and-complete-immediately

If this is true, how can I make at least the other file format transcription with gstreamer (like mp4) be as fast as wav file?

I am following this code for python from the doc.
https://video2.skills-academy.com/en-us/azure/ai-services/speech-service/how-to-use-codec-compressed-audio-input-streams?tabs=linux%2Cdebian%2Cjava-android%2Cterminal&pivots=programming-language-python

Thank you for your help!

santoshkc 6,310 Reputation points Microsoft Vendor

2024-07-02T06:58:49.07+00:00

Hi @kk,

Thank you for reaching out to Microsoft Q&A forum!

To make speech service transcription faster there is no baseline claim for processing the audio file and returning the text response since there are many factors that effect the response including the audio quality, network bandwidth, SDK or REST API used, pricing tier of the resource.

However, there are a few guidelines mentioned in the FAQ that help with performance. In most cases, including transcription scenarios, the response is fast. For batch transcription, jobs are scheduled on a best-effort basis. You cannot estimate when a job will change into the running state, but it should happen within minutes under normal system load. Once in the running state, the transcription occurs faster than the audio runtime playback speed.

I hope you understand! Thank you.
santoshkc 6,310 Reputation points Microsoft Vendor

2024-07-03T07:14:34.3666667+00:00

Hi @kk,

Following up to see if the given response was helpful. Thank you.
santoshkc 6,310 Reputation points Microsoft Vendor

2024-07-04T06:45:28.3466667+00:00

Hi @kk,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Thank you.