Real time diarization, for true!

Question

Hi,

i've decided to join Azure AI program due to this demo:

https://speech.microsoft.com/portal/speechtotexttool

In this demo, I can activate microphone, flagging the Diarization to True, and that's it!

Now, when I've discovered by documentation that I couldn't replicate that exact feature:

https://video2.skills-academy.com/en-us/azure/ai-services/speech-service/get-started-stt-diarization?tabs=windows&pivots=programming-language-python

I felt VERY disappointed.

Now I'm struggling to find a way to use a good silence detector (like the one to STT Azure AI of the quickstart) but that would be at least able to save the speech until silence to a wav file to pass to the diarizer.

I'm using Python for Ai, but If needed I could change to Java or C#.

I'm really really sad in this moment.

Answer

@Marco Cocco Could you please add more detail about what failure you have seen while using the quickstart for python? If you are looking to just use the microphone instead of the file as mentioned in the sample, set the use_default_microphone to True instead of filename parameter in AudioConfig class. See the reference of the class here for available options.

audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)

Also, even if you do not provide AudioConfig the default will be set to mic for python. See the confirmation in this issue from SDK team.

audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)

Also, even if you do not provide AudioConfig the default will be set to mic for python. See the confirmation in this issue from SDK team.

Share via

Real time diarization, for true!

1 answer

Your answer