is there any way of accessing the sounds that are sent to speech sdk server

Faris Lemes 50 Reputation points
2024-06-06T06:16:28.6233333+00:00

hi, I'm trying to make some ai assistant using speech SDK, device is Linux kernel based, and I've configured Alsa loopback and Pulseaudio to utilize the echo cancellation feature which should be supported by SDK. One thing that I noticed is that sometimes echo cancellation works flawlessly, but sometimes it does not. I'm trying to figure out is there something wrong on my side. I've tried to record all channels using arecord and Alsa while I'm experimenting with sound and speaker sound is really recorded on the last channel as it is requested by Speech SDK. Which filter is used on Speech SDK side? Is it some adaptive filter or some filter which is buffering the loopback channel? Is there any way to debug this behavior? I've tried to analyze the logs that SpeechSDK generates but I could not find anything interesting there.

Thanks!

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,516 questions
{count} votes