Azure Text To Speech docker container throws an exception with viseme
I'm using the Azure Text to Speech docker image (mcr.microsoft.com/azure-cognitive-services/speechservices/neural-text-to-speech:3.3.0-amd64-en-us-jennyneural). I'm passing it SSML through the dotnet SDK. When asking for viseme (via <mstts:viseme…
Is there a way to make speech service transcription faster (diarization with speakers differentiated)?
Currently the speed seems to be half the time for wav and 1:1 ratio for mp4 with gstreamer. From this post, it seems half the time for wav file is the…
how to assign operation permissions a resources
Hello, I am new to Azure and I want to use it to convert text to speech. when I creat the resources -> enter the speech studio and try to start the service, the system raised an error and say "You don't have operation permissions to [New],…
Random Words Detected by Azure Speech Recognizer in Silence
Hello Azure Support Team, I am currently using the Azure Speech Service to recognize speech inputs in my application. The setup of my speech recognizer is as follows: export const createSpeechRecognizer = () => { const speechRecognitionConfig =…
![](https://techprofile.blob.core.windows.net/images/3b270b575c094eeca63e9bc66c861c5a.png)
Speech-to-Text batch transcribe API in germanycentralwest doesn't work
Last Friday (May 31 2024) we started getting the following errors on all transcripts sent to the batch transcription API on our speech resource in…
![](https://techprofile.blob.core.windows.net/images/3b270b575c094eeca63e9bc66c861c5a.png)
azure prononciation assessment time limit
i am using azure prononciation assessment to assess an audio , but the problem the assessment happens only for the 1 min of the speech and it doesnt assess the rest of the audio this is my code const sdk =…
![](https://techprofile.blob.core.windows.net/images/3b270b575c094eeca63e9bc66c861c5a.png)
azure prononciation assessment input video
can i give to azure prononciation assessment a video input ?
![](https://techprofile.blob.core.windows.net/images/3b270b575c094eeca63e9bc66c861c5a.png)
azure prononciation assessment async assessment
i'am using azure speech recognizer sdk , to do the prononciation assessment of an audio file. the problem when the speech is in french the results are always low , and no expressive const language = await detectSingleSpeechLanguage(text) …
![](https://techprofile.blob.core.windows.net/images/3b270b575c094eeca63e9bc66c861c5a.png)
Error while trying to train a 202240228 Whisper Large v2 baseline model
When trying to train a custom speech model using a dataset containing an audio file and its transcript, the model failed to train due to an internal error. Can anyone provide any insights on how to troubleshoot this issue?
![](https://techprofile.blob.core.windows.net/images/3b270b575c094eeca63e9bc66c861c5a.png)
How to create a dataset for Azure custom speech using spx (speechCLI)
I am using the following command for creating a custom speech dataset in my Azure Speech service: spx csr dataset create --api-version v3.1 --kind "Acoustic" --name "My Custom Speech" --description "My Acoustic Dataset…
Azure Cognitive Services Speech: Unable to get Custom Translator model results from speech translation code
In test C# code that I created based on the speech translation code in the following sample (“Using custom translation in speech translation”), I’m having trouble getting Custom Translator model translation results. The code just returns a cancellation…
How to synchronize real world events happening while speech recognition is happening with individual spoken words
I am trying to synchronize real world events that are occuring during live streaming of speech to Azure speech recognition services (e.g., eye gaze shifts, hardware device interactions, etc.). I note the time when I start speech recognition and record…
Can my web app use a GPU for AI capabilities or will I need to use an Azure VM?
I am running a web app which I deployed through docker. The web app works perfectly besides one important detail, the whisperx ai model I have takes forever to run a transcription (think hours). I run the same ai function on a "T4 GPU" using…
Is each voice in the voice gallery based on a clone of one specific natural person or is it synthetic?
I would like to understand whether: Each voice in the voice gallery is based on a clone of one specific natural person? Voices are synthetic (similar to those from 11Labs Voice Design) that cannot be traced back to an individual person? Thank you!
Microsoft: fix captioning by Speech Studio
The captioning functionality in the Speech Studio is an utter failure. This is typical output: I encourage Microsoft to implement the functionality that allows the user to specify the number of lines of text (typically one or two), and the maximum…
SpeakSsmlAsync is cancelled, but SpeakTextAsync is successfull
I am trying out the Azure AI service to convert text to speech from a C# WPF application. My calls through SpeakTextAsync are successfull, but my calls through SpeakSsmlAsync are returned with the Reason = Cancelled. I am on the free tier for South…
What are the HW or sound limitations for the echo cancellation algorithm in SpeechSDK
hi, I'm having some issues with the echo cancellation on my device, and I'm trying to use speech SDK, when I was analyzing the sounds that I record with microphone it seems that there are present higher harmonics which are 24dB less then primary…
create a basic voice-interactive dashboard
Hello Team, I need to create a basic voice-interactive dashboard using Azure Cognitive services like, Speech service, CLU(Conversational Language Understanding) & PowerBI.Also suggest if any other way to achieve this. It would be really helpful.
As a student how can I use Azure Speech resource
I have a student subscription and want to create an Azure Speech resource, but there's a problem. Is it because of the student subscription limitation or what I can do to use Azure speech service?
how can I set the permission to the resources
Hello, I want to upload a text file to Speech Studio, but the system raised an error Does anyone help how I can fix this and assign a proper role for myself? I already set my role as a Cognitive Services User.