In Azure AI Speech the training of a custom Speech-To-Text model with audio and transcript keeps throwing "Internal Error"

Benedikt Schmitt 20

I am trying to fine-tune a baseline-20231107 model for my specific use-case. I have recorded two audios in a .wav-format with all the requirements mentioned on this page:

https://video2.skills-academy.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-test-and-train

I have also provided transcripts that fit the requirements from the documentation:

https://video2.skills-academy.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-human-labeled-transcriptions

Every time I try to train the model it throws an "Internal Error".

-I previously had my audio as .m4a and converted it to .wav. Now the audio was recorded as .wav

-I have waited several days with the training to see if it was a temporary error but it still doesn't work

romungi-MSFT 44,771 Reputation points Microsoft Employee

2024-08-26T11:16:09.41+00:00

@Benedikt Schmitt I would recommend to try the sample data with your project before trying to use your data if it continues to fail with internal error. You can try to use the transcript and audio files from here for custom speech.

This should confirm if the issue with your resource or data, I have tried the same sample data and it works to build a model successfully. Thanks!!
Benedikt Schmitt 20 Reputation points

2024-08-26T12:14:10.3933333+00:00

Thanks. I will try a training with the sample data and get back to you.
Benedikt Schmitt 20 Reputation points

2024-08-26T12:49:10.8833333+00:00

Edit: The sample data worked. I have gone through my transcript again and changed everything that could possibly cause an error.
Benedikt Schmitt 20 Reputation points

2024-08-26T13:38:05.8633333+00:00

The training is still failing with "Internal error"
romungi-MSFT 44,771 Reputation points Microsoft Employee

2024-08-27T08:56:57.4666667+00:00

@Benedikt Schmitt At this point, i think something in data might be causing the training to fail in your case. Do you have a valid support subscription to create a support case with the speech team to check further?
Ihtisham Ali 0 Reputation points

2024-08-30T12:43:05.94+00:00

Hi guys i am facing similar issue but my audio files are under 1 minute

1 answer

Benedikt Schmitt 20 Reputation points

2024-08-27T09:35:43.03+00:00

I checked my input data again ad it turns out my audio was too long. The different files were slightly longer than 40 seconds.

It was user error in the end but it would still be a good idea to implement more descriptive error messages. It could have saved me a few days of waiting.
Please sign in to rate this answer.
romungi-MSFT 44,771 Reputation points Microsoft Employee

2024-08-27T10:57:59.08+00:00

@Benedikt Schmitt Good to hear that you are unblocked. I certainly agree that the error is misleading, I will communicate the same internally as a feedback item to the product team. Thanks for posting your resolution.
Sign in to comment

Use comments to ask for clarification, additional information, or improvements to the question.

Share via

In Azure AI Speech the training of a custom Speech-To-Text model with audio and transcript keeps throwing "Internal Error"

1 answer

Your answer