How can we customize TTS/STT for a rare language?

Question

There are so many languages and dialects that have no standard of pronunciation and writing system. If we want to customize a certain unstandard languages. How many steps should we take? Is there any notifications for those who want to customize a rare language? For instance:

10,000 basic vocabulary text with its speech;
Million of sentences text with its speech;
Could the training process doing by a team like a announcer team?
How to customize the text-to-speech and speech-to-text for a not official language like Southern HoKkien (Taiwanese)?

Answer

@Gates With respect to customizations for STT or TTS the customizations are limited to the supported languages of both the features. In this case it looks like Southern Hokkien is not supported.

The limits with respect to data for customization of STT or TTS these are available in the documentation but these are mostly guidance to get a good model. You can always add more data and train your model to get better customizations.

The TTS custom voice feature is the most popular feature which is used by announcers and professional voice artists to create content with voices for a specific language. These custom voices are in addition to the available standard and neural voices of azure where customers use them for various scenarios.

Share via

How can we customize TTS/STT for a rare language?

1 answer

Your answer