Sample Data for different styles of Custom Neural Voices (happy, excited, sad).

Question

I could find individual utterances for neutral speech, questions, and exclamations here: https://github.com/Azure-Samples/Cognitive-Speech-TTS/blob/master/CustomVoice/Sample%20Data/Individual%20utterances%20%2B%20matching%20script/SampleScript.txt

To create styles like Happy, Excited, or Sad for my existing custom neural voice, do I need to add and train another dataset where individual utterances are in these styles?
Where can I find such a sample data? Any links would be great!

Kindly help. Thank you!

Accepted Answer

Hi @PAVAGEAU Perrine

Thank you for your question.

To create different styles of custom neural voices like Happy, Excited, or Sad, you need to add and train another dataset where individual utterances are in these styles. You can record your own voice or hire voice actors to record the utterances in different styles.

The Azure documentation suggests working with your voice talent to develop a persona that defines the overall sound and emotional tone of the custom neural voice. You can define the speaking styles of your persona and ask your voice talent to read the script in a way that resonates with the styles you want.

You can use the individual utterances for neutral speech, questions, and exclamations that you have found as a starting point and modify them to create the different styles you need.

Here's a link to the Azure documentation that explains how to create a custom neural voice using the Microsoft Azure Cognitive Services Text-to-Speech API: Choose your voice talent.

The documentation provides a step-by-step guide to creating a custom neural voice, including how to record and label the data, how to train the model, and how to test and deploy the custom voice.

I hope this helps. Thank you.

Please don't forget to click Accept Answer and Yes for was this answer helpful.

Share via

Sample Data for different styles of Custom Neural Voices (happy, excited, sad).

0 additional answers