Can I use phonetic language to create perfect speech

Geoff Surtees 0 Reputation points
2023-12-06T06:55:10.1666667+00:00

Can I use International Phonetic alphabetic translation in azure text to speech to come out with a near perfect speech? If so, how?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,713 questions
{count} votes

1 answer

Sort by: Most helpful
  1. navba-MSFT 24,270 Reputation points Microsoft Employee
    2023-12-06T07:11:14.0733333+00:00

    @Geoff Surtees Welcome to Microsoft Q&A Forum, Thank you for posting your query here!
    I understand that your question is related to using International Phonetic alphabetic translation in azure text to speech to come out with a near perfect speech? If so, how?

    .
    Yes, you can use the International Phonetic Alphabet (IPA) for phonetic pronunciation in Azure Text to Speech. Azure AI services allow you to specify the phonetic pronunciation of words using the Universal Phone Set (UPS) in a structured text data file. The UPS is a machine-readable phone set that is based on the IPA. See here for more details.

    Here’s how you can do it:

    • Prepare a structured text data file where you specify the phonetic pronunciation of words using the UPS.
    • UPS pronunciations consist of a string of UPS phonemes, each separated by whitespace.
    • UPS phoneme labels are all defined using ASCII character strings.
    • You can either use a pronunciation data file on its own, or you can add pronunciation within a structured text data file.
    • The Speech service doesn’t support training a model with both of those datasets as input.

    .

    Please note that structured text phonetic pronunciation data is separate from pronunciation data, and they cannot be used together. The first one is “sounds-like” or spoken-form data, and is input as a separate file, and trains the model what the spoken form sounds like. For more detailed steps on implementing UPS, you can refer to the Structured text phonetic pronunciation guide provided by Microsoft. Structured-text data for training is in public preview. . This should help you achieve a near-perfect speech output with Azure Text to Speech. Remember, the quality of the speech output will also depend on the accuracy of your phonetic transcriptions.
    .
    Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.
    **
    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.