Why is the Isabella Multilingual voice available only in Clipchamp?

i'm MariOhn 61 Reputation points
2024-05-27T04:40:01.55+00:00

Hello, I noticed that the Isabella Multilingual voice for Thai Text to Speech is available in Clipchamp but not in Audio Content Creation. I'm interested in using this voice for my projects. I was wondering if there are any specific reasons why this voice is not available in Audio Content Creation and if there are plans to add it in the future. Thank you in advance for your response.

and

Difference in Remy Multilingual Thai voice quality between single words and full sentences.

I've been experimenting with the Remy Multilingual voice for Thai Text to Speech and noticed an interesting difference in voice quality. When I ask Remy to read a single word, the voice quality is excellent. However, when I request Remy to read a full sentence, the voice quality noticeably deteriorates. I'm curious about the reasons behind this difference in voice quality and would like to know if there are any ways to improve the voice quality when reading full sentences. Thank you in advance for your explanation.

Thank you.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,519 questions
0 comments No comments
{count} votes

Accepted answer
  1. santoshkc 6,310 Reputation points Microsoft Vendor
    2024-05-27T08:01:07.1333333+00:00

    Hi @i'm MariOhn,

    Thank you for reaching out to Microsoft Q&A forum!

    Regarding your first question about the Isabella Multilingual voice for Thai Text to Speech, it's possible that the voice is not available in Audio Content Creation due to technical limitations or licensing agreements. However, I don't have access to specific information about this. I recommend reaching out to Microsoft Azure support for more information on this topic.

    Regarding your second question about the Remy Multilingual voice, it's possible that the difference in voice quality between single words and full sentences is due to the way the voice is synthesized. When synthesizing a single word, the system has more control over the pronunciation and intonation of the word. However, when synthesizing a full sentence, the system has to take into account the context and flow of the sentence, which can be more challenging.

    To improve the voice quality when reading full sentences, you can try adjusting the punctuation and phrasing of the text to better match the natural flow of speech. Additionally, you can experiment with the below speaking styles (specially for Thai text) and prosody settings to find the best fit for your project.

    User's image

    I hope this information is helpful! Let me know if you have any further questions.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful