Note
Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.
Microsoft Speech Platform
ISpVoice
The ISpVoice interface enables an application to perform speech synthesis operations. Applications can speak text strings and text files, or play audio files through this interface. All of these can be done synchronously or asynchronously.
A voice is an instance of a speech synthesis (text-to-speech, or TTS) engine that specifies a voice token to use for synthesizing speech from text. Applications can choose a specific TTS voice token using ISpVoice::SetVoice. If no voice token is selected, the TTS engine will use the default voice token, which is specified at the following registry key: HKEY_CURRENT_USER\Software\Microsoft\Speech Server\v11.0\Voices\DefaultTokenId.
Your applications can modify the characteristics of a voice (for example, rate, pitch, and volume), by embedding Speech Synthesis Markup Language (SSML) XML tags into the text to be spoken. See Use SSML to Create Prompts and Control TTS. Some attributes, like rate and volume, can be changed in real time using ISpVoice::SetRate and ISpVoice::SetVolume. Applications can set the priority of a voice using ISpVoice_SetPriority.htm.
ISpVoice inherits from the ISpEventSource interface. An ISpVoice object forwards events back to the application when the corresponding audio data has been rendered to the output device.
Associated Class IDs
The following class IDs (CLSID) may be used with this interface.
- CLSID_SpVoice
See Application Object Classes for a complete CLSID listing for all interfaces.
Methods in Vtable Order
ISpVoice Methods | Description |
---|---|
ISpEventSource inherited methods | All methods of ISpEventSource are accessible from this interface. |
SetOutput | Sets the current output object. A value of NULL may be used to select the default audio device. |
GetOutputObjectToken | Retrieves the object token for the current audio output object. |
GetOutputStream | Retrieves a pointer to the current output stream. |
Pause | Pauses the voice at the nearest alert boundary and closes the output device. |
Resume | Sets the output device to the RUN state and resumes rendering. |
SetVoice | Sets the identity of the voice used for text synthesis. |
GetVoice | Retrieves the object token that identifies the voice used in text synthesis. |
Speak | Speaks the contents of a text string or file. |
SpeakStream | Speaks the contents of a stream. |
GetStatus | Retrieves the current rendering and event status associated with this ISpVoice instance. |
Skip | Causes the voice to skip forward or backward the specified number of items within the text of the current speak call. |
SetPriority | Sets the priority for the voice. Normal, Alert, Over. |
GetPriority | Retrieves the current voice priority level. |
SetAlertBoundary | Specifies which event should be used as the insertion point for alerts. |
GetAlertBoundary | Retrieves the event that is currently being used as the insertion point for alerts. |
SetRate | Sets the text rendering rate adjustment in real time. |
GetRate | Retrieves the current text rendering rate adjustment. |
SetVolume | Sets the synthesizer output volume level in real time. |
GetVolume | Retrieves the current output volume level of the synthesizer. |
WaitUntilDone | Blocks the caller until either the voice has completed speaking or the specified time interval has elapsed. |
SetSyncSpeakTimeout | Sets the timeout interval in milliseconds after which, synchronous Speak and SpeakStream calls to this instance of the voice will timeout. |
GetSyncSpeakTimeout | Retrieves the timeout interval for synchronous speech operations for this ISpVoice instance. |
SpeakCompleteEvent | Returns an event handle that will be signaled when the voice has completed speaking all pending requests. |
IsUISupported | Determines if the specified type of UI is supported. |
DisplayUI | Displays the requested UI. |