Note
Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.
Microsoft Speech Platform
Audio Interfaces
Use the audio interfaces in the Speech Platform to manage audio input for speech recognition and audio output for speech synthesis (TTS, text-to-speech) and playback of audio files.
With the exception of ISpTranscript, the audio interfaces in the Speech Platform inherit from the standard COM IStream interface. However, since audio devices represent hardware, a ::Clone method may be not be used and will return E_NOTIMPL.
Development Helpers
The following table list enumerations, functions, and classes that are useful when working with audio in the Speech Platform:
Enumeration, Function, or Class | Description |
---|---|
SPSTREAMFORMAT | Stream formats supported by the Speech Platform. |
CSpEvent | Class for decoding event structures. |
CSpDynamicString | Class for managing dynamically sized WCHAR strings. |
SpBindToFile | Function converts the specified stream format into a wave format structure. |
CSpStreamFormat | Class for managing Stream formats and WAVEFORMATEX structures supported by the Speech Platform. |