Understanding Unified Messaging Audio Codecs
Applies to: Exchange Server 2010 SP3, Exchange Server 2010 SP2
In Microsoft Exchange Server 2010 Unified Messaging (UM), a codec is used to store voice mail messages. Another codec is used between an IP gateway or IP Private Branch eXchange (PBX) and the Unified Messaging server. Exchange 2010 Unified Messaging can use any of the following four audio codecs to create and store voice messages:
MP3 (default)
Windows Media Audio (WMA)
Group System Mobile (GSM) 06.10
G.711 Pulse Code Modulation (PCM) Linear
However, the G.711 (PCMA and PCMU) and the G.723.1 codecs are VoIP codecs used between an IP gateway and the Unified Messaging server.
Part of planning your Unified Messaging system involves selecting the correct audio codec based on the needs and requirements of your organization. This topic discusses the audio codecs that Unified Messaging can use and will help you plan your UM deployment.
Codecs
Two types of codecs are used in Unified Messaging: the codec used between IP gateways and the Unified Messaging server or between a PBX and IP gateway, depending on the type of PBX, and the codec used to encode and store voice messages for users.
The term codec is a combination of the words "coding" and "decoding" and is used with digital audio data. A codec is a software program that transforms digital data into an audio file format or audio streaming format. Codecs are used to convert an analog voice signal to a digital version of the voice signal. Codecs can vary in their sound quality, the bandwidth required to use them, and the system requirements needed to do the encoding.
When you use an ordinary telephone over the Public Switched Telephone Network (PSTN) your voice is transported in an analog format over the telephone line. But with Voice over IP (VoIP), your voice must be converted into digital signals. This conversion process is known as encoding. Encoding is performed by a codec. After the digitized voice has reached its destination, it must then be decoded back to its original analog format so the person on the other end of the call can hear and understand the caller.
VoIP Codec
In Unified Messaging, three types of codecs can be used between IP gateways or IP PBXs and the Unified Messaging server. Unified Messaging servers can accept the following VoIP codecs from an IP gateway or IP PBX:
G.711 µ-law
G.711 A-law
G.723.1
G.711 is a standard that was developed for use with audio codecs. There are two main algorithms defined in the standard for G.711: the µ-law algorithm that is used in North America and Japan and the A-law algorithm that is used in Europe and other countries. The G.723.1 audio codec is mostly used in VoIP applications and requires a license to be used. G.723.1 is a high quality, high compression type of codec.
Both a Unified Messaging server and a supported IP gateway or IP PBX can offer both the G.711 and G.723.1 codec. By default, the first codec to be used is G.723.1. If you want to use a different codec other than G.723.1 between the Unified Messaging server and the IP gateway or IP PBX, we recommend that you change the configuration on the IP gateway or IP PBX. The following table summarizes some common VoIP codecs.
VoIP codecs
VoIP codec | Bandwidth (Kbps) | Description |
---|---|---|
G.711 |
64 |
This codec requires very low processing. It needs a minimum of 128 kilobits per second (Kbps) for two-way communication. |
G.723.1 |
5.3/6.3 |
This codec offers high compression with high quality audio. It requires more processing than the G.711 codec. The G.723.1 codec uses reduced bandwidth but offers poorer quality audio. |
UM Voice Message Storage Codec
Unified Messaging dial plans are integral to the operation of Unified Messaging. By default, when you create a UM dial plan, the UM dial plan uses the WMA audio codec. However, after you create the UM dial plan, you can configure the UM dial plan to use GSM 06.10 or G.711 PCM Linear audio codecs.
Each audio codec has advantages and disadvantages. The WMA audio codec was selected as the default audio codec because of its sound quality and compression properties. GSM 06.10 and G.711 PCM Linear audio codecs were included as available options because of their ability to support other types of messaging systems.
When you plan for Unified Messaging, you must balance the size and the relative quality of the audio file that will be created for voice messages. Generally, the higher the bit rate for an audio file, the higher the quality. You must also consider whether the audio file is compressed. The sample bit rate (bit/sec) and compression properties for each audio codec used in Unified Messaging are as follows:
Default UM voice message storage codecs
Voice message storage codec | Bits | Compressed file? |
---|---|---|
MP3 |
16 bit |
Yes |
WMA |
16 bit |
Yes |
G.711 PCM |
16 bit |
No |
GSM 06.10 |
8-bit |
Yes |
In Unified Messaging, the MP3, WMA, G.711 PCM Linear, and GSM 06.10 audio codecs are used to create .mp3, .wma and .wav audio files for voice messages. However, the file type created depends on the audio codec that is used to create the voice message audio file. In Unified Messaging, the .mp3 audio codec creates .mp3 audio files, the .wma audio codec creates .wma audio files and the GSM 06.10 and G.711 PCM Linear audio codecs produce .wav audio files. Both kinds of audio files are sent together with the e-mail message to the recipient of the voice message.
Frequently, but not always, coding and decoding the digital data also involves compression or decompression. Audio compression is a form of data compression that reduces the size of audio data files. The audio compression algorithm used by the audio codec compresses the .wma or .wav audio files. In Unified Messaging, the type of audio compression algorithm that is used is based on the type of audio codec selected in the UM dial plan properties. After the audio file is created and compressed, it's attached to the voice message.
Sometimes information from the digital data is lost during compression and decompression. The higher the compression that is used to compress the audio file, the greater the loss of information during the conversion. However, less disk space is used because the size of the audio file is reduced. Conversely, the lower the compression, the lower the loss of the information. However, more disk space must be used because of the increased size of each audio file.
RTAudio wideband or high fidelity audio for recording voice messages is also available as an audio codec. However, high fidelity audio using RTAudio is available only after you have successfully integrated Exchange 2007 Unified Messaging with Office Communications Server 2007 R2 or Microsoft Lync Server 2010 (the next generation of Office Communications Server). To enable RTAudio, the UM dial plan must be configured as a Session Initiation Protocol (SIP) URI-type dial plan and you must set the call answering codec on the dial plan to WMA.
Important
RTAudio is not available in environments where Office Communications Server 2007 or R2 or Lync Server 2010 is not deployed. This is because, in these environments, the dial plan is set to Telephone Extension and not SIP URI.
There are two media streams for each incoming call: inbound to a Unified Messaging server and outbound from a Unified Messaging server. When the dial plan type is set to SIP URI and the call-answering codec on the dial plan is set to WMA, a Unified Messaging server tries to select the RTAudio VoIP codec for the inbound media stream. If negotiation is successful, the RTAudio codec for the inbound stream will be used for call answering calls or calls that originate from Office Communicator 2007.
Note
Calls placed by using the Play on Phone feature will not use the RTAudio codec. The inbound stream for calls placed by using Play on Phone will use the G.711 or G.723.1 codec.
When the RTAudio codec is used, the voice message that is recorded will be recorded in high fidelity and will be stored as an audio file that has a .wma extension. When the voice message is played back to the user in Office Outlook 2007 or Outlook Web Access, they will hear the voice message in high fidelity audio. If negotiation is unsuccessful, either the G.711 or G.723.1 codec will be used. Both the G.711 and the G.723.1 codecs are narrowband codecs. When they're used as the VoIP codec, the voice message is recorded and stored as a narrowband audio file that has a .wma extension.
The outbound media stream will always be negotiated by using either the G.711 or G.723.1 codec. This means that callers will always hear narrowband audio over the telephone. This also applies to situations when a call is placed by using Office Communicator.
The audio format and codec that Unified Messaging servers use to store the audio in voice messages depends not only on the audio codec that's configured on the dial plan but also on the bit rate of the audio that UM negotiates with a SIP peer. If your environment includes Office Communications Server 2007 R2, Lync Server 2010, or the SIP endpoints, a Unified Messaging server will also negotiate the audio codec to use with a SIP peer. For example, when wideband RTAudio is negotiated as the wire codec, a Unified Messaging server will then use either the 32 Kbps MP3 or WMA 9.2 format when creating voice messages, depending on the dial plan setting. The following table shows the relationship between the voice message storage audio codec and the VoIP or wire audio codec that's used.
Relationship between the storage audio codec and the VoIP or wire audio codec
Audio codec configured on a UM dial plan | VoIP or wire codec (narrowband) - G.723, G.711, or RTAudio (8KHz) | VoIP or wire codec (wideband) - RTAudio (16KHz) |
---|---|---|
G.711 |
G.711 |
Not applicable. A UM server doesn't negotiate wideband audio if the dial plan is set to G.711. |
WMA |
WMA 9 Voice |
WMA 9.2 |
GSM |
GSM 6.10 |
Not applicable. A UM server doesn't negotiate wideband audio if the dial plan is set to G.711. |
MP3 |
MP3 (16 Kbps) |
MP3 (32 Kbps) |
Return to top
UM Message Sizing
You can configure Unified Messaging to use one of the four following audio codecs for creating voice messages: MP3, WMA, GSM 06.10, and G.711 PCM Linear. By default, the MP3 format is selected. The MP3 format is a common audio file format that's used to greatly reduce the size of the audio file and is most commonly used by personal audio devices or MP3 players. MP3 is a cross-platform type of audio codec and is used for compatibility with many mobile phones and devices and different computer operating systems.
The WMA audio codec is always stored in the Windows Media format, and the attachment is a file that has a .wma file name extension. Audio files encoded using the GSM or G.711 PCM Linear audio codecs are always stored in RIFF/WAV format, and the attachment is a file that has a .wav file name extension.
The size of Unified Messaging voice messages depends on the size of the attachment that holds the voice data. In turn, the size of the attachment depends on the following factors:
The duration of the voice mail recording
The audio codec that is used
The audio file storage format
The following figure shows how the size of the audio file depends on the duration of the voice mail recording for the three audio codecs that you can use in UM.
Note
In this figure, the average length of a call-answered voice message is approximately 30 seconds.
Audio file size
MP3
By default, the MP3 format is selected and is the default audio file format for voice mail messages. The MP3 format is a common audio file format that's used to greatly reduce the size of the audio file and is most commonly used by personal audio devices or MP3 players. MP3 is a cross-platform type of audio codec and is used for compatibility with many mobile phones and devices and different computer operating systems.
WMA
WMA is the most highly compressed audio codec of the three kinds of codecs. The compression is approximately 11,000 bytes for each 10 seconds of audio. However, the .wma file format has a much larger header section than the .wav file format. The .wma file header section is approximately 7 kilobytes (KB), whereas the header section for the .wav file is less than 100 bytes. Although WMA audio recordings are recorded for longer than 15 seconds, they become smaller than GSM audio recordings. Therefore, for the smallest but highest quality audio files, use the WMA audio codec.
G.711 PCM Linear
The G.711 PCM Linear audio codec creates .wav audio files that are not compressed. Therefore, G.711 PCM Linear .wav audio files occupy the most space for any given duration when they're compared to the GSM and WMA audio codecs. G.711 PCM Linear .wav audio files occupy just over 160,000 bytes for each 10 seconds of audio. G.711 PCM Linear .wav audio files have the highest audio quality of the three audio codecs used by Unified Messaging. However, the quality of comparable audio files created using the WMA and GSM audio codecs are acceptable to most users who listen to voice messages.
GSM
The GSM audio codec creates .wav audio files that are compressed. GSM .wav audio files are just over 16,000 bytes for each 10 seconds of audio. However, GSM creates an audio file larger than the audio file created by the WMA audio codec. Therefore, when you are balancing the quality of the voice message and the size, this may not be the best choice.
Return to top
© 2010 Microsoft Corporation. All rights reserved.