Previous Next

WAVEFORMATEXTENSIBLE

The WAVEFORMATEXTENSIBLE structure specifies the format of an audio wave stream.

typedef struct
{
  WAVEFORMATEX  Format;
  union
  {
    WORD  wValidBitsPerSample;
    WORD  wSamplesPerBlock;
    WORD  wReserved;
  } Samples;
  DWORD  dwChannelMask;
  GUID  SubFormat;
} WAVEFORMATEXTENSIBLE, *PWAVEFORMATEXTENSIBLE;

Members

Format
Specifies the stream's wave-data format. This member is a structure of type WAVEFORMATEX. The wFormat member of WAVEFORMATEX should be set to WAVE_FORMAT_EXTENSIBLE. The wBitsPerSample member of WAVEFORMATEX is defined unambiguously as the size of the container for each sample. Sample containers are always byte-aligned, and wBitsPerSample must be a multiple of eight.
Samples
A union containing one of the following three members: wValidBitsPerSample, wSamplesPerBlock, or wReserved. These three members are described in the following text.
wValidBitsPerSample
Specifies the precision of the sample in bits. The value of this member should be less than or equal to the container size specified in the Format.wBitsPerSample member. See the following Comments section.
wSamplesPerBlock
Specifies the number of samples contained in one compressed block. This value is useful for estimating buffer requirements for compressed formats that have a fixed number of samples within each block. Set this member to zero if each block of compressed audio data contains a variable number of samples. In this case, buffer-estimation and buffer-position information must be obtained in other ways.
wReserved
Reserved for internal use by operating system. Initialize to zero.
dwChannelMask
Specifies the assignment of channels in the multichannel stream to speaker positions. The encoding is the same as that used for the ActiveSpeakerPositions member of the KSAUDIO_CHANNEL_CONFIG structure. See the following Comments section.
SubFormat
Specifies the subformat. See the following Comments section.

Headers

Declared in ksmedia.h and mmreg.h. Include ksmedia.h or mmreg.h.

Comments

WAVEFORMATEXTENSIBLE is an extended form of the obsolete WAVEFORMATEX structure. WAVEFORMATEXTENSIBLE overcomes the two major limitations of WAVEFORMATEX, which is unable to unambiguously specify formats with more than 16 bits per sample or with more than two channels. For more information, see the discussion of WAVEFORMATEXTENSIBLE and WAVEFORMATEX in Data Formats for Audio Wave Streams.

Frequently, the wValidBitsPerSample member, which specifies the sample precision, contains the same value as the Format.wBitsPerSample member, which specifies the sample container size. However, these values can be different. For example, if the wave data originated from a 20-bit A/D converter, then wValidBitsPerSample should be 20 but Format.wBitsPerSample might be 24 or 32. If wValidBitsPerSample is less than Format.wBitsPerSample, the valid bits (the actual PCM data) are left-aligned within the container. The unused bits in the least-significant portion of the container should be set to zero.

Sample containers begin and end on byte boundaries, and the value of Format.wBitsPerSample should always be a multiple of eight. Also, the value of wValidBitsPerSample should never exceed that of Format.wBitsPerSample. Drivers should reject wave formats that violate these rules.

The WAVEFORMATEXTENSIBLE structure's dwChannelMask member contains a mask indicating which channels are present in the multichannel stream. The least-significant bit represents the front-left speaker, the next bit corresponds to the front-right speaker, and so on. The following flag bits are defined in the header file ksmedia.h.

Speaker Position Flag Bit
SPEAKER_FRONT_LEFT 0x1
SPEAKER_FRONT_RIGHT 0x2
SPEAKER_FRONT_CENTER 0x4
SPEAKER_LOW_FREQUENCY 0x8
SPEAKER_BACK_LEFT 0x10
SPEAKER_BACK_RIGHT 0x20
SPEAKER_FRONT_LEFT_OF_CENTER 0x40
SPEAKER_FRONT_RIGHT_OF_CENTER 0x80
SPEAKER_BACK_CENTER 0x100
SPEAKER_SIDE_LEFT 0x200
SPEAKER_SIDE_RIGHT 0x400
SPEAKER_TOP_CENTER 0x800
SPEAKER_TOP_FRONT_LEFT 0x1000
SPEAKER_TOP_FRONT_CENTER 0x2000
SPEAKER_TOP_FRONT_RIGHT 0x4000
SPEAKER_TOP_BACK_LEFT 0x8000
SPEAKER_TOP_BACK_CENTER 0x10000
SPEAKER_TOP_BACK_RIGHT 0x20000

The channels that are specified in dwChannelMask should be present in the order shown in the preceding table, beginning at the top.

For example, if only front-left and front-center are specified, then front-left and front-center should be in channels 0 and 1, respectively, of the interleaved stream.

As a second example, if nChannels (in the Format member; see WAVEFORMATEX) is set to 4 and dwChannelMask is set to 0x00000033, the audio channels are intended for playback to the front-left, front-right, back-left, and back-right speakers. The channel data should be interleaved in that order within each block.

Channel locations beyond the predefined ones are considered reserved.

Alternatively, the channel mask can be specified as one of the following constants, which are defined in ksmedia.h and are bitwise ORed combinations of the preceding flags that represent standard speaker configurations:

KSAUDIO_SPEAKER_MONO

KSAUDIO_SPEAKER_STEREO

KSAUDIO_SPEAKER_QUAD

KSAUDIO_SPEAKER_SURROUND

KSAUDIO_SPEAKER_5POINT1

KSAUDIO_SPEAKER_7POINT1

KSAUDIO_SPEAKER_DIRECTOUT

A hardware device can be set to one of these speaker configurations by a KSPROPERTY_AUDIO_CHANNEL_CONFIG set-property request. For more information on setting speaker configurations, see KSAUDIO_CHANNEL_CONFIG.

Typically, the count in nChannels equals the number of bits set in dwChannelMask, but this is not necessarily so. If nChannels is less than the number of bits set in dwChannelMask, the extra (most significant) bits in dwChannelMask are ignored. If nChannels exceeds the number of bits set in dwChannelMask, the channels that have no corresponding mask bits are not assigned to any particular speaker location. In any speaker configuration other than KSAUDIO_SPEAKER_DIRECTOUT, an audio sink like KMixer (see KMixer System Driver) simply ignores these excess channels and mixes only the channels that have corresponding mask bits.

KSAUDIO_SPEAKER_DIRECTOUT represents a configuration with no speakers and is defined in ksmedia.h as zero. In this configuration, the audio device renders the first channel to the first port on the device, the second channel to the second port on the device, and so on. This allows an audio authoring application to output multichannel data directly and without modification to a device such as a digital mixer or a digital audio storage device (hard disk or ADAT). For example, channels 0 through 30 might contain, respectively, drums, guitar, bass, voice, and so on. For this kind of raw audio data, speaker positions are meaningless, and assigning speaker positions to the input or output streams could cause a component such as KMixer to intervene inappropriately by performing an unwanted format conversion. If a device is unable to process the raw audio streams, it should reject a request to change its speaker configuration to KSAUDIO_SPEAKER_DIRECTOUT.

For more information on multichannel configurations, see the white paper titled Multiple Channel Audio Data and WAVE Files at the audio technology Web site.

The SubFormat member contains a GUID that specifies the subformat. The subformat information is similar to that provided by the wave-format tag in the WAVEFORMATEX structure's wFormatTag member. The following table shows some typical SubFormat GUIDs and their corresponding wave-format tags.

SubFormat GUID Wave-Format Tag
KSDATAFORMAT_SUBTYPE_PCM WAVE_FORMAT_PCM
KSDATAFORMAT_SUBTYPE_IEEE_FLOAT WAVE_FORMAT_IEEE_FLOAT
KSDATAFORMAT_SUBTYPE_DRM WAVE_FORMAT_DRM
KSDATAFORMAT_SUBTYPE_ALAW WAVE_FORMAT_ALAW
KSDATAFORMAT_SUBTYPE_MULAW WAVE_FORMAT_MULAW
KSDATAFORMAT_SUBTYPE_ADPCM WAVE_FORMAT_ADPCM

Every wave-format tag has a corresponding SubType GUID, as explained in Converting Between Format Tags and Subformat GUIDs. This means that every format that can be described by WAVEFORMATEX (by itself) can also be described by WAVEFORMATEXTENSIBLE. Because WAVEFORMATEXTENSIBLE is an extended version of WAVEFORMATEX, it can describe additional formats that cannot be described by WAVEFORMATEX alone. Vendors are free to define their own SubFormat GUIDs to identify proprietary formats for which no wave-format tags exist.

See Also

WAVEFORMATEX, KSAUDIO_CHANNEL_CONFIG