Previous Next

Data Formats for Audio Wave Streams

The data format for the wave stream that an audio device renders or captures is specified by a variable-length format descriptor, as shown in the following figure.

Wave-Format Descriptor

The format descriptor begins with a KSDATAFORMAT structure. The amount of additional format information following the KSDATAFORMAT structure varies depending on the data format.

Audio systems use this type of format descriptor in several ways:

The format information that follows the KSDATAFORMAT structure should preferably be in the form of a WAVEFORMATEXTENSIBLE structure, although, for the sake of backwards compatibility, the WDM audio subsystem continues to support the obsolete WAVEFORMATEX structure for some PCM formats. WAVEFORMATEX is an extended version of the pre-WDM WAVEFORMAT structure. WAVEFORMAT is obsolete and is not supported by the WDM audio subsystem in any version of Windows.

Similarly, the PCMWAVEFORMAT structure is an extended version of WAVEFORMAT that is obsolete, but for which the WDM audio subsystem provides limited support.

For information about WAVEFORMAT and PCMWAVEFORMAT, see the Platform SDK documentation.

The four wave-format structures—WAVEFORMAT, PCMWAVEFORMAT, WAVEFORMATEX, and WAVEFORMATEXTENSIBLE—all begin with the same five members, starting with wFormatTag. The preceding figure shows these four structures superimposed on each other to highlight the portions of the structures that are identical. PCMWAVEFORMAT and WAVEFORMATEX extend WAVEFORMAT by adding a wBitsPerSample member, but WAVEFORMATEX also adds a cbSize member. WAVEFORMATEXTENSIBLE extends WAVEFORMATEX by adding three members, beginning with Samples.wValidBitsPerSample. (Samples is a union whose other member, wValidSamplesPerBlock, is used instead of wValidBitsPerSample for some compressed formats.) The wFormatTag member, which immediately follows the end of the KSDATAFORMAT structure in the buffer, specifies what kind of format information follows KSDATAFORMAT. The KMixer system driver supports only PCM formats that use one of the three format tags shown in the following table.

wFormatTag Value Meaning
WAVE_FORMAT_PCM PCM data format specified by WAVEFORMATEX or PCMDATAFORMAT.
WAVE_FORMAT_IEEE_FLOAT Floating-point data format specified by WAVEFORMATEX.
WAVE_FORMAT_EXTENSIBLE Extended data format specified by WAVEFORMATEXTENSIBLE.

In fact, KMixer supports only a subset of the formats that can be described by these tag values. USB audio devices (see USBAudio Class System Driver) are restricted to this subset because all USB audio streams pass through KMixer. DirectSound applications, however, can overcome KMixer's restrictions by connecting directly to hardware pins on WaveCyclic and WavePci devices that support formats not supported by KMixer.

Note the ambiguity in the meaning of the WAVE_FORMAT_PCM tag value in the preceding table—it can specify either a WAVEFORMATEX or PCMDATAFORMAT structure. However, these two structures are nearly identical. The only difference is that WAVEFORMATEX contains a cbSize member and PCMDATAFORMAT does not. According to the WAVEFORMATEX specification, cbSize is ignored if wFormatTag = WAVE_FORMAT_PCM; cbSize is used for all other formats. Thus, in the case of a PCM format, PCMDATAFORMAT and WAVEFORMATEX contain the same information and can be treated identically.

WAVEFORMATEXTENSIBLE is able to specify a wider range of formats than WAVEFORMATEX can:

  1. WAVEFORMATEXTENSIBLE specifies the number of bits per sample separately from the size of the sample container. For example, a 20-bit sample can be stored left-justified within a three-byte container. WAVEFORMATEX, which fails to distinguish the number of data bits per sample from the sample container size, is unable to describe such a format unambiguously.
  2. WAVEFORMATEXTENSIBLE can assign specific speaker locations to audio channels in multichannel streams. WAVEFORMATEX lacks this capability and can adequately support only mono and (two-channel) stereo streams.

Any format that is described by WAVEFORMATEX can also be described by WAVEFORMATEXTENSIBLE. For information about converting a WAVEFORMATEX structure to WAVEFORMATEXTENSIBLE, see Converting Between Format Tags and Subformat GUIDs.

WAVEFORMATEX is sufficient for describing formats with sample sizes of 8 or 16 bits, but WAVEFORMATEXTENSIBLE is necessary to adequately describe formats with a sample precision of greater than 16 bits. Here are two examples:

In both of these examples, preserving signal quality while making the right tradeoff between processing and storage efficiency is possible only if both the sample precision and container size are known.

In all Windows releases except for Windows 98 "Gold", KMixer supports a range of WAVEFORMATEXTENSIBLE PCM formats with multiple channels and up to 32 bits per sample.

The subset of WAVEFORMATEX PCM formats that KMixer supports differs between Windows releases, as shown in the following table.

Windows Release Packed Sample Sizes Number of Channels
Windows 98 "Gold" 8, 16, 24, and 32 bits Multichannel
Windows 98 SE 8 and 16 bits only Mono and stereo only
Windows 98 SE + QFE 8, 16, 24, and 32 bits Mono and stereo only
Windows 2000 8 and 16 bits only Mono and stereo only
Windows Me 8, 16, 24, and 32 bits Mono and stereo only
Windows XP (and later) 8 and 16 bits only Mono and stereo only

KMixer limits WAVEFORMATEX formats to only one or two channels in all versions of Windows except Windows 98 "Gold". The same limitations in sample size and number of channels apply to PCMWAVEFORMAT because it is equivalent to WAVEFORMATEX for PCM formats. For more information about Windows 98 SE + QFE, see Additional Requirements for Windows 98.

In WAVEFORMATEXTENSIBLE, dwBitsPerSample is the container size, and wValidBitsPerSample is the number of valid data bits per sample. Containers are always byte-aligned in memory, and the container size must be specified as a multiple of eight bits.

When using WAVEFORMATEXTENSIBLE, set wFormatTag to WAVE_FORMAT_EXTENSIBLE and SubFormat to the appropriate format GUID. For PCM formats, set SubFormat to KSDATAFORMAT_SUBTYPE_PCM. For formats that encode sample values as floating-point numbers, set SubFormat to KSDATAFORMAT_SUBTYPE_IEEE_FLOAT. For either of these formats, set cbSize to sizeof(WAVEFORMATEXTENSIBLE)-sizeof(WAVEFORMATEX). For information about using WAVEFORMATEXTENSIBLE to describe non-PCM data formats, see Supporting Non-PCM Wave Formats.

Each pin on a KS filter declares which data formats it supports. The pin factory exposes this information as an array of data ranges. Unlike the format descriptor shown in the preceding figure, a data range describes a range of data formats. For example, the data range for an audio pin's PCM data format specifies the range of sample frequencies, range of sample sizes, and maximum number of channels that the pin supports.

When the miniport driver instantiates a pin, it configures the pin to handle a stream with a particular data format that it selects from the pin's data ranges. This work is done by the miniport driver's data-intersection handler, which selects an audio data format that is common to two pins so that they can be connected. For more information, see Data-Intersection Handlers.

As explained in KS Data Formats and Data Ranges, KS pins use KSDATAFORMAT and KSDATARANGE structures to specify their data formats and data ranges.

Audio pins use extended versions of these structures. WDM audio drivers use the KSDATAFORMAT_DSOUND and KSDATAFORMAT_WAVEFORMATEX structures, which are extensions of the KSDATAFORMAT structure, to provide additional information about audio data formats. Similarly, the KSDATARANGE_AUDIO and KSDATARANGE_MUSIC structures extend the KSDATARANGE structure to provide additional information about audio data ranges.

For information about using property requests to query audio pins for data formats and ranges, see Pin Data-Range and Intersection Properties.

The following examples show how to use the KSDATAFORMAT and KSDATARANGE structures to describe some of the more common formats for audio streams:

Analog Audio Stream Data Range

DirectMusic Stream Data Format

DirectMusic Stream Data Range

DirectSound Stream Data Format

DirectSound Stream Data Range

MIDI Stream Data Format

MIDI Stream Data Range

PCM Stream Data Format

PCM Stream Data Range

PCM Multichannel Stream Data Format

PCM Multichannel Stream Data Range

PCM High Bitdepth Stream Data Format

PCM High Bitdepth Stream Data Range