Windows audio

This page describes how to set up audio for the Windows OS. It covers how to set up XBMC when using either Directsound or WASAPI, and when to use one vs the other.

Since WASAPI performs no mixing or resampling this is the preferred mode for best quality audio.

From Frodo onwards XBMC uses WASAPI only in the Exclusive Mode of operation in order that XBMC gets the exclusive rights to the audio buffers whilst playing audio streams to the exclusion of all other sounds or players, this is a change from previous version of XBMC where Shared Mode was also allowed. When using WASAPI care must be taken to ensure Windows is configured to allow XBMC to run in exclusive mode, refer to the below Configure Windows Sound Settings section.

In addition XBMC from Frodo onwards uses WASAPI the more modern Event driven mode, previously XBMC used the Push mode, so both the audio hardware & audio driver need to support the Event mode for audio to work with WASAPI selected.

= Hardware Vendor Specifics =

AMD GPU
If using WASAPI do not use the Realtek HD Audio drivers as they do not work with the event driven mode XBMC uses for WASAPI. To use WASAPI you must use the AMD High Definition Audio drivers.

The Realtek HD Audio drivers will however work with the Directsound mode.

Intel GPU
To support HD Audio on Windows the Intel Management Engine Interface driver must be installed, it's this driver that provides the HDCP DRM necessary for the HD Audio formats to works. If this driver is not installed then the HD formats will be missing from the Supported Formats tab.

In order to verify you have the Intel Management Engine Interface driver installed, follow the relevant step at Blu-Ray* Disc Playback with Intel® HD Graphics FAQ

= Check drivers =

= Windows Audio API's - Background = Since Windows Vista SP1 there has two primary audio interfaces, DirectSound and WASAPI (Windows Audio Session Application Programming Interface) with WASAPI being a replacement for Windows XP's Kernel Streaming mode.

Directsound
DirectSound acts as a program-friendly middle layer between the program and the audio driver, which in turn speaks to the audio hardware. With DirectSound, Windows controls the sample rate, channel layout and other details of the audio stream via an Audio Mixer. Every program using sound passes it's data to DirectSound and the Audio Mixer which then resamples as required so it can mix audio streams from any program together with system sounds.

The advantages are that programs don't need resampling code or other complexities, and any program can play sounds at the same time as others, or the same time as system sounds, because they are all mixed to one format.

The disadvantages are that other programs can play at the same time, and that a program's output gets mixed to whatever the system's settings are. This means the program cannot control the sampling rate, channel count, format, etc. Even more important for this thread is that you cannot pass through encoded formats, as DirectSound will not decode them and it would otherwise bit-mangle them, and there is a loss of sonic quality involved in the mixing and resampling.

WASAPI
Partly to allow for cleaner, uncompromised or encoded audio, and for low-latency requirements like mixing and recording, Microsoft re-vamped their Kernel Streaming mode after XP and came up with WASAPI for Vista.

WASAPI itself has two modes, Shared and Exclusive.

Shared mode is in many ways similar to DirectSound as it allows other sounds to be mixed into the currently playing stream, however this mode is not supported on XBMC so won't be covered any further here.

WASAPI Exclusive mode bypasses the Audio Mixer and thus the mixing/resampling layers of DirectSound so audio is passed-through as-is, this is why WASAPI should be used for encoded formats like DTS in order that they can reach the receiver unchanged for decoding there.

WASAPI Exclusive mode allows the application to interrogate the capabilities of the audio driver, since audio is presented directly by the application to the audio driver the format that the audio is sent in by the application must be in a format that is compatible with the capabilities of the audio driver, as there is no DirectSound between to convert it. This interrogation is a two way process that often involves some back-and-forth depending on the format specified and the device's capabilities, once a set of compatible formats is agreed upon by application and audio driver, the application then decides how it will present the audio stream to the audio driver.

In addition to Shared and Exclusive modes, there are two modes for how data is passed from the application to the audio driver.

The normal manner is in push mode - a buffer is created which the audio device draws from, and the application pushes as much data in as it can to keep that buffer full. To do this it must constantly monitor the levels in the buffer, with short "sleeps" in between to allow other threads to run.

WASAPI, and most modern sound devices, also support a "pull" or "event-driven" mode. In this mode two buffers are used. The application gives the audio driver a call-back address or function, fills one buffer and starts playback, then goes off to do other processing. It can forget about the data stream for a while. Whenever one of the two buffers is empty, the audio driver "calls you back", and gives you the address of the empty buffer. You fill this and go your way again. Between the two buffers there is a ping-pong action: one is in use and draining, the other is full and ready. As soon as the first is emptied the buffers are switched, and you are called upon to fill the empty one. So audio data is being "pulled" from the application by the audio driver, as opposed to "pushed" by the application.