DirectSound is a deprecated software component of the Microsoft DirectX library for the Windows operating system, superseded by XAudio2. It provides a low-latency interface to sound card drivers written for Windows 95 through Windows XP and can handle the mixing and recording of multiple audio streams. DirectSound was originally written for Microsoft by John Miles.[1]
Besides providing the essential service of passing audio data to the sound card, DirectSound provides other essential capabilities such as recording and mixing sound, adding effects to sound (e.g., reverb, echo, or flange), using hardware accelerated buffers (if the sound card supports hardware audio acceleration) in Windows 95 through XP, and positioning sounds in 3D space. DirectSound also provides a means to capture sounds from a microphone or other input and controlling capture effects during audio capture.[2]
After many years of development, today DirectSound is a mature API, and supplies many other useful capabilities, such as the ability to play multichannel sounds at high resolution. While DirectSound was designed to be used by games, today it is used to play audio in many audio applications. DirectShow uses DirectSound's hardware audio acceleration capabilities if the sound card's hardware audio acceleration capabilities exist and are exposed by the audio driver.[3]
DirectSound is a user mode API that provides an interface between applications and the sound card driver, enabling applications to produce sounds and play back music.
DirectSound was considered revolutionary when it was introduced in 1995, as it featured multiple simultaneous audio streams and allowed several applications to access the sound card simultaneously. Before that, the game developers were required to implement their own audio rendering engine in software.
DirectSound provides sample rate conversion and sound mixing (volume and pan) for an unlimited number of audio sources; however, the practical limits are the number of hardware audio sources and the performance of software mixers.
The DirectSound architecture features a concept of the "ring buffer" which would be continuously played in a cycle. The application programmer creates the sound buffer then continuously queries its state through the "read cursor" and updates it with the "write cursor". There are two types of buffers - a "streaming" buffer, which holds continuous sounds such as background music, and a "static" buffer which holds short sounds.
On supported sound cards, DirectSound would try to use "hardware accelerated" buffers, i.e. the ones which either can be placed in local sound card memory, or can be accessed by the sound card from the system memory. If hardware acceleration is not available, DirectSound would create audio buffers in the system memory and use purely software mixing.
Some late DOS-era "wavetable" sound cards such as Sound Blaster AWE32 and Gravis Ultrasound featured dedicated DSPs, which were borrowed from the digital music instruments. These cards featured local memory which could be used for buffering multiple audio streams and mixing them on board, thus offloading the CPU and greatly improving the sound quality. However, this was only possible in DOS by directly programming the hardware, and full-featured "hardware acceleration" from the local memory was never implemented on these cards, due to complexities of double buffering. Later cards such as Sound Blaster Live!, Audigy and X-Fi are capable of accessing the system memory buffers directly.
DirectSound3D (DS3D) is an extension to DirectSound introduced with DirectX 3 in 1996 with the intention to standardize 3D audio in Windows. DirectSound3D allows software developers to utilize audio by writing once for a single audio API instead of rewriting code numerous times to work for each audio card vendor.
In DirectX 5, DirectSound3D gained the support for sound cards that use third party 3D audio algorithms in order to accelerate DirectSound3D properly, through methods approved by Microsoft.
In DirectX 8, DirectSound and DirectSound3D (DS3D) were officially merged and given the name DirectX Audio, however the API is still commonly referred to as DirectSound.
See main article: Environmental Audio Extensions. EAX is an extension to DirectSound and DirectSound3D which provides sound effects processing to the hardware-accelerated buffers.
In Windows 95, 98 and Me, the DirectSound mixer component and the sound card drivers were both implemented as a kernel-mode VxD driver (Dsound.vxd), allowing direct access to the primary buffer used by the audio hardware and thus, providing the lowest possible latency between the user-mode API and the underlying hardware, but in some cases causing instability and blue screen errors.
Windows 98 introduced WDM Audio and the Kernel Audio Mixer driver (KMixer), which enabled digital mixing, routing and processing of simultaneous audio streams with a higher quality sample rate conversion as well as kernel streaming. Under WDM, DirectSound sends data to the software-based KMixer. Windows 98 Second Edition improved WDM audio support by adding DirectSound hardware buffering, DirectSound3D hardware abstraction, KMixer sample-rate conversion (SRC) for capture streams, multichannel audio support and introduction of DirectMusic. If the audio hardware supports hardware mixing (also known as hardware buffering or DirectSound hardware acceleration), DirectSound buffers directly to the rendering device.[4] If DirectSound streams use hardware mixing, KMixer and its latency delay are bypassed.[5] On Windows 98 and Windows Me, WDM audio drivers were preferred but compatibility with VxD driver model was preserved.
Although Windows Driver Model (WDM) was available starting with Windows 98, few audio card manufacturers used it. Due to internal buffering, KMixer introduced significant processing latency (30 ms on then-current systems). Windows 98 also includes a WDM streaming class driver (Stream.sys) to address these real time multimedia data stream processing requirements. When the sound card uses a custom driver for use with the system supplied port class driver PortCls.sys or implements a mini-driver for use with the streaming class driver, applications can bypass the KMixer completely and use the kernel streaming interfaces instead to reduce latency.
In Windows 2000, Microsoft also implemented the same WDM-based audio stack on Windows NT by introducing the WDM audio drivers and the kernel mixer component (KMixer).[6] In Windows XP, Microsoft introduced another improved kernel streaming class driver, AVStream. Beginning with Windows XP, hardware acceleration was also added for DirectSound capture effects processing[7] such as Acoustic Echo Cancellation for USB microphones, noise suppression and array microphone support.
Windows Vista features a completely re-written audio stack based on the Universal Audio Architecture. Because of the architectural changes in the redesigned audio stack, a direct path from DirectSound to the audio drivers does not exist.[8] DirectSound, DirectMusic and other APIs such as MME are emulated as WASAPI Session instances. DirectSound runs in emulation mode on the Microsoft software mixer. The emulator does not have hardware abstraction, so there is no hardware DirectSound acceleration, meaning hardware and software relying on DirectSound acceleration may have degraded performance. It's likely a supposed performance hit might not be noticeable, depending on the application and actual system hardware. In the case of hardware 3D audio effects played using DirectSound3D, they will not be playable; this also breaks compatibility with EAX extensions.[9]
Third-party APIs such as ASIO and OpenAL are not affected by these architectural changes in Windows Vista, as they use IOCtl to interface directly with the audio driver. A solution for applications that wish to take advantage of hardware accelerated high-quality 3D positional audio is to use OpenAL. However, this only works if the manufacturer provides an OpenAL driver for their hardware.[10]
WASAPI audio stack in Windows 8 introduces support for "hardware offloading" of multiple audio streams to the audio card for mixing and effect processing, in addition to the software processing introduced in Vista,[11] [12] however the functionality is only exposed for Windows Runtime apps.[13] DirectSound's and DirectMusic's hardware interfaces to sound card drivers are not implemented.
Although DirectSound support was available in Windows CE versions up to 4.2, it was removed starting 5.0.[14] Windows CE 6.0 also does not support DirectSound, instead favoring that applications be rewritten to use the Waveform Audio API.
After the removal of DirectSound in Windows Vista, a few replacement implementations have appeared.
Sound Blaster's Creative ALchemy (2007) provides hardware acceleration of DirectSound3D and Audio Effects, such as EAX.[15] Creative ALchemy intercepts calls to DirectSound3D and translates them into OpenAL calls to be processed by supported hardware such as Sound Blaster X-Fi and Sound Blaster Audigy. For software-based Creative audio solutions, ALchemy utilizes its built-in 3D audio engine without using OpenAL at all.
Realtek, a manufacturer of integrated HD audio codecs, has a product similar to ALchemy called 3D SoundBack. C-Media, a manufacturer of PC sound card chipsets, also has a solution called Xear3D EX, although it works instead by intercepting DirectSound3D calls transparently in the background without any user intervention.
IndirectSound is a freeware library that emulates DirectSound 3D using XAudio2, without using hardware acceleration.[16]
DSOAL is an open source library that emulates DirectSound 3D and EAX using OpenAL. Either a hardware-accelerated OpenAL implementation or OpenAL Soft (which provides HRTF) can be used.[17]