Spatial Audio

Written by

Naison Ware

Published at

Feb 15, 2024

What is Spatial Audio ?

Spatial audio is an innovative technology that enriches auditory experiences by creating a sense of space and directionality. Unlike traditional stereo sound, which comes from fixed left and right channels, spatial audio simulates sound sources coming from various directions. It leverages cues like distance, direction, and reverberation to create a more immersive and realistic auditory environment.

How does it work?

Spatial audio involves several techniques:

a. Head-Related Transfer Functions (HRTFs)

  • HRTFs capture how sound waves interact with the listener’s head, ears, and torso.

  • By applying personalized HRTFs, we can simulate how sound arrives at each ear from different angles.

  • This enables precise localization of sound sources.

b. Binaural Rendering

  • Binaural rendering processes audio to match the listener’s head movements.

  • When you turn your head, spatial audio adjusts accordingly, maintaining the illusion of 3D space.

c. Ambisonics

  • Ambisonics represents sound as spherical harmonics.

  • It allows for flexible placement of sound sources in 3D space.

  • Ambisonic recordings capture sound from all directions, making them ideal for spatial audio.

Applications of Spatial Audio in HCI and XR

Spatial Audio is a common technique apply on HCI to advanced the user experiences, it could be uses on several cases, for example :

a. Virtual Reality (VR)

  • In VR environments, spatial audio enhances immersion.

  • Imagine standing in a virtual forest—birds chirping from above, leaves rustling around you. Spatial audio makes it feel real.

b. Augmented Reality (AR)

  • AR overlays virtual content onto the real world.

  • Spatial audio helps anchor virtual objects to physical locations.

  • For example, an AR navigation app could guide you by whispering directions in your ear.

c. Mixed Reality (MR)

  • MR combines elements of VR and AR.

  • Spatial audio in MR allows virtual objects to interact with real-world surfaces.

  • You might see a virtual ball bounce off your coffee table, hearing it move accordingly.

d. Online Meetings and Collaboration

  • Spatial audio improves teleconferencing.

  • Participants’ voices appear to come from their avatars’ positions, enhancing communication.

e. Architectural Design and Simulation

  • Architects use spatial audio to experience virtual buildings.

  • You can “walk” through a digital space, hearing echoes and room acoustics.

The differences between Spatial Audio and Flat Audio

Spatial audio takes sound rendering to a whole new level by allowing developers to create realistic soundscapes that accurately reflect the position of objects in a 3D space. Let’s explore the differences between traditional flat audio and spatial audio:

  1. Channel Configuration:

    • Traditional Flat Audio:
      • Uses stereo channels (left and right) for sound reproduction.

      • Sound appears to come from fixed positions in front of the listener.

    • Spatial Audio:
      • Utilizes multiple channels to create an illusion of three-dimensional space.

      • Sound sources can come from various directions, including above, below, and behind.

  2. Realism and Immersion:

    • Traditional Flat Audio:
      • Provides basic left-right separation.

      • Lacks depth and immersion.

    • Spatial Audio:
      • Creates a realistic sound experience that immerses the listener.

      • Objects in the scene appear to emit sound from specific positions, enhancing immersion.

  3. Applications:

    • Traditional Flat Audio:
      • Common in music, movies, and basic gaming.

      • Limited for conveying spatial information.

    • Spatial Audio:
      • Essential for virtual reality (VR) and augmented reality (AR) experiences.

      • Enhances gaming, architectural simulations, and teleconferencing.

  4. Techniques Used:

    • Traditional Flat Audio:
      • Simple panning (left-right balance).

      • Stereo mixing.

    • Spatial Audio:
      • Head-Related Transfer Functions (HRTFs): Capture how sound interacts with the listener’s head and ears.

      • Binaural rendering: Adjusts audio based on head movements.

      • Ambisonics: Represents sound as spherical harmonics for flexible placement.

  5. Listener Experience:

    • Traditional Flat Audio:
      • Limited sense of space.

      • Suitable for casual listening.

    • Spatial Audio:
      • Immersive, directional, and engaging.

      • Makes users feel part of the environment.

In summary, spatial audio goes beyond the confines of stereo, allowing us to perceive sound as if it surrounds us. Whether exploring virtual worlds or enjoying music, spatial audio adds depth and realism!