Tech

How loudness and acoustic cues help us judge where a speaker is facing

Share
Share
Hear here: How loudness and acoustic cues help us judge where a speaker is facing
Researchers at Sophia University discover that both loudness and frequency-based acoustic cues help listeners identify speaker’s facing direction, a breakthrough for spatial audio in virtual and augmented realities. Credit: Dr. Shinya Tsuji, Sophia University, Japan

As technology increasingly integrates complex soundscapes into virtual spaces, understanding how humans perceive directional audio becomes vital. This need is bolstered by the rise of immersive media, such as augmented reality (AR) and virtual reality (VR), where users are virtually transported into other worlds. In a recent study, researchers explored how listeners identify the direction from which a speaker is facing while speaking.

The research was led by Dr. Shinya Tsuji, a postdoctoral fellow, Ms. Haruna Kashima, and Professor Takayuki Arai from the Department of Information and Communication Sciences, Sophia University, Japan. The team also included Dr. Takehiro Sugimoto, Mr. Kotaro Kinoshita, and Mr. Yasushige Nakayama from the NHK Science and Technology Research Laboratories, Japan. Their study was published in the journal Acoustical Science and Technology.

In the study, the researchers asked participants to identify the direction a speaker was facing using only sound recordings, using two experiments. The first experiment involved sound recordings with variations in loudness, and the second experiment involved recordings with constant loudness.

The researchers found that loudness was consistently a strong indicator in judging the speaker’s facing direction, but when loudness cues were minimized, listeners still managed to make correct judgments based on the spectral cues of the sound. These spectral cues involve the distribution and quality of sound frequencies that change subtly depending on the speaker’s orientation.

“Our study suggests that humans mainly rely on loudness to identify a speaker’s facing direction,” said Dr. Tsuji. “However, it can also be judged from some acoustic cues, such as the spectral component of the sound, not just loudness alone.”

These findings are particularly useful in virtual sound fields that allow six-degrees-of-freedom—immersive environments like those found in AR and VR applications, where users can move freely and experience audio in different spatial configurations.

“In contents having virtual sound fields with six-degrees-of-freedom—like AR and VR—where listeners can freely appreciate sounds from various positions, the experience of human voices can be significantly enhanced using the findings from our research,” said Dr. Tsuji.

The research emerges at a time when immersive audio is a major design frontier for consumer tech companies. Devices such as Meta Quest 3 and Apple Vision Pro are already shifting how people interact with digital spaces. Accurate rendering of human voices in these environments can significantly elevate user experience—whether in entertainment, education, or communication.

“AR and VR have become common with advances in technology,” Dr. Tsuji added. “As more content is developed for these devices in the future, the findings of our study may contribute to such fields.”

Beyond the immediate applications, this research has broader implications in how we might build more intuitive and responsive soundscapes in the digital world. By improving realism through audio, companies can create more convincing immersive media—an important factor not only for entertainment, but also for accessibility solutions, virtual meetings, and therapeutic interventions.

By uncovering the role of both loudness and spectral cues in voice-based directionality, this study deepens our understanding of auditory perception and lays a foundation for the next generation of spatial audio systems. The findings pave the way for designing more realistic virtual interactions, particularly those involving human speech, which is probably the most familiar and meaningful sound we process every day.

More information:
Shinya Tsuji et al, Perception of speech uttered as speaker faces different directions in horizontal plane: Identification of speaker’s facing directions from the listener, Acoustical Science and Technology (2024). DOI: 10.1250/ast.e24.99

Provided by
Sophia University


Citation:
Hear here: How loudness and acoustic cues help us judge where a speaker is facing (2025, July 1)
retrieved 1 July 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles
Most companies are diving into AI without a plan, and it’s going to explode in their faces soon
Tech

Most companies are diving into AI without a plan, and it’s going to explode in their faces soon

Report finds business AI adoption is exploding, but most companies are skipping...

This home NAS with 32TB, 4K HDMI, and AI photo sorting sounds too wild to ignore
Tech

This home NAS with 32TB, 4K HDMI, and AI photo sorting sounds too wild to ignore

Streams 4K video, backs up your phone, and still skips cloud storage...

AI-driven lifecycle management for end-of-life household appliances
Tech

AI-driven lifecycle management for end-of-life household appliances

Image analysis of a refrigerator to measure and locate features. Credit: Fraunhofer...