Diegetic and object-based spatial audio in cinematic VR: a PRISMA-Guided systematic review with a functional taxonomy and validation framework

Citation

Perumal, Vimala and Shah, Zeeshan Jawed (2026) Diegetic and object-based spatial audio in cinematic VR: a PRISMA-Guided systematic review with a functional taxonomy and validation framework. Frontiers in Virtual Reality, 7. ISSN 2673-4192

[img] Text
frvir-7-1696677.pdf - Published Version
Restricted to Repository staff only

Download (3MB)

Abstract

Introduction: Cinematic VR (CVR) removes the director’s frame, creating the challenge of guiding audience attention without breaking immersion. This systematic review synthesizes empirical evidence on two audio modalities with strong potential to function as narrative agents—diegetic audio (sounds from within the story world) and object-based spatial audio (discrete sound “objects” rendered with positional metadata)—to clarify how they guide attention, shape affect and presence, and how these effects are validated. Methods: Following Preferred Reporting Items for Systematic Reviews and MetaAnalyses (PRISMA) 2020, searches in IEEE Xplore, ACM Digital Library, Scopus, and Web of Science (last searched June 2025) identified studies using diegetic and/or object-based spatial audio as narrative devices in CVR with empirical user data; non-diegetic-only or purely technical papers without user measures were excluded (except where used as baselines). We conducted a qualitative synthesis across behavioral (head/eye tracking), subjective (presence/ engagement), and physiological (HR/EMG/EDA/PPG) measures; no protocol registration was performed. Results: Eighteen studies met inclusion criteria. Across studies, world-locked, offscreen diegetic cues were repeatedly reported to redirect gaze and shorten timeto-region-of-interest after cuts, while object-based rendering enabled precise, dynamic cue placement and was commonly associated with higher presence/ immersion and affective arousal relative to non-spatial or head-locked baselines. Discussion: Evidence remains constrained by methodological heterogeneity, small-to-moderate samples, inconsistent reporting, and limited direct measures of narrative comprehension. We contribute (i) a functional taxonomy of CVR narrative audio techniques aligned to diegetic/object-based practice, (ii) a Validation Triangulation Framework integrating behavioral, subjective, and physiological evidence, and (iii) a Minimum Reporting & Sharing Standard for CVR Narrative Audio specifying what to report and how to share data/metadata, aligned with PRISMA guidance and Findable, Accessible, Interoperable, Reusable (FAIR) principles.

Item Type: Article
Uncontrolled Keywords: Cinematic virtual reality
Subjects: N Fine Arts > N Visual arts
Q Science > QA Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science
Divisions: Faculty of Creative Multimedia (FCM)
Depositing User: Ms Rosnani Abd Wahab
Date Deposited: 03 Apr 2026 02:37
Last Modified: 03 Apr 2026 02:37
URII: http://shdl.mmu.edu.my/id/eprint/15686

Downloads

Downloads per month over past year

View ItemEdit (login required)