
Editorial
Optimized Temporal Scaffolding: Rhythmic Micro-Variations Selectively Enhance Visual Working Memory Precision
@ARTICLE{10.4108/eetpht.11.11068, author={Xingbo Chen and Melinda Xiaoxiao Ying and Kaiwen Xiao and Qiurui Wang}, title={Optimized Temporal Scaffolding: Rhythmic Micro-Variations Selectively Enhance Visual Working Memory Precision}, journal={EAI Endorsed Transactions of Pervasive Health and Technology}, volume={11}, number={1}, publisher={EAI}, journal_a={PHAT}, year={2026}, month={1}, keywords={Visual Working Memory, Rhytmic Micro-Variations, Temporal Scaffolding, Cognitive Precision, Musical Expressivity, Cross-Modal Interaction}, doi={10.4108/eetpht.11.11068} }- Xingbo Chen
Melinda Xiaoxiao Ying
Kaiwen Xiao
Qiurui Wang
Year: 2026
Optimized Temporal Scaffolding: Rhythmic Micro-Variations Selectively Enhance Visual Working Memory Precision
PHAT
EAI
DOI: 10.4108/eetpht.11.11068
Abstract
Visual working memory (VWM) precision—the fidelity of stored visual representations—is a critical determinant of cognitive performance, yet the influence of subtle, non-semantic auditory features remains largely unexplored. While the effects of gross musical features on cognition are relatively well documented, the role of expressive micro-variations—minute, human-like deviations in rhythm—in shaping VWM has received little empirical attention. We employed a controlled, within-subjects color change-detection paradigm (N = 100) to measure VWM capacity (K) and precision (σ). Participants performed the task under three auditory conditions: Silence, a Mechanically Isochronous Rhythm (ME), and a Micro-Variation Rhythm (MV) consisting of quasi-isochronous drum patterns at 240 BPM with ±20 ms timing variation. VWM metrics were estimated using a standard mixture-model analysis and evaluated with Bayesian multilevel regression. Relative to both Silence and the Mechanical Rhythm, the MV condition produced a robust enhancement of VWM precision (i.e., smaller σ; e.g., βMV−SI = −7.0, Ppost(β < 0) > 0.99), while VWM capacity (K) remained statistically stable across conditions. An exploratory analysis further suggested that the magnitude of the precision benefit was positively associated with participants’ level of musical training. These findings are consistent with the idea that the natural, non-linear temporal structure embedded in expressive rhythm can serve as an Optimized Temporal Scaffold for visual cognition, providing a more effective acoustic context than perfect isochrony for supporting high-fidelity VWM representations. The work bridges research on musical expressivity and fundamental cognitive resource allocation and points to novel theoretical insights and potential applications for designing acoustic environments that support, rather than constrain, cognitive performance
Copyright © 2026 X. Chen et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.


