EMID: An Emotional Aligned Dataset in Audio-Visual Modality

Published in ACM MM Workshop 2023, 2023

We construct EMID, a high-quality music-image cross-modal matching dataset containing over 30K+ data pairs. We innovatively use emotional consistency between music and images as the primary basis for cross-modal alignment, supporting generation and retrieval tasks in domains such as art therapy.

Recommended citation: Jialing Zou*, Jiahao Mei*, Guangze Ye, et al. "EMID: An Emotional Aligned Dataset in Audio-Visual Modality." ACM MM Workshop, 2023.
Download Paper