Posts by Collection

portfolio

publications

EMID: An Emotional Aligned Dataset in Audio-Visual Modality

Published in ACM MM Workshop 2023, 2023

A high-quality music-image cross-modal matching dataset (30K+ pairs) using emotional consistency as the primary basis for cross-modal alignment.

Recommended citation: Jialing Zou*, Jiahao Mei*, Guangze Ye, et al. "EMID: An Emotional Aligned Dataset in Audio-Visual Modality." ACM MM Workshop, 2023.
Download Paper

MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio

Published in NAACL 2025, 2025

An open-source multi-modal multi-agent story video generation framework achieving automated immersive narrated storybook video generation. 85K+ visits on ModelScope.

Recommended citation: Xuenan Xu, Jiahao Mei, Chenliang Li, et al. "MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio." NAACL, 2025.
Download Paper

UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities

Published in arXiv preprint, 2025

The first fully open-source unified audio generation framework based on Flow Matching, with a novel Dual-Fusion mechanism supporting text, audio, and video inputs across 7 tasks.

Recommended citation: Xuenan Xu*, Jiahao Mei*, Zihao Zheng, et al. "UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities." arXiv, 2025.
Download Paper

LARA-Gen: Enabling Continuous Emotion Control for Music Generation Models via Latent Affective Representation Alignment

Published in arXiv preprint, 2025

A latent affective representation alignment mechanism for continuous fine-grained emotion control in music generation using valence-arousal values.

Recommended citation: Jiahao Mei, Xuenan Xu, Zeyu Xie, et al. "LARA-Gen: Enabling Continuous Emotion Control for Music Generation Models via Latent Affective Representation Alignment." arXiv, 2025.
Download Paper

WritingBench: A Comprehensive Benchmark for Generative Writing

Published in NeurIPS 2025, 2025

A comprehensive benchmark covering 6 domains, 100 sub-domains, and 1239 queries for long-form creative writing evaluation, with a dynamic evaluation framework achieving 83% human agreement.

Recommended citation: Yuning Wu, Jiahao Mei, Ming Yan, et al. "WritingBench: A Comprehensive Benchmark for Generative Writing." NeurIPS, 2025.
Download Paper

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.