UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities
Published in arXiv preprint, 2025
We propose the first fully open-source unified audio generation framework based on Flow Matching, with a novel Dual-Fusion mechanism that unifies Time-Align and Non-Time-Align audio generation tasks. UniFlow-Audio supports text, audio, and video inputs, demonstrating excellent performance across seven tasks including TTS and TTA.
Recommended citation: Xuenan Xu*, Jiahao Mei*, Zihao Zheng, et al. "UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities." arXiv, 2025.
Download Paper
