Home > Research > Publications & Outputs > Complex-Cycle-Consistent Diffusion Model for Mo...

Electronic data

View graph of relations

Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Forthcoming

Standard

Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement. / Li, Yi; Sun, Yang; Angelov, Plamen.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence. Philadelphia: AAAI, 2024.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Li, Y, Sun, Y & Angelov, P 2024, Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement. in Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence. AAAI, Philadelphia, The 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, Pennsylvania, United States, 25/02/25.

APA

Li, Y., Sun, Y., & Angelov, P. (in press). Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement. In Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence AAAI.

Vancouver

Li Y, Sun Y, Angelov P. Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement. In Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence. Philadelphia: AAAI. 2024

Author

Li, Yi ; Sun, Yang ; Angelov, Plamen. / Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement. Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence. Philadelphia : AAAI, 2024.

Bibtex

@inproceedings{aa6b372059524416b850e7c550dd2c87,
title = "Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement",
abstract = "In this paper, we present a novel diffusion model-based monaural speech enhancement method. Our approach incorporates the separate estimation of speech spectra{\textquoteright}s magnitude and phase in two diffusion networks. Throughout the diffusion process, noise clips from real-world noise interferences are added gradually to the clean speech spectra and a noise-aware reverse process is proposed to learn how to generate both clean speech spectra and noise spectra.Furthermore, to fully leverage the intrinsic relationship between magnitude and phase, we introduce a complex-cycleconsistent (CCC) mechanism that uses the estimated magnitude to map the phase, and vice versa. We implement this algorithm within a phase-aware speech enhancement diffusion model (SEDM). We conduct extensive experiments on public datasets to demonstrate the effectiveness of our method, highlighting the significant benefits of exploiting the intrinsic relationship between phase and magnitude information to enhance speech. The comparison to conventional diffusion models demonstrates the superiority of SEDM.",
author = "Yi Li and Yang Sun and Plamen Angelov",
year = "2024",
month = dec,
day = "9",
language = "English",
booktitle = "Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence",
publisher = "AAAI",
note = "The 39th Annual AAAI Conference on Artificial Intelligence ; Conference date: 25-02-2025 Through 04-03-2025",
url = "https://aaai.org/conference/aaai/aaai-25/",

}

RIS

TY - GEN

T1 - Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement

AU - Li, Yi

AU - Sun, Yang

AU - Angelov, Plamen

PY - 2024/12/9

Y1 - 2024/12/9

N2 - In this paper, we present a novel diffusion model-based monaural speech enhancement method. Our approach incorporates the separate estimation of speech spectra’s magnitude and phase in two diffusion networks. Throughout the diffusion process, noise clips from real-world noise interferences are added gradually to the clean speech spectra and a noise-aware reverse process is proposed to learn how to generate both clean speech spectra and noise spectra.Furthermore, to fully leverage the intrinsic relationship between magnitude and phase, we introduce a complex-cycleconsistent (CCC) mechanism that uses the estimated magnitude to map the phase, and vice versa. We implement this algorithm within a phase-aware speech enhancement diffusion model (SEDM). We conduct extensive experiments on public datasets to demonstrate the effectiveness of our method, highlighting the significant benefits of exploiting the intrinsic relationship between phase and magnitude information to enhance speech. The comparison to conventional diffusion models demonstrates the superiority of SEDM.

AB - In this paper, we present a novel diffusion model-based monaural speech enhancement method. Our approach incorporates the separate estimation of speech spectra’s magnitude and phase in two diffusion networks. Throughout the diffusion process, noise clips from real-world noise interferences are added gradually to the clean speech spectra and a noise-aware reverse process is proposed to learn how to generate both clean speech spectra and noise spectra.Furthermore, to fully leverage the intrinsic relationship between magnitude and phase, we introduce a complex-cycleconsistent (CCC) mechanism that uses the estimated magnitude to map the phase, and vice versa. We implement this algorithm within a phase-aware speech enhancement diffusion model (SEDM). We conduct extensive experiments on public datasets to demonstrate the effectiveness of our method, highlighting the significant benefits of exploiting the intrinsic relationship between phase and magnitude information to enhance speech. The comparison to conventional diffusion models demonstrates the superiority of SEDM.

M3 - Conference contribution/Paper

BT - Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence

PB - AAAI

CY - Philadelphia

T2 - The 39th Annual AAAI Conference on Artificial Intelligence

Y2 - 25 February 2025 through 4 March 2025

ER -