Publications

Selected publications in reverse chronological order. Check Google Scholar for a complete list of papers and citations. My name is bolded.

Xiang Hao, Jibin Wu, Jianwei Yu, Chenglin Xu, and Kay Chen Tan, “Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction,” IEEE Transactions on Cognitive and Developmental Systems, vol. 18, no. 2, pp. 361-372, 2026. DOI: 10.1109/TCDS.2025.3598687. arXiv: 2310.07284. Code: LLM-TSE.
Xiang Hao, Chenxiang Ma, Qu Yang, Jibin Wu, and Kay Chen Tan, “Toward Ultralow-Power Neuromorphic Speech Enhancement With Spiking-FullSubNet,” IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 9, pp. 17350-17364, 2025. DOI: 10.1109/TNNLS.2025.3566021. Code: spiking-fullsubnet.
Hanglei Zhang, Yiwei Guo, Zhihan Li, Xiang Hao, Xie Chen, and Kai Yu, “Unlocking Temporal Flexibility: Neural Speech Codec with Variable Frame Rate,” Proc. Interspeech 2025, pp. 5003-5007, 2025. DOI: 10.21437/Interspeech.2025-1289. arXiv: 2505.16845.
Xiang Hao, Chenxiang Ma, Qu Yang, Kay Chen Tan, and Jibin Wu, “When Audio Denoising Meets Spiking Neural Network,” 2024 IEEE Conference on Artificial Intelligence (CAI), pp. 1524-1527, 2024. DOI: 10.1109/CAI59869.2024.00275. Code: spiking-fullsubnet.
Xiang Hao, Di Xu, Yang Zhao, Xin Meng, and Jibin Wu, “Pink-Eggs Dataset: A Step Toward Invasive Species Management Using Deep Learning Solutions,” 2024 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE International Conference on Robotics, Automation and Mechatronics (RAM), pp. 45-50, 2024. DOI: 10.1109/CIS-RAM61939.2024.10672749. Dataset: Pink-Eggs.
Honglin Qu, Xiangdong Su, Yonghe Wang, Xiang Hao, and Guanglai Gao, “Noise-Separated Adaptive Feature Distillation for Robust Speech Recognition,” IEEE Signal Processing Letters, vol. 30, pp. 763-767, 2023. DOI: 10.1109/LSP.2023.3289110.
Xiang Hao and Xiaofei Li, “Fast FullSubNet: Accelerate Full-Band and Sub-Band Fusion Model for Single-Channel Speech Enhancement,” arXiv preprint arXiv:2212.09019, 2022. DOI: 10.48550/arXiv.2212.09019. Code: FullSubNet.
Zhenhao Jin, Xiang Hao, and Xiangdong Su, “Coarse-to-Fine Recursive Speech Separation for Unknown Number of Speakers,” arXiv preprint arXiv:2203.16054, 2022. DOI: 10.48550/arXiv.2203.16054.
Xiang Hao, Xiangdong Su, Radu Horaud, and Xiaofei Li, “FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement,” ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6633-6637, 2021. DOI: 10.1109/ICASSP39728.2021.9414177. arXiv: 2010.15508. Code: FullSubNet.
Xiang Hao, Shixue Wen, Xiangdong Su, Yun Liu, Guanglai Gao, and Xiaofei Li, “Sub-Band Knowledge Distillation Framework for Speech Enhancement,” Proc. Interspeech 2020, pp. 2687-2691, 2020. DOI: 10.21437/Interspeech.2020-1539. arXiv: 2005.14435.
Xiang Hao, Xiangdong Su, Zhiyu Wang, Qiang Zhang, Huali Xu, and Guanglai Gao, “SNR-Based Teachers-Student Technique for Speech Enhancement,” 2020 IEEE International Conference on Multimedia and Expo (ICME), pp. 1-6, 2020. DOI: 10.1109/ICME46284.2020.9102846. arXiv: 2005.14441.
Xiang Hao, Xiangdong Su, Shixue Wen, Zhiyu Wang, Yiqian Pan, Feilong Bao, and Wei Chen, “Masking and Inpainting: A Two-Stage Speech Enhancement Approach for Low SNR and Non-Stationary Noise,” ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6959-6963, 2020. DOI: 10.1109/ICASSP40776.2020.9053188.
Xiang Hao, Xiangdong Su, Zhiyu Wang, and Hui Zhang, “UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-Noise Ratio Condition,” Proc. Interspeech 2019, pp. 1786-1790, 2019. DOI: 10.21437/Interspeech.2019-1567. arXiv: 2010.15521. Demo: Extremely-Low-SNR-Demo.