default search action
Xuenan Xu
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j3]Xuenan Xu, Zeyu Xie, Mengyue Wu, Kai Yu:
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning. IEEE ACM Trans. Audio Speech Lang. Process. 32: 95-112 (2024) - [c18]Zeyu Xie, Baihan Li, Xuenan Xu, Mengyue Wu, Kai Yu:
Enhancing Audio Generation Diversity with Visual Information. ICASSP 2024: 866-870 - [c17]Xuenan Xu, Xiaohang Xu, Zeyu Xie, Pingyue Zhang, Mengyue Wu, Kai Yu:
A Detailed Audio-Text Data Simulation Pipeline Using Single-Event Sounds. ICASSP 2024: 1091-1095 - [c16]Luoyi Sun, Xuenan Xu, Mengyue Wu, Weidi Xie:
Auto-ACD: A Large-scale Dataset for Audio-Language Representation Learning. ACM Multimedia 2024: 5025-5034 - [i24]Xuenan Xu, Ziyang Ma, Mengyue Wu, Kai Yu:
Towards Weakly Supervised Text-to-Audio Grounding. CoRR abs/2401.02584 (2024) - [i23]Zeyu Xie, Baihan Li, Xuenan Xu, Mengyue Wu, Kai Yu:
Enhancing Audio Generation Diversity with Visual Information. CoRR abs/2403.01278 (2024) - [i22]Xuenan Xu, Xiaohang Xu, Zeyu Xie, Pingyue Zhang, Mengyue Wu, Kai Yu:
A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds. CoRR abs/2403.04594 (2024) - [i21]Yi Yuan, Zhuo Chen, Xubo Liu, Haohe Liu, Xuenan Xu, Dongya Jia, Yuanzhe Chen, Mark D. Plumbley, Wenwu Wang:
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining. CoRR abs/2404.17806 (2024) - [i20]Haohe Liu, Xuenan Xu, Yi Yuan, Mengyue Wu, Wenwu Wang, Mark D. Plumbley:
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound. CoRR abs/2405.00233 (2024) - [i19]Yiming Zhang, Xuenan Xu, Ruoyi Du, Haohe Liu, Yuan Dong, Zheng-Hua Tan, Wenwu Wang, Zhanyu Ma:
Zero-Shot Audio Captioning Using Soft and Hard Prompts. CoRR abs/2406.06295 (2024) - [i18]Zeyu Xie, Baihan Li, Xuenan Xu, Zheng Liang, Kai Yu, Mengyue Wu:
FakeSound: Deepfake General Audio Detection. CoRR abs/2406.08052 (2024) - [i17]Zeyu Xie, Xuenan Xu, Zhizheng Wu, Mengyue Wu:
AudioTime: A Temporally-aligned Audio-text Benchmark Dataset. CoRR abs/2407.02857 (2024) - [i16]Zeyu Xie, Xuenan Xu, Zhizheng Wu, Mengyue Wu:
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation. CoRR abs/2407.02869 (2024) - [i15]Baihan Li, Zeyu Xie, Xuenan Xu, Yiwei Guo, Ming Yan, Ji Zhang, Kai Yu, Mengyue Wu:
DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation. CoRR abs/2407.13198 (2024) - [i14]Xuenan Xu, Haohe Liu, Mengyue Wu, Wenwu Wang, Mark D. Plumbley:
Efficient Audio Captioning with Encoder-Level Knowledge Distillation. CoRR abs/2407.14329 (2024) - [i13]Xuenan Xu, Pingyue Zhang, Ming Yan, Ji Zhang, Mengyue Wu:
Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models. CoRR abs/2407.14355 (2024) - 2023
- [c15]Guangwei Li, Xuenan Xu, Lingfeng Dai, Mengyue Wu, Kai Yu:
Diverse and Vivid Sound Generation from Text Descriptions. ICASSP 2023: 1-5 - [c14]Xuenan Xu, Mengyue Wu, Kai Yu:
Investigating Pooling Strategies and Loss Functions for Weakly-Supervised Text-to-Audio Grounding via Contrastive Learning. ICASSP Workshops 2023: 1-5 - [c13]Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu:
Enhance Temporal Relations in Audio Captioning with Sound Event Detection. INTERSPEECH 2023: 4179-4183 - [c12]Xuenan Xu, Zhiling Zhang, Zelin Zhou, Pingyue Zhang, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu:
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data. ACM Multimedia 2023: 2756-2764 - [i12]Xuenan Xu, Zhiling Zhang, Zelin Zhou, Pingyue Zhang, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu:
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data. CoRR abs/2303.07902 (2023) - [i11]Guangwei Li, Xuenan Xu, Lingfeng Dai, Mengyue Wu, Kai Yu:
Diverse and Vivid Sound Generation from Text Descriptions. CoRR abs/2305.01980 (2023) - [i10]Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu:
Enhance Temporal Relations in Audio Captioning with Sound Event Detection. CoRR abs/2306.01533 (2023) - [i9]Hanxue Zhang, Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu:
Improving Audio Caption Fluency with Automatic Error Correction. CoRR abs/2306.10090 (2023) - [i8]Luoyi Sun, Xuenan Xu, Mengyue Wu, Weidi Xie:
A Large-scale Dataset for Audio-Language Representation Learning. CoRR abs/2309.11500 (2023) - 2022
- [j2]Ning Yang, De-Feng Liu, Tao Liu, Tianyuan Han, Pingyue Zhang, Xuenan Xu, Siyu Lou, Huan-Guang Liu, Anchao Yang, Cheng Dong, Mang I Vai, Sio-Hang Pun, Jian-Guo Zhang:
Automatic Detection Pipeline for Accessing the Motor Severity of Parkinson's Disease in Finger Tapping and Postural Stability. IEEE Access 10: 66961-66973 (2022) - [c11]Guangwei Li, Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
Category-Adapted Sound Event Enhancement with Weakly Labeled Data. ICASSP 2022: 851-855 - [c10]Xuenan Xu, Mengyue Wu, Kai Yu:
Diversity-Controllable and Accurate Audio Captioning Based on Neural Condition. ICASSP 2022: 971-975 - [c9]Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu:
Can Audio Captions Be Evaluated With Image Caption Metrics? ICASSP 2022: 981-985 - [c8]Guangwei Li, Xuenan Xu, Mengyue Wu, Kai Yu:
Navigating Audio-Visual Event Detection Across Mismatched Modalities. ICASSP 2022: 1975-1979 - [c7]Siyu Lou, Xuenan Xu, Mengyue Wu, Kai Yu:
Audio-Text Retrieval in Context. ICASSP 2022: 4793-4797 - [i7]Siyu Lou, Xuenan Xu, Mengyue Wu, Kai Yu:
Audio-text Retrieval in Context. CoRR abs/2203.13645 (2022) - [i6]Xuenan Xu, Mengyue Wu, Kai Yu:
A Comprehensive Survey of Automated Audio Captioning. CoRR abs/2205.05357 (2022) - 2021
- [j1]Heinrich Dinkel, Shuai Wang, Xuenan Xu, Mengyue Wu, Kai Yu:
Voice Activity Detection in the Wild: A Data-Driven Approach Using Teacher-Student Training. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1542-1555 (2021) - [c6]Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events. ICASSP 2021: 606-610 - [c5]Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Zeyu Xie, Kai Yu:
Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning. ICASSP 2021: 905-909 - [c4]Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
A Lightweight Framework for Online Voice Activity Detection in the Wild. Interspeech 2021: 371-375 - [c3]Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
Audio Caption in a Car Setting with a Sentence-Level Loss. ISCSLP 2021: 1-5 - [i5]Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Zeyu Xie, Kai Yu:
Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning. CoRR abs/2102.11457 (2021) - [i4]Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events. CoRR abs/2102.11474 (2021) - [i3]Heinrich Dinkel, Shuai Wang, Xuenan Xu, Mengyue Wu, Kai Yu:
Voice activity detection in the wild: A data-driven approach using teacher-student training. CoRR abs/2105.04065 (2021) - [i2]Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu:
Can Audio Captions Be Evaluated with Image Caption Metrics? CoRR abs/2110.04684 (2021) - 2020
- [c2]Nathan Magyar, Xuenan Xu, Molly Maher:
Creating and Evaluating a Goal Setting Prototype for MOOCs. CHI Extended Abstracts 2020: 1-8 - [c1]Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
A CRNN-GRU Based Reinforcement Learning Approach to Audio Captioning. DCASE 2020: 225-229
2010 – 2019
- 2019
- [i1]Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
What does a Car-ssette tape tell? CoRR abs/1905.13448 (2019)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-13 23:44 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint