Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation

Yongchang Hao, Shilin He, Wenxiang Jiao, Zhaopeng Tu, Michael Lyu, Xing Wang


Abstract
Non-Autoregressive machine Translation (NAT) models have demonstrated significant inference speedup but suffer from inferior translation accuracy. The common practice to tackle the problem is transferring the Autoregressive machine Translation (AT) knowledge to NAT models, e.g., with knowledge distillation. In this work, we hypothesize and empirically verify that AT and NAT encoders capture different linguistic properties of source sentences. Therefore, we propose to adopt multi-task learning to transfer the AT knowledge to NAT models through encoder sharing. Specifically, we take the AT model as an auxiliary task to enhance NAT model performance. Experimental results on WMT14 En-De and WMT16 En-Ro datasets show that the proposed Multi-Task NAT achieves significant improvements over the baseline NAT models. Furthermore, the performance on large-scale WMT19 and WMT20 En-De datasets confirm the consistency of our proposed method. In addition, experimental results demonstrate that our Multi-Task NAT is complementary to knowledge distillation, the standard knowledge transfer method for NAT.
Anthology ID:
2021.naacl-main.313
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3989–3996
Language:
URL:
https://aclanthology.org/2021.naacl-main.313
DOI:
10.18653/v1/2021.naacl-main.313
Bibkey:
Cite (ACL):
Yongchang Hao, Shilin He, Wenxiang Jiao, Zhaopeng Tu, Michael Lyu, and Xing Wang. 2021. Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3989–3996, Online. Association for Computational Linguistics.
Cite (Informal):
Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation (Hao et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.313.pdf
Video:
 https://aclanthology.org/2021.naacl-main.313.mp4
Code
 yongchanghao/multi-task-nat