Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding

Yang, Guangyu; Chen, Jinghong; Lin, Weizhe; Byrne, Bill

Computer Science > Computation and Language

arXiv:2311.08380 (cs)

[Submitted on 14 Nov 2023 (v1), last revised 12 Apr 2024 (this version, v2)]

Title:Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding

Authors:Guangyu Yang, Jinghong Chen, Weizhe Lin, Bill Byrne

View PDF HTML (experimental)

Abstract:Minimum Bayes Risk (MBR) decoding can significantly improve translation performance of Multilingual Large Language Models (MLLMs). However, MBR decoding is computationally expensive. We show how the recently developed Reinforcement Learning technique, Direct Preference Optimization (DPO), can fine-tune MLLMs to get the gains of MBR without any additional computation in inference. Our method uses only a small monolingual fine-tuning set and yields significantly improved performance on multiple NMT test sets compared to MLLMs without DPO.

Comments:	To appear at NAACL 2024
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2311.08380 [cs.CL]
	(or arXiv:2311.08380v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2311.08380

Submission history

From: Guangyu Yang [view email]
[v1] Tue, 14 Nov 2023 18:43:51 UTC (8,601 KB)
[v2] Fri, 12 Apr 2024 14:07:38 UTC (8,645 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2023-11

Change to browse by:

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators