Itzhak, Itay; Sinha, Koustuv; Lake, Brenden; Williams, Adina; Hupkes, Dieuwke

Evaluating locality in NMT models

2022

Creative Commons 'BY' version 4.0 license

Abstract

With a series of theoretically-informed tests, Dankers, Bruni, and Hupkes (2021) investigated how compositional the behavior of neural networks that are trained on fully natural data is. Focusing on neural machine translation (NMT), one of their key findings is that models appear to be modulating poorly between local and global behavior, where local changes in the input often affect the output in an unwanted manner. While their study is based exclusively on the behavior of the models, we take one step further and investigate how this non-locality manifests itself within the model. We develop metrics to quantify internal locality on the encoder side of the model, focusing on the attention mechanism. We find strikingly different patterns in models trained on different amounts of data that go beyond what could be observed behaviourally and present a range of experiments showing how local and global behavior is modulated within different setups.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Evaluating locality in NMT models