Modeling Task-Aware MIMO Cardinality for Efficient Multilingual Neural Machine Translation

Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong

August 2021

Abstract

Neural machine translation has achieved great success in bilingual settings, as well as in multilingual settings. With the increase of the number of languages, multilingual systems tend to underperform their bilingual counterparts. Model capacity has been found crucial for massively multilingual NMT to support language pairs with varying typological characteristics. Previous work increases the modeling capacity by deepening or widening the Transformer. However, modeling cardinality based on aggregating a set of transformations with the same topology has been proven more effective than going deeper or wider when increasing capacity. In this paper, we propose to efficiently increase the capacity for multilingual NMT by increasing the cardinality. Unlike previous work which feeds the same input to several transformations and merges their outputs into one, we present a Multi-Input-Multi-Output (MIMO) architecture that allows each transformation of the block to have its own input. We also present a task-aware attention mechanism to learn to selectively utilize individual transformations from a set of transformations for different translation directions. Our model surpasses previous work and establishes a new state-of-the-art on the large scale OPUS-100 corpus while being 1.31 times as fast.

Type

Conference paper

Publication

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing

Modeling Task-Aware MIMO Cardinality for Efficient Multilingual Neural Machine Translation

Abstract

Hongfei Xu

Researcher

Josef van Genabith

Professor at German Research Center for Artificial Intelligence (DFKI)