We compare our team's systems to others submitted for the CODI-CRAC 2021 Shared-Task on anaphora resolution in dialogue. We analyse the architectures and performance, report some problematic cases in gold annotations, and suggest possible …
We describe the system developed by the DFKI-TalkingRobots Team for the CODI-CRAC 2021 Shared-Task on anaphora resolution in dialogue. Our system consists of three subsystems: (1) the Workspace Coreference System (WCS) incrementally clusters mentions …
For many use cases, it is required that MT does not just translate raw text, but complex formatted documents (e.g. websites, slides, spreadsheets) and the result of the translation should reflect the formatting. This is challenging, as markup can be …
We present the EveEnti (Event and Entity) annotation framework for events and entities in dialogue that we use to annotate several dialogues in German from the emergency response domain.
We present MobIE, a German-language dataset, which is human-annotated with 20 coarse- and fine-grained entity types and entity linking information for geographically linkable entities. The dataset consists of 3,232 social media texts and traffic …
For most language combinations, parallel data is either scarce or simply unavailable. To address this, unsupervised machine translation (UMT) exploits large amounts of monolingual data by using synthetic data generation techniques such as …
Word alignment and machine translation are two closely related tasks. Neural translation models, such as RNN-based and Transformer models, employ a target-to-source attention mechanism which can provide rough word alignments, but with a rather low …
Neural machine translation has achieved great success in bilingual settings, as well as in multilingual settings. With the increase of the number of languages, multilingual systems tend to underperform their bilingual counterparts. Model capacity has …
One of the reasons Transformer translation models are popular is that self-attention networks for context modelling can be easily parallelized at sequence level. However, the computational complexity of a self-attention network is $O(n^2)$, …
There has been a significant progress in thefield of extractive question answering (EQA)in the recent years. However, most of themrely on annotations of answer-spans in the cor-responding passages. In this work, we ad-dress …