Contextual Reasoning and Adaptation for Natural Language Processing

Language is implicit — it omits information. Filling this information gap requires contextual inference, background- and commonsense knowledge, and reasoning over situational context. Language also evolves, i.e., it specializes and changes over time. For example, many different languages and domains exist, new domains arise, and both evolve constantly. Thus, language understanding also requires continuous and efficient adaptation to new languages and domains — and transfer to, and between, both. Current language understanding methods, however, focus on high resource languages and domains, use little to no context, and assume static data, task, and target distributions.

The research in Cora4NLP aims to address these challenges. It builds on the expertise and results of the predecessor project DEEPLEE and is carried out jointly between DFKI’s language technology research departments in Berlin and Saarbrücken. Specifically, our goal is to develop natural language understanding methods that enable:

  • reasoning over broader co- and contexts
  • efficient adaptation to novel and/or low resource contexts
  • continual adaptation to, and generalization over, evolving contexts

Cora4NLP is funded by the German Federal Ministry of Education and Research (BMBF) under funding code 01IW20010.

Latest News

Recent Publications

InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations

While recently developed NLP explainability methods let us open the black box in various ways (Madsen et al., 2022), a missing ingredient in this endeavor is an interactive tool offering a conversational interface. Such a dialogue system can help users explore datasets and models with explanations in a contextualized manner, e.g. via clarification or follow-up questions, and through a natural language interface. We adapt the conversational explanation framework TalkToModel (Slack et al., 2022) to the NLP domain, add new NLP-specific operations such as free-text rationalization, and illustrate its generalizability on three NLP tasks (dialogue act classification, question answering, hate speech detection). To recognize user queries for explanations, we evaluate fine-tuned and few-shot prompting models and implement a novel Adapter-based approach. We then conduct two user studies on (1) the perceived correctness and helpfulness of the dialogues, and (2) the simulatability, i.e. how objectively helpful dialogical explanations are for humans in figuring out the model’s predicted label when it’s not shown. We found rationalization and feature attribution were helpful in explaining the model behavior. Moreover, users could more reliably predict the model outcome based on an explanation dialogue rather than one-off explanations.

JoinER-BART: Joint Entity and Relation Extraction with Constrained Decoding, Representation Reuse and Fusion

Joint Entity and Relation Extraction (JERE) is an important research direction in Information Extraction (IE). Given the surprising performance with fine-tuning of pre-trained BERT in a wide range of NLP tasks, nowadays most studies for JERE are based on the BERT model. Rather than predicting a simple tag for each word, these approaches are usually forced to design complex tagging schemes, as they may have to extract entity-relation pairs which may overlap with others from the same sequence of word representations in a sentence. Recently, sequence-to-sequence (seq2seq) pre-trained BART models show better performance than BERT models in many NLP tasks. Importantly, a seq2seq BART model can simply generate sequences of (many) entity-relation triplets with its decoder, rather than just tag input words. In this paper, we present a new generative JERE framework based on pre-trained BART. Different from the basic seq2seq BART architecture: 1) our framework employs a constrained classifier which only predicts either a token of the input sentence or a relation in each decoding step, and 2) we reuse representations from the pre-trained BART encoder in the classifier instead of a newly trained weight matrix, as this better utilizes the knowledge of the pre-trained model and context-aware representations for classification, and empirically leads to better performance. In our experiments on the widely studied NYT and WebNLG datasets, we show that our approach outperforms previous studies and establishes a new state-of-the-art (92.91 and 91.37 F1 respectively in exact match evaluation).