Retrieval Augmented Generative Task-Oriented Dialogue Systems

Abstract

In this thesis, we study task-oriented dialogue systems that rely on efficient knowledge inference from background knowledge sources and dialogue history to satisfy a user goal. While the conventional modular architectures and the new age end-to-end architectures are the more established design choices for task-oriented dialogue systems, in this thesis, we propose a relatively simple retrieve-and-generate strategy for task-oriented dialogue systems. Our proposed approach enjoys the best of both architectures while addressing their respective limitations. We experiment with different retrieval techniques using sparse representations and dense embeddings. Considering the diversity of task-oriented dialogue datasets, we experiment with SMD, Camrest, and MultiWOZ-2.1. Furthermore, in view of the entity-rich nature of task-oriented dialogue systems, we question the typical process of introducing auxiliary objectives for better capturing entity awareness, with a simple alternative: adding a syntax embedding layer on top of the standard token embedding and position embedding layers, thereby explicitly adding syntactic knowledge into the model parameters. We propose to use a syntax-infused transformer, a model that explicitly leverages syntactic information by augmenting readily available entity-level metadata, e.g. part-of-speech tags. Despite its simplicity, the syntax-infused transformer is effective. On standard evaluation benchmarks for task-oriented dialogue systems, our proposed syntax-infused model exceeds our base model by an average of 13 Entity-F1 points and 2.8 BLEU points across the three datasets. At the same time, experimental results further confirm that our proposed model outperforms existing state-of-the-art models on the Entity-F1 metric. The empirical analysis further confirms the efficacy of our approach. Overall, our work proposes a relatively more interpretable, easily reproducible and lightweight model in terms of trainable parameters while achieving comparable performance with state-ofthe-art models. Additionally, we conduct robust error analysis for the generated responses together with the evaluation metrics and propose a handful of future research directions.

Type
Publication
Master Thesis
Soumya Ranjan Sahoo
Soumya Ranjan Sahoo
AI Researcher (Conversational AI)

My interests are in building robust machine learning models for large-scale information systems, spanning the areas of natural language processing, information extraction and retrieval, knowledge bases, and knowledge graphs. I am also interested in problems in geometric deep learning, particularly graph machine learning, and theory. My current focus area includes representation learning of natural languages and graphs.