REthinking PAssage Retrieval for Question-Answering (REPARQA)

Workshop Website

Workshop Organisers

  • Rafika Boutalbi
    University of Stuttgart, Germany
  • Mohamed Nadif
    Université de Paris, CNRS Centre Borelli, France
  • Rim Hantach
    Engie, France

Workshop Abstract

Research on Question-Answering (QA) systems has recently achieved considerable success in simplified closed-domain settings such as the SQuAD dataset, which provides a preselected passage. Researchers tackled open-domain QA that presents a key challenge in natural language processing (NLP). Open-domain QA considers a large text corpus such as Wikipedia pages instead of a preselected passage for answering a given question. In this context, the Natural Questions (NQ) dataset has presented a more challenging problem. In fact, instead of providing one short passage for each question, NQ gives an entire Wikipedia page which is significantly longer than the passage provided in the other datasets.

An effective open-domain QA system must be able to successfully retrieve the document and the passage on one hand, and comprehend the question context to answer on the other. The current state-of-the-art of deep learning-based research for open-domain QA is often complicated and consist of mainly two components: (1) a passage retriever that selects a small subset of passages from documents (e.g., Wikipedia pages), and then (2) a machine comprehension that examines the retrieved passages to identify the final answer. Several studies showed that passage retrieval impact can significantly improve question answering task

Several elements are important for the passage retriever, such as question and passage representation, similarity and attention mechanism between the question and passages, passage ranking techniques, etc.

The REPARQA workshop is the first one that tackles the issue of passage retrieval for open-domain QA. It aims to bring together experts from industry, science, and academia to exchange ideas and discuss ongoing research in open-domain QA and, more precisely, the passage retrieval component. We encourage the description of novel problem definition of passage retrieval for open-domain QA and new datasets in this context. Furthermore, we also encourage contributions developing new techniques for document retrieval for open-domain QA problems.