Publications

We are making strong progress by focusing on challenging problems, driving new techniques in reinforcement learning with a focus on machine comprehension and conversational interfaces. Our research team publishes peer-reviewed papers that provide insight into the work we’re doing to advance knowledge in this field.  

 

May 2017

Machine Comprehension by Text-to-Text Neural Question Generation

We propose a recurrent neural model that generates natural-language questions from documents, conditioned on answers. We show how to train the model using a combination of supervised and reinforcement learning. After teacher forcing for standard maximum likelihood training, we fine-tune the model using policy gradient techniques to...

View publication


April 2017

Multi-Advisor Reinforcement Learning

This article deals with a novel branch of Separation of Concerns, called Multi-Advisor Reinforcement Learning (MAd-RL), where a single-agent RL problem is distributed to
n learners, called advisors. Each advisor tries to solve the problem with a different focus. Their advice is then communicated to an aggregator, which is in control of...

View publication


March 2017

Learning Algorithms for Active Learning

ICLR WORKSHOP

We present a model that learns active learning algorithms via metalearning. For each metatask, our model jointly learns: a data representation, an item selection heuristic, and a one-shot classifier. Our model uses the item selection heuristic to construct a labeled support set for the one-shot classifier. Using metatasks...

View publication


February 2017

Transfer Reinforcement Learning with Shared Dynamics

AAAI

This article addresses a particular Transfer Reinforcement Learning (RL) problem: when dynamics do not change from one task to another, and only the reward function does. Our method relies on two ideas, the first one is that transition samples obtained from a task can be reused to learn on…

View publication


January 2017

Algorithm selection of off-policy reinforcement learning algorithm

Dialogue systems rely on a careful reinforcement learning design: the learning algorithm and its state space representation. In lack of more rigorous knowledge, the designer resorts to its practical experience to choose the best option. In order to automate and to improve the performance of the aforementioned...

View publication


December 2016

Towards Information-Seeking Agents

We develop a general problem setting for training and testing the ability of agents to gather information efficiently. Specifically, we present a collection of tasks in which success requires searching through a partially-observed environment, for fragments of information which can be pieced together to accomplish…

View publication


December 2016

Frames: A Corpus For Adding Memory To Goal-Oriented Dialogue Systems

This paper presents the Frames dataset, a corpus of 1369 human-human dialogues with an average of 15 turns per dialogue. We developed this dataset to study the role of memory in goal-oriented dialogue systems. Based on Frames, we introduce a task called frame tracking, which generalizes..

View publication


December 2016

Separation of Concerns in Reinforcement Learning

In this paper, we propose a framework for solving a single-agent task by using multiple agents, each focusing on different aspects of the task. This approach has two main advantages: 1) it allows for specialized agents for different parts of the task, and 2) it provides a new way to transfer…

View publication


November 2016

Calibrating Energy-based Generative Adversarial Networks

ICLR

In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specifically, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to...

View publication


November 2016

NewsQA: A Machine Comprehension Dataset

We present NewsQA, a challenging machine comprehension dataset of over 100,000 question-answer pairs. Crowd-workers supply questions and answers based on a set of over 10,000 news articles from CNN, with answers consisting in spans of text from the corresponding articles. We collect this dataset through a four-stage process designed to solicit exploratory questions that require reasoning. A thorough analysis confirms...

View publication

Access dataset


September 2016

An Architecture for Deep, Hierarchical Generative Models

NIPS 2016

We present an architecture which makes it easy to train deep, directed generative models with many layers of latent variables…

View publication


August 2016

Effective Multi-step Temporal-Difference Learning for Non-Linear Function Approximation

Multi-step temporal-difference (TD) learning, where the update targets contain information from multiple time steps ahead, is one of the most popular forms of TD learning for linear function approximation. The reason is that multi-step methods often yield substantially better performance than their single-step counter-parts, due to a lower bias of...

View publication


June 2016

Natural Language Comprehension with the EpiReader

EMNLP 2016

We present the EpiReader, a novel model for machine comprehension of text. Machine comprehension of unstructured, real-world text is a major research goal for natural language processing. Current tests of machine comprehension pose questions whose answers can be inferred from some supporting text, and evaluate a model's response to the questions. The EpiReader is an end-to-end neural model…

View publication


June 2016

Iterative Alternating Neural Attention for Machine Reading

We propose a novel neural attention architecture to tackle machine comprehension tasks, such as answering Cloze-style queries with respect to a document. Unlike previous models, we do not collapse the query into a single vector, instead we deploy an iterative alternating attention mechanism that allows a fine-grained exploration of both the query and…

View publication


June 2016

Policy Networks with Two-Stage Training for Dialogue Systems

SIGDIAL 2016

In this paper, we propose to use deep policy networks which are trained with an advantage actor-critic method for statistically optimised dialogue systems. First, we show that, on summary state and action spaces, deep Reinforcement Learning (RL) outperforms Gaussian Processes methods. Summary state and action spaces lead to good performance but…

View publication


June 2016

A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems

InterSpeech 2016

User simulation is essential for generating enough data to train a statistical spoken dialogue system. Previous models for user simulation suffer from several drawbacks, such as the inability to take dialogue history into account, the need of rigid structure to ensure coherent user behaviour, heavy dependence on a specific…

View publication


June 2016

Natural Language Generation in Dialogue using Lexicalized and Delexicalized Data

ICLR

Natural language generation plays a critical role in any spoken dialogue system. We present a new approach to natural language generation using recurrent neural networks in an encoder-decoder framework. In contrast with previous work, our model uses both lexicalized and delexicalized versions of slot-value pairs for each dialogue…

View publication


May 2016

Score-based Inverse Reinforcement Learning

AAMAS

This paper reports theoretical and empirical results obtained for the score-based Inverse Reinforcement Learning (IRL) algorithm. It relies on a non-standard setting for IRL consisting of learning a reward from a set of globally...

View publication


March 2016

A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data

ACL 2016

Understanding unstructured text is a major goal within natural language processing. Comprehension tests pose questions based on short text passages to evaluate such understanding. In this work, we investigate machine comprehension on the challenging MCTest benchmark. Partly because of its limited size, prior work on MCTest has focused mainly on engineering…

View publication