Hybrid Reward Architecture (HRA)

Solving complex tasks with reinforcement learning

HRD-header.png
 
 
 
 

A major achievement in reinforcement learning research

Video games have proven a popular tool for testing reinforcement learning algorithms. Ms. Pac-Man is regarded as one of the hardest games from the Atari games set for AI to learn, due to the large number of unique situations that can be encountered and the limited number of lives. Maluuba's algorithm achieved the maximum possible score of 999,990 points.

 

 
 
 

Solving Ms. Pac-Man

Games are popular as a test-bed for new machine learning techniques because they can be very challenging and allow for easy analysis of new learning techniques in a controlled environment. For reinforcement learning, where the goal is to learn good behavior in a data-driven way, the Arcade Learning Environment (ALE), which provides access to a large number of Atari 2600 games, has been a popular test-bed. 

In 2015, Mnih et al. achieved a breakthrough in RL research: by combining standard RL techniques with deep neural networks, they outperformed humans on a large number of ALE games. Since then, many new methods have been developed based on the same principles, improving performance even further. Nonetheless, for some of the ALE games, DQN and its successors are unsuccessful, achieving only a fraction of the score that a human gets. One of these hard games is the classical game Ms. Pac-Man.

In our blog post we look deeper into the reason of why Ms. Pac-Man is hard and propose a new technique, called Hybrid Reward Architecture, to deal with the underlying challenge of Ms. Pac-Man.  

 
 
 
 

Related links