Investigating the Properties of Neural Network Representations in Reinforcement Learning

Traditional representation learning methods usually design a fixed basis function architecture to achieve desired properties such as orthogonality and sparsity. In contrast, the idea of deep reinforcement learning is that the designer should not encode the properties of the representation, but instead let the data flow determine the properties of the representation, allowing good representations to emerge spontaneously under appropriate training regimes.

Neural network architecture used in this study This research explores the properties of representations learned by a deep reinforcement learning system. The study combines these two perspectives and investigates, through empirical analysis, the properties of representations that can facilitate transfer in reinforcement learning. The authors propose and measure six representation properties, studying them across over 25,000 agent task settings.

They use deep Q-learning agents with different auxiliary losses, experimenting in pixel-based navigation environments where the source task and transfer task correspond to different goal locations.

The researchers developed a method for better understanding why some representations are more suitable for transfer by systematically varying task similarity and measuring representation properties correlated with transfer performance. They also demonstrated the generality of the method by studying the representations learned by Rainbow agents when successfully transferring between Atari 2600 game modes.

Key findings include:

  1. Auxiliary tasks can promote representations beneficial for transfer, but many auxiliary tasks fail to outperform learning from scratch in ReLU networks.

  2. Using the sparse activation function FTA is an important factor in improving transferability, with FTA representations consistently transferring well with or without auxiliary tasks.

  3. ReLU representations transfer well to highly similar tasks but perform far worse than FTA on transferring to more dissimilar tasks.

  4. Linear function approximation cannot achieve transfer, with performance significantly better when representations are input to a nonlinear value function.

  5. The best transferring representations have high complexity reduction, medium-to-high dynamics awareness and diversity, and medium orthogonality and sparsity.

The researchers proposed a systematic approach to investigate representations and their properties. Through iterative experiment design, developing a suite of property metrics, tuning hyperparameters, and visualizing large amounts of data, they arrived at the above findings.

They applied this method to understanding representation transfer between Atari 2600 game modes, finding that the representations learned by Rainbow agents have similar properties to the best-performing FTA representations in the maze environments: high complexity reduction, high orthogonality and sparsity, and medium diversity. This suggests the proposed properties and method are meaningful.

This work complements the understanding of learned representations in reinforcement learning by providing quantitative analysis methods. The findings provide insights for designing better representation learning algorithms.