Recurrent Orthogonal Networks and Long-Memory Tasks

International Conference on Machine Learning


Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research. In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions furthermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.

Related Publications

All Publications

Emerging Cross-lingual Structure in Pretrained Language Models

Shijie Wu, Alexis Conneau, Haoran Li, Luke Zettlemoyer, Veselin Stoyanov

ACL - July 9, 2020

Open Source Evolutionary Structured Optimization

Jeremy Rapin, Pauline Bennet, Emmanuel Centeno, Daniel Haziza, Antoine Moreau, Olivier Teytaud

Evolutionary Computation Software Systems Workshop at ​GECCO - July 9, 2020

Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning

Tianyu Li, Nathan Lambert, Roberto Calandra, Franziska Meier, Akshara Rai

ICRA - June 1, 2020

Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data

Haytham M. Fayek, Anurag Kumar

IJCAI - July 11, 2020

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy