What is RL
Deep learning helps us handle unstructured environments.
Reinforcement learning provides a formalism for behavior.
Robotic control pipeline:
Observations -> State Estimation(e.g. vision) -> Modeling & Prediction -> Planning -> Low-level Control -> Controls
Deep models are what allow reinforcement learning algorithms to solve complex problems end to end!
Other Forms of Supervision
Learning from demonstrations
- Directly copying observed behavior
- Inferring rewards from observed behavior (inverse reinforcement learning)
- Learning from observing the world
- Learning to predict
- Unsupervised learning
- Learning from other tasks
- Transfer learning
- Meta-learning: learning to learn
Learning as the basis of intelligence
- Some things we can all do (e.g. walking)
- Some things we can only learn (e.g. driving a car)
- We can learn a huge variety of things, including very difficult things
- Therefore our learning mechanism(s) are likely powerful enough to do everything we associate with intelligence
- But it may still be very convenient to “hard-code” a few really important bits
Why Deep RL
Deep = can process complex sensory input, and also compute really complex functions.
Reinforcement learning = can choose complex actions.
What can DL & RL do Well Now
- Acquire high degree of proficiency in domains governed by simple, known rules
- Learn simple skills with raw sensory inputs, given enough experience
- Learn from imitating enough human provided expert behavior
What has Proven Challenging So Far
- Humans can learn incredibly quickly
- Deep RL methods are usually slow
- Humans can reuse past knowledge
- Transfer learning in deep RL is an open problem
- Not clear what the reward function should be
- Not clear what the role of prediction should be
Instead of trying to produce a program to simulate the adult ind, why not rather try to produce one which simulates the child’s? If this were then subjected to an appropriate course of education one would obtain the adult brain. —Alan Turing
Note: Cover Picture