CptS 580, Spring 2017

CptS 580: Reinforcement Learning
Spring 2017

Matthew E. Taylor (Matt)
taylorm@eecs.wsu.edu
EME 137

Syllabus: Spring 2017

Textbook:
Reinforcement Learning: An Introduction. New link here.

Resources:

Piazza for reading responses
Blackboard can be used for submissions
Machine Learning 3 - Reinforcement Learning

Schedule

Date	Topic	Homework
1/10	First day of class: Introduction (with Audio)
1/12	Second day of class	Read: Chapter 1 Udacity: Sign up for Machine Learning 3 - Reinforcement Learning Watch: "Introduction" through "Markov Decision Pocess Four" (8 videos total) Sign up for Piazza
1/17	Bandits!	Read: Chapter 2 Udacity: More About Rewards (1 & 2) + Rewards Quiz Please write a response to both on Piazza by 5pm on Monday
1/19	MDPs	Read up through section 3.5 in the book Please write a response on Piazza by 5pm on Wednesday
1/24	Value functions	None
1/26	V and Q: What's it to you?	Watch Sequences of Rewards -- Finding Policies 4 Finish reading chapter 3 (Please write a response on Piazza by 11:59pm on Wednesday.)
1/31	Value & Policy Iteration	Watch up through "What have we learned", finishing Part 1 of the course. Read chapter 4 of the book. Please post to piazza by 11:59pm on Monday.
2/2	MC Hammer? No! MC Methods	Read up through 5.3. Please write a response on Piazza by 11:59pm on Wednesday.
2/7	No Class	Read up through chapter 5. Watch "2. Reinforcement Learning Basics": 7 videos. Please write a response to Piazza by 5pm on Monday.
2/9	No Class
2/14	Revenge of the Monte Carlo Methods
2/16	Temporal Difference Methods	Watch "3. TD and Friends", Temporal Difference Learning -> Quiz: Selecting Learning Rates (6 videos in total). No response required. Due by 11:59pm, Saturday 2/18: Homework 1
2/21	On vs. Off policy: Q-Learning and Sarsa	Watch the remainder of "3. TD and Friends". Read all of chapter 6 in Sutton and Barto. Please write a respone by 11:59pm, Monday 2/20.
2/23	TD(\lambda) and friends	Read sections 7.1 and 7.2 in book. No response needed.
2/28	Eligibility traces	Due by 11:59pm, Monday 2/27: Homework 2
3/2	Ending Eligibility and Starting State Space Approximation	Please read chapter 7 in the book. By midnight Wednesday, please post to piazza with questions/comments regarding this chapter and Tuesday's recorded class.
3/7	Function Approximation	Please read chapter 8 in the book and post a response on Piazza by midnight on Monday the 6th.
3/9	Function Approximation in Practice. We also looked at some code
3/14	No Class
3/16	No Class
3/21	Planning and Final Project Discussion	Due by 11:59pm, Monday 3/20: Homework 3
3/23		Read Integrating Reinforcement Learning with Human Demonstrations of Varying Ability Enter your preference on Doodle for when you'd like to present a paper to the class. Ideally it would be something related to your final project. You can present individually, or with 1 other person.
3/28	John Jenkins	By 11:59pm on 3/27, please submit a proposed final project via blackboard. One submission per team of at most 1 page. What will you be doing? How will you evaluate it? What will the success conditions be? Read: Dynamic Algorithm Selection Using Reinforcement Learning
3/30	Yunshu Du, Jessie Waite/Lorin Vandegrift	Read: Policy Gradient Methods for Reinforcement Learning with Function Approximation and A3C. You may also want to look through this: background information on deepRL
4/4	No Class: Please work on your final projects
4/6	Coby Soss, Yang Zhang	Read: Model-Based Multi-Objective Reinforcement Learning
4/11	Alex Joens, Mark Keen	Please read Human Interaction for Effective Reinforcement Learning and Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming
4/13	Lei Cai/Hongyang Gao, Zhengyang Wang/Hao Yuan	Read: Adversarial Learning for Neural Dialogue Generation and SeqGAN
4/18	Niloofar Hezarjaribi/Brandon Yang, Tao Zeng/Yongjun Chen	Please submit your rough draft of the final project to learn.wsu.edu by 11:59pm on 4/18. The more detail you can give me, the better advice I can give. You can find the rubric for the final project here, as discussed in class. Please read: Reinforcement Learning in Continuous State and Action Spaces (please focus on section 2 and section 3.1) Please read: Active Object Localization with Deep Reinforcement Learning
4/20	Yao Zhang, Yan Zhang	Please read: Bridging the Gap Between Value and Policy Based Reinforcement Learning and Hierarchical Object Detection with Deep Reinforcement Learning
4/25	Kayl/Shivam Advice for Final Project + Shaping, TAMER	Please read: Power to the People: The Role of Humans in Interactive Machine Learning
4/27	Intrinsic Motivation & Options
5/2	8am (Time of Exam): Final Presentations	Please submit your report for the final project to learn.wsu.edu by 11:59pm on 5/2.

Possible further topics

Current Function Approximation Choices
Efficient Model-Learning methods
Hierarchical Methods
Game Playing
Learning in Robotics
Transfer Learning
Shaping Rewards
Learning from Human Rewards
Learning from Demonstration
Multi-agent RL
Partially observable envirnments and/or POMDPs
MetaRL and empirical evaluation of algorithms
Least Squares methods (e.g., LSPI)
Adaptive Representations / Representation Learning
Case Studies: Robot soccer, Helicopter Control, etc.
Inverse Reinforcement Learning (IRL)
Intrinsicly Motivated Reinforcement Learning
Actor-Critic Methods
Policy Gradient methods
Crowd Sourcing (?)
DeepRL

CptS 580: Reinforcement Learning Spring 2017

Schedule

CptS 580: Reinforcement Learning
Spring 2017