Agent Corrections to Pac-Man from the Crowd

We developed a user study to show a video with respect to a pac-man playing the game to the user, and the user needs to point out the time that the pac-man is making a mistake and give action suggestion. We will publish the task on Mechanical Turk to get some crowdsourcing data to see whether humans are good at giving correct suggestions. If so, crowdsourcing can be utilized to navigate robot in the future.

Training an Agent to Ground Commands with Reward and Punishment

As increasing need for humans to convey complex tasks to robot without any technical expertise, conveying tasks through natural language provides an intuitive interface. But it needs the agent to learn a grounding of natural language commands. In this work, we developed a simple simulated home environment in which the robot needs to complete some tasks via learning from human positive or negative feedback.

Agent Learning Behaviors from Discrete Human Feedback

In this project, we consider the problem of a human trainer teaching an agent via providing positive or negative feedback. Most existing work has treated human feedback as a numerical value that the agent seeks to maximize, and has assumed that all trainers will give feedback in the same way when teaching the same behavior. In contrast, we treat the feedback as a human-delivered discrete communication between trainers and learners and different training strategies will be chosen by them. We propose a probabilistic model to classify different training strategies. We also present the SABL and I-SABL algorithms, which consider multiple interpretations of trainer feedback in order to learn behaviors more efficiently. Our online user studies show that human trainers follow various training strategies when teaching virtual agents and explicitly considering trainer strategy can allow a learner to make inferences from cases where no feedback is given.

Publications:


  • Robert Loftin, Bei Peng, James MacGlashan, Michael Littman, Matthew E. Taylor, David Roberts, and Jeff Huang. Learning Something from Nothing: Leveraging Implicit Human Feedback Strategies. In Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), August 2014.
    Details     Download: [pdf] (434.7kB )  

  • Robert Loftin, Bei Peng, James MacGlashan, Machiael L. Littman, Matthew E. Taylor, Jeff Huang, and David L. Roberts. A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI), July 2014. 28% acceptance rate
    Details     Download: [pdf] (667.3kB )  

    NSF IIS-1319412. RI: Small: Collaborative Research: Speeding Up Learning through Modeling the Pragmatics of Training. PI: Roberts, Co-PIs: Littman and Taylor