Summer 2016 Project 5

From Quantitative Analysis Software Courses
Revision as of 19:33, 19 July 2016 by Dave (talk | contribs) (→‎Overview)
Jump to navigation Jump to search

Overview

In this project you will implement the Q-Learning and Dyna-Q solutions to the reinforcement learning problem. You will apply them to two problems: 1) Navigation, and 2) Trading. The reason for working with the navigation problem first is that, as you will see, navigation is an easy problem to work with and understand. In the last part of the assignment you will apply Q-Learning to stock trading.

Note that your Q-Learning code really shouldn't care which problem it is solving. The difference is that you need to wrap the learner in different code that frames the problem for the learner as necessary.

For the navigation problem we have created testqlearner.py that automates testing of your Q-Learner in the navigation problem. We also provide teststrategylearner.py to test your strategy learner. In order to apply Q-learning to trading you will have to implement an API that calls Q-learning internally.

Overall, your tasks for this project include:

  • Code a Q-Learner
  • Code the Dyna-Q feature of Q-Learning
  • Test/debug the Q-Learner in navigation problems
  • Build a strategy learner based on your Q-Learner
  • Test/debug the strategy learner on specific symbol/time period problems

Scoring for the project will be allocated as follows:

  • Navigation test cases: 80% (note that we will check those with dyna = 0)
  • Dyna implemented: 5% (we will check this with one navigation test case by comparing performance with and without dyna turned on)
  • Trading strategy test cases: 20%

For this assignment we will test only your code (there is no report component). Note that the scoring is structured so that you can earn a B (80%) if you implement only Q-Learning, but if you implement everything, the total possible score is 105%. That means you can earn up to 5% extra credit on this project ( == 1% extra credit on the final course grade).

Template and Data

Part 1: Implement QLearner

Part 2: Navigation Problem Test Cases

Part 3: Implement Dyna

Part 4: Implement Strategy Learner

Contents of Report

Hints & Resources

What to turn in

Rubric

Required, Allowed, & Prohibited