Difference between revisions of "MC3-Project-4"

From Quantitative Analysis Software Courses
Jump to navigation Jump to search
Line 10: Line 10:
 
==Overview==
 
==Overview==
  
In this project you will apply the Q-Learner you developed earlier to the trading problemIt is not required, but we recommend that you reuse the indicators that you developed in the previous project for this oneNote that there is no regression or classification learning in this project (so no use of RTLearner or LinRegLearner). The indicators define most of the "state" for your learner, the additional component of state is whether or not you are currently holding a position long or short. The actions are BUY, NOTHING, SELL.
+
In this project you will design a learning trading agentYou can draw on the learners you have created so far in the courseYour choices are:
 +
 
 +
* Create a regression or classification-based strategy using your Random Forest learner.
 +
* Create a Q-learning-based strategy using your Q-Learner.
 +
* Create a scan-based strategy using an optimizer.
 +
 
 +
Your learner should work in the following way:
 +
* In the training phase your learner will be provided with a stock symbol and a time period.  It should use this data to learn its strategy. For instance, for a regression-based learner
 +
it will use this data to make predictions about future price changes.
 +
* In the testing phase your learner will be provided a symbol and a date range. All learning should be turned OFF during this phase.
 +
If the date range is the same as used for the training, it is an in-sample test. Your learner should return a set of dated
  
 
Overall, your tasks for this project include:
 
Overall, your tasks for this project include:
  
* Build a strategy learner based on your Q-Learner and previously developed indicators.
+
* Devise some numerical/technical indicators to evaluate the state of a stock on each day.
 +
* Build a strategy learner based on one of the learners described above that uses the indicators.
 
* Test/debug the strategy learner on specific symbol/time period problems
 
* Test/debug the strategy learner on specific symbol/time period problems
  

Revision as of 20:47, 13 July 2017

DRAFT

This assignment is under revision. This notice will be removed once it is final.

Updates / FAQs

  • 2017-7-11
    • Project revision in progress.

Overview

In this project you will design a learning trading agent. You can draw on the learners you have created so far in the course. Your choices are:

  • Create a regression or classification-based strategy using your Random Forest learner.
  • Create a Q-learning-based strategy using your Q-Learner.
  • Create a scan-based strategy using an optimizer.

Your learner should work in the following way:

  • In the training phase your learner will be provided with a stock symbol and a time period. It should use this data to learn its strategy. For instance, for a regression-based learner

it will use this data to make predictions about future price changes.

  • In the testing phase your learner will be provided a symbol and a date range. All learning should be turned OFF during this phase.
If the date range is the same as used for the training, it is an in-sample test. Your learner should return a set of dated 

Overall, your tasks for this project include:

  • Devise some numerical/technical indicators to evaluate the state of a stock on each day.
  • Build a strategy learner based on one of the learners described above that uses the indicators.
  • Test/debug the strategy learner on specific symbol/time period problems

Scoring for the project will be based on trading strategy test cases. For this assignment we will test only your code (there is no report component).

Template and Data

  • Update your local mc3_p4 directory using github.
  • Place your existing Q-Learner in the file mc3_p4/QLearner.py.
  • Implement the StrategyLearner class in mc3_p4/StrategyLearner.py
  • To test your strategy learner, run python teststrategylearner.py from the mc3_p4/ directory.

Use the following parameters for trading and evaluation:

  • Use only the data provided for this course. You are not allowed to import external data.
  • Allowable positions are: 200 shares long, 200 shares short, 0 shares.
  • Benchmark:
    • The performance of a portfolio starting with $100,000 cash, then investing in 200 shares of the relevant symbol and holding that position
  • There is no limit on leverage.

Implement Strategy Learner

For this part of the project you should develop a learner that can learn a trading policy using your Q-Learner. You should be able to use your Q-Learner from the earlier project directly, with no changes. You will need to write code in StrategyLearner.py to "wrap" your Q-Learner appropriately to frame the trading problem for it. Utilize the template provided in StrategyLearner.py Overall the structure of your strategy learner should be arranged as below. Note that this is a suggestion, not a requirement:

For the policy learning part:

  • Select several technical features, and compute their values for the training data
  • Discretize the values of the features
  • Instantiate a Q-learner
  • For each day in the training data:
    • Compute the current state (including holding)
    • Compute the reward for the last action
    • Query the learner with the current state and reward to get an action
    • Implement the action the learner returned (BUY, SELL, NOTHING), and update portfolio value
  • Repeat the above loop multiple times until cumulative return stops improving.

A rule to keep in mind: As in past projects, you can only be long or short 200 shares, so if your learner returns two BUYs in a row, don't double down, same thing with SELLs.

For the policy testing part:

  • For each day in the testing data:
    • Compute the current state
    • Query the learner with the current state to get an action
    • Implement the action the learner returned (BUY, SELL, NOTHING), and update portfolio value
  • Return the resulting trades in a data frame (details below).

Your StrategyLearner should implement the following API:

import StrategyLearner as sl
learner = sl.StrategyLearner(verbose = False) # constructor
learner.addEvidence(symbol = "AAPL", sd=dt.datetime(2008,1,1), ed=dt.datetime(2009,12,31), sv = 100000) # training phase
df_trades = learner.testPolicy(symbol = "AAPL", sd=dt.datetime(2010,1,1), ed=dt.datetime(2011,12,31), sv = 100000) # testing phase

The input parameters are:

  • verbose: if False do not generate any output
  • symbol: the stock symbol to train on
  • sd: A datetime object that represents the start date
  • ed: A datetime object that represents the end date
  • sv: Start value of the portfolio

The output result is:

  • df_trades: A data frame whose values represent trades for each day. Legal values are +200.0 indicating a BUY of 200 shares, -200.0 indicating a SELL of 200 shares, and 0.0 indicating NOTHING. Values of +400 and -400 for trades are also legal so long as net holdings are constrained to -200, 0, and 200.

Contents of Report

There is no report component of this assignment. However, if you would like to impress us with your Machine Learning prowess, you are invited to submit a succinct report.

What to turn in

Turn your project in via t-square. All of your code must be contained within QLearner.py and StrategyLearner.py.

  • Your QLearner as QLearner.py
  • Your StrategyLearner as StrategyLearner.py
  • Your report (if any) as report.pdf
  • Do not submit any other files.

Rubric

We will test StrategyLearner in the following situations:

  • Training / in sample: January 1, 2008 to December 31 2009.
  • Testing / out of sample: January 1, 2010 to December 31 2011.
  • Symbols: ML4T-220, AAPL, UNH, SINE_FAST_NOISE
  • Starting value: $100,000
  • Benchmark: Buy 200 shares on the first trading day, Sell 200 shares on the last day.

We expect the following outcomes in evaluating your system:

  • For ML4T-220, the trained policy should provide a cumulative return greater than 100% in sample (20 points)
  • For ML4T-220, the trained policy should provide a cumulative return greater than 100% out of sample (20 points)
  • For AAPL, the trained policy should significantly outperform the benchmark in sample (20 points)
  • For SINE_FAST_NOISE, the trained policy should provide a cumulative return greater than 200% in sample (20 points)
  • For UNH, the trained policy should significantly outperform the benchmark in sample (20 points)
  • Additional test in which we train your learner with one data set, then test it out of sample with another data set. The out of sample performance should be worse than in sample. Not counted for now.

Training and testing for each situation should run in less than 30 seconds. We reserve the right to use different time periods if necessary to reduce auto grading time.

Required, Allowed & Prohibited

Required:

  • Your project must be coded in Python 2.7.x.
  • Your code must run on one of the university-provided computers (e.g. buffet02.cc.gatech.edu).

Allowed:

  • You can develop your code on your personal machine, but it must also run successfully on one of the university provided machines or virtual images.
  • Your code may use standard Python libraries.
  • You may use the NumPy, SciPy, matplotlib and Pandas libraries. Be sure you are using the correct versions.
  • You may reuse sections of code (up to 5 lines) that you collected from other students or the internet.
  • Code provided by the instructor, or allowed by the instructor to be shared.
  • Use util.py (only) for reading data.

Prohibited:

  • Any libraries not listed in the "allowed" section above.
  • Any code you did not write yourself (except for the 5 line rule in the "allowed" section).
  • Any Classes (other than Random) that create their own instance variables for later use (e.g., learners like kdtree).
  • Print statements outside "verbose" checks (they significantly slow down auto grading).
  • Any method for reading data besides util.py

Legacy