MC3-Project-2

From Quantitative Analysis Software Courses
Jump to navigation Jump to search

Updates / FAQs

2015-11-21

  • Updated list of charts to generate.

Overview

In this project you will transform your regression learner into a stock trading strategy. Overall, you should follow these steps:

  • Train a regression learner (KNN or LinReg, or other of your choice with or without bagging) on data from 2008 to 2009. This is your in sample training data.
    • For your X values: Identify and implement at least 3 technical features that you believe may be predictive of future return. You should implement them so they output values typically ranging from -1.0 to 1.0. This will help avoid the situation where one feature overwhelms the results. See a few formulae below.
    • For your Y values: Don't use price, use future 5 day return. Remember you're trying to PREDICT the future.
  • Create a plot that illustrates your training Y values in one color, price in another color and your model's PREDICTED Y in a third color. With this chart we should be able to see how well your learner performs and that your Y values are shifted back 5 days. You may find it convenient to zoom in on a particular time period so this is evident.
  • Create a trading policy based on what your learner predicts for future return. As an example you might choose to buy when the forecaster predicts the price will go up more than 1%, then hold for 5 days.
  • Create a plot that illustrates entry and exits as vertical lines on a price chart for the in sample period 2008-2009. Show long entries as green lines, short entries as red lines and exits as black lines. You may find it convenient to zoom in on a particular time period so this is evident.
  • Now use your code to generate orders and run those orders through your market simulator. Create a chart of this backtest. It should do VERY well for the in sample period 2008-2009.
  • Freeze your model based on the 2008-2009 data. Now test it for the year 2010 -- Plot that illustrates entry & exits, generate trades, run through simulator, chart the backtest.

Perform the above steps first using the data ML4T-399.csv, once you've validated success (it should work well), repeat using IBM data over the same dates. Remember 2008-2009 is training, 2010 is testing. You should have one set of charts for each symbol.

Summary of Plots To Create

  1. Training Y/Price/Predicted Y: Create a plot that illustrates your training Y values in one color, price in another color and PREDICTED Y in a third color. With this chart we should be able to see how well your learner performs and that your Y values are shifted back 5 days. You may find it convenient to zoom in on a particular time period so this is evident.
  2. Sine Data In Sample Entries/Exits: Create a plot that illustrates entry and exits as vertical lines on a price chart for the in sample period 2008-2009. Show long entries as green lines, short entries as red lines and exits as black lines. You may find it convenient to zoom in on a particular time period so this is evident.
  3. Sine Data In Sample Backtest
  4. Sine Data Out of Sample Entries/Exits: Freeze your model based on the 2008-2009 data. Now test it for the year 2010 -- Plot that illustrates entry & exits, generate trades,
  5. Sine Data Out of Sample Backtest
  6. IBM Data In Sample Entries/Exits: Create a plot that illustrates entry and exits as vertical lines on a price chart for the in sample period 2008-2009. Show long entries as green lines, short entries as red lines and exits as black lines. You may find it convenient to zoom in on a particular time period so this is evident.
  7. IBM Data In Sample Backtest
  8. IBM Data Out of Sample Entries/Exits: Freeze your model based on the 2008-2009 data. Now test it for the year 2010 -- Plot that illustrates entry & exits, generate trades,
  9. IBM Data Out of Sample Backtest

Template and Data

You will use data in the ML4T/Data directory. In particular files named ML4T-XXX.csv, where XXX are digits.

Choosing Technical Features -- Your X Values

You should have already successfully coded the Bollinger Band feature. Here's a suggestion of how to normalize that feature so that it will typically provide values between -1.0 and 1.0:

bb_value[t] = (price[t] - SMA[t])/(2 * stdev[t])

Two other good features worth considering are momentum and volatility.

Choosing Y

Your code should predict 5 day change in price. You need to build a new Y that reflects the 5 day change and aligns with the current date. Here's pseudo code for the calculation of Y

Y[t] = (price[t+5]/price[t]) - 1.0

If you select Y in this manner and use it for training, your learner will predict 5 day returns.

Contents of Report

  • Your report should be no more than 6 pages long. Use 1" margins and no smaller than 10 point font. Your report should contain no more than 10 charts. Penalties will apply if you violate these constraints.
  • Include the charts listed in the overview section above.
  • Describe each of the indicators you have selected in enough detail that someone else could reproduce them in code.
  • Describe your trading policy clearly.
  • Discussion of results. Did it work well? Why? What would you do differently?

Hints & resources

What to turn in

Turn your project in via t-square.

  • Your report as report.pdf
  • Your code as code.py

Extra credit up to 3%

Extend your code to simultaneously forecast all the members of the S&P 500. Generate trades accordingly, and backtest the result.

Rubric