MC3-Project-2
Contents
Draft
"Draft" will be removed when the assignment is final.
Updates / FAQs
Overview
In this project you will transform your regression learner into a stock trading strategy. Overall, you should follow these steps:
- Train a regression learner (KNN or LinReg, or other of your choice with or without bagging) on data from 2008 to 2009. This is your in sample training data.
- For your X values: Identify and implement at least 3 technical features that you believe may be predictive of future return. You should implement them so they output values typically ranging from -1.0 to 1.0. This will help avoid the situation where one feature overwhelms the results. See a few formulae below.
- For your Y values: Don't use price, use 5 day return.
- Create a plot that illustrates your Y values in one color and price in another color, we should be able to see that your Y values are shifted back 5 days. You may find it convenient to zoom in on a particular time period so this is evident.
- Create a trading policy based on what your learner predicts for future return. As an example you might choose to buy when the forecaster predicts the price will go up more than 1%, then hold for 5 days.
- Create a plot that illustrates entry and exits as vertical lines on a price chart for the in sample period 2008-2009. Show long entries as green lines, short entries as red lines and exits as black lines. You may find it convenient to zoom in on a particular time period so this is evident.
- Now use your code to generate orders and run those orders through your market simulator. Create a chart of this backtest. It should do VERY well for the in sample period 2008-2009.
- Freeze your model based on the 2008-2009 data. Now test it for the year 2010 -- Plot that illustrates entry & exits, generate trades, run through simulator, chart the backtest.
Perform the above steps first using the data ML4T-399.csv, once you've validated success (it should work well), repeat using IBM data over the same dates. Remember 2008-2009 is training, 2010 is testing. You should have one set of charts for each symbol.
Template and Data
You will use data in the ML4T/Data directory. In particular files named ML4T-XXX.csv, where XXX are digits.
Choosing Technical Features -- Your X Values
You should have already successfully coded the Bollinger Band feature. Here's a suggestion of how to normalize that feature so that it will typically provide values between -1.0 and 1.0:
bb_value[t] = (price[t] - SMA[t])/(2 * stdev[t])
Two other good features worth considering are momentum and volatility.
Choosing Y
Your code should predict 5 day change in price. You need to build a new Y that reflects the 5 day change and aligns with the current date. Here's pseudo code for the calculation of Y
Y[t] = (price[t+5]/price[t]) - 1.0
If you select Y in this manner and use it for training, your learner will predict 5 day returns.
Part 5: Create a Trading Strategy
Use your forecaster to generate a trading strategy. Create a chart that shows entry and exit points.
Generate trades, show backtest. Show results for in sample (2008-2009) and out of sample 2010-2011
Apply this now to real data, IBM, over the same period.