Difference between revisions of "Manual strategy"

From Quantitative Analysis Software Courses
Jump to navigation Jump to search
 
(55 intermediate revisions by one other user not shown)
Line 1: Line 1:
==DRAFT==
+
==Finalized==
 
 
This assignment is under revision.  This notice will be removed once it is final.
 
  
 
==Updates / FAQs==
 
==Updates / FAQs==
Line 16: Line 14:
  
 
In this project you will develop a trading strategy using your intuition and Technical Analysis, and test it against a stock using your market simulator. In a later project, you will use your same indicators but with Machine Learning (instead of your intuition) to create a trading strategy. We hope Machine Learning will do better than your intuition, but who knows?
 
In this project you will develop a trading strategy using your intuition and Technical Analysis, and test it against a stock using your market simulator. In a later project, you will use your same indicators but with Machine Learning (instead of your intuition) to create a trading strategy. We hope Machine Learning will do better than your intuition, but who knows?
 +
 +
==Template==
 +
 +
There is no distributed template for this project.  You should create a directory for your code in ml4t/manual_strategy and make a copy of util.py there.  You will have access to the data in the ML4T/Data directory but you should use ONLY the code in util.py to read it.
 +
 +
You should create the following code files for submission.  They should comprise ALL code from you that is necessary to run your evaluations.
 +
 +
* <tt>indicators.py</tt> Your code that implements your indicators as functions that operate on dataframes.  The "main" code in indicators.py should generate the charts that illustrate your indicators in the report.
 +
* <tt>marketsimcode.py</tt> An improved version of your marketsim code that accepts a "trades" data frame (instead of a file).  More info on the trades data frame below.  It is OK not to submit this file if you have subsumed its functionality into one of your other code files.
 +
* <tt>ManualStrategy.py</tt> Code implementing a ManualStrategy object (your manual strategy).  It should implement testPolicy() which returns a trades data frame (see below). The main part of this code should call marketsimcode as necessary to generate the plots used in the report.
 +
* <tt>TheoreticallyOptimalStrategy.py</tt> Code implementing a TheoreticallyOptimalStrategy object (details below).  It should implement testPolicy() which returns a trades data frame (see below). The main part of this code should call marketsimcode as necessary to generate the plots used in the report.
 +
 +
Note that we may not test your code, so we may not know if you didn't organize your code as recommended, but this arrangement will be required for later projects, so it is worthwhile getting it set up this way.  The key requirement is that, if necessary, a TA should be able to run your code on a buffet machine and get the same results (e.g., statistics and charts) that we see in your report.
  
 
==Data Details, Dates and Rules==
 
==Data Details, Dates and Rules==
Line 28: Line 39:
 
* Benchmark: The performance of a portfolio starting with $100,000 cash, investing in 1000 shares of JPM and holding that position.
 
* Benchmark: The performance of a portfolio starting with $100,000 cash, investing in 1000 shares of JPM and holding that position.
 
* There is no limit on leverage.
 
* There is no limit on leverage.
* Transaction costs: Commission: $9.95, Impact: 0.005.
+
* Transaction costs for ManualStrategy: Commission: $9.95, Impact: 0.005.
 +
* Transaction costs for TheoreticallyOptimalStrategy: Commission: $0.00, Impact: 0.00.
  
 
==Part 1: Technical Indicators (20 points)==
 
==Part 1: Technical Indicators (20 points)==
Line 40: Line 52:
 
At least one of the indicators you use should be completely different from the ones presented in our lectures. (i.e. something other than SMA, Bollinger Bands, RSI).
 
At least one of the indicators you use should be completely different from the ones presented in our lectures. (i.e. something other than SMA, Bollinger Bands, RSI).
  
==Part 2: Best Possible Strategy (5 points)==
+
==Part 2: Theoretically Optimal Strategy (20 points)==
  
 
Assume that you can see the future, but that you are constrained by the portfolio size and order limits as specified above.  Create a set of trades that represents the best a strategy could possibly do during the in sample period. The reason we're having you do this is so that you will have an idea of an upper bound on performance.   
 
Assume that you can see the future, but that you are constrained by the portfolio size and order limits as specified above.  Create a set of trades that represents the best a strategy could possibly do during the in sample period. The reason we're having you do this is so that you will have an idea of an upper bound on performance.   
Line 49: Line 61:
  
 
* Benchmark (see definition above) normalized to 1.0 at the start: Blue line
 
* Benchmark (see definition above) normalized to 1.0 at the start: Blue line
* Value of the best possible portfolio (normalized to 1.0 at the start): Black line
+
* Value of the theoretically optimal portfolio (normalized to 1.0 at the start): Black line
  
 
You should also report in text:
 
You should also report in text:
Line 57: Line 69:
 
* Mean of daily returns of benchmark and portfolio
 
* Mean of daily returns of benchmark and portfolio
  
==Part 3: Manual Rule-Based Trader (20 points)==
+
Your code should implement testPolicy() as follows:
 +
 
 +
df_trades = tos.testPolicy(symbol = "AAPL", sd=dt.datetime(2010,1,1), ed=dt.datetime(2011,12,31), sv = 100000)
 +
 
 +
The input parameters are:
 +
 
 +
* symbol: the stock symbol to act on
 +
* sd: A datetime object that represents the start date
 +
* ed: A datetime object that represents the end date
 +
* sv: Start value of the portfolio
 +
 
 +
The output result is:
  
Devise a set of rules using the indicators you created in Part 1 aboveDevise some simple logic using your indicators to enter and exit positions in the stock. A recommended approach is to create a single logical expression that yields a -1, 0, or 1, corresponding to a "short," "out" or "long" positionExample usage this signal: If you are out of the stock, then a 1 would signal a BUY 1000 order. If you are long, a -1 would signal a SELL 2000 order.  You don't have to follow this advice though, so long as you follow the trading rules outlined above.
+
* df_trades: A data frame whose values represent trades for each dayLegal values are +1000.0 indicating a BUY of 1000 shares, -1000.0 indicating a SELL of 1000 shares, and 0.0 indicating NOTHINGValues of +2000 and -2000 for trades are also legal so long as net holdings are constrained to -1000, 0, and 1000.
  
We want a written description here in the report, not code, however, it is OK to augment your written description with a pseudocode figure.
+
==Part 3: Manual Rule-Based Trader (50 points)==
  
You should tweak your rules as best you can to get the best performance possible during the in sample period (do not peek at out of sample performance).  Use your rule-based strategy to generate an orders file over the in sample period, then run that file through your market simulator to create a chart that includes the following components over the in sample period:
+
In <tt>ManualStrategy.py</tt> implement a set of rules using the indicators you created in Part 1 above.  Devise some simple logic using your indicators to enter and exit positions in the stock. 
 +
 
 +
A recommended approach is to create a single logical expression that yields a -1, 0, or 1, corresponding to a "short," "out" or "long" position.  Example usage this signal: If you are out of the stock, then a 1 would signal a BUY 1000 order. If you are long, a -1 would signal a SELL 2000 order.  You don't have to follow this advice though, so long as you follow the trading rules outlined above.
 +
 
 +
For the report we want a written description, not code, however, it is OK to augment your written description with a pseudocode figure.
 +
 
 +
You should tweak your rules as best you can to get the best performance possible during the in sample period (do not peek at out of sample performance).  Use your rule-based strategy to generate an orders dataframe over the in sample period, then run that dataframe through your market simulator to create a chart that includes the following components over the in sample period:
  
 
* Benchmark (see definition above) normalized to 1.0 at the start: Blue line
 
* Benchmark (see definition above) normalized to 1.0 at the start: Blue line
Line 70: Line 99:
 
* Vertical red lines indicating SHORT entry points.
 
* Vertical red lines indicating SHORT entry points.
  
We expect that your rule-based strategy should outperform the benchmark over the in sample period.
+
We expect that your rule-based strategy should outperform the benchmark over the in sample period.
  
Deliverables:
+
Your code should implement the same API as above for theoretically optimal:
* Descriptive text (1 or 2 pages with chart) that provides a compelling justification for the rule-based system developed.
 
* Text must describe rule based system in sufficient detail that another person could implement it.
 
* 1 chart.
 
* Code: rule_based.py (generates an orders file)
 
  
==Part 6: Comparative Analysis (10%)==
+
df_trades = ms.testPolicy(symbol = "AAPL", sd=dt.datetime(2010,1,1), ed=dt.datetime(2011,12,31), sv = 100000)
  
Evaluate the performance of both of your strategies in the out of sample period.  Note that you '''should not''' train or tweak your learner on this data.  You should use the classification learned using the training data only.  Create a chart that shows, out of sample:
+
==Part 4: Comparative Analysis (10 points)==
  
* Benchmark (see definition above) normalized to 1.0 at the start: Black line
+
Evaluate the performance of your strategy in the out of sample period.  Note that you '''should not''' train or tweak your approach on this data. You should use the classification learned using the in sample data only. Create a chart that shows, out of sample:
* Performance of manual strategy: Blue line
 
* Performance of the ML strategy: Green line
 
* All three should be normalized to 1.0 at the start.
 
  
Create a table that summarizes the performance of the stock, the manual strategy and the ML strategy for both in sample and out of sample periods.  Utilize your experience in this class to determine which factors are best to use for comparing these strategies.  If performance out of sample is worse than in sample, do your best to explain why. Also if the manual and ML strategies perform substantially differently, explain why. Is one method or the other more or less susceptible to the same underlying flaw?  Why or why not?
+
* Benchmark (see definition above) normalized to 1.0 at the start: Blue line
 +
* Performance of manual strategy: Black line
 +
* Both should be normalized to 1.0 at the start.
  
Deliverables:
+
Create a table that summarizes the performance of the stock, and the manual strategy for both in sample and out of sample periods.  Explain WHY these differences occur.
* Descriptive text (1 or 2 pages including figures)
 
* 1 chart
 
  
 
==Hints==
 
==Hints==
Line 105: Line 127:
 
** If neither LONG or SHORT is triggered, the result should be DO NOTHING.
 
** If neither LONG or SHORT is triggered, the result should be DO NOTHING.
 
** For debugging purposes, you may find it helpful to plot the value of the rule-based output (-1, 0, 1) versus the stock price.
 
** For debugging purposes, you may find it helpful to plot the value of the rule-based output (-1, 0, 1) versus the stock price.
* Train a classification learner on in sample training data:
 
** For your Y values: Use future 21 day return (not future price).  Then classify that return as LONG, SHORT or DO NOTHING.  You're trying to predict a relative change that you can use to invest with.
 
** For debugging purposes, you may find it helpful to plot the value of the training classification data (-1, 0, 1) versus the stock price in one color.
 
** For debugging purposes, you may find it helpful to plot the value of the training classification output (-1, 0, 1) versus the stock price in another color.  Ideally, these two lines should be very similar.
 
  
 
'''Choosing Technical Features -- Your X Values'''
 
'''Choosing Technical Features -- Your X Values'''
Line 115: Line 133:
  
 
<PRE>
 
<PRE>
bb_value[t] = (price[t] - SMA[t])/(stdev[t])
+
bb_value[t] = (price[t] - SMA[t])/(2 * stdev[t])
 
</PRE>
 
</PRE>
  
Line 126: Line 144:
 
Volatility is just the stdev of daily returns.
 
Volatility is just the stdev of daily returns.
  
You still need to standardize the resulting values.
+
It is usually worthwhile to standardize the resulting values (see https://en.wikipedia.org/wiki/Standard_score).
  
'''Choosing Y'''
+
==Contents of Report==
 
 
Your code should classify based on 21 day change in price.  You need to build a new Y that reflects the 21 day change and aligns with the current date.  Here's pseudo code for the calculation of Y
 
 
 
ret = (price[t+21]/price[t]) - 1.0
 
if ret > YBUY:
 
    Y[t] = +1 # LONG
 
else if ret < YSELL:
 
    Y[t] = -1 # SHORT
 
else:
 
    Y[t] = 0
 
  
If you select Y in this manner and use it for training, your learner will classify 21 day returns.
+
Describe each indicator you use in sufficient detail that someone else could reproduce it.  You should also provide a compelling description regarding why that indicator might work and how it could be used.  You should also provide one or more charts that convey how each indicator works in a compelling way. (up to 8 charts).
  
==Template and Data==
+
For the best possible strategy, describe how you created it and any assumptions you had to make to make it work.  Provide a chart that illustrates its performance versus the benchmark.
  
There is no github template for this projectYou should create a directory for your code in ml4t/mc3-p3 and make a copy of util.py there. You should also copy into that directory your learner code and your market simulator code. You will have access to the data in the ML4T/Data directory but you should use ONLY the code in util.py to read it.
+
For your manual strategy, describe how you combined your indicators to create an overall signalHow do you decide to enter and exit your positions and why? Why do you believe (or not) that this is an effective strategy? Provide a chart.
  
==Contents of Report==
+
Compare the performance of your manual strategy versus the benchmark for the in sample and out of sample time periods. Provide a chart.
  
* Your report should be no more than 3000 words.  Your report should contain no more than 14 charts.  Penalties will apply if you violate these constraints.
+
Your report should be no more than 3000 words.  Your report should contain no more than 14 charts.  Penalties will apply if you violate these constraints.
* Include charts and text as identified in the sections above.
 
  
 
==Expectations==
 
==Expectations==
  
* In-sample AAPL backtests should perform very well -- The ML version should do better than the manual version.
+
* In-sample backtests should perform very well.
* Out-of-sample AAPL backtests should... (you should be able to complete this sentence).
+
* Out-of-sample backtests should... (you should be able to complete this sentence).
  
 
==What to turn in==
 
==What to turn in==
  
Turn your project in via t-square.   
+
Turn your project in via Canvas.   
  
 
* Your report as <tt>report.pdf</tt>
 
* Your report as <tt>report.pdf</tt>
Line 167: Line 174:
 
==Rubric==
 
==Rubric==
  
Start with 100%, deductions as follows:
+
Start with 100 points, deductions as follows:
  
Indicators (up to 20% potential deductions):
+
Neatness (up to 5 points deduction if not).
* Is each indicator described in sufficient detail that someone else could reproduce it? (-5% for each if not)
 
* Is there a chart for each indicator that properly illustrates its operation? (-5% for each if not)
 
* Is at least one indicator different from those provided by the instructor's code (i.e., another indicator that is not SMA, Bollinger Bands or RSI) (-10% if not)
 
* Does the submitted code <tt>indicators.py</tt> properly reflect the indicators provided in the report (-20% if not)
 
  
Best possible (up to 5% potential deductions):
+
Bonus for exceptionally well written reports (up to 2 points)
* Is the chart correct (dates and equity curve) (-5% for if not)
 
* Is the reported performance correct within 5% (-1% for each item if not)
 
  
Manual rule-based trader (up to 20% deductions):
+
Indicators (up to 20 points potential deductions):
* Is the trading strategy described with clarity and in sufficient detail that someone else could reproduce it? (-10%)
+
* Is there a compelling description why each indicator might work (-2 for each, up to a total of 6 off)
* Does the provided chart include:
+
* Is each indicator described in sufficient detail that someone else could reproduce it? (-5 points for each if not)
** Historic value of benchmark normalized to 1.0 with black line (-5% if not)
+
* Is there a chart for each indicator that properly illustrates its operation? (-5 points for each if not)
** Historic value of portfolio normalized to 1.0 with blue line (-10% if not)
+
* Is at least one indicator different from those provided by the instructor's code (i.e., another indicator that is not SMA, Bollinger Bands or RSI) (-10 points if not)
** Are the appropriate date ranges covered? (-5% if not)
+
* Does the submitted code <tt>indicators.py</tt> properly reflect the indicators provided in the report (-20 points if not)
** Are vertical lines included to indicate entries (-10% if not)
 
* Does the submitted code <tt>rule_based.py</tt> properly reflect the strategy provided in the report? (-20% if not)
 
* Does the manual trading system provide higher cumulative return than the benchmark over the in-sample time period? (-5% if not)
 
  
ML-based trader (up to 30% deductions):
+
Theoretically optimal (up to 20 points potential deductions):
* Is the ML strategy described with clarity and in sufficient detail that someone else could reproduce it? (-10%)
+
* Is the methodology described correct and convincing? (-10 points if not)
* Are modifications/tweaks to the basic decision tree learner fully described (-10%)
+
* Is the chart correct (dates and equity curve) (-10 points if not)
* Does the methodology utilize a classification-based learner? (-30%)
+
* Is the chart correct (dates and equity curve) (-10 points if not)
* Does the provided chart include:
+
* Historic value of benchmark normalized to 1.0 with blue line (-5 if not)
** Historic value of benchmark normalized to 1.0 with black line (-5% if not)
+
* Historic value of portfolio normalized to 1.0 with black line (-5 if not)
** Historic value of rule-based portfolio normalized to 1.0 with blue line (-5% if not)
+
* Are the reported performance criteria correct ? (-2 points for each item if not)
** Historic value of ML-based portfolio normalized to 1.0 with green line (-10% if not)
 
** Are the appropriate date ranges covered? (-5% if not)
 
** Are vertical lines included to indicate entry (-10% if not)
 
* Does the submitted code <tt>ML_based.py</tt> properly reflect the strategy provided in the report? (-30% if not)
 
* Does the ML trading system provide 1.5x higher cumulative return or than the benchmark over the in-sample time period? (-5% if not)
 
  
Data visualization (up to 15% deductions):
+
Manual rule-based trader (up to 50 points deductions):
* Is the X data reported in all three charts the same? (-5% if not)
+
* Is the trading strategy described with clarity and in sufficient detail that someone else could reproduce it? (-20)
* Is the X data standardized? (-5% if not)
+
* Does the provided chart(s) include:
* Is the Y data in the train and query plots similar (-5% if not)
+
** Historic value of benchmark normalized to 1.0 with blue line (-10 if not)
 +
** Historic value of portfolio normalized to 1.0 with black line (-10 if not)
 +
** Are the appropriate date ranges covered? (-10 if not)
 +
** Are vertical lines included to indicate entries (-10 if not)
 +
* Does the submitted code <tt>ManualStrategy.py</tt> properly reflect the strategy provided in the report? (-20 if not)
 +
* Does the submitted code and report reflect an understanding of the subject matter (up to -30 if not)
 +
* Does the manual trading system provide higher cumulative return than the benchmark over the in-sample time period? (-10 if not)
 +
* Did the student use the correct symbol? (-10 if not)
 +
* Did the student use the date periods? (-10 if not)
 +
* Does the strategy obey holding constraints (-5 if not)
  
Comparative analysis (up to 10% deductions):
+
Comparative analysis (up to 10 points deductions):
* Is the appropriate chart provided (-5% for each missing element, up to a maximum of -10%)
+
* Is the appropriate chart provided (-5 for each missing element, up to a maximum of -10)
* Is there a table that reports in-sample and out-of-sample data for the baseline (just the stock), rule-based, and ML-based strategies? (-5% for each missing element)
+
* Are differences between the in-sample and out-of-sample performances appropriately explained (-5)
* Are differences between the in-sample and out-of-sample performances appropriately explained (-5%)
+
* Does the submitted code and report reflect an understanding of the subject matter (up to -5 if not)
 +
* Is the required table present and correct (up to -5 if not)
  
 
==Required, Allowed & Prohibited==
 
==Required, Allowed & Prohibited==
Line 217: Line 220:
 
* Your project must be coded in Python 2.7.x.
 
* Your project must be coded in Python 2.7.x.
 
* Your code must run on one of the university-provided computers (e.g. buffet02.cc.gatech.edu).
 
* Your code must run on one of the university-provided computers (e.g. buffet02.cc.gatech.edu).
* Use only util.py to read data. If you want to read items other than adjusted close, modify util.py to do it, and submit your new version with your code.
+
* Use only util.py to read data.  
 +
* All charts must be generated in Python, and you must provide the code you used.
  
 
Allowed:
 
Allowed:
Line 223: Line 227:
 
* Your code may use standard Python libraries.
 
* Your code may use standard Python libraries.
 
* You may use the NumPy, SciPy, matplotlib and Pandas libraries.  Be sure you are using the correct versions.
 
* You may use the NumPy, SciPy, matplotlib and Pandas libraries.  Be sure you are using the correct versions.
* You may reuse sections of code (up to 5 lines) that you collected from other students or the internet.
 
 
* Code provided by the instructor, or allowed by the instructor to be shared.
 
* Code provided by the instructor, or allowed by the instructor to be shared.
 
* A herring.
 
* A herring.
  
 
Prohibited:
 
Prohibited:
 +
* Generating charts using a method other than Python.
 
* Any other method of reading data besides util.py
 
* Any other method of reading data besides util.py
 
* Any libraries not listed in the "allowed" section above.
 
* Any libraries not listed in the "allowed" section above.
* Any code you did not write yourself (except for the 5 line rule in the "allowed" section).
+
* Any code you did not write yourself.
  
 
==Legacy==
 
==Legacy==

Latest revision as of 23:21, 30 October 2018

Finalized

Updates / FAQs

  • Q: I want to read some other values from the data besides just adjusted close, how can I do that? A: Look carefully at util.py and you will see that you can query for other values.
  • Q: Are we only allowed one position at a time? A: You can be in one of three states: -1000 shares, +1000 shares, 0 shares.
  • Q: Are we required to trade in only 1000 share blocks? (and have no more than 1000 shares long or short at a time? A: You can trade up to 2000 shares at a time as long as you maintain the requirement of holding 1000, 0 or -1000 shares.
  • Q: Are we limited to leverage of 2.0 on the portfolio? A: There is no limit on leverage.

Overview

In this project you will develop a trading strategy using your intuition and Technical Analysis, and test it against a stock using your market simulator. In a later project, you will use your same indicators but with Machine Learning (instead of your intuition) to create a trading strategy. We hope Machine Learning will do better than your intuition, but who knows?

Template

There is no distributed template for this project. You should create a directory for your code in ml4t/manual_strategy and make a copy of util.py there. You will have access to the data in the ML4T/Data directory but you should use ONLY the code in util.py to read it.

You should create the following code files for submission. They should comprise ALL code from you that is necessary to run your evaluations.

  • indicators.py Your code that implements your indicators as functions that operate on dataframes. The "main" code in indicators.py should generate the charts that illustrate your indicators in the report.
  • marketsimcode.py An improved version of your marketsim code that accepts a "trades" data frame (instead of a file). More info on the trades data frame below. It is OK not to submit this file if you have subsumed its functionality into one of your other code files.
  • ManualStrategy.py Code implementing a ManualStrategy object (your manual strategy). It should implement testPolicy() which returns a trades data frame (see below). The main part of this code should call marketsimcode as necessary to generate the plots used in the report.
  • TheoreticallyOptimalStrategy.py Code implementing a TheoreticallyOptimalStrategy object (details below). It should implement testPolicy() which returns a trades data frame (see below). The main part of this code should call marketsimcode as necessary to generate the plots used in the report.

Note that we may not test your code, so we may not know if you didn't organize your code as recommended, but this arrangement will be required for later projects, so it is worthwhile getting it set up this way. The key requirement is that, if necessary, a TA should be able to run your code on a buffet machine and get the same results (e.g., statistics and charts) that we see in your report.

Data Details, Dates and Rules

  • Use only the data provided for this course. You are not allowed to import external data.
  • For your report, trade only the symbol JPM. This will enable us to more easily compare results.
  • You may use data from other symbols (such as SPY) to inform your strategy.
  • The in sample/development period is January 1, 2008 to December 31 2009.
  • The out of sample/testing period is January 1, 2010 to December 31 2011.
  • Starting cash is $100,000.
  • Allowable positions are: 1000 shares long, 1000 shares short, 0 shares.
  • Benchmark: The performance of a portfolio starting with $100,000 cash, investing in 1000 shares of JPM and holding that position.
  • There is no limit on leverage.
  • Transaction costs for ManualStrategy: Commission: $9.95, Impact: 0.005.
  • Transaction costs for TheoreticallyOptimalStrategy: Commission: $0.00, Impact: 0.00.

Part 1: Technical Indicators (20 points)

Develop and describe at least 3 and at most 5 technical indicators. You may find our lecture on time series processing to be helpful. For each indicator you should create a single, compelling chart that illustrates the indicator.

As an example, you might create a chart that shows the price history of the stock, along with "helper data" (such as upper and lower bollinger bands) and the value of the indicator itself. Another example: If you were using price/SMA as an indicator you would want to create a chart with 3 lines: Price, SMA, Price/SMA. In order to facilitate visualization of the indicator you might normalize the data to 1.0 at the start of the date range (i.e. divide price[t] by price[0]).

Your report description of each indicator should enable someone to reproduce it just by reading the description. We want a written description here, not code, however, it is OK to augment your written description with a pseudocode figure.

At least one of the indicators you use should be completely different from the ones presented in our lectures. (i.e. something other than SMA, Bollinger Bands, RSI).

Part 2: Theoretically Optimal Strategy (20 points)

Assume that you can see the future, but that you are constrained by the portfolio size and order limits as specified above. Create a set of trades that represents the best a strategy could possibly do during the in sample period. The reason we're having you do this is so that you will have an idea of an upper bound on performance.

The intent is for you to use adjusted close prices with the market simulator that you wrote earlier in the course. For this activity, use $0.00, and 0.0 for commissions and impact respectively.

Provide a chart that reports:

  • Benchmark (see definition above) normalized to 1.0 at the start: Blue line
  • Value of the theoretically optimal portfolio (normalized to 1.0 at the start): Black line

You should also report in text:

  • Cumulative return of the benchmark and portfolio
  • Stdev of daily returns of benchmark and portfolio
  • Mean of daily returns of benchmark and portfolio

Your code should implement testPolicy() as follows:

df_trades = tos.testPolicy(symbol = "AAPL", sd=dt.datetime(2010,1,1), ed=dt.datetime(2011,12,31), sv = 100000) 

The input parameters are:

  • symbol: the stock symbol to act on
  • sd: A datetime object that represents the start date
  • ed: A datetime object that represents the end date
  • sv: Start value of the portfolio

The output result is:

  • df_trades: A data frame whose values represent trades for each day. Legal values are +1000.0 indicating a BUY of 1000 shares, -1000.0 indicating a SELL of 1000 shares, and 0.0 indicating NOTHING. Values of +2000 and -2000 for trades are also legal so long as net holdings are constrained to -1000, 0, and 1000.

Part 3: Manual Rule-Based Trader (50 points)

In ManualStrategy.py implement a set of rules using the indicators you created in Part 1 above. Devise some simple logic using your indicators to enter and exit positions in the stock.

A recommended approach is to create a single logical expression that yields a -1, 0, or 1, corresponding to a "short," "out" or "long" position. Example usage this signal: If you are out of the stock, then a 1 would signal a BUY 1000 order. If you are long, a -1 would signal a SELL 2000 order. You don't have to follow this advice though, so long as you follow the trading rules outlined above.

For the report we want a written description, not code, however, it is OK to augment your written description with a pseudocode figure.

You should tweak your rules as best you can to get the best performance possible during the in sample period (do not peek at out of sample performance). Use your rule-based strategy to generate an orders dataframe over the in sample period, then run that dataframe through your market simulator to create a chart that includes the following components over the in sample period:

  • Benchmark (see definition above) normalized to 1.0 at the start: Blue line
  • Value of the rule-based portfolio (normalized to 1.0 at the start): Black line
  • Vertical green lines indicating LONG entry points.
  • Vertical red lines indicating SHORT entry points.

We expect that your rule-based strategy should outperform the benchmark over the in sample period.

Your code should implement the same API as above for theoretically optimal:

df_trades = ms.testPolicy(symbol = "AAPL", sd=dt.datetime(2010,1,1), ed=dt.datetime(2011,12,31), sv = 100000)

Part 4: Comparative Analysis (10 points)

Evaluate the performance of your strategy in the out of sample period. Note that you should not train or tweak your approach on this data. You should use the classification learned using the in sample data only. Create a chart that shows, out of sample:

  • Benchmark (see definition above) normalized to 1.0 at the start: Blue line
  • Performance of manual strategy: Black line
  • Both should be normalized to 1.0 at the start.

Create a table that summarizes the performance of the stock, and the manual strategy for both in sample and out of sample periods. Explain WHY these differences occur.

Hints

Overall, I recommend the following steps in the creation of your strategies:

  • Indicator design hints:
    • For your X values: Identify and implement at least 3 technical features that you believe may be predictive of future return.
  • Rule based design:
    • Use a cascade of if statements conditioned on the indicators to identify whether a LONG condition is met.
    • Use a cascade of if statements conditioned on the indicators to identify whether a SHORT condition is met.
    • The conditions for LONG and SHORT should be mutually exclusive.
    • If neither LONG or SHORT is triggered, the result should be DO NOTHING.
    • For debugging purposes, you may find it helpful to plot the value of the rule-based output (-1, 0, 1) versus the stock price.

Choosing Technical Features -- Your X Values

You should have already successfully coded the Bollinger Band feature:

bb_value[t] = (price[t] - SMA[t])/(2 * stdev[t])

Two other good features worth considering are momentum and volatility.

momentum[t] = (price[t]/price[t-N]) - 1

Volatility is just the stdev of daily returns.

It is usually worthwhile to standardize the resulting values (see https://en.wikipedia.org/wiki/Standard_score).

Contents of Report

Describe each indicator you use in sufficient detail that someone else could reproduce it. You should also provide a compelling description regarding why that indicator might work and how it could be used. You should also provide one or more charts that convey how each indicator works in a compelling way. (up to 8 charts).

For the best possible strategy, describe how you created it and any assumptions you had to make to make it work. Provide a chart that illustrates its performance versus the benchmark.

For your manual strategy, describe how you combined your indicators to create an overall signal. How do you decide to enter and exit your positions and why? Why do you believe (or not) that this is an effective strategy? Provide a chart.

Compare the performance of your manual strategy versus the benchmark for the in sample and out of sample time periods. Provide a chart.

Your report should be no more than 3000 words. Your report should contain no more than 14 charts. Penalties will apply if you violate these constraints.

Expectations

  • In-sample backtests should perform very well.
  • Out-of-sample backtests should... (you should be able to complete this sentence).

What to turn in

Turn your project in via Canvas.

  • Your report as report.pdf
  • All of your code, as necessary to run as .py files.
  • Document how to run your code in readme.txt.
  • No zip files please.

Rubric

Start with 100 points, deductions as follows:

Neatness (up to 5 points deduction if not).

Bonus for exceptionally well written reports (up to 2 points)

Indicators (up to 20 points potential deductions):

  • Is there a compelling description why each indicator might work (-2 for each, up to a total of 6 off)
  • Is each indicator described in sufficient detail that someone else could reproduce it? (-5 points for each if not)
  • Is there a chart for each indicator that properly illustrates its operation? (-5 points for each if not)
  • Is at least one indicator different from those provided by the instructor's code (i.e., another indicator that is not SMA, Bollinger Bands or RSI) (-10 points if not)
  • Does the submitted code indicators.py properly reflect the indicators provided in the report (-20 points if not)

Theoretically optimal (up to 20 points potential deductions):

  • Is the methodology described correct and convincing? (-10 points if not)
  • Is the chart correct (dates and equity curve) (-10 points if not)
  • Is the chart correct (dates and equity curve) (-10 points if not)
  • Historic value of benchmark normalized to 1.0 with blue line (-5 if not)
  • Historic value of portfolio normalized to 1.0 with black line (-5 if not)
  • Are the reported performance criteria correct ? (-2 points for each item if not)

Manual rule-based trader (up to 50 points deductions):

  • Is the trading strategy described with clarity and in sufficient detail that someone else could reproduce it? (-20)
  • Does the provided chart(s) include:
    • Historic value of benchmark normalized to 1.0 with blue line (-10 if not)
    • Historic value of portfolio normalized to 1.0 with black line (-10 if not)
    • Are the appropriate date ranges covered? (-10 if not)
    • Are vertical lines included to indicate entries (-10 if not)
  • Does the submitted code ManualStrategy.py properly reflect the strategy provided in the report? (-20 if not)
  • Does the submitted code and report reflect an understanding of the subject matter (up to -30 if not)
  • Does the manual trading system provide higher cumulative return than the benchmark over the in-sample time period? (-10 if not)
  • Did the student use the correct symbol? (-10 if not)
  • Did the student use the date periods? (-10 if not)
  • Does the strategy obey holding constraints (-5 if not)

Comparative analysis (up to 10 points deductions):

  • Is the appropriate chart provided (-5 for each missing element, up to a maximum of -10)
  • Are differences between the in-sample and out-of-sample performances appropriately explained (-5)
  • Does the submitted code and report reflect an understanding of the subject matter (up to -5 if not)
  • Is the required table present and correct (up to -5 if not)

Required, Allowed & Prohibited

Required:

  • Your project must be coded in Python 2.7.x.
  • Your code must run on one of the university-provided computers (e.g. buffet02.cc.gatech.edu).
  • Use only util.py to read data.
  • All charts must be generated in Python, and you must provide the code you used.

Allowed:

  • You can develop your code on your personal machine, but it must also run successfully on one of the university provided machines or virtual images.
  • Your code may use standard Python libraries.
  • You may use the NumPy, SciPy, matplotlib and Pandas libraries. Be sure you are using the correct versions.
  • Code provided by the instructor, or allowed by the instructor to be shared.
  • A herring.

Prohibited:

  • Generating charts using a method other than Python.
  • Any other method of reading data besides util.py
  • Any libraries not listed in the "allowed" section above.
  • Any code you did not write yourself.

Legacy