Difference between revisions of "Manual strategy"

From Quantitative Analysis Software Courses
Jump to navigation Jump to search
Line 97: Line 97:
 
* Vertical red lines indicating SHORT entry points.
 
* Vertical red lines indicating SHORT entry points.
  
We expect that your rule-based strategy should outperform the benchmark over the in sample period.
+
We expect that your rule-based strategy should outperform the benchmark over the in sample period.
 
 
Deliverables:
 
* Descriptive text (1 or 2 pages with chart) that provides a compelling justification for the rule-based system developed.
 
* Text must describe rule based system in sufficient detail that another person could implement it.
 
* 1 chart.
 
* Code: rule_based.py (generates an orders file)
 
  
 
==Part 6: Comparative Analysis (10 points)==
 
==Part 6: Comparative Analysis (10 points)==

Revision as of 16:51, 27 October 2017

DRAFT

This assignment is under revision. This notice will be removed once it is final.

Updates / FAQs

  • Q: I want to read some other values from the data besides just adjusted close, how can I do that? A: Look carefully at util.py and you will see that you can query for other values.
  • Q: Are we only allowed one position at a time? A: You can be in one of three states: -1000 shares, +1000 shares, 0 shares.
  • Q: Are we required to trade in only 1000 share blocks? (and have no more than 1000 shares long or short at a time? A: You can trade up to 2000 shares at a time as long as you maintain the requirement of holding 1000, 0 or -1000 shares.
  • Q: Are we limited to leverage of 2.0 on the portfolio? A: There is no limit on leverage.

Overview

In this project you will develop a trading strategy using your intuition and Technical Analysis, and test it against a stock using your market simulator. In a later project, you will use your same indicators but with Machine Learning (instead of your intuition) to create a trading strategy. We hope Machine Learning will do better than your intuition, but who knows?

Template

You should create the following code files for submission. They should comprise ALL code from you that is necessary to run your evaluations.

  • indicators.py Your code that implements your indicators as functions that operate on dataframes. The "main" code in indicators.py should generate the charts that illustrate your indicators in the report.
  • marketsimcode.py An improved version of your marketsim code that accepts a "trades" data frame (instead of a file). More info on the trades data frame below.
  • ManualStrategy.py Code implementing a ManualStrategy object (your manual strategy). It should implement testPolicy() which returns a trades data frame (see below). The main part of this code should call marketsimcode as necessary to generate the plots used in the report.
  • BestPossibleStrategy.py Code implementing a BestPossibleStrategy object (details below). It should implement testPolicy() which returns a trades data frame (see below). The main part of this code should call marketsimcode as necessary to generate the plots used in the report.

Data Details, Dates and Rules

  • Use only the data provided for this course. You are not allowed to import external data.
  • For your report, trade only the symbol JPM. This will enable us to more easily compare results.
  • You may use data from other symbols (such as SPY) to inform your strategy.
  • The in sample/development period is January 1, 2008 to December 31 2009.
  • The out of sample/testing period is January 1, 2010 to December 31 2011.
  • Starting cash is $100,000.
  • Allowable positions are: 1000 shares long, 1000 shares short, 0 shares.
  • Benchmark: The performance of a portfolio starting with $100,000 cash, investing in 1000 shares of JPM and holding that position.
  • There is no limit on leverage.
  • Transaction costs for ManualStrategy: Commission: $9.95, Impact: 0.005.
  • Transaction costs for BestPossibleStrategy: Commission: $0.00, Impact: 0.00.

Part 1: Technical Indicators (20 points)

Develop and describe at least 3 and at most 5 technical indicators. You may find our lecture on time series processing to be helpful. For each indicator you should create a single, compelling chart that illustrates the indicator.

As an example, you might create a chart that shows the price history of the stock, along with "helper data" (such as upper and lower bollinger bands) and the value of the indicator itself. Another example: If you were using price/SMA as an indicator you would want to create a chart with 3 lines: Price, SMA, Price/SMA. In order to facilitate visualization of the indicator you might normalize the data to 1.0 at the start of the date range (i.e. divide price[t] by price[0]).

Your report description of each indicator should enable someone to reproduce it just by reading the description. We want a written description here, not code, however, it is OK to augment your written description with a pseudocode figure.

At least one of the indicators you use should be completely different from the ones presented in our lectures. (i.e. something other than SMA, Bollinger Bands, RSI).

Part 2: Best Possible Strategy (5 points)

Assume that you can see the future, but that you are constrained by the portfolio size and order limits as specified above. Create a set of trades that represents the best a strategy could possibly do during the in sample period. The reason we're having you do this is so that you will have an idea of an upper bound on performance.

The intent is for you to use adjusted close prices with the market simulator that you wrote earlier in the course. For this activity, use $0.00, and 0.0 for commissions and impact respectively.

Provide a chart that reports:

  • Benchmark (see definition above) normalized to 1.0 at the start: Blue line
  • Value of the best possible portfolio (normalized to 1.0 at the start): Black line

You should also report in text:

  • Cumulative return of the benchmark and portfolio
  • Stdev of daily returns of benchmark and portfolio
  • Mean of daily returns of benchmark and portfolio

Your code should implement testPolicy() as follows:

df_trades = bps.testPolicy(symbol = "AAPL", sd=dt.datetime(2010,1,1), ed=dt.datetime(2011,12,31), sv = 100000) 

The input parameters are:

  • symbol: the stock symbol to act on
  • sd: A datetime object that represents the start date
  • ed: A datetime object that represents the end date
  • sv: Start value of the portfolio

The output result is:

  • df_trades: A data frame whose values represent trades for each day. Legal values are +1000.0 indicating a BUY of 1000 shares, -1000.0 indicating a SELL of 1000 shares, and 0.0 indicating NOTHING. Values of +2000 and -2000 for trades are also legal so long as net holdings are constrained to -1000, 0, and 1000.

Part 3: Manual Rule-Based Trader (20 points)

Devise a set of rules using the indicators you created in Part 1 above. Devise some simple logic using your indicators to enter and exit positions in the stock.

A recommended approach is to create a single logical expression that yields a -1, 0, or 1, corresponding to a "short," "out" or "long" position. Example usage this signal: If you are out of the stock, then a 1 would signal a BUY 1000 order. If you are long, a -1 would signal a SELL 2000 order. You don't have to follow this advice though, so long as you follow the trading rules outlined above.

We want a written description here in the report, not code, however, it is OK to augment your written description with a pseudocode figure.

You should tweak your rules as best you can to get the best performance possible during the in sample period (do not peek at out of sample performance). Use your rule-based strategy to generate an orders file over the in sample period, then run that file through your market simulator to create a chart that includes the following components over the in sample period:

  • Benchmark (see definition above) normalized to 1.0 at the start: Blue line
  • Value of the rule-based portfolio (normalized to 1.0 at the start): Black line
  • Vertical green lines indicating LONG entry points.
  • Vertical red lines indicating SHORT entry points.

We expect that your rule-based strategy should outperform the benchmark over the in sample period.

Part 6: Comparative Analysis (10 points)

Evaluate the performance of your strategy in the out of sample period. Note that you should not train or tweak your approach on this data. You should use the classification learned using the in sample data only. Create a chart that shows, out of sample:

  • Benchmark (see definition above) normalized to 1.0 at the start: Blue line
  • Performance of manual strategy: Black line
  • Both should be normalized to 1.0 at the start.

Create a table that summarizes the performance of the stock, and the manual strategy for both in sample and out of sample periods. Utilize your experience in this class to determine which factors are best to use for comparing these strategies. If performance out of sample is worse than in sample, do your best to explain why. Also if the manual and ML strategies perform substantially differently, explain why. Is one method or the other more or less susceptible to the same underlying flaw? Why or why not?

Deliverables:

  • Descriptive text (1 or 2 pages including figures)
  • 1 chart

Hints

Overall, I recommend the following steps in the creation of your strategies:

  • Indicator design hints:
    • For your X values: Identify and implement at least 3 technical features that you believe may be predictive of future return.
  • Rule based design:
    • Use a cascade of if statements conditioned on the indicators to identify whether a LONG condition is met.
    • Use a cascade of if statements conditioned on the indicators to identify whether a SHORT condition is met.
    • The conditions for LONG and SHORT should be mutually exclusive.
    • If neither LONG or SHORT is triggered, the result should be DO NOTHING.
    • For debugging purposes, you may find it helpful to plot the value of the rule-based output (-1, 0, 1) versus the stock price.

Choosing Technical Features -- Your X Values

You should have already successfully coded the Bollinger Band feature:

bb_value[t] = (price[t] - SMA[t])/(2 * stdev[t])

Two other good features worth considering are momentum and volatility.

momentum[t] = (price[t]/price[t-N]) - 1

Volatility is just the stdev of daily returns.

You still need to standardize the resulting values.

Choosing Y

Your code should classify based on 21 day change in price. You need to build a new Y that reflects the 21 day change and aligns with the current date. Here's pseudo code for the calculation of Y

ret = (price[t+21]/price[t]) - 1.0
if ret > YBUY:
    Y[t] = +1 # LONG
else if ret < YSELL:
    Y[t] = -1 # SHORT
else:
    Y[t] = 0

If you select Y in this manner and use it for training, your learner will classify 21 day returns.

Template and Data

There is no github template for this project. You should create a directory for your code in ml4t/mc3-p3 and make a copy of util.py there. You should also copy into that directory your learner code and your market simulator code. You will have access to the data in the ML4T/Data directory but you should use ONLY the code in util.py to read it.

Contents of Report

  • Your report should be no more than 3000 words. Your report should contain no more than 14 charts. Penalties will apply if you violate these constraints.
  • Include charts and text as identified in the sections above.

Expectations

  • In-sample AAPL backtests should perform very well -- The ML version should do better than the manual version.
  • Out-of-sample AAPL backtests should... (you should be able to complete this sentence).

What to turn in

Turn your project in via t-square.

  • Your report as report.pdf
  • All of your code, as necessary to run as .py files.
  • Document how to run your code in readme.txt.
  • No zip files please.

Rubric

Start with 100%, deductions as follows:

Indicators (up to 20% potential deductions):

  • Is each indicator described in sufficient detail that someone else could reproduce it? (-5% for each if not)
  • Is there a chart for each indicator that properly illustrates its operation? (-5% for each if not)
  • Is at least one indicator different from those provided by the instructor's code (i.e., another indicator that is not SMA, Bollinger Bands or RSI) (-10% if not)
  • Does the submitted code indicators.py properly reflect the indicators provided in the report (-20% if not)

Best possible (up to 5% potential deductions):

  • Is the chart correct (dates and equity curve) (-5% for if not)
  • Is the reported performance correct within 5% (-1% for each item if not)

Manual rule-based trader (up to 20% deductions):

  • Is the trading strategy described with clarity and in sufficient detail that someone else could reproduce it? (-10%)
  • Does the provided chart include:
    • Historic value of benchmark normalized to 1.0 with black line (-5% if not)
    • Historic value of portfolio normalized to 1.0 with blue line (-10% if not)
    • Are the appropriate date ranges covered? (-5% if not)
    • Are vertical lines included to indicate entries (-10% if not)
  • Does the submitted code rule_based.py properly reflect the strategy provided in the report? (-20% if not)
  • Does the manual trading system provide higher cumulative return than the benchmark over the in-sample time period? (-5% if not)

ML-based trader (up to 30% deductions):

  • Is the ML strategy described with clarity and in sufficient detail that someone else could reproduce it? (-10%)
  • Are modifications/tweaks to the basic decision tree learner fully described (-10%)
  • Does the methodology utilize a classification-based learner? (-30%)
  • Does the provided chart include:
    • Historic value of benchmark normalized to 1.0 with black line (-5% if not)
    • Historic value of rule-based portfolio normalized to 1.0 with blue line (-5% if not)
    • Historic value of ML-based portfolio normalized to 1.0 with green line (-10% if not)
    • Are the appropriate date ranges covered? (-5% if not)
    • Are vertical lines included to indicate entry (-10% if not)
  • Does the submitted code ML_based.py properly reflect the strategy provided in the report? (-30% if not)
  • Does the ML trading system provide 1.5x higher cumulative return or than the benchmark over the in-sample time period? (-5% if not)

Data visualization (up to 15% deductions):

  • Is the X data reported in all three charts the same? (-5% if not)
  • Is the X data standardized? (-5% if not)
  • Is the Y data in the train and query plots similar (-5% if not)

Comparative analysis (up to 10% deductions):

  • Is the appropriate chart provided (-5% for each missing element, up to a maximum of -10%)
  • Is there a table that reports in-sample and out-of-sample data for the baseline (just the stock), rule-based, and ML-based strategies? (-5% for each missing element)
  • Are differences between the in-sample and out-of-sample performances appropriately explained (-5%)

Required, Allowed & Prohibited

Required:

  • Your project must be coded in Python 2.7.x.
  • Your code must run on one of the university-provided computers (e.g. buffet02.cc.gatech.edu).
  • Use only util.py to read data. If you want to read items other than adjusted close, modify util.py to do it, and submit your new version with your code.
  • All charts must be generated in Python, and you must provide the code you used.

Allowed:

  • You can develop your code on your personal machine, but it must also run successfully on one of the university provided machines or virtual images.
  • Your code may use standard Python libraries.
  • You may use the NumPy, SciPy, matplotlib and Pandas libraries. Be sure you are using the correct versions.
  • You may reuse sections of code (up to 5 lines) that you collected from other students or the internet.
  • Code provided by the instructor, or allowed by the instructor to be shared.
  • A herring.

Prohibited:

  • Any other method of reading data besides util.py
  • Any libraries not listed in the "allowed" section above.
  • Any code you did not write yourself (except for the 5 line rule in the "allowed" section).

Legacy