Difference between revisions of "Martingale"

From Quantitative Analysis Software Courses
Jump to navigation Jump to search
Line 41: Line 41:
 
You should change ONLY <tt>martingale.py</tt>.  ALL of your code should be in that one file.  Do not create additional files.  It should always remain in and run from the directory <tt>ML4T_2018Fall/martingale/</tt>.  Leave the copyright information at the top intact.
 
You should change ONLY <tt>martingale.py</tt>.  ALL of your code should be in that one file.  Do not create additional files.  It should always remain in and run from the directory <tt>ML4T_2018Fall/martingale/</tt>.  Leave the copyright information at the top intact.
  
'''Insert your GT User ID and GT ID number''
+
'''Insert your GT User ID and GT ID number'''
 +
 
 +
Revise the code functions <TT>author()</tt> and <tt>gtid()</tt> to correctly include your GT User ID and 9 digit GT ID respectively.  Your GT User ID should be something like <tt>tbalch78</tt> and your GTID is a 9 digit number.  You should also update this information the comments section at the top.
  
Revise the template code to correctly include your GT User ID in the author method (should be something like <tt>tbalch78</tt>.  Also, in the
 
 
'''Build a simple gambling simulator'''
 
'''Build a simple gambling simulator'''
  
Revise the code in
+
Revise the code in <tt>martingale.py</tt> to simulate 1000 successive bets on spins of the roulette wheel using the betting scheme outlined above.  You should test for the results of the betting events by making successive calls to the <tt>get_spin_result(win_prob)</tt> function.  Note that you'll have to update the <tt>win_prob</tt> parameter according to the correct probability of winning.  You can figure that out by thinking about how roulette works (see wikipedia link above).
 +
 
 +
Track your winnings by storing them in a numpy array. You might call that array <tt>winnings</tt> where <tt>winnings[0]</tt> should be set to 0 (just before the first spin).  <tt>winnings[1]</tt> should reflect the total winnings after the first spin and so on.
  
 
==API specification==
 
==API specification==

Revision as of 13:17, 15 August 2018

Revisions

2018-8-11

  • Project is in DRAFT.

Overview

The purpose of this assignment is to get you started programming in Python right away and to help provide you some initial feel for risk, probability and "betting." Purchasing a stock is, after all, a bet that the stock will increase in value.

In this project you will evaluate the actual betting strategy that Professor Balch uses at roulette when he goes to Las Vegas. Here it is:

  • winnings = $0
  • while winnings < $80:
    • won = False
    • bet_amount = $1
    • while not won
      • wager bet_amount on black
      • won = result of roulette wheel spin
      • if won == True:
        • winnings = winnings + bet_amount
      • else:
        • winnings = winnings - bet_amount

Here are some details regarding how roulette betting works: Betting on black (or red) is considered an "even money" bet. That means that if you bet N chips and win, you keep your N chips and you win another N chips. If you bet N chips and you lose then those N chips are lost. The odds of winning or losing depend on whether you're betting at an American wheel or a European wheel. For this project we will be assuming an American wheel. You can learn more about roulette and betting here: https://en.wikipedia.org/wiki/Roulette

Tasks

Set up your development environment

First, if you haven't yet set up your software environment, follow the instructions here: ML4T_Software_Setup. The base directory structure for all projects in the class, including supporting data and software are will be set up correctly when you follow those instructions.

Get the template code for this project

This project is available here: File:Fall18 martingale.zip. Download and extract its contents into the base directory (ML4T_2018Fall). Once you've done this, you should see the following directory structure:

  • ML4T_2018Fall/: Root directory for course
    • data/: Location of data
    • grading/: Grading libraries used by the individual grading scripts for each assignment.
    • util.py: Common utility library. This is the only allowed way to read in stock data.
    • martingale/: Root directory for this project
      • martingale.py: Main project file to use as a template for your code.

You should change ONLY martingale.py. ALL of your code should be in that one file. Do not create additional files. It should always remain in and run from the directory ML4T_2018Fall/martingale/. Leave the copyright information at the top intact.

Insert your GT User ID and GT ID number

Revise the code functions author() and gtid() to correctly include your GT User ID and 9 digit GT ID respectively. Your GT User ID should be something like tbalch78 and your GTID is a 9 digit number. You should also update this information the comments section at the top.

Build a simple gambling simulator

Revise the code in martingale.py to simulate 1000 successive bets on spins of the roulette wheel using the betting scheme outlined above. You should test for the results of the betting events by making successive calls to the get_spin_result(win_prob) function. Note that you'll have to update the win_prob parameter according to the correct probability of winning. You can figure that out by thinking about how roulette works (see wikipedia link above).

Track your winnings by storing them in a numpy array. You might call that array winnings where winnings[0] should be set to 0 (just before the first spin). winnings[1] should reflect the total winnings after the first spin and so on.

API specification

For grading purposes, we will test ONLY assess_portfolio() the function that computes statistics. You should implement the following API EXACTLY, if you do not, your submission will be penalized at least 20 points.

import datetime as dt
cr, adr, sddr, sr, ev = \
    assess_portfolio(sd=dt.datetime(2008,1,1), ed=dt.datetime(2009,1,1), \
    syms=['GOOG','AAPL','GLD','XOM'], \
    allocs=[0.1,0.2,0.3,0.4], \
    sv=1000000, rfr=0.0, sf=252.0, \
    gen_plot=False)

Where the returned outputs are:

  • cr: Cumulative return
  • adr: Average period return (if sf == 252 this is daily return)
  • sddr: Standard deviation of daily return
  • sr: Sharpe ratio
  • ev: End value of portfolio

The input parameters are:

  • sd: A datetime object that represents the start date
  • ed: A datetime object that represents the end date
  • syms: A list of 2 or more symbols that make up the portfolio (note that your code should support any symbol in the data directory)
  • allocs: A list of 2 or more allocations to the stocks, must sum to 1.0
  • sv: Start value of the portfolio
  • rfr: The risk free return per sample period that does not change for the entire date range (a single number, not an array).
  • sf: Sampling frequency per year
  • gen_plot: If False, do not create any output. If True it is OK to output a plot such as plot.png

Template

A template is provided for you to get started with the project. The base directory structure, util.py, data, and grading modules are provided by this zip file: File:ML4T 2018Spring.zip (also linked to from the ML4T Software Setup page. Once you have extracted that zip file, the template for this project is available here: File:Spr18 assess portfolio.zip. Download and extract its contents into the base directory. Once you've done this, you should see the following directory structure:

  • ML4T_2018Spring/: Root directory for course
    • data/: Location of data
    • grading/: Grading libraries used by the individual grading scripts for each assignment.
    • util.py: Common utility library. This is the only allowed way to read in stock data.
    • assess_portfolio/: Root directory for this project
      • analysis.py: Main project script with functions you need to implement, as well as test code
      • grade_analysis.py: Grading script for this assignment.

You should change ONLY analysis.py. ALL of your code should be in that one file. Do not create additional files. It should always remain in and run from the directory ML4T_2018Spring/assess_portfolio/. If you move it somewhere else and develop your code there, it may not run properly when auto graded.

Notes:

  • Ignore any file named __init__.py; they are used to mark directories as Python packages.
  • The util.py shipped with the template code expects the data directory to be one level up, but this will change during autograding. If you hard-code paths to data files, your code will break. Use the functions provided by util.py to read data.
  • To execute the main script, make sure your current working directory is assess_portfolio/, then run:
python analysis.py

Suggestions

Here is a suggested high-level outline for what your code needs to do:

  • Read in adjusted closing prices for the equities.
  • Normalize the prices according to the first day. The first row for each stock should have a value of 1.0 at this point.
  • Multiply each column by the allocation to the corresponding equity.
  • Multiply these normalized allocations by starting value of overall portfolio, to get position values.
  • Sum each row (i.e. all position values for each day). That is your daily portfolio value.
  • Compute statistics from the total portfolio value.

You will need to reuse some of your code from this project for the next project. To make that task easier, we suggest that you create a helper function within your code that has the following prototype. Note that we will not be testing this part of your code directly, so it isn't essential that you follow the API exactly.

cr, adr, sddr, sr = \
    compute_portfolio_stats(prices = df_prices, \
    allocs=[0.1,0.2,0.3,0.4],\
    rfr = 0.0, sf = 252.0)

Where the returned outputs are:

  • cr: Cumulative return
  • adr: Average daily return
  • sddr: Standard deviation of daily return
  • sr: Sharpe Ratio

The input parameters are:

  • prices is a data frame or an ndarray of historical prices.
  • allocs: A list of allocations to the stocks, must sum to 1.0
  • rfr: The risk free return per sample period for the entire date range. We assume that it does not change.
  • sf: Sampling frequency per year

Here are some notes and assumptions:

  • When we compute statistics on the portfolio value, we do not include the first day.
  • We assume you are using the data provided. If you use other data your results may turn out different from ours. Yahoo's online data changes every day. We cannot not build a consistent "correct" answer based on "live" Yahoo data.
  • Assume 252 trading days/year.

Make sure your assess_portfolio() function gives correct output. Check it against the examples below.

Example output

These are actual correct examples that you can use to check your work.

Example 1

Start Date: 2010-01-01
End Date: 2010-12-31
Symbols: ['GOOG', 'AAPL', 'GLD', 'XOM']
Allocations: [0.2, 0.3, 0.4, 0.1]
Sharpe Ratio: 1.51819243641
Volatility (stdev of daily returns): 0.0100104028
Average Daily Return: 0.000957366234238
Cumulative Return: 0.255646784534

Example1.png

Example 2

Start Date: 2010-01-01
End Date: 2010-12-31
Symbols: ['AXP', 'HPQ', 'IBM', 'HNZ']
Allocations: [0.0, 0.0, 0.0, 1.0]
Sharpe Ratio: 1.30798398744
Volatility (stdev of daily returns): 0.00926153128768
Average Daily Return: 0.000763106152672
Cumulative Return: 0.198105963655

Example 2.png

Example 3

Start Date: 2010-06-01
End Date: 2010-12-31
Symbols: ['GOOG', 'AAPL', 'GLD', 'XOM']
Allocations: [0.2, 0.3, 0.4, 0.1]
Sharpe Ratio: 2.21259766672
Volatility (stdev of daily returns): 0.00929734619707
Average Daily Return: 0.00129586924366
Cumulative Return: 0.205113938792

What to turn in

Be sure to follow these instructions diligently!

Submit via Canvas:

  • Your code as analysis.py (please use this EXACT filename)

Important: We may test against OTHER symbols and other allocations, so don't hardcode the list of symbols.

Unlimited resubmissions are allowed up to the deadline for the project.

Rubric

10 test cases: We will test your code against 10 cases (10 points per case). Each case will be deemed "correct" if:

  • 5 points: Sharpe ratio = reference answer +- 0.001
  • 2.5 points: Average daily return = reference answer +- 0.00001
  • 2.5 points: Cumulative return = reference answer +- 0.001

Required, Allowed & Prohibited

Required:

  • Your project must be coded in Python 2.7.x.
  • Your code must run on one of the university-provided computers (e.g. buffet02.cc.gatech.edu).
  • Use the code for reading in historical data provided in util.py
  • Your code must run in less than 5 seconds on one of the university-provided computers.

Allowed:

  • You can develop your code on your personal machine, but it must also run successfully on one of the university provided machines or virtual images.
  • Your code may use standard Python libraries (except os).
  • You may use the NumPy, SciPy, matplotlib and Pandas libraries. Be sure you are using the correct versions.
  • Code provided by the instructor, or allowed by the instructor to be shared.

Prohibited:

  • Any use of global variables.
  • Any libraries not listed in the "allowed" section above.
  • Use of any code other than util.py to read in data.
  • Use of Python's os module.
  • Any code you did not write yourself (except for the 5 line rule in the "allowed" section).
  • Knights who say "neeee."

Legacy versions