Difference between revisions of "MC1-Project-1"

From Quantitative Analysis Software Courses
Jump to navigation Jump to search
m
(Major refactoring of instructions)
Line 3: Line 3:
 
A ''portfolio'' is a collection of stocks (or other investment options) and corresponding allocations of money to each of them. In order to evaluate and compare different portfolios, we first need to compute certain metrics, based on available historical data.
 
A ''portfolio'' is a collection of stocks (or other investment options) and corresponding allocations of money to each of them. In order to evaluate and compare different portfolios, we first need to compute certain metrics, based on available historical data.
  
The primary goal of this assignment is to introduce you to this form of portfolio analysis. We will use [[pandas]] for reading in data, calculating various statistics and plotting a comparison graph.
+
The primary goal of this assignment is to introduce you to this form of portfolio analysis. You will use [[pandas]] for reading in data, calculating various statistics and plotting a comparison graph.
  
 
==Task==
 
==Task==
  
We are given the following inputs for analyzing a portfolio:
+
You are given the following inputs for analyzing a portfolio:
 
* A date range to select the historical data to use (specified by a start and end date)
 
* A date range to select the historical data to use (specified by a start and end date)
 
* Symbols for equities (e.g., GOOG, AAPL, GLD, XOM)
 
* Symbols for equities (e.g., GOOG, AAPL, GLD, XOM)
Line 13: Line 13:
 
* Total starting value of the portfolio (e.g. $1,000,000)
 
* Total starting value of the portfolio (e.g. $1,000,000)
  
These are passed to a Python function named <tt>assess_portfolio()</tt>. This function will be provided to you.
+
Your goal is to compute the daily portfolio value over given date range, and then the following statistics for the overall portfolio:
 +
* Cumulative return
 +
* Average daily return
 +
* Standard deviation of daily returns
 +
* Sharpe ratio of the overall portfolio, given daily risk free rate (usually 0), and yearly sampling frequency (usually 252, the no. of trading days in a year)
  
Example call:
+
==Template==
<pre>assess_portfolio('2010-01-01', '2010-12-31', ['GOOG','AAPL','GLD','XOM'], [0.2,0.3,0.4,0.1], 1000000)</pre>
 
  
It uses three helper functions to simulate and assess the performance of the stock portfolio. Your job is to implement these three functions:
+
A template is provided for you to complete the project: <tt>01-P1.zip</tt>
; <tt>'''get_portfolio_value'''(prices, allocs, start_val)</tt>
 
: Compute daily portfolio value given stock prices, allocations and starting value.
 
; <tt>'''get_portfolio_stats'''(port_val, daily_rf, samples_per_year)</tt>
 
: Calculate statistics on given portfolio values.
 
; <tt>'''plot_normalized_data'''(df, title, xlabel, ylabel)</tt>
 
: Normalize given stock prices and plot for comparison.
 
  
TODO: Include detailed function descriptions here
+
Download and unzip it. It should consist of:
 +
* <tt>01-P1/</tt>: Root directory for the template
 +
** <tt>portfolio/</tt>: Python package with all project-specific code
 +
*** <tt>analysis.py</tt>: Main project script with functions you need to implement, as well as test code
 +
** <tt>data/</tt>: Directory containing all stock data, in CSV files (e.g. <tt>GOOG.csv</tt>, <tt>AAPL.csv</tt>, etc.)
 +
** <tt>output/</tt>: Directory to store all program outputs, including plots
 +
** <tt>util.py</tt>: Utility functions (do not modify these, unless instructed)
  
The function should return:
+
(Note: Ignore any file named <tt>__init__.py</tt>; they are used to mark directories as Python packages.)
* Standard deviation of daily returns of the total portfolio
 
* Average daily return of the total portfolio
 
* Sharpe ratio of the total portfolio (Assume you have 252 trading days in an year. And risk free rate = 0)  
 
* Cumulative return  of the total portfolio
 
  
Also, create a chart that illustrates the value of your portfolio over the year and compares it to SPY.  The portfolio and SPY should be normalized to 1.0 at the beginning of the period.
+
To execute the main script, make sure your current working directory is <tt>01-P1</tt>, then run:
 +
<pre>python -m portfolio.analysis</pre>
  
TODO: Include example chart here
+
This directory structure may seem a little complicated at first, but it will help you organize your code better.
 +
 
 +
==Instructions==
 +
 
 +
* Open: <tt>portfolio/analysis.py</tt><br />Function documentation and code comments should help you understand what you need to do. If it is still not clear, read the detailed instructions below.
 +
* Look at the function: <tt>test_run()</tt><br />Here we have set up some sample inputs, which are then passed to the <tt>assess_portfolio()</tt> function:
 +
<pre>start_date = '2010-01-01'
 +
end_date = '2010-12-31'
 +
symbols = ['GOOG', 'AAPL', 'GLD', 'XOM']
 +
allocs = [0.2, 0.3, 0.4, 0.1]
 +
start_val = 1000000
 +
assess_portfolio(start_date, end_date, symbols, allocs, start_val)</pre>
 +
* Now look at: <tt>assess_portfolio()</tt><br />It first reads historical data for the given date range and symbols, and then uses three helper functions to simulate and assess the performance of the stock portfolio.
 +
* Your job is to implement these functions:
 +
** <tt>'''get_portfolio_value'''(prices, allocs, start_val)</tt>: Compute daily portfolio value given stock prices, allocations and starting value.<br />Ensure that it returns a pandas <tt>Series</tt> or <tt>DataFrame</tt> (with a single column).
 +
** <tt>'''get_portfolio_stats'''(port_val, daily_rf, samples_per_year)</tt>: Calculate statistics on daily portfolio value, given daily risk-free rate and data sampling frequency.<br />This function should return a ''tuple'' consisting of the following statistics (in order): cumulative return, average daily return, standard deviation of daily return, Sharpe ratio<br />Note: The return statement provided ensures this order.
 +
** <tt>'''plot_normalized_data'''(df, title, xlabel, ylabel)</tt>: Normalize given stock prices and plot for comparison.<br />This is used to create a chart that illustrates the value of your portfolio over the year and compares it to SPY.<br />Note: Before plotting, portfolio and SPY values should be normalized to 1.0 at the beginning of the period. Also, use the <tt>plot_data()</tt> utility function to generate and show your plot.
 +
* Refer to each function's documentation (in triple quotes after the <tt>def</tt> line) for details about the parameters and expected return values.
 +
* Implement each function; feel free to modify <tt>test_run()</tt> and <tt>assess_portfolio()</tt> to write additional tests (e.g. to call and inspect the functions individually).
 +
* Save the comparison plot as <tt>comparison.png</tt> (you should be able to do this directly from the plot window).
 +
* Submit your final <tt>analysis.py</tt> along with <tt>comparison.png</tt> once you are confident that your functions are working as expected.
 +
 
 +
Note: In order to avoid issues with grading, make sure your functions return exactly the value(s) that are defined in the respective function documentation. Also, turn off all printing and plotting from within these functions, unless instructed (e.g. <tt>plot_normalized_data()</tt> should generate a plot).
  
 
==Suggestions==
 
==Suggestions==
  
Here is a suggested outline for your code:
+
Here is a suggested high-level outline for what your script needs to do:
 
*Read in adjusted closing prices for the 4 equities.
 
*Read in adjusted closing prices for the 4 equities.
 
*Normalize the prices according to the first day. The first row for each stock should have a value of 1.0 at this point.
 
*Normalize the prices according to the first day. The first row for each stock should have a value of 1.0 at this point.
 
*Multiply each column by the allocation to the corresponding equity.
 
*Multiply each column by the allocation to the corresponding equity.
*Sum each row for each day. That is your cumulative daily portfolio value.
+
*Multiply these normalized allocations by starting value of overall portfolio, to get position values.
 +
*Sum each row (i.e. all position values for each day). That is your daily portfolio value.
 
*Compute statistics from the total portfolio value.
 
*Compute statistics from the total portfolio value.
  
 
Here are some notes and assumptions:
 
Here are some notes and assumptions:
 
*When we compute statistics on the portfolio value, we do not include the first day.
 
*When we compute statistics on the portfolio value, we do not include the first day.
*We assume you are using the data provided. If you use other data your results may turn out different from ours. Yahoo's online data changes every day. We could not build a consistent "correct" answer based on "live" Yahoo data.
+
*We assume you are using the data provided. If you use other data your results may turn out different from ours. Yahoo's online data changes every day. We could not build a consistent "correct" answer based on "live" Yahoo data.
 
*Assume 252 trading days/year.
 
*Assume 252 trading days/year.
  
Line 56: Line 79:
 
==Example output==
 
==Example output==
  
Here's an example output for your function. These are actual correct examples that you can use to check your work.
+
These are actual correct examples that you can use to check your work.
  
TODO: example 1
+
===Example 1===
 +
Start Date: 2010-01-01
 +
End Date: 2010-12-31
 +
Symbols: ['GOOG', 'AAPL', 'GLD', 'XOM']
 +
Allocations: [0.2, 0.3, 0.4, 0.1]
 +
Sharpe Ratio: 1.51819243641
 +
Volatility (stdev of daily returns): 0.0100104028
 +
Average Daily Return: 0.000957366234238
 +
Cumulative Return: 0.255646784534
  
TODO: example 2
+
TODO: Include example chart here
  
  Start Date: January 1, 2010
+
===Example 2===
  End Date: December 31, 2010
+
  Start Date: 2010-01-01
 +
  End Date: 2010-12-31
 
  Symbols: ['AXP', 'HPQ', 'IBM', 'HNZ']
 
  Symbols: ['AXP', 'HPQ', 'IBM', 'HNZ']
  Optimal Allocations: [0.0, 0.0, 0.0, 1.0]
+
  Allocations: [0.0, 0.0, 0.0, 1.0]
 
  Sharpe Ratio: 1.29889334008
 
  Sharpe Ratio: 1.29889334008
 
  Volatility (stdev of daily returns): 0.00924299255937
 
  Volatility (stdev of daily returns): 0.00924299255937
 
  Average Daily Return: 0.000756285585593
 
  Average Daily Return: 0.000756285585593
 
  Cumulative Return: 1.1960583568
 
  Cumulative Return: 1.1960583568
 +
 +
TODO: Include example chart here
  
 
Minor differences in float values may arise due to different implementations.
 
Minor differences in float values may arise due to different implementations.
Line 77: Line 111:
 
Via t-square turn in attachments only:
 
Via t-square turn in attachments only:
  
* Your code as <tt>submission.py</tt>
+
* Your code as <tt>analysis.py</tt>
* Your chart as <tt>chart.pdf</tt>
+
* Your plot of daily portfolio value and SPY as <tt>comparison.png</tt>
 +
 
 +
TODO: See if chart subm

Revision as of 20:29, 20 May 2015

Overview

A portfolio is a collection of stocks (or other investment options) and corresponding allocations of money to each of them. In order to evaluate and compare different portfolios, we first need to compute certain metrics, based on available historical data.

The primary goal of this assignment is to introduce you to this form of portfolio analysis. You will use pandas for reading in data, calculating various statistics and plotting a comparison graph.

Task

You are given the following inputs for analyzing a portfolio:

  • A date range to select the historical data to use (specified by a start and end date)
  • Symbols for equities (e.g., GOOG, AAPL, GLD, XOM)
  • Allocations to the equities at the beginning of the simulation (e.g., 0.2, 0.3, 0.4, 0.1)
  • Total starting value of the portfolio (e.g. $1,000,000)

Your goal is to compute the daily portfolio value over given date range, and then the following statistics for the overall portfolio:

  • Cumulative return
  • Average daily return
  • Standard deviation of daily returns
  • Sharpe ratio of the overall portfolio, given daily risk free rate (usually 0), and yearly sampling frequency (usually 252, the no. of trading days in a year)

Template

A template is provided for you to complete the project: 01-P1.zip

Download and unzip it. It should consist of:

  • 01-P1/: Root directory for the template
    • portfolio/: Python package with all project-specific code
      • analysis.py: Main project script with functions you need to implement, as well as test code
    • data/: Directory containing all stock data, in CSV files (e.g. GOOG.csv, AAPL.csv, etc.)
    • output/: Directory to store all program outputs, including plots
    • util.py: Utility functions (do not modify these, unless instructed)

(Note: Ignore any file named __init__.py; they are used to mark directories as Python packages.)

To execute the main script, make sure your current working directory is 01-P1, then run:

python -m portfolio.analysis

This directory structure may seem a little complicated at first, but it will help you organize your code better.

Instructions

  • Open: portfolio/analysis.py
    Function documentation and code comments should help you understand what you need to do. If it is still not clear, read the detailed instructions below.
  • Look at the function: test_run()
    Here we have set up some sample inputs, which are then passed to the assess_portfolio() function:
start_date = '2010-01-01'
end_date = '2010-12-31'
symbols = ['GOOG', 'AAPL', 'GLD', 'XOM']
allocs = [0.2, 0.3, 0.4, 0.1]
start_val = 1000000
assess_portfolio(start_date, end_date, symbols, allocs, start_val)
  • Now look at: assess_portfolio()
    It first reads historical data for the given date range and symbols, and then uses three helper functions to simulate and assess the performance of the stock portfolio.
  • Your job is to implement these functions:
    • get_portfolio_value(prices, allocs, start_val): Compute daily portfolio value given stock prices, allocations and starting value.
      Ensure that it returns a pandas Series or DataFrame (with a single column).
    • get_portfolio_stats(port_val, daily_rf, samples_per_year): Calculate statistics on daily portfolio value, given daily risk-free rate and data sampling frequency.
      This function should return a tuple consisting of the following statistics (in order): cumulative return, average daily return, standard deviation of daily return, Sharpe ratio
      Note: The return statement provided ensures this order.
    • plot_normalized_data(df, title, xlabel, ylabel): Normalize given stock prices and plot for comparison.
      This is used to create a chart that illustrates the value of your portfolio over the year and compares it to SPY.
      Note: Before plotting, portfolio and SPY values should be normalized to 1.0 at the beginning of the period. Also, use the plot_data() utility function to generate and show your plot.
  • Refer to each function's documentation (in triple quotes after the def line) for details about the parameters and expected return values.
  • Implement each function; feel free to modify test_run() and assess_portfolio() to write additional tests (e.g. to call and inspect the functions individually).
  • Save the comparison plot as comparison.png (you should be able to do this directly from the plot window).
  • Submit your final analysis.py along with comparison.png once you are confident that your functions are working as expected.

Note: In order to avoid issues with grading, make sure your functions return exactly the value(s) that are defined in the respective function documentation. Also, turn off all printing and plotting from within these functions, unless instructed (e.g. plot_normalized_data() should generate a plot).

Suggestions

Here is a suggested high-level outline for what your script needs to do:

  • Read in adjusted closing prices for the 4 equities.
  • Normalize the prices according to the first day. The first row for each stock should have a value of 1.0 at this point.
  • Multiply each column by the allocation to the corresponding equity.
  • Multiply these normalized allocations by starting value of overall portfolio, to get position values.
  • Sum each row (i.e. all position values for each day). That is your daily portfolio value.
  • Compute statistics from the total portfolio value.

Here are some notes and assumptions:

  • When we compute statistics on the portfolio value, we do not include the first day.
  • We assume you are using the data provided. If you use other data your results may turn out different from ours. Yahoo's online data changes every day. We could not build a consistent "correct" answer based on "live" Yahoo data.
  • Assume 252 trading days/year.

Make sure your assess_portfolio() function gives correct output. Check it against the examples below.

Example output

These are actual correct examples that you can use to check your work.

Example 1

Start Date: 2010-01-01
End Date: 2010-12-31
Symbols: ['GOOG', 'AAPL', 'GLD', 'XOM']
Allocations: [0.2, 0.3, 0.4, 0.1]
Sharpe Ratio: 1.51819243641
Volatility (stdev of daily returns): 0.0100104028
Average Daily Return: 0.000957366234238
Cumulative Return: 0.255646784534

TODO: Include example chart here

Example 2

Start Date: 2010-01-01
End Date: 2010-12-31
Symbols: ['AXP', 'HPQ', 'IBM', 'HNZ']
Allocations: [0.0, 0.0, 0.0, 1.0]
Sharpe Ratio: 1.29889334008
Volatility (stdev of daily returns): 0.00924299255937
Average Daily Return: 0.000756285585593
Cumulative Return: 1.1960583568

TODO: Include example chart here

Minor differences in float values may arise due to different implementations.

What to turn in

Via t-square turn in attachments only:

  • Your code as analysis.py
  • Your plot of daily portfolio value and SPY as comparison.png

TODO: See if chart subm