Difference between revisions of "CS7646 Summer 2016"

From Quantitative Analysis Software Courses
Jump to navigation Jump to search
Line 25: Line 25:
 
Class meets TTH 2-3:45 in College of Computing 102
 
Class meets TTH 2-3:45 in College of Computing 102
  
==Week 1==
+
===Week 1===
 
<B>2016-05-17 (Tuesday)</B><BR>
 
<B>2016-05-17 (Tuesday)</B><BR>
 
Course Overview/Admin, Machine Learning Overview, Finance Overview<BR>
 
Course Overview/Admin, Machine Learning Overview, Finance Overview<BR>
Line 33: Line 33:
 
Market Basics (OHLC, Candlesticks), Python Basics (data processing, files)<BR>
 
Market Basics (OHLC, Candlesticks), Python Basics (data processing, files)<BR>
  
==Week 2==
+
===Week 2===
 
Working with Numpy, Pandas<BR>
 
Working with Numpy, Pandas<BR>
 
Time Series<BR>
 
Time Series<BR>
 
Incomplete Data<BR>
 
Incomplete Data<BR>
<P>
+
 
 
<B>Project 1 Due</B>
 
<B>Project 1 Due</B>
  
==Week 3==
+
===Week 3===
 
Plots<BR>
 
Plots<BR>
 
Portfolio Statistics<BR>
 
Portfolio Statistics<BR>
Line 46: Line 46:
 
Supervised vs Unsupervised ML<BR>
 
Supervised vs Unsupervised ML<BR>
  
==Week 4==
+
===Week 4===
 
Assessing Learners<BR>
 
Assessing Learners<BR>
 
Cross-Validation<BR>
 
Cross-Validation<BR>
Line 54: Line 54:
 
KNN<BR>
 
KNN<BR>
 
Decision Trees<BR>
 
Decision Trees<BR>
<P>
+
 
 
<B>Project 2 Due</B>
 
<B>Project 2 Due</B>
  
==Week 5==
+
===Week 5===
 
Bagging<BR>
 
Bagging<BR>
 
Boosting<BR>
 
Boosting<BR>
Line 64: Line 64:
 
Time Value of Money<BR>
 
Time Value of Money<BR>
  
==Week 6==
+
===Week 6===
 
Capital Assets Pricing Model (CAPM)<BR>
 
Capital Assets Pricing Model (CAPM)<BR>
 
Efficient Market Hypothesis<BR>
 
Efficient Market Hypothesis<BR>
 
Fundamental Law<BR>
 
Fundamental Law<BR>
 
Efficient Frontier<BR>
 
Efficient Frontier<BR>
<P>
+
 
 
<B>Project 3 Due</B>
 
<B>Project 3 Due</B>
  
==Week 7==
+
===Week 7===
 
MIDTERM<BR>
 
MIDTERM<BR>
 
Technical Analysis<BR>
 
Technical Analysis<BR>
 
Data Problems<BR>
 
Data Problems<BR>
  
==Week 8==
+
===Week 8===
 
Holiday<BR>
 
Holiday<BR>
 
Hidden Markov Models<BR>
 
Hidden Markov Models<BR>
 
Random Forests<BR>
 
Random Forests<BR>
 
Artificial Neural Networks<BR>
 
Artificial Neural Networks<BR>
<P>
+
 
 
<B>Project 4 Due</B>
 
<B>Project 4 Due</B>
  
==Week 9==
+
===Week 9===
 
Reinforcement Learning<BR>
 
Reinforcement Learning<BR>
 
Q-Learning<BR>
 
Q-Learning<BR>
 
Dyna<BR>
 
Dyna<BR>
  
==Week 10==
+
===Week 10===
 
Nonparametric Learners
 
Nonparametric Learners
 
TBD
 
TBD
<P>
+
 
 
<B>Project 5 Due</B>
 
<B>Project 5 Due</B>
  
==Week 11==
+
===Week 11===
 
Final Instruction Days
 
Final Instruction Days
  
==Week 12==
+
===Week 12===
 
Finals (no final exam in this class)
 
Finals (no final exam in this class)
<P>
+
 
 
<B>Project 6 Due</B>
 
<B>Project 6 Due</B>
  

Revision as of 16:31, 11 May 2016

Overview

This course introduces students to the real world challenges of implementing machine learning based trading strategies including the algorithmic steps from information gathering to market orders. The focus is on how to apply probabilistic machine learning approaches to trading decisions. We consider statistical approaches like linear regression, Q-Learning, KNN and regression trees and how to apply them to actual stock trading situations.

This summer, the course will follow this broad outline:

  1. Brief introduction to Manipulating Financial Data in Python
  2. Introduction to Machine Learning
  3. Computational Investing
  4. Machine Learning Algorithms for Trading


Instructor information

David Byrd
Research Scientist, Interactive Media Technology Center at Georgia Tech

Course Designer

Tucker Balch, Ph.D.
Professor, Interactive Computing at Georgia Tech

----------- Current edit level -----------

2016 Summer Schedule

Class meets TTH 2-3:45 in College of Computing 102

Week 1

2016-05-17 (Tuesday)
Course Overview/Admin, Machine Learning Overview, Finance Overview

2016-05-19 (Thursday)
Project 1 assigned (Analyze a Portfolio)
Market Basics (OHLC, Candlesticks), Python Basics (data processing, files)

Week 2

Working with Numpy, Pandas
Time Series
Incomplete Data

Project 1 Due

Week 3

Plots
Portfolio Statistics
Regression vs Classification
Supervised vs Unsupervised ML

Week 4

Assessing Learners
Cross-Validation
Batch vs Online Learning
RMS, Pearson's r
Overfitting
KNN
Decision Trees

Project 2 Due

Week 5

Bagging
Boosting
Hedge Funds
Markets, Orders, Crashes, Valuation
Time Value of Money

Week 6

Capital Assets Pricing Model (CAPM)
Efficient Market Hypothesis
Fundamental Law
Efficient Frontier

Project 3 Due

Week 7

MIDTERM
Technical Analysis
Data Problems

Week 8

Holiday
Hidden Markov Models
Random Forests
Artificial Neural Networks

Project 4 Due

Week 9

Reinforcement Learning
Q-Learning
Dyna

Week 10

Nonparametric Learners TBD

Project 5 Due

Week 11

Final Instruction Days

Week 12

Finals (no final exam in this class)

Project 6 Due

Assignments

Textbooks & Other Resources

We will use the following textbooks:

  • For Mini-course 1: Python for Finance by Yves Hilpisch amazon.com (optional)
  • For Mini-course 2: What Hedge Funds Really Do by Romero and Balch amazon.com (required)
  • For Mini-course 3: Machine Learning by Tom Mitchell (optional)
    • Buy it for $218.00 at: amazon.com
    • Buy a paperback version for $61.78. IMPORTANT WARNINGS: 1) They only ship to the US 2) It takes them 3 weeks to print the book. If you order from outside the US they will quietly accept your money but never ship the book: less expensive version at mcgraw hill
    • Buy a paperback international version for $19.10. I am not certain about the reliability of this company: international

Other resources:

Prerequisites/Co-requisites

All types of students are welcome! The Machine Learning topics might be "review" for CS students, while finance parts will be review for finance students. However, even if you have experience in these topics, you will find that we consider them in a different way than you might have seen before, in particular with an eye towards implementation for trading.

If you answer "no" to the following questions, it may be beneficial to refresh your knowledge of the prerequisite material prior to taking CS 7646:

  • Do you have a working knowledge of basic statistics, including probability distributions (such as normal and uniform), calculation and differences between mean, median and mode
  • Do you understand the difference between geometric mean and arithmetic mean?
  • Do you have strong programming skills? Take this quiz compinvesti-prog-quiz if you would like help determining the strength of your programming skills.

Who this course is for: The course is intended for people with strong software programming experience and introductory level knowledge of investment practice. A primary prerequisite is an interest and excitement about the stock market.

Software we'll use: In order to complete the programming assignments you will need to a development environment that you're comfortable with. We use Unix, but you can also work with Windows and Mac OS environments. You must download and install a set of Python modules to your computer (including NumPy, SciPy, and Pandas).

How to install the software: ML4T Software Installation

Logistics

Grading

  • Mini-course 1: Two homework assignments and two programming projects.
  • Mini-course 2: Two programming projects, and a midterm.
  • Mini-course 3: Three programming projects (no final).

Weightings:

  • MC1-Homework-1: 2.5%
  • MC1-Homework-2: 2.5%
  • MC1-Homework-3: 2.5%
  • MC1-Project-1: 5%
  • MC1-Project-2: 5%
  • MC2-Project-1: 15%
  • MC2-Project-2: 10%
  • MC3-Homework-1: 2.5%
  • Midterm: 20%
  • MC3-Project-1: 10%
  • MC3-Project-2: 10%
  • MC3-Project-3: 15%

Thresholds:

  • A: 90% and above
  • B: 80% and above
  • C: 70% and above
  • D: 60% and above
  • F: below 60%

Minimum technical requirements

  • Browser and connection speed: An up-to-date version of Chrome or Firefox is strongly recommended. We also support Internet Explorer 9 and the desktop versions of Internet Explorer 10 and above (not the metro versions). 2+ Mbps recommended; at minimum 0.768 Mbps download speed.
  • Hardware: A computer with at least 4GB of RAM and CPU speed of at least 2.5GHz.
  • OS:
    • PC: Windows XP or higher with latest updates installed
    • Mac: OS X 10.6 or higher with latest updates installed
    • Linux: Any recent distribution that has the supported browsers installed

Office hours

To be determined.

Plagiarism

In most cases I expect that all submitted code will be written by you. I will present some libraries in class that you are allowed to use (such as pandas and numpy). Otherwise, all source code, images and write-ups you provide should have been created by you alone.

Late Policy & Absences

Assignments are due at 11:55PM Eastern Time on the assignment due date. Assignments turned in after 11:55PM are considered late. Assignments may be turned in up to one day late with a 10% penalty.

Exams will be held on specific days at specific times. If there is an emergency or other issue that requires changing the date of an exam for you, you will need to have it approved by the Dean of Students. You can apply for that here:

Legacy