Difference between revisions of "Machine Learning Algorithms for Trading"

From Quantitative Analysis Software Courses
Jump to navigation Jump to search
Line 38: Line 38:
 
==Lesson 3: Assessing a learning algorithm==
 
==Lesson 3: Assessing a learning algorithm==
 
*Now that we have two, (linreg & KNN), let's compare them
 
*Now that we have two, (linreg & KNN), let's compare them
 +
**Pros and cons of LinReg versus KNN
 +
*Cross validation,
 +
*roll forward cross validation
 +
**Use all data versus most recent data
 +
**Online learning
 +
*How long to take to learn versus query
 +
*Batch versus online
 
*RMS error
 
*RMS error
 
*Scatterplot predict vs actual
 
*Scatterplot predict vs actual
 
*Corrcoef
 
*Corrcoef
 
*Overfitting
 
*Overfitting
*Cross validation
 
  
 
==Lesson 4: Ensemble learners, bagging and boosting==
 
==Lesson 4: Ensemble learners, bagging and boosting==

Revision as of 12:22, 22 July 2015

Lesson 1: How Machine Learning is used at a hedge fund

  • introduce problem early
  • Overview of use and backtesting
    • Out of sample
    • Roll forward cross validation
  • Methods
    • Linear regression
    • KNN regression
    • Decision trees Random Forest regression (considering to drop)
  • Quiz: which algorithm makes most sense here?
  • Supervised ML (intent is that the treatment here is light)
    • Use: Regression
    • Use: Classification
    • Model type: Parametric
    • Model type: Instance-based
  • Quiz: What's the next point?
  • Problems with regression for finance
    • Hint at reinforcement learning
  • Introduce the problem we will focus on in the rest of the class, namely:
    • Example data, will learn on over a particular year (2012)
    • Will test on over the next two years (2013 2014)
    • It will be "easy" data that has obvious patterns
    • You will create trades.txt and run them through your backtester

Lesson 2: Regression

[note: need to create fake stock data that has embedded patterns]

  • Overview of how it fits into overall trading process
  • Definition of the problem 1
    • Black box diagram
    • training: Xtrain, Ytrain
    • using: Query with X
  • Definition of the problem 2: APIs
    • constructor
    • addEvidence(X,Y)
    • query(X)
  • How to implement linear regression

Lesson 3: Assessing a learning algorithm

  • Now that we have two, (linreg & KNN), let's compare them
    • Pros and cons of LinReg versus KNN
  • Cross validation,
  • roll forward cross validation
    • Use all data versus most recent data
    • Online learning
  • How long to take to learn versus query
  • Batch versus online
  • RMS error
  • Scatterplot predict vs actual
  • Corrcoef
  • Overfitting

Lesson 4: Ensemble learners, bagging and boosting

perhaps include decision trees

Lesson 6: Reinforcement Learning

  • Classic view of the problem (from Kaelbling, Littman, Moore)
  • Model-based
  • Model-free

Lesson 7: Q-Learning

Lesson 8: Dyna