Deep Q-Learning

From Quantitative Analysis Software Courses
Revision as of 17:59, 10 February 2018 by Petosa (talk | contribs) (Created page with "==Intro== To understand deep Q-learning, it is imperative you first have an understanding of normal, table-based Q-learning. <ul> <li>Define a table <code>Q</code> with a row...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Intro

To understand deep Q-learning, it is imperative you first have an understanding of normal, table-based Q-learning.

  • Define a table Q with a row for each state and a column for each action. We can index into Q with Q(s,a). The value of each index will be the expected utility of being in state s and taking action a
  • Define exploration probability 0 < p <= 1
  • Begin in state s
  • With probability p, take action a = argmax_a[Q(s,a)]. Otherwise, take a random action from the action space.

the sea