Deep Q-Learning

Intro

To understand deep Q-learning, it is imperative you first have an understanding of normal, table-based Q-learning.

Define a table Q with a row for each state and a column for each action. We can index into Q with Q(s,a). The value of each index will be the expected utility of being in state s and taking action a
Define exploration probability 0 < p <= 1
Begin in state s
With probability p, take action a = argmax_a[Q(s,a)]. Otherwise, take a random action from the action space.

the sea