Back
MDPs, Bellman equations, batch value iteration, tabular Q-learning with epsilon-greedy exploration, and approximate Q-learning with feature weights.
artificial-intelligence
reinforcement-learning
coursework
python