Reinforcement Learning in Python, Part 1: Welcome to RL

Reinforcement Learning is a framework for tackling sequential decision problems: what to do next in order to maximize a reward (which might be delayed), on a changing universe (which might react to our actions). Examples of this include: Game playing: which actions are critical to win a game? Learning in a “small world”: what actions … Read moreReinforcement Learning in Python, Part 1: Welcome to RL

Multi-armed bandits, part 2

Implementation There are two important parts for the implementation: on one hand, we have to implement an environment that simulates the reward of the arms. The skeleton of this class is given below: class Arm(object): def __init__(self, params): ## passes the required parameters ## this could be the reward probability, or other parameter (mean, sigma) … Read moreMulti-armed bandits, part 2