Reinforce my Learning

A Historical Introduction: Self-Play in Reinforcement Learning

Introduction

in Rl, Self-play 26 Apr 2019

A Review: Policy Gradient Algorithms

Policy gradient theorem

in Rl, Theory-dive 09 Oct 2018

Environment models (Beyond Markov Decision Processes)

Even though Markov Decision Processes are the most famous mathematical structure used to model an environment in reinforcement learning, there are other types of possible models for RL environment which act as extensions to vanilla MDPs. This section concerns itself to defining these extensions, and making links between them.

in Rl, Rl-environments 04 Sep 2018

A note on: Q learning

The Q-learning algorithm was first introduced by (Watkins1989), and is arguably one of the most famous, most studied and most widely implemented methods in the entire field. Given an MDP, Q-learning aims to calculate the corresponding optimal action value function , following the principle of optimality and the proof of existence...

04 Sep 2018

Classification of RL algorithms

Every RL algorithm attempts to learn an optimal policy for a given environment . So far, there is not a single algorithm which is used in every single environment to find an optimal policy. The choice of algorithm depends on many factors, such as the nature of the environment, the...

in Rl 21 Jul 2017

Reinforcement Learning: A technical introduction

What is Reinforcement Learning?

in Rl, Rl-environments 21 Jul 2017