Maxim Lapan | Deep Reinforcement Learning: theory, intuition, code

  Рет қаралды 8,246

PyData

PyData

Күн бұрын

PyData Amsterdam 2017
In this talk I'd like to give practical introduction into deep reinforcement learning methods, used to solve complex control problems in robotics, play Atari games, self-driving car control and lots more.
Deep Reinforcement Learning is a very hot topic, successfully applied in lots of areas which require planning of actions in complex, noisy and partially-observed environments. Concrete examples vary from playing arcade games, navigating websites, helicopter, quadrocopter and car control, protein folding and lots of others.
Surprisingly, during my own delving into this wide topic, I've discovered that (with rare exceptions) there is a lack of concrete, understandable explanation of most successful and useful algorithms and methods, such as Deep Q-Networks (DQN), Policy Gradients (PG) and Asynchronous Advantage Actor-Critic (A3C). The situation is even worse with simple code examples of the above methods.
On the one side, there are lots of scientific papers on arxiv.org where researchers tune ideas and methods. On the other side there is a couple full-sized open-source projects implementing those methods plus dozens of "tricks" to improve stability and performance of those methods.
In this talk, I'll try to fill the gap between them by showing the intuition behind the math and demonstrating how those three approaches (DQN, PG and A3C) can be implemented in less than 200 lines of python code using keras. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our KZfaq videos to help with discoverability? Find out more here: github.com/numfocus/KZfaqVi...

Пікірлер
Niels Zeilemaker | Deploying Python models to production
31:46
ОДИН ДЕНЬ ИЗ ДЕТСТВА❤️ #shorts
00:59
BATEK_OFFICIAL
Рет қаралды 8 МЛН
Children deceived dad #comedy
00:19
yuzvikii_family
Рет қаралды 8 МЛН
An introduction to Policy Gradient methods - Deep Reinforcement Learning
19:50
Reinforcement Learning with sparse rewards
16:01
Arxiv Insights
Рет қаралды 115 М.
An introduction to Reinforcement Learning
16:27
Arxiv Insights
Рет қаралды 645 М.
TensorFlow Tutorial #16 Reinforcement Learning
1:14:00
Hvass Laboratories
Рет қаралды 55 М.
Thomas Huijskens - Bayesian optimisation with scikit-learn
39:21
В России ускорили интернет в 1000 раз
0:18
Короче, новости
Рет қаралды 1,3 МЛН
Хотела заскамить на Айфон!😱📱(@gertieinar)
0:21
Взрывная История
Рет қаралды 4,9 МЛН
Main filter..
0:15
CikoYt
Рет қаралды 14 МЛН
ПОКУПКА ТЕЛЕФОНА С АВИТО?🤭
1:00
Корнеич
Рет қаралды 3,5 МЛН