Maxim Lapan | Deep Reinforcement Learning: theory, intuition, code

Niels Zeilemaker | Deploying Python models to production

Maarten Breddels | A billion stars in the Jupyter Notebook

Суд над Бишимбаевым: прямая трансляция из зала суда. 26 июня 2024 года

ОДИН ДЕНЬ ИЗ ДЕТСТВА❤️ #shorts

Когда твоя МАМА следит за твоим боем, ты просто НЕ ИМЕЕШЬ ПРАВА проиграть #shorts

Children deceived dad #comedy

Maxim Lapan | Deep Reinforcement Learning: theory, intuition, code

Рет қаралды 8,246

PyData

Күн бұрын

PyData Amsterdam 2017
In this talk I'd like to give practical introduction into deep reinforcement learning methods, used to solve complex control problems in robotics, play Atari games, self-driving car control and lots more.
Deep Reinforcement Learning is a very hot topic, successfully applied in lots of areas which require planning of actions in complex, noisy and partially-observed environments. Concrete examples vary from playing arcade games, navigating websites, helicopter, quadrocopter and car control, protein folding and lots of others.
Surprisingly, during my own delving into this wide topic, I've discovered that (with rare exceptions) there is a lack of concrete, understandable explanation of most successful and useful algorithms and methods, such as Deep Q-Networks (DQN), Policy Gradients (PG) and Asynchronous Advantage Actor-Critic (A3C). The situation is even worse with simple code examples of the above methods.
On the one side, there are lots of scientific papers on arxiv.org where researchers tune ideas and methods. On the other side there is a couple full-sized open-source projects implementing those methods plus dozens of "tricks" to improve stability and performance of those methods.
In this talk, I'll try to fill the gap between them by showing the intuition behind the math and demonstrating how those three approaches (DQN, PG and A3C) can be implemented in less than 200 lines of python code using keras. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our KZfaq videos to help with discoverability? Find out more here: github.com/numfocus/KZfaqVi...

Пікірлер

Niels Zeilemaker | Deploying Python models to production

31:46

Niels Zeilemaker | Deploying Python models to production

PyData

Рет қаралды 17 М.

Maarten Breddels | A billion stars in the Jupyter Notebook

30:59

Maarten Breddels | A billion stars in the Jupyter Notebook

PyData

Рет қаралды 7 М.

Суд над Бишимбаевым: прямая трансляция из зала суда. 26 июня 2024 года

8:33:21

Суд над Бишимбаевым: прямая трансляция из зала суда. 26 июня 2024 года

TENGRI TV

Рет қаралды 838 М.

ОДИН ДЕНЬ ИЗ ДЕТСТВА❤️ #shorts

00:59

ОДИН ДЕНЬ ИЗ ДЕТСТВА❤️ #shorts

BATEK_OFFICIAL

Рет қаралды 8 МЛН

Когда твоя МАМА следит за твоим боем, ты просто НЕ ИМЕЕШЬ ПРАВА проиграть #shorts

01:00

Когда твоя МАМА следит за твоим боем, ты просто НЕ ИМЕЕШЬ ПРАВА проиграть #shorts

BalcevMMA_BOXING

Рет қаралды 19 МЛН

Children deceived dad #comedy

00:19

Children deceived dad #comedy

yuzvikii_family

Рет қаралды 8 МЛН

An introduction to Policy Gradient methods - Deep Reinforcement Learning

19:50

An introduction to Policy Gradient methods - Deep Reinforcement Learning

Arxiv Insights

Рет қаралды 192 М.

Deep Q-Learning/Deep Q-Network (DQN) Explained | Python Pytorch Deep Reinforcement Learning

34:05

Deep Q-Learning/Deep Q-Network (DQN) Explained | Python Pytorch Deep Reinforcement Learning

Johnny Code

Рет қаралды 17 М.

Reinforcement Learning with sparse rewards

16:01

Reinforcement Learning with sparse rewards

Arxiv Insights

Рет қаралды 115 М.

Cees Taal | Smoothing your data with polynomial fitting: a signal processing perspective

25:53

Cees Taal | Smoothing your data with polynomial fitting: a signal processing perspective

PyData

Рет қаралды 10 М.

TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)

40:47

TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)

TensorFlow

Рет қаралды 81 М.

An introduction to Reinforcement Learning

16:27

An introduction to Reinforcement Learning

Arxiv Insights

Рет қаралды 645 М.

|Carsten van Weelden, Beata Nyari | Siamese LSTM in Keras: Learning Character-Based Phrase

29:43

|Carsten van Weelden, Beata Nyari | Siamese LSTM in Keras: Learning Character-Based Phrase

PyData

Рет қаралды 6 М.

TensorFlow Tutorial #16 Reinforcement Learning

1:14:00

TensorFlow Tutorial #16 Reinforcement Learning

Hvass Laboratories

Рет қаралды 55 М.

Thomas Huijskens - Bayesian optimisation with scikit-learn

39:21

Thomas Huijskens - Bayesian optimisation with scikit-learn

PyData

Рет қаралды 38 М.

⚡️Супер БЫСТРАЯ Зарядка | Проверка

1:00

⚡️Супер БЫСТРАЯ Зарядка | Проверка

YOLODROID

Рет қаралды 835 М.

В России ускорили интернет в 1000 раз

0:18

В России ускорили интернет в 1000 раз

Короче, новости

Рет қаралды 1,3 МЛН

Хотела заскамить на Айфон!😱📱(@gertieinar)

0:21

Хотела заскамить на Айфон!😱📱(@gertieinar)

Взрывная История

Рет қаралды 4,9 МЛН

Main filter..

0:15

CikoYt

Рет қаралды 14 МЛН

ПОКУПКА ТЕЛЕФОНА С АВИТО?🤭

1:00

ПОКУПКА ТЕЛЕФОНА С АВИТО?🤭

Корнеич

Рет қаралды 3,5 МЛН

iPhone 15 😈 vs POCO X6 PRO vs 2GB RAM vs 4GB RAM vs OLD Mobile 💀 - FREEFIRE TEST #freefire #shorts

0:17

iPhone 15 😈 vs POCO X6 PRO vs 2GB RAM vs 4GB RAM vs OLD Mobile 💀 - FREEFIRE TEST #freefire #shorts

Sameer Gaming

Рет қаралды 7 МЛН

Buyurtma uchun:998946006660 #play #tech #movie #kino #football #youtube #projector

1:01

Buyurtma uchun:998946006660 #play #tech #movie #kino #football #youtube #projector

Reboort_uzb

Рет қаралды 761 М.