In Lecture 14 we move from supervised learning to reinforcement learning (RL), in which an agent must learn to interact with an environment in order to maximize its reward. We formalize reinforcement learning using the language of Markov Decision Processes (MDPs), policies, value functions, and Q-Value functions. We discuss different algorithms for reinforcement learning including Q-Learning, policy gradients, and Actor-Critic. We show how deep reinforcement learning has been used to play Atari games and to achieve super-human Go performance in AlphaGo.
11 views
35
8
2 months ago 00:19:44 1
Mais qu’est-ce qu’on enseigne à nos enfants ???
2 months ago 00:54:34 1
ХЛЫСТЫ. Самая дикая секта Российской Империи | ФАЙБ
2 months ago 04:00:46 1
FRANC-MAÇONNERIE : La FIN du SILENCE / Les Témoignages que la Loge Redoute...
2 months ago 03:17:36 5
Александр I Благословенный (1777-1825) | Курс Владимира Мединского | XIX век
2 months ago 01:10:50 1
Бандитские 90-е. История, которая вас удивит | ФАЙБ
2 months ago 01:03:16 1
Yuri Bezmenov: Psychological Warfare Subversion & Control of Western Society (Complete)
2 months ago 00:36:17 20
Лекция 12. «Катакомбы» - картинки с выставки. Мусоргский как мистик. | Композитор Иван Соколов
2 months ago 01:40:25 2
La Mafia et la Maison Blanche, de Roosevelt à Trump - JEAN-FRANÇOIS GAYRAUD
2 months ago 02:01:58 1
De la fraude du nom à la conscience de soi | Les carnets de Jeremiah
2 months ago 01:07:04 1
Ivy League Scholar Explains How the Qur’an Evolved | Recovering Qur’anic Arabic | Munther Younes
2 months ago 00:20:13 1
SATAN LE DIEU CACHÉ DES SUMÉRIENS ET DES ÉGYPTIENS
2 months ago 01:26:27 1
Should We Be Worried About Incel Violence? - Dr Andrew Thomas