Commit Graph

535 Commits

Author SHA1 Message Date
Thomas Simonini
fb12b509ef Update snowball-target.mdx 2023-01-06 18:01:33 +01:00
Thomas Simonini
583462ff23 Update introduction.mdx 2023-01-06 17:58:22 +01:00
simoninithomas
0d352e4f7a Update environments explanation 2023-01-06 14:27:57 +01:00
simoninithomas
a86695b50e Update snowball target explanation 2023-01-06 14:19:02 +01:00
simoninithomas
8baa4e45b6 Update MLAgents introduction 2023-01-06 14:06:34 +01:00
Thomas Simonini
816901d50d Merge branch 'main' into ThomasSimonini/MLAgents 2023-01-04 16:21:08 +01:00
Thomas Simonini
d4b6b46257 Merge pull request #172 from huggingface/ThomasSimonini/PG
Add Unit Policy Gradient
2023-01-04 15:29:52 +01:00
Thomas Simonini
26e335736e Update hands-on.mdx 2023-01-04 14:27:23 +01:00
Thomas Simonini
9b5db4e879 Update colab 2023-01-04 14:24:22 +01:00
Thomas Simonini
8a35f1bf67 Update hands-on.mdx 2023-01-04 14:18:09 +01:00
Thomas Simonini
89e97f0196 Update hands-on.mdx 2023-01-04 14:10:57 +01:00
Thomas Simonini
49692e07b7 Apply suggestions from code review
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2023-01-04 14:02:15 +01:00
Thomas Simonini
5272fb8941 Update policy-gradient.mdx 2023-01-04 14:00:05 +01:00
Thomas Simonini
fabf98b74f Update what-are-policy-based-methods.mdx 2023-01-04 13:58:06 +01:00
Thomas Simonini
2e1e4046a2 Update quiz.mdx 2023-01-04 11:30:55 +01:00
Thomas Simonini
2e49a1fb6f Update quiz.mdx 2023-01-04 11:14:36 +01:00
simoninithomas
c32d96dbc8 Add hands on mdx 2023-01-04 10:01:54 +01:00
Thomas Simonini
c5aceb0877 Final update unit 4 2023-01-04 09:58:33 +01:00
Thomas Simonini
08259b5370 Update pseudocode 2023-01-04 09:53:22 +01:00
Thomas Simonini
0fdaa23948 Update notebook 2023-01-04 09:46:31 +01:00
simoninithomas
851b083fcf Add the Quiz 2023-01-04 09:07:09 +01:00
simoninithomas
5dbb460d90 Modifications based on Omar feedback + cleanup 2023-01-04 08:48:30 +01:00
Thomas Simonini
1c93606aec Apply suggestions from code review
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2023-01-04 08:22:31 +01:00
Thomas Simonini
4cf68f25b3 Create requirements-unit4.txt 2023-01-03 21:59:57 +01:00
Thomas Simonini
6ee1dde45a Update Colab 2023-01-03 21:57:57 +01:00
Thomas Simonini
43dcfac6e2 Merge pull request #95 from Chris1nexus/main
Fix and improve unit5 REINFORCE computation of the returns
2023-01-03 19:06:43 +01:00
simoninithomas
b94cc104e1 Typo 2023-01-03 10:07:58 +01:00
simoninithomas
8e0bbdb82e Update maths 2023-01-03 09:58:54 +01:00
simoninithomas
53ad3d9a09 Add derivation optional 2023-01-03 09:44:20 +01:00
simoninithomas
fc00de7e69 Add mathematics 2023-01-03 09:06:28 +01:00
Thomas Simonini
902e203b82 Add unit4 notebook (wip) 2023-01-02 22:58:59 +01:00
simoninithomas
c458fb33c7 Update PG and add hands-on 2023-01-02 22:37:01 +01:00
simoninithomas
e1cf375c36 Update advantages-disadvantages and policy gradient 2023-01-02 22:23:27 +01:00
simoninithomas
88fded6cf3 Update intro and what are policy based mtd 2023-01-02 22:05:36 +01:00
simoninithomas
c0c4f9b565 Add conclusion 2023-01-02 21:52:41 +01:00
simoninithomas
7bb90190c7 Update Policy Gradient 2023-01-02 21:47:33 +01:00
simoninithomas
bebb6fed17 Adding conclusion 2023-01-02 21:22:44 +01:00
simoninithomas
2d2dffd4f7 Add illustrations PG 2023-01-02 20:11:50 +01:00
simoninithomas
1197198d2b Added policy gradient section 2023-01-02 14:43:34 +01:00
simoninithomas
5e4f2e0024 Update MLAgents draft 2023-01-01 15:42:22 +01:00
Thomas Simonini
14580c6b78 Apply suggestions from code review
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2023-01-01 13:27:21 +01:00
simoninithomas
c71422e59c First draft unfinished 2023-01-01 13:23:38 +01:00
Thomas Simonini
ab8598b772 Merge pull request #168 from dario248/main
Unit 3 Glossary
2022-12-31 21:51:20 +01:00
Thomas Simonini
bc9bb6c52f Merge pull request #170 from HasarinduPerera/main
Update glossary.mdx [Unit 2]
2022-12-31 21:46:20 +01:00
Thomas Simonini
b9856e2f54 Merge pull request #171 from huggingface/ThomasSimonini/BigUpdate
Big Update (small typos, feedback form etc)
2022-12-31 21:40:55 +01:00
Thomas Simonini
e60f817254 Update conclusion 2022-12-31 21:35:06 +01:00
Thomas Simonini
f9de15477c Replace Huggy image 2022-12-31 21:30:31 +01:00
Thomas Simonini
31105e358f Update 2022-12-31 21:22:17 +01:00
Thomas Simonini
633639d3e7 Update 2022-12-31 21:11:47 +01:00
Thomas Simonini
45d6455cd2 Update schedule 2022-12-31 21:09:32 +01:00