Commit Graph

176 Commits

Author SHA1 Message Date
simoninithomas
c458fb33c7 Update PG and add hands-on 2023-01-02 22:37:01 +01:00
simoninithomas
e1cf375c36 Update advantages-disadvantages and policy gradient 2023-01-02 22:23:27 +01:00
simoninithomas
88fded6cf3 Update intro and what are policy based mtd 2023-01-02 22:05:36 +01:00
simoninithomas
c0c4f9b565 Add conclusion 2023-01-02 21:52:41 +01:00
simoninithomas
7bb90190c7 Update Policy Gradient 2023-01-02 21:47:33 +01:00
simoninithomas
bebb6fed17 Adding conclusion 2023-01-02 21:22:44 +01:00
simoninithomas
2d2dffd4f7 Add illustrations PG 2023-01-02 20:11:50 +01:00
simoninithomas
1197198d2b Added policy gradient section 2023-01-02 14:43:34 +01:00
simoninithomas
c71422e59c First draft unfinished 2023-01-01 13:23:38 +01:00
Thomas Simonini
ab8598b772 Merge pull request #168 from dario248/main
Unit 3 Glossary
2022-12-31 21:51:20 +01:00
Thomas Simonini
bc9bb6c52f Merge pull request #170 from HasarinduPerera/main
Update glossary.mdx [Unit 2]
2022-12-31 21:46:20 +01:00
Thomas Simonini
e60f817254 Update conclusion 2022-12-31 21:35:06 +01:00
Thomas Simonini
f9de15477c Replace Huggy image 2022-12-31 21:30:31 +01:00
Thomas Simonini
31105e358f Update 2022-12-31 21:22:17 +01:00
Thomas Simonini
633639d3e7 Update 2022-12-31 21:11:47 +01:00
Thomas Simonini
45d6455cd2 Update schedule 2022-12-31 21:09:32 +01:00
Thomas Simonini
90ce3173b8 Update Unit 1 notebook 2022-12-31 21:06:20 +01:00
Thomas Simonini
6d2b2b6ae7 Update _toctree.yml 2022-12-31 20:58:56 +01:00
Thomas Simonini
9b531c3be0 Some small updates 2022-12-31 20:52:40 +01:00
Thomas Simonini
10d539f24f Update discord101.mdx 2022-12-31 20:31:21 +01:00
Thomas Simonini
8a68a5e8dc Update setup.mdx 2022-12-31 20:25:03 +01:00
Thomas Simonini
d48dc1ad6c Update introduction.mdx 2022-12-31 20:23:46 +01:00
Hasarindu Perera
815ae5ba13 Update glossary.mdx
Add Epsilon-greedy strategy and Greedy strategy.
2022-12-31 13:30:42 +05:30
Dario Paez
5a4117630b Update _toctree.yml 2022-12-28 10:24:21 -03:00
Dario Paez
779b7e5b07 Create glossary.mdx 2022-12-28 10:22:32 -03:00
Thomas Simonini
7b61d9f813 Update bellman-equation.mdx 2022-12-20 14:20:40 +01:00
Thomas Simonini
5f66e67419 Update mc-vs-td.mdx 2022-12-20 14:06:10 +01:00
Thomas Simonini
3bdc44cd35 Update bellman-equation.mdx 2022-12-20 14:05:29 +01:00
Thomas Simonini
beaef9b0a4 Update two-types-value-based-methods.mdx 2022-12-20 14:02:46 +01:00
Thomas Simonini
31dc00a52b Update additional-readings.mdx
Add make your own gym custom env
2022-12-20 13:59:12 +01:00
Thomas Simonini
630b80a00f Update hands-on.mdx 2022-12-20 13:54:08 +01:00
Thomas Simonini
093bdb1ed8 Merge pull request #137 from ramon-rd/patch-1
Create glossary.mdx
2022-12-20 13:07:07 +01:00
Thomas Simonini
a37804cebf Update glossary.mdx 2022-12-20 13:06:31 +01:00
Thomas Simonini
c275b13ddf Update _toctree.yml 2022-12-20 13:04:35 +01:00
Thomas Simonini
f354d80e8c Merge pull request #108 from huggingface/ThomasSimonini/Unit3
Adding Unit 3: Deep Q-Learning and Optuna Bonus
2022-12-19 16:03:11 +01:00
Thomas Simonini
fd710896cf Update hands-on.mdx
- Add cleanrl link
- Some cleanups
2022-12-19 15:05:38 +01:00
Thomas Simonini
96b49481da Update hands-on.mdx 2022-12-19 13:53:38 +01:00
Thomas Simonini
33c02be800 Update hands-on.mdx 2022-12-19 12:44:13 +01:00
Thomas Simonini
d500baac63 Apply suggestions from code review
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2022-12-19 12:18:26 +01:00
Andrey Voroshilov
a4cd53cd37 Fixing the reference, a) to the right Berkeley course (CS 188 and not CS 285) and b) to have a proper URL format 2022-12-18 12:24:23 -08:00
Artagon
fc66ea7e4a Rephrasing for initial epsilon value 2022-12-17 22:33:02 +01:00
Artagon
96714cdb10 Cases consistency 2022-12-17 22:23:08 +01:00
Artagon
a7d74befb0 Fix midsentence uppercase 'Policy' 2022-12-17 14:47:18 +01:00
Artagon
753ef67eae epsilon-greedy instead of epsilon greedy 2022-12-17 14:45:08 +01:00
Artagon
f913af7300 epsilon smaller or equal to 1.0 2022-12-17 14:39:40 +01:00
Artagon
0a4c6c6f2c fix redundant 'pair' and inconsistent Case. 2022-12-17 14:30:19 +01:00
Artagon
0c3616c03f Replace ** by <b> tags in figcaption 2022-12-16 20:34:24 +01:00
Artagon
0744d542ad Properly display π* 2022-12-16 20:31:49 +01:00
Thomas Simonini
38d0b2c73a Merge branch 'main' into ThomasSimonini/Unit3 2022-12-16 10:04:21 +01:00
simoninithomas
ed065ac128 Update Unit 3 and 4 2022-12-16 09:44:59 +01:00