mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-02-08 04:45:27 +08:00
Update curiosity.mdx
This commit is contained in:
@@ -41,7 +41,7 @@ There are different ways to calculate this intrinsic reward. The classical appro
|
||||
|
||||
Because the idea of Curiosity is to **encourage our agent to perform actions that reduce the uncertainty in the agent’s ability to predict the consequences of its actions** (uncertainty will be higher in areas where the agent has spent less time or in areas with complex dynamics).
|
||||
|
||||
If the agent spends a lot of time on these states, it will be good to predict the next state (low Curiosity). On the other hand, if it’s a new state unexplored, it will be harmful to predict the following state (high Curiosity).
|
||||
If the agent spends a lot of time on these states, it will be good to predict the next state (low Curiosity). On the other hand, if it’s a new state unexplored, it will be hard to predict the following state (high Curiosity).
|
||||
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit5/curiosity4.png" alt="Curiosity"/>
|
||||
|
||||
|
||||
Reference in New Issue
Block a user