Files
deep-rl-class/units/en/unit1/tasks.mdx
Dylan Wilson 70aadc9565 Typos Unit1
2023-04-18 14:01:03 -05:00

28 lines
1.4 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Type of tasks [[tasks]]
A task is an **instance** of a Reinforcement Learning problem. We can have two types of tasks: **episodic** and **continuing**.
## Episodic task [[episodic-task]]
In this case, we have a starting point and an ending point **(a terminal state). This creates an episode**: a list of States, Actions, Rewards, and new States.
For instance, think about Super Mario Bros: an episode begin at the launch of a new Mario Level and ends **when youre killed or you reached the end of the level.**
<figure>
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/mario.jpg" alt="Mario">
<figcaption>Beginning of a new episode.
</figcaption>
</figure>
## Continuing tasks [[continuing-tasks]]
These are tasks that continue forever (**no terminal state**). In this case, the agent must **learn how to choose the best actions and simultaneously interact with the environment.**
For instance, an agent that does automated stock trading. For this task, there is no starting point and terminal state. **The agent keeps running until we decide to stop it.**
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/stock.jpg" alt="Stock Market" width="100%">
To recap:
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/tasks.jpg" alt="Tasks recap" width="100%">