Merge pull request #90 from ryanrussell/unit1-readability

docs: unit1 readability fixups
This commit is contained in:
Thomas Simonini
2022-10-03 07:49:00 +02:00
committed by GitHub
3 changed files with 4 additions and 4 deletions

View File

@@ -345,7 +345,7 @@
"3⃣ Get an action using our model (in our example we take a random action)\n",
"\n",
"4⃣ Using `env.step(action)`, we perform this action in the environment and get\n",
"- `obsevation`: The new state (st+1)\n",
"- `observation`: The new state (st+1)\n",
"- `reward`: The reward we get after executing the action\n",
"- `done`: Indicates if the episode terminated\n",
"- `info`: A dictionary that provides additional information (depends on the environment).\n",

View File

@@ -131,7 +131,7 @@
}
],
"source": [
"# Initilaize wandb\n",
"# Initialize wandb\n",
"# https://docs.wandb.ai/guides/integrations/other/stable-baselines-3\n",
"\n",
"config = {\n",
@@ -182,7 +182,7 @@
],
"source": [
"env = make_vec_env('LunarLander-v2', n_envs=16)\n",
"# Use the folling line with caution. The video recorder will try to render the agent on the screen, so that ffmpeg can caputre it. Here, we have 16 envs set. Trying to render 16 envs on screen will\n",
"# Use the following line with caution. The video recorder will try to render the agent on the screen, so that ffmpeg can capture it. Here, we have 16 envs set. Trying to render 16 envs on screen will\n",
"# be pretty resource intensive. \n",
"# env = VecVideoRecorder(env, f\"videos/{run.id}\", record_video_trigger=lambda x: x % 2000 == 0, video_length=200) # Set the video recorder, to record our agent during training\n",
"\n",

View File

@@ -287,7 +287,7 @@
"\n",
"**Note:** If you have more time available, then you can tune other hyperparameters too. Moreover, you can explore wider ranges for each hyperparameter.\n",
"\n",
"The `trial.suggest_int()` and `trial.suggest_uniform()` methods are used by Optuna to suggest hyperparamter values in the ranges specified. The suggested combination of values are then used to train a model and return the score."
"The `trial.suggest_int()` and `trial.suggest_uniform()` methods are used by Optuna to suggest hyperparameter values in the ranges specified. The suggested combination of values are then used to train a model and return the score."
],
"metadata": {
"id": "dORGHcVYdSKp"