ModelΒΆ

A model is a reresentation that predicts what will happen inside an environment. This knowledge about the environment and/or task is explicitly encoded, such that instead of directly learning a policy, we learn a model function which helps improve our policy.

Most commonly, they can take the following forms:

  1. State Transition model:

  2. A Reward model: