ModelΒΆ
A model is a reresentation that predicts what will happen inside an environment. This knowledge about the environment and/or task is explicitly encoded, such that instead of directly learning a policy, we learn a model function which helps improve our policy.
Most commonly, they can take the following forms:
State Transition model:
A Reward model: