Components to be Learned
We have described several agents so far this semester. The components of these agents which might be improved include:
- The mapping from conditions on the current state to actions
- The means to infer relevant properties of the world from percept sequence
- Information about the way the world evolves and about the results of possible actions the agent can take.
- Utility information indicating the desirability of world states
- Action-value information indicating the desirability of actions
- Goals that describe classes of states whose achievement maximizes the agent's utility.
The book gives examples of the above in terms of a taxi driver agent.
Feedback to learn from
There are three main types of feedback that correspond to the three main types of learning:
- In unsupervised learning the agent learns patterns in the input even though no explicit feedback is supplied. A common task in this area is clustering: detecting potentially useful clusters of input examples.
- In reinforcement learning the agent learns from a series of reinforcements -- rewards or punishments. For example, the lack of a tip at the end of the journey gives the taxi agent an indication that it did something wrong. Winning a chess game give an agent an indication is did something right.
- In supervised learning the agent observes some examples input-output pairs and learns a function that maps from input to output.
In addition, to the above there is also things such as semi-supervised learning where we are given a few labeled examples and must make what we can of a large collection of unlabeled examples.