The Boltzmann machine Activations are in {0,1} Each state (set of activation values) has a goodness (or harmony), equal to the weighted sum of the product of the sending & receiving activations. For a given unit, the net input is the difference in goodness between the unit's being on and it's being off. So to increase goodness, turn on a unit if its net input is positive. If units are selected randomly for update, this gives an activation algorithm, but one that may not maximize goodness. One stochastic algorithm turns on a unit with probability 1/[1 + exp(-net/T)] Here T is a parameter called temperature. If T starts high and is gradually lowered, the system will reach optimal goodness. By analogy with the process of annealing, this process is called simulated annealing. The intuition is that at high temperatures, the probability above is about 1/2, and it is easy to bounce off of a local maximum. At low temperatures, the probability above is near 0 or 1, and the system will settle smoothly to an optimal solution. Note that the hidden units can be thought of as recognizing features that are not directly represented by input units. This is a general characteristic of neural nets (and of the brain). Applications: constraint satisfaction -- e.g., Necker cube classification (to nearest prototype) e.g., face recognition e.g., character recognition combinatorial optimization e.g., TSP Boltzmann machines with arbitrary connectivity can be too slow to be practical. But restricted versions can be practical.