(My notes from Richard M. Golden's excellent podcast "Learning Machines 101")
What makes for human intelligence? Decision making. Both to decide based on past experience, as well as to decode novel situations, those with no precedent at all.
"The ability to make decisions in new situations and the ability to learn from experience are important characteristics of human intelligence," Golden said.
And this is how machines can "learn" as well, so the reasoning goes. With machine learning, computers map out all the possible scenarios that would ensue from a particular decision that had to be made, ranking them in terms of favorable outcome, to choose the best one. It could also digest this information, using it to modify the scores of past scenarios, in order to build up a sense of "experience."
Characterizing people's motivation for doing things, one can say they always act in the desire of positive outcomes.
Take games for instance. Every move a player makes has the same objective, to win. Winning is a positive outcome.
One of the earliest experimental AI programs, from the 1950s, was a game of checkers. With checkers, like life itself, you have to make moves not knowing the final outcome, good or bad.
Of course, life is more difficult than a game. But a game like checkers offers one advantage: It is interpretable.
Computationally, it would appear to be easy to calculate multiple moves ahead in checkers, namely to look at all the possible move combinations, in order to find those that would lead to a positive outcome i.e. win the game. This is called a look ahead strategy. Of course, the calculation would also have to include the opponent's own calculations for a positive outcome as well.
Such calculations would rapidly run out the capabilities of a computer to figure out an answer. A total of 50 moves would provide nearly an infinite number of outcomes. In addition to generating moves, an ML program would also have to evaluate possible move combinations, ideally by creating a rating that estimates the success of any given approach, based on a pre-chosen number of moves ahead it can easily calculate.
The rating could incorporate a number of different evaluations, called features. In checkers, a feature could be the number of pieces still on the board. Another feature could be the number of Kings on the checker board. The right features have to be chosen, and weighed against each other, to best estimate an accurate outcome. This is the work of the evaluation role.
In this way, an ML machine can map out possible positive outputs to a novel situation. It will also need to take into account its "experience" with this game. This is where reinforcement learning comes into work.
This learning rule reconciles the current predictions from the Evaluation Rule against the rating of the present set of outcomes. For instance, if all the possible outcomes equal "4" but the rating of the present state of the system is a "3," then the present rating would be upped to a "4," given all the future outcomes. (Episode 2)
Knowledge is represented by a set of if-then rules. If condition X is true then the machine will execute Y. The rules fall into two categories: legal moves (i.e. pieces can only move forward on a checkerboard) and strategic moves (i.e. try to convert a checker piece intro a king).
Collections of rules are assembled into a production system. In addition to each set of rules, such a system will also have a working memory, which captures the current state of knowledge within the system, as well as a conflict resolution system, to reconcile potentially conflicting rules.
One of the biggest ongoing challenges in ML is to pick the rules, or the features, that adequately describe the problem at hand. Rule sets may contain contradictory knowledge or incomplete knowledge. And, for ML, one of the most intractable problems of all are the exceptions to the rules. A rules such as "all birds can fly" for instance, does not cover the penguin or the ostrich, birds that, in fact, can not fly. (Episode 3)