The authors consider the problem of a learning mechanism learning the optimal action offered by a random environment. The mechanism presented can be defined as an action probability updating rule and thus a variable-structure stochastic automaton (VSSA). The machine is essentially a stubborn machine; in other words, once the machine has chosen a particular action it increases the probability of choosing the action irrespective of whether the response from the environment was favorable or unfavorable. However, this increase in the action probability is done in a systematic and methodical way so that the machine learns, in an ε-optimal fashion, the best action which the environment offers. The proposed mechanism forms an excellent model for an ε-optimal stubbornly learning system. Apart from the fact that the machine is shown to be ε-optimal, a major contribution of the present work is that the mathematical tools used in this proof (namely the theory of distributions, kernels, and topological spaces) are quite distinct from those which are currently used in the field of learning. Also presented are simulation results which demonstrate the properties of the mechanism and which compare it to the traditional LRI scheme.

Additional Metadata
Conference 1989 IEEE International Conference on Systems, Man, and Cybernetics. Part 1 (of 3)
Christensen, J.P.R. (J. P R), & Oommen, J. (1989). On using distribution theory to prove the epsilon-optimality of stubborn learning mechanisms. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (pp. 286–291).