A learning algorithm that has been but marginally referred to in the field of learning machines is presented. The machine is an automaton whose structure changes with time and is assumed to be interacting with a random environment. The machine is essentially a stubborn machine. In other words, once the machine has chosen a particular action it increases the probability of choosing the action irrespective of whether the response from the environment was favorable or unfavorable. However this increase in the action probability is done in a systematic and methodical way so that the machine ultimately learns the best action that the environment offers. It is shown that the learning mechanism is r-optimal and that the probability of it choosing the optimal action converges uniformly to unity. Apart from the machine being shown to be r-optimal, a major contribution of this paper is that the mathematical tools used in the proof are quite novel to the field of learning. Besides the previous theoretical results, the paper also contains various simulation results that demonstrate the properties of stubbornly learning mechanism. The mechanism is also shown to be inferior to the learning machine that merely ignores the penalty responses of the environment. Some open problems are also presented.

Additional Metadata
Persistent URL dx.doi.org/10.1109/21.59983
Journal IEEE Transactions on Systems, Man and Cybernetics
Christensen, J.P.R. (J. P.R.), & Oommen, J. (1990). Epsilon-Optimal Stubborn Learning Mechanisms. IEEE Transactions on Systems, Man and Cybernetics, 20(5), 1209–1216. doi:10.1109/21.59983