A learning automation is a finite state machine which learns the optimal action from a set of actions offered to it by an environment. The automata considered have a variable structure and hence they are completely described by action probability updating functions. The action probabilities can take only a finite number of prespecified values. These values are linearly increasing and divide the interval into a number of equal length subintervals. The automata update the probability only if the environment responds with a reward and hence they are called Discretized Linear Reward-Inaction (DL//R//I) automata. The asymptotic optimality of this family of automata is proved for all environments.

Additional Metadata
Conference Proceedings of the 1984 Conference on Information Sciences and Systems.
Citation
Oommen, J, & Hansen, Eldon (Eldon). (1984). OPTIMAL PROPERTIES OF TWO ACTION DISCRETIZED LINEAR REWARD-INACTION LEARNING AUTOMATA. In Proceedings of the 1984 Conference on Information Sciences and Systems. (pp. 658–662).