In this paper we present a foundational basis for optimal and information theoretic, syntactic Pattern Recognition (PR) for syntactic patterns which can be “linearly” represented as strings. In an earlier paper Oommen and Kashyap [25] we had presented a formal basis for designing such systems when the errors involved were arbitrarily distributed Substitution, Insertion and Deletion (SID) syntactic errors. In this paper we generalize the framework and permit these traditional errors and Generalized Transposition (GT) errors. We do this by developing a rigorous model, MG*, for channels which permit all these errors in an arbitrarily distributed manner. The scheme is Functionally Complete and stochastically consistent. Besides the synthesis aspects, we also deal with the analysis of such a model and derive a technique by which Pr[Y|U], the probability of receiving Y given that U was transmitted, can be computed in quartic time using dynamic programming. Experimental results which involve dictionaries with strings of lengths between 7 and 14 with an overall average noise of 70.5% demonstrate the superiority of our system over existing methods.

Additional Metadata
Series Lecture Notes in Computer Science
Citation
Oommen, J, & Loke, R.K.S. (R. K.S.). (1996). Optimal and information theoretic syntactic pattern recognition involving traditional and transposition errors. In Lecture Notes in Computer Science.