We have solved the following problem using pattern classification techniques (PCT): given two histogram methods, M1 and M2, used in query optimization, if the estimation accuracy of M1 is greater than that of M2, then M1 has a higher probability of leading to the optimal query evaluation plan (QEP) than M2. To the best of our knowledge, this problem has been open for at least two decades, the difficulty of the problem partially being due to the hurdles involved in the formulation itself. By formulating the problem from a pattern recognition perspective, we use PCT to present a rigorous mathematical proof of this fact, and show some uniqueness results. We also report empirical results demonstrating the power of these theoretical results on well-known histogram estimation methods.

Additional Metadata
Persistent URL dx.doi.org/10.1109/DASFAA.2001.916393
Conference 7th International Conference on Database Systems for Advanced Applications, DASFAA 2001
Citation
Oommen, J, & Rueda, L.G. (L. G.). (2001). Histogram methods in query optimization: The relation between accuracy and optimality. In Proceedings - 7th International Conference on Database Systems for Advanced Applications, DASFAA 2001 (pp. 320–326). doi:10.1109/DASFAA.2001.916393