As the demand for the development of cloud computing grows, more and more organizations have outsourced their data and query services to the cloud for cost-saving and flexibility. Suppose an organization that has a great number of users querying the cloud-deployed multiple proxy servers to achieve cost efficiency and load balancing. Given n queries, each of which is expressed as several keywords, and k proxy servers, the problem to be solved is how to classify n queries into k groups, in order to minimize the difference between each group and the number of distinct keywords in all groups. Since this problem is NP-hard, it is solved in mathematic and heuristic ways. Mathematic grouping uses a local optimization method, and heuristic grouping is based on k-means. Specifically, two extensions are provided: the first one focuses on robustness, i.e., each user obtains search results even if some proxy servers fail; the second one focuses on benefit, i.e., each user can retrieve as many files as possible that may be of interest without increasing the sum. Extensive evaluations have been conducted on both a synthetic dataset and real query traces to verify the effectiveness of our strategies.

Additional Metadata
Keywords benefit, cloud computing, cost efficiency, load balancing, robustness
Persistent URL
Journal Journal of Computer Science and Technology
Liu, Q. (Qin), Guo, Y, Wu, J. (Jie), & Wang, G. (Guojun). (2017). Effective Query Grouping Strategy in Clouds. Journal of Computer Science and Technology, 32(6), 1231–1249. doi:10.1007/s11390-017-1797-9