We consider the Micro-Aggregation Problem (MAP) in secure statistical databases which involves partitioning a set of individual records in a micro-data file into a number of mutually exclusive and exhaustive groups. This problem, which seeks for the best partition of the micro-data file, is known to be NP-hard, and has been tackled using many heuristic solutions. In this paper, we would like to demonstrate that in the process of developing Micro-Aggregation Techniques (MATs), it is expedient to incorporate information about the dependence between the random variables in the micro-data file. This can be achieved by pre-processing the micro-data before invoking any MAT, in order to extract the useful dependence information from the joint probability distribution of the variables in the micro-data file, and then accomplishing the micro-aggregation on the "maximally independent" variables. Our results, on real life data sets, show that including such information will enhance the process of determining how many variables are to be used, and which of them should be used in the micro-aggregation process.

Additional Metadata
Persistent URL dx.doi.org/10.1007/978-3-540-70500-0-30
Series Lecture Notes in Computer Science
Oommen, J, & Fayyoumi, E. (Ebaa). (2008). Enhancing micro-aggregation technique by utilizing dependence-based information in secure statistical databases. In Lecture Notes in Computer Science. doi:10.1007/978-3-540-70500-0-30