Declarative entity resolution via matching dependencies and answer set programs
Entity resolution (ER) is an important and common problem in data cleaning. It is about identifying and merging records in a database that represent the same real-world entity. Recently, matching dependencies (MDs) have been introduced and investigated as declarative rules that specify ER. An ER process induced by MDs over a dirty instance leads to multiple clean instances, in general. In this work, we present disjunctive answer set programs (with stable model semantics) that capture through their models the class of alternative clean instances obtained after an ER process based on MDs. With these programs, we can obtain clean answers to queries, i.e. those that are invariant under the clean instances, by skeptically reasoning from the program. We investigate the ER programs in terms of expressive power for the ER task at hand. As an important special and practical case of ER, we provide a declarative reconstruction of the so-called union-case ER methodology, as presented through a generic approach to ER (the so-called Swoosh approach). Copyright
|Conference||13th International Conference on the Principles of Knowledge Representation and Reasoning, KR 2012|
Bahmani, Z. (Zeinab), Bertossi, L, Kolahi, S. (Solmaz), & Lakshmanan, L.V.S. (Laks V.S.). (2012). Declarative entity resolution via matching dependencies and answer set programs. Presented at the 13th International Conference on the Principles of Knowledge Representation and Reasoning, KR 2012.