Entity Resolution with Markov Logic
Parag Singla
and
Pedro Domingos
Abstract:
Entity resolution is the problem of determining which records in a
database refer to the same entities, and is a crucial and expensive
step in the data mining process. Interest in it has grown rapidly in
recent years, and many approaches have been proposed. However, they
tend to address only isolated aspects of the problem, and are often
ad hoc. This paper proposes a well-founded, integrated solution to
the entity resolution problem based on Markov logic. Markov logic combines
first-order logic and probabilistic graphical models by attaching
weights to first-order formulas, and viewing them as templates for
features of Markov networks. We show how a number of previous
approaches can be formulated and seamlessly combined in Markov logic,
and how the resulting learning and inference problems can be solved
efficiently. Experiments on two citation databases show the utility of
this approach, and evaluate the contribution of the different
components.
Download:
Paper (PDF)
Datasets used:
BibServ
Cora