Extracting Semantic Networks from Text via Relational Clustering
Stanley Kok and Pedro Domingos
Abstract:
Extracting knowledge from text has long been a goal of AI. Initial
approaches were purely logical and brittle. More recently, the availability
of large quantities of text on the Web has led to the development of
machine learning approaches. However, to date these have mainly extracted
ground facts, as opposed to general knowledge. Other learning approaches
can extract logical forms, but require supervision and do not scale. In
this paper we present an unsupervised approach to extracting semantic
networks from large volumes of text. We use the TextRunner system to
extract tuples from text, and then induce general concepts and relations
from them by jointly clustering the objects and relational strings in the
tuples. Our approach is defined in Markov logic using four simple rules.
Experiments on a dataset of two million tuples show that it outperforms
three other relational clustering approaches, and extracts meaningful
semantic networks.
Download:
Paper (PDF)
Code
Slides
Video
Derivation of Log-Posterior
Fragments of a Semantic Network Learned by SNE