Scaling Textual Inference to the Web
Daniel S. Weld
Most Web-based Q/A systems work by finding
pages that contain an explicit answer to
a question. These systems are helpless if the
answer has to be inferred from multiple sentences,
possibly on different pages. To solve
this problem, we introduce the HOLMES system,
which utilizes textual inference (TI) over
tuples extracted from text.
Whereas previous work on TI (e.g., the literature
on textual entailment) has been applied
to paragraph-sized texts, HOLMES utilizes
knowledge-based model construction to
scale TI to a corpus of 117 millionWeb pages.
Given only a few minutes, HOLMES doubles
recall for example queries in three disparate
domains (geography, business, and nutrition).
Importantly, HOLMES.s runtime is linear in
the size of its input corpus due to a surprising
property of many textual relations in the Web
corpus.they are .approximately. functional
in a well-defined sense.