Next: 8 Information Extraction Up: The Alchemy Tutorial Previous: 6 Entity Resolution

7 Hidden Markov Models

A very effective and intuitive approach to many sequential pattern recognition tasks, such as speech recognition, protein sequence analysis, machine translation, and many others, is to use a hidden Markov model (HMM). We demonstrate the modeling of an HMM on two examples. In order to convey the translation of HMMs to MLNs, we use a small traffic example. We then show a real-world example of applying an HMM in Markov logic to perform segmentation of Cora citations.

Suppose, on a given day, we observe a car taking three actions: it is either stopped, driving, or slowing. We assume this is only dependent on the state of the stoplight in front of it: red, green or yellow. In a Markov process we need to model states and observations at certain points in time. In Alchemy, we model a state and observation with a first-order predicate and time is a variable in each of these predicates, i.e.:

state = {Stop, Drive, Slow}
obs = {Red, Green, Yellow}
time = {0,...,10}

State(state, time)
Obs(obs, time)

In Markov logic we need to explicitly express some constraints which are inherent to HMMs; in particular, we need to state that at each time step there is exactly one state and observation. This is achieved with the ! operator:

State(s!, t)
Obs(o!, t)

Now, we need for each state and its successor, the probability of this transition and for each observation-state pair, we need the probability of this emission. In addition, we need the prior probability of each state at the starting time point. These probabilities can be represented in Alchemy in three formulas by using the + operator:

State(+s, 0)
State(+s1, t) => State(+s2, t+1)
Obs(+o, t) => State(+s, t)

By using the + operator, we generate a clause for each state/state and observation/state pair, thus enabling us to learn weights for each of these combinations. These are the observation and transition matrices of our HMM.

We can learn weights from the data with:

learnwts -d -i traffic.mln -o traffic-out.mln -t traffic-train.db -ne State

Then, we can find the most likely sequence of states by querying all State atoms:

infer -m -i traffic-out.mln -r traffic.result -e traffic-test.db -q State

Although this is a toy example, it showcases the intuitiveness and many of the capabilities of Alchemy: block variables, weight learning with EM and compactness of representation. We now move on to a real-world application of HMMs, segmentation of citations, and how to achieve this with Alchemy.

Next: 8 Information Extraction Up: The Alchemy Tutorial Previous: 6 Entity Resolution

Marc Sumner 2010-01-22