7 Hidden Markov Models

A very effective and intuitive approach to many sequential pattern recognition tasks, such as speech recognition, protein sequence analysis, machine translation, and many others, is to use a hidden Markov model (HMM). We demonstrate the modeling of an HMM on two examples. In order to convey the translation of HMMs to MLNs, we use a small traffic example. We then show a real-world example of applying an HMM in Markov logic to perform segmentation of Cora citations.

Suppose, on a given day, we observe a car taking three actions: it is either
stopped, driving, or slowing. We assume this is only dependent on the state of
the stoplight in front of it: red, green or yellow.
In a Markov process we need to model *states* and *observations* at
certain points in *time*. In Alchemy, we model a state and observation
with a first-order predicate and time is a variable in each of these
predicates, i.e.:

state = {Stop, Drive, Slow} obs = {Red, Green, Yellow} time = {0,...,10} State(state, time) Obs(obs, time)

In Markov logic we need to explicitly express some constraints which are
inherent to HMMs; in particular, we need to state that at each time step there
is exactly one state and observation. This is achieved with the `!`
operator:

State(s!, t) Obs(o!, t)

Now, we need for each state and its successor, the probability of this
*transition* and for each observation-state pair, we need the probability
of this *emission*. In addition, we need the prior probability of each
state at the starting time point. These probabilities can be represented in
Alchemy in three formulas by using the `+` operator:

State(+s, 0) State(+s1, t) => State(+s2, t+1) Obs(+o, t) => State(+s, t)

By using the `+` operator, we generate a clause for each state/state and
observation/state pair, thus enabling us to learn weights for each of these
combinations. These are the observation and transition matrices of our HMM.

We can learn weights from the data with:

learnwts -d -i traffic.mln -o traffic-out.mln -t traffic-train.db -ne State

Then, we can find the most likely sequence of states by querying all `State`
atoms:

infer -m -i traffic-out.mln -r traffic.result -e traffic-test.db -q State

Although this is a toy example, it showcases the intuitiveness and many of the capabilities of Alchemy: block variables, weight learning with EM and compactness of representation. We now move on to a real-world application of HMMs, segmentation of citations, and how to achieve this with Alchemy.