10 Bayesian Networks

Bayesian networks are one of the most popular and widespread graphical models and many people from fields other than AI and machine learning are familiar with them. This framework can be easily implemented in Markov logic; we show how this is done on the classic ALARM Bayesian network used to monitor patients in intensive care units. It contains 37 nodes and 46 arcs.

In a Bayesian network, nodes represent discrete variables and arcs the
dependencies between them. A conditional probability table (CPT) is associated
with each node indicating the probability distribution for the variable
conditioned on its parents. We want to represent every variable in a Bayesian
network as a predicate; thus, we declare for each variable `Var` a unary
predicate `Var(varValue!)`, indicating that each variable can only take on
one of its values (i.e. if `CVP` is `LOW`, it cannot be `NORMAL` or
`HIGH`). This results in the following predicate declarations:

HISTORY(historyValue!) CVP(cvpValue!) PCWP(pcwpValue!) HYPOVOLEMIA(hypovolemiaValue!) LVEDVOLUME(lvedvolumeValue!) LVFAILURE(lvfailureValue!) STROKEVOLUME(strokevolumeValue!) ERRLOWOUTPUT(errlowoutputValue!) HRBP(hrbpValue!) HREKG(hrekgValue!) ERRCAUTER(errcauterValue!) HRSAT(hrsatValue!) INSUFFANESTH(insuffanesthValue!) ANAPHYLAXIS(anaphylaxisValue!) TPR(tprValue!) EXPCO2(expco2Value!) KINKEDTUBE(kinkedtubeValue!) MINVOL(minvolValue!) FIO2(fio2Value!) PVSAT(pvsatValue!) SAO2(sao2Value!) PAP(papValue!) PULMEMBOLUS(pulmembolusValue!) SHUNT(shuntValue!) INTUBATION(intubationValue!) PRESS(pressValue!) DISCONNECT(disconnectValue!) MINVOLSET(minvolsetValue!) VENTMACH(ventmachValue!) VENTTUBE(venttubeValue!) VENTLUNG(ventlungValue!) VENTALV(ventalvValue!) ARTCO2(arco2Value!) CATHECOL(cathecolValue!) HR(hrValue!) CO(coValue!) BP(bpValue!)

As shown in [], we can convert a Bayesian network to a weighted CNF
expression (an MLN) by producing a clause for each line and value of the
variable and enforcing mutually exclusive and exhaustive constraints on the
variables (we have already achieved this with the `!` operator. Each clause
contains the negation of each variable in the row of the CPT and the weight
is
, where
is the corresponding probability in the
CPT. Entries with zero probability cannot be translated in this manner; however,
this problem can be solved by making these lines of the CPT hard clauses and
not negating the variables. For our example we arrive at the MLN found in
`alarm.mln`.

Note, these weights could easily be learned from training data, either discriminatively or generatively. In the case of Bayesian networks, the partition function is known (it is 1) and the weights can be computed exactly.