Bayesian networks are one of the most popular and widespread graphical models and many people from fields other than AI and machine learning are familiar with them. This framework can be easily implemented in Markov logic; we show how this is done on the classic ALARM Bayesian network used to monitor patients in intensive care units. It contains 37 nodes and 46 arcs.
In a Bayesian network, nodes represent discrete variables and arcs the dependencies between them. A conditional probability table (CPT) is associated with each node indicating the probability distribution for the variable conditioned on its parents. We want to represent every variable in a Bayesian network as a predicate; thus, we declare for each variable Var a unary predicate Var(varValue!), indicating that each variable can only take on one of its values (i.e. if CVP is LOW, it cannot be NORMAL or HIGH). This results in the following predicate declarations:
HISTORY(historyValue!) CVP(cvpValue!) PCWP(pcwpValue!) HYPOVOLEMIA(hypovolemiaValue!) LVEDVOLUME(lvedvolumeValue!) LVFAILURE(lvfailureValue!) STROKEVOLUME(strokevolumeValue!) ERRLOWOUTPUT(errlowoutputValue!) HRBP(hrbpValue!) HREKG(hrekgValue!) ERRCAUTER(errcauterValue!) HRSAT(hrsatValue!) INSUFFANESTH(insuffanesthValue!) ANAPHYLAXIS(anaphylaxisValue!) TPR(tprValue!) EXPCO2(expco2Value!) KINKEDTUBE(kinkedtubeValue!) MINVOL(minvolValue!) FIO2(fio2Value!) PVSAT(pvsatValue!) SAO2(sao2Value!) PAP(papValue!) PULMEMBOLUS(pulmembolusValue!) SHUNT(shuntValue!) INTUBATION(intubationValue!) PRESS(pressValue!) DISCONNECT(disconnectValue!) MINVOLSET(minvolsetValue!) VENTMACH(ventmachValue!) VENTTUBE(venttubeValue!) VENTLUNG(ventlungValue!) VENTALV(ventalvValue!) ARTCO2(arco2Value!) CATHECOL(cathecolValue!) HR(hrValue!) CO(coValue!) BP(bpValue!)
As shown in , we can convert a Bayesian network to a weighted CNF expression (an MLN) by producing a clause for each line and value of the variable and enforcing mutually exclusive and exhaustive constraints on the variables (we have already achieved this with the ! operator. Each clause contains the negation of each variable in the row of the CPT and the weight is , where is the corresponding probability in the CPT. Entries with zero probability cannot be translated in this manner; however, this problem can be solved by making these lines of the CPT hard clauses and not negating the variables. For our example we arrive at the MLN found in alarm.mln.
Note, these weights could easily be learned from training data, either discriminatively or generatively. In the case of Bayesian networks, the partition function is known (it is 1) and the weights can be computed exactly.