The Bayesian Dirichlet (BD) scoring function is defined as follows.
Click this link to see what each of these terms mean.
For a weekend project, I created a Java and JavaScript API for computing the BD scoring function. The project is open-source with Apache License v2.0. You may download the project from GitHub at https://github.com/vangj/multdir-core.
Let’s see how we may quickly use these APIs to compute the score of a Bayesian Belief Network (BBN). In [Cooper92], a set of data with three variables (X1, X2, X3) was given as follows.
X1 | X2 | X3 |
p | a | a |
p | p | p |
a | a | p |
p | p | p |
a | a | a |
a | p | p |
p | p | p |
a | a | a |
p | p | p |
a | a | a |
There was also 3 Bayesian network structures (BS) to represent the relationships of the variables as well. Those 3 BS were reported as follows.
- BS1: X1 → X2 → X3
- BS2: X2 ← X1 → X3
- BS3: X1 ← X2 ← X3
In Java, we can use the API to quickly estimate the scores of BS1, BS2, and BS3 as follows.
double bs1 = (new BayesianDirchletBuilder()) .addKutato(5, 5) //X1 .addKutato(1, 4) //X2 .addKutato(4, 1) .addKutato(0, 5) //X3 .addKutato(4, 1) .build() .get(); double bs2 = (new BayesianDirchletBuilder()) .addKutato(5, 5) //X1 .addKutato(1, 4) //X2 .addKutato(4, 1) .addKutato(2, 3) //X3 .addKutato(4, 1) .build() .get(); double bs3 = (new BayesianDirchletBuilder()) .addKutato(1, 4) //X1 .addKutato(4, 1) .addKutato(0, 4) //X2 .addKutato(5, 1) .addKutato(6, 4) //X3 .build() .get();
Likewise, in JavaScript, we can also quickly estimate the scores of these BBNs as follows.
var bs1 = (new BayesianDirichletBuilder()) .addKutato([5,5]) .addKutato([1,4]) .addKutato([4,1]) .addKutato([0,5]) .addKutato([4,1]) .build() .get(); var bs2 = (new BayesianDirichletBuilder()) .addKutato([5,5]) .addKutato([1,4]) .addKutato([4,1]) .addKutato([2,3]) .addKutato([4,1]) .build() .get(); var bs3 = (new BayesianDirichletBuilder()) .addKutato([1,4]) .addKutato([4,1]) .addKutato([0,4]) .addKutato([5,1]) .addKutato([6,4]) .build() .get();
Notice how in both APIs, you only add the counts? Easy.
Also, a working demo of using the JavaScript API to compute the BBN scores is in the repository for this project. Here’s a screenshot. Note that the scores are in log-space. Computing the score using factorials is not practical. In log-space, the lower the score associated with a BBN, the better the BBN.
As always, enjoy and cheers! Sib ntsib dua nawb mog!
References
- [Cooper92] G.F. Cooper and E. Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, 309–347 (1992).