Bayesian scoring functions for structure learning of Bayesian belief networks (BBNs)

In this blog post, I’m going to be referring to a paper that I wrote about some Bayesian scoring functions for learning the structure of Bayesian belief networks (BBNs). The paper may be downloaded by clicking here, and there is also an accompanying slide deck that may be downloaded by clicking here. These documents are licensed under a Creative Commons Attribution 3.0 Unported License.

Define the following.

  • \textbf{X}=\{X_1,\ldots,X_n\} is a set of discrete random variables
  • \pi_i is the set of parents of X_i and \pi=\{\pi_1,\ldots,\pi_n\}
  • q_i is the number of unique instantiations (configurations) of \pi_i
  • r_i is the number of values for X_i
  • N_{ijk} is the number of times (frequency of when) X_i=k and pi_i=j, and N_{ij} = \sum_k N_{ijk}
  • N_{ijk}^{'} is the hyperparameter for when X_i=k and pi_i=j, and N_{ij}^{'} = \sum_k N_{ijk}^{'}
  • x!=\prod_{k=1}^{x} k is the factorial function
  • \Gamma(x) = (x-1)! is the gamma function
  • B_S is the BBN structure
  • D is the data
  • P(B_S,D) is the joint probability of the BBN structure and data

Then, the Bayesian Dirichlet (BD) scoring function is defined as follows.
P(B_S,D) = \prod_{i=1}^{n} \prod_{j=1}^{q_i} \left( \frac{\Gamma(N_{ij}^{'})}{\Gamma(N_{ij}^{'} + N_{ij})} \prod_{k=1}^{r_i} \frac{\Gamma(N_{ijk}^{'} + N_{ijk})}{\Gamma(N_{ijk}^{'})} \right)

A few Bayesian scoring functions are variations of the BD scoring function. In particular, the Kutato (K2), Bayesian Dirichlet equivalence (BDe), and Bayesian Dirichlet equivalence uniform (BDeu) scoring functions are variants (special cases) of BD, and they differ in how they set the values of the hyperparameters. If you have ever wondered how these scoring functions (BD, K2, BDe, BDeu) were derived, the documents I wrote might give you some insight. In particular, I show how these scoring functions are based (in a sequential progression) on some basic mathematical functions (factorial, gamma, Beta), probability distributions (multinomial, Dirichlet, Dirichlet-multinomial), Bayes’ Theorem, and assumptions.

At any rate, happy reading! Sib nstib dua thiab thov kom muaj kev zoo siab rau koj xyoo tshiab no nawb mog!


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s