# Bayesian scoring functions for structure learning of Bayesian belief networks (BBNs)

In this blog post, I’m going to be referring to a paper that I wrote about some Bayesian scoring functions for learning the structure of Bayesian belief networks (BBNs). The paper may be downloaded by clicking here, and there is also an accompanying slide deck that may be downloaded by clicking here. These documents are licensed under a Creative Commons Attribution 3.0 Unported License.

Define the following.

• $\textbf{X}=\{X_1,\ldots,X_n\}$ is a set of discrete random variables
• $\pi_i$ is the set of parents of $X_i$ and $\pi=\{\pi_1,\ldots,\pi_n\}$
• $q_i$ is the number of unique instantiations (configurations) of $\pi_i$
• $r_i$ is the number of values for $X_i$
• $N_{ijk}$ is the number of times (frequency of when) $X_i=k$ and $pi_i=j$, and $N_{ij} = \sum_k N_{ijk}$
• $N_{ijk}^{'}$ is the hyperparameter for when $X_i=k$ and $pi_i=j$, and $N_{ij}^{'} = \sum_k N_{ijk}^{'}$
• $x!=\prod_{k=1}^{x} k$ is the factorial function
• $\Gamma(x) = (x-1)!$ is the gamma function
• $B_S$ is the BBN structure
• $D$ is the data
• $P(B_S,D)$ is the joint probability of the BBN structure and data

Then, the Bayesian Dirichlet (BD) scoring function is defined as follows.
$P(B_S,D) = \prod_{i=1}^{n} \prod_{j=1}^{q_i} \left( \frac{\Gamma(N_{ij}^{'})}{\Gamma(N_{ij}^{'} + N_{ij})} \prod_{k=1}^{r_i} \frac{\Gamma(N_{ijk}^{'} + N_{ijk})}{\Gamma(N_{ijk}^{'})} \right)$

A few Bayesian scoring functions are variations of the BD scoring function. In particular, the Kutato (K2), Bayesian Dirichlet equivalence (BDe), and Bayesian Dirichlet equivalence uniform (BDeu) scoring functions are variants (special cases) of BD, and they differ in how they set the values of the hyperparameters. If you have ever wondered how these scoring functions (BD, K2, BDe, BDeu) were derived, the documents I wrote might give you some insight. In particular, I show how these scoring functions are based (in a sequential progression) on some basic mathematical functions (factorial, gamma, Beta), probability distributions (multinomial, Dirichlet, Dirichlet-multinomial), Bayes’ Theorem, and assumptions.

At any rate, happy reading! Sib nstib dua thiab thov kom muaj kev zoo siab rau koj xyoo tshiab no nawb mog!