Models as Graphs

Every node is a random variable and every edge shows a dependency, which combined for a certain node can be expressed as probability distribution functions.

For example, in a graph like this :

Since in the graph x3x_3 depends only on x1,x2x_1,x_2 , we have a way to determine the distribution p(x3x1,x2)p(x_3|x_1,x_2) without requiring to know the values or probability distributions of any other variables . If we know the probability distribution for the nodes that don’t have any arrows pointing in, then we basically know the actual (not conditional) probability distribution of every variable. These “grandparent” nodes’ distribution is usually either assumed , or approximated empirically.

Bayesian update

Notice that when some of the random variables are known, we can directly update the probability distributions of the children nodes and their children, and so on. But we can also update the parent nodes via Baeysian inference, and thus, also the sibling nodes. What we can’t (or don’t need to) update are the “co-parent” nodes, that is nodes that are also parents of the children of our known nodes. For example it graph above, if x2x_2 is known, then we know the pdf of x4x_4 and we can update the pdf of x5x_5 . We now also know the pdf of x3x_3 , but we have used the old pdf of x1x_1 to figure it out. That old pdf of x1x_1 has not changed, because althought the pdf of x3x_3 was altered, we did so using our prior beliefs about x1x_1 and using Baeysian inference again to update the pdf of x1x_1 will do nothing but echo that belief.

Conditionally independent nodes

Nodes a,ba,b that are independent, provided the values of some other nodes c1,c2,c_1,c_2,\dots , are said to be conditionally independent under the set of nodes C={c1,c2,}C = \{c_1,c_2,\dots\}. We write this as (ab  c1,c2,)(a\perp b \;|c_1,c_2,\dots) . A way to determine this is by forgetting that cic_i are random variables and seeing them as arbatory constants in the distribution functions of their parent nodes, and thus enabling us to delete these nodes completely . Now, in this scenario, if a,ba,b have to be independent, then there should be no effect of node aa on bb , directly, or indirectly. That happens when for every path that goes from aa  to bb , in the original graph, there is some roadblock after deletion of nodes, either because the path contained a node from CC, and now it’s incomplete (note that such a node shouldn’t have both the arrows connected to it in the path, pointing towards it, because then, this node, while being deleted would have formed a connection between its parent nodes and thus information can still be transmitted through this “broken” path), or because it contains a node, not in CC with the arrows attached to it from the path, both pointing in, making aa and bb the grandparent nodes. We know that knowing one grandparent node doesn’t cause us to update our beliefs about the other.