Tuesday, November 12, 2013

Bayesian Inference vs. Maximum Likelihood

Last Friday (November 8, 2013), my mentor and I discussed a bit about the Bayesian Inference of Trees and the program BEAST.

Like Maximum Likelihood (ML), Bayesian Inference is a character-based tree method, and they both generate several trees and use some criterion to decide which tree is the best. However, BI differs from ML in that BI "seeks the tree that is most likely given the data and the chosen substitution model" whereas ML "seeks the tree that makes the data the most likely" (Hall, 140). Sounds the same, I know. However, after a short period of intense research and seeing a lot of alien equations, I interpreted the major distinction like this (please correct me if I am wrong, clarification needed!!):

  • ML - make  different trees and calculate the likelihood of each tree
  • BI -  finds the tree that constitute the X (observation) with given knowledge (i.e. the descendants or a substitution model).
Mathematically, according to Bayes' Theorem:

P(A|B) = \frac{P(B | A)\, P(A)}{P(B)}\cdot \,

In Bayesian inference, the event B is fixed (the "given knowledge") in the discussion, and we wish to consider the impact of its having been observed on our belief in various possible events A (how the tree split). In such a situation the denominator of the last expression, the probability of the given evidence B, is fixed; what we want to vary is A. Thus, the posterior probabilities are proportional to the numerator: 


P(A|B) \propto  P(A) \cdot P(B|A) \

  • P(B|A) is the prior probability, also the likelihood. It can be interpreted as P(data | hypothesis). Prior probability indicates our state of knowledge about the truth of a hypothesis before we have observed the data. 
  • P(A|B) is the posterior probability. We can rewrite as (hypothesis | data). It shows how well our models agree with the observed data --> Bayesian
I drew a picture and hopefully it helps with visualization. Suppose we know there are species A, B, C, E, and F. For ML, we are finding in what way are the species located on the tree that P(A)xP(B|A) is the largest. For BI, we know from data that A, D, E, B are in the order they are. Given this knowledge, we are finding where C and F locate so that P(A|B) is the greatest.

This is at least what I've got so far. Nonetheless, BI can be burdensome if no obvious prior distribution for a parameter exists. The researcher to ensure that the prior selected is not inadvertently influencing the posterior distribution of parameters of interest. I clarified my thoughts every time I wrote it out. Before we get into the bird data, we try to ask ourselves some fundamental questions about the phylogenetic trees. Next Friday, I will be joining a talk with the data holders to discuss our participation in the Avian mycobacteria project. 

1 comment:

  1. I'll leave it to the systematists to judge whether you got this right. Actually, could you share your blog with your mentor, if you have not already?

    You are clearly making great progress with your work. I am sure that your theoretical understanding will aid you in the near future. I anxiously await your first trees!

    ReplyDelete