Project Description

A fundamental problem in evolutionary biology is the following: given genomic data (such as DNA sequences) for a collection of species, determine the evolutionary history of the species (for example, are humans more closely related to gorillas or orangutans).  This evolutionary history corresponds to a tree known as a phylogenetic tree.  One of the widely used algorithmic tools for determining the phylogenetic tree is a program called MrBayes which is a Markov Chain Monte Carlo (MCMC) algorithm.  An MCMC algorithm does a random walk, in this case, on the collection of all phylogenetic trees (this is a huge set), where the steps of the walk are weighted by the likelihood of the particular tree producing the given data.  It's easy to define a simple MCMC algorithm that converges to the desired probability distribution after an infinite number of steps.  In practice, MCMC algorithms use heuristics to determine how long to run for (or practitioners just run for as long as they have).  The goal of this project is to study how well MCMC algorithms, such as MrBayes, perform: do they find the right tree, or do they get trapped in local optima?

Team Members

  • Katy Tsao
  • Yashasvi Vangala
  • David Wang
  • Kyle Wong

Professor and Mentors

  • Prof. Eric Vigoda
  • Grad mentor: Chinmay Sonar

Meeting Time

  • Meetings with the Professor
    • Friday 2-3p HFH 1152
  • Meetings with Grad mentor
    • Fridays 3-3:30p
  • ERSP meetings with central mentors
    • Chinmay: TBD
    • Diba: TBD
  • Team meetings
    • Monday 11a-12p and Thursday 3:30p-4p

Links to Proposals and Presentation

  • Proposal (first draft): link
  • Proposal (after peer review): link
  • Final Proposal (after instructor's feedback): link
  • Final presentation: link

Individual Logs

Peer Review

Project Documentation and Resource