Phylogenetic Inference in the Post-genomic Era: Statistical Inference for Tree- and DAG-based Generative Models

October 16th, 12:00 pm- 1:00 pm in DCH 3092
Speaker: Luay Nakhleh (COMP)

Please indicate interest, especially if you want lunch, here.
Abstract:

Phylogenetic inference is mostly about estimating evolutionary histories from molecular sequence data. An evolutionary history is a generative model that consists of two components. The discrete component, which is often a rooted tree or directed acyclic graph (DAG) topology, whose leaves are labeled by a set of taxa (species, genes, cells, …). The continuous component is the set of evolutionary parameters (positive real numbers) that label the nodes and edges of the tree or DAG. The probability distribution that this model defines on sequence data is very complex and not amenable to standard manipulations that one encounters in machine learning textbooks. For example, integration of the probability density and derivatives of the likelihood function cannot be done analytically.

In this talk, I will describe two different areas of phylogenetics that my group is working on. The first concerns inferring DAGs from the genomes of different species. The second concerns the inference of trees from single cancer cells in a patient. Both cases involve evolutionary mixture models and Markov models of sequence evolution. In the case of DAGs from species genomes, the probability distribution is given a model adopted from population genetics. In the case of cancer cell trees, the model consists of clusters defined by a tree structure. Inference in both cases involves walking in a space of varying dimensions; therefore, to conduct Bayesian inference in both cases, we employ reversible-jump Markov chain Monte Carlo samplers.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *