Difference between revisions of "Phylogenetics: RevBayes Lab"

From EEBedia
Jump to: navigation, search
(Created page with "{| border="0" |- |rowspan="2" valign="top"|150px |<span style="font-size: x-large">[http://hydrodictyon.eeb.uconn.edu/eebedia/index.php/Phylogenetics:_S...")
 
(Create a directory)
Line 18: Line 18:
 
=== Create a directory ===
 
=== Create a directory ===
 
Use the unix <tt>mkdir</tt> command to create a directory to play in today:
 
Use the unix <tt>mkdir</tt> command to create a directory to play in today:
 +
cd ~            # you can omit this if you are already in your home directory
 
  mkdir rblab
 
  mkdir rblab
 +
 +
=== Downloading and compiling indelible ===
 +
 +
We will use the program indelible to simulate trees and data. Start by filling out the web form and downloading the software from [http://abacus.gene.ucl.ac.uk/software/indelible/ the indelible web site].
 +
 +
Transfer the ''INDELibleV1.03.tar.gz'' file to the Xanadu cluster and unpack it using tar:
 +
tar zxvf INDELibleV1.03.tar.gz
 +
 +
Navigate into the folder ''INDELibleV1.03/src'' and enter the following command to compile the program:
 +
g++ -o indelible -O4 indelible.cpp
 +
 +
Once this command has finished, you will find a file named 'indelible' in that same ''src'' directory. Move that file to your ''rblab'' folder as follows:
 +
cd ~
 +
mv INDELibleV1.03/src/indelible rblab
  
 
=== Simulating and analyzing under the strict clock model ===
 
=== Simulating and analyzing under the strict clock model ===
  
 
Divergence time analyses are the most tricky type of analysis we will do in this course. That's because the sequences do not contain information about substitution rates or divergence times per se; they contain information about the number of substitutions that have occurred, and the number of substitutions  is the ''product'' of rate and time. Thus, maximum likelihood methods cannot separate rates from times; this requires a Bayesian approach and considered use of priors, which constrain the range of rate and time scenarios considered plausible.
 
Divergence time analyses are the most tricky type of analysis we will do in this course. That's because the sequences do not contain information about substitution rates or divergence times per se; they contain information about the number of substitutions that have occurred, and the number of substitutions  is the ''product'' of rate and time. Thus, maximum likelihood methods cannot separate rates from times; this requires a Bayesian approach and considered use of priors, which constrain the range of rate and time scenarios considered plausible.

Revision as of 14:15, 19 April 2020

Adiantum.png EEB 5349: Phylogenetics
The goal of this lab exercise is to introduce you to Bayesian divergence time estimation using RevBayes. There are other programs that are currently more popular than RevBayes for doing this (notably BEAST2), but I prefer RevBayes for this lab because it is less of a black box: every aspect of the model is very explicitly defined in RevBayes.

Getting started

Login to Xanadu

Login to Xanadu and request a machine as usual:

srun --pty -p mcbstudent --qos=mcbstudent bash

Once you are transferred to a free node, type

module load RevBayes/xxx

Create a directory

Use the unix mkdir command to create a directory to play in today:

cd ~            # you can omit this if you are already in your home directory
mkdir rblab

Downloading and compiling indelible

We will use the program indelible to simulate trees and data. Start by filling out the web form and downloading the software from the indelible web site.

Transfer the INDELibleV1.03.tar.gz file to the Xanadu cluster and unpack it using tar:

tar zxvf INDELibleV1.03.tar.gz

Navigate into the folder INDELibleV1.03/src and enter the following command to compile the program:

g++ -o indelible -O4 indelible.cpp

Once this command has finished, you will find a file named 'indelible' in that same src directory. Move that file to your rblab folder as follows:

cd ~
mv INDELibleV1.03/src/indelible rblab

Simulating and analyzing under the strict clock model

Divergence time analyses are the most tricky type of analysis we will do in this course. That's because the sequences do not contain information about substitution rates or divergence times per se; they contain information about the number of substitutions that have occurred, and the number of substitutions is the product of rate and time. Thus, maximum likelihood methods cannot separate rates from times; this requires a Bayesian approach and considered use of priors, which constrain the range of rate and time scenarios considered plausible.