Phylogenetics: BayesTraits Lab

From EEBedia
Revision as of 21:28, 15 April 2007 by PaulLewis (Talk | contribs) (Do the tutorial)

Jump to: navigation, search
Under construction.png This article is still under construction.
Expect it to change frequently until this notice is removed.
Adiantum.png EEB 349: Phylogenetics
In this lab you will learn how to use the program BayesTraits, written by Andrew Meade and Mark Pagel. BayesTraits can perform several analyses related to evaluating evolutionary correlation in discrete morphological traits. This program is meant to replace the older programs Discrete and Multistate. You will learn not only how to use the program on the Windows-based PCs in the computer lab, but also how to download and use it on the cluster (the cluster is better for long runs).

We will use BayesTraits interactively for awhile on the PCs in the computer room (Part 1), then we will set up a non-interactive run on the cluster in Part 2 so that you know how to do this.

Part 1: Running BayesTraits under Windows

Download BayesTraits

BayesTraits has not been installed on the machines in this room, so you will need to download it yourself. Go to Mark Pagel's web site, click on the "Software" link, then click on the "Description and Downloads" link under "BayesTraits". Finally, click on the "BayesTraits - Windows" link to download a zip file containing the program itself and some sample tree and data files. Right-click the BayesTraits-PC-V1.0.zip file and choose "Extract to BayesTraits-PC-V1.0" to unpack it on your local hard drive. Navigate to the BayesTraits-PC-V1.0\BayesTraits folder and verify that it contains the BayesTraits.exe file, as well as the PPI.txt, PPI.trees, Primates.txt and Primates.trees example files. I will hereafter refer to this folder as simply the BayesTraits folder. Go back to Mark Pagel's web site and download the manual for BayesTraits. This is a PDF file and should open in your browser window.

Download the modified example files

You will be going through the tutorial presented in the manual for the program during this lab, but there are a couple of modifications we need to make to the example data files first:

Use Primates.first.tree instead of Primates.trees

The Primates.trees file that comes with BayesTraits contains 500 trees, which makes any analysis take a very long time. We'll avoid the long waits by using a version of this file that contains only the first tree. Download Primates.first.tree and save it in your BayesTraits folder. Whenever the tutorial refers to the file Primates.trees, use Primates.first.tree instead.

Obtain the missing MatingSystem.txt file

The first part of the tutorial in the manual will not work out of the box because it assumes you have the file MatingSystem.txt, which is not included in the distribution. It turns out that the missing MatingSystem.txt is just Primates.txt with the first of the two characters deleted. I've done the modification for you, so download the MatingSystem.txt file now and save it in your BayesTraits folder.

Do the tutorial

Work through the tutorial stating on p. 10 of the BayesTraits draft manual PDF file (but only after reading the Tutorial Notes section below). The heading of the section is "Using MultiState to estimate the model of evolution and ancestral states for a binary trait". Stop when you get to the "Functional Gene Links" section (p. 18 of the manual).

Tutorial Notes

Remember throughout the tutorial to use Primates.first.tree instead of Primates.trees! Note that your output will only correspond to that of tree number 1 in the sample output from the BayesTraits manual.

BayesTraits must be run from the command line, which means you must open a command window to run the program. Simply double-clicking BayesTraits.exe will cause it to run but not for long! The problem is that when you double-click the program, you have no way to tell it what tree and data file to use, so it simply quits immediately.

There are two ways to get a command window (or shell) in which you can run BayesTraits. The first (not preferred) method is to click on the Start button, and choose Run..., then type cmd in the dialog box that appears. Pressing the Enter key will get you a command window, but you will need to navigate (using the unfriendly cd command) to your BayesTraits directory before starting the tutorial.

The second, and preferred, approach is to create a Windows batch file. In your BayesTraits folder, create a file named run_matingsystem.bat that contains the following text:

BayesTraits Primates.first.tree MatingSystem.txt
pause

Double-clicking run_matingsystem.bat will open a command window and start the BayesTraits program, saving the trouble of having to type the name of the tree file and data file each time you start the program. The pause command means that the window will stay open after BayesTraits finishes.

This batch file will work for the example involving MatingSystem.txt. Later, however, the tutorial switches to using the data file Primates.txt. At this point, you might want to create a second batch file named run_primates.bat containing the following text:

BayesTraits Primates.first.tree Primates.txt
pause

One final note: the default number of MCMC iterations is 5,050,000. This will take some time to run. For our purposes, it is ok to reduce this number. For example, to tell BayesTraits to only run for 550,000 iterations, type in the following command before you type run:

it 550000

There is a listing of all commands recognized by BayesTraits at the end of the manual.

Part 2: Running BayesTraits on the cluster