Phylogenetics: NEXUS Format
EEB 349: Phylogenetics | |
The goal of this lab exercise is to show you how to easily create a NEXUS-formatted data file from a set of sequences. The NEXUS format is widely used in phylogenetics, and its basic features are described in the second part of this tutorial. |
Using PAUP to create a NEXUS data file
First, download the file angio35.txt to your hard drive and then upload it to the cluster (instructions in Phylogenetics: Bioinformatics Cluster).
Now login to the cluster (bbcxsrv1.biotech.uconn.edu) and type paup to start the PAUP* program.
Now type in the following (PAUP) command:
tonexus from=angio35.txt to=angio35.nex datatype=nucleotide format=text;
After the conversion, the file angio35.nex should be present. Type quit to quit PAUP*, then open this Nexus file in the pico editor to see what PAUP* did to convert the original file to Nexus format. (The most important thing PAUP* did was to count the number of nucleotides and set nchar for you.)
Create an assumptions block containing a default exclusion set that excludes the following sites automatically whenever the data file is executed. This should be added to the bottom of the newly-created Nexus file (i.e., after the data). You can use the pico editor for this.
begin assumptions; exset * unused = 1-41 234-241 246 506-511 555 681-689 1393-1399 1797-1855 1856-1884 4754-4811; end;
These numbers represent nucleotide sites that are either missing a lot of data or are difficult to align. The name I gave to this exclusion set is unused, but you could name it anything you like. The asterisk tells PAUP* that you want this exset to be applied automatically every time the file is executed.