Difference between revisions of "Phylogenetics: NEXUS Format"
Line 17: | Line 17: | ||
Now type in the following (PAUP) command: | Now type in the following (PAUP) command: | ||
tonexus from=angio35.txt to=angio35.nex datatype=nucleotide format=text; | tonexus from=angio35.txt to=angio35.nex datatype=nucleotide format=text; | ||
− | After the conversion, the file <tt>angio35.nex</tt> should be present. | + | After the conversion, the file <tt>angio35.nex</tt> should be present. Type <tt>quit</tt> to quit PAUP*, then open this Nexus file in the pico editor to see what PAUP* did to convert the original file to Nexus format. (The most important thing PAUP* did was to count the number of nucleotides and set <tt>nchar</tt> for you.) |
+ | |||
+ | Create an assumptions block containing a default exclusion set that excludes the following sites automatically whenever the data file is executed. This should be added to the bottom of the newly-created Nexus file (i.e., after the data). You can use the pico editor for this. | ||
+ | begin assumptions; | ||
+ | exset * unused = 1-41 234-241 246 506-511 555 681-689 1393-1399 1797-1855 1856-1884 4754-4811; | ||
+ | end; | ||
+ | These numbers represent nucleotide sites that are either missing a lot of data or are difficult to align. The name I gave to this exclusion set is <tt>unused</tt>, but you could name it anything you like. The asterisk tells PAUP* that you want this exset to be applied automatically every time the file is executed. |
Revision as of 18:09, 21 January 2009
EEB 349: Phylogenetics | |
The goal of this lab exercise is to show you how to easily create a NEXUS-formatted data file from a set of sequences. The NEXUS format is widely used in phylogenetics, and its basic features are described in the second part of this tutorial. |
Using PAUP to create a NEXUS data file
First, download the file angio35.txt to your hard drive and then upload it to the cluster (instructions in Phylogenetics: Bioinformatics Cluster).
Now login to the cluster (bbcxsrv1.biotech.uconn.edu) and type paup to start the PAUP* program.
Now type in the following (PAUP) command:
tonexus from=angio35.txt to=angio35.nex datatype=nucleotide format=text;
After the conversion, the file angio35.nex should be present. Type quit to quit PAUP*, then open this Nexus file in the pico editor to see what PAUP* did to convert the original file to Nexus format. (The most important thing PAUP* did was to count the number of nucleotides and set nchar for you.)
Create an assumptions block containing a default exclusion set that excludes the following sites automatically whenever the data file is executed. This should be added to the bottom of the newly-created Nexus file (i.e., after the data). You can use the pico editor for this.
begin assumptions; exset * unused = 1-41 234-241 246 506-511 555 681-689 1393-1399 1797-1855 1856-1884 4754-4811; end;
These numbers represent nucleotide sites that are either missing a lot of data or are difficult to align. The name I gave to this exclusion set is unused, but you could name it anything you like. The asterisk tells PAUP* that you want this exset to be applied automatically every time the file is executed.