Difference between revisions of "Phylogenetics: Searching Lab"

From EEBedia
Jump to: navigation, search
Line 6: Line 6:
 
# '''Create the data file.''' Start PAUP*, press the Cancel button to dismiss the "Open" dialog box that appears, and choose File > New from the main menu. This will open a new, empty text edit window. Click [http://hydrodictyon.eeb.uconn.edu/people/plewis/courses/phylogenetics/labs/angio35.nex here] to see the data, use Ctrl-a to copy all of it, then paste it into the editor window in PAUP*. Save the file as angio35.nex.   
 
# '''Create the data file.''' Start PAUP*, press the Cancel button to dismiss the "Open" dialog box that appears, and choose File > New from the main menu. This will open a new, empty text edit window. Click [http://hydrodictyon.eeb.uconn.edu/people/plewis/courses/phylogenetics/labs/angio35.nex here] to see the data, use Ctrl-a to copy all of it, then paste it into the editor window in PAUP*. Save the file as angio35.nex.   
 
# '''Create a command file.''' Once again choose File &gt; New from the main menu to create a second blank file, then type in the following text, saving this as run.nex:<pre>#nexus&#10;&#10;begin paup;&#10;  log file=output.txt start replace;&#10;  execute angio35.nex;&#10;end;</pre>
 
# '''Create a command file.''' Once again choose File &gt; New from the main menu to create a second blank file, then type in the following text, saving this as run.nex:<pre>#nexus&#10;&#10;begin paup;&#10;  log file=output.txt start replace;&#10;  execute angio35.nex;&#10;end;</pre>
# Now execute run.nex (either using File &gt; Open..., or if it is still visible in the editor window in front of you, you can execute it by pressing Ctrl-r). There at least a couple of advantages to creating little NEXUS files like run.nex. For now, the only advantage is that executing run.nex automatically starts a log file so that you will have a record of what you did. Later, when you get in the habit of putting commands in paup blocks, you will appreciate the separation of the data from the commands that initiate analyses (I have many times opened a data file, forgetting about the embedded paup block that then starts a long search, overwrites my previous log file, and otherwise creates havoc).<br/><br/>Note that because we used the replace keyword in the log command, the file output.txt will be overwritten without warning if it exists. This is called ''living dangerously'', so you may want to refrain from using the replace keyword so that PAUP* asks before overwriting files.
+
# '''Execute run.nex, which will in turn execute angio35.nex.''' You can execute run.nex either using File &gt; Open..., or (if it is still visible in the editor window in front of you) by pressing Ctrl-r. There at least a couple of advantages to creating little NEXUS files like run.nex. For now, the only advantage is that executing run.nex automatically starts a log file so that you will have a record of what you did. Later, when you get in the habit of putting commands in paup blocks, you will appreciate the separation of the data from the commands that initiate analyses (I have many times opened a data file, forgetting about the embedded paup block that then starts a long search, overwrites my previous log file, and otherwise creates havoc).<br/><br/>Note that because we used the replace keyword in the log command, the file output.txt will be overwritten without warning if it exists. This is called ''living dangerously'', so you may want to refrain from using the replace keyword so that PAUP* asks before overwriting files.
# Delete (using the <tt>delete 6-.</tt> command) all taxa except the first five (Ephedrasinica, Gnetum_gnemJS, WelwitschiaJS, Ginkgo_biloba, and Pinus_ellCH).
+
# '''Delete all taxa except the first five.''' Using this command<pre>delete 6-.</pre>will cause PAUP* to ignore all taxa except Ephedrasinica, Gnetum_gnemJS, WelwitschiaJS, Ginkgo_biloba, and Pinus_ellCH.
# Perform an exhaustive search using parsimony (use the <tt>alltrees</tt> command). This should go fast because you now have only 5 taxa; if it seems to be taking a long time, it probably actually finished some time ago and is waiting for you to press the Close button on the dialog box.
+
# '''Perform an exhaustive search using parsimony.''' Use the <tt>alltrees</tt> command for this. This should go fast because you now have only 5 taxa; if it seems to be taking a long time, it probably actually finished some time ago and is waiting for you to press the Close button on the dialog box.
#* How many separate tree topologies did PAUP* examine?
+
#* ''How many separate tree topologies did PAUP* examine?''
#* What is the parsimony treelength of the best tree? The worst tree?
+
#* ''What is the parsimony treelength of the best tree? The worst tree?''
#* How many steps separate the best tree from the next best?
+
#* ''How many steps separate the best tree from the next best?''
# You will next perform an heuristic search using NNI branch swapping. Before you start, use the <tt>describe</tt> command to show you the tree obtained from the exhaustive enumeration.
+
# '''Perform an heuristic search using NNI branch swapping.''' Before you start, use the <tt>describe</tt> command to show you the tree obtained from the exhaustive enumeration.
#* Draw this tree on a piece of paper and then draw the 4 possible NNI rearrangements
+
#* ''Draw this tree on a piece of paper and then draw the 4 possible NNI rearrangements''
 
+
# '''Find all NNI rearrangements of the best tree.''' Note that because we performed an exhaustive enumeration, we now know which tree is the globally most parsimonious tree. We are thus guaranteed to never find a better tree were we to start an heuristic search with this tree. Let's do an experiment: perform an NNI heuristic search, starting with the best tree, and have PAUP* save all the trees it encounters in this search. In the end, PAUP* will have in memory 5 trees: the starting tree and the 4 trees corresponding to all possible NNI rearrangements of that starting tree.<pre>hsearch start=1 swap=nni nbest=15;</pre>
 
+
#* <tt>start=1</tt> starts the search from the tree currently in memory (i.e., the best tree resulting from your exhaustive search using the parsimony criterion)
 
+
#* <tt>swap=nni</tt> causes the Nearest-Neighbor Interchange (NNI) method to be used for branch swapping
 +
#* <tt>nbest=15</tt> saves the 15 best trees found during the search. Thus, were PAUP* to examine every possible tree, we would end up saving all of them in memory. The reason this command is needed is that PAUP* ordinarily does not save trees that are worse than the best one it has seen thus far. Here, we are interested in seeing the trees that are examined during the course of the search, even if they are not as good as the starting tree.
  
 
[[Parsimony Lab|Go back to part A: Using PAUP* to check your answers for homework #2]]
 
[[Parsimony Lab|Go back to part A: Using PAUP* to check your answers for homework #2]]

Revision as of 03:26, 29 January 2007

This article is part of an EEB course.
Please do not edit the content of this page without the approval of the course instructor.

Adiantum.png EEB 349: Phylogenetics

Part B: Searching under the parsimony criterion

  1. Create the data file. Start PAUP*, press the Cancel button to dismiss the "Open" dialog box that appears, and choose File > New from the main menu. This will open a new, empty text edit window. Click here to see the data, use Ctrl-a to copy all of it, then paste it into the editor window in PAUP*. Save the file as angio35.nex.
  2. Create a command file. Once again choose File > New from the main menu to create a second blank file, then type in the following text, saving this as run.nex:
    #nexus
    
    begin paup;
      log file=output.txt start replace;
      execute angio35.nex;
    end;
  3. Execute run.nex, which will in turn execute angio35.nex. You can execute run.nex either using File > Open..., or (if it is still visible in the editor window in front of you) by pressing Ctrl-r. There at least a couple of advantages to creating little NEXUS files like run.nex. For now, the only advantage is that executing run.nex automatically starts a log file so that you will have a record of what you did. Later, when you get in the habit of putting commands in paup blocks, you will appreciate the separation of the data from the commands that initiate analyses (I have many times opened a data file, forgetting about the embedded paup block that then starts a long search, overwrites my previous log file, and otherwise creates havoc).

    Note that because we used the replace keyword in the log command, the file output.txt will be overwritten without warning if it exists. This is called living dangerously, so you may want to refrain from using the replace keyword so that PAUP* asks before overwriting files.
  4. Delete all taxa except the first five. Using this command
    delete 6-.
    will cause PAUP* to ignore all taxa except Ephedrasinica, Gnetum_gnemJS, WelwitschiaJS, Ginkgo_biloba, and Pinus_ellCH.
  5. Perform an exhaustive search using parsimony. Use the alltrees command for this. This should go fast because you now have only 5 taxa; if it seems to be taking a long time, it probably actually finished some time ago and is waiting for you to press the Close button on the dialog box.
    • How many separate tree topologies did PAUP* examine?
    • What is the parsimony treelength of the best tree? The worst tree?
    • How many steps separate the best tree from the next best?
  6. Perform an heuristic search using NNI branch swapping. Before you start, use the describe command to show you the tree obtained from the exhaustive enumeration.
    • Draw this tree on a piece of paper and then draw the 4 possible NNI rearrangements
  7. Find all NNI rearrangements of the best tree. Note that because we performed an exhaustive enumeration, we now know which tree is the globally most parsimonious tree. We are thus guaranteed to never find a better tree were we to start an heuristic search with this tree. Let's do an experiment: perform an NNI heuristic search, starting with the best tree, and have PAUP* save all the trees it encounters in this search. In the end, PAUP* will have in memory 5 trees: the starting tree and the 4 trees corresponding to all possible NNI rearrangements of that starting tree.
    hsearch start=1 swap=nni nbest=15;
    • start=1 starts the search from the tree currently in memory (i.e., the best tree resulting from your exhaustive search using the parsimony criterion)
    • swap=nni causes the Nearest-Neighbor Interchange (NNI) method to be used for branch swapping
    • nbest=15 saves the 15 best trees found during the search. Thus, were PAUP* to examine every possible tree, we would end up saving all of them in memory. The reason this command is needed is that PAUP* ordinarily does not save trees that are worse than the best one it has seen thus far. Here, we are interested in seeing the trees that are examined during the course of the search, even if they are not as good as the starting tree.

Go back to part A: Using PAUP* to check your answers for homework #2