Difference between revisions of "Phylogenetics: Using IQ-TREE for multilocus likelihood analyses"

From EEBedia
Jump to: navigation, search
(Created page with "{| border="0" |- |rowspan="2" valign="top"|200px |<span style="font-size: x-large">[http://phylogeny.uconn.edu/courses/ EEB 5349: Phylogenetics]</span>...")
 
(Getting started)
 
Line 15: Line 15:
 
  srun --qos=general --pty bash
 
  srun --qos=general --pty bash
 
to start a session on a node that is not currently running jobs. Once you see the prompt, type
 
to start a session on a node that is not currently running jobs. Once you see the prompt, type
  module load paup/4.0a-166
+
  module load iqtree/1.6.6
to load the paup module.
+
to load the iqtree module. (How did I know to type "iqtree/1.6.6"? Use the command "module avail" to see a list of all modules.)
  
<div style="background-color: #ddddff">Which rate heterogeneity submodels provide the largest boost to the likelihood?</div>
+
=== Download the beta version too ===
<div style="background-color: #ddddff">Why do pinvar and shape change the way they do when estimated jointly versus separately?</div>
+
 
 +
The tutorial we will be using today requires a later version of IQ-TREE than any version currently installed on the Xanadu cluster. The proper thing to do under most circumstances is to use the software request form from the "Contact Us" section of the [Computational Biology Core https://bioinformatics.uconn.edu/] site to request that it be installed for everyone to use (it would become a new module that anyone could load). However, sometimes you need to just get an analysis done and would prefer to have access to the software immediately (perhaps to just see if it is useful). For this reason, it is good to know how to install and run software in your own home directory, accessible only by you.
 +
 
 +
Download version 2.0-rc1 of IQ-TREE as follows from your home directory on Xanadu:
 +
curl -LO https://github.com/Cibiv/IQ-TREE/releases/download/v2.0-rc1/iqtree-2.0-rc1-Linux.tar.gz
 +
The curl ("copy url") command is a convenient way to download files from a web site into your current directory on the cluster. The "L" switch may not be necessary, but sometimes helps if the URL is an indirect reference to the file. The "O" (capital letter o, not zero) switch tells curl to save the file using the original file name (i.e. the name at the end of the URL, in this case "iqtree-2.0-rc1-Linux.tar.gz").
 +
 
 +
Unpack the downloaded "tape" archive file using the tar command:
 +
tar zxvf iqtree-2.0-rc1-Linux.tar.gz
 +
TAR files are single files containing the contents of an entire directory (or hierarchy of directories). Files in the original directory are simply concatenate into one big file (along with the information about what directory/subdirectory the file was in). The "z" tells tar to uncompress the file first (necessary because it has been compressed, as indicated by the "gz" file ending). The "x" tells tar to extract the file ("c" means create, but we are not creating a tar file, we are instead "x"tracting one). The "v" tells tar to be verbose and tell us the name of each file as it is extracted. Finally, "f" says that the name of the tar file is the next thing to expect on the command line.
 +
 
 +
The executable file is now in the "~/iqtree-2.0-rc1-Linux/bin" directory, where "~" means "home directory" and is a synonym of "$HOME", which is itself a synonym of the actual path of your home directory (something like "/home/CAM/plewis"). To make the new version of iqtree easy to access, and to give it a name consistent with the name used in the tutorial, execute this command:
 +
alias iqtree-beta="$HOME/iqtree-2.0-rc1-Linux/bin/iqtree"
 +
This will create an alias named "iqtree-beta" so that when you type "iqtree-beta" on the command line it will be replaced by ~/iqtree-2.0-rc1-Linux/bin/iqtree". This alias will only be available to you while you are logged in; it will be lost when you logout. If you want it to be permanent, edit (using nano) the file "~/.bash_profile" and place the alias command anywhere in the file (but on a line by itself. Now the alias will be automatically recreated every time you login.
 +
 
 +
=== Start the tutorial ===
 +
 
 +
We will be using the IQ-TREE tutorial written by Bui Minh (one of the main developers of IQ-TREE) for the 2019 Woods Hole Workshop in Molecular Evolution. The tutorial begins by having you download a couple of files (turtle.fa and turtle.nex). Can you figure out how to do this using curl?
 +
 
 +
Here is the link to the tutorial:
 +
 
 +
[http://www.iqtree.org/workshop/molevol2019 http://www.iqtree.org/workshop/molevol2019]
  
 
[[Category:Phylogenetics]]
 
[[Category:Phylogenetics]]

Latest revision as of 13:12, 10 February 2020

Adiantum.png EEB 5349: Phylogenetics

Goals

Today you will continue learning about maximum likelihood inference and will learn to use a program known for its speed and ability to handle large-scale phylogenetic analyses. IQ-TREE is not the only software that does this: you may later wish to visit the Phylogenetics: Large Scale Maximum Likelihood Analyses lab and note that there is also an alternative IQ-Tree written by Kevin Keegan for the 2018 version of EEB-5349.

Getting started

Login to your account on the Health Center (Xanadu) cluster (ssh username@xanadu-submit-ext.cam.uchc.edu). Type the following:

srun --qos=general --pty bash

to start a session on a node that is not currently running jobs. Once you see the prompt, type

module load iqtree/1.6.6

to load the iqtree module. (How did I know to type "iqtree/1.6.6"? Use the command "module avail" to see a list of all modules.)

Download the beta version too

The tutorial we will be using today requires a later version of IQ-TREE than any version currently installed on the Xanadu cluster. The proper thing to do under most circumstances is to use the software request form from the "Contact Us" section of the [Computational Biology Core https://bioinformatics.uconn.edu/] site to request that it be installed for everyone to use (it would become a new module that anyone could load). However, sometimes you need to just get an analysis done and would prefer to have access to the software immediately (perhaps to just see if it is useful). For this reason, it is good to know how to install and run software in your own home directory, accessible only by you.

Download version 2.0-rc1 of IQ-TREE as follows from your home directory on Xanadu:

curl -LO https://github.com/Cibiv/IQ-TREE/releases/download/v2.0-rc1/iqtree-2.0-rc1-Linux.tar.gz

The curl ("copy url") command is a convenient way to download files from a web site into your current directory on the cluster. The "L" switch may not be necessary, but sometimes helps if the URL is an indirect reference to the file. The "O" (capital letter o, not zero) switch tells curl to save the file using the original file name (i.e. the name at the end of the URL, in this case "iqtree-2.0-rc1-Linux.tar.gz").

Unpack the downloaded "tape" archive file using the tar command:

tar zxvf iqtree-2.0-rc1-Linux.tar.gz

TAR files are single files containing the contents of an entire directory (or hierarchy of directories). Files in the original directory are simply concatenate into one big file (along with the information about what directory/subdirectory the file was in). The "z" tells tar to uncompress the file first (necessary because it has been compressed, as indicated by the "gz" file ending). The "x" tells tar to extract the file ("c" means create, but we are not creating a tar file, we are instead "x"tracting one). The "v" tells tar to be verbose and tell us the name of each file as it is extracted. Finally, "f" says that the name of the tar file is the next thing to expect on the command line.

The executable file is now in the "~/iqtree-2.0-rc1-Linux/bin" directory, where "~" means "home directory" and is a synonym of "$HOME", which is itself a synonym of the actual path of your home directory (something like "/home/CAM/plewis"). To make the new version of iqtree easy to access, and to give it a name consistent with the name used in the tutorial, execute this command:

alias iqtree-beta="$HOME/iqtree-2.0-rc1-Linux/bin/iqtree"

This will create an alias named "iqtree-beta" so that when you type "iqtree-beta" on the command line it will be replaced by ~/iqtree-2.0-rc1-Linux/bin/iqtree". This alias will only be available to you while you are logged in; it will be lost when you logout. If you want it to be permanent, edit (using nano) the file "~/.bash_profile" and place the alias command anywhere in the file (but on a line by itself. Now the alias will be automatically recreated every time you login.

Start the tutorial

We will be using the IQ-TREE tutorial written by Bui Minh (one of the main developers of IQ-TREE) for the 2019 Woods Hole Workshop in Molecular Evolution. The tutorial begins by having you download a couple of files (turtle.fa and turtle.nex). Can you figure out how to do this using curl?

Here is the link to the tutorial:

http://www.iqtree.org/workshop/molevol2019