|EEB 5349: Phylogenetics|
by Kevin Keegan
To introduce you to the R package ggtree for plotting phylogenetic trees.
Download the tree file moths.txt and save in a convenient place on your hard drive.
See what versions of R are available:
Load R version 3.3.1
module load R/3.3.1
when you try to install a package it will ask you if you want to install in local directory
Open a terminal, start R, and install the packages we will be using. We'll be using the packages:
BiocInstaller ape Biostrings ggplot2 ggtree phytools ggrepel stringr stringi abind treeio
You can install a package like so:
Many of the above packages are part of the Bioconductor project (like ggtree and treeio). You can find extensive documentation on their website for packages associated with their project.
Read in the Tree File
We're dealing with a tree in the Newick file format which the function read.newick from the package treeio can handle:
tree <- read.newick("moth.txt")
R can handle more than just Newick formatted tree files. To see what other file formats from the various phylogenetic software that R can handle checkout treeio. Note: the functionality within treeio used to be part of the ggtree package itself, but the authors recently split ggtree in two with one part (ggtree) handling mostly plotting, and the other other part (treeio) handling mostly file input/output operations.
Let's quickly plot the tree to see what it looks like using the regular old plot function from the graphics package:
Notice the tree has all of its tips labeled. It's also a little cramped. You can expand the plot window to try to get the tree to display more legibly. We'll eventually use the package ggsave to control the dimensions of the plot when we finally export it to a PDF file. But until then, expand the plot window to get the tree to display reasonably well.
Now plot the tree using the ggtree package:
What happened to our tree!? The plot function from the graphics package simply, but stubbornly, plots your tree without much ability to alter aesthetics. ggtree by default plots almost nothing, assuming you will add what you want to your tree plot. You can add elements to the plot using geoms, just the same way that you would add elements to plots using the package ggplot2. The use of geoms makes plotting easily extensible, but it is by no means normal R syntax. To see the geoms available to ggtree check out its reference manual on BioConductor.
Adding/Altering Tree Elements with Geoms
OK this tree would be more useful with tiplabels. Let's add them using geom_tiplab:
Those tip labels are nice but a little big. geom_tiplab has a bunch of arguments that you can play around with, including one for the text size. You can read more about the available arguments in the ggtree manual Plot the tree again but with smaller labels:
Export Plot to PDF
Yu G, Smith D, Zhu H, Guan Y and Lam TT (2017). “ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data.” Methods in Ecology and Evolution, 8, pp. 28-36. doi: 10.1111/2041-210X.12628, http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12628/abstract.