Ggtree

From EEBedia
Revision as of 22:02, 7 March 2018 by Paul Lewis (Talk | contribs) (Getting Started)

Jump to: navigation, search
Adiantum.png EEB 5349: Phylogenetics

by Kevin Keegan

Goals

To introduce you to the R package ggtree for plotting phylogenetic trees.

Introduction

Getting Started

Download the tree file moths.txt and save in a convenient place on your hard drive.

Installing Packages

Open a terminal, start R, and install the packages we will be using. We'll be using the packages:

BiocInstaller
ape
Biostrings
ggplot2
ggtree
phytools
ggrepel
stringr
stringi
abind
treeio

You can install a package like so:

install.packages("BiocInstaller")

Many of the above packages are part of the Bioconductor project (like ggtree and treeio). You can find extensive documentation on their website for packages associated with their project.

Read in the Tree File

We're dealing with a tree in the Newick file format which the function read.newick from the package treeio can handle:

tree <- read.newick("moth.txt")

R can handle more than just Newick formatted tree files. To see what other file formats from the various phylogenetic software that R can handle checkout treeio. Note: the functionality within treeio used to be part of the ggtree package itself, but the authors recently split ggtree in two with one part (ggtree) handling mostly plotting, and the other other part (treeio) handling mostly file input/output operations.

Let's quickly plot the tree to see what it looks like using the regular old plot function from the graphics package:

plot(tree)

Notice the tree has all of its tips labeled. It's also a little cramped. You can expand the plot window to try to get the tree to display more legibly. We'll eventually use the package ggsave to control the dimensions of the plot when we finally export it to a PDF file. But until then, expand the plot window to get the tree to display reasonably well.

Now plot the tree using the ggtree package:

ggtree(tree)

What happened to our tree!? The plot function from the graphics package simply, but stubbornly, plots your tree without much ability to alter aesthetics. ggtree by default plots almost nothing, assuming you will add what you want to your tree plot. You can add elements to the plot using geoms, just the same way that you would add elements to plots using the package ggplot2. The use of geoms makes plotting easily extensible, but it is by no means normal R syntax. To see the geoms available to ggtree check out its reference manual on BioConductor.

Adding/Altering Tree Elements with Geoms

Tip Labels

OK this tree would be more useful with tiplabels. Let's add them using geom_tiplab:

ggtree(tree)+geom_tiplab()

Those tip labels are nice but a little big. geom_tiplab has a bunch of arguments that you can play around with, including one for the text size. You can read more about the available arguments in the ggtree manual Plot the tree again but with smaller labels:

ggtree(tree)+geom_tiplab(size=3.5)
Clade Labels
Node Labels
Clade Color
Scale Bar

Export Plot to PDF

Cite ggtree

citation("ggtree")

References

Yu G, Smith D, Zhu H, Guan Y and Lam TT (2017). “ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data.” Methods in Ecology and Evolution, 8, pp. 28-36. doi: 10.1111/2041-210X.12628, http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12628/abstract.