Paul Lewis

From eebedia

Revision as of 10:22, 19 January 2011 by PaulLewis (Talk | contribs)
Jump to: navigation, search
P gracilis inflor.jpg
Welcome to the home page of Paul O. Lewis. I am an Associate Professor in the Department of Ecology and Evolutionary Biology at the University of Connecticut. Although much of my professional training was in the area of flowering plant systematics, specifically the evolution and biogeography of the North American genus Polygonella (P. gracilis is shown at left), my research now focuses primarily on phylogenetic theory and methodology. Please visit the research section below to learn more about the ongoing research in my lab.

Prospective students: I welcome inquiries by prospective students, especially those with interests in both mathematics and biology.



Paul's office and lab is located on the first floor of the Torrey Life Science Building on the Storrs campus of the University of Connecticut (map) Our mailing address is
Department of Ecology & Evolutionary Biology, University of Connecticut, 75 North Eagleville Road, Unit 3043, Storrs, CT 06269-3043 U.S.A.

Current Lab Members

Paul O. Lewis
Paul O. Lewis is an Associate Professor in the Ecology and Evolutionary Biology Department. He obtained his B.S. in Biology and Mathematics from Georgetown College in 1982, M.S. in Biology from Memphis State University (now The University of Memphis) in 1984, and Ph.D. in Plant Biology from Ohio State University in 1991. He was a postdoc at North Carolina State University with Bruce Weir (now at the University of Washington), and at the Laboratory of Molecular Systematics, Smithsonian Institution, with David Swofford (now at Duke University). Paul came to UConn after 3 years as an Assistant Professor in the Biology Department at the University of New Mexico. Office: TLS 166A Voice: (860) 486-2069 Fax: (860) 486-6364 E-mail:

Yu (Daniel) Fan
Yu (Daniel) Fan is a doctoral student. He obtained his bachelor's degree from the Ocean University of China in Qingdao, and graduate education in Molecular Systematics from the Institute of Oceanology, Chinese Academy of Sciences (IOCAS). Office: TLS 162 Voice: (860) 486-2232 Fax: (860) 486-6364 E-mail:

Past Lab Members

Holder family
Mark T. Holder was a postdoctoral research associate in the lab for 2 years. Mark received his Ph.D. from the University of Texas, Austin, in 2001, and moved on to a postdoctoral position with David Swofford at Florida State University in 2003 and is currently an Assistant Professor in Ecology and Evolutionary Biology at the University of Kansas. Kris (also Ph.D. 2001 from Univ. of Texas, Austin) and Julia (b. 2001) are also pictured. Not pictured is the latest arrival, Susan (b. Nov. 13, 2004). For Mark's current contact information, see


Below is a brief description of several recent projects to give you a sense of the kinds of research questions I address. The biggest emphasis is currently Bayesian phylogenetic inference. This field is perhaps not in its infancy anymore, but there are many important problems still needing work, especially in areas such as model comparison, mixing, and convergence diagnostics. A list of published and in-press papers follows the project descriptions.

Bayesian Model Selection

Most recently, in collaboration with Ming-Hui Chen and Lynn Kuo in the UConn Department of Statistics, we've been working on improving methods for estimating the marginal likelihood of a model. The marginal likelihood is used in Bayesian inference to compare model fit. Comparing two models, the one with the higher marginal likelihood can be viewed as fitting the data better overall. The commonly-used harmonic mean method is biased, tending to overestimate the fit of a model. This can lead to selection of models that are overparameterized, the consequences of which include longer run times for MCMC analyses and, potentially, poor parameter estimates for some part of the model. Our new method for estimating marginal likelihoods is called steppingstone sampling (or SS for short). SS is much more reliable than the harmonic mean (HM) method, and is as accurate as thermodynamic integration, which is an alternative estimation method developed by Nicolas Lartillot and Herve Phillippe (see Lartillot and Philippe. 2006. Computing Bayes factors using thermodynamic integration. Systematic Biology 55(2):195-207). We anticipate that using SS will have the most impact on partitioned analyses where HM often suggests that the most-partitioned model is best. SS is currently (as of Feb. 2010) being incorporated into the software Phycas so that others can try it.

Bayesian Star Tree Paradox

If sequence data are simulated using a 4-taxon star tree (such as the one shown on the right) and evaluated with standard software tools for Bayesian phylogenetic inference, one of the 3 possible fully-resolved trees is often supported very strongly. This is paradoxical in that most people expect the three possible resolutions to be equally supported in this case, but such an outcome is only seen when the sequence length is tiny (e.g. 1 site). It appears that uncertainty in this case is manifested in the inability to predict, from dataset to dataset, which of the 3 possible fully-resolved tree topologies will be favored. This behavior is troubling, and possible examples of this behavior have been pointed out by several researchers. Many more potential examples can be found in the literature by looking for high posterior probabilities but low bootstrap support, combined with tiny internal edges.

We argue that the central problem here is the non-identifiability of the tree topology, and propose a solution using reversible-jump MCMC. Our rjMCMC sampler visits not only fully-resolved tree topologies, but can visit topologies containing hard polytomies as well. This effectively places a point mass prior probability on polytomies, providing an alternative in situations in which a fully-resolved topology is not a reasonable option. The analysis can be made as conservative as desired by modifying the prior distribution assumed for topologies, but in our (albeit limited) experience it does not appear easy to destroy support for real edges by using a prior that strongly supports polytomous topologies.

Reference: Lewis, P. O., Holder, M. T., and Holsinger, K. E. 2005. Polytomies and Bayesian phylogenetic inference. Systematic Biology 54(2): 241-253. link

Phylodiversity in Desert Green Algae (the other land plants)

A major thrust in the laboratory of Louise Lewis is diversity and systematics of green algae (Phylum Chlorophyta) living in the soils of North American deserts. These unicellular green algae are capable of tolerating the harsh conditions posed by desert soil environments, and represent an important (yet not well understood) component of desert microbiotic crust communities. The 18S rDNA sequences of a number of green algal isolates have been determined, and these data suggest that several lineages of green algae have diversified within deserts. One might be tempted to think that the green algal cells isolated from desert soils are simply the result of spores dispersed into deserts from distant aquatic sources. This study shows that the 18S sequences of these desert isolates are more divergent from their nearest aquatic relatives than would be predicted if they were merely incidental visitors. We characterize the molecular phylodiversity of desert green algae and demonstrate with a Bayesian analysis of 150 green algal 18S sequences that all freshwater classes of green algae have yielded desert lineages. The numerous transitions from desert to aquatic existence apparent from the phylogeny argue that it is no longer accurate to portray land plants as resulting from a single origin. The highly celebrated origin leading to the embryophytes is but one of many transitions to terrestriality.

Reference: Lewis, L. A., and Lewis, P. O. 2005. Unearthing the molecular phylodiversity of desert soil green algae (Chlorophyta). Systematic Biology 54(6): 936-947. link


(Current or past lab members are indicated in bold.)

Xie, W., P. O. Lewis, Y. Fan, L. Kuo, and M.-H. Chen. Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Systematic Biology (in press). doi:10.1093/sysbio/syq085

Fan, Y., R. Wu, M.-H. Chen, L. Kuo and P. O. Lewis. 2011. Choosing among partition models in Bayesian Phylogenetics. Molecular Biology and Evolution 28(1):523-532. doi:10.1093/molbev/msq224 Open Access

Holder, M. T., P. O. Lewis, and D. L. Swofford. 2010. The Akaike Information Criterion will not choose the no common mechanism model. Systematic Biology 59(4):477–485. doi:10.1093/sysbio/syq028

Holder, M. T., J. Sukumaran, and P. O. Lewis. 2008. A justification for reporting the majority-rule consensus tree in Bayesian phylogenetics. Systematic Biology 57(5):814–821. doi:10.1080/10635150802422308

Wickett, N. J., Y. Fan, P. O. Lewis, and B. Goffinet. 2008. Distribution and evolution of pseudogenes, gene losses, and a gene rearrangement in the plastid genome of the nonphotosynthetic liverwort, Aneura mirabilis (Metzgeriales, Jungermanniopsida). Journal of Molecular Evolution 67(1): 111-122 link

Lapp, H., S. Bala, J. P. Balhoff, A. Bouck, N. Goto, M. Holder, R. Holland, A. Holloway, T. Katayama, P. O. Lewis, A. J. Mackey, B. I. Osborne, W. H. Piel, S. L. Kosakovsky Pond, A. F. Y. Poon, W-G Qiu, J. E. Stajich, A. Stoltzfus, T. Thierer, A. J. Vilella, R. A. Vos, C. M. Zmasek, D. J. Zwickl and T. J Vision. 2007. The 2006 NESCent Phyloinformatics Hackathon: A Field Report. Evolutionary Bioinformatics 3: 357-366

Gelfand, A. E., J. A. Silander Jr., S. Wu, A. Latimer, P. O. Lewis, A. G. Rebelo and M. Holder. 2006. Explaining species distribution patterns through hierarchical modeling. Bayesian Analysis 1(1): 41-92 link

Holder, Mark T., Lewis, P. O., Swofford, D. L., and Larget, B. 2005. Hastings ratio of the LOCAL proposal used in Bayesian phylogenetics. Systematic Biology 54(6): 961-965 link

Lewis, L. A., and Lewis, P. O. 2005. Unearthing the molecular phylodiversity of desert soil green algae (Chlorophyta). Systematic Biology 54(6): 936-947 link

Lewis, P. O., Holder, M. T. and Holsinger, K. E. 2005. Polytomies and Bayesian phylogenetic inference. Systematic Biology 54(2): 241-253 link.

Holder, M. T. and Lewis, P. O. 2003. Phylogeny estimation: traditional and Bayesian approaches. Nature Reviews Genetics 4: 275-284. Pdficon small.gif

Duran, K. L., Lowrey, T. K., Parmenter, R. R., and Lewis, P. O. 2005. Genetic diversity in Chihuahuan desert populations of creosotebush (Zygophyllaceae: Larrea tridentata). American Journal of Botany 92(4): 722-729.

Lewis, P. O. 2003. NCL: a C++ class library for interpreting data files in NEXUS format. Bioinformatics 19 (17): 2330-2331. link

Brauer, M. J., Holder, M. T., Dries, L. A., Zwickl, D. J., Lewis, P. O., and Hillis, D. M. 2002. Genetic algorithms and parallel processing in maximum-likelihood phylogeny inference. Molecular Biology and Evolution 19: 1717-1726.

Holsinger, K. E., Lewis, P. O., and Dey, D. K. 2002. A Bayesian approach to inferring population structure from dominant markers. Molecular Ecology 11: 1157-1164.

Swofford, D. L., Waddell, P. J., Huelsenbeck, J. P., Foster, P. G., Lewis, P. O., and Rogers, J. S. 2001. Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Systematic Biology 50: 525-539.

Lewis, P. O. 2001. A likelihood approach to estimating phylogeny from discrete morphological character data. Systematic Biology 50:913-925.

Lewis, P. O. 2001. Phylogenetic systematics turns over a new leaf. Trends in Ecology and Evolution 16:30-37. Pdficon small.gif (Reprinted from TRENDS in Ecology & Evolution, volume 16, copyright 2001, with permission from Elsevier Science)

Conant, G. C., and Lewis, P. O. 2001. Effects of nucleotide composition bias on the success of the parsimony critierion in phylogenetic inference. Molecular Biology and Evolution 18: 1024-1033.

Lewis, P. O.', and Swofford, D. L. 2001. Back to the future: Bayesian inference arrives in phylogenetics. Trends in Ecology and Evolution 16: 600-601. (conference report)

Lewis, P. O. 1998. Maximum likelihood as an alternative to parsimony for inferring phylogeny using nucleotide sequence data. Pages 132-163 in: Soltis, D. E., Soltis, P. S., and Doyle, J. J., Molecular Systematics of Plants II. Kluwer, Boston.

Lewis, P. O. 1998. A genetic algorithm for maximum likelihood phylogeny inference using nucleotide sequence data. Molecular Biology and Evolution 15:277-283.

Gaut, B. S., and Lewis, P. O. 1995. Success of maximum likelihood phylogeny inference in the four-taxon case. Molecular Biology and Evolution 12:152-162.

Lewis, P. O., and Crawford, D. J. 1995. Pleistocene refugium endemics exhibit greater allozymic diversity than widespread congeners in the genus Polygonella (Polygonaceae). American Journal of Botany 82:141-149.

Williams, C. G., Hamrick, J. L., and Lewis, P. O. 1995. Multiple-population versus hierarchical conifer breeding programs: a comparison of genetic diversity levels. Theoretical and Applied Genetics 90: 584-594.

Lewis, P. O., and Lewis, L. A. 1995. MEGA: Molecular evolutionary genetics analysis, version 1.02. Systematic Biology 44: 576-577. (software review)

Kaplan, N. L., Lewis, P. O. and Weir, B. S. 1994. Age of cystic fibrosis mutation. Nature Genetics 8: 216.

Snow, A. A., and Lewis, P. O. 1993. Reproductive traits and male fertility in plants: empirical approaches. Annual Review of Ecology and Systematics 24: 331-351.

Govindaraju, D., Lewis, P. O., and Cullis, C. 1992. Phylogenetic analysis of pines using ribosomal DNA restriction fragment length polymorphisms. Plant Systematics and Evolution 179: 141-153.

Lewis, P. O., and Snow, A. A. 1992. Deterministic paternity exclusion using RAPD markers. Molecular Ecology 1:155-160.

Lewis, P. O. 1991. Allozyme variation in the rate Gulf Coast endemic Polygonella macrophylla Small (Polygonaceae). Plant Species Biology 6: 1-10.





Personal tools