Development and application of open-source research tools for computational biology


McMurdie and Holmes Waste Not, Want Not: Why Rarefying Microbiome Data is Inadmissible

(2014) PLoS Computational Biology. 10(4):e1003531

(2013) Pre-print hosted on the arxiv prior to publication:

PDF version 2 (12 Dec 2013), PDF version 1 (1 Oct 2013)


Citing phyloseq in your article:

McMurdie and Holmes (2013) phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census DataPLoS ONE. 8(4):e61217


phyloseq: Reproducible Analysis for Microbiome Census Data

I am interested in data management, multiple-testing, exploratory analysis, and other statistical interpretations of microbiome census data in studies such as the human gut microbiome, microbial mats, and biofuel feedstock degradation communities. To help facilitate a need for easier, more reproducible statistical analysis of this highly multivariate, multicomponent data, I have created a new open-source R package, phyloseq, that  provides a set of tools for importing, organizing, filtering, analyzing, and graphically-summarizing phylogenetic sequencing data. The phyloseq package leverages many of the tools available in R for ecological/phylogenetic analysis, graphics, statistics, and parallel/cloud computing, with emphasis on flexible publication-quality graphics built with a powerful implementation of the Grammar of Graphics called ggplot2.

Latest Peer-Reviewed Article about phyloseq:


The phyloseq package is a completely open-source software tool, licensed under AGPL-3.
The development version of phyloseq is available for use and collaborative development through GitHub, where we also have hosted tutorial and demo materials, as well as the phyloseq feature request and issue tracker.

Previous Research

My doctoral research was focused on a group of bacteria, called Dehalococcoides, that can destroy chlorinated organic pollutants, including chloroethenes - the most commonly detected pollutants in contaminated groundwater. The prevalence of chloroethenes in our nation's aquifers is primarily due to the use of tetrachloroethene (PCE) in dry-cleaning, as well as extensive use of trichloroethene (TCE) in industrial degreasing. In particular, I am interested in the evolution of the enzymes responsible for this ability of Dehalococcoides to respire (think: "breathe") chlorinated compounds, and the surprising extent to which the Dehalococcoides genome and ecological niche is specialized toward respiration of organochlorine compounds.