Communicate Knowledge - New Article in Briefings in Bioinformatics

Posted on April 07, 2021

Communicate Knowledge - Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis

Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads.

Nunn A, Otto C, Stadler PF, Langenberger D: 'Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis', Briefings in Bioinformatics bbab021, (2021)

Dr. Gero Doose
Adam Nunn holds a BSc in Biological Sciences from University of Portsmouth and a double MSc in Bioinformatics and Biology from Lund University, Sweden. Between the bachelor's and the master's degree, he worked for four years as a lead research scientist for a UK-based company where he focused on developing biological alternatives to chemical pesticides. In 2017 he joined ecSeq Bioinformatics GmbH for his PhD studies. His work is fully funded by the Marie Skłodowska-Curie Innovative Training Network EpiDiverse and in cooperation with Prof. Peter F. Stadler, holder of the bioinformatics chair at Leipzig University.

Share this article

Receive updates about NGS articles and trainings

More news from ecSeq

On our blog you will find major news, background stories and press releases.

More frequent updates are provided on the following pages: