Bioinformatics Pipeline Development with Nextflow
How to manage your own data analysis pipelines using workflow management systems

Streamline your research through the development of reproducible analysis pipelines

In a nutshell

  • Learn the fundamental best-practices of bioinformatic pipeline development
  • Understand how workflow management systems can accelerate your research
  • Use state-of-the-art, open source software to make complex analyses routine
  • Perform your own custom analysis pipelines using Nextflow!

When?
October 4-7, 2022
9 am - 5 pm

Where?
Leipzig, Germany

The purpose of the workshop is to introduce the concepts of bioinformatic pipeline development through the context of the open source Workflow Management System (WMS) Nextflow. The participants will be trained in the scripting, configuration and execution of example analysis pipelines based on current industry best-practices, and learn how to share them with other users. Finally, the participants will apply everything they have learned by implementing their own analysis pipelines from the ground up.

By the end of the workshop all attendees will be enabled to build their own scalable, reproducible bioinformatic pipelines which can be run locally, on high-performance computing clusters or even in the cloud. The course layout has been adapted to the needs of beginners in the field of computational biology and allows scientists with little or no background in software development to get a first hands-on experience in this new and fast-evolving area of expertise.

Get trained by experts

Our trainers have a proven record of academic and/or industrial experience in NGS data analysis. Because up-to-date expert knowledge is needed to answer your questions and know what is important in the field.

Open source NGS tools

We only use open source tools that are free to use for academia and industry.

Learn effectively with well-curated materials

For an optimal learning experience we carefully prepare our learning materials and example data.

This workshop has been adapted to the needs of beginners in the field of (biological) data analysis and comprises these three course modules:

  1. Introduction to pipeline development and workflow management systems:
    An overview of bioinformatic pipeline development in the context of workflow management systems such as Nextflow and Snakemake. Important consideration is given to understanding and addressing the needs of other pipeline users in regards to various types of computational infrastructure.
  2. Nextflow for biological data analysis:
    Get hands-on with Nextflow. Understand processes and channels, the scripting language and syntax, execution abstraction and relevant configuration options. This module covers essential knowledge for the practical implementation of any new project in bioinformatic pipeline development.
  3. Build your own analysis pipeline:
    This module will be entirely hands-on, beginning with the planning and outlining of a custom bioinformatics pipeline and ending with the opportunity to start building and implementing the pipeline from the ground up, with guidance from our in-house experts. Participants can choose from a selection of relevant examples from the field of NGS data analysis.

Detailed Course Program


Introduction to pipeline development and workflow management systems

  • Introduction and overview:. Why build bioinformatic analysis pipelines at all?
  • Workflow Management Systems:. What’s out there and how should I decide what to use? How do I think like an end-user?
  • Where to find example pipelines, how to run them, and get a feel for what output to expect. Get familiar with the Linux command line.
  • Considerations for different types of underlying computational infrastructure.
  • Should my pipeline run locally, on a HPC or in the cloud? How do I make my work scalable?
  • Setting up environmental dependencies and software containers. How do I make my work reproducible?
  • Industry best-practices and optimising your work environment for software development.

Nextflow for biological data analysis

  • Understanding the concepts of dataflow: processes and channels, input and output
  • Running a pipeline with Nextflow: work directory layout and process execution
  • Language basics: Nextflow scripting and syntax
  • Configuration options: parameters, scopes and profiles
  • Execution abstraction: integrating with resource management software
  • Workflow introspection: runtime metadata and handling errors
  • Sharing your pipeline with online code repositories

Build your own analysis pipeline

  • How to outline and approach a new project in pipeline development
  • Getting started: building your pipeline from the ground up
  • Write processes, define the workflow, add dependencies, run and test your pipeline!

Speakers

Adam Nunn (ecSeq Bioinformatics GmbH)
is a PhD student at the Marie Skłodowska-Curie Innovative Training Network 'Epidiverse'. He developed several bioinformatics pipelines using Nextflow for this European network.

Dr. Mario Fasold (ecSeq Bioinformatics GmbH)
Mario works in the analysis of microarray data since 2007 and developed several bioinformatics tools such as the Bioconductor package AffyRNADegradation and the Larpack program package. Since 2011 he specialized in the field of NGS data analysis and helped analysing sequencing data of several large consortium projects.

Requirements

The target audience are biologists or data analysts with no or little experience in developing computational pipelines for data analysis. A superficial understanding of molecular biology (DNA, RNA, gene expression, PCR, ...) is assumed, as examples will be given in the context of this field.

Some familiarity with a command line interface (e.g. Linux, Mac OS X) and a minimal understanding of object-oriented programming (with e.g. Python or Java) is recommended but not required.

  •   Printed course materials
  •   Catering during the workshop
  •   Conference dinner
  •   High-performance computer (no laptop needed)
  •   Downloadable environment for seamless continuation / repetition after the course
  •   Certificate

Attendance

Location: Z&P Schulung GmbH, Rabensteinpl. 1, 04103 Leipzig, Germany
Language: English
Available Seats: 25 (first-come, first-served)

Registration Fee: 1359 EUR (excluding VAT)

Travel Information - Leipzig

Key dates

Opening Date of Registration: March 1, 2022
Closing Date of Registration: September 1, 2022
Workshop: October 4-7, 2022 from 9 am to 5 pm

"This is a fantastic course for beginners planning to self analyse NGS data. The course is well organised and explained in an easy common man language." Winny Varikatt, ICPMR, Westmead Hospital, Australia

"I really enjoyed the nature of the workshop, I felt that everything was very approachable for the attendees to learn even if the experience was lacking. I witnessed some of my classmates who have no experience ask very detailed questions through the course of the workshop, so I think it is a testimonial to how informational the content of the workshop was." Noah Legall, University of Georgia, USA

"It was an interesting course where I could learn a lot. It was not only interesting to see how nextflow itself works, but also very helpful to understand how such pipeline-tools work. Now I will be able to use such pipelines." Cecilia Mittelberger,Laimburg Research Centre, Italy



When you register for this workshop you are agreeing with our Workshop Terms and Conditions. Please read them before you register.


Any Questions? Please feel free to contact our events team.

ecSeq Bioinformatics GmbH
Sternwartenstr. 29
D-04103 Leipzig
Germany
Email: events@ecSeq.com