Bioinformatics Pipeline Development with Nextflow
How to manage your own data analysis pipelines using workflow management systems

Streamline your research through the development of reproducible analysis pipelines

In a nutshell

  • Learn the fundamental best-practices of bioinformatic pipeline development
  • Understand how workflow management systems can accelerate your research
  • Use state-of-the-art, open source software to make complex analyses routine
  • Perform your own custom analysis pipelines using Nextflow!

When?
November 15-17, 2023
9 am - 5 pm (CET UTC+1)

Where?
Online

The purpose of the workshop is to introduce the concepts of bioinformatic pipeline development through the context of the open source Workflow Management System (WMS) Nextflow. The participants will be trained in the scripting, configuration and execution of example analysis pipelines based on current industry best-practices, and learn how to share them with other users. Finally, the participants will apply everything they have learned by implementing their own analysis pipelines from the ground up.

By the end of the workshop all attendees will be enabled to build their own scalable, reproducible bioinformatic pipelines which can be run locally, on high-performance computing clusters or even in the cloud. The course layout has been adapted to the needs of beginners in the field of computational biology and allows scientists with little or no background in software development to get a first hands-on experience in this new and fast-evolving area of expertise. This instructor-led live online workshop has been newly designed for an engaging, interactive online learning experience.

Get trained by experts

Our trainers have a proven record of academic and/or industrial experience in NGS data analysis. Because up-to-date expert knowledge is needed to answer your questions and know what is important in the field.

Open source NGS tools

We only use open source tools that are free to use for academia and industry.

Learn effectively with well-curated materials

For an optimal learning experience we carefully prepare our learning materials and example data.

This workshop has been adapted to the needs of beginners in the field of (biological) data analysis and comprises these three course modules:

  1. Introduction to pipeline development and workflow management systems:
    An overview of bioinformatic pipeline development in the context of workflow management systems such as Nextflow and Snakemake. Important consideration is given to understanding and addressing the needs of other pipeline users in regards to various types of computational infrastructure.
  2. Nextflow concepts and language:
    Learn how Nextflow can be used to create pipelines. Understand processes and channels, the scripting language and syntax. This module covers essential knowledge for the practical implementation of any new project in bioinformatic pipeline development.
  3. Nextflow for biological data analysis:
    Get hands-on with Nextflow. Learn how to modify file names with channel operators. Understand execution abstraction and relevant configuration options. Learn how to debug problems with your pipelines and how to share them with the community.

Detailed Course Program


Introduction to pipeline development and workflow management systems

  • Introduction and overview: Why build bioinformatic analysis pipelines at all?
  • Workflow Management Systems: What’s out there and how should I decide what to use? How do I think like an end-user?
  • Where to find example pipelines, how to run them, and get a feel for what output to expect. Get familiar with the Linux command line.
  • Considerations for different types of underlying computational infrastructure.
  • Should my pipeline run locally, on a HPC or in the cloud? How do I make my work scalable?
  • Setting up environmental dependencies and software containers. How do I make my work reproducible?
  • Basic concepts of NextFlow

Nextflow concepts and language

  • Understanding the concepts of dataflow: processes and channels, input and output
  • Running a pipeline with Nextflow: work directory layout and process execution
  • Language basics: Nextflow scripting and syntax
  • Reading input data: the channel factory
  • Creating data analysis pipelines: Nextflow workflows and processes

Nextflow for biological data analysis

  • Modyfying Connecting processes with channel operators
  • Configuration options: parameters, scopes and profiles
  • Execution abstraction: integrating with resource management software
  • Workflow introspection: runtime metadata and handling errors
  • Sharing your pipeline with online code repositories

Speakers

Adam Nunn (ecSeq Bioinformatics GmbH)
is a PhD student at the Marie Skłodowska-Curie Innovative Training Network 'Epidiverse'. He developed several bioinformatics pipelines using Nextflow for this European network.

Dr. Mario Fasold (ecSeq Bioinformatics GmbH)
Mario works in the analysis of microarray data since 2007 and developed several bioinformatics tools such as the Bioconductor package AffyRNADegradation and the Larpack program package. Since 2011 he specialized in the field of NGS data analysis and helped analysing sequencing data of several large consortium projects.

Requirements

The target audience are biologists or data analysts with no or little experience in developing computational pipelines for data analysis. A superficial understanding of molecular biology (DNA, RNA, gene expression, PCR, ...) is assumed, as examples will be given in the context of this field.

Some familiarity with a command line interface (e.g. Linux, Mac OS X) and a minimal understanding of object-oriented programming (with e.g. Python or Java) is recommended but not required.

A current desktop computer / laptop with an up-to-date browser (Firefox or Chrome) is required.

  •   Printed course materials
  •   High-performance cloud computer (accessed via browser)
  •   Downloadable environment for seamless continuation / repetition after the course

  •   Hands-on use of workflow management tools to see where the stumbling blocks are
  •   Our assistants can help you and provide feedback you on the spot
  •   No previous installation of software necessary
  •   Continue practicing on your own using our Live-Linux system and the printed manuscript


Attendance

Location: Online
Language: English
Available Seats: 20 (first-come, first-served)

Registration Fee: 989 EUR (excluding VAT)

Key dates

Opening Date of Registration: May 1, 2023
Closing Date of Registration: November 1, 2023
Workshop: November 15-17, 2023 from 9 am to 5 pm (CET UTC+1)

Find out what time it is at your location: Time Difference

"This is a fantastic course for beginners planning to self analyse NGS data. The course is well organised and explained in an easy common man language." Winny Varikatt, ICPMR, Westmead Hospital, Australia

"I really enjoyed the nature of the workshop, I felt that everything was very approachable for the attendees to learn even if the experience was lacking. I witnessed some of my classmates who have no experience ask very detailed questions through the course of the workshop, so I think it is a testimonial to how informational the content of the workshop was." Noah Legall, University of Georgia, USA

"It was an interesting course where I could learn a lot. It was not only interesting to see how nextflow itself works, but also very helpful to understand how such pipeline-tools work. Now I will be able to use such pipelines." Cecilia Mittelberger,Laimburg Research Centre, Italy



When you register for this workshop you are agreeing with our Workshop Terms and Conditions. Please read them before you register.


Answer

What you need:

  • A computer with one of the following operation systems: Windows 7 or later (incl. Windows 10), Mac OS X 10.13 or later.
  • One of the following web browsers: Edge 42 and later, Chrome 65 or Later, Firefox 48 or Later.
  • A microphone and loudspeakers/headphones.

  The course cannot be run on phones, tablets and similar handheld devices.

Answer

We will start every morning at 9am sharp and work together until 5pm in the afternoon. There will be regular short breaks and a longer break at lunchtime.

As this is a live broadcast, you cannot pause the course and continue later. The individual exercises build on each other, so you should not leave in between.

Answer

No, you do not have to install any software to follow the course. You will get access to a high performance computer in the cloud, which you can easily log into using an in-browser console. All necessary programs are already installed on this computer. This way, we can start right away.

Answer

Of course we'll help you. If such a case should occur, we have assistants in the virtual room whom you can contact via chat. They can discuss your issue in a separate room/chat. They can also dial into your in-browser console and see exactly the same as what you see. This way they can help you directly and without much detour.

Note: The assistent can only see your in-browser terminal window but nothing else and also do not have any access to your computer.

Answer

A few days before the course we will send you the manuscript by mail. After the course, you will get the environment to continue analysis and/or practice all the tasks of the course.

Any Questions? Please feel free to contact our events team.

ecSeq Bioinformatics GmbH
Sternwartenstr. 29
D-04103 Leipzig
Germany
Email: events@ecSeq.com