Convert nucleotide sequences with IUPAC codes to an regular expression

This online tool generates a regular expression from nucleotide sequences which can include IUPAC codes. This allows to use any string/pattern search program (e.g. the linux commandline tool grep) to extract a given consensus sequence from a large file, for example a FASTA/FASTQ file obtained from a next generation sequencing experiment.

Example Usage

Consensus nucleotide sequence with IUPAC as extracted from the genome browser


Regular expression with ambigous IUPAC characters resolved:


Finding the sequencing in a FASTQ file on the commandline:


Receive updates about NGS articles and trainings

Share this article

Last updated on August 07, 2016

ecSeq is a bioinformatics solution provider with solid expertise in the analysis of high-throughput sequencing data. We organize public workshops and conduct on-site trainings on NGS data analysis.

Would you like to receive updates about our NGS trainings and solutions? Then sign-up for our newsletter