Convert nucleotide sequences with IUPAC codes to an regular expression

This online tool generates a regular expression from nucleotide sequences which can include IUPAC codes. This allows to use any string/pattern search program (e.g. the linux commandline tool grep) to extract a given consensus sequence from a large file, for example a FASTA/FASTQ file obtained from a next generation sequencing experiment.

Example Usage

Consensus nucleotide sequence with IUPAC as extracted from the genome browser


Regular expression with ambigous IUPAC characters resolved:


Finding the sequencing in a FASTQ file on the commandline:


Last updated on August 07, 2016