Bioinformatics Toolbox

Next Generation Sequencing Analysis

Bioinformatics Toolbox provides algorithms and visualization techniques for Next Generation Sequencing analysis. The toolbox enables you to analyze whole genomes while performing calculations at a base pair level of resolution. You can use the NGS browser to visualize and investigate short-read alignments using either single-end or paired-end short reads. You can also build custom analysis routines, as shown in the following examples.

bioinfo-chipseqpedemo

Exploring Protein-DNA Binding Sites from Paired-End ChIP-Seq Data
Perform a genome-wide analysis of a transcription factor in the Arabidopsis Thaliana (Thale Cress) model organism.

bioinfo-rnaseqdedemo

Identifying Differentially Expressed Genes from RNA-Seq Data
Load RNA-seq data and tests for differential expression using a statistical model.

Visualizing and Investigating Short-Read Alignment

Using the NGS browser, you can verify and investigate the alignment of short-read sequences in support of analyses that measure genetic variation and gene expression. The NGS browser lets you:

  • Visualize short-read data aligned to a nucleotide reference sequence
  • Compare multiple data sets aligned against a common reference sequence
  • View coverage of different bases and regions of the reference sequence
  • Investigate quality and other details of aligned reads
  • Identify mismatches due to base-calling errors or polymorphisms
  • Visualize insertions and deletions
  • Retrieve feature annotations relative to a specific region of the reference sequence
NGS browser, showing single nucleotide polymorphisms (SNPs) in bold.

NGS browser, showing single nucleotide polymorphisms (SNPs) in bold. You can display multiple tracks of data, examine peaks, identify insertions and deletions, and inspect read quality.

Custom plot mapping E-box motifs to peaks in a wavelet denoised signal.

Custom plot mapping E-box motifs to peaks in a wavelet denoised signal.

Storing and Managing Short-Read Sequence Data

The data sets used in Next Generation Sequencing analysis are often too large to fit into physical memory. Bioinformatics Toolbox provides specialized data containers that enable you to analyze entire genomes.

The BioIndexedFile object lets you access the contents of text files containing nonuniform-sized entries such as sequences, annotations, and cross references to the data set. You can generate these objects from tables, flat files, or application-specific formats such as SAM, FASTA, and FASTQ.

The BioMap class stores information from short-read sequences, including sequence headers, read sequences, quality scores, and data about alignment and mapping to a single reference sequence. You can use object properties and methods to explore, access, filter, and manipulate the data contained in a BioMap object.

Next: Microarray Data Analysis

Try Bioinformatics Toolbox

Get trial software

Physiologically-based Modeling of Oral Drug Absorption with SimBiology

View webinar