BAM Input Engine

Processes | Import | BAM Input Engine

BAM Input Engine

The BAM Input Engine imports a set of Compressed Binary Sequence Alignment Map ( .bam ) files and combines them into three SAS data sets containing chromosome , location, and sequence identity of each sample with a reference sequence. The total number of reads are reduced through binning, and read count is calculated.

This process supports both paired-end and single-end reads.

Important : Before running this process, you must download SAMtools version 0.1.12 (for Windows) from http://samtools.sourceforge.net/ and save the executable files under the folder C:\Program Files\SASHome\JMP\10\Genomics\ThirdPartyAnnotation\NextGen .

Note : This process is considered experimental .

What do I need?

Before you can successfully import the raw data into SAS data sets that can be used for analysis in JMP Genomics, you must locate and gather several sources of information:

•

An Experimental Design File (EDF) that indexes the individual raw data files for the experiment. The EDF is typically a text file or Excel spread sheet and must be created before the data can be imported.

Important : The EDF used for importing BAM data must contain a variable called SampleName . The values listed in this column identify each sample in the experiment. Providing this name allows the BAM data for each sample, which are typically spread across multiple .bam files, to be merged into one SAS data set.

•

All of the .bam files containing the raw data, which must be located and copied to a single folder. Each .bam file contains read alignment information, for either an individual chromosome or all chromosomes.

An example EDF (shown below) specifies the import of the GSM468501_MUS1_chrY.bam and GSM468501_MUS2_chrY.bam files.

For detailed information about the files and data sets used or created by JMP Life Sciences software, see Files and Data Sets .

Output/Results

The output data sets generated by this process are listed in a Results window. Refer to the BAM Input Engine output documentation for detailed descriptions.