Process Description

Gene Model Summary

Gene Model Summary takes position-level intensity data (for example, output from the SAM Input Engine) and summarizes it into exon and intron bins as defined by an isoform definition file in UCSC format. The mean of all intensities within a bin is computed as the output value for that bin.

Note: This process is considered experimental.

What do I need?

To run this process, the following input is required:

•

An Input SAS Data Set in tall format, which contains chromosome, position, and summarization variables.

•

A Gene Model Text File. This is a text annotation file that defines a gene model that can be used for read summarization by exon and intron bins. It must be in the same format as the gene model files available from the Table Browser of the UCSC Genome Browser web page.

An Annotation SAS Data Set, although not required, can be supplied. This data set provides annotation information, such as gene names, function, physical location, and association, for each of the markers used in the analysis. This data set must have a variable named Probe_Set_ID that is used to correctly order the SNPs in the output data set. For more information, see Annotation Data Set.

The edf_gse18905_2mus_chry_data.sas7bdat file, located in the \Sample Data\Next-Gen\SAM\GSE18905 directory, serves as an example Input SAS Data Set, and is shown below.

The mm9_knownGene_chrY.txt file, located in the \Sample Data\UCSC\Gene directory, serves as an example Gene Model Text File.

For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.

Output/Results

The output data sets generated by this process are listed in a Results window. Refer to the Gene Model Summary output documentation for detailed descriptions.