Process Description
Affymetrix Exon and Whole Transcript Expression CEL Input Engine
Probe-level intensity data gathered from Affymetrix microarrays is typically collected and stored in raw data files formatted as .cel files. Before the information contained in these files can be manipulated and analyzed using JMP Genomics, it must be extracted and organized into two SAS data sets:
• | a tall data set containing all of the raw data, and |
• | an Experimental Design Data Set (EDDS), that contains information about the experimental design. |
The Affymetrix Exon and Whole Transcript Expression CEL Input Engine (IE) enables you to import expression data from a set of Affymetrix whole transcript .cel files into these SAS data sets.
What do I need?
Before you can successfully import the raw data into SAS data sets that can be used for analysis in JMP Genomics, you must locate and gather four different sources of information:
• | A folder containing the raw data files. These .cel files, each corresponding to an individual microarray, contain the hybridization intensities. |
• | The Experimental Design File (EDF) for the experiment. The EDF lists specific information about the design of the experiment. The EDF is typically a text file or Excel spread sheet and must be created before the data can be imported. |
• | The Probe Filter Data Set, which contains probe identifiers to be excluded from the output data set. This file is required when you are importing regular expression .cel files (not from whole transcript or exon arrays) and you only want to import a subset of the probes. A column containing a probeset identifier is required. Otherwise, this data set is optional. |
A subset, corresponding to two arrays, of the Affymetrix Human Exon 1.0 ST colon cancer data set downloaded from the Affymetrix website can serve as an example. It is shipped with JMP Genomics.
The edf_2arrays.txt file has been selected as the EDF.
For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.
Output/Results
The output data sets generated by this process are listed in a Results window. Refer to the Affymetrix Exon and Whole Transcript Expression CEL Input Engine output documentation for detailed descriptions.