An EDDS is a SAS data set that provides information about the columns of a tall data set. It describes relevant experimental variables such as treatment conditions and
covariates as well as a variable named
ColumnName. Entries in the
ColumnName column must exactly match the column names in the input tall data set. EDDSs have certain constraints that must be followed for the processes to run successfully.
An EDDS is frequently constructed using information from a corresponding Experimental Design File (EDF). An EDDS contains many of the same columns as an EDF, but, unlike the EDF, must be saved as a SAS data set (
.sas7bdat). Many of the input engines that generate a tall data set from raw data files also automatically generate the needed EDDS.
In addition to an experimental design data set, many JMP Genomics processes also optionally accept an
annotation data set. This is a SAS data set containing arbitrary biological or chemical properties corresponding to the molecular entities in the experiment.
In addition to the common data sets listed above, additional supplementary SAS data sets might be required by specific processes. These can include Coordinate data sets, which list x- and
y-coordinates of spots on
microarrays, Haplotype Frequency data sets, used in
haplotype analysis, as well as others. These supplementary data sets are used only in specific cases, are frequently optional, and are described, as needed, in the chapters detailing the processes that use them.
Most of the processes in JMP Genomics assume that the input SAS data set has a particular data structure. JMP Genomics distinguishes between tall and wide SAS data sets. A tall SAS data set has samples as columns and molecular entity (for example, marker, gene, clone, protein, or metabolite) as rows, whereas a wide SAS data set is the transpose of the tall, having the samples as rows and molecular entity as columns.
When specifying the input SAS data set for a process, it is important to know the required form. Most of the processes associated with the Genetics processes require a wide structure, whereas most of the others use a
tall structure. The
Transpose Tall to Wide and
Transpose Wide to Tall processes under the
Genomics > SAS Data Set Utilities > Tables menu enable you to transform your SAS data sets between tall and wide forms.