The Multiple SNP-Trait Association process tests for
association between various types of
traits and numerically coded
genotypes from multiple
SNPs at a time using logistic, linear, or survival
regression models, as well as generalized linear
mixed models on SNP genotypes themselves or
principal components used to represent the SNP genotypes. These methods allow adjustments to be made for quantitative
covariates and
random effects or for some trait types,
strata variables. For
binary traits,
Hotelling's T-squared test can be performed as well or instead. For
binary or
continuous traits, the
Sequence Kernel Association Test (SKAT) is also an option.
The Annotation Analysis Group Variable in the Annotation Data Set identifies the SNPs to be included in the same
model together, and models are fit and statistics reported for each unique value of this
variable (typically a gene) within an annotation group (typically a
chromosome).
P-values from these tests, with adjustments applied if requested, are plotted along the marker map, using the location of the first SNP in each annotation analysis group.
See the MIXED,
GLIMMIX,
LOGISTIC, and
PHREG procedures in the SAS/STAT User's Guide for more information.
The second required data set is the Annotation Data Set. This data set contains information, such as gene identity or chromosomal location, for each of the markers. The
morocco_anno_rg.sas7bdat annotation data set is used in this example. A portion of this data set is illustrated below. This data set is a tall data set; each row corresponds to a different marker.
Both data sets are included in the Sample Data folder that comes with JMP Genomics.