The
Multiple SNP-Trait Association
process tests for
association
between various types of
traits
and numerically coded
genotypes
from multiple
SNPs
at a time using logistic, linear, or survival
regression
models, as well as generalized linear
mixed models
on SNP genotypes themselves or
principal components
used to represent the SNP genotypes. These methods allow adjustments to be made for quantitative
covariates
and
random effects
or for some trait types,
strata variables
. For
binary traits
,
Hotelling's T-squared test
can be performed as well or instead. For
binary
or
continuous traits
, the
Sequence Kernel Association Test (SKAT)
is also an option.
The Annotation Analysis Group Variable in the
Annotation Data Set
identifies the SNPs to be included in the same
model
together, and models are fit and statistics reported for each unique value of this
variable
(typically a gene) within an annotation group (typically a
chromosome
).
P-values
from these tests, with adjustments applied if requested, are plotted along the marker map, using the location of the first SNP in each annotation analysis group.
See the
MIXED
,
GLIMMIX
,
LOGISTIC
, and
PHREG
procedures in the SAS/STAT User's Guide for more information.
The second required data set is the
Annotation Data Set
. This data set contains information, such as gene identity or chromosomal location, for each of the markers. The
morocco_anno_rg.sas7bdat
annotation data set
is used in this example. A portion of this data set is illustrated below. This data set is a tall data set; each row corresponds to a different marker.
Both data sets are included in the
Sample Data
folder that comes with JMP Genomics.