Large scale genetic mapping studies seek to associate genetic markers, such as SNPs, of known location, with various quantitative and qualitative phenotypic traits. Both Marker-Trait Association and SNP-Trait Association processes were developed to address specific needs of these investigations. Marker-Trait Association is especially useful for studies involving multi-allelic markers and some of the more complex modeling techniques. However, it is not particularly efficient at handling very large data sets.
Survey SNP-Trait Association was specifically designed for very large genetic data sets, but it lacks some of the more complex options available in
Marker-Trait Association. Both of these
procedures complement each other very well. However, neither
Marker-Trait Association nor
SNP-Trait Association can accommodate complex survey designs.
Survey SNP-Trait Association addresses this deficiency by testing for
association between various types of traits and SNP
genotypes or
alleles from a single SNP at a time taking into account complex survey designs. Two types of analyses can be performed: an
ANOVA based on SNP genotypes or a
regression testing for a linear trend of SNP alleles. Adjustments can be made for quantitative
covariates. Rao-Scott
chi-square and
F statistics can also be computed for non-continuous traits.
P-values from these tests, with adjustments applied if requested, are plotted along the marker map.
See the SURVEYFREQ,
SURVEYLOGISTIC, and
SURVEYREG procedures in the SAS/STAT User's Guide for more information.
A second optional data set is the Annotation Data Set. This data set contains information, such as gene identity or chromosomal location, for each of the markers. This data set must be a tall data set; each row corresponds to a different marker.
The survey_genotype.sas7bdat data set is included in the
Sample Data folder that comes with JMP Genomics.