The Q-K Mixed Model process tests for
association between various types of
traits and SNP
genotypes or
alleles from a single
SNP at a time while adjusting simultaneously for population structure and family relatedness (Yu et al. 2006). You must already have computed the Q and K matrices. Two types of analyses can be performed: an
ANOVA based on SNP genotypes or a
regression testing for a linear trend of SNP alleles.
P-values from these tests, with adjustments applied if requested, are plotted along the marker map.
One data set, the Input Data Set, which contains all of the marker data, is needed for this process. The sample data set used in the following example, the samplegmdata_numgeno_rm_pcm.sas7bdat data set, which was generated from the samplegmdata.sas7bdat described in Sample Genetic Marker Data, contains a root identity-by-descent (IBD) matrix computed for 60 computer-generated SNP genotypes by single value decomposition (SVD) from the
Relationship Matrix process, a compressed IBD matrix from the
K Matrix Compression process, a
principal components matrix from the
PCA for Population Stratification process, a coordinates matrix from the
Multidimensional Scaling process, and a population membership probability, all merged with the original data. This data set is partially shown below. Note that this is a wide data set; markers are listed in columns, whereas individuals are listed in rows.
A second, optional, data set is the Annotation Data Set. This data set contains information, such as gene identity or chromosomal location, for each of the markers. The
annotation data set used in this example, the
samplemap.sas7bdat data set, was computer generated and identifies markers, locations, and gene identities. A portion of this data set is illustrated below. This data set is a tall data set; each row corresponds to a different marker.
Both data sets are included in the Sample Data folder that comes with JMP Genomics.