The
K Matrix Compression
process clusters a symmetric relationship or estimated kinship matrix in order to reduce the size of the
random effect
that accounts for relatedness in the
Q-K Mixed Model
analytical process. The compressed matrix can be used to define the
covariance
structure in a down-stream
mixed models
for testing SNP-trait
association
. Compression can be done interactively through the JMP Clustering Platform where you can decide cluster membership via a cutoff level in
hierarchical clustering
based on visual inspection. Alternatively, clustering via PROC CLUSTER can be automated to produce a compressed
K matrix
based on a specified number of clusters. The optimized compression method scans through varying levels of compression to find the compression level that optimizes the fit of the mixed model to a specified
trait
(with the
SNP
effect excluded). This algorithm is described in Zhang et al. (Nature Genetics, 2010).
Warning
: In contrast to the
Q-K Mixed Model
process, which requires the square root of the K matrix to be used in the
model
, the
K Matrix Compression
process requires the K matrix before taking the square root. This process computes the square root of the compressed K matrix (via Singular Value Decomposition) so that the columns are appropriately formatted for input as random effect
variables
in the
Q-K Mixed Model
process.
Note
:
To run optimized compression, a trait variable must be specified from the
SNP Input Data Set
tab. If the SNP,
phenotype
data and K matrix are all in the Input K matrix data set, a separate SNP input data set is not required. However, the K matrix input data set should be specified again as the SNP Input Data.