The KDMM Normalization process (Kernel Density Mean of
M component) is a scaling
normalization method for RNA-seq data similar to
TMM Normalization (Robinson and Oshlack 2010).
The data set is preprocessed and summarized into bins,
exons, or genes at
tall format with each row containing data from unique individual bin, exon, or gene across samples (columns). The
M and
A components between targeting sample (under normalization) and reference sample are calculated for estimating 2-dimensional Kernel Density and applying the density for weighted
mean of
M component as the scaling
factor corresponding to the targeting sample.
The trimmed sam_mus_gse18905_ch1_6s.sas7bdat data set shown below lists SAM data from genes located on
chromosome 1 from 3 different mouse lines.
The second data set is the Experimental Design Data Set (EDDS). This required data set tells how the experiment was performed, providing information about the columns in the input data set. Note that one column in the EDDS must be named
ColumnName and the values contained in this column must exactly match the column names in the input data set.
The edf_mus_gse18905_chr1-6s_sas7bdat EDDS, shown below, corresponds to the
sam_mus_gse18905_ch1_6s.sas7bdat input data set.
The sam_mus_gse18905_ch1_6s.sas7bdat and
edf_mus_gse18905_chr1-6s_sas7bdat data sets were downloaded from
GEO.
Refer to the KDMM Normalization output documentation for detailed descriptions of the output of this process.