Singular value decomposition (SVD) complements association analysis by providing another method to identify items that have an affinity for each other. Singular value decomposition of the transaction item matrix reduces the matrix to a manageable number of dimensions, thereby enabling you to group similar transactions and similar items. The partial singular value decomposition in the Association Analysis platform is equivalent to performing principal components analysis (PCA).
The transaction item matrix is a matrix for which each row corresponds to a transaction each column corresponds to an item. The entries of the matrix are zeros and ones. If an item occurs in a transaction, the corresponding row and column entry is one. Otherwise, the row and column entry is zero. Because the transaction item matrix usually contains more values of zero than one, it is called a sparse matrix.
The partial singular value decomposition approximates the transaction item matrix using three matrices: U, S, and V‘. The relationship between these matrices is defined as follows:
Define nTransactions as the number of transactions (rows) in the transaction item matrix, and nItems as the number of items (columns) in the transaction item matrix, and nVec as the specified number of singular vectors. Note that nVec must be less than or equal to min(nTransactions, nItems). It follows that U is an nTransactions by nVec matrix. S is a diagonal matrix of dimension nVec. The diagonal entries in S are the singular values of the transaction item matrix. V‘ is an nVec by nTransactions matrix. The rows in V‘ are the singular vectors.
The singular vectors capture connections among different items with similar functions or topic areas. If three items tend to appear in the same transactions, the SVD is likely to produce a singular vector in V‘ with large values for those three items. The U singular vectors represent the transactions projected into this new item space.
The transaction item matrix is centered, scaled, and divided by nTransactions minus 1 before the singular value decomposition is carried out. This analysis is equivalent to a PCA of the correlation matrix of the transaction item matrix.