Publication date: 07/08/2024

Image shown hereStatistical Details for the t-SNE Method

The t-SNE method maps points from a high-dimensional space, {x1, x2,..., xn}, to points in a low-dimensional space, {y1, y2,..., yn} by minimizing the difference between the high-dimensional similarities of {xi, xj} and the low-dimensional similarities of {yi, yj}. The pairwise similarities are represented as probability distributions. In the high-dimensional space, conditional probabilities, pj|i, are calculated using the Gaussian distribution. The Multivariate Embedding platform provides two methods to calculate the conditional probabilities.

Sparse Approximation Calculation for Conditional Probabilities

If the Sparse option is selected in the launch window, pj|i are calculated using a sparse approximation. For each of the n inputs, a set of nearest neighbors is found using a vantage-point (VP) tree. Then, the conditional probabilities are calculated only for those subsets of nearest neighbors:

Equation shown here

In this equation, Ni is the set of the floor(3p) nearest neighbors of xi, where p is the perplexity parameter defined in the launch window. The variance of the Gaussian distribution, σi, is also based on the perplexity parameter. See van der Maaten and Hinton (2008) and van der Maaten (2014).

Non-Sparse Calculation for Conditional Probabilities

If the Sparse option is not selected in the launch window, pj|i are calculated for all points:

Equation shown here

In this calculation, the variance of the Gaussian distribution, σi, is also based on the perplexity parameter.

Calculation of Joint Probability Distributions

In the t-SNE method, it is assumed that the conditional probabilities are symmetric. Therefore, the joint probabilities, pij, in the high-dimensional space are defined by the symmetrical conditional similarities:

Equation shown here

where pij = pji for all i and j. Since it is the pairwise similarities that are of interest, it is also assumed that pii = 0.

The joint probabilities in the low-dimensional mapping, qij, are calculated using the Student’s t distribution with one degree of freedom:

Equation shown here

These probabilities have the same properties as the pij’s, meaning that qij = qji for all i and j and qii = 0.

The t-SNE method minimizes the difference between the pairwise similarities in the high-dimensional space and the pairwise similarities in the low-dimensional space by minimizing a single Kullback-Leibler divergence between the joint probability distribution P and the joint probability distribution Q. The Kullback-Leibler divergence between P and Q is calculated as follows:

Equation shown here

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).