Launch the Principal Components platform by selecting Analyze > Multivariate Methods > Principal Components. Principal Component analysis is also available using the Multivariate and the Scatterplot 3D platforms.
The example described in Example of Principal Component Analysis uses all of the continuous variables from the Solubility.jmp sample data table.
Figure 3.3 Principal Components Launch Window
The Default option uses either the Row-wise, Pairwise, or REML methods. A JMP Alert also recommends switching to the Wide method when appropriate.
–
|
Row-wise estimation is used for data tables with no missing values.
|
–
|
Pairwise estimation is used for data tables with missing values and either more than 10 columns, more than 5,000 rows, or more columns than rows.
|
–
|
REML estimation is used otherwise.
|
–
|
Wide estimation is recommended by a JMP Alert window for data tables with more than 500 columns. This is because computation time can be considerable when you use the other methods with a large number of columns. Click Wide to switch to the Wide method or click Continue to use the method you originally selected.
|
Restricted maximum likelihood (REML) estimation uses all of the data, even if missing values are present. Due to a bias-correction factor, this method is slow if the dataset is large and there are many missing values. Therefore, REML is most useful for smaller datasets. If there are no missing cells in the data, then the REML and ML estimates are equivalent and equal to the sample covariance matrix. If there are missing cells, REML’s variance and covariance estimates are less biased than the estimates from ML estimation. For more information, see REML.
Robust estimation uses all of the data, even if missing values are present. This method down-weights extreme values and is therefore useful for data tables that might have outliers. For statistical details, see Robust in Correlations and Multivariate Techniques.
Wide estimation does not use observations with missing values, so rows that contain missing cells are deleted before the method is applied. This estimation method uses an algorithm based on the full singular value decomposition. The algorithm avoids calculating the covariance matrix and is therefore computationally efficient. It is useful when you have a very large number of columns in your data. For additional information, see Wide.
Sparse estimation uses all of the data, even if missing values are present. This estimation method uses an algorithm based on the partial singular value decomposition, which computes only the first specified number of singular values and singular value vectors. The algorithm avoids calculating the covariance matrix, as well as unnecessary principal components and is therefore computationally efficient. It is useful when your data are sparse, meaning they contain many zeros, or when there are a large number of columns in the data. For additional information, see Sparse.
•
|
Use the Impute Missing Data option found under Multivariate Methods > Multivariate. See Impute Missing Data in Correlations and Multivariate Techniques.
|
•
|
Use the Multivariate Normal Imputation or Multivariate SVD Imputation utilities found in Analyze > Screening > Explore Missing Values. See Explore Missing Values Utility in the Predictive and Specialized Modeling book for details.
|