The Robust PCA Outliers report in the Explore Outliers platform includes a table of information about the algorithm and several tables of results. The initial table includes the rank of the low-rank matrix, the number of SVD iterations, the convergence criterion, the value of Lambda, and the number of imputed missing values. If the Randomized SVD option is enabled, the number of dimensions used in the Randomized SVD option is also included in the summary table. These are the other tables that are included in the report.
Cell Large Residuals
A table that shows the largest outlier observations, identified by column name and row number. The number of observations shown is determined by the Outlier Threshold. The table contains the column name and row number of the observation, the residual value, and the scaled residual value.
Tip: To color specific outlier cells in the data table, select rows in the Cell Large Residuals table and click Colorize.
Row Root Mean Square
A table that shows the root mean square value for each row in the data table. The root mean square is calculated using the scaled residuals.
Tip: If you select a row in the Row Root Mean Square table, the corresponding row is selected in the data table.
Column Root Mean Square
A table that shows the root mean square value for each column specified in the launch window. The root mean square is calculated using the scaled residuals.
Tip: If you select a row in the Column Root Mean Square table and click Select Columns, the corresponding column is selected in the data table.
Snapshot
A graphical representation of the outlier cells in the data table. The outlier cells are colored in red.
Residuals
The matrix of residuals from the matrix decomposition. A cell is colored if the absolute value of the scaled residual is greater than the following:
min[0.99 × max{abs(residuals)}, Outlier Threshold]
Low Rank Approximation
The matrix of scaled residuals from the matrix decomposition.
Singular Values
The vector of singular values from the SVD.
There are buttons at the bottom of the Robust PCA Outliers report that provide options to save different parts of the report.
Close
Closes the Robust PCA Outliers report.
Save Large Outliers
Saves the information in the Cell Large Residuals table to a new data table.
Save Cleaned
Opens a window that provides several techniques to clean the outliers based on thresholds and save new columns to the data table.
Trim
Trims outlier cells if the corresponding absolute scaled residual is greater than the specified threshold. By default, the threshold is 10. Select Color to color the outlier cells red. The trimmed cells are set to the value of the unscaled threshold.
Impute
Sets outlier cells to the value of the low-rank approximation if the corresponding absolute scaled residual is greater than the specified threshold. By default, the threshold is 100. Select Color to color these cells green.
Make Missing
Sets outlier cells to missing if the corresponding absolute scaled residual is greater than the specified threshold. By default, the threshold is 1000. Select Color to color these cells blue.
Color imputed from missing
If selected, colors cells that originally had missing values and were imputed.
Save Residuals
Saves the residuals to new columns in the original data table.
Save Scaled Residuals
Saves the scaled residuals to new columns in the original data table.
Save Low Rank Approx
Saves the low-rank approximation to new columns in the original data table.