•
|
lignin sulfonate (ls), which is pulp industry pollution
|
•
|
humic acid (ha), which is a natural forest product
|
1.
|
Note: The data in the Baltic.jmp data table are reported in Umetrics (1995). The original source is Lindberg, Persson, and Wold (1983).
2.
|
Select Analyze > Multivariate Methods > Partial Least Squares.
|
3.
|
4.
|
Assign Intensities, which contains the 27 intensity variables v1 through v27, to the X, Factor role.
|
5.
|
Click OK.
|
6.
|
Select Leave-One-Out as the Validation Method.
|
7.
|
Click Go.
|
A portion of the report appears in Figure 5.2. Since the van der Voet test is a randomization test, your Prob > van der Voet T2 values can differ slightly from those in Figure 5.2.
Figure 5.2 Partial Least Squares Report
The Root Mean PRESS (predicted residual sum of squares) Plot shows that Root Mean PRESS is minimized when the number of factors is 7. This is stated in the note beneath the Root Mean PRESS Plot. A report called NIPALS Fit with 7 Factors is produced. A portion of that report is shown in Figure 5.3.
The van der Voet T2 statistic tests to determine whether a model with a different number of factors differs significantly from the model with the minimum PRESS value. A common practice is to extract the smallest number of factors for which the van der Voet significance level exceeds 0.10 (SAS Institute Inc 2017d; Tobias 1995). If you were to apply this thinking here, you would fit a new model by entering 6 as the Number of Factors in the Model Launch panel.
Figure 5.3 Seven Extracted Factors
8.
|
Select Diagnostics Plots from the NIPALS Fit with 7 Factors red triangle menu.
|
This gives a report showing actual by predicted plots and three reports showing various residual plots. The Actual by Predicted Plot in Figure 5.4 shows the degree to which predicted compound amounts agree with actual amounts.
Figure 5.4 Diagnostics Plots
9.
|
Select VIP vs Coefficients Plot from the NIPALS Fit with 7 Factors red triangle menu.
|
Figure 5.5 VIP vs Coefficients Plot
The VIP vs Coefficients plot helps identify variables that are influential relative to the fit for the various responses. For example, v23, v2, and v26 have both VIP values that exceed 0.8 and relatively large coefficients.