Use a sphere-packing design to study a pre-defined model. Worley (1987) presented a model of the flow of water through a borehole that is drilled from the ground surface through two aquifers. The response variable y is the flow rate through the borehole in m3/year and is determined by the following equation:
There are eight factors in this model:
rw = radius of borehole, 0.05 to 0.15 m
r = radius of influence, 100 to 50,000 m
Tu = transmissivity of upper aquifer, 63,070 to 115,600 m2/year
Hu = potentiometric head of upper aquifer, 990 to 1100 m
Tl = transmissivity of lower aquifer, 63.1 to 116 m2/year
Hl = potentiometric head of lower aquifer, 700 to 820 m
L = length of borehole, 1120 to 1680 m
Kw = hydraulic conductivity of borehole, 9855 to 12,045 m/year
You can use a sphere-packing design to obtain a set of conditions for which to calculate the response y. Then you can build a model to estimate the true model over the range of inputs used in your design. Evaluation of your estimated model can help you understand the impact of each of the eight factors on the response.
To create a sphere-packing design for the borehole model, you can use a data table of saved factor settings.
1. Select Help > Sample Data Folder and open Design Experiment/Borehole Factors.jmp
2. Select DOE > Special Purpose > Space Filling Design
3. Click the Space Filling Design red triangle and select Load Factors.
Figure 22.21 Factors Panel with Factor Values Loaded for Borehole Example
Note: The logarithm of r and rw are used as factors.
4. Set the Number of Runs to 32.
5. Click Sphere Packing to produce the design.
6. Click Make Table to make a table showing the design settings for the experiment.
Note: The design table includes a Model table script. This script runs a Gaussian Process model for the response y.
7. (Optional) To see a completed data table for this example, select Help > Sample Data Folder and open Design Experiment/ Borehole Sphere Packing.jmp.
Because the designs are generated from a random seed, the settings that you obtain can differ from those shown in the completed table.
It is important to remember that deterministic data have no random component. The same input values generate the same output. As a result, p-values from fitted statistical models do not have their usual meanings. A large F statistic (low p-value) is an indication of an effect due to a model term. However, you cannot construct valid confidence intervals for effects or model predictions.
Residuals from a model fit to deterministic data are not a measure of noise. Instead, residuals are a measure of the model bias. Bias is the difference between the true value and the predicted value. Distinct patterns in the residuals indicate that additional terms should be considered for the model in order to reduce bias.
Often, the true model is not available in a simple analytical form. As a result, the prediction bias is known only at observed data points. However, in this example, the functional form of the true model is known. In the Borehole Sphere Packing.jmp data table, the true model column contains the formula of the known function. This formula enables you to profile the prediction bias over the factor input region.
1. Select Help > Sample Data Folder and open Design Experiment/Borehole Sphere Packing.jmp.
2. Click the green triangle next to the Model (GP from DOE) script.
Use the Gaussian Process Model report to explore the relationships between the factors and the outcome Y.
3. Click the red triangle next to Gaussian Process Model of Y and select Save Prediction Formula.
4. Go back to the Borehole Sphere Packing.jmp data table.
5. In the data grid, select the column headings for true model and Y Prediction Formula.
6. Right-click and select New Formula Column > Combine > Difference.
This creates a new column containing the bias.
7. From the Borehole Sphere Packing.jmp data table, select Graph > Profiler.
8. Select true model-Y Prediction Formula and click Y, Prediction Formula
9. Select Expand Intermediate Formulas.
This option shows the bias as a function of the eight design factors.
Figure 22.22 Profiler Launch for Borehole Sphere-Packing Data
10. Click OK.
The profiler defaults to the center of the design region. If there were no bias, all profile traces would be constant between the value ranges of each factor. In this example, the variables logRw, Kw, and L show the largest effects on the bias.
Figure 22.23 Profiler for Bias of the Borehole GP Model with Y Axis Set at -40 to 20
You can use the profiler to explore the range of the prediction bias over the entire domain. To find points of minimum and maximum bias, select Optimization and Desirability > Desirability Functions from the Prediction Profiler red triangle menu. See Desirability Profiling and Optimization in Profilers. To evaluate the prediction bias over the design points, select Analyze > Distribution to see a distribution analysis.
Figure 22.24 Distribution of the Prediction Bias
Keep in mind that, in this example, the true model is known. In many applications, the response at any factor setting is unknown. The prediction bias over the experimental data can underestimate the bias throughout the design domain.