This example shows how to fit a zero-inflated Poisson regression model using the Generalized Regression personality of the Fit Model platform. A zero-inflated Poisson distribution is appropriate for count data where responses of zero can come from multiple sources.
In this example, you are analyzing data on the number of fish caught by groups of park visitors. The data table for this example details five factors that might affect the number of fish caught by 250 groups visiting a park.
During data collection, it was never determined whether anyone in the group had actually fished. However, the hidden Fished column is included in the table to emphasize the point that catching zero fish can happen in one of two ways: Either no one in the group fished, or everyone who fished in the group was unlucky. Therefore, zero responses can come from two sources. To address this issue, you can fit a zero-inflated distribution. Because a Poisson distribution is appropriate for the count data resulting from people who fished, you fit a zero-inflated Poisson distribution.
1. Select Help > Sample Data Folder and open Fishing.jmp.
2. Select Analyze > Fit Model.
3. Select Fish Caught from the Select Columns list and click Y.
4. Select Live Bait through Children and click Macros > Factorial to Degree.
Terms up to degree 2 (the default in the Degree box) are added to the model.
5. Select Validation from the Select Columns list and click Validation.
6. From the Personality list, select Generalized Regression.
7. From the Distribution list, select ZI Poisson.
8. Click Run.
The Generalized Regression report that appears contains a Model Comparison report, a Model Launch control panel, and a ZI Poisson Maximum Likelihood with Validation Column report. Note that the default estimation method is the Lasso.
9. From the Estimation Method List, select Elastic Net.
10. Click Go.
A ZI Poisson Elastic Net with Validation Column report appears. The Solution Path, the Parameter Estimates for Original Predictors report, and the Effect Tests report indicate that one term is zeroed. The Zero Inflation parameter, whose estimate is shown on the last line of both Parameter Estimates reports, is highly significant. This indicates that some of the variation in the response, Fish Caught, might be due to the fact that some groups did not fish.
Figure 7.6 Parameter Estimates for Original Predictors Report
The Effect Tests report indicates that four terms are significant at the 0.05 level: Live Bait, Fishing Poles, Fishing Poles*Camper, and Fishing Poles*Children.
11. Click the red triangle next to ZI Poisson Elastic Net with Validation Column and select Profilers > Profiler.
12. Click the Prediction Profiler red triangle and select Optimization and Desirability > Desirability Functions.
A function is imposed on the response, which indicates that maximizing the number of Fish Caught is desirable. For more information about desirability functions, see “Desirability Profiling and Optimization” in Profilers.
13. Click the Prediction Profiler red triangle and select Optimization and Desirability > Maximize Desirability.
Figure 7.7 Prediction Profiler with Fish Caught Maximized
You can vary the settings of the predictors to see the impact of the significant effects: Live Bait, Fishing Poles, Fishing Poles*Camper, and Fishing Poles*Children. For example, Live Bait is associated with more fish; a Camper tends to bring more fishing poles than someone who is not camping and therefore catches more fish.
14. Click the red triangle next to ZI Poisson Elastic Net with Validation Column and select Save Columns > Save Prediction Formula and Save Columns > Save Variance Formula.
Two columns are added to the data table: Fish Caught Prediction Formula and Fish Caught Variance.
15. In the data table, right-click either column heading and select Formula to view the formula. Alternatively, click the plus sign to the right of the column name in the Columns panel. Note the appearance of the estimated zero-inflation parameter, 0.781522155, in both of these formulas.