Response Screening automates the process of conducting tests across a large number of responses. It tests each response that you specify against each factor that you specify. Response Screening addresses two main issues connected to large-scale data. These are the need to conduct many tests and the requirement to deal effectively with outliers and missing values.
Response Screening is available as a platform and as a Fit Model personality. In both cases, it performs tests analogous to those found in the Fit Y by X platform, as shown in Table 24.1. As a personality, it performs tests of the responses against the individual model effects.
To facilitate and support the multiple inferences that are required, Response Screening provides these features:
Data Tables
Results are available in data tables, as well as in a report, to enable you to explore, sort, search, and plot your results. Statistics that facilitate plot interpretation are provided, such as the logworth of p-values (-log10(p-value)).
False Discovery Rates
Because you are conducting a large number of tests, you need to control the overall rate of declaring tests significant. Response screening controls the false discovery rate. The False Discovery Rate (FDR) is the expected proportion of significant tests that are incorrectly declared significant (Benjamini and Hochberg 1995; Westfall et al. 2011).
Tests of Practical Significance
When your data table consists of a large number of rows (large n), the standard error used in testing can be very small. As a result, tests might be statistically significant, when in fact, the observed difference is too small to be of practical consequence. To address this issue, you can define an effect size that you consider to be of practical significance. You then conduct tests of practical significance, thereby only detecting effects large enough to be of pragmatic interest.
Equivalence Tests
When you are studying many factors, you are often interested in those that have essentially equivalent effects on the response. In this case, you can specify an effect size that defines practical equivalence and then conduct equivalence tests.
To address issues that arise when dealing with messy data, Response Screening provides features to deal with outliers and missing data. These features enable you to analyze your data directly, without expending effort to address data quality issues:
Robust Estimation
Outliers in your data increase estimates of standard error, causing tests to be insensitive to real effects. Select the Robust option to conduct Huber M-estimation. Outliers remain in the data, but the sensitivity of tests to these outliers is reduced.
Missing Value Options
The platform contains an option to treat missing values in categorical predictors in an informative fashion.
Response |
Factor |
Fit Y by X Analysis |
Description |
---|---|---|---|
Continuous |
Categorical |
Oneway |
Analysis of Variance |
Continuous |
Continuous |
Bivariate |
Simple Linear Regression |
Categorical |
Categorical |
Contingency |
Chi-Square |
Categorical |
Continuous |
Logistic |
Simple Logistic Regression |
The Response Screening platform generates a section of plots and a section of reports. The FDR PValue plot and the Result Table are shown by default. The Response Screening personality generates a report that contains an Effect Tests table, the FDR PValue Plot for Effects plot, and the FDR Logworth by Effect Size plot.
The JSL command Summarize Y by X performs the same function as the Response Screening platform but without creating a platform window. See Summarize YByX(X(<x columns>, Y (<y columns>), Group(<grouping columns>), Freq(<freq column>), Weight(<weight column>)) in the JSL Syntax Reference.