Response Screening automates the process of conducting tests across a large number of responses. It tests each response that you specify against each factor that you specify. Response screening addresses two main issues connected with large-scale data. These are the need to conduct many tests, and the requirement to deal effectively with outliers and missing values.
Response screening is available as a platform and as a Fit Model personality. In both cases, it performs tests analogous to those found in the Fit Y by X platform, as shown in Table 22.1. As a personality, it performs tests of the response against the individual model effects.
To facilitate and support the multiple inferences that are required, Response Screening provides these features:
Data Tables
Results are shown in data tables, as well as in a report, to enable you to explore, sort, search, and plot your results. Statistics that facilitate plot interpretation are provided, such as the LogWorth of p-values.
False Discovery Rates
Because you are conducting a large number of tests, you need to control the overall rate of declaring tests significant. Response screening controls the false discovery rate. The False Discovery Rate (FDR) is the expected proportion of significant tests that are incorrectly declared significant (Benjamini and Hochberg 1995; Westfall et al. 2011).
Tests of Practical Significance
When you have many observations, even small effects that are of no practical consequence can result in statistical significance. To address this issue, you can define an effect size that you consider to be of practical significance. You then conduct tests of practical significance, thereby only detecting effects large enough to be of pragmatic interest.
Equivalence Tests
When you are studying many factors, you are often interested in those that have essentially equivalent effects on the response. In this case, you can specify an effect size that defines practical equivalence and then conduct equivalence tests.
To address issues that arise when dealing with messy data, Response Screening provides features to deal with outliers and missing data. These features enable you to analyze your data directly, without expending effort to address data quality issues:
Robust Estimation
Outliers in your data increase estimates of standard error, causing tests to be insensitive to real effects. Select the Robust option to conduct Huber M-estimation. Outliers remain in the data, but the sensitivity of tests to these outliers is reduced.
Missing Value Options
The platform contains an option to treat missing values on categorical predictors in an informative fashion.
Response |
Factor |
Fit Y by X Analysis |
Description |
---|---|---|---|
Continuous |
Categorical |
Oneway |
Analysis of Variance |
Continuous |
Continuous |
Bivariate |
Simple Linear Regression |
Categorical |
Categorical |
Contingency |
Chi-Square |
Categorical |
Continuous |
Logistic |
Simple Logistic Regression |
The Response Screening platform generates a report and a data table: the Response Screening report and the PValues table. The Response Screening personality generates a report and two data tables: the Fit Response Screening report, the PValues table, and the Y Fits table.
The JSL command Summarize Y by X performs the same function as the Response Screening platform but without creating a platform window. See Summarize YByX(X(<x columns>, Y (<y columns>), Group(<grouping columns>), Freq(<freq column>), Weight(<weight column>)) in the JSL Syntax Reference.