Reports | Digit Preference

Digit Preference
This analysis assesses the use of terminal digits (either first or last in numerical findings) by study sites when reporting their clinical findings. It can be used to identify those sites that might exhibit biases in rounding issues or other problems with how they report data as compared with all other sites in the study.
This analysis assesses the use of terminal digits (either first or last in numerical findings) by study sites when reporting their clinical findings. It can be used to identify those sites that might exhibit biases in rounding issues or other problems with how they report data as compared with all other sites in the study.
Running Digit Preference for Nicardipine using default settings generates the report shown below.
Digit Preference Volcano Plot
In this Volcano Plot, each point represents the comparison of a site to all other sites. This comparison is used to determine whether there is a difference in distribution for the last digit for a findings test with numeric data available and is done for all sites across all tests in all findings domains.
The Y axis is the -log10(Raw Row Mean Score p-value), which takes advantage of the ordinality of the final digit value, This test uses standardized midrank scores in case there are gaps due to certain digit values not present. Midranks are a way of scoring the columns when the distance between levels does not necessarily have a practical interpretation. Large numbers on the Y axis indicate statistically significant results.
The X axis is the maximum percent difference1 across all digits between a site versus all sites.
Values far from 0 indicate important differences between a site and the reference distribution of all other sites. An FDR (alpha=0.05) line is indicated by the dotted red line. Values above this line can be considered significant adjusting for multiple comparisons. This could identify rounding issues or other problems with how a site reports a particular test compared to other sites.
Test Results
The Laboratory Test Results section contains the following elements:
Shows the number of tests performed at each study site, the numbers of each type of test, and the subjects delineated by the specified digit (first or last) of the subjects’ USUBJID. Selecting any group (for example, all tests for subjects whose USUBJID ends with “1”) as shown above, highlights those tests across all distributions. There is a separate set of distributions for each Findings domain.
One Data Filter.
Enables you to subset subjects based on study site, test, and digit. Refer to Data Filter for more information.
Action Buttons
Action buttons, provide you with an easy way to drill down into your data. The following action buttons are generated by this report:
Show Sites: Shows the rows of the data table for the selected points from the volcano plot. Use your mouse to select one or more sites of interest before clicking this button, as shown below:
Clicking opens the following table:
Digit Bar Charts: Clicking displays a bar chart, comparing the last digit distribution between selected sites versus all others, for the points selected in the table. This gives the user the ability to compare just how different each site is for a particular test. The underlying data table is available by going to Script > Data Table . The following chart shows the sites/tests selected above:
In this example, site 16 shows a marked preference for using reporting diastolic pressure with a terminal digit of “0”.
General
Click to generate a standardized pdf- or rtf-formatted report containing the plots and charts of selected sections.
Click the Options arrow to reopen the completed report dialog used to generate this output.
Click the gray border to the left of the Options tab to open a dynamic report navigator that lists all of the reports in the review. Refer to Report Navigator for more information.
Methodology
Compare the observed distribution of the last or first digit for each test with each site (the suspect site, indexed with s) compared to all other sites taken together as a reference (indexed as o).
 
Compared using a row mean score chi-square tests (Stokes et al., 2012)2 to take advantage of the ordinality of the column variable.
Scores are based on standardized midranks , often used when column values cannot necessarily be considered equally spaced (which tends to happen if not all digits are presented).
FDR p-values are calculated and the reference line is determined as described in How does JMP Clinical calculate the False Discovery Rate (FDR)?.
Report Options
Findings Analysis
By default the report is set to Analyze all tests from all findings domains. You can opt however, to restrict the search to specific Findings Tests.
You can specify whether to search all results in original units when you Analyze: the data or restrict the search to either character results (in standard format) or numeric results (in standard units).
You might or might not want to include unscheduled visits when you are analyzing findings by visit. Check the Remove unscheduled visits to exclude unscheduled visits.
You can opt to Consider BY variables in the analysis. This option, which assumes that BY variables (left vs. right arm for collecting blood pressure data, for example) are included in the experimental design, is selected by default. You can uncheck this option to ignore BY variables.
Use the Only include BY variables if they are domain keys option to subset the available variables to only include those variables that are domain keys. If the option is unchecked, the report uses the cross-classification of xxCAT, xxSCAT, xxLOC, xxMETHOD, xxPOS, xxSPEC, and xxTPT for creating by groups for all variables that are available (as it had in the past).
The Preserve trailing zero in decimals for result in original units enables you to preserve the trailing zero in decimals for results in original units. When this option is left unchecked, the trailing 0 is deleted and the value with one fewer decimal place is considered.
The Summarize sites with at least this many subjects: option enables you to set a minimal threshold for the sites to be analyzed. Only those sites which exceed the specified number of subjects are included. This feature is useful because it enables you to exclude smaller sites, where small differences due to random events are more likely to appear more significant than they truly are. In larger sites, observed differences from expected attendance due to random events are more likely to be significant because any deviations due to random events are less likely to be observed.
You can also specify whether to analyze the first or last digits of any of the numeric findings. See Analyze these digits: for more information.
The Alpha option is used to specify the significance level by which to judge the validity of the results generated by this report. By definition, alpha represents the probability that you will reject the null hypothesis when the null is, in fact, true. Alpha can be set to any number between 0 and 1, but is most typically set at 0.01, 0.05, or 0.10. The higher the alpha, the lower your confidence that the results you observe are correct.
Filtering the Data:
Filters enable you to restrict the analysis to a specific subset of subjects and/or adverse events, based on values within variables. You can also filter based on population flags (Safety is selected by default) within the study data.
See Select the analysis population, Select saved subject Filter3, and Additional Filter to Include Subjects
The Subset of Visits to Analyze options enables you to restrict to a specific subset of visits your search tests with similar and questionable results.

1
This is the maximum of (p(0 in suspect) - p(0 in reference), p(1 in suspect) - p(1 in reference), … p(9 in suspect) - p(9 in reference)) where p(x) is the percent of records, suspect is the site in question and reference is all other sites that are not the site in question.

2
Stokes ME, Davis CS, Koch GG. (2012). Categorical Data Analysis Using SAS, Third Edition. Cary, NC: SAS Institute, Inc.

3
Subject-specific filters must be created using the Create Subject Filter report prior to your analysis.