Digit Preference

Report Output | Overviews | Digit Preference

Digit Preference

This analysis assesses the use of terminal digits (either first or last in numerical findings) by study sites when reporting their clinical findings. It can be used to identify those sites that might exhibit biases in rounding issues or other problems with how they report data as compared with all other sites in the study.

Running Digit Preference for Nicardipine using default settings generates the report shown below.

The Report contains the following elements:

•

Digit Preference Volcano Plot: Each point represents the comparison of a site to all other sites. This comparison is used to determine whether there is a difference in distribution for the last digit for a findings test with numeric data available and is done for all sites across all tests in all findings domains.

The y-axis is the -log10(Raw Row Mean Score p-value), which takes advantage of the ordinality of the final digit value, This test uses standardized midrank scores in case there are gaps due to certain digit values not present. Midranks are a way of scoring the columns when the distance between levels does not necessarily have a practical interpretation. Large numbers on the y-axis indicate statistically significant results.

The x-axis is the maximum percent difference¹ across all digits between a site versus all sites.

Values far from 0 indicate important differences between a site and the reference distribution of all other sites. An FDR (alpha=0.05) line is indicated by the dotted red line. Values above this line can be considered significant adjusting for multiple comparisons. This could identify rounding issues or other problems with how a site reports a particular test compared to other sites.

•

Test Results (Digit Preference): One or more sections displaying distributions for subjects’ tests across study sites. There is a separate set of distributions for each Findings domain.

Down Buttons

down buttons provide you with an easy way to down into your data. The following down buttons are generated by this report:

•

Show Sites: Shows the rows of the data table for the selected points from the volcano plot. Use your mouse to select one or more sites of interest before clicking this button, as shown below:

Clicking opens the following table:

•

Digit Bar Charts: Clicking displays a bar chart, comparing the last digit distribution between selected sites versus all others, for the points selected in the table. This gives the user the ability to compare just how different each site is for a particular test. The underlying data table is available by going to Script > Data Table Window. The following chart shows the sites/tests selected above:

In this example, site 16 shows a marked preference for using reporting diastolic pressure with a terminal digit of “0”.

General

•

Click to view the associated data tables. Refer to View Data for more information.

•

Click to generate a standardized pdf- or rtf-formatted report containing the plots and charts of selected sections.

•

Click to take notes, and store them in a central location. Refer to Add Notes for more information.

•

Click to read user-generated notes. Refer to View Notes for more information.

•

Click the Options arrow to reopen the completed report dialog used to generate this output.

•

Click the gray border to the left of the Options tab to open a dynamic report navigator that lists all of the reports in the review. Refer to Report Navigator for more information.

Methodology

Compare the observed distribution of the last or first digit for each test with each site (the suspect site, indexed with s) compared to all other sites taken together as a reference (indexed as o).

0

1

2

3

4

5

6

7

8

9

Suspect

Others

Compared using a row mean score chi-square tests (Stokes et al., 2012)² to take advantage of the ordinality of the column variable.

Scores are based on standardized midranks , often used when column values cannot necessarily be considered equally spaced (which tends to happen if not all digits are presented).

FDR p-values are calculated and the reference line is determined as described in How does JMP Clinical calculate the False Discovery Rate (FDR)?.

1

This is the maximum of (p(0 in suspect) - p(0 in reference), p(1 in suspect) - p(1 in reference), … p(9 in suspect) - p(9 in reference)) where p(x) is the percent of records, suspect is the site in question and reference is all other sites that are not the site in question.

2

Stokes ME, Davis CS, Koch GG. (2012). Categorical Data Analysis Using SAS, Third Edition. Cary, NC: SAS Institute, Inc.