Reports | Duplicate Records

Duplicate Records
This report identifies sets of records that have identical values on more than one occasion within a subject or between subjects within a study site. This report identifies records based on USUBJID and the following covariates (if available): visit number (VISITNUM), location (xxLOC), method (xxMETHOD), position (xxPOS), specimen (xxSPEC), and planned time point (xxTPT).
Report Results Description
Running Duplicate Records for Nicardipine using default settings generates the Report shown below.
The Report contains a section for each findings domain that contains the variable specified by the Analyze: option on the dialog. Data that are summarized are those test codes that are constant throughout the course of the study. Each results section contains a Data Filter.
The Vital Signs section is shown below:
Vital Signs
The Vital Signs section contains the following elements:
One or more Histograms.
The data table underlying these histograms represents records that have at least one duplicate within or between subjects in the trial (depending on options). Therefore, Unique Subject Identifier shows the number of subject records that occur in duplicate bins. Non-Missing Components helps identify duplicate records with a majority of nonmissing data. Country, Study Site, and Treatment Arm are provided to show where these duplicates occur most frequently.
See Distribution for more information.
Using the Show Subjects button shows the records for selected subjects. This details the records that are duplicated within the subject or across other subjects.
To see duplicates, select one group (shown above) and click to view the duplicates in the associated data table. In these cases, each subject’s Diastolic, Systolic, and Heart Rate (highlighted columns) values are identical. It is unlikely to expect that these sets of values would be the same across subjects (or perhaps even repeat identically within subjects).
One Data Filter.
Enables you to subset subjects based on study site, test, and digit. Refer to Data Filter for more information.
 
Action Buttons
Action buttons, provide you with an easy way to drill down into your data. The following action buttons are generated by this report:
Show Subjects: Select subjects and click to open and subset the underlying data table to the selected subjects.
Show Duplicates: Select subjects and click to open and subset the underlying data table to the selected subjects.
General
Click to generate a standardized pdf- or rtf-formatted report containing the plots and charts of selected sections.
Click to read user-generated notes. Refer to View Notes for more information.
Click the Options arrow to reopen the completed report dialog used to generate this output.
Click the gray border to the left of the Options tab to open a dynamic report navigator that lists all of the reports in the review. Refer to Report Navigator for more information.
Methodology
No statistical tests are performed. This report identifies sets of tests with similar values using Unique Subject Identifier; Visit number, BY-values based on xxCAT, xxSCAT, xxLOC, xxMETHOD, xxPOS, xxSPEC, and xxTPT (if selected); and date-time of collection (xxDTC) to determine sets of records.
Report Options
Findings Analysis
By default the report is set to Analyze all tests from all findings domains. You can opt however, to restrict the search to specific Findings Tests.
You can specify whether to search all results in original units when you Analyze: the data or restrict the search to either original units, or standard format (for character results) or standard units (for numeric results).
You might or might not want to include unscheduled visits when you are analyzing findings by visit. Check the Remove unscheduled visits to exclude unscheduled visits.
You can opt to Consider BY variables in the analysis. This option, which assumes that BY variables (left vs. right arm for collecting blood pressure data, for example) are included in the experimental design, is selected by default. You can uncheck this option to ignore BY variables.
Use the Only include BY variables if they are domain keys option to subset the available variables to only include those variables that are domain keys. If the option is unchecked, the report uses the cross-classification of xxCAT, xxSCAT, xxLOC, xxMETHOD, xxPOS, xxSPEC, and xxTPT for creating by groups for all variables that are available (as it had in the past).
Under certain circumstances, you may want to ignore select duplicate records. For example, if duplicate entries from a visit are made for a subject, you only want to consider one set of entries. You can check the Ignore duplicate records within subject option to delete multiple occurrences of the same subject within the same set of records.
You may also want to Ignore duplicate records when covariates don’t match Need more information and clarification.
The Summarize sites with at least this many subjects: option enables you to set a minimal threshold for the sites to be analyzed. Only those sites which exceed the specified number of subjects are included. This feature is useful because it enables you to exclude smaller sites, where small differences due to random events are more likely to appear more significant than they truly are. In larger sites, observed differences from expected attendance due to random events are more likely to be significant because any deviations due to random events are less likely to be observed.
Filtering the Data:
Filters enable you to restrict the analysis to a specific subset of subjects and/or adverse events, based on values within variables. You can also filter based on population flags (Safety is selected by default) within the study data.
See Select the analysis population, Select saved subject Filter1, and Additional Filter to Include Subjects
The Subset of Visits to Analyze options enables you to restrict to a specific subset of visits your search tests with similar and questionable results.

1
Subject-specific filters must be created using the Create Subject Filter report prior to your analysis.