Cluster Subjects Across Study Sites

Appendixes | Clinical | Cluster Subjects Across Study Sites

Cluster Subjects Across Study Sites

This process clusters subjects across study sites for the purpose of identifying similar subjects. It constructs a cross domain data set using as much data as possible (subject to user options). Next, it computes a distance matrix and performs hierarchical clustering of subjects across all of the study centers. The goal of this exercise is to identify pairs of subjects with a very small distance. This could be an indication that these subjects are in fact the same individual who has enrolled at multiple sites.

What do I need?

This process requires the following variables :

•

DM ( ARM , SITEID , USUBJID ).

•

Findings domains require VISITNUM and xxSTRESN . ( xxTPTNUM is used if available.)

•

From Events or interventions domains, xxDECOD is required.

Domains that fail to meet the aforementioned criteria are not used.

Refer to Localization-Specific Value Specification for more information.

Output/Results

The output generated by this process is summarized in a tabbed report. Refer to the Cluster Subjects Across Study Sites output documentation for detailed descriptions and guides to interpreting your results.