This process clusters subjects across study sites for the purpose of identifying similar subjects. It constructs a cross domain data set using as much data as possible (subject to user options). Next, it computes a
distance matrix
and performs
hierarchical clustering
of subjects across all of the study centers. The goal of this exercise is to identify pairs of subjects with a very small distance. This could be an indication that these subjects are in fact the same individual who has enrolled at multiple sites.
The output generated by this process is summarized in a tabbed report. Refer to the
Cluster Subjects Across Study Sites
output documentation for detailed descriptions and guides to interpreting your results.