Cluster Subjects Across Study Sites

Output Overview Descriptions | Clinical Reports | Cluster Subjects Across Study Sites

Cluster Subjects Across Study Sites

The Cluster Subjects Across Study Sites report is used to identify similar subjects. It t does so by constructing a cross domain data set using as much data as possible (subject to user options). Next, it computes a distance matrix and performs hierarchical clustering of subjects across all of the study centers. The goal of this exercise is to identify pairs of subjects with a very small distance. This could be an indication that these subjects are in fact the same individual who has enrolled at multiple sites.

Running this report using the Nicardipine sample setting and default options generates the output shown below. Refer to the Cluster Subjects Across Study Sites requirements description for more information. This report uses pre-dosing information with the goal of identifying subjects that have enrolled at two or more clinical sites.

The Cluster Subjects Across Study Sites report shows the results of clustering of the subjects on the basis of different combinations of covariates (demographic groups in this example). The results for each grouping are presented on a separate “tab”.

Results Section

This pane enables you to access and view the output plots and associated data sets on each tab. Use the drop-down menu to view the section in the Results pane or remove the section and its contents from the Results pane.

The following sections are generated by this process:

•

Between-Subject Distance Summary (Cluster Subjects Across Study Sites) : Box plots are presented for all pairwise distances between subjects in the selected population . Pairs are limited based on selections from the Cluster subjects matching these criteria panel of the dialog .

In this example, box plots are presented by gender and race. The more similar a pair of subjects, the smaller the distance value (a zero indicates a perfect match). The minimum distance from each covariate subgroup is presented in the box plot to the right. The subgroup with the most similar pair of subjects is presented on the next tab.

This “tab is shaded gray in the figure above.

•

One or more Subgroup Clustering sections: Only one tab is opened initially. The name of this tab is dependent on the covariate values used (as specified in the Cluster subjects matching these criteria panel) and the subgroup that is identified with the minimum pairwise distance.

In this example, the Sex =F, Race WHITE subgroup shows the minimum pairwise distance. Other subgroup results can be opened from the Results Sections sections menu.

This “tab” is shaded yellow in the figure above.

Drill Down Buttons

Drill down buttons, provide you with an easy way to drill down into your data. The following drill down buttons are generated by this process:

•

Profile Subjects : Select subjects and click to generate the patient profiles. See Profile Subjects for additional information.

•

Show Subjects : Select subjects and click to open the ADSL (or DM if ADSL is unavailable) of selected subjects.

•

Show Rows in Heat Map : Select points that represent pairs of subjects in the Box Plot s and click to highlight the subjects within the Heat Map and Dendrogram to see how they cluster together.

•

Subset Clustering : On a subgroup clustering page, subsets clustering to subjects, based on pairs selected from corresponding box plot.

•

Revert Clustering : Click to return a subset clustering to the original state where all subjects are clustered.

General

•

Click to view the associated data tables. Refer to View Data for more information.

Output includes one summary data set (named csass_sum_XXX ¹ , by default) containing one record per subject with pre-dosing data, one data set of all pairwise distances within the covariate subgroups (named csass_alldist_XXX , by default), one data set containing minimum pairwise distances for each covariate subgroup (named csass_mindist_XXX ), by default), one data set per covariate subgroup containing pairwise distances (named csass_p_Y_XXX , by default, where Y is indexed 1 to the number of covariate subgroups) and one data set per covariate subgroup containing the distance matrix of subjects within the covariate subgroup (named csass_Y_XXX , by default, where Y is indexed 1 to the number of covariate subgroups).

•

Click to generate a standardized pdf - or rtf -formatted report containing the plots and charts of selected sections.

•

Click to take notes, and store them in a central location. Refer to Add Notes for more information.

•

Click to read user-generated notes. Refer to View Notes for more information.

•

Click the Options arrow to reopen the completed process dialog used to generate this output.

Add Holiday or Event

Refer to Add Holiday or Event for a description of the output data and files generated by this process.

1

The _XXX designation is used to designate a one- to three-digit number that is added sequentially to prevent overwriting of existing data sets.