Process Description

Hierarchical Clustering

The Hierarchical Clustering process creates a tree of relationships of the observations (rows) of a SAS data set using one of several methods. The variables themselves can also be clustered to create a two-way organization. All methods are based on an agglomerative hierarchical clustering procedure. Each observation begins in a cluster by itself. The two closest clusters are merged to form a new cluster that replaces the two old clusters. Merging of the two closest clusters is repeated until only one cluster is left. The various clustering methods differ in how the distance between two clusters is computed.

What do I need?

One data set is required to run the Hierarchical Clustering process. This data set must be in a rectangular format and contain the variables (columns) whose observations (rows) are to be clustered.

The adsl_diit.sas7bdat data set, shown below, was generated from the included Nicardipine data set. Patients are listed in columns, domain data are listed in rows. There are 911 columns for 906 patients and 318 rows listing events.

For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.

Output/Results

The output generated by this process is summarized in a Tabbed report. Refer to the Hierarchical Clustering output documentation for detailed descriptions and guides to interpreting your results.