Hierarchical Clustering
The Hierarchical Clustering tab contains the following elements:
• | Heat Map (also known as a Color Map) |
This element is a chromatographic display of the data matrix that is clustered. The columns of this matrix correspond to the variables that you specified to cluster, and the rows correspond to the rows of data in the Input SAS Data Set. Each value of the data matrix is mapped to a color according to the color theme that you specify. A legend for the mapped colors is shown to the right of the display.
Examine the heat map for patterns of large, small, and intermediate values of data. The two-way clustering process rearranges both rows and columns so that those that are similar to one another are near each other in the display. A block of similar colors corresponds to data values that are similar, and sharp changes in color correspond to places where data values are changing most dramatically.
See Heat Map and Dendrogram for more information.
• | Dendrogram |
A dendrogram is a tree-like diagram portraying the form estimated by hierarchical clustering. A split of the data occurs at each stage of the hierarchical clustering, and the location of this split is shown where one line becomes two. The branches of the tree correspond to groups that are similar to one another, and deeper splits between branches indicate a higher degree of dissimilarity.
The two-way clustering analysis contains two distinct dendrograms corresponding to rows and columns, and are shown to the right and below the heat map, respectively. Click on branches of the dendrograms to select rows or columns.
The Row Dendogram also contains two Crosshair Tools at its top and bottom. Click and slide one of these to change the cutoff used to color the clusters.
Use the Magnifier Tool () to zoom in on row clusters of interest. Click the magnifier on the Tools Toolbar (show this toolbar using View > Toolbars > Tools, or access the magnifier directly from the Tools > Magnifier menu), and click and drag a region in the row dendrogram. -click to reverse the zoom.
Although subjective by nature, the branching pattern of the dendrogram can provide you with an idea of how many clusters adequately describe the rows and columns. You can also find rows or columns of interest in the display and see what other rows and columns are similar.
See Heat Map and Dendrogram for more information.
• | Distance Graph |
This graph, in the lower right corner of the display, shows how the distance criterion changes with each branch of the hierarchy. Sharp changes in this graph indicate regions with more distinct breaks, whereas flat regions represent cases where the splits are not as severe.
See Distance Graph for more information.
• | Data Filter |
You can use the data filter to locate rows of interest. Scroll through the filter to find the row label that you want and click it. The corresponding row becomes highlighted in the heat map and dendrogram display. You might need to use the zoom tool to more clearly see the region around the selected row.
Explore elements together to discover patterns in your data. The branches of the dendrogram should correspond to blocks of color in the heat map display.