The Hierarchical Clustering red triangle menu contains the following options:
Color Clusters
Colors the labels for dendrogram and their associated join bars according to cluster membership. Also assigns the corresponding colors to the rows of the data table The colors update if you change the number of clusters. If you deselect this option, the colors are no longer updated based on the number of clusters.
Mark Clusters
Assigns markers to the rows of the data table corresponding to the cluster to which the row belongs. The markers update if you change the number of clusters. If you deselect this option, the markers are no longer updated based on the number of clusters.
Number of Clusters
Prompts you to enter a number of clusters and positions the dendrogram slider to that number.
Cluster Criterion
Gives the Cubic Clustering Criterion (CCC) for the entire range of number of clusters. The CCC is used to estimate the number of clusters. It can be used with any distance-based clustering algorithm. Larger values of the CCC indicate better fit in terms of number of clusters. See SAS Institute Inc. (1983). (Not available when Data is distance matrix is selected.)
Show Dendrogram
Shows or hides the Dendrogram report.
Dendrogram Scale
Contains the following options for scaling the dendrogram:
Distance Scale
Shows the horizontal distances between any two join points as the distances between the two clusters joined at that point, based on the distance method specified on the launch window. The distance scale is the same scale as used in the Distance Graph and is the default scale for the dendrogram.
Even Spacing
Shows the horizontal distances between any two join points as equal.
Geometric Spacing
Increases the horizontal distances between join points as the number of clusters increases. This option is useful when there are many objects and you want the smaller clusters to be more visible than the larger clusters.
Distance Graph
Shows or hides the distance plot beneath the dendrogram.
Show NCluster Handle
Shows or hides the handles on the dendrogram used to manually change the number of clusters.
Zoom to Selected Rows
Selects and enlarges a particular cluster after you select the cluster in the dendrogram. Alternatively, you can double-click the cluster to zoom in on it. Use Release Zoom to return to the original view.
Release Zoom
Returns the dendrogram to the original view after zooming.
Pivot on Selected Cluster
Reverses the order of the two sub-clusters of the currently selected cluster.
Color Map
Gives the option to add a color map, or heat map, showing each Y, Column variable colored by value. Several color theme choices are available in a submenu.
Two Way Clustering
Clusters by the variables specified in Y, Columns as well as rows. A color map is added with a dendrogram for the Y, Column variables at its base. Typically, for two-way clustering, your variables are measured on the same scale and you do not select Standardize Data. (Not available when Data is stacked is selected.)
Positioning
Provides options for changing the positions of labels and other parts of the dendrogram.
Legend
Shows or hides a legend for the colors used in color maps. This option is available only if a color map is enabled.
More Color Map Columns
Adds a color map for specified columns. (Not available when Data as summarized, Data is distance matrix, or Data is stacked is selected.)
Constellation Plot
Shows or hides an alternative way to present the information in the hierarchical clustering dendrogram. Each observation (row) is represented by an endpoint and each cluster join is represented by a new point. The lines that are drawn represent cluster membership. The lengths of the lines represent the distance between clusters. Longer lines represent greater distances between clusters.
You can position your pointer over the lines in the constellation plot to see their length. However, the length values are meaningful only with respect to each other. The axis scaling, orientation of points, and angles of the lines are arbitrary. They are determined such that the ends of the nodes are spaced out and the plot does not appear cluttered, which is important with larger data sets.
To turn off the labels on the endpoints, right-click inside the Constellation Plot and deselect Show Labels.
Save Constellation Coordinates
Saves the coordinates of the constellation plot to the data table. (Not available when Data as summarized, Data is distance matrix, or Data is stacked is selected.)
Save Clusters
Creates a data table column that contains the cluster number. If Add Spatial Measures is selected on the launch window, the cluster numbers are also saved to the Hough Data Table.
Save Formula for Closest Cluster
Creates a data table column that contains a formula for the closest cluster. This option calculates the squared Euclidean distance to each cluster’s centroid and selects the cluster that is closest. Note that this formula does not always reproduce the cluster assignment given by Hierarchical Clustering since the clusters are determined differently. However, the cluster assignment is very similar. (Not available when Data as summarized, Data is distance matrix, or Data is stacked is selected.)
Save Display Order
Creates a data table column that contains the order in which the row appears in the dendrogram.
Save Cluster Hierarchy
Creates a data table that contains the information needed to write a script for a custom dendrogram. For each cluster join, there are three rows: the first for the joiner, the second for the leader, and the third for the result, giving the cluster centers, size, and other information.
Save Cluster Tree
Creates a new data table that contains information needed to compare cluster trees between JMP and SAS. For each cluster join, there is one row for each new cluster, with the cluster’s size and other information.
Save Distance Matrix
Creates a new data table that contains the distances between the observations.
Save Cluster Means
Creates a new data table that contains the number of rows and the means of each column in each cluster.
Cluster Summary
(Not available when Data is distance matrix is selected.) Displays the following information:
Cluster Means
A table that gives, for each cluster, the number of observations (or Object IDs, if the data are stacked) and means for each variable.
Cluster Standard Deviations
A table that gives, for each cluster, the number of observations (or Object IDs, if the data are stacked) and standard deviations for each variable.
Cluster Means Plot
Either a parallel plot or a two-dimensional heat map of the cluster means.
The plot is a parallel plot unless Data is stacked is selected and there are two Attribute ID variables. For the parallel plot, the axis for each variable is scaled as follows:
– If Standardize Data were selected, the axis ranges from two standard deviations above and below the mean, where the standard deviation and mean are computed for the raw data. If a cluster mean falls beyond this range, the axis is extended to include it.
– If Standardize Data were not selected, there is a common vertical axis whose scaling is displayed. (The scaling is equivalent to the Scale Uniformly option in Graph Builder).
When Data is stacked is selected and there are two Attribute ID variables, two-dimensional plots of the mean of the Y variable at each location are shown for each cluster. These plots are colored using a Blue to Gray to Red color gradient.
Column Summary
For each variable, gives the RSquare value that represents the proportion of variation explained by the clusters. This number is the RSquare value for a regression of the variable on the clusters. The option also gives a bar graph of RSquare values.
Scatterplot Matrix
Creates a scatterplot matrix using all the variables. (Not available when Data as summarized, Data is distance matrix, or Data is stacked is selected.)
Parallel Coord Plots
Creates a parallel coordinate plot for each cluster. (Not available when Data as summarized, Data is distance matrix, or Data is stacked is selected.) The axes are scaled as described for the Cluster Means Plot. See Cluster Means Plot.
Cluster Treatment Comparisons
(Available only if you hold Shift and click the Hierarchical Clustering red triangle.) Select a response column and a two-level treatment column. Creates a Hierarchically Clustered Differences report.
See Local Data Filter, Redo Menus, and Save Script Menus in Using JMP for more information about the following options:
Local Data Filter
Shows or hides the local data filter that enables you to filter the data used in a specific report.
Redo
Contains options that enable you to repeat or relaunch the analysis. In platforms that support the feature, the Automatic Recalc option immediately reflects the changes that you make to the data table in the corresponding report window.
Save Script
Contains options that enable you to save a script that reproduces the report to several destinations.
Save By-Group Script
Contains options that enable you to save a script that reproduces the platform report for all levels of a By variable to several destinations. Available only when a By variable is specified in the launch window.