The Hierarchical Clustering red triangle menu contains the following options:
Color Clusters
Colors the labels for dendrogram and their associated join bars according to cluster membership. Also assigns the corresponding colors to the rows of the data table The colors update if you change the number of clusters. If you deselect this option, the colors are no longer updated based on the number of clusters.
Mark Clusters
Assigns markers to the rows of the data table corresponding to the cluster to which the row belongs. The markers update if you change the number of clusters. If you deselect this option, the markers are no longer updated based on the number of clusters.
Number of Clusters
Specifies the number of row clusters and positions the dendrogram slider to that number.
Cluster Criterion
(Not available when Data is distance matrix is selected as the Data Format.) Shows or hides the Cubic Clustering Criterion (CCC) table. The CCC is shown for clusters 1 through about 1/10th the number of observations in the data table. The CCC can be used to estimate the number of clusters in certain scenarios. In general, larger values of the CCC indicate a better fit. However, fewer numbers of clusters can be more interpretable. For more specific guidelines on how to interpret the CCC, see SAS Institute Inc. (1983). This criterion can be used with any distance-based clustering algorithm.
Caution: If clusters are elongated or irregularly shaped, the Cubic Clustering Criterion should not be used as a clustering criterion.
Show Dendrogram
Shows or hides the Dendrogram report.
Dendrogram Scale
Contains the following options for scaling the dendrogram:
Distance Scale
Shows the horizontal distances between any two join points as the distances between the two clusters joined at that point, based on the distance method specified on the launch window. The distance scale is the same scale as used in the Distance Graph and is the default scale for the dendrogram.
Even Spacing
Shows the horizontal distances between any two join points as equal.
Geometric Spacing
Increases the horizontal distances between join points as the number of clusters increases. This option is useful when there are many objects and you want the smaller clusters to be more visible than the larger clusters.
Distance Graph
Shows or hides the distance plot beneath the dendrogram.
Show NCluster Handle
Shows or hides the handles on the dendrogram used to manually change the number of clusters.
Zoom to Selected Rows
Selects and enlarges a particular cluster after you select the cluster in the dendrogram. Alternatively, you can double-click the cluster to zoom in on it. Use Release Zoom to return to the original view.
Release Zoom
Returns the dendrogram to the original view after zooming.
Pivot on Selected Cluster
Reverses the order of the two sub-clusters of the currently selected cluster.
Positioning
Provides options for changing the positions of labels and other parts of the dendrogram.
Color Map
Enables you to add a color map, or heat map, showing each Y, Column variable colored by value. Several color theme choices are available in a submenu. To remove a color map, select Color Map > None.
More Color Map Columns
(Available only when Data as usual is selected as the Data Format.) Adds a color map for specified columns.
Legend
Shows or hides a legend for the colors used in color maps. There is a separate legend for each of the specified columns. This option is available only if a color map is enabled.
Note: If there are more than 400 columns, a single legend is shown with a standardized score for the colors used in the color maps.
Two Way Clustering
(Available only when Data as usual or Data as summarized is selected as the Data Format.) Clusters by both the specified columns and the rows. A color map is added to the dendrogram with a dendrogram for the Y variables at its base. Typically, for two-way clustering, your variables are measured on the same scale and you do not standardize the data.
Column Clustering
(Available only when Two Way Clustering is used.) Provides option for clustering the columns in two way clustering.
Number of Column Clusters
Specifies the number of column clusters.
Column Cluster Criterion
Shows or hides the Cubic Clustering Criterion (CCC) table for the entire range of number of column clusters. The CCC is used to estimate the number of clusters. It can be used with any distance-based clustering algorithm. Larger values of the CCC indicate better fit in terms of number of clusters. See SAS Institute Inc. (1983).
Save Column Clusters
Saves a new data table that contains cluster membership information for the columns.
Save Clusters
Saves a new data table that contains cluster membership information. If Add Spatial Measures is selected on the launch window, the cluster numbers are also saved to the Hough Data Table.
Save Cluster Means
Creates a new data table that contains the number of rows and the means of each column in each cluster.
Save Other
Shows a submenu of additional save options.
Save Formula for Closest Cluster
Creates a data table column that contains a formula for the closest cluster. This option calculates the squared Euclidean distance to each cluster’s centroid and selects the cluster that is closest. Note that this formula does not always reproduce the cluster assignment given by Hierarchical Clustering since the clusters are determined differently. However, the cluster assignment is very similar. (Not available when Data as summarized, Data is distance matrix, or Data is stacked is selected.)
Save Cluster History
Creates a new data table that contains the information in the Clustering History report.
Save Display Order
Creates a data table column that contains the order in which the row appears in the dendrogram.
Save Distance Matrix
Creates a new data table that contains the distances between the observations.
Save Constellation Coordinates
Saves the coordinates of the constellation plot to the data table. (Not available when Data as summarized, Data is distance matrix, or Data is stacked is selected.)
Save Cluster Hierarchy
Creates a data table that contains the information needed to write a script for a custom dendrogram. For each cluster join, there are three rows: the first for the joiner, the second for the leader, and the third for the result, giving the cluster centers, size, and other information.
Save Cluster Tree
Creates a new data table that contains information needed to compare cluster trees between JMP and SAS. For each cluster join, there is one row for each new cluster, with the cluster’s size and other information.
Clustering History
Shows or hides the Clustering History report. See Clustering History.
Cluster Summary
(Not available when Data is distance matrix is selected.) Shows or hides a report that contains the following information:
Cluster Means
A table that gives, for each cluster, the number of observations (or Object IDs, if the data are stacked) and means for each variable.
Cluster Standard Deviations
A table that gives, for each cluster, the number of observations (or Object IDs, if the data are stacked) and standard deviations for each variable.
Cluster Means Plot
Either a parallel plot or a two-dimensional heat map of the cluster means.
The plot is a parallel plot unless Data is stacked is selected and there are two Attribute ID variables. For the parallel plot, the axis for each variable is scaled.
• If Columns is selected for the Standardize By option, the axis ranges from two standard deviations above and below the mean, where the standard deviation and mean are computed for the raw data. If a cluster mean falls beyond this range, the axis is extended to include it.
• If anything other than Columns is selected for the Standardize By option, there is a common vertical axis whose scaling is displayed. (The scaling is equivalent to the Scale Uniformly option in Graph Builder).
When Data is stacked is selected and there are two Attribute ID variables, two-dimensional plots of the mean of the Y variable at each location are shown for each cluster. These plots are colored using a Blue to Gray to Red color gradient.
Column Summary
For each variable, gives the RSquare value that represents the proportion of variation explained by the clusters. This number is the RSquare value for a regression of the variable on the clusters. The option also gives a bar chart of RSquare values.
Late Join Outliers
Shows or hides a table that contains observations that were clustered very late in the algorithm. The observations in this table were still clusters of one when the algorithm was 80% complete. Since each of these observations remained as it’s own cluster until late in the algorithm, these observations are potential outliers in the data set.
Constellation Plot
Shows or hides an alternative way to present the information in the hierarchical clustering dendrogram. Each observation (row) is represented by an endpoint and each cluster join is represented by a new point. The lines that are drawn represent cluster membership. The lengths of the lines represent the distance between clusters. Longer lines represent greater distances between clusters.
You can hover over the lines in the constellation plot to see their length. However, the length values are meaningful only with respect to each other. The axis scaling, orientation of points, and angles of the lines are arbitrary. They are determined such that the ends of the nodes are spaced out and the plot does not appear cluttered, which is important with larger data sets.
To turn off the labels on the endpoints, right-click inside the Constellation Plot and deselect Show Labels.
Scatterplot Matrix
(Available only when Data as usual is selected as the Data Format.) Creates a scatterplot matrix using all the variables.
Parallel Cord Plots
(Available only when Data as usual is selected as the Data Format.) Creates a parallel coordinate plot for each cluster. The axes are scaled as described for the Cluster Means Plot. See Cluster Means Plot.
Cluster Treatment Comparisons
(Available only if you hold Shift and click the Hierarchical Clustering red triangle.) Select a response column and a two-level treatment column. Creates a Hierarchically Clustered Differences report.
See “Local Data Filters in JMP Reports”, “Redo Menus in JMP Reports”, “Group Platform”, and “Save Script Menus in JMP Reports” in Using JMP for more information about the following options:
Local Data Filter
Shows or hides the local data filter that enables you to filter the data used in a specific report.
Redo
Contains options that enable you to repeat or relaunch the analysis. In platforms that support the feature, the Automatic Recalc option immediately reflects the changes that you make to the data table in the corresponding report window.
Platform Preferences
Contains options that enable you to view the current platform preferences or update the platform preferences to match the settings in the current JMP report.
Save Script
Contains options that enable you to save a script that reproduces the report to several destinations.
Save By-Group Script
Contains options that enable you to save a script that reproduces the platform report for all levels of a By variable to several destinations. Available only when a By variable is specified in the launch window.
Note: Additional options for this platform are available through scripting. Open the Scripting Index under the Help menu. In the Scripting Index, you can also find examples for scripting the options that are described in this section.