The hierarchical clustering method starts with each observation forming its own cluster. At each step, the clustering process calculates the distance between all pairs of clusters and combines the two clusters that are closest together. This process continues until all the points are contained in one cluster. Hierarchical clustering is also called agglomerative clustering because of the combining approach that it uses.
The agglomerative process is portrayed as a tree, called a dendrogram. To help you decide on a number of clusters, JMP provides a distance graph. You can select a number of clusters by determining when the distances between clusters no longer appear to be of practical importance.
Hierarchical clustering also supports character columns. There are two ways to define the distances.
• If a column is ordinal, then the value used for clustering is the index of the ordered category, treated as if it were continuous data. These values are standardized as if they were continuous data.
• If a column is nominal, then the distance between two observations where the categories match is zero. If the categories differ, the distance is one.
Hierarchical clustering enables you to choose among five rules for defining distances between clusters: Average, Centroid, Ward, Single, and Complete. Each rule can generate a different sequence of clusters. There are also two additional methods that are based on Ward’s method for defining distances between clusters: Fast Ward and Hybrid Ward.
Tip: The hierarchical clustering process starts with n(n + 1)/2 distances for n observations, except when the Fast Ward method is used. For this reason, this method can take a long time to run when n is large. For large numbers of numeric observations, consider K Means Cluster or Normal Mixtures.
Hierarchical Clustering is one of four platforms that JMP provides for clustering observations. For a comparison of all four methods, see Overview of Platforms for Clustering Observations.