Tuesday, February 20, 2024

HIERARCHICAL CLUSTERING IN MACHINE LEARNING/PYTHON/ARTIFICIAL INTELLIGENCE

Hierarchical Clustering

  • Why Hierarchical Clustering?
  • Types of Hierarchical Clustering
  • Agglomerative Hierarchical Clustering Algorithm
  • Working of Hierarchical Clustering
  • Advantages of Hierarchical Clustering
  • Disadvantages of Hierarchical Clustering

A kind of unsupervised machine learning known as hierarchical clustering arranges data into a tree-like hierarchical structure that is sometimes shown as a dendrogram. In this, at first every data point is handled independently as a cluster. Subsequently, the following steps are executed:
  • Identifying the two clusters that are closest to each other.
  • Merging the two most similar clusters. This merging process continues until all clusters are amalgamated.

While the outcomes of K-means clustering and hierarchical clustering Python might seem alike at times, their methodologies differ significantly based on their operational approach. The starting set of clusters is not needed to calculate hierarchical clustering, and this is a major difference between K-means and the hierarchy. Hierarchical clustering, also known as the hierarchical classification machine learning cluster analysis, yields a structured diagram of clusters within a dataset. The hierarchical clustering procedure initiates by considering each data point as a separate cluster, discovering the clusters’ closest neighbors, and combining them until the calculated stopping threshold is reached.

Why do we need Hierarchical Clustering?

Hierarchical clustering is markedly different from K-means in that it doesn't need calculations of the initial cluster set. Named also hierarchical cluster analysis, the hierarchical clustering method gives a ranked diagram or dendrogram for the dataset. The hierarchical clustering algorithm finds, for each cluster, its nearest neighbors first to establish an initial cluster and then merges these with others progressively (towards both larger sizes and finer structures) until what seems to happen most often is that two clusters fall apart.

Real-World Example for Hierarchical Clustering

Let’s suppose there is a village. In this village, there are many farmers live who grow many different grains and vegetables in this village nature lovers also live. Even it is ideal to live the villagers face challenges in organizing their annual Harvest Festival.

To overcome this challenge the villagers, hire Emma an event planner. She knew that she needed a more strategic approach, so Emma turned to hierarchical clustering to organize the festival activities and attractions.

Emma began by gathering data on the festival’s past attendance, popular attractions, and demographic preferences. After gaining all this information, she applied hierarchical clustering to group similar festival activities and identify clusters of interest.

As the clustering algorithm went through the data, Emma discovered several different clusters of festival attractions. One cluster includes traditional harvest-themed activities such as pumpkin carving and apple picking, while another comprises live music performances and artisanal craft stalls. She also found a cluster focused on children’s entertainment, featuring interactive games and storytelling sessions.

Using these insights, Emma devised a tiered approach to organizing the Harvest Festival. She created a hierarchical structure with main clusters representing broad categories of attractions and subclusters highlighting specific activities within each category.

This method allowed participants to quickly browse through the many attractions and select the ones that most closely matched their preferences. Young families may explore the children's amusement group, and foodies might taste delicious meals in the food and drink area.

Types of Hierarchical Clustering

There are two types of hierarchical clustering:
  1. Agglomerative clustering analysis
  2. Divisive clustering

Agglomerative Hierarchical Clustering, a popular method in hierarchical cluster analysis (HCA), follows a bottom-up approach. It starts by treating each data point as a separate cluster. Then, in each iteration, it merges the closest pair of clusters until all clusters are merged into a single cluster encompassing the entire dataset.

Agglomerative clustering algorithm

The algorithm for Agglomerative Hierarchical Clustering is:
  • Compute the similarity between each cluster and all other clusters, resulting in a proximity matrix.
  • Initially, treat each data point as its cluster.
  • Merge clusters that exhibit high similarity or proximity.
  • Reassess the proximity matrix for each newly formed cluster.
  • Iterate through steps 3 and 4 until only one cluster remains.

Image source original

Let’s look at algorithms working in a little detail and understand them with images.

Agglomerative clustering example

In the first step, we create each point as a single cluster. If there are N data points then there will be N clusters as shown in the below image. 


Every dot has a cluster (image source original)
In the next step, we take two datasets or points or clusters that are closest and combine them to make one single cluster. It leads to N-1 clusters

Image source original
Now this process takes the 2 closest clusters and combines them to form one cluster. Now it will be N-2 clusters
Image source original

This combining of the clusters process continues to repeat the 3 and 4 steps until we have only one cluster left shown in the below images.
Image source original

After each cluster is combined into one big cluster then we can make a dendrogram (as shown in the above image) which divides the clusters according to the problem.

Divisive Hierarchical clustering

In the divisive approach to several methods, data points are classified as a first step into a single big cluster and then divided into smaller clusters based on how gently or roughly similar they are. That process gives Porscen code that forms N clusters In doing so iteration proceeds until every data point is counted.

Image source original

Working of dendrogram in Hierarchical clustering

A dendrogram is used by the Hierarchical Clustering (HC) technique to show graphically each clustering stage. With the Euclidean distances between data points shown on the y-axis and all of the dataset's data points shown on the x-axis, the dendrogram resembles a tree. This dendrogram provides a comprehensive overview of the clustering process, showcasing the merging of clusters and the distances between them.
Image source original


In the diagram on the left side, the clusters formed by the agglomerative clustering in the machine learning process are depicted, while the corresponding dendrogram is illustrated on the right side.
  • The initial step shows the combination of data points P2 and P3, forming a cluster, which is reflected in the dendrogram by the connection between P2 and P3. The height in the dendrogram signifies the Euclidean distance between these data points.
  • Subsequently, another cluster is formed by P5 and P6, and its corresponding dendrogram emerges. The height of this linkage is greater than the last one, indicating a slightly bigger distance along P5 and P6 compared to P2 and P3.
  • Two additional dendrograms are created, combining P1, P2, and P3 into one dendrogram, and P4, P5, and P6 into another.
  • Finally, a final dendrogram is constructed, amalgamating all the data points together.
Advantages of Hierarchical Clustering
Hierarchical clustering holds several strengths:
  • Its capability to accommodate non-convex clusters, also clusters of various sizes and densities, makes it versatile for diverse datasets.
  • Effective handling of missing and noisy data, contributing to robustness in the clustering process.
  • The hierarchical structure revealed by the dendrogram provides valuable insights into the relationships among clusters, helping to comprehend intricate inter-cluster connections within the dataset.
Disadvantages of Hierarchical Clustering
Hierarchical clustering also faces several challenges:
  • Determining a stopping criterion to ascertain the final number of clusters, which can be subjective and challenging.
  • Higher computational demands, and memory requirements, especially with larger datasets.
  • Sensitivity to initial conditions, impacting the final clusters identified.
  • Despite its ability to handle diverse data and unveil relationships among clusters, its high computational cost and sensitivity to specific conditions remain notable drawbacks.
Summary

An unsupervised method for building a hierarchy of clusters is called hierarchical clustering. It arranges data into a structure like a tree, with each node standing in a cluster. Based on how similar the two clusters are, it might be agglomerative (bottom-up) or divisive (top-down), merging or dividing them. The number of clusters can be determined at any time using hierarchical clustering, which also provides information about the connections between the data points. But for huge datasets, it can be computationally demanding, and once it's set up, there's no way to go back and change the hierarchy.


Python Code


No comments:

Post a Comment

Featured Post

ASSOCIATION RULE IN MACHINE LEARNING/PYTHON/ARTIFICIAL INTELLIGENCE

Association rule   Rule Evaluation Metrics Applications of Association Rule Learning Advantages of Association Rule Mining Disadvantages of ...

Popular