Topic: Agglomerative clustering

From: Akansha Singh <akansha(dot)singh(at)oracle(dot)com>
To: Pgsql-Students <pgsql-students(at)postgresql(dot)org>
Subject: Topic: Agglomerative clustering
Date: 2013-04-29 12:11:42
Message-ID: 2701ba0a-ce70-45db-9cc2-bf098c019780@default
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-students

hi
i have another clustering algorithm.. Please give your suggestions for this one whether it can be implemented? I will be obliged.

Agglomerative Clustering
In agglomorative clustering, each data point is initially placed in a cluster by itself. At each step, two clusters with the highest similarity score are merged. The output of agglomerative clustering is a tree, where the leaves are the data items. At any intermediate step, the clusters so far are different trees. When two clusters are merged by the algorithm, the respective trees are merged by making the roots of these trees the left and right children of a new node (so that the total number of trees decreases by 1). In the end, the process yields a single tree. See here for an example.

This algorithm closely resembles Kruskal's algorithm for Minimum Spanning Tree construction. In Kruskal's algorithm, at each step the smallest edge connecting two components is added till there is only one component left. Each new edge added merges two components. This can be seen as clustering where the similarity score between two components is the negative the length of the smallest edge between any two points in the cluster.

Many different clustering algorithms can be obtained by varying the similarity measures between clusters. Some examples of similarity measures are the following. Here, A and B are two clusters, and sim(A,B) is the similarity measure between them. The similarity measure sim(u,v) between input data items u and v is specified as input. Typically, the similarity is a number between 0 and 1, with larger values implying greater similarity.

Regards
Akansha SIngh

Browse pgsql-students by date

  From Date Subject
Next Message Thom Brown 2013-05-01 08:30:25 Re: GSoC proposals deadline
Previous Message Christoph Berg 2013-04-29 08:14:45 Re: [GSoC 2013] [Debian]