# Deliverable 3 - Implementation of Clustering using the minimum spanning tree

## Goal

The main objective of this deliverable is to implement clustering using Kruskal’s algorithm.

## Introduction

Clustering using Kruskal’s algorithm is a hierarchical type of clustering. Hierarchical algorithms find the clusters using the previously established clusters. The implementation is done in two parts. Kruskal’s algorithm is implemented to find the minimum spanning tree of the graph. It finds the subset of edges which includes all the vertices and the total weight of the edges are minimized. Clustering is done by deleting the most expensive edge from the minimum spanning tree.

## Implementation Details

Implementation was done in PHP. The edges of the graph with weights and the number of clusters to be formed were given as the input.

The minimum spanning tree using Kruskal’s algorithm was created initially. Each vertex was considered as a separate tree. Each of the minimum edges was added to the minimum spanning tree if it did not create a cycle. The algorithm was implemented until all the vertices were covered.

The minimum spanning tree obtained by applying Kruskal’s algorithm was used to create the required number of clusters. Clusters were formed by removing the most expensive edge from the minimum spanning tree. The number of edges to be removed was one less than the required number of clusters.

The output displayed the input tree, the minimum spanning tree edges and the clusters of vertices. Figure 1 below shows a sample output for creating 3 clusters.

Figure 1: Sample output for creating 3 clusters using Kruskal’s algorithm
## Source code