|
Cluster analysis
|
|
There are many forms of
cluster analysis. The form most commonly applied to grid data is known
as hierarchical cluster analysis. It is hierarchical in the sense, that
at the lowest level, every element1
is deemed to be
in its own cluster. In the first step the similarity (which may be any
kind
of measure, such as distances or correlations) between each pair of
clusters
is considered and the most similar pair of clusters are merged (to form
a
cluster containing two elements). The second step is to compute2 the similarity between
this
new cluster and the other (single element) clusters and again merge
most
similar pair is chosen to form the next cluster. Here the first cluster
(of two elements) is considered along with the other clusters (of one
element).
The third step is to form a new cluster from the next most similar pair
of clusters. This continues until the clusters are gradually merged
into
one cluster. There is one less step or level in the hierarchical
clustering
than there are elements. The process is often represented by connecting
lines in what is known as a dendogram. Cluster analysis is the basis of
the FOCUS procedure.
1 Wherever elements appear,
constructs could.
2 There are many methods of
hierarchical clustering. The methods differ chiefly with respect to how
the similarity between a recently merged pair and the the other
clusters is computed. Two simple ways is to either take the similarity
of the more similar member of the merged pair (often called
‘single-linkage’ or nearest-neighbour and used by FOCUS)
or to take the similarity of the less
similar member of the merged pair (often called ‘complete-linkage’ or
furthest-neighbour). And obviously there are various ways of taking
averages. More detail about these can be found in Everitt, Landau, and
Leese (2001) |
|
References
|
|
- Everitt,
B.S., Landau, S., and Leese, M. (2001) Cluster analysis
(4th edition) London: Arnold.
|
|
Richard C. Bell
|
|
|