Classification is a major step in creating maps. Behind every classification mechanism (that might be done automatically by most GIS Software) a complex mathematical algorithm is used.
Regarding to which classification algorithm you choose the product will look totally different as the data breaks between the classes change. To decide which classification is the best you should understand what happens behind.
Clustering: Classification:
In general there are two major classification and clustering methodologies: 1. Supervised Classification 2. Unsupervised Classification
+ PRO: Entire feature space can be classified
+ CONTRA: Description of cluster determined by radial shape, starting points must be selected as cluster center
Hierarchical Clustering Hierarchical clustering creates a Dendrogram which is a tree diagram. It shows the euklidian distance between data points. The Dendrogram can be used to decide how many clusters you want to create. You can choose between different possibilities to define the distance between clusters:
+ Single link = Minimal euklidian distance
+ Average link = Average euklidian distance
+ Complete link = Maximum euklidian distance
Naive Bayes PROBLEM: Classes often overlap IDEA of Naive Bayes Theorem: Probability maximisation of correct classification of each data point GIVEN PARAMETER:
+ A-priori probability that each data point is realated to a specific class
+ Feature probability that one data point within a specific class has a specific feature vector
A Voronoi Diagramm decribes areas, that are nearest to a certain data point (within a defined distance) 1. For the k-nn classification you choose an uneven number of neighbors=k 2. The algorithm starts at a random point 3. This starting point gets the value that most appears in the "neighborhood" (for that reason k should always be uneven)
+ PRO: Easy concept
+ CONTRA: Distance parameter often difficult
+ PRO: Unlinear classification possible, less parameter required, no a-priori knowledge required, applicable for large feature space
+ CONTRA: Slow classification
AdaBoost is a combination of multiple weak classifiers to build one strong classifier