Determining the Number of Clusters in a Data Set Without Graphical InterpretationCluster analysis is a data mining technique that is meant ot simplify the process of classifying data points. The basic clustering process requires an input of data points and the number of clusters wanted. The clustering algorithm will then pick starting C points for the clusters, which can be either random spatial points or random data points. It then assigns each data point to the nearest C point where "nearest usually means Euclidean distance, but some algorithms use another criterion. The next step is determining whether the clustering arrangement this found is within a certain tolerance. If it falls within this tolerance, the process ends. Otherwise the C points are adjusted based on how many data points are in each cluster, and the steps repeat until the algorithm converges,
Document ID
20110016534
Acquisition Source
Ames Research Center
Document Type
Presentation
Authors
Aguirre, Nathan S. (Hispanic Coll. Fund, Inc. Washington, DC, United States)
Davies, Misty D. (NASA Ames Research Center Moffett Field, CA, United States)