NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Measuring Constraint-Set Utility for Partitional Clustering AlgorithmsClustering with constraints is an active area of machine learning and data mining research. Previous empirical work has convincingly shown that adding constraints to clustering improves the performance of a variety of algorithms. However, in most of these experiments, results are averaged over different randomly chosen constraint sets from a given set of labels, thereby masking interesting properties of individual sets. We demonstrate that constraint sets vary significantly in how useful they are for constrained clustering; some constraint sets can actually decrease algorithm performance. We create two quantitative measures, informativeness and coherence, that can be used to identify useful constraint sets. We show that these measures can also help explain differences in performance for four particular constrained clustering algorithms.
Document ID
20070017958
Acquisition Source
Jet Propulsion Laboratory
Document Type
Preprint (Draft being sent to journal)
External Source(s)
Authors
Davidson, Ian
(State Univ. of New York Albany, NY, United States)
Wagstaff, Kiri L.
(Jet Propulsion Lab., California Inst. of Tech. Pasadena, CA, United States)
Basu, Sugato
(SRI International Corp. Menlo Park, CA, United States)
Date Acquired
August 23, 2013
Publication Date
September 18, 2006
Subject Category
Numerical Analysis
Meeting Information
Meeting: 10th European Conference on Principles and Practice of Knowledge Discovery in Databases
Location: Berlin
Country: Germany
Start Date: September 18, 2006
End Date: September 22, 2006
Funding Number(s)
CONTRACT_GRANT: NSF ITR-03-25329
CONTRACT_GRANT: NBCHD030010
Distribution Limits
Public
Copyright
Other
Keywords
constraints
utility
clustering
machine learning

Available Downloads

There are no available downloads for this record.
No Preview Available