NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Feature Acquisition with Imbalanced Training DataThis work considers cost-sensitive feature acquisition that attempts to classify a candidate datapoint from incomplete information. In this task, an agent acquires features of the datapoint using one or more costly diagnostic tests, and eventually ascribes a classification label. A cost function describes both the penalties for feature acquisition, as well as misclassification errors. A common solution is a Cost Sensitive Decision Tree (CSDT), a branching sequence of tests with features acquired at interior decision points and class assignment at the leaves. CSDT's can incorporate a wide range of diagnostic tests and can reflect arbitrary cost structures. They are particularly useful for online applications due to their low computational overhead. In this innovation, CSDT's are applied to cost-sensitive feature acquisition where the goal is to recognize very rare or unique phenomena in real time. Example applications from this domain include four areas. In stream processing, one seeks unique events in a real time data stream that is too large to store. In fault protection, a system must adapt quickly to react to anticipated errors by triggering repair activities or follow- up diagnostics. With real-time sensor networks, one seeks to classify unique, new events as they occur. With observational sciences, a new generation of instrumentation seeks unique events through online analysis of large observational datasets. This work presents a solution based on transfer learning principles that permits principled CSDT learning while exploiting any prior knowledge of the designer to correct both between-class and withinclass imbalance. Training examples are adaptively reweighted based on a decomposition of the data attributes. The result is a new, nonparametric representation that matches the anticipated attribute distribution for the target events.
Document ID
20110012250
Acquisition Source
Jet Propulsion Laboratory
Document Type
Other - NASA Tech Brief
Authors
Thompson, David R.
(California Inst. of Tech. Pasadena, CA, United States)
Wagstaff, Kiri L.
(California Inst. of Tech. Pasadena, CA, United States)
Majid, Walid A.
(California Inst. of Tech. Pasadena, CA, United States)
Jones, Dayton L.
(California Inst. of Tech. Pasadena, CA, United States)
Date Acquired
August 25, 2013
Publication Date
March 1, 2011
Publication Information
Publication: NASA Tech Briefs, March 2011
Subject Category
Man/System Technology And Life Support
Report/Patent Number
NPO-47562
Distribution Limits
Public
Copyright
Public Use Permitted.
No Preview Available