NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Fast and Flexible Multivariate Time Series Subsequence SearchMultivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical monitoring, and financial systems. Domain experts are often interested in searching for interesting multivariate patterns from these MTS databases which often contain several gigabytes of data. Surprisingly, research on MTS search is very limited. Most of the existing work only supports queries with the same length of data, or queries on a fixed set of variables. In this paper, we propose an efficient and flexible subsequence search framework for massive MTS databases, that, for the first time, enables querying on any subset of variables with arbitrary time delays between them. We propose two algorithms to solve this problem (1) a List Based Search (LBS) algorithm which uses sorted lists for indexing, and (2) a R*-tree Based Search (RBS) which uses Minimum Bounding Rectangles (MBR) to organize the subsequences. Both algorithms guarantee that all matching patterns within the specified thresholds will be returned (no false dismissals). The very few false alarms can be removed by a post-processing step. Since our framework is also capable of Univariate Time-Series (UTS) subsequence search, we first demonstrate the efficiency of our algorithms on several UTS datasets previously used in the literature. We follow this up with experiments using two large MTS databases from the aviation domain, each containing several millions of observations. Both these tests show that our algorithms have very high prune rates (>99%) thus needing actual disk access for only less than 1% of the observations. To the best of our knowledge, MTS subsequence search has never been attempted on datasets of the size we have used in this paper.
Document ID
20100033688
Document Type
Conference Paper
Authors
Bhaduri, Kanishka (Mission Critical Technologies, Inc. Moffett Field, CA, United States)
Oza, Nikunj C. (NASA Ames Research Center Moffett Field, CA, United States)
Zhu, Qiang (California Univ. Riverside, CA, United States)
Srivastava, Ashok N. (NASA Ames Research Center Moffett Field, CA, United States)
Date Acquired
August 25, 2013
Publication Date
July 25, 2010
Subject Category
Mathematical and Computer Sciences (General)
Report/Patent Number
ARC-E-DAA-TN1230
Meeting Information
KDD-2010: 16th ACM SIGKDD Conference on Knowledge, Discovery and Data Mining(Washington, DC)
Funding Number(s)
CONTRACT_GRANT: NNA08CG83C
Distribution Limits
Public
Copyright
Public Use Permitted.

Available Downloads

NameType 20100033688.pdf STI