NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Correlation approach to identify coding regions in DNA sequencesRecently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
Document ID
20050000336
Acquisition Source
Legacy CDMS
Document Type
Reprint (Version printed in journal)
Authors
Ossadnik, S. M.
(Boston University Massachusetts 02215)
Buldyrev, S. V.
Goldberger, A. L.
Havlin, S.
Mantegna, R. N.
Peng, C. K.
Simons, M.
Stanley, H. E.
Date Acquired
August 22, 2013
Publication Date
July 1, 1994
Publication Information
Publication: Biophysical journal
Volume: 67
Issue: 1
ISSN: 0006-3495
Subject Category
Life Sciences (General)
Distribution Limits
Public
Copyright
Other
Keywords
Non-NASA Center
NASA Discipline Cardiopulmonary

Available Downloads

There are no available downloads for this record.
No Preview Available