NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Decoding the effects of synonymous variantsSynonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evaluation data and exhibit over-reliance on sequence conservation signals. We developed synVep (synonymous Variant effect predictor), a machine learning-based method that overcomes both of these limitations. Our training data was a combination of variants reported by gnomAD (observed) and those unreported, but possible in the human genome (generated). We used positive-unlabeled learning to purify the generated variant set of any likely unobservable variants. We then trained two sequential extreme gradient boosting models to identify subsets of the remaining variants putatively enriched and depleted in effect. Our method attained 90% precision/recall on a previously unseen set of variants. Furthermore, although synVep does not explicitly use conservation, its scores correlated with evolutionary distances between orthologs in cross-species variation analysis. synVep was also able to differentiate pathogenic vs. benign variants, as well as splice-site disrupting variants (SDV) vs. non-SDVs. Thus, synVep provides an important improvement in annotation of sSNVs, allowing users to focus on variants that most likely harbor effects.
Document ID
20230003000
Acquisition Source
2230 Support
Document Type
Accepted Manuscript (Version with final changes)
Authors
Zishuo Zeng ORCID
(Rutgers, The State University of New Jersey New Brunswick, New Jersey, United States)
Ariel A. Aptekmann ORCID
(Rutgers, The State University of New Jersey New Brunswick, New Jersey, United States)
Yana Bromberg ORCID
(Rutgers, The State University of New Jersey New Brunswick, New Jersey, United States)
Date Acquired
March 6, 2023
Publication Date
November 30, 2021
Publication Information
Publication: Nucleic Acids Research
Publisher: Oxford University Press
Volume: 49
Issue: 22
Issue Publication Date: December 16, 2021
ISSN: 0305-1048
e-ISSN: 1362-4962
Subject Category
Life Sciences (General)
Funding Number(s)
CONTRACT_GRANT: 80NSSC18M0093
Distribution Limits
Public
Copyright
Portions of document may include copyright protected material.
Technical Review
No Preview Available