NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Modeling the Activity of Single GenesThe central dogma of molecular biology states that information is stored in DNA, transcribed to messenger RNA (mRNA) and then translated into proteins. This picture is significantly augmentated when we consider the action of certain proteins in regulating transcription. These transcription factors provide a feedback pathway by which genes can regulate one another's expression as mRNA and then as protein. To review: DNA, RNA and proteins have different functions. DNA is the molecular storehouse of genetic information. When cells divide, the DNA is replicated, so that each daughter cell maintains the same genetic information as the mother cell. RNA acts as a go-between from DNA to proteins. Only a single copy of DNA is present, but multiple copies of the same piece of RNA may be present, allowing cells to make huge amounts of protein. In eukaryotes (organisms with a nucleus), DNA is found in the nucleus only. RNA is copied in the nucleus then translocates(moves) outside the nucleus, where it is transcribed into proteins. Along the way, the RNA may be spliced, i.e., may have pieces cut out. RNA then attaches to ribosomes and is translated to proteins. Proteins are the machinery of the cell other than DNA and RNA, all the complex molecules of the cell are proteins. Proteins are specialized machines, each of which fulfills its own task, which may be transporting oxygen, catalyzing reactions, or responding to extracellular signals, just to name a few. One of the more interesting functions a protein may have is binding directly or indirectly to DNA to perform transcriptional regulation, thus forming a closed feedback loop of gene regulation. The structure of DNA and the central dogma were understood in the 50s; in the early 80s it became possible to make arbitrary modifications to DNA and use cellular machinery to transcribe and translate the resulting genes; more recently, genomes (i.e., the complete DNA sequence) of many organisms have been sequenced. This large-scale sequencing began with simple organisms, viruses and bacteria, progressed to eukaryotes such as yeast, and more recently (1998) progressed to a multi-cellular animal, the nematode Caenorhabditis elegans. Sequencers have now moved on to the fruit fly Drosophila melanogaster, whose sequence is slated for completion by the end of 1999. The human genome project is expected to determine the complete sequence of all 3 billion bases of human DNA within the next five years. In the wake of genome-scale sequencing, further instrumentation is being developed to assay gene expression and function on a comparably large scale. Much of the work in computational biology focuses on computational tools used in sequencing, finding genes that are related to a particular gene, finding which parts of the DNA code for proteins and which do not, understanding what proteins will be formed from a given length of DNA, predicting how the proteins will fold from a one-dimensional structure into a three dimensional structure, and so on. Much less computational work has been done regarding the function of proteins. One reason for this is that different proteins function very differently, and so work on protein function is very specific to certain classes of proteins. There are, for example, proteins such enzymes that catalyze various intracellular reactions, receptors that respond to extracellular signals and ion channels that regulate the flow of charged particles into and out of the cell. In this chapter, we will consider a particular class of proteins called transcription factors(TFs), which are responsible for regulating when a certain gene is expressed in a certain cell, which cells it is express in, and how much is expressed. Understanding these processes will involve developing a deeper understanding of transcription, translation, and the cellular processes that control those processes. All of these elements fall under the aegis of gene regulation or more narrowly transcriptional regulation. Some of the key questions in gene regulation are: What genes are expressed in a certain cell at a certain time? How does gene expression differ from cell to cell in a multicellular organism? Which proteins act as transcription factors, i.e., are important in regulating gene expression? From questions like these, we hope to understand which genes are important for various macroscopic processes. Nearly all of the cells of a multicellular organism contain the same DNA. Yet this same genetic information yields a large number of different cell types. The fundamental difference between a neuron and a liver cell, for example, is which genes are expressed. Thus understanding gene regulation is an important step in understanding development. Furthermore, understanding the usual genes that are expressed in cells may give important clues about various diseases. Some diseases, such as sickle cell anemia and cystic fibrosis, are caused by defects in single, non-regulatory genes; others, such as certain cancers, are caused when the cellular control circuitry malfunctions - an understanding of these diseases will involve pathways of multiple interacting gene products. There are numerous challenges in the area of understanding and modeling gene regulation. First and foremost, biologists would like to develop a deeper understanding of the processes involved, including which genes and families of genes are important, how they interact, etc. From a computation point of view, there has been embarrassingly little work done. In this chapter there are many areas in which we can phrase meaningful, non-trivial computational questions, but questions that have not been addressed. Some of these are purely computational (what is a good algorithm for dealing with a model of type X) and others are more mathematical (given a system with certain characteristics, what sort of model can one use? How does one find biochemical parameters from system-level behavior using as few experiments as possible?). In addition to biological and algorithmic problems, there is also the ever-present issue of theoretical biology - what general principles can be derived from these systems, what can one do with models other than just simulate time-courses, what can be deduced about a class of systems without knowing all the details? The fundamental challenge to computationalists and theorists is to add value to the biology - to use models, modeling techniques and algorithms to understand the biology in new ways.
Document ID
20000058173
Acquisition Source
Headquarters
Document Type
Other
Authors
Mjolsness, Eric
(Jet Propulsion Lab., California Inst. of Tech. Pasadena, CA United States)
Gibson, Michael
(California Inst. of Tech. Pasadena, CA United States)
Date Acquired
September 7, 2013
Publication Date
April 19, 1999
Subject Category
Life Sciences (General)
Funding Number(s)
CONTRACT_GRANT: N00014-97-1-0422
CONTRACT_GRANT: N00014-97-1-0293
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
No Preview Available