<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Starclouds &#187; Genomics</title>
	<atom:link href="http://mmwaldrop.com/Starclouds/category/genomics/feed/" rel="self" type="application/rss+xml" />
	<link>http://mmwaldrop.com/Starclouds</link>
	<description></description>
	<lastBuildDate>Sat, 19 Jan 2008 15:06:14 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>An Overview of Systems Biology</title>
		<link>http://mmwaldrop.com/Starclouds/2007/11/19/an-overview-of-systems-biology/</link>
		<comments>http://mmwaldrop.com/Starclouds/2007/11/19/an-overview-of-systems-biology/#comments</comments>
		<pubDate>Mon, 19 Nov 2007 19:57:26 +0000</pubDate>
		<dc:creator>Mitch</dc:creator>
				<category><![CDATA[Genomics]]></category>
		<category><![CDATA[Health care]]></category>
		<category><![CDATA[Systems Biology]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[Collins]]></category>
		<category><![CDATA[DNA]]></category>
		<category><![CDATA[Hood]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[metabolic networks]]></category>
		<category><![CDATA[NRC]]></category>
		<category><![CDATA[proteins]]></category>
		<category><![CDATA[proteomics]]></category>
		<category><![CDATA[RNA]]></category>
		<category><![CDATA[signal transduction networks]]></category>

		<guid isPermaLink="false">http://mmwaldrop.com/Starclouds/2007/11/19/an-overview-of-systems-biology/</guid>
		<description><![CDATA[Back in 2003, the National Research Council commissioned me to write a chapter about &#8220;systems biology&#8221; for a report they were doing on the relation between biology and information technology. Since the report, which eventually appeared as Catalyzing Inquiry at the Interface of Computing and Biology (2005), was radically reorganized after my assignment was done, [...]]]></description>
			<content:encoded><![CDATA[<p>Back in 2003, the National Research Council commissioned me to write a chapter about &#8220;systems biology&#8221; for a report they were doing on the relation between biology and information technology. Since the report, which eventually appeared as <a href="http://www.nap.edu/catalog.php?record_id=11480">Catalyzing Inquiry at the Interface of Computing and Biology (2005)</a>, was radically reorganized after my assignment was done, and since my text wound up being scattered, I thought I would post the original version here. It&#8217;s a little dated, and none of the references are recent, but it still gives a pretty good overview of the issues. Enjoy&#8230;<span id="more-47"></span></p>
<p><script type="text/javascript"><!--
google_ad_client = "pub-4225684446778290";
google_ui_features = "rc:";
google_ad_width = 468;
google_ad_height = 60;
google_ad_format = "468x60_as";
google_ad_type = "text";
google_alternate_ad_url = "?adsensem-benice=468x60";
google_color_border = "FFFFFF";
google_color_bg = "FFFFFF";
google_color_link = "0000FF";
google_color_text = "000000";
google_color_url = "";

//--></script>
<script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"></script>
</p>
<h1>Systems Biology</h1>
<p>On 14 April  2003, not quite 50 years to the day after James Watson and Francis Crick first published the structure of the DNA double helix,<a href="#_ftn1" title="_ftnref1" name="_ftnref1">[1]</a> officials announced that the Human Genome Project was finished. <a href="#_ftn2" title="_ftnref2" name="_ftnref2">[2]</a> After thirteen years and $2.7 billion, the international effort had finally given us a virtually complete listing of the human genetic code: a sequence some 3 billion base-pairs long.<a href="#_ftn3" title="_ftnref3" name="_ftnref3">[3]</a> Along the way, moreover, scientists had begun to compile similar genetic sequences for a rapidly expanding list of other organisms, from bacteria to fruit flies to mice, thereby laying the foundations for a new science of comparative genomics.<a href="#_ftn4" title="_ftnref4" name="_ftnref4">[4]</a> They had begun to map out individual variations in the genetic code, thereby laying the foundations for a new practice of genomic medicine, in which physicians would be able to calibrate each individual patient&#8217;s disorder, and devise treatments for it, with molecular precision. And they had begun to open up a Pandora&#8217;s box of potentially explosive social issues, ranging from the possibility of genetic discrimination, to the role of genetics as a determinant of race, ethnicity, and human behavior. In short, as genome project director Francis Collins and his colleagues declared a few weeks later in the journal <em>Nature</em>, the completion was &#8220;a landmark event.&#8221;</p>
<p>But now, they added, as they outlined a research agenda for turning the visions into reality,<a href="#_ftn5" title="_ftnref5" name="_ftnref5">[5]</a> the <em>real</em> work begins.</p>
<p>After all, knowing the complete sequence of base pairs in the human genome is a bit like knowing the complete sequence of <em>1</em>s and <em>0</em>s that make up a computer program: by itself, that information doesn&#8217;t tell you anything whatsoever about what the program does, or how it&#8217;s organized into functional units such as subroutines. In the case of DNA, it&#8217;s true, biologists have devised algorithms that can go through the sequence and (sometimes) identify regions that comprise individual genes. And it&#8217;s also true that (some of) those individual genes encode the instructions for making protein molecules: the &#8220;nanomachines&#8221; that serve as enzymes, transporters, gateways, structural building blocks, and a myriad other roles in the cell. But the fact remains that few, if any, biological functions can be assigned to a single gene or a single protein. A cell&#8217;s metabolism, its response to chemical signals from the outside, its cycle of growth and cell division-all these functions and more are carried out and controlled by elaborate webs of interacting molecules. Indeed, in what has to be the most astonishing display of self-reference in nature, the protein products of the genome actually react back on the DNA (and with each other) to regulate their own creation. Understanding these networks-&#8221;systems [that] are far more complex than any problem that molecular biology, genetics or genomics has yet approached,&#8221; as Collins and his coauthors put it-is critical to realizing genomics&#8217; promise.<a href="#_ftn6" title="_ftnref6" name="_ftnref6">[6]</a></p>
<p>Taking on the challenge of &#8220;systems biology,&#8221; as it&#8217;s come to be called, promises to be a major opportunity for cross-fertilization between IT and biology. There are many reasons for that, as discussed below. But perhaps the most fundamental is the simple fact that biology and computer science are the only two disciplines that have at their core the concept of information.</p>
<h2>Biology as an Information Science</h2>
<p>If anything, having a full listing of all 3 billion base pairs in the human genetic code has only made the genome more mysterious. Roughly half of it consists of highly repetitive sequences that don&#8217;t seem to be encoding much of anything. And most of the rest consists of <em>non</em>-repetitive sequences that seem equally useless. Why these sequences are there remains an open question. But in the meantime, most or all of the biological action appears to be confined to the tiny the fraction that&#8217;s left over. It&#8217;s in this fraction that we find the two fundamental types of information in the genome: coding sequences, and regulatory sequences.</p>
<h3>Coding Sequences: Molecular Blueprints</h3>
<p>The first type features DNA in its classic role as a blueprint for nature&#8217;s nanomachines, the proteins. The sequences here are organized into functional units-genes-that follow the fundamental dogma described in every basic biology textbook: one gene, one protein. The encoding relies on a three-letter code, in which each triplet of bases picks out one of 20 molecular building blocks known as &#8220;amino acids.&#8221; The full sequence of triplets within a single gene thus determines a corresponding sequence of amino acids, which will eventually be linked together to make the protein like so many beads on a string.</p>
<p>Among the early surprises of the genome project was that these blueprint sequences comprise no more than 1-2% of human DNA.<a href="#_ftn7" title="_ftnref7" name="_ftnref7">[7]</a> Another surprise was that the human sequences encode no more than 30-40,000 different proteins in toto-an almost humiliatingly small number, considering that the simple nematode worm <em>C. elegans</em> has roughly 20,000. In any case, only about a third to a half of the proteins in the genome are actually manufactured in any given cell type (a muscle cell, say, or a liver cell); the genes that specify the rest are suppressed by the regulatory apparatus discussed below.</p>
<p>Also included under the DNA-as-blueprint category are a few thousand genes that encode various types of RNA molecules. RNA is a kind of single-strand version of DNA, right down to the information-encoding bases. It plays a variety of roles in the cell, the most famous being in the protein synthesis process that is also described in every biology textbook. In outline, that process begins with <em>transcription</em>, in which the content of a given gene is copied from the DNA to a strand of  RNA.  This &#8220;messenger&#8221; RNA, or mRNA, then moves out into the cell cytoplasm, where it encounters an RNA-based structure known as the ribosome.<a href="#_ftn8" title="_ftnref8" name="_ftnref8">[8]</a> The ribosome grabs onto one end of the mRNA and begins to move along its length like the read-write head of a videotape player scanning the tape. As it goes, it carries out the <em>translation</em> portion of protein synthesis, reading each triplet code in turn and creating a corresponding chain of amino acids: the growing protein molecule. (Each amino acid is brought to the ribosome by a special &#8220;transfer&#8221; RNA.) When the ribosome reaches the end of the mRNA, the protein is complete.</p>
<h3>Regulatory Sequences: Biological Processes and the Cellular Operating System</h3>
<p>In addition to the 1-2% of the genome that contains coding sequences, there is a roughly equal percentage that appears to be under considerable selection pressure.<a href="#_ftn9" title="_ftnref9" name="_ftnref9">[9]</a> That is, similar sequences are found in the genomes of mice and other organisms, suggesting that these particular stretches of DNA are too critical to our survival to change very rapidly over the course of evolution. Relatively little is know about these sequences for sure, although some of them are known to be involved in the basic mechanics of the chromosomes themselves. (Examples include the highly repetitive DNA in the &#8220;telemeres,&#8221; which are structures that cap off the ends of the chromosomes, and the special DNA sites that tether the chromosomes to the membrane of the cell&#8217;s nucleus.) Most likely, however, these sequences contain the bulk of the cell&#8217;s regulatory information.</p>
<p>Although there is much still to learn about DNA regulation,<a href="#_ftn10" title="_ftnref10" name="_ftnref10">[10]</a> the research done to date suggests that a protein- or RNA-encoding gene will typically be controlled by several short stretches of DNA.<a href="#_ftn11" title="_ftnref11" name="_ftnref11">[11]</a> Often these sites are located just outside the coding region of the gene, near the beginning, but in principle they could be anywhere. Under the right circumstances, a specialized protein will come in and bind to each regulator sequence, latching right onto the DNA. The presence that protein will either encourage the cell to start &#8220;expressing&#8221; the gene-transcribing it into mRNA-or block the cell from doing so. The proteins are accordingly known as &#8220;transcription factors,&#8221; while the binding sites are known as &#8220;promoters&#8221; or &#8220;suppressers,&#8221; respectively. The resulting push-pull system allows the cell to shift the balance between expression and non-expression of the gene with exquisite precision, depending on which transcription factors bind, and when.<a href="#_ftn12" title="_ftnref12" name="_ftnref12">[12]</a></p>
<p>The transcription factors, in turn, are part of (and are controlled by) the vast web of molecular interactions that comprise a kind of operating system for the cell. In purely biochemical terms, it&#8217;s true, the details of the reactions can be exceedingly intricate. A protein might interact with an enzyme that adorns it a phosphate group, for example, or a sugar molecule, or any of a variety of other appendages, which will then change its shape and activity level. Or several proteins might come together to form a multiprotein complex, and so on. Nonetheless, at a more abstract level, many of these molecular participants can be thought of as carrying information from one reaction to the next,<a href="#_ftn13" title="_ftnref13" name="_ftnref13">[13]</a> in somewhat the same way that data structures carry information from one computational process to the next. Just as various software modules are specialized for different tasks, moreover, the web of cellular interactions can be seen as a multitude of specialized sub-networks. Signal transduction pathways, for example, are the cascades of reactions that get triggered when the cell encounters some stimulus from the outside.<a href="#_ftn14" title="_ftnref14" name="_ftnref14">[14]</a> A metabolic network is the web of reactions by which a cell processes food molecules. The cell cycle is the network of reactions and genetic regulation that controls when and how a cell divides. Specialized or not, however, these sub-networks and many others overlap strongly, funneling information back and forth to one another, and to the genetic regulatory apparatus, thus allowing the cell to maintain itself, sense the outside world, and respond like the living thing that it is.</p>
<h2>Challenges and Opportunities</h2>
<p>The broad outlines of this picture were first sketched more than 40 years ago, as scientists in the late 1950s and early 1960s worked out the fundamentals of the genetic code, protein synthesis, and genomic regulation.<a href="#_ftn15" title="_ftnref15" name="_ftnref15">[15]</a> Indeed, their findings even inspired a brief vogue for &#8220;systems biology,&#8221; which in those days meant mathematical models and computer simulations based on such then-fashionable ideas as cybernetics and General Systems Theory.<a href="#_ftn16" title="_ftnref16" name="_ftnref16">[16]</a> That initial burst of enthusiasm waned fairly quickly, as it became clear that there wasn&#8217;t enough data to keep the mathematical abstractions tethered to experiment. But by the turn of the millennium, enthusiasm for a new and more modern form of systems biology was running strong again. Not only were we now in an age of abundant data, thanks in large part to the Human Genome Project, but more and more biologists were embracing the systems approach as one of the inevitable next steps for the post-genome era.<a href="#_ftn17" title="_ftnref17" name="_ftnref17">[17]</a></p>
<p>The challenge, in a nutshell, is to understand the cellular information processing system-<em>all</em> of it-from the genome on up. Some of the key issues:</p>
<ul type="disc">
<li>What      is the complete inventory of proteins in any given cell (a subfield often      known as &#8220;proteomics&#8221;<a href="#_ftn18" title="_ftnref18" name="_ftnref18">[18]</a>)?      How do these individual protein molecules organize themselves into      functional sub-networks-and how do these sub-networks then organize      themselves into higher- and higher-level networks?<a href="#_ftn19" title="_ftnref19" name="_ftnref19">[19]</a>      What are the functional design principles of these systems? And how,      precisely, do the products of the genome react <em>back</em> on the genome to control their own creation?</li>
<li>How do      these dynamically self-organizing networks vary over the course of the      cell cycle, and as the cell responds to its surroundings? How do they      encode and process information? And what accounts for life&#8217;s <em>robustness</em>-the ability of these      networks to adapt, maintain themselves, and recover from a wide variety of      environmental insults?<a href="#_ftn20" title="_ftnref20" name="_ftnref20">[20]</a></li>
<li>How      do the networks organize and reorganize themselves over the course of embryonic      development, as each cell decides whether its progeny are going to become      skin, muscle, brain, or whatever?<a href="#_ftn21" title="_ftnref21" name="_ftnref21">[21]</a>      Then, once the cells are done differentiating, how do the networks actually      vary from one cell type to the next? What constitutes the difference? And      what happens to the networks as cells age, or are damaged? How do flaws in      the networks manifest themselves as maladies such as cancer?</li>
<li>How      do the networks vary between individuals? How do those variations account      for differences in morphology and behavior? And-especially in humans-how      do those variations account for individual differences in the response to      drugs and other therapies?</li>
<li>How      do the networks vary between species? Or to put it another way, how have      they changed over the course of evolution? Since the &#8220;blueprint&#8221; genes for      proteins and RNA seem to be quite highly conserved from one species to the      next, is it possible that most of evolution is the result of rearrangements      in the genetic regulatory system?<a href="#_ftn22" title="_ftnref22" name="_ftnref22">[22]</a></li>
</ul>
<p>To call this challenge &#8220;immense&#8221; would be an understatement; a full accounting of the cellular regulatory networks in every cell type, in multiple species, and over all time-scales, would dwarf the Human Genome Project by many orders of magnitude. Nonetheless, scientists are already organizing themselves to tackle the problem-or at least, significant pieces of it. Among the major initiatives are the Alliance for Cellular Signaling,<a href="#_ftn23" title="_ftnref23" name="_ftnref23">[23]</a> a university-industry consortium organized by Nobel laureate Alfred Gilman of the University of Texas, Southwestern; the Institute for Systems Biology,<a href="#_ftn24" title="_ftnref24" name="_ftnref24">[24]</a> a not-for-profit research foundation created in Seattle by Leroy Hood, a pioneer of rapid genome sequencing technology; the Caltech-ERATO-Kitano Systems Biology Workbench Project, a U.S.-Japanese collaboration devoted to computer modeling of biological systems; the U.S. Department of Energy&#8217;s Genomes to Life Program,<a href="#_ftn25" title="_ftnref25" name="_ftnref25">[25]</a> which focuses on identifying the proteins and characterizing the gene regulatory networks in microbial communities, with an eye towards energy production, global change mitigation, and environmental cleanup; and the National Cancer Institute Director&#8217;s Challenge: Toward a Molecular Classification of Cancer.<a href="#_ftn26" title="_ftnref26" name="_ftnref26">[26]</a></p>
<p>This listing could be extended almost indefinitely; indeed, there seem to be very few university departments, biotech firms, or pharmaceutical companies that <em>haven&#8217;t</em> made at least some sort of investment in systems biology. But one common thread in all these initiatives is the critical importance of information technology. Indeed, biologists and information technologists have already created a flourishing cross-discipline known as bioinformatics, which encompasses a wide variety of techniques for archiving biological information and then &#8220;mining&#8221; it to reveal hidden patterns. Today, for example, it&#8217;s routine for biologists to run genomic data through gene-finding software to identify coding sequences-and from there, pipe the results into search programs such as BLAST and HMMer, which go through archives of previously annotated data to find proteins or protein families with similar sequences. With that information, in turn, they can often predict the function of  their newly identified proteins.<a href="#_ftn27" title="_ftnref27" name="_ftnref27">[27]</a></p>
<p>And yet, as powerful as such capabilities are, our current generation of bioinformatics tools are only the beginning. &#8220;The heterogeneity, complexity, and dynamic nature of [the data in systems biology] present computer science demands unlike those of any scientific domain before,&#8221; noted the organizers of a recent Department of Energy workshop<a href="#_ftn28" title="_ftnref28" name="_ftnref28">[28]</a> on the systems biology-IT interface. Some of the key challenges:</p>
<h3>Decoding the Genome</h3>
<p>In the Human Genome Institute&#8217;s recently published agenda for research in the post-genome era, Francis Collins and his co-authors repeatedly emphasized how little biologists understand about the data they&#8217;ve already got. They are a very long way from knowing everything there is to know about how genes are structured and regulated, for example-and they are virtually without a clue as to what&#8217;s going on in the other, non-coding 95% of the genome. That&#8217;s why the agenda&#8217;s very first Grand Challenge was to systematically endow that data with meaning-that is, to &#8220;comprehensively identify the structural and functional components encoded in the human genome.&#8221;<a href="#_ftn29" title="_ftnref29" name="_ftnref29">[29]</a></p>
<p>The effort to meet this grand challenge may very well produce some scientific surprises, along the lines of, say, &#8220;non-coding&#8221; sequences that confer biological function in some new and unexpected way. But that effort will definitely produce a demand for major improvements in data-handling technology. Ideally, for example, the data technology for systems biology should be-</p>
<ul class="unIndentedList">
<li> <strong>Scalable.</strong> The first complete sequencing of the human genome took thirteen years and $2.7 billion. But along the way, that effort gave an enormous boost to the technology of automated gene sequencing. Today, advanced sequencers can analyze DNA at the rate of some 1.5 million base pairs per day; soon, if development goes the way researchers hope, it may be possible to sequence any given individual&#8217;s entire 3 billion-base-pair genome within 24 hours, for a cost of a few thousand dollars.<a href="#_ftn30" title="_ftnref30" name="_ftnref30">[30]</a> The same technologies will be applicable to other species, as well. The result promises to be archives of genomic data that will grow even more explosively than it is already. And that, in turn, implies that the architecture, algorithms, and hardware of the genome archives will have to be scalable-meaning that system will still be able to store and retrieve information efficiently no matter how large those archives get. In particular, it would be very helpful to have algorithms that could do a better and more accurate job of identifying genes and regulatory regions, with less need for humans to proofread the results.<a href="#_ftn31" title="_ftnref31" name="_ftnref31">[31]</a> Such algorithms might be based on advance pattern recognition techniques, for example, or sophisticated heuristic reasoning.</li>
<li> <strong>Extensible</strong>. Today, when biologists archive a newly discovered gene sequence in, say, GenBank, they have various types of annotation software at their disposal to link it with explanatory data, such as how and by whom the sequence was identified, as well as the function of the protein or RNA it encodes. But next-generation annotation systems will have to do this for many other genome features, such as transcription factor binding sites and single nucleotide polymorphisms (SNPs), that most of today&#8217;s systems don&#8217;t cover at all. Indeed, these systems will have to be able to create, annotate, and archive models of entire metabolic, signaling, and genetic pathways. At the same time, moreover, they will have to deal with entirely new kinds of data. In recent years, for example, there has been a widespread deployment of DNA microarrays, which can assay any given type of cell and measure the activity level of hundreds or thousands of genes at once: are they or are they not being expressed, and by how much?<a href="#_ftn32" title="_ftnref32" name="_ftnref32"><sup><sup>[32]</sup></sup></a> Even more recently, there has been a parallel deployment of protein microarrays, which can identify protein-protein (and protein-drug) interactions among some 10,000 proteins at once.<a href="#_ftn33" title="_ftnref33" name="_ftnref33"><sup><sup>[33]</sup></sup></a> And of course, there are many promising technologies still in the laboratories.<a href="#_ftn34" title="_ftnref34" name="_ftnref34"><sup><sup>[34]</sup></sup></a> The upshot is that next-generation annotation systems will have to be built in a highly modular and open fashion, so that they can accommodate new capabilities and new data types without anyone&#8217;s having to rewrite the basic code.</li>
<li> <strong>Distributed</strong>. Given the scope of systems biology, the number of researchers in the field, and the variety of experimental tools being deployed, it seems highly unlikely all the relevant information will ever be gathered together in one giant data warehouse. Systems biologists will almost inevitably be confronted with archives that are stored in many different locations, in many different formats, and with many different owners. Already, for example, there is a certain inconsistency among genome annotations, simply because biologists have no standardized vocabulary for expressing the relationships. A group known as the Genome Ontology Consortium<a href="#_ftn35" title="_ftnref35" name="_ftnref35">[35]</a> is developing such a vocabulary. But even if the consortium&#8217;s work is universally accepted, there will still be vast swaths of legacy annotations that don&#8217;t conform. In any case, the trick is to devise database technology that can access these archives anyway, while hiding the complexities and inconsistencies, and making it seem to the user as if all of the archives <em>were</em> in a single warehouse.</li>
<li> <strong>Visualizable.</strong> Biological processes can take place over a vast array of spatial scales, from the nano-scape inhabited by individual molecules, to our everyday, meter-sized human world. They can take place over an even vaster range of time scales, from the nanosecond gyrations of a folding protein molecule to the seven (or so)-decade span of a human life-and far beyond, if we include evolutionary time. And they can be considered at many levels of organization, from the straightforward realm of chemical interaction to the abstract realm of, say, signal transduction and information processing. Yet systems biology has to deal with these processes at every level and at every scale. Thus the need for cutting-edge information visualization systems. Such a system would offer vivid and easily understood visual metaphors to display the information at each level, showing just the right amount of detail. (Such a display would be analogous to, say, a circuit diagram, with its widely recognized icons for diodes, transistors, and other such components.) The system would likewise offer easy and intuitive ways to navigate between levels, so that the user could drill down to get more detail, or pop up to higher abstractions as needed. And it would offer good ways to visualize the dynamical behavior of the system over time-whatever the appropriate time scale might be. Current-generation visualization systems such as BioSPICE<a href="#_ftn36" title="_ftnref36" name="_ftnref36">[36]</a> and Cytoscape<a href="#_ftn37" title="_ftnref37" name="_ftnref37">[37]</a> are a good beginning. But, as their developers themselves are the first to admit, only a beginning.</li>
</ul>
<p>Of course, none of these issues are unique to biology. Scalable, extensible, distributed database architectures are critically important in the corporate sector, as well-as are information visualization capabilities-and the computer industry has put a lot of effort into providing them. To cope with the multiple-archives problem, for example, IBM has developed a &#8220;federated&#8221; architecture, which does indeed allow users to submit queries and receive answers without worrying about exactly where (and in which format) the data resides.<a href="#_ftn38" title="_ftnref38" name="_ftnref38">[38]</a> To provide for modular, open software, meanwhile, the industry has recently begun to coalesce around the idea of web services, or the closely related grid protocols.<a href="#_ftn39" title="_ftnref39" name="_ftnref39">[39]</a> And, of course, developers can draw on at least two decades of research on information visualization,<a href="#_ftn40" title="_ftnref40" name="_ftnref40">[40]</a> not to mention their own extensive experience with visual programming environments, which allow them to view code at multiple levels of abstraction.<a href="#_ftn41" title="_ftnref41" name="_ftnref41">[41]</a> Nonetheless, adapting these technologies to the needs of biologists-and implementing them on the scale required by systems biology-will be a continuing and non-trivial challenge.</p>
<h3>Understanding the Proteome</h3>
<p>The central dogma of molecular genetics-the classic progression of gene to mRNA to protein-would seem to suggest that a roughly one-to-one correspondence exists between the genome of a cell and its &#8220;proteome&#8221;: the overall collection of proteins it contains. In fact, the proteome is vastly more complex than the genome. For one thing, a single gene can sometimes produce <em>many</em> proteins. In eukaryotes, for example, mRNA can&#8217;t be used as a blueprint until special enzymes first cut out the introns, or non-coding regions, and splice together the exons, the fragments that contain useful code. But in some cases, the cell can splice the exons in different ways, producing a series of proteins with various pieces added or subtracted. Or the cell&#8217;s translation machinery might introduce an even more radical change by shifting its &#8220;reading frame,&#8221; meaning that it starts to read the three-base-pair genetic code at a point displaced by one or two base pairs from the original. The result will be an utterly different sequence of amino acids, and thus, an utterly different protein. Furthermore, even after the proteins are manufactured at the ribosome, they undergo quite a lot of post-processing as they enter into the various regulatory networks. Some might have their shapes and activity levels altered by the attachment of a phosphate group, for example, or a sugar molecule, or any of a variety of other appendages, while others might come together to form a multi-protein structure.</p>
<p>If nothing else, this complexity implies a massive escalation of the database challenge discussed in the previous section: a typical human cell has 30,000 to 40,000 genes, but at least 300,000 different proteins, all of which have to be tracked and accounted for. However, the proteome also poses an entirely new computational challenge, which is to determine the structure of the regulatory networks using the available data.</p>
<p>Those data can be obtained in a variety of ways. In the two-hybrid technique, for example, a cell is genetically manipulated so that the protein-protein interaction of interest will cause a marker gene to be expressed; if the gene product is detected, then the interaction has presumably taken place, and vice versa.<a href="#_ftn42" title="_ftnref42" name="_ftnref42">[42]</a> First developed in 1989 to test for interactions between two proteins at a time, the two-hybrid approach has recently been employed to search for interactions en masse.<a href="#_ftn43" title="_ftnref43" name="_ftnref43">[43]</a> Another technique is co-immunoprecipitation,<a href="#_ftn44" title="_ftnref44" name="_ftnref44">[44]</a> in which a molecular tag is attached to the protein of interest, and the cell is then treated with an antibody to that tag. The antibody binds to the tag, precipitates out, and pulls down the protein along with anything bound to it; the bound species can then be identified via standard techniques such as mass spectroscopy. Yet another approach is to perturb the cell in some fashion-say, by deleting a specific gene-and then use DNA microarrays to determine how the genome responds.<a href="#_ftn45" title="_ftnref45" name="_ftnref45">[45]</a> The genes that show a significant increase or decrease in expression rates will presumably have protein products that belong to the same regulatory pathway as that of the deleted gene.</p>
<p>The result, in every case, is a list of protein-protein and/or protein-DNA interactions. Unfortunately, having such a list is not the same as having a good model of the regulatory pathway itself. Data from both the two-hybrid technique and co-immunoprecipitation tend to be noisy, for example, meaning that they contain a substantial fraction of false positives and false negatives; one big reason is that they are looking at proteins that have been chemically modified with tags and such, which can potentially change the proteins&#8217; behavior. Meanwhile, data from the microarray techniques give the &#8220;nodes&#8221; of a pathway-that is, the proteins being expressed-but they have little or nothing to say about the &#8220;wires&#8221;: the interactions among those proteins. Indeed, the same microarray data can usually be accounted for by any number of networks.<a href="#_ftn46" title="_ftnref46" name="_ftnref46">[46]</a></p>
<p>For all of the difficulties, researchers have been able to work out quite a few networks anyway.<a href="#_ftn47" title="_ftnref47" name="_ftnref47">[47]</a> Still, there&#8217;s a vast amount left to learn, and plenty of room for better computational tools. Ideally, for example, the systems biologist&#8217;s suite of analytical software would offer easy-to-use algorithms that integrated all the different forms of data and produced a most-likely guess as to the structure of the networks, with each reconstructed link assigned a confidence level. It would also offer algorithms that helped researchers plan new experiments-say, by suggesting which perturbations might do the most to resolve the ambiguities in the microarray data.<a href="#_ftn48" title="_ftnref48" name="_ftnref48">[48]</a>  And it would offer algorithms that allowed them to compare networks across different cell types and different species, in much the same way that BLAST now allows them to find homologous sequences in different genomes.</p>
<p>That last capability could be an extremely powerful one, if experience with comparative genomics is any guide. Of course, such &#8220;comparative proteomics&#8221; won&#8217;t reach its full potential until researchers have accumulated a lot more network data than they have now. Nor will it go anywhere until they&#8217;ve solved some key theoretical and computational challenges. After all, two sub-networks may contain homologous proteins, but do different things. Or they may have utterly different proteins, yet carry out virtually the same function. Or they may have similar proteins and similar overall functions, but a different interaction structure. So precisely what does it mean to say that one (piece of a) regulatory network is homologous to another?</p>
<p>Still, it should eventually be possible to look across species and understand how regulatory networks have changed through evolution. It may likewise be possible to identify modular sub-circuits that nature has used again and again-or conversely, to find pathways in bacteria, say, that are <em>not</em> shared by humans, and that are therefore attractive targets for new pharmaceuticals.</p>
<h3>Modeling the Networks</h3>
<p>Ultimately, of course, data yields insight only when it&#8217;s been codified into theory-which, in the case of cellular regulatory networks, means computer simulation. Indeed, since paper-and-pencil calculations are pretty much hopeless in systems of this complexity, a good computer model is the only feasible way to see if a tentative reconstruction behaves like the real network. And that, in the classic, experiment-theory-experiment cycle of the scientific method, is a critical step toward better reconstructions. As the simulations improve, moreover, they could provide a foundation for what some have called &#8220;cellular engineering&#8221;-a discipline in which practitioners could predict, control and design cellular networks as confidently as traditional engineers create, say, a new aircraft.<a href="#_ftn49" title="_ftnref49" name="_ftnref49">[49]</a> Certainly the simulations will be exceptionally useful for assessing and predicting drug actions,<a href="#_ftn50" title="_ftnref50" name="_ftnref50">[50]</a> and especially, drug <em>inter</em>actions. (It&#8217;s a rare pharamaceutical that binds to just one cell-surface receptor, and triggers just one signaling network; thus the ubiquity of side effects.<a href="#_ftn51" title="_ftnref51" name="_ftnref51">[51]</a>)</p>
<p>Not surprisingly, cell-network researchers have developed any number of simulation development packages already; examples include BioSpice, DBSolve, E-Cell, VCell, Gepasi, StochSim, and Caltech ERATO.<a href="#_ftn52" title="_ftnref52" name="_ftnref52">[52]</a> Nonetheless, this is still very much a field in flux-even (or especially) when it comes to such fundamental questions as <em>ontology</em>: how do we go about understanding the cellular networks? What kind of conceptual framework will best help us make sense of how they work, and what they are doing? And precisely what is the most effective way to represent the networks in a computer?<a href="#_ftn53" title="_ftnref53" name="_ftnref53">[53]</a></p>
<p>There are almost as many answers to those questions as there are researchers in the field-not least because the &#8220;right&#8221; answer so often depends on the phenomenon they are looking at, and on critical factors such as time scale. Take metabolic networks and signal transduction pathways, for example, which can respond to environmental changes considerably faster than the genome itself can. (They operate on a physiological time scale of milliseconds to a minute or so, whereas transcription and translation take a minute or longer.<a href="#_ftn54" title="_ftnref54" name="_ftnref54">[54]</a>) In 1995, writing in the journal <em>Nature</em>, Dennis Bray of Cambridge  University forcefully made the case for an information-processing view of these pathways: &#8220;Many proteins in living cells appear to have as their primary function the transfer and processing of information, rather than the chemical transformation of metabolic intermediates or the building of cellular structures,&#8221; he declared.<a href="#_ftn55" title="_ftnref55" name="_ftnref55">[55]</a> In particular, Bray argued, a simple enzyme protein could be viewed as a computational element that takes an input-the concentration of its &#8220;substrate,&#8221; the molecule it interacts with-and produces an output: the catalyzed reaction product. Likewise, an enzyme that becomes active only when it binds with two separate regulator molecules will function something like a Boolean AND gate,<a href="#_ftn56" title="_ftnref56" name="_ftnref56">[56]</a> and so on. Just as in an electrical engineering lab, moreover, circuits formed from these elements can be as simple as a switch or an oscillator, or as complex as a bacterium&#8217;s chemotaxis<a href="#_ftn57" title="_ftnref57" name="_ftnref57">[57]</a> response.  Indeed, the cell even possesses a kind of short-term, &#8220;random-access&#8221; memory, in the sense that events in its environment have profoundly shaped the concentration and activity of many thousands molecules in the cell. In short, Bray concluded, these protein-based circuits comprise a kind of nervous system for the cell, providing it with much of what it needs to control its behavior.<a href="#_ftn58" title="_ftnref58" name="_ftnref58">[58]</a></p>
<p>Of course, that left the other, slower half of the cellular control system: the genetic regulatory networks that govern responses on a time-scale of minutes or longer. As it happened, however, in a paper<a href="#_ftn59" title="_ftnref59" name="_ftnref59">[59]</a> that was published at almost exactly the same time as Bray&#8217;s, Stanford University&#8217;s Harley McAdams and Lucy Shapiro showed that genetic networks could also be modeled via the electrical circuit analogy.<a href="#_ftn60" title="_ftnref60" name="_ftnref60">[60]</a> Indeed, McAdams and Shapiro not only tackled the complexities of an actual regulatory network-the decision circuit that governs the course of a ?-phage infection in <em>E. coli</em>-but they gave careful consideration to such real-world factors as time delays, which are critical in biological networks (gene transcription and translation are not instantaneous, for example) and indeed, in electrical networks, as well.<a href="#_ftn61" title="_ftnref61" name="_ftnref61">[61]</a> Along the way, and in later work with colleagues such as Adam Arkin, now at the University of California, Berkeley, they clarified some of the ways in which regulatory networks are <em>not</em> like electrical circuits. Because critical molecules are often present in the cell in extremely small quantities, to take the most notable example, certain critical reactions are subject to large statistical fluctuations, meaning that they proceed in fits and starts, much more erratically than their electrical counterparts.<a href="#_ftn62" title="_ftnref62" name="_ftnref62">[62]</a>  Nonetheless, as McAdams and Arkin emphasized in a review paper a few years later, so long as the differences are kept clearly in mind, the cell circuit-electrical circuit analogy can be a deep and powerful one. Indeed, they wrote, nature&#8217;s designs for the cellular circuitry seems to draw on any number of techniques that are very familiar from engineering: &#8220;the biochemical logic in genetic regulatory circuits provides real-time regulatory control [via positive and negative feedback loops], implements a branching decision logic, and executes stored programs [in the DNA] that guide cellular differentiation extending over many cell generations.&#8221;<a href="#_ftn63" title="_ftnref63" name="_ftnref63">[63]</a></p>
<p>At still longer time-scales of, say, hours, one finds comparatively slow-moving processes such as the cell cycle. In this regime, modelers can safely describe the cell&#8217;s dynamics with a straightforward series of chemical rate equations-that is, equations in which nothing matters but the concentration of each chemical species at any given moment. All the complications due to time delays in gene expression, statistical fluctuations, membrane transport, and the like have simply gone away; they happen so rapidly on this time scale that they are effectively instantaneous. <a href="#_ftn64" title="_ftnref64" name="_ftnref64">[64]</a> Of course, the equations for any real regulatory network are still dauntingly complex, and can only be solved by computer. But even so, much of their behavior can be understood via the tools of non-linear systems dynamics, often referred to as &#8220;chaos theory.&#8221; For example, a stable &#8220;point attractor&#8221; in the equations-that is, a solution in which the variables don&#8217;t change with time-might correspond to a cell that was at a stable &#8220;checkpoint&#8221; of its cycle: a kind of waiting state brought on by factors such as DNA damage, or a lack of nutrients. Likewise, a &#8220;bifurcation&#8221; in the equations, in which the systems suddenly changes from, say, a point attractor to a periodic oscillation, might correspond to an egg cell that&#8217;s been fertilized, and must now start to go through cycle after cycle of growth and division. Indeed, such dynamical systems have now been implemented in dozens of biological simulations.<a href="#_ftn65" title="_ftnref65" name="_ftnref65">[65]</a></p>
<p>And so it goes: ontology-and modeling-for systems biology have made encouraging progress. But there are still a great many challenges.<a href="#_ftn66" title="_ftnref66" name="_ftnref66">[66]</a> To mention just a few:</p>
<ul type="disc">
<li>Building      in spatial structure. The cytoplasm isn&#8217;t just a uniform mixture of all      the biomolecules that exist in a cell; proteins and other macromolecules      are often bound to membranes, or are isolated inside of various cellular      compartments (especially in eukaryotes.) A full account of the regulatory      networks has to take this compartmentalization into account, along with      such spatial factors as diffusion, and the transport of various species      through the cytoplasm and across membranes.</li>
<li>Modeling      multicellular biology. Although it&#8217;s certainly possible to model bacteria      and single-celled eukaryotes can be modeled as more or less isolated      entities, a full account of multicellular creatures such as humans will      have to include an account of intercellular signaling, cellular      differentiation, cell motility, tissue architecture, and many other      &#8220;community&#8221; issues.</li>
<li>Interoperability.      Despite the developers&#8217; best efforts, none of the simulation packages today      offers everything a systems biologist might need. Nor is any of them likely      to do so in the foreseeable future; covering the entire range of size- and      time-scales requires so many simulation techniques that no one package can      hope to offer the best-of-breed tools in everything. And in any case, the      field itself is evolving far too rapidly for any single package to keep      up. So a better solution is to get the various packages working together.<a href="#_ftn67" title="_ftnref67" name="_ftnref67">[67]</a></li>
</ul>
<p>This is trickier than it sounds. Ideally, for example, it would mean packages that took advantage of a cluster of emerging technologies known variously as web services,<a href="#_ftn68" title="_ftnref68" name="_ftnref68">[68]</a> grid protocols,<a href="#_ftn69" title="_ftnref69" name="_ftnref69">[69]</a> and peer-to-peer computing.<a href="#_ftn70" title="_ftnref70" name="_ftnref70">[70]</a> Among other things, these technologies would allow for simulations to be run across the Internet on dozens, hundreds, or even thousands of machines in parallel, thus allowing researchers to bring vast computational power to bear. And they likewise allow for networked simulations to be assembled on the fly from self-contained software modules, which could be mixed and matched by other systems as needed.<a href="#_ftn71" title="_ftnref71" name="_ftnref71">[71]</a></p>
<p>In the meantime, however, an equally important goal is to develop an easy way for the models (and the modelers) to share and communicate their results. And indeed, a consortium of leading developers led by the Caltech-ERATO Kitano group has taken a significant step in that direction by developing the Systems Biology Modeling Language: an open, extensible representation scheme, based on XML, that gives developers a common format for describing their models. SBML, in turn, provides a foundation for the Systems Biology Workbench: a software framework that allows interaction among models created by different groups-even if the models are written in different programming languages and running on different machines.<a href="#_ftn72" title="_ftnref72" name="_ftnref72">[72]</a> A parallel effort is the Physiome Project,<a href="#_ftn73" title="_ftnref73" name="_ftnref73">[73]</a> which was launched in 2001 by the International Union of Physiological Sciences, and which is headquartered at the University of Aukland, New Zealand. In addition to offering its own modeling tools, the Physiome Project has developed a number of representation languages for higher-level biological systems, including CellML, Cell Modeling Language, and AnatML, Anatomy Markup Language.</p>
<h3 align="center">Reference List</h3>
<p align="center">&nbsp;</p>
<div style="align:left;">
&#8220;Alliance for Cellular Signaling.&#8221; Web page available at <a href="http://www.cellularsignaling.org/">http://www.cellularsignaling.org/</a>.</p>
<p>&#8220;Caltech ERATO Kitano Systems Biology Workbench Development Group.&#8221; Web page available at <a href="http://www.sbw-sbml.org/index.html">http://www.sbw-sbml.org/index.html</a>.</p>
<p>&#8220;Gene Ontology Consortium.&#8221; Web page available at <a href="http://www.geneontology.org/">http://www.geneontology.org/</a>.</p>
<p>&#8220;The Institute for Systems Biology.&#8221; Web page available at <a href="http://www.systemsbiology.org/">http://www.systemsbiology.org/</a>.</p>
<p>Bray, Dennis. 1995. &#8220;Protein molecules as computational elements in living cells,&#8221; <em>Nature</em> 376:307-12.</p>
<p>Caltech-ERATO-Kitano Systems Biology Workbench Development Group. &#8220;Repository.&#8221; Web page available at <a href="http://www.sbw-sbml.org/repository.html">http://www.sbw-sbml.org/repository.html</a>.</p>
<p>Cohen, Jon. &#8220;The Proteomics Payoff,&#8221; <em>Technology Review, </em>Oct 2001.  p. 55-60.</p>
<p>Collins, Francis S.<em>, et al.</em> 2003. &#8220;A vision for the future of genomic research,&#8221; <em>Nature</em> 422:835-47.</p>
<p>Couzin, Jennifer. 2002. &#8220;BREAKTHROUGH OF THE YEAR: Small RNAs Make Big Splash,&#8221; <em>Science</em> 298(5602):2296. Available online at <a href="http://www.sciencemag.org/">http://www.sciencemag.org</a>.</p>
<p>Csete, Marie E., and John C. Doyle. 2002. &#8220;Reverse Engineering of Biological Complexity,&#8221; <em>Science</em> 295(5560):1664. Available online at <a href="http://www.sciencemag.org/cgi/content/abstract/295/5560/1664">http://www.sciencemag.org/cgi/content/abstract/295/5560/1664</a>.</p>
<p>Davidson, Eric H.<em>, et al.</em> 2002. &#8220;A Genomic Regulatory Network for Development,&#8221; <em>Science</em> 295(5560):1669. Available online at <a href="http://www.sciencemag.org/cgi/content/abstract/295/5560/1669">http://www.sciencemag.org/cgi/content/abstract/295/5560/1669</a>.</p>
<p>DeFrancesco, Laura. 2002 . &#8220;Probing Protein Interactions,&#8221; <em>The Scientist</em> 16(8):20.</p>
<p>DeFrancesco, Laura, and  Deborah Wilkinson. 1999. &#8220;The Two-Body Problem,&#8221; <em>The Scientist</em> 13(8):21. Available online at <a href="http://www.the-scientist.com/yr1999/apr/profile2_990412.html">http://www.the-scientist.com/yr1999/apr/profile2_990412.html</a>.</p>
<p>Frazier, Marvin E.<em>, et al.</em> 2003. &#8220;Realizing the Potential of the Genome Revolution: The Genomes to Life Program,&#8221; <em>Science</em> 300(5617):290. Available online at <a href="http://www.sciencemag.org/cgi/content/abstract/300/5617/290">http://www.sciencemag.org/cgi/content/abstract/300/5617/290</a>.</p>
<p>Gannis, Mike. &#8220;Alliance for Cellular Signaling Decodes Complex Messages of Cells,&#8221; <em>NPACI &amp; SDSC Online, </em>28 Nov  2001. Available online at <a href="http://www.npaci.edu/online/v5.24/cell.sig.html">http://www.npaci.edu/online/v5.24/cell.sig.html</a>.</p>
<p>Gwynne, Peter, and Guy Page. 1999. &#8220;Microarray Analysis: The Next Revolution in Molecular Biology,&#8221; <em>Science</em> 285. Available online at <a href="http://www.sciencemag.org/feature/e-market/benchtop/micro.shl">http://www.sciencemag.org/feature/e-market/benchtop/micro.shl</a>.</p>
<p>Hartwell, Leland H.<em>, et al.</em> 1999. &#8220;From Molecular to Modular Cell Biology,&#8221; <em>Nature</em> 402:C47-52.</p>
<p>Hood, Leroy, and David Galas. 2003. &#8220;The digital code of DNA,&#8221; <em>Nature</em> 421:444-8.</p>
<p>Hunter, Philip. &#8220;Putting Humpty Dumpty Back Together Again,&#8221; <em>The Scientist, </em>24 Feb 2003.</p>
<p>Ideker, Trey<em>, et al.</em> 2001. &#8220;A New Approach to Decoding Life: Systems Biology,&#8221; <em>Annu. Rev. Genomics Hum. Genet.</em> 2:343-72.</p>
<p>Ideker, Trey<em>, et al.</em> 2001. &#8220;Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network,&#8221; <em>Science</em> 292(5518):929.</p>
<p>The Institute for Systems Biology, The Whitehead Institute, and The Memorial Sloan-Kettering Cancer  Center. &#8220;Cytoscape.&#8221; Web page available at <a href="http://www.cytoscape.org/">http://www.cytoscape.org/</a>.</p>
<p>The International Human Genome Sequencing Consortium. 2001. &#8220;Initial sequencing and analysis of the human genome,&#8221; <em>Nature</em> 409:860-921.</p>
<p>Jacob, François, and  Jacques Monod. 1962. &#8220;On the regulation of gene activity,&#8221; pp. 193-209 in <em>Symposium on cellular regulatory mechanisms</em> Cold Spring Harbor, NY: Cold Spring Harbor Laboratory.</p>
<p>Kitano, Hiroaki. 2002. &#8220;Systems Biology: A Brief Overview,&#8221; <em>Science</em> 295(5560):1662. Available online at <a href="http://www.sciencemag.org/cgi/content/abstract/295/5560/1662">http://www.sciencemag.org/cgi/content/abstract/295/5560/1662</a>.</p>
<p>Lawrence Berkeley Laboratory. &#8220;BioSpice: Open-Source Biology.&#8221; Web page available at <a href="http://biospice.lbl.gov/home.html">http://biospice.lbl.gov/home.html</a>.</p>
<p>MacBeath, Gavin, and Stuart L. Schreiber. 2000. &#8220;Printing Proteins as Microarrays for High-Throughput Function Determination,&#8221; <em>Science</em> 289(5485):1760.</p>
<p>Maher, Brendan A. &#8220;The People&#8217;s Biology: Cellular signaling alliance puts a socialist spin on systems biology,&#8221; <em>The Scientist, </em>24 Feb 2003.  p. 22. Available online at <a href="http://www.the-scientist.com/yr2003/feb/feature1_030224.html">http://www.the-scientist.com/yr2003/feb/feature1_030224.html</a>.</p>
<p>McAdams, Harley H., and  Adam Arkin. 1998. &#8220;Simulation of Prokaryotic Genetic Circuits,&#8221; <em>Annu. Rev. Biophys. Biomol. Struct.</em> 27:199-224.</p>
<p>McAdams, Harley H., and  Lucy Shapiro. 1995. &#8220;Circuit Simulation of Genetic Networks,&#8221; <em>Science</em> 269:650-656.</p>
<p>Miller, Karl. &#8220;Metabolic Pathways of Biochemistry.&#8221; Web page available at <a href="http://www.hfni.gsehd.gwu.edu/%7Empb/">http://www.hfni.gsehd.gwu.edu/~mpb/</a>.</p>
<p>The Mouse Genome Sequencing Consortium. 2002. &#8220;Initial sequencing and comparative analysis of the mouse genome,&#8221; <em>Nature</em> 420: 520-562.</p>
<p>National Cancer Institute. &#8220;NCI Director&#8217;s Challenge: Toward a Molecular Classification of Cancer.&#8221; Web page available at <a href="http://dc.nci.nih.gov/">http://dc.nci.nih.gov/</a>.</p>
<p>National Human Genome Research Institute. &#8220;Comparative Genomics.&#8221; Web page available at <a href="http://www.genome.gov/11006946">http://www.genome.gov/11006946</a>.</p>
<p>National Human Genome Research Institute. &#8220;The ENCODE Project: ENCyclopedia Of DNA Elements.&#8221; Web page available at <a href="http://www.genome.gov/10005107">http://www.genome.gov/10005107</a>.</p>
<p>National Human Genome Research Institute. 2002. &#8220;DNA Microarray Technology.&#8221; Web page available at <a href="http://www.genome.gov/10000533">http://www.genome.gov/10000533</a>.</p>
<p>National Human Genome Research Institute. 2003. &#8220;International Consortium Completes Human Genome Project.&#8221; Web page available at <a href="http://www.genome.gov/11006929">http://www.genome.gov/11006929</a>.</p>
<p>Noble, Denis. 2002. &#8220;Modeling the Heart&#8211;from Genes to Cells to the Whole Organ,&#8221; <em> Science</em> 295(5560):1678. Available online at <a href="http://www.sciencemag.org/cgi/content/abstract/295/5560/1678">http://www.sciencemag.org/cgi/content/abstract/295/5560/1678</a>.</p>
<p>Shi, Leming. 2002. &#8220;DNA Microarray (Genome Chip)&#8211;Monitoring the Genome on a Chip.&#8221; Web page available at <a href="http://www.gene-chips.com/">http://www.gene-chips.com/</a>.</p>
<p>Smith, Lloyd M.<em>, et al.</em> 1986. &#8220;Fluorescence detection in automated DNA sequence analysis,&#8221; <em>Nature</em> 321:674-79.</p>
<p>Taubes, Gary. &#8220;The Virtual Cell,&#8221; <em>Technology Review, </em>Apr 2002.  p. 63-70.</p>
<p>Travis, John. &#8220;Biological Dark Matter,&#8221; <em> Science News, </em>12 Jan 2002. Available online at <a href="http://www.sciencenews.org/20020112/bob9.asp">http://www.sciencenews.org/20020112/bob9.asp</a>.</p>
<p>Tyson, John J.<em>, et al.</em>  2001. &#8220;Network Dynamics and Cell Physiology,&#8221; <em>Nat Rev Mol Cell Biol</em> 2(12):908-16.</p>
<p>U.S. Department of Energy. &#8220;Genomes to Life: Biological Solutions for Energy Challenges.&#8221; Web page available at <a href="http://doegenomestolife.org/">http://doegenomestolife.org/</a>.</p>
<p>U.S. Department of Energy. 2003. &#8220;Report on the Computer Science Workshop for the Genomes to Life Program (March 6-7, 2002).&#8221; Web page available at <a href="http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf">http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf</a>.</p>
<p>University of Aukland Bioengineering Institute. &#8220;The IUPS Physiome Project.&#8221; Web page available at http://www.bioeng.auckland.ac.nz/physiome/physiome.php.</p>
<p>von Bertalanffy, Ludwig. 1969. <em>General Systems Theory: Foundations, Development, Applications</em>. New York: George Braziller.</p>
<p>Waldrop, M. Mitchell. 2002. &#8220;Grid Computing,&#8221; <em>Technology Review</em> 105(4):31-37.</p>
<p>Watson, James D., and Francis H. C. Crick. 1953. &#8220;Molecular structure of nucleic acids: A structure for deoxyribose nucleic acid,&#8221; <em>Nature</em> 171:737.</p>
<p>Webopedia. &#8220;Grid Computing.&#8221; Web page available at <a href="http://www.webopedia.com/TERM/g/grid_computing.html">http://www.webopedia.com/TERM/g/grid_computing.html</a>.</p>
<p>Webopedia. &#8220;Peer-to-Peer Architecture.&#8221; Web page available at <a href="http://www.webopedia.com/TERM/p/peer_to_peer_architecture.html">http://www.webopedia.com/TERM/p/peer_to_peer_architecture.html</a>.</p>
<p>Webopedia. &#8220;Web Services.&#8221; Web page available at <a href="http://www.webopedia.com/TERM/W/Web_services.html">http://www.webopedia.com/TERM/W/Web_services.html</a>.</p>
<p>Whitehead Institute. 2001. &#8220;Scientists Find New Class of Genes Implicated in Protein Regulation.&#8221; Web page available at <a href="http://www.wi.mit.edu/nap/2001/nap_press_01_dbmrna.html">http://www.wi.mit.edu/nap/2001/nap_press_01_dbmrna.html</a>.</p>
<p>Wiener, Norbert. 1961. <em>Cybernetics, or Control and Communication in the Animal and the Machine</em>. 2nd ed. Cambridge, MA: MIT Press.</p>
<p>Wolkenhauer, Olaf. 2001. &#8220;Systems biology: The reincarnation of systems theory applied in biology?,&#8221; <em>Briefings in Bioinformatics</em> 2(3):258-70.</p>
<p><br clear="all" /></p>
<hr align="left" size="1" width="33%" /><a href="#_ftnref1" title="_ftn1" name="_ftn1">[1]</a>  <!--[if supportFields]> QUOTE &quot;Watson and Crick (1953).&quot; <![endif]-->Watson and Crick (1953).<!--[if supportFields]><![endif]--><a href="#_ftnref2" title="_ftn2" name="_ftn2">[2]</a>  The &#8220;completion&#8221; of the project had actually been announced once before, on June  26, 2000, when U.S. President Bill Clinton and British Prime Minister Tony Blair jointly hailed the release of a preliminary, draft version of the sequence with loud media fanfare. However, while that draft sequence was undoubtedly useful, it contained multiple gaps and had an error rate of one mistaken base pair in every 10,000. The much-revised sequence released in 2003 has an error rate of only 1 in 100,000, and gaps in only those very rare segments of the genome that can&#8217;t be reliably sequenced with current technology.  <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.genome.gov/11006929&amp;gt;&quot; <![endif]-->http://www.genome.gov/11006929.<a href="#_ftnref3" title="_ftn3" name="_ftn3">[3]</a>  DNA in its natural state takes the shape of a twisted ladder: two parallel strands winding around and around one another in the famous double helix. Each strand consists of a backbone of endlessly repeating sugar-phosphate molecules, which form one side of the ladder, plus a sequence of &#8220;base&#8221; molecules attached to each sugar. There are four types of bases, usually abbreviated as A, T, C, and G. (The full names are adenine, thymine, cytosine, and guanine, respectively.) In the complete double helix, each base links with its counterpart on the opposite strand to form one step of the ladder; thus the term &#8220;base-pairs.&#8221; The pairing always links A with T and C with G, which makes each strand the exact complement of the other. But in any case, the precise sequence of bases along either backbone is what encodes the genetic information.<a href="#_ftnref4" title="_ftn4" name="_ftn4">[4]</a>  <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.genome.gov/11006946&amp;gt;&quot; <![endif]-->http://www.genome.gov/11006946<!--[if supportFields]><![endif]--><a href="#_ftnref5" title="_ftn5" name="_ftn5">[5]</a>  <!--[if supportFields]> QUOTE &quot;Collins et al. (2003).&quot; <![endif]-->Collins et al. (2003).<!--[if supportFields]><![endif]--> Formulated over the course of two years, through more than a dozen workshops that involved hundreds of scientists and members of the public (see http://www.genome.gov/About/Planning), the agenda is organized into three themes: &#8220;genomics to biology,&#8221; which focuses on the kind of  systems biology issues discussed in this chapter; &#8220;genomics to health,&#8221; which focuses on the role of genomics in health, disease, diagnosis, and treatment; and &#8220;genomics to society,&#8221; which focuses on hot-button issues such as genetic discrimination, and the genetic basis of race, ethnicity, and kinship.<a href="#_ftnref6" title="_ftn6" name="_ftn6">[6]</a>  <!--[if supportFields]> QUOTE &quot;Collins et al. (2003).&quot; <![endif]-->Collins et al. (2003).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref7" title="_ftn7" name="_ftn7">[7]</a>  <!--[if supportFields]> QUOTE &quot;The International Human Genome Sequencing Consortium (2001).&quot; <![endif]-->The International Human Genome Sequencing Consortium (2001).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref8" title="_ftn8" name="_ftn8">[8]</a> In the bacterium <em>E. coli</em> and other such prokaryotes-one-celled organisms that lack a nucleus-this is exactly what happens. But in amoebas, yeast, plants, humans, and all other eukaryotes-organisms whose cells <em>do</em> have a nucleus, as well as mitochondria and many other organelles-there is an intermediate step. For reasons that are still not clearly understood, the coding region of eukaryotic genes are typically broken up by long stretches of ­<em>non</em>-coding DNA. So after each gene is transcribed, the resulting mRNA is set upon by a whole series of specialized enzymes that edit out the non-coding regions, or &#8220;introns,&#8221; and splice together the useful parts, known as &#8220;exons.&#8221; Only then does the edited mRNA move out into the cytoplasm for its encounter with the ribosomes.</p>
<p><a href="#_ftnref9" title="_ftn9" name="_ftn9">[9]</a> <!--[if supportFields]> QUOTE &quot;Collins et al. (2003).&quot; <![endif]-->Collins et al. (2003).<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;The Mouse Genome Sequencing Consortium (2002).&quot; <![endif]-->The Mouse Genome Sequencing Consortium (2002).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref10" title="_ftn10" name="_ftn10">[10]</a>  Witness the recent discovery that a hitherto obscure class of &#8220;small&#8221; RNAs seems to be playing a major regulatory role in a wide variety of organisms, including humans. In December 2002, <em>Science</em> magazine declared this to be the &#8220;breakthrough of the year.&#8221; (<!--[if supportFields]> QUOTE &quot;Couzin (2002).&quot; <![endif]-->Couzin (2002).<!--[if supportFields]><![endif]-->, and references therein.) The small RNAs are typically only about 25 bases long, but the genes that encode them comprise an estimated 1% of the entire genome, making them roughly as numerous as the protein-encoding genes. <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.wi.mit.edu/nap/2001/nap_press_01_dbmrna.html&amp;gt;&quot; <![endif]-->http://www.wi.mit.edu/nap/2001/nap_press_01_dbmrna.html<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;Travis (2002).&quot; <![endif]-->Travis (2002).<!--[if supportFields]><![endif]-->.</p>
<p><a href="#_ftnref11" title="_ftn11" name="_ftn11">[11]</a>  &#8220;Several&#8221; and &#8220;short&#8221; are relative terms. In prokaryotes such as <em>E. coli</em>, a typical gene has only four or five regulatory sites, the sites themselves are only about 15 base pairs long, and the transcription factors that bind to them are comparatively simple. (or at least, that&#8217;s the case in the handful of genes whose regulation has been studied in detail.) In eukaryotes, however, the binding sites are large, numerous, and widely scattered, and the transcription factors are correspondingly complex.</p>
<p><a href="#_ftnref12" title="_ftn12" name="_ftn12">[12]</a>  Actually, the protein production seems to be regulated not just at the start, but at every step along the way. For example, certain of the small RNAs mentioned in a previous footnote cam regulate protein production by attacking the mRNA and destroying the data tape, so to speak, before it&#8217;s even read. Other types can shut off the translation process at the ribosome-as can certain regulatory proteins.</p>
<p><a href="#_ftnref13" title="_ftn13" name="_ftn13">[13]</a>  <!--[if supportFields]> QUOTE &quot;Bray (1995).&quot; <![endif]-->Bray (1995).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref14" title="_ftn14" name="_ftn14">[14]</a>  Think of a photon of light hitting a chloroplast in a green plant cell, or a hormone molecule locking onto a protein receptor molecule embedded in the cell membrane. Depending on the cell type, signaling pathways can be triggered by chemicals, light, heat, temperature, ion gradients, or even mechanical contact with another cell.</p>
<p><a href="#_ftnref15" title="_ftn15" name="_ftn15">[15]</a> As early as 1961, François Jacob and Jacque Monod-who had recently discovered the regulatory regions in DNA, for which they would share the 1965 Nobel Prize-wrote a report that emphasized the importance of regulatory feedback; talked about regulatory &#8220;circuits&#8221;; and suggested that cancer was triggered by the breakdown of regulatory control.  <!--[if supportFields]> QUOTE &quot;Jacob and Monod (1962).&quot; <![endif]-->Jacob and Monod (1962).<!--[if supportFields]><![endif]-->. These are all key ideas in systems biology today. <!--[if supportFields]> QUOTE &quot;McAdams and Arkin (1998).&quot; <![endif]-->McAdams and Arkin (1998).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref16" title="_ftn16" name="_ftn16">[16]</a> <!--[if supportFields]> QUOTE &quot;Wiener (1961).&quot; <![endif]-->Wiener (1961).<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;von Bertalanffy (1969).&quot; <![endif]-->von Bertalanffy (1969).<!--[if supportFields]><![endif]-->. This history was recently summarized in <!--[if supportFields]> QUOTE &quot;Wolkenhauer (2001).&quot; <![endif]-->Wolkenhauer (2001).<!--[if supportFields]><![endif]-->. See also <!--[if supportFields]> QUOTE &quot;Hunter (2003).&quot; <![endif]-->Hunter (2003).<!--[if supportFields]><![endif]-->.</p>
<p><a href="#_ftnref17" title="_ftn17" name="_ftn17">[17]</a>  <!--[if supportFields]> QUOTE &quot;Ideker, Galitski, and Hood (2001).&quot; <![endif]-->Ideker, Galitski, and Hood (2001).<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;Hood and Galas (2003).&quot; <![endif]-->Hood and Galas (2003).<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;Collins et al. (2003).&quot; <![endif]-->Collins et al. (2003).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref18" title="_ftn18" name="_ftn18">[18]</a> <!--[if supportFields]> QUOTE &quot;Cohen (2001).&quot; <![endif]-->Cohen (2001).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref19" title="_ftn19" name="_ftn19">[19]</a>  The hierarchy of levels obviously doesn&#8217;t stop at the cell membrane. Although deciphering the various cellular regulatory networks is a huge challenge in itself, systems biology ultimately has to deal as well with how cells organize themselves into tissues, organs, and the whole organism. One group that is trying to lay the groundwork for such an effort is the Physiome Project at the University  of Aukland in New Zealand. <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.webopedia.com/TERM/W/Web_services.html&amp;gt;&quot; <![endif]-->http://www.webopedia.com/TERM/W/Web_services.html<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref20" title="_ftn20" name="_ftn20">[20]</a>  Among the known mechanisms for biological robustness are negative feedback, which maintains stability;  redundancy, which allows for multiple backups; and modularity, which tends to isolate failures, rather than allowing them to spread. (<!--[if supportFields]> QUOTE &quot;Kitano (2002).&quot; <![endif]-->Kitano (2002).<!--[if supportFields]><![endif]-->.) These mechanisms are also widely used by human engineers-a fact that some researchers regard as no accident, arguing that the parallels between biology and large-scale engineering may actually be quite deep. (<!--[if supportFields]> QUOTE &quot;Csete and Doyle (2002).&quot; <![endif]-->Csete and Doyle (2002).<!--[if supportFields]><![endif]-->.)</p>
<p><a href="#_ftnref21" title="_ftn21" name="_ftn21">[21]</a>  Physiological processes such as metabolism, signal transduction, and the cell cycle take place on a time scale that ranges from milliseconds to days, and are reversible in the sense that an activity flickers on, gene expression is adjusted as needed, and then everything returns to some kind of equilibrium. But the commitments that the cell makes during development are effectively <em>ir</em>reversible. Becoming a particular cell line means that the genetic regulatory networks in each successive generation of cells have to go through a cascade of decisions that end up turning genes on and off by the thousands. And unless there is some drastic intervention, as in the cloning experiments that created Dolly the Sheep, those genes are locked in place for the lifespan of the organism. (<!--[if supportFields]> QUOTE &quot;Davidson et al. (2002).&quot; <![endif]-->Davidson et al. (2002).<!--[if supportFields]><![endif]-->). Of course, the developmental program does not proceed in an isolated, &#8220;open-loop&#8221; fashion, as a computer scientist might say. Quite the opposite. Very early in the process, for example, the growing embryo lays out its basic body plan-front versus back, top versus bottom, and so on-by establishing embryo-wide chemical gradients, so that the concentration of the appropriate compound tells each cell what to do. Similar tricks are used at every stage thereafter: each cell is always receiving copious feedback from its neighbors, with chemical signals providing a constant stream of instructions and course corrections.</p>
<p><a href="#_ftnref22" title="_ftn22" name="_ftn22">[22]</a>  After all, even very small changes in the timing of events during development, and in the rates at which various tissues grow, can have a profound impact on the final outcome.</p>
<p><a href="#_ftnref23" title="_ftn23" name="_ftn23">[23]</a>   <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.cellularsignaling.org/&amp;gt;&quot; <![endif]-->http://www.cellularsignaling.org/<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;Gannis (2001).&quot; <![endif]-->Gannis (2001).<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;Maher (2003).&quot; <![endif]-->Maher (2003).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref24" title="_ftn24" name="_ftn24">[24]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.systemsbiology.org/&amp;gt;&quot; <![endif]-->http://www.systemsbiology.org/<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref25" title="_ftn25" name="_ftn25">[25]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://doegenomestolife.org/&amp;gt;&quot;<![endif]-->http://doegenomestolife.org/<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;Frazier et al. (2003).&quot; <![endif]-->Frazier et al. (2003).<!--[if supportFields]><![endif]-->.</p>
<p><a href="#_ftnref26" title="_ftn26" name="_ftn26">[26]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://dc.nci.nih.gov/&amp;gt;&quot; <![endif]-->http://dc.nci.nih.gov/<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref27" title="_ftn27" name="_ftn27">[27]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf&amp;gt;&quot; <![endif]-->http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref28" title="_ftn28" name="_ftn28">[28]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf&amp;gt;&quot; <![endif]-->http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref29" title="_ftn29" name="_ftn29">[29]</a> <!--[if supportFields]> QUOTE &quot;Collins et al. (2003).&quot; <![endif]-->Collins et al. (2003).<!--[if supportFields]><![endif]-->. To help achieve this Grand Challenge, the institute has launched the ENCODE project, a public research consortium dedicated to building an annotated encyclopedia of all known functional DNA elements. <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.genome.gov/10005107&amp;gt;&quot; <![endif]-->http://www.genome.gov/10005107<!--[if supportFields]><![endif]-->.</p>
<p><a href="#_ftnref30" title="_ftn30" name="_ftn30">[30]</a> <!--[if supportFields]> QUOTE &quot;Smith, Hunkapiller, and Hood (1986).&quot; <![endif]-->Smith, Hunkapiller, and Hood (1986).<!--[if supportFields]><![endif]-->; <!--[if supportFields]> QUOTE &quot;Hood and Galas (2003).&quot; <![endif]-->Hood and Galas (2003).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref31" title="_ftn31" name="_ftn31">[31]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf&amp;gt;&quot; <![endif]-->http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref32" title="_ftn32" name="_ftn32">[32]</a> Although there are many variations, the basic idea starts with the &#8220;chip&#8221;: a glass slide containing an array of artificial DNA molecules that correspond to the genes of interest. To assay a particular cell type, the researchers extract all the mRNA molecules, label each of them with a fluorescent dye, and wash the resulting concoction across the chip so that each type of mRNA can bind to its complementary DNA sequence. Then the researchers just have to measure how brightly each spot fluoresces to gauge how much of the corresponding mRNA was present, which in turn gives an estimate of how actively the gene was being expressed. A superb overview of microarray technology is available on a private web site created by Chinese researcher Leming Shi: <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.gene-chips.com/&amp;gt;&quot; <![endif]-->http://www.gene-chips.com/<!--[if supportFields]><![endif]-->. See also <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.genome.gov/10000533&amp;gt;&quot; <![endif]-->http://www.genome.gov/10000533<!--[if supportFields]><![endif]--> and <!--[if supportFields]> QUOTE &quot;Gwynne and Page (1999).&quot; <![endif]-->Gwynne and Page (1999).<!--[if supportFields]><![endif]-->.</p>
<p><a href="#_ftnref33" title="_ftn33" name="_ftn33">[33]</a> <!--[if supportFields]> QUOTE &quot;MacBeath and Schreiber (2000).&quot; <![endif]-->MacBeath and Schreiber (2000).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref34" title="_ftn34" name="_ftn34">[34]</a>  <!--[if supportFields]> QUOTE &quot;Collins et al. (2003).&quot; <![endif]-->Collins et al. (2003).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref35" title="_ftn35" name="_ftn35">[35]</a>  The consortium already has standard vocabulary lists, or &#8220;ontologies,&#8221; in three areas: Molecular function (e.g., TKTKTK); biological process (e.g., TKTKTK); and subcellular structures (e.g., TKTKTK). <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.geneontology.org/&amp;gt;&quot; <![endif]-->http://www.geneontology.org/<!--[if supportFields]><![endif]-->.</p>
<p><a href="#_ftnref36" title="_ftn36" name="_ftn36">[36]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://biospice.lbl.gov/home.html&amp;gt;&quot; <![endif]-->http://biospice.lbl.gov/home.html<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref37" title="_ftn37" name="_ftn37">[37]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.cytoscape.org/&amp;gt;&quot; <![endif]-->http://www.cytoscape.org/<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref38" title="_ftn38" name="_ftn38">[38]</a>  TK</p>
<p><a href="#_ftnref39" title="_ftn39" name="_ftn39">[39]</a> TK</p>
<p><a href="#_ftnref40" title="_ftn40" name="_ftn40">[40]</a> TK</p>
<p><a href="#_ftnref41" title="_ftn41" name="_ftn41">[41]</a> TK</p>
<p><a href="#_ftnref42" title="_ftn42" name="_ftn42">[42]</a> <!--[if supportFields]> QUOTE &quot;DeFrancesco and Wilkinson (1999).&quot; <![endif]-->DeFrancesco and Wilkinson (1999).<!--[if supportFields]><![endif]-->.</p>
<p><a href="#_ftnref43" title="_ftn43" name="_ftn43">[43]</a>  <!--[if supportFields]> QUOTE &quot;DeFrancesco (2002).&quot; <![endif]-->DeFrancesco (2002).<!--[if supportFields]><![endif]-->.</p>
<p><a href="#_ftnref44" title="_ftn44" name="_ftn44">[44]</a> TK</p>
<p><a href="#_ftnref45" title="_ftn45" name="_ftn45">[45]</a> <!--[if supportFields]> QUOTE &quot;Ideker et al. (2001).&quot; <![endif]-->Ideker et al. (2001).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref46" title="_ftn46" name="_ftn46">[46]</a> <!--[if supportFields]> QUOTE &quot;Kitano (2002).&quot; <![endif]-->Kitano (2002).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref47" title="_ftn47" name="_ftn47">[47]</a> See, for example, a compendium of major metabolic pathways posted by Karl Miller of TK:</p>
<p><a href="#_ftnref48" title="_ftn48" name="_ftn48">[48]</a> <!--[if supportFields]> QUOTE &quot;Kitano (2002).&quot; <![endif]-->Kitano (2002).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref49" title="_ftn49" name="_ftn49">[49]</a>  One harbinger of this hypothetical new discipline are the recent, successful efforts to design and implement (through genetic engineering) artificial networks in cells. TKTKTTK.</p>
<p><a href="#_ftnref50" title="_ftn50" name="_ftn50">[50]</a>  The U.S. Food and Drug Administration has already used computer models to help assess drugs for factors such as cardiac safety. (<!--[if supportFields]> QUOTE &quot;Noble (2002).&quot; <![endif]-->Noble (2002).<!--[if supportFields]><![endif]-->.)</p>
<p><a href="#_ftnref51" title="_ftn51" name="_ftn51">[51]</a> <!--[if supportFields]> QUOTE &quot;Noble (2002).&quot; <![endif]-->Noble (2002).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref52" title="_ftn52" name="_ftn52">[52]</a> Links to each of these sites have been collected at the Caltech ERATO site, which also offers a repository of the various software packages. (<!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.sbw-sbml.org/repository.html&amp;gt;&quot; <![endif]-->http://www.sbw-sbml.org/repository.html<!--[if supportFields]><![endif]-->)</p>
<p><a href="#_ftnref53" title="_ftn53" name="_ftn53">[53]</a> Among representations in current use are Boolean models, Bayesian networks, generalized logical networks, Petri nets, rule-based systems, fuzzy logic, and both stochastic and deterministic ordinary differential equations. See <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf&amp;gt;&quot; <![endif]-->http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref54" title="_ftn54" name="_ftn54">[54]</a> <!--[if supportFields]> QUOTE &quot;McAdams and Arkin (1998).&quot; <![endif]-->McAdams and Arkin (1998).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref55" title="_ftn55" name="_ftn55">[55]</a> <!--[if supportFields]> QUOTE &quot;Bray (1995).&quot; <![endif]-->Bray (1995).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref56" title="_ftn56" name="_ftn56">[56]</a> The analogy isn&#8217;t perfect, since real proteins rarely respond in a completely binary, yes-no fashion. But the analogy can be useful nonetheless.</p>
<p><a href="#_ftnref57" title="_ftn57" name="_ftn57">[57]</a> That is, the propensity of certain bacteria, such as <em>E. coli</em>, to swim towards higher concentrations of nutrients.</p>
<p><a href="#_ftnref58" title="_ftn58" name="_ftn58">[58]</a> Bray was quite explicitly <em>not</em> claiming that the cell processes information the way a modern digital computer does. The organizations are radically different, starting with the fact that there&#8217;s no clean separation between the data store and the central processing unit: the cell&#8217;s memory is the same protein reaction network that does its processing. In that sense, the cell&#8217;s information processing architecture is organized more like that of a neural network. And indeed, Bray&#8217;s 1995 paper made much of that analogy.</p>
<p><a href="#_ftnref59" title="_ftn59" name="_ftn59">[59]</a> <!--[if supportFields]> QUOTE &quot;McAdams and Shapiro (1995).&quot; <![endif]-->McAdams and Shapiro (1995).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref60" title="_ftn60" name="_ftn60">[60]</a> Indeed, the electrical circuit analogy is almost irresistible, as can be seen from a glance at any of the known regulatory pathways: the tangle of links and nodes could easily pass for a circuit diagram of Intel&#8217;s latest Pentium chip. But the fact that the most fruitful analogies tend to come from engineering, as opposed to, say, physics, chemistry, or pure mathematics, may have deeper reasons, as well. Although molecular biology is obviously rooted in physics and chemistry, for example, the very notion of &#8220;function&#8221; takes it a long, long way from those roots. (<!--[if supportFields]> QUOTE &quot;Hartwell et al. (1999).&quot; <![endif]-->Hartwell et al. (1999).<!--[if supportFields]><![endif]-->). Organisms exist to survive and reproduce-a purpose endowed by natural selection-whereas atoms and molecules just are; they have no purpose whatsoever (except, possibly, in a religious context.) So for that reason alone, the concepts needed to understand network function are more likely to resemble the concepts already developed for &#8220;synthetic&#8221; disciplines, of which engineering and computer science are prime examples.</p>
<p>On a more pragmatic level, meanwhile, the engineering disciplines have already had a long history of systems-level thinking-and indeed, have already produced artifacts that are approaching biological levels of complexity. A Boeing 777 jetliner contains about 150,000 subsystem modules, including 1000 computers, a number that&#8217;s impressively close to the estimated 300,000 different proteins in a typical human cell. Just as in the cell, moreover, these subsystems are linked into an immensely complex &#8220;network of networks&#8221;: a control system that just happens to fly. And, just as in the cell, those networks exhibit an intricate interplay between complexity, feedback regulation, robustness, fragility, and cascading failures-all of which indicate that engineering and biology may have much more in common than their superficial differences might suggest. (<!--[if supportFields]> QUOTE &quot;Csete and Doyle (2002).&quot; <![endif]-->Csete and Doyle (2002).<!--[if supportFields]><![endif]-->).</p>
<p><a href="#_ftnref61" title="_ftn61" name="_ftn61">[61]</a> McAdams is an electrical engineer by training. <!--[if supportFields]> QUOTE &quot;Taubes (2002).&quot; <![endif]-->Taubes (2002).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref62" title="_ftn62" name="_ftn62">[62]</a> Actually, statistical fluctuations in the current flow can make electrical circuits noisy, too-but usually at a much lower level.</p>
<p><a href="#_ftnref63" title="_ftn63" name="_ftn63">[63]</a> <!--[if supportFields]> QUOTE &quot;McAdams and Arkin (1998).&quot; <![endif]-->McAdams and Arkin (1998).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref64" title="_ftn64" name="_ftn64">[64]</a> <!--[if supportFields]> QUOTE &quot;Tyson, Chen, and Novak (2001).&quot; <![endif]-->Tyson, Chen, and Novak (2001).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref65" title="_ftn65" name="_ftn65">[65]</a> <!--[if supportFields]> QUOTE &quot;Tyson, Chen, and Novak (2001).&quot; <![endif]-->Tyson, Chen, and Novak (2001).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref66" title="_ftn66" name="_ftn66">[66]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf&amp;gt;&quot; <![endif]-->http://www.doegenomestolife.org/pubs/ComputerScience-10.pdf<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref67" title="_ftn67" name="_ftn67">[67]</a> <!--[if supportFields]> QUOTE &quot;Kitano (2002).&quot; <![endif]-->Kitano (2002).<!--[if supportFields]><![endif]-->/ft &#8220;p. 1663-4&#8243;</p>
<p><a href="#_ftnref68" title="_ftn68" name="_ftn68">[68]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.webopedia.com/TERM/W/Web_services.html&amp;gt;&quot; <![endif]-->http://www.webopedia.com/TERM/W/Web_services.html<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref69" title="_ftn69" name="_ftn69">[69]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.webopedia.com/TERM/g/grid_computing.html&amp;gt;&quot; <![endif]-->http://www.webopedia.com/TERM/g/grid_computing.html<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref70" title="_ftn70" name="_ftn70">[70]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.webopedia.com/TERM/p/peer_to_peer_architecture.html&amp;gt;&quot; <![endif]-->http://www.webopedia.com/TERM/p/peer_to_peer_architecture.html<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref71" title="_ftn71" name="_ftn71">[71]</a> <!--[if supportFields]> QUOTE &quot;Waldrop (2002).&quot; <![endif]-->Waldrop (2002).<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref72" title="_ftn72" name="_ftn72">[72]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.sbw-sbml.org/index.html&amp;gt;&quot; <![endif]-->http://www.sbw-sbml.org/index.html<!--[if supportFields]><![endif]--></p>
<p><a href="#_ftnref73" title="_ftn73" name="_ftn73">[73]</a> <!--[if supportFields]> QUOTE &quot;&amp;lt;http://www.bioeng.auckland.ac.nz/physiome/physiome.php&amp;gt;&quot; <![endif]-->http://www.bioeng.auckland.ac.nz/physiome/physiome.php<!--[if supportFields]><![endif]-->
</div>
]]></content:encoded>
			<wfw:commentRss>http://mmwaldrop.com/Starclouds/2007/11/19/an-overview-of-systems-biology/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The New Genomic Medicine</title>
		<link>http://mmwaldrop.com/Starclouds/2007/11/07/the-new-genomic-medicine/</link>
		<comments>http://mmwaldrop.com/Starclouds/2007/11/07/the-new-genomic-medicine/#comments</comments>
		<pubDate>Thu, 08 Nov 2007 01:36:21 +0000</pubDate>
		<dc:creator>Mitch</dc:creator>
				<category><![CDATA[Genomics]]></category>
		<category><![CDATA[Health care]]></category>
		<category><![CDATA[Innovation]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[drug industry]]></category>
		<category><![CDATA[genome]]></category>
		<category><![CDATA[health insurance]]></category>

		<guid isPermaLink="false">http://mmwaldrop.com/Starclouds/2007/11/07/the-new-genomic-medicine/</guid>
		<description><![CDATA[One of the most frustrating things about our relentlessly partisan debate over health care is that the proposals on every side are so-linear. Are drugs too expensive, and do too many people lack insurance? Subsidize them. Are malpractice awards spiraling out of control? Cap them. Is the total cost of health care growing faster than [...]]]></description>
			<content:encoded><![CDATA[<blockquote><p><em>One of the most frustrating things about our relentlessly partisan debate over health care is that the proposals on every side are so-</em>linear. <em>Are drugs too expensive, and do too many people lack insurance? Subsidize them. Are malpractice awards spiraling out of control? Cap them. Is the total cost of health care growing faster than any conceivable economy could support? Manage it. Force individual patients to defer, cut back, pay more out of pocket, go without. Do whatever meets the needs of the moment, in short, so long as you don&#8217;t actually change anything. </em></p></blockquote>
<blockquote><p><em>Even as the arguments have raged in Washington, however, a radically different vision of medicine&#8217;s future has begun to emerge from the laboratories, courtesy of the Human Genome Project and its many spin-offs. Think of it as the ultimate in treating causes, not symptoms. Or think of it as the medical version of working smarter, not harder. Either way, the new &#8220;genomic&#8221; medicine is shaping up to be the most potent catalyst for health care transformation since the introduction of antibiotics in the mid-20<sup>th</sup> century-perhaps even since the germ theory of disease in the 1860s. Nor will the results be limited to new forms of treatment. Along with therapy, genomic medicine will change the nature of drug development, health insurance, and even the relationship between doctor and patient-all in ways the health-care industry is just waking up to.</em></p></blockquote>
<p>That&#8217;s from one draft of a feature story I wrote about the new genomic medicine back in 2003, for the prototype issue of a new general-interest science magazine that was going to be published by a certain famous university. The magazine never materialized, unfortunately (although I <em>was</em> paid for my work!) But I&#8217;ve always liked the story, which holds up pretty well four years later. So I thought I would publish the final version here. (Along with a <a href="http://mmwaldrop.com/Starclouds/wp-content/uploads/2007/11/the-new-genomic-medicine.pdf" title="The New Genomic Medicine">printable version</a> for downloading.) Enjoy. <span id="more-40"></span></p>
<h1> The New Genomic Medicine</h1>
<h3>M. Mitchell Waldrop</h3>
<p>Earlier this year, almost 50 years to the day after James Watson and Francis Crick first described the DNA double helix, scientists celebrated an equally epochal event: the official completion of the Human Genome Project. They had ample reason to be jubilant. Watson and Crick had given us the <em>structure</em> of DNA; now, after 13 years of international effort and an investment of $2.7 billion, we had its <em>content</em>-a 3-billion-character sequence comprising the entire genetic blueprint of a human being.</p>
<p>But then, as the scientists themselves pointed out, this is hardly the time to rest on our laurels. Already, shimmering in the distance, we can begin to see the next milestone: a new &#8220;genomic medicine&#8221; in which physicians will be able to diagnose and treat each patient&#8217;s disorder with molecular precision. Granted, it could still take us a decade or more to get there. &#8220;Genomic medicine is an enormous opportunity, and an enormous challenge,&#8221; says Timothy Clark, head of bioinformatics at the Cambridge, Massachusetts, biotech firm Millennium Pharmaceuticals. But the first steps are already being taken. Even now, hardly a day goes by that a university or biotech company doesn&#8217;t announce the discovery of a gene linked to <em>this</em> condition or <em>that</em> condition. And the pharmaceutical companies&#8217; development pipeline is full of brand new drugs crafted at least in part with genetic knowledge.</p>
<p>Moreover, when we do get there, the effects on medicine and on the health care system as a whole will be far more profound than most people realize. Indeed, the new genomic medicine is shaping up to be the most potent catalyst for health care transformation since the introduction of antibiotics in the mid-20<sup>th</sup> century-perhaps even since the germ theory of disease in the 1860s. Nor will the results be limited to new forms of treatment. Along with therapy, genomic medicine will change the nature of drug development, health insurance, and even the relationship between doctor and patient-all in ways the health-care industry is just waking up to.</p>
<p>&#8220;It&#8217;s only recently that people in the industry have even begun to look at these systemic effects,&#8221; says Philip Reilly, CEO of Interleukin Genetics in Waltham,  Massachusetts. Just as in the fable of the blind men and the elephant, moreover, the discussion has focused on disjointed bits and pieces of transformation: the Wall Street Journal covers biotech and the pharmaceutical industry, Science and Nature document the latest science; only industry insiders have even begun to delve into the effects on insurance and the doctor-patient relationship. Our goal in this article is to bring those pieces together to provide an overview of the revolution, with all its attendant potential-and danger.</p>
<p><strong>1.      </strong><strong>Diagnosis and Treatment</strong></p>
<p>One of the most intriguing possibilities is that genomics, will one day reverse the ever-rising spiral of health-care costs, and actually start to drive them lower. Certainly these techniques could eliminate an enormous amount of costly trial and error from medicine, making therapy far safer and more effective than it is today. For example, genomics will allow physicians to identify the exact genetic defects involved in every malignant tumor they see, as well as the exact genetic underpinnings of every case of diabetes, asthma, and so on. Genomics will also reveal how well any given therapy is likely to work for each individual patient-and ultimately lead to pharmaceuticals that precisely target the broken cellular machinery in that patient.</p>
<p>The key is being able to understand and measure human genetic variation among individuals, according to Chris Austin, advanced-research director of the National Human Genome Research Institute in Bethesda, Maryland. &#8220;The original genome sequence told us what we have in common,&#8221; he says. (Literally: the sequence represented a mix of DNA from many individuals, deliberately chosen to include both sexes and all racial groups.) But the story of health and disease is told mostly in our differences. Why are some of us horribly susceptible to conditions like arteriosclerosis or high blood pressure, no matter how many hours we spend on the treadmill or how much brown rice we consume? Why are others, who appear to have won the genetic lottery, able to smoke two packs a day for decades, and still come out with lungs as clear as a baby&#8217;s? Knowing the answer would take a lot of the error out of medicine&#8217;s trial and error approach. Such questions are surprisingly tricky to answer, says Austin, in part because the actual genetic differences involved are surprisingly small. &#8220;Between you and me there are only about 3 million genetic differences,&#8221; he says-the equivalent of just one spelling variation in every 1000 letters of the genetic code. Or to put it another way, any two human beings are 99.9 percent identical; <em>all</em> their differences, whether they be in height, skin color, blood type, athletic ability, disease susceptibility, or anything else, arise from that remaining 0.1 percent. (Actually, that&#8217;s true only if you compare women to women or men to men; if you include the X and Y sex chromosomes, the between-gender match falls to only 98.5%-which means that, as jokesters are fond of pointing out, a woman is no more closely related to a man than she is to a female chimpanzee.)</p>
<p>But a more important reason, says Austin, is that it&#8217;s actually quite rare to find a gene that is the sole culprit in causing a disease. There are a few; in Huntington&#8217;s disease a defective gene produces a protein that forms insoluble clots in certain brain cells, causing dementia, along with a progressive loss of motor control. But far more common are complex disorders that involve multiple genes, each of which only increases our susceptibility. &#8220;And that&#8217;s a fundamentally different situation,&#8221; says Austin, &#8220;because there are lots of people running round with bad gene variants who don&#8217;t have the disease.&#8221; Examples include arteriosclerosis, high blood pressure, schizophrenia, and that poster child for genetic complexity, diabetes, which afflicts 17 million Americans. At least 15 genetic variants have been identified as upping the risk of adult-onset diabetes, and yet none yields the inevitability of Huntington&#8217;s. As doctors have been telling us for years, there is also a powerful influence of the environment: diet, exercise and any number of other factors can affect whether the risky genes become active.</p>
<p>Fully sorting out this interplay of multiple genes and the environment, and the contributions of each to health and disease, will take years&#8211;if not decades. In the meantime, however, the health-care industry is hardly waiting around. Commercial work in genomics has been rushing ahead more or less independently of the Human Genome Project for years (although the practitioners have eagerly made use of the project&#8217;s data as quickly as they could get it.) Much of that effort has focused on one group of genetic markers: those that identify precisely which subclass of disease a patient may have and determine precisely how he or she will respond to a particular drug.  &#8220;Pharmacogenomics,&#8221;  as the field is known, has gotten the attention of the big pharmaceutical firms, the insurance companies, the regulators at the Food and Drug Administration, and just about everyone else in the health care industry, says Interleukin&#8217;s Reilly. The idea has been around for quite awhile, he says, &#8220;but the science and technology of it are getting better and better, and people are finally saying that we&#8217;re really going to be able to do this.&#8221;</p>
<p>Certainly there are any number of biotech firms eager to market the necessary genetic tests. &#8220;We can test any set of genes you want, as accurately as you want,&#8221; declares Charles Cantor, chief science officer of the San Diego DNA analysis firm Sequenom, Inc. &#8220;Biology has never had this kind of data before.&#8221; A prime example is the recently developed technology of DNA microarrays, in which a sliver of specially prepared silicon-a &#8220;gene chip&#8221;-can look for changes in the activity of hundreds of genes at once. Wash one of these chips with a puréed tumor sample, say, and it will respond with a pattern of fluorescent spots that maps the activity of hundreds or thousands of genes at a time. By comparing this pattern with that of a normal cell, a computer can then generate a vividly colored chart that makes it instantly obvious how profoundly disturbed the tumor cells really are-and that can in principle, identify precisely which components of the tumor cells&#8217; regulatory networks are broken. In 2001, for example, Stanford  University biologist Patrick O. Brown and his colleagues showed that such a chip could clearly distinguish two types of breast cancer that appear identical under the microscope and had previously been classified as the same cancer. They&#8217;re not. Patients with one type respond well to conventional chemotherapy and have a high recovery rate. But patients with the other type, which shows a very different pattern of gene activity, respond not at all. The payoff for such insights, in this and other diseases, is clear: physicians could immediately move non-responsive patients to other kinds of therapy without wasting money, effort, or most important, time.</p>
<p>Still-does all this activity mean that genomic medicine is coming soon to a clinic near you?</p>
<p>Not quite. Yes, genomic medicine has begun to clear some of its first scientific hurdles, particularly in the realm of pharmacogenomics. But, like any other technology, it also faces any number of ethical, legal, economic, and psychological hurdles. &#8220;And in my experience,&#8221; says Interleukin&#8217;s Reilly, &#8220;these ‘social&#8217; hurdles are the big ones.&#8221;</p>
<p>As patients, for example, you or I (and our doctors) would have an obvious incentive to take advantage of genetic tests if they were available. If nothing else, research tells us that genetics-not environment-is the primary reason our responses are all over the map for certain drugs, including the &#8220;statins&#8221; used to lower cholesterol, the beta-blockers used to treat congestive heart failure, the bronchodilators used to control asthma, and many others. Genetic variations may also affect our ability to metabolize alcohol, and the compounds in tobacco smoke-which means that genetics could also be a major factor in susceptibility to addiction. In some of these cases, the genetic difference might lie in a &#8220;receptor&#8221; protein, the gateway molecule that will supposedly allow a particular drug to cross the cell membrane and find its target deep in the interior; if the shape of the receptor isn&#8217;t quite what the drug was expecting, then the drug can&#8217;t get in, and we might as well have taken a sugar pill. In other cases, the genetic differences might lie in the liver, which harbors a certain set of digestive enzymes that metabolize nutrients and anything else that enters the bloodstream-including drugs. If your enzymes metabolize a given drug faster than expected, it might never have a chance to take effect. If my enzymes metabolize the drug too slowly, it might build up to toxic levels. That&#8217;s exactly what seems to be happening with Cipro and related antibiotics, which can sometimes trigger tingling, numbness, or even severe pain in the arms and legs. Such individual differences in metabolic rates are also what make it difficult to get the right dosage with antidepressants like Prozac-which is why Reilly, for one, predicts psychotherapy will be among the first fields to use pharmacogenetic testing routinely.</p>
<p>Those same genetic tests will also help doctors identify which of us is most likely to suffer side effects from a particular drug-&#8221;side effects,&#8221; in this case, meaning much more than an upset stomach. According to the <em>Journal of the American Medical Association</em>, adverse reactions to FDA-approved drugs that were correctly administered by licensed physicians occur at the rate of 2.2 million cases per year in the United States, with more than 100,000 of those cases ending in death. That makes the innocent-sounding &#8220;side effects&#8221; the fifth leading cause of death, right after heart disease, cancer, stroke, and pulmonary disease, and just ahead of accidents. Obviously, anything that could lower those figures would save a great deal of human suffering, not to mention cost.</p>
<p>On the other hand, patients (and their doctors) will have to balance any possible benefits from the genetic tests against their cost and their reliability, which can be low. Even though a test might detect the presence of a given genetic variant very accurately, admits Sequenom&#8217;s Cantor, &#8220;How accurately can you predict the outcome? That&#8217;s tougher.&#8221; Just as in disease susceptibility, the instances where drug response is determined by a single gene are in the minority. Indeed, drug response can be just as much a complex, multi-gene process as diabetes, and just as much influenced by an individual&#8217;s environment and life experience. That&#8217;s one big reason why the insurance industry has been leery of paying for genetic testing. And until the tests&#8217; reliability improves-which it undoubtedly will, in time, as researchers learn more about the basic genetics-that attitude is unlikely to change.</p>
<p><strong>2.      </strong><strong>Drug development.</strong></p>
<p>Meanwhile, the big pharmaceutical companies, or &#8220;pharmas,&#8221; are finding pharmacogenetics be an exceptionally tricky balancing act. &#8220;My experience is that the scientists in the big pharmas love it, and the marketing people hate it,&#8221; says Reilly.</p>
<p>On the one hand, the pharmas have often, and with some justice, been accused of cultivating a Hollywood-like addiction to &#8220;blockbusters&#8221;: drugs like Prozac or Viagra that can be sold at a premium to millions of people. Occasionally, as in a recent article in the <em>Wall Street Journal</em>, they&#8217;ve even been accused of undermining the development of pharmacogenomic tests for those blockbusters, on the grounds that screening out even few percent of potential customers would cost them millions. And whatever the truth of that allegation, it is true that the Pharmas have little incentive to offer genetic testing for drugs already on the market-not when they&#8217;re desperately trying to recoup an average investment of $800 million for every drug that actually makes it that far.</p>
<p>On the other hand, it&#8217;s not clear how long the FDA will sit still for that attitude once reliable tests are actually available, since it means deliberately selling drugs to at least some people in the overall patient population for a particular drug who won&#8217;t benefit from it. What the pharmas really need is to get that $800 million figure way down-which is why there&#8217;s a very different attitude in the laboratories, where company researchers see that genomics could speed the process of developing new compounds. For one thing, says Millennium Pharmaceuticals&#8217; Clark, genomics has already opened up a whole new world of possibilities. &#8220;In past decades, there were only about 500 cellular proteins used as drug targets in the entire industry.&#8221; But now, he says, thanks to the Human Genome Project, &#8220;we&#8217;ve essentially done a survey of all the 30,000-plus genes in the body. So even if only a small fraction of those are potential drug targets for drug development, we&#8217;re talking about thousands of new targets.&#8221;</p>
<p>Furthermore, genetic testing could allow researchers to sidestep problems with a new drug early in the development cycle. To take a hypothetical example, let&#8217;s say that the researchers at MegaPharmaCo come up with a new drug that drastically slows the progress of dementia in Alzheimer&#8217;s patients. Unfortunately, clinical trials show that, in a small number of individuals, the drug also seems to trigger life-threatening heart arrhythmias. Today, since the Food and Drug Administration would never approve such a drug, MegaPharmaCo would have no choice but to write off its sunk costs and start looking elsewhere-which is a classic example of where that $800-million-per-marketable-drug figure comes from: roughly $720 million, or 90%, is the cost of other drugs that washed out along the way. With the right genetic tests, however, MegaPharmaCo could identify the vulnerable individuals ahead of time and eliminate them from the clinical trials. Result: the large majority of Alzheimer&#8217;s sufferers get a better life, MegaPharmaCo gets a new moneymaker instead of an expensive failure-albeit a moneymaker that will have to bear a clear warning on the label about who should not take it-and the upward pressure on drug prices eases a notch.</p>
<p>Actually, that example is not so hypothetical. This kind of genetic testing would have saved GlaxoSmithKline a lot of grief a few years ago when it was forced to withdraw alosetron, the first drug approved for irritable bowel syndrome, after a small number of users developed life-threatening complications. (Following intense lobbying by desperate patients, the FDA has allowed the drug back on the market, but only when used at much lower initial dosages, and with intensive monitoring-precautions that might also be made unnecessary by genetic screening.) Stories like may even be enough to get the attention of the marketing departments, says Reilly: &#8220;I see the Pharmas fighting pharmacogenomics until they realize that it can help them rescue a drug that might have been rejected. How this will all shake out over time is far from clear. It may be that improvements in our understanding of cellular networks will make for drugs that are more and more precisely targeted, so that each one meets the needs of a smaller and smaller population of patients. Of course, it&#8217;s an interesting question whether the development costs of these niche drugs will ever fall far enough that a small population could afford them. But if that did happen, it would certainly spell the end of the blockbuster drug. Or would it? &#8220;The blockbuster drug model won&#8217;t go away,&#8221; argues Jeffery Augen, director of IBM&#8217;s bioinformatics unit-&#8221;but it will change. When you target diseases at the molecular level, addressing the underlying mechanism, you may find differences and commonalities that weren&#8217;t obvious before. So you may end up with one compound that treats multiple diseases. In fact, there are already drugs on the market that have different targets. An example are the Cox-2 inhibitors, which are very potent against inflammation-but only because they interfere with the growth of capillaries, which means that they may be also inhibitors for cancer growth.&#8221;</p>
<p>The upshot: we can expect Big Pharma to embrace genomics slowly, after they&#8217;ve had a chance to calibrate the intrinsic trade-offs. And over time, we can hope genomics can begin to reverse the rise in drug costs. &#8220;I have certainly made the argument that it will,&#8221; says Reilly. &#8220;But I don&#8217;t know the time frame. In the next three years? Absolutely not. In the next 10 years? Maybe.&#8221;</p>
<p><strong>3.      </strong><strong>Insurance.</strong></p>
<p>Beyond diagnostic genetic testing is the largely unexplored realm of predictive testing: genetic assessments that will one day tell us not just what we have now, but what may lie in our medical future. Eventually, such tests could finally force the health care system to get serious about delaying, or even preventing things like heart disease or diabetes, instead of always waiting until we get sick. And as a side effect, genomics will likely transform the insurance industry-which, after all, lives and dies by evaluating risks&#8211;beyond recognition.</p>
<p>For the time being, the &#8220;payers&#8221;- insurance companies, managed care companies, Medicare, Medicaid, and the like-find the whole notion of predictive genetic testing to be something of a nightmare. &#8220;I think the issue is too new, and the insurance industry is barely grappling with it,&#8221; says Murali Prahalad, Sequenom&#8217;s business development manager. In fact, adds Prahalad, who has worked closely with the industry, &#8220;a lot of insurers seem to be banking on the idea that testing will be made illegal, or will be so ethically abhorrent that it will just go away.&#8221;</p>
<p>Of course, it won&#8217;t go away. The science is too powerful and the human desire to know the future too great. But the industry&#8217;s wish is understandable. On the one hand, genetic testing is already a political lightning rod: the same genetic data that could help us stay healthier, through better treatment plans and preventative measures, could also make us a higher insurance risk, not to mention a greater employment risk. Thus the widespread anxiety over genetic discrimination, and the laws that have been passed in many states to prohibit it. But on the other hand, insurance can&#8217;t work as a business unless higher risks are covered by higher premiums. So how can the payers <em>not</em> take genetic data into account? And conversely, what are they supposed to do in the not-too distant future, when individuals can get themselves tested in private, and then sign up for a policy knowing a few nasty little facts about their genomes that the insurance company doesn&#8217;t know?</p>
<p>&#8220;We&#8217;re talking about the fundamental principles of how we assign risk,&#8221; says Prahalad, &#8220;How industry deals with that is a huge open question.&#8221;</p>
<p>The end result won&#8217;t be business as usual. But what will the business model look like? Prahalad offers one scenario. &#8220;I would start with two presumptions: first, that no one should be denied coverage on genetic basis; and second, that no should be forced to have a genetic test in order to qualify. Then I, as an applicant, have two choices. A). I don&#8217;t take the test, in which case I go into a risk pool just as today, on the basis of age and so forth. But I would also have to certify that I don&#8217;t know of any genetic conditions that affect my health. What I know, the insurance company knows; I won&#8217;t try to game the system. Or B). I submit to test, and get assigned to a genetic risk class. If I&#8217;m low risk, that&#8217;s not a problem; maybe I even get a lower premium.&#8221;</p>
<p>The question is how deal with high-risk individuals. Since they will be covered under this scenario (it will be illegal to deny them coverage, remember), it will be in the payers&#8217; interest to move them into an aggressive preventative regimen. Let&#8217;s say a new genetic test reveals that a 40-year-old employee of a Fortune 500 company is at high risk of developing a deadly form of prostate cancer by the time he&#8217;s in his 50s. The company&#8217;s health plan immediately puts him on schedule of frequent screening, plus regular treatments with genetically targeted drugs that will delay the onset of his particular kind of tumor for decades-until long after he has died from other causes.  The regimen is pricey. But it costs a fraction of what the plan would have to pay to treat a life-threatening cancer.</p>
<p>In addition, it might also be in the payers&#8217; interest to undertake a little wholesale industry restructuring, so that the risks could be pooled in novel ways. For example, a good life insurance purchaser is a bad retirement insurance purchaser, and vice versa; if both were handled through the same company-as they rarely are today-there might be creative new ways to bundle the risks. By extension, the age of genomic medicine might ultimately force the industry to bundle <em>all </em>forms of insurance: health, disability, retirement, unemployment, life-the works.</p>
<p><strong>4.      </strong><strong>The doctor-patient relationship</strong></p>
<p>What will genomic medicine feel like on the receiving end? When we walk through that clinic door in 2013, or in 2023, what kind of experience will we have?</p>
<p>Well, it&#8217;s clear enough that the experience could be profoundly different from what we&#8217;re used to-although it&#8217;s considerably less clear what those differences will be.</p>
<p>On the one hand, for example, it&#8217;s conceivable that &#8220;personalized&#8221; genomic medicine could have the paradoxical effect of turning the clinic into a depersonalized assembly line. In this scenario, we&#8217;d walk in and get hooked up to a machine. The machine would give us an automated DNA scan. Computers would generate an automated diagnosis. We&#8217;d walk out clutching a computer-generated treatment plan-all without once discussing our problems with a human being. This transformation, which would bring medicine into line with, say, banking, could take awhile; when it comes to information technology, the health-care system is light years behind almost any other sector of the economy. But genomics may force the issue. And so will cognitive overload. After all, human doctors are having enough trouble keeping up with medical progress as it is. As the impending explosion of genetic data combines with an multiplication of treatment options, they will find it impossible: the sheer quantity of things the doctor needs to keep in her brain will exceed the capacity of that organ The current expedient-&#8221;subspecialty&#8221; care, in which each doctor focuses on a smaller and smaller portion of the patient-won&#8217;t be practical. So the ancient goal of &#8220;finding a good doctor&#8221; will no longer be viable; the new goal will be finding a good <em>health care system</em>-one that most definitely includes those state-of-the-art facilities for testing, diagnosing, and data analysis.</p>
<p>On the other hand, one could argue that many managed-care clinics are pretty factory-like already; genomics could hardly make them worse. And in any case, it&#8217;s just as conceivable that genomic medicine will be <em>more </em>personalized than it is now, in the sense of offering much more room for individual patient involvement. Thanks to Internet, for example, it&#8217;s no longer unusual to see patients sitting in doctor&#8217;s waiting room holding a fistful of printouts: stacks of downloaded research articles about their own conditions, and the gamut of treatment options, from conventional to bizarre. It&#8217;s a practice some physicians encourage, and others despise. (It may be true, as former House Speaker Newt Gingrich once said, that we&#8217;ll soon see the day when many patients know more about their specific condition than their doctors-but there will also be quite a few patients who only think they know more.)</p>
<p>Nonetheless, patient activism is here to stay, and the growing emphasis on predictive genomics is only going to reinforce it. In this changing environment, we can expect to see a substantial shift in the roles of the various health-care professions. Least affected will be surgeons, radiologists, nurses, and the like-specialists licensed to perform specialized procedures. But for general practitioners and all the other the clinicians who interact with the person rather than with the organs, says Sequenom&#8217;s Cantor, &#8220;The physician as wise counselor pretty much disappears.&#8221;  These folks may find themselves acting more and more like-well, tech support: specialists who advise patients on choices and consequences as they struggle to cope with a vastly complex biological system and an exploding array of treatment and prevention options. That, in turn, will force physicians to put a lot more emphasis on communication and teaching skills-talents the medical profession largely ignored in its late 20<sup>th</sup> century incarnation.</p>
<p>Genomics is both an opportunity and a challenge. The changes it brings will make us all uncomfortable in different ways-intellectually, financially, emotionally, ideologically. But the one thing none of us can do is ignore it. We need to understand how genomics will make the world a different place. And for all of us, the greatest challenge is to feel the shape of the beast: the whole elephant, not just head, trunk, legs or tail.</p>
<p><script type="text/javascript"><!--
google_ad_client = "pub-4225684446778290";
google_ui_features = "rc:";
google_ad_width = 468;
google_ad_height = 60;
google_ad_format = "468x60_as";
google_ad_type = "text";
google_alternate_ad_url = "?adsensem-benice=468x60";
google_color_border = "FFFFFF";
google_color_bg = "FFFFFF";
google_color_link = "0000FF";
google_color_text = "000000";
google_color_url = "";

//--></script>
<script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"></script>
</p>
]]></content:encoded>
			<wfw:commentRss>http://mmwaldrop.com/Starclouds/2007/11/07/the-new-genomic-medicine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
