Lecture 27

Bioinformatics and Systems Biology


Bioinformatics:

    A major goal in interpreting genome sequences is understanding how the genome encodes the information that controls the spatial-temporal gene expression profiles in a complex organism.  To do so, a primary objective is to identify those regions of the genome that contain regulatory information.  As we've seen in previous lectures, this regulatory information is organized into complex, modular, cis-regulatory sequences containing enhancers/silencers, promoters, initiator elements, etc.  A common property of these cis-regulatory modules is that they are all docking stations for multiple transcription factors, hence, they contain multiple protein binding sites.  Presumably, multiple bound transcription factors act combinatorially to confer specific transcriptional activity. 

    Cis-regulatory modules seem to share several architectural features:  They are typically only hundreds of base pairs in length and contain multiple binding sites for as many as 4–5 different transcription factors.  The frequent occurrence of multiple copies of the same binding site as well as the enrichment of certain combinations of binding sites in a module in comparison with the genome at large provide the basis for  computational strategies to predict genes which are part of the same regulatory network or for computational identification of novel regulatory elements on a genome-wide scale.     

    The term bioinformatics is sometimes used interchangeably with the term computational biology.  Computational biology is defined as the systematic development and application of computing systems and computational solution techniques to models of biological phenomena; bioinformatics is defined as the systematic development and application of computing systems and computational solution techniques analyzing data obtained by experiments, modeling, database search, and instrumentation to make novel observations about biological processes.

    Today we will be discussing the following paper that utilizes a computational approach, backed by solid molecular biology, to identify genes (known and unknown) that are regulated by the characterized transcription factor Dorsal:

Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo.  PNAS 99(2), 763-768, 2002.

Click here to download PDF

Fig 1Fig 2Fig 3Figs  4 &5


Systems Biology:

Taken from:  http://www.ki.se/icsb2002/

"Remarkable progress in molecular biology has lead to a complete map of the building blocks of life. The next major challenge is to understand the interactions between the myriad of sub-cellular components. The Human Genome Project and recent advances in Proteomics and DNA microarray technology highlight the need for systems level integration of experiments and theory in order to decode the logic of life. This is the ambitious goal for Systems Biology, the quantitative study of biological processes as integrated systems rather than as isolated parts. In Systems Biology, traditionally separated scientific disciplines, including physical chemistry, biochemistry, molecular biology, cell physiology and the behaviour of multicellular organisms, are unified by quantitative models."

The Emergent Integrated Circuit of the Cell (Hanahan and Weinber, Cell 100, 2000)

Progress in dissecting signaling pathways has begun to lay out a circuitry that will likely mimic electronic integrated circuits in complexity and finesse, where transistors are replaced by proteins (e.g., kinases and phosphatases) and the electrons by phosphates and lipids, among others. In addition to the prototypical growth signaling circuit centered around Ras and coupled to a spectrum of extracellular cues, other component circuits transmit antigrowth and differentiation signals or mediate commands to live or die by apoptosis. As for the genetic reprogramming of this integrated circuit in cancer cells, some of the genes known to be functionally altered are highlighted in red.

 

Systems Biology with respect to enhancer function:

    A recurring theme in the analysis of complex cis-regulatory elements is the idea of functional interactions between the different modules that make up the element.  One of the best-characterized cis-regulatory elements belongs to the endo16 gene of the sea urchin Stronglyocentrotus purpuratos.  Much of this work was performed in the laboratory of Eric Davidson.  Early in development the endo16 gene participates in the specification events that define the endomesoderm; later in development it functions as a gut-specific differentiation gene.  Extensive research of this gene’s cis-regulatory element has led to the generation of complex computational logic models that were used to explain this switch.  Logic considerations predicted that developmentally controlled functional interactions between two modules, named Module A and B, mediate this switch in endo16 function.  Indeed, this prediction was confirmed experimentally and a distinct set of functional interactions between the modules that mediate the switch function was demonstrated.  The endo16 computational model now provides a detailed explanation of the information processing functions executed by the cis-regulatory system of this gene throughout embryogenesis.  One of the greatest undertakings in developmental biology will be to reach the stage where we can devise computational models to explain how all enhancers decipher the regulatory inputs and generate novel regulatory outputs.  Useful models for enhancer function must be capable of explaining the temporal and spatial dynamics of cis-regulatory response as input and output data become available. 

From Bolouri H, Davidson EH.  Dev Biol 2002 Jun 1;246(1):2-13

Big Picture for transcriptional regulatory networks:

From Davidson et al.  Science. 2002 Mar 1;295(5560):1669-78. Review.