• Computational Biology

    Applying Quantitative Sciences and Information Technologies to Answer Biological Questions

    Featured here are two examples where computational biology is helping to make sense of an increasingly information-rich environment.

    Insights into Stem Cell Differentiation

    The laboratory of stem cell biologist Stuart Orkin, MD, chair of the Department of Pediatric Oncology, identified a novel enzyme that safeguards the identity and pluripotency of embryonic stem (ES) cells. Together with computational biologist Guo-Cheng ("GC") Yuan, PhD, of the Department of Biostatistics and Computational Biology, Orkin is defining the novel mechanisms that regulate cellular identity and cell-fate switching during ES cell development. These rely on an understanding of the molecules that play essential roles in development and in the proliferation of tumor cells.

    Histones, the spool-like proteins around which DNA winds to form chromatin, are critical to embryonic development because they undergo methylation and other epigenetic modifications that affect gene expression. Enzymes, called methyltransferases, transfer a methyl group onto the tails of histones at fixed locations on chromatin. This marks the gene at that site for repression or activation, depending on where methylation occurs in the histone. One of the protein complexes that synthesize these methylation marks is Polycomb repressive complex 2 (PRC2), which acts as a master epigenetic regulator of ES cells. To maintain pluripotency of the cell, PRC2 binds to developmental genes and mediates methylation via the EZH2 methyltransferase within the complex, thereby repressing differentiation genes. When the cell is destined to develop into different lineages, however, PRC2 de-associates from its target genes, allowing them to be fully expressed and for differentiation to occur.

    Until recently, scientists believed that EZH2, which is up-regulated in some cancers, was the only enzyme directly responsible for methylation on histone H3 lysine 27 (H3K27). Then Xiaohua Shen, PhD, a research fellow in the Orkin laboratory, identified a new methyltransferase, EZH1, which is homologous to EZH2 and able to transfer methylation marks to H3K27. After interrogating a 45-million-probe microarray to locate the marks and target genes of both EZH1 and EZH2, Orkin turned to Yuan to analyze the enormous data set. "Traditional biochemical and molecular analyses are rudimentary compared to what a true computational biologist can do," says Orkin.

    Yuan and his postdoctoral fellow, Yingchun Liu, PhD, sifted through the dizzying array of numbers in order to find the loci where the EZH proteins bind, to map these loci to their chromosomal locations, and to search complex databases to uncover the genes at those sites. They also assessed whether the binding sites of EZH1 correspond to the same genomic regions as the H3K27 marks; indeed, both EZH1 and EZH2 co-localize with H3K27 on chromatin. Genome-wide study and computational analysis thus confirmed biochemical and genetic evidence that EZH1 compensates for, and complements, EZH2 by targeting the same genes. Interestingly, in cells lacking EZH2, only one-third of target genes retained H3K27 marks due to the presence of EZH1. These genes were more often associated with lineage differentiation, while genes losing H3K27 marks were associated with non-developmental functions.

    "Other scientists believed that EZH1 has no role whatsoever," says Orkin, who is renowned for defining the transcription factors governing differentiation. "But with GC's help, we made a solid case to the contrary. This collaboration gave us real confidence in our data and insight into the function of these genes."

    More exciting work will happen in the next step of this research, says Yuan. "The real question is: how does EZH1 know which set of genes to target when EZH2 is depleted?" It's a mystery he hopes to help solve using a new computational method he developed for other purposes, but has since adapted to study Polycomb binding.

    "The identification of EZH1 as a novel methyltransferase acting on H3K27 demonstrates the diversity in mammalian Polycomb repressive complexes," explains Orkin. "This discovery should set the stage for new developments in the role of chromatin in stem cell pluripotency and cancer biology."

    Profiling Ovarian Cancer

    One day, after consoling yet another patient whose ovarian cancer had stubbornly resisted platinum agents, clinical investigator Ursula Matulonis, MD, of Medical Oncology, sought the expertise of computational biologist John Quackenbush, PhD, of Biostatistics and Computational Biology. She wanted to apply modern molecular techniques, such as DNA microarrays, to understand platinum resistance in ovarian cancer, the most deadly gynecologic malignancy. The cross-disciplinary partnership that the two investigators forged that day has reached beyond the laboratories of Dana-Farber to produce the largest set of ovarian cancer genomic profiles to date.

    subtypes-of-ovarian-cancer-samples.jpgOvarian cancer samples were evaluated by genome-wide gene expression profiling and the data subjected to principal component analysis. The ovarian cancer samples grouped into three distinct classes, shown here in red, blue, and green. 

    Although their pilot project was limited by the small number of fresh-frozen tumor samples available, serendipitous events enabled the two investigators to dramatically scale up their joint effort. Quackenbush happened to meet a former colleague, now at Illumina, who had developed a new gene expression assay for paraffin-embedded tissues. Meanwhile, Matulonis was working with pathologists Ronny Drapkin, MD, PhD, of Medical Oncology, and Michelle Hirsch, MD, PhD, of Brigham and Women's Hospital, who had recently used paraffin-embedded tissues from a tumor bank to build a tissue microarray – a paraffin block of 100 or more microtumor cores on a single slide. Quackenbush and Matulonis quickly recognized the power of combining these two new resources. "We thought that if the results of our gene expression assays could be confirmed using a simple antibody test on the tissue microarray," says Quackenbush, "we might have a test with the potential for immediate clinical impact."

    With the pathologists joining the partnership and Illumina on board, the study began in earnest. Quackenbush's team extracted DNA and RNA from the same tumor samples used to create the tissue microarray and sent the purified nucleic acids to Illumina. The company generated data on mRNA, microRNA, copy number variation, and DNA methylation, which it returned to Dana-Farber for analysis.

    Discoveries deriving from these data – such as markers indicating that a patient is likely to become platinumresistant or platinum-sensitive – may lead to more effective diagnostics and treatments. One early discovery showed that tumor samples separate into distinct molecular subgroups, a finding which may guide future treatment of patients with ovarian cancer. Investigators are now examining whether these subgroups are associated with outcomes or other clinical measures. The team is also analyzing RNA and microRNA (which can bind to mRNA, targeting it for degradation) to look for anti-correlations, instances where an increase in microRNA decreases mRNA. "Integrating these two types of data may provide more complete information than analyzing either type alone," Quackenbush explains. "Most importantly, Dana-Farber now has the ability to address basic questions about ovarian cancer."

  • Email
  • Print
  • Share
  • Text
Highlight Glossary Terms