Skip Navigation

Department of Data Sciences

  • Mission

    Today, quantitative ideas are an essential part of cancer research, as many of the opportunities and challenges for the cancer research community involve using complex data to further our understanding of cancer biology and optimize prevention and therapy. The mission of our department is twofold: to generate data science knowledge and innovative tools that can enable progress in cancer research, and to provide investigators across Dana-Farber and beyond with access to highly qualified and committed quantitative scientists.

    The Department of Data Sciences is home to a complex web of programs, centers, cores, labs, and working groups that pursue complementary missions, but that come together in forming one of the most vibrant computational sciences environments in today’s biomedicine. For example, the department is the home base of the DF/HCC Cancer Data Science program — the only one of this kind among comprehensive cancer centers, and the DF/HCC Biostatistics Core; it is also home to four Dana-Farber strategic research centers, the Knowledge System software development group, and two statistical coordinating centers for clinical trial cooperative groups. Faculty members play a key role in open-source software initiatives such as Bioconductor, the CISTROME database, and cBioPortal, and in online data science education. 

    Biostatistics staff members

    Members of the department include about 40 faculty, with academic appointments in 7 different clinical, basic, and population science departments in the Harvard T.H. Chan School of Public Health and Harvard Medical School.

    Departmental Role in the Dana-Farber/Harvard Cancer Center

    Department members participate in the leadership of the Dana-Farber/Harvard Cancer Center (DF/HCC), the largest NCI-designated Comprehensive Cancer Center in the country. Giovanni Parmigiani, PhD, is the DF/HCC Associate Director for Population Sciences, and Franziska Michor, PhD, is the leader of the Cancer Data Sciences program at DF/HCC. The DF/HCC Biostatistics Core is located at Dana-Farber and is directed by faculty member Paul Catalano, ScD. The Core provides consultation and assistance to Cancer Center members in all DF/HCC research programs. Department members are also involved in a variety of DF/HCC activities, and play a role in the development of all clinical research protocols by serving as members of DF/HCC's Scientific Review Committee, as well as the Institutional Review Board.

    Research Units and Centers

    Innovative Clinical Trials

    The Department of Data Sciences has historically been at the forefront of the development of innovative approaches for carrying out clinical trials of cancer treatments. Its founder, Marvin Zelen, PhD, was one of the most influential innovators in this area. Today, because of the increasing personalization of cancer therapy, and because of the importance of therapies that target features of cancers that are not shared by all patients, traditional clinical trial paradigms are insufficient to guide us to the best drugs in a timely fashion. Several faculty (including Barry, Gelber, Gray, Parmigiani, Trippa) are active in developing innovations in clinical trial design and analysis to meet this challenge.

    More than 90 percent of potential new cancer drugs tested in clinical trials never find their way to pharmacy shelves. The dearth of effective new cancer drugs is impeding the ability to deliver personalized cancer medicine — the right drug for the right patient at the right time. Traditional clinical trials accrue patients based on the anatomical locations of their tumors. However, cancer is a disease in which certain genes undergo changes (mutations) that result in cancer. Each type of cancer has several subtypes, and the genetic mutations may vary in each subtype, which means that two patients with lung cancer or breast cancer or any other type of cancer may not benefit from the same treatment. It follows that a genetically-based clinical trial, in which patients are divided into subgroups based on the specific genetic mutations driving their tumors, has the potential to have a greater impact than a traditional clinical trial. The application of sophisticated bioinformatics and computational algorithms to voluminous genomics data makes it possible to design and conduct genetically-based trials, recruit the right patients for the right clinical trials, and more efficiently interpret the results. This type of trial design is expected to increase success rates in bringing new FDA-approved cancer therapies to the clinic.

    Program for Innovations in Therapeutic and Biomarker Development (PITBD)

    The mission of the Program for Innovations in Therapeutic and Biomarker Development (PITBD) is to generate novel approaches for the development of therapeutics and biomarkers through preclinical and clinical trial designs, pipeline modeling and analysis, and advancement in regulatory science and policy.

    The development and translation of effective precision medicines require generation of information regarding both efficacious therapeutics and putative biomarkers that predict that efficacy. The exponential increase in potential biomarker information that is available for any given patient or disease, including genomic biomarkers, has created challenges with respect to the best methods to generate meaningful information to support both the scientific development and regulatory decision-making required for bringing new medicines to the clinic. The PITBD consists of both an internal program that develops new approaches for information generation to support precision-medicine development as well as an external-facing component that serves as a nidus to bring together expertise from multiple fields to support the development of precision medicines and serve as an intellectual collaborator for the early concept development of clinical trials.

    The PITBD focuses on three major areas of clinical trial design and collaboration: 1) Design and implementation of biomarker-based platform trials in oncology; 2) Development and implementation of novel algorithmic or model-based approaches to phase I combination studies, including with radiation; and 3) Development of novel approaches to the design and analysis of trials with non-constant hazards, such as immunotherapy trials. Many of these approaches involve Bayesian technologies.

    In addition, PITBD is fostering collaborations with HMS-affiliated institutions and clinical areas outside of oncology. In addition to the basic, clinical/translational, and computational collaborators, PITBD will leverage its relationships with the Harvard/MIT Center for Regulatory Science and the Harvard Kennedy School to address issues related to precision medicine product development within the specific regulatory and business environment. Additionally, PITBD will foster active collaborations with industry and FDA.

    ECOG-ACRIN Cancer Research Group

    The Department of Data Sciences hosts the statistical center for the ECOG-ACRIN Cancer Research Group, a multidisciplinary, membership-based scientific organization that designs and conducts biomarker-driven cancer research involving adults who have or are at risk of developing cancer. The Group is dedicated to achieving research advances in all aspects of cancer care and thereby reducing the burden of cancer and improving the quality of life and survival in patients with cancer.

    The statistical center designs and conducts biomarker-driven cancer research involving adults who have cancer or are at risk. They have boldly integrated therapeutic and diagnostic imaging-based research disciplines with the latest bioinformatics technologies into a single scientific organization. With their capacity to explore integral biomarkers, including imaging markers of response and prognosis, they are poised to achieve patient-centered research breakthroughs across the cancer care continuum, from prevention and screening through treatment of metastatic disease.

    International Breast Cancer Study Group (IBCSG)

    Dana-Farber Cancer Institute has been the home of the International Breast Cancer Study Group Statistical Center for since 1977. IBCSG is a non-profit research organization dedicated to innovative clinical research to improve the prognosis of women with breast cancer. As one of the world’s leading breast cancer research groups, IBCSG pioneers research in combined hormonal therapy and chemotherapy, timing and duration of adjuvant therapies, and quality of life of breast cancer patients. The latest generation of clinical trials in the adjuvant setting addresses tailored treatment for subgroups of patients, as IBCSG’s research expands into neoadjuvant treatment and chemotherapy for advanced disease. In addition to clinical trials, IBCSG conducts extensive programs in translational research, database studies, quality of life, and statistical methodology.

    The Statistical Center is responsible for study design, forms and protocol development, collection of data for both newly enrolled patients and those being followed long term, data processing, statistical analysis, reporting of results, and manuscript preparation. The Statistical Center staff has also developed methodologies for interpreting subpopulations (Subpopulation Treatment Effect Pattern Plots—STEPP), and incorporating quality of life into treatment comparisons (Quality-adjusted Time Without Symptoms or Toxicity—Q-TWiST). Under the leadership of Richard Gelber, PhD, and Meredith Regan, ScD, the IBCSG Statistical Center continues to innovate and guide critically important clinical trials to improve outcomes for women with early breast cancer.

    cBio Center

    The cBio Center in our department has a three-fold mission: 1) To provide oncologists with tools to mine genomic patient data for research and for guiding treatment decisions; 2) To devise strategies to overcome resistance to targeted cancer drugs; and 3) To create new connections between scientists at Dana-Farber and Harvard Medical School, including collaborative structures for scientists using quantitative sciences to solve biological problems. The cBio Center team develops and uses powerful computational algorithms, tools, and systems that are accelerating research on the molecular basis of various cancers and opening pathways to new drugs and clinical trials.

    Center for Functional Cancer Epigenetics (CFCE)

    The Center for Functional Cancer Epigenetics, co-led by Myles Brown, MD, and Xiaole (Shirley) Liu, PhD, explores the key role that epigenetic alterations and abnormal transcriptional regulation play in the development and progression of cancer. A better understanding of these alterations will lead to better diagnosis for cancer and the potential to contribute to the knowledge required for the development of new therapeutics involving epigenetic mechanisms.

    The Center for Functional Cancer Epigenetics serves as a central resource for Dana-Farber in supporting both unbiased and hypothesis based cancer research involving epigenetics. CFCE collaborates with multiple investigators across basic and clinical research to develop and execute innovative research involving epigenetics experiments and analyses.

    CFCE employs technologies such as Chromatin Immunoprecipitation followed by next-generation-sequencing (ChIP-seq), DNase hypersensitivity mapping, gene expression profiling (RNA-Seq) and DNA methylation mapping. It combines these technologies with very strong computational biology expertise to explore the role of epigenetic changes and transcriptional regulation in disease pathogenesis and treatment. CFCE includes faculty with expertise from a variety of disciplines including cell biology, physiology, cancer biology, human genetics, and computational biology.

    Center for Cancer Evolution (CCE)

    Cancer cells arise as the result of the accumulation of genetic changes that make these cells function differently from their intended purpose. Genetically altered cells survive and sometimes thrive in human tissues through a process that is very similar to evolution and natural selection. Mathematical models of evolution are very powerful tools for understanding how this process works and are pursued actively by Department faculty (Michor, Parmigiani).

    The department is home to the Physical Science – Oncology Center (PS-OC), directed by Franziska Michor, PhD. The principal mission of the PS-OC is to advance our understanding of the physical principles that govern cancer initiation, progression, response to treatment, and the emergence of resistance. The members of the PS-OC include theoretical biologists from Dana-Farber Cancer Institute, Memorial Sloan-Kettering Cancer Center, and City College of New York, as well as scientists from Vanderbilt University and Memorial Sloan-Kettering Cancer Center. Collaborations between theoretical biologists and experimental scientists in the PS-OC will bridge the divide between the physical sciences and oncology.

    The Center for Cancer Evolution (CCE) was founded in 2016 by Franziska Michor, PhD, David Pellman, MD, and Kornelia Polyak, MD, PhD, at Dana-Farber Cancer Institute. The CCE focuses on understanding cancer evolution through a multi-disciplinary approach. Our goal is to understand the mechanisms behind tumor evolution, metastasis formation, and the emergence of drug resistance – to ultimately provide more specialized and effective patient care for a variety of different cancer types.

    The principal mission of the Center for Cancer Evolution CCE is to develop and validate mathematical modeling strategies of tumor evolution and treatment response, utilize these strategies to identify best therapeutic intervention, and make this expertise available to the Dana-Farber community. We will investigate the treatment response of cancer cells and their microenvironment, develop novel mathematical frameworks describing the evolutionary dynamics of tumor progression and treatment response that are parameterized using the experimental systems, and predict and validate optimum intervention strategies that will ultimately be implemented as prospective clinical trials. This approach will incorporate many different parameters inherent to tumor cell populations to better define mechanisms of progression and resistance and develop alternative therapeutic strategies.

    Teaching the next generation of biostatisticians and computational biologists

    The Department has a close partnership with the Department of Biostatistics at Harvard T.H. Chan School of Public Health, where the majority of B&CB faculty have primary academic appointments. Department faculty lead training grants and curriculum development initiatives, direct research, teach in the doctoral program, and also teach in the undergraduate degree program in statistics at Harvard College.

    Faculty in our department are heads of two training grants through the Harvard T.H. Chan School of Public Health; John Quackenbush, PhD, leads the Statistical & Quantitative Training in Big Data Health Science, which helps students with the appropriate training and experience required to develop new quantitative methods to handle the complexities associated with Big Data, recognizing that one area of critical training for students is in statistical and computing methods that can be applied to the analysis of large-scale data in health and biomedical research, and that this area will require more than training in computer science.

    Giovanni Parmigiani, PhD, leads the Training Grant in Quantitative Sciences for Cancer Research. The mission of this training grant is to provide trainees with a deep interdisciplinary experience that will enable them to contribute, through quantitative science skills, to progress in cancer research. The program draws upon a highly distinguished faculty, consisting of quantitative scientists, as well as experts in cancer research. It enables students to seamlessly participate in training and research activities across the Harvard T.H. Chan School of Public Health, Dana-Farber Cancer Institute, and other Harvard institutions.

    Awards of note

    Several of our faculty members received awards in recent years. The White House honored John Quackenbush, PhD, as an Open Science Champion of Change for his contributions to making large and complex sets of biological data widely accessible to the larger research community. Xiaole (Shirley) Liu, PhD, received the Richard E. Weitzman Outstanding Early Career Investigator award from the Endocrine Society 2016. Franziska Michor, PhD, received both the 2015 New York Stem Cell Foundation – Robertson Stem Cell Prize and the 36th Annual AACR Award for Outstanding Achievement in Cancer Research in 2016. Rafael Irizarry, PhD, received the Benjamin Franklin Award in Life Sciences in 2017. In addition, Matchminer, an open source computational platform for matching patient-specific genomic profiles to precision cancer medicine clinical trials that was developed by our Knowledge Systems Group, won Harvard Business School’s Precision Trials Challenge, as well as the Best in Show Award for Data Visualization and Exploration at BioIT 2017.