International collection of open reading frames now totals 80 percent of human protein-coding genes, collaboration reports

Posted date

February 25, 2016

An international collaboration of organizations, including Dana-Farber Cancer Institute, has reached a milestone in creating a library of complete genetic blueprints for the thousands of different proteins in human cells. The collection – consisting of open-reading frames (ORFs), the portions of genes that code for full-length proteins – is an essential resource for scientists studying the basic mechanics of human cells and how those processes go awry in disease.

In a paper published today by Nature Methods, the ORFeome Collaboration (OC), a group of 13 academic, commercial, and governmental organizations, announced that its collection of ORF clones now comprises about 80 percent of all protein-coding genes in human cells – 17,154 in all, and counting. It is the largest human-gene DNA collection openly available to the worldwide research community.

“The OC ORF collection can be of enormous utility in a broad range of research applications,” said the paper’s senior author, David E. Hill, PhD, associate director of the Center for Cancer Systems Biology (CCSB) at Dana-Farber, one of the founding institutions of the OC. “To explore cell physiology in a comprehensive way, scientists need a resource that allows them to express virtually any cell protein of interest. The OC is a unique and valuable tool for that type of work.”

Thousands of scientists have used OC-supplied ORF clones in their research since the collaboration began in 2005. Applications include large-scale mapping of protein-protein interactions; production of recombinant human proteins; functional screening of specific proteins; development of disease-specific protein interaction networks; studies on the effect of knocking down or knocking out key proteins in cells, and other uses.

The clones are available from multiple OC distributors around the world at minimal cost, with no restrictions by the OC on their use. Information on the collaboration and on ordering clones is available at the OC website: www.orfeomecollaboration.org.

“This website also has a searchable database where we provide rich annotation of clones and encoded proteins to enhance utilization in the community,” said Stefan Wiemann, PhD, of the German Cancer Research Center (DKFZ), Heidelberg, Germany, the first author of the study.

Each ORF contains the protein-coding regions of a specific gene. The ORF clones are encased in plasmids, which are injected into bacteria and stored in freezers at the OC’s multiple distribution sites. The clones are provided in the Gatewayä vector format, which allows for easy transfer to a large variety of vectors for expressing the corresponding proteins using for example Escherichia coli, yeast, and mammalian cells, or even cell-free expression systems.

The OC grew out of informal discussions among researchers at human ORFeome conferences sponsored by the CCSB at Dana-Farber in the early 2000s. “Attendees from various institutions began discussing what they were doing in the area of generating and validating ORFs,” Hill explained. “We began to think about how we could work together to produce the largest possible collection.”

“Different members of the OC have performed different roles in its operation,” Hill continued. “Some groups have worked on adding new clones to the collection, some do DNA sequencing, or concentrate on quality control of the ORFs and archiving them for members. Some do informatics work, while others are mainly involved in distribution. Reaching the current milestone has required a concerted effort from a very diverse group of people and organizations. Everyone involved has made an important contribution – which has made this a very enjoyable and productive collaboration.”

This phase represents the “end of the beginning” as OC members are continuing to work together to expand the human ORFeome as well as adding ORFeomes for other model organisms.

The co-authors of the study are: Christa Pennacchio of Lawrence Livermore National Laboratories; Yanhui Hu of Harvard Medical School; Preston Hunter, Jin Park, Catherine Seiler, Jason Steel, and Joshua LaBaer of Arizona State University; Matthias Harbers of DNAFORM Inc. and RIKEN Yokohama Institute, both of Kanagawa, Japan; Alexandra Amiet and Anja van Brabant Smith of Dharmacon, part of GE Healthcare, Lafayette, Colo.; Graeme Bethel of Wellcome Trust Sanger Institute, Cambridge, United Kingdom; Melanie Busse and Tom Weaver of Source BioScience, Nottingham, United Kingdom; Piero Carninci, Yoshihide Hayashizaki, and Jun Kawai, of RIKEN Yokohama Institute; Mark Diekhans of University of California, Santa Cruz; Ian Dunham of Wellcome Trust Sanger Institute; Tong Hao, Kourosh Salehi‐Ashtiani, and Marc Vidal, PhD, of Dana-Farber; J. Wade Harper of Dana Farber-Harvard Cancer Center; Oliver Heil, Agnes Hotz-Wagenblatt, Anika Jöcker, and Ruth Wellenreuther of DKFZ; Steffen Hennig, Christoph Koenig, and Johannes Maurer of imaGenes GmbH, Berlin, Germany; Wonhee Jang and Lukas Wagner, of the National Library of Medicine, National Institutes of Health (NIH) , Bethesda, Md.; Bernhard Korn of Ressourcenzentrum für Genomforschung gGmbH, Berlin, Germany; Cristen Lambert and Gary Temple of the National Human Genome Research Institute, NIH; Anita LeBeau, of HudsonAlpha Institute of Biotechnology, Huntsville, Ala.; Sun Lu of GeneCopoeia, Inc., Rockville, Md. and Guangzhou FulenGen, LTD, Guangdong, China; Troy Moore of Open Biosystems, Inc., Huntsville, Ala.; Osamu Ohara of Kasusa DNA Research Institute, Kisarazu, Chiba, Japan; Andreas Rolfs of Harvard Medical School; Blake Simmons of HudsonAlpha Institute of Biotechnology and Open Biosystems, Inc.; Shuwei Yang of GeneCopoeia, Inc., of Rockville, Md.; and Daniela S. Gerhard of the National Cancer Institute, NIH.

The work was supported by a grant from the Ellison Foundation, Boston, and by Dana-Farber Cancer Institute Sponsored Research funds, and research grants from the Ministry of Education, Culture, Sports, Science and Technology, Japan, to the RIKEN Omics Science Center and the RIKEN Center for Life Science Technologies. The German cDNA Consortium was funded by the Federal Ministry of Education and Research in the frame of the German Genome Project and the National Genome Research Network programs.

News Category

Research

Media Contacts

If you are a journalist and have a question about this story, please call 617-632-4090 and ask to speak to a member of the media team, or email media@dfci.harvard.edu.

The Media Team cannot respond to patient inquiries. For more information, please see Contact Us.

International collection of open reading frames now totals 80 percent of human protein-coding genes, collaboration reports

Stay in touch with us

About Dana-Farber

Locations

Discovery and Insights