Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae.
Hu Y., Rolfs A., Bhullar B., Murthy T.V.S., Zhu C., Berger M.F., Camargo A.A., Kelley F., McCarron S., Jepson D., Richardson A., Raphael J., Moreira D., Taycher E., Zuo D., Mohr S., Kane M.F., Williamson J., Simpson A.J.G., Bulyk M.L., Harlow E., Marsischky G., Kolodner R.D., LaBaer J.
The availability of an annotated genome sequence for the yeast Saccharomyces cerevisiae has made possible the proteome-scale study of protein function and protein-protein interactions. These studies rely on availability of cloned open reading frame (ORF) collections that can be used for cell-free or cell-based protein expression. Several yeast ORF collections are available, but their use and data interpretation can be hindered by reliance on now out-of-date annotations, the inflexible presence of N-or C-terminal tags, and/or the unknown presence of mutations introduced during the cloning process. High-throughput biochemical and genetic analyses would benefit from a "gold standard" (fully sequence-verified, high-quality) ORF collection, which allows for high confidence in and reproducibility of experimental results. Here, we describe Yeast FLEXGene, a S. cerevisiae protein-coding clone collection that covers over 5000 predicted protein-coding sequences. The clone set covers 87% of the current S. cerevisiae genome annotation and includes full sequencing of each ORF insert. Availability of this collection makes possible a wide variety of studies from purified proteins to mutation suppression analysis, which should contribute to a global understanding of yeast protein function.