Molecular Biology of the Cell Sign up for new MBC in Press e-TOCs!

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Botstein datasets
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Spellman, P. T.
Right arrow Articles by Futcher, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Spellman, P. T.
Right arrow Articles by Futcher, B.

Vol. 9, Issue 12, 3273-3297, December 1998

Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization

Paul T. Spellman,*dagger Gavin Sherlock,*dagger Dagger Michael Q. Zhang,Dagger Vishwanath R. Iyer,§ Kirk Anders,* Michael B. Eisen,* Patrick O. Brown,§parallel David Botstein,* and Bruce FutcherDagger

 *Department of Genetics, Stanford University Medical Center, Stanford, California 94306-5120;  Dagger Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724-2209;  §Department of Biochemistry, Stanford University Medical Center, Stanford, California 94306-5428; and  parallel Howard Hughes Medical Institute, Stanford, California 94305-5428

Submitted September 4, 1998; Accepted October 15, 1998
Monitoring Editor: Gerald R. Fink

    ABSTRACT
Top
Abstract
Introduction
Materials & Methods
Results
Discussion
References

We sought to create a comprehensive catalog of yeast genes whose transcript levels vary periodically within the cell cycle. To this end, we used DNA microarrays and samples from yeast cultures synchronized by three independent methods: alpha  factor arrest, elutriation, and arrest of a cdc15 temperature-sensitive mutant. Using periodicity and correlation algorithms, we identified 800 genes that meet an objective minimum criterion for cell cycle regulation. In separate experiments, designed to examine the effects of inducing either the G1 cyclin Cln3p or the B-type cyclin Clb2p, we found that the mRNA levels of more than half of these 800 genes respond to one or both of these cyclins. Furthermore, we analyzed our set of cell cycle-regulated genes for known and new promoter elements and show that several known elements (or variations thereof) contain information predictive of cell cycle regulation. A full description and complete data sets are available at http://cellcycle-www.stanford.edu

    INTRODUCTION
Top
Abstract
Introduction
Materials & Methods
Results
Discussion
References

In 1981 Hereford and coworkers discovered that yeast histone mRNAs oscillate in abundance during the cell division cycle (Hereford et al., 1981). To date 104 messages that are cell cycle regulated have been identified using traditional methods, and it was estimated that some 250 cell cycle-regulated genes might exist (Price et al., 1991). There are several reasons why genes might be regulated in a periodic manner coincident with the cell cycle. Such regulation might be required for the proper functioning of mechanisms that maintain order during cell division. Alternatively, regulation of these genes could simply allow conservation of resources. Much of the literature has focused on the posttranscriptional mechanisms that control the basic timing of the cell cycle. However, there is also clear evidence that trans-acting factors play a critical role in the regulation of the abundance of many cell cycle-regulated transcripts.

Most identified cell cycle controls that exert influence over mRNA levels do so at the level of transcription. Three major types of cell cycle transcription factors are known in yeast, the MBF and SBF factors, Mcm1p-containing factors, and Swi5p/Ace2p (Table 1). Many genes expressed at about the G1/S transition contain MCB or SCB elements in their promoters to which MBF and SBF bind respectively (for review, see Koch and Nasmyth, 1994). It is now apparent that SBF is not as specific for SCBs as was originally thought but, rather, can bind, at least in some cases, to motifs more closely matching the MCB consensus (Partridge et al., 1997). MBF and SBF are activated posttranslationally by Cln3p-Cdc28p, and SBF, at least, is inactivated by Clb2p-Cdc28p (Amon et al., 1993). It is this cyclin-dependent activation and inactivation that causes MBF- and SBF-mediated transcription to be cell cycle regulated. Mcm1p can bind with other DNA binding proteins to mediate a specific biological effect. In cooperation with Ste12p, Mcm1p directs the cell cycle expression of some genes in early G1 phase (Oehlen et al., 1996). In cooperation with an uncloned factor called "Swi five factor" (SFF), it induces the expression of CLB1, CLB2, BUD4, and SWI5 in M (Lydall et al., 1991; Sanders and Herskowitz, 1996). Finally, possibly acting without a partner, it induces transcription of CLN3, SWI4, and CDC6 at the M/G1 boundary (McInerny et al., 1997). The Mcm1p + SFF combination is interesting, because it is somehow activated by Clb2p-Cdc28p, and Mcm1p + SFF then induces further transcription of CLB2. Thus, Mcm1p is part of a positive feedback loop for CLB2 transcription. Finally, Swi5p and Ace2p, which are transcriptionally controlled by Mcm1p and SFF, are responsible for the expression of many genes in M and M/G1 (Kovacech et al., 1996). Some of these genes are responsible for inactivating Clb2p and promoting cytokinesis, thus allowing exit from mitosis, and allowing the cycle to begin anew.

                              
View this table:
[in this window]
[in a new window]
 
Table 1.  Transcription factors that regulate the cell cycle

Many cell cycle-regulated genes are involved in processes that occur only once per cell cycle. Such processes include DNA synthesis, budding, and cytokinesis. Additionally many of these genes are involved in controlling the cell cycle itself, although in most cases it is unclear whether their regulated transcription is absolutely required. The cell division cycle is thus a complex self-regulating program, such that many genes involved in aspects of the cell cycle are also controlled by it.

We present the results of a comprehensive series of experiments aimed at objectively identifying all protein-encoding transcripts in the genome of Saccharomyces cerevisiae that are cell cycle regulated. We used DNA microarrays to analyze mRNA levels in cell cultures that had been synchronized by three independent methods. These data were analyzed by deriving a numerical score based on a Fourier algorithm (testing periodicity) and by a correlation function that identified genes whose RNA levels were similar to the RNA levels of genes already known to be regulated by the cell cycle. This protocol allowed us to include data from a previously published study (Cho et al., 1998). We find that ~800 genes are cell cycle regulated, which constitutes >10% of all protein-coding genes in the genome. We also find that about one-half of these genes can be controlled by the G1 cyclin CLN3 and/or the mitotic cyclin CLB2. The primary data presented in this article, tools for examining them, and supporting analyses can be found at http://cellcycle-www.stanford.edu.

    MATERIALS AND METHODS
Top
Abstract
Introduction
Materials & Methods
Results
Discussion
References

Strains

Strains used in this study are shown in Table 2.

                              
View this table:
[in this window]
[in a new window]
 
Table 2.  Strains used in this study

Media and Growth Conditions

YEP medium (Sherman, 1991) was used in all experiments, supplemented with an appropriate carbon source. Carbon sources are indicated in the descriptions of each experiment and were used at a final concentration of 2% (wt/vol), unless otherwise noted. The pH of the YEP for the alpha  factor experiment was adjusted to 5.5 before use. The medium used for the elutriation was first passed through Whatman #1 filter paper. Cultures of yeast were shaken at 250-300 rpm, in a volume no more than 25% of the vessel maximum at the temperature specified in the description of each experiment.

Microarray Manufacture

Yeast ORFs were amplified using gene PAIRS primers (Research Genetics, Huntsville, AL). One hundred-microliter PCR reactions were performed in 96-well PCR plates using each primer pair with the following reagents: 1 µM each primer, 200 µM each dATP, dCTP, dTTP, and dGTP, 1× PCR buffer (Perkin Elmer-Cetus, Norwalk, CT), 2 mM MgCl2, and 2 U of Taq DNA polymerase (Perkin Elmer-Cetus). Thermalcycling was performed in Perkin Elmer-Cetus 9600 thermalcyclers with a 5-min denaturation step at 94°C, followed by 30 cycles with melting, annealing, and extension temperatures and times of 94°C, 30 s; 54°C, 45 s; and 72°C, 3 min 30 s, respectively. Production of the correct PCR product was verified by gel electrophoresis. Products deemed to have failed were reamplified either by repeating the PCR reaction with the gene PAIRS primers, ordering custom primers, or using the yeast ORF DNA (Research Genetics) as a template. Reamplification of failed PCRs used the same protocol as initial amplification.

DNAs were prepared and printed onto microarrays as described previously (Shalon et al., 1996; DeRisi et al., 1997 [http://cmgm.stanford.edu/pbrown/]; Eisen and Brown, 1999) with 190-µm spacing between the centers of each element. Each microarray was visually inspected, and all microarrays used in this study were estimated to be missing <1% of all elements except for arrays used in the cdc15 experiments, which were missing ~3% of all elements.

Cell Density and Size Measurements

All cell size measurements were made using a Coulter Counter Multisizer (Coulter Electronics, Hialeah, FL) or a Beckman FACScan workstation (Beckman Instruments, Fullerton, CA). The Coulter Counter was also used to measure the cell density for elutriation. Cell densities in the alpha  factor experiment were measured at OD600 using a Pharmacia Ultrospec III spectrophotometer (Pharmacia, Piscataway, NJ).

Budding Index Calculations

Each sample was sonicated for 30 s with a Virsonic 300 (Virtis, Gardiner, NY) microprobe equipped sonicator at 50% power to separate divided cells. At least 200 cells were counted and scored for the presence of a bud.

DNA Content Determination

Samples were prepared as described previously (Futcher, 1993), and DNA content was measured using a Beckman FACScan workstation.

Nuclear Staining

Cells were washed in water and resuspended in water containing DAPI at 1 µg/ml. Cells were then placed on a glass slide and visualized by fluorescence microscopy, using a Zeiss Axioplan microscope (Carl Zeiss, Thornwood, NY).

alpha Factor-based Synchronization

Yeast strain DBY8724 was grown to an OD600 of 0.2 in YEP glucose, an asynchronous sample was taken, and alpha  factor (PAN facility, Beckman Center, Stanford University) was added to a concentration of 12 ng/ml. After 120 min the alpha  factor was removed by pelleting the cells for 5 min in a Sorvall (Newtown, CT) S34 rotor at 3000 rpm and decanting the supernatant. The arrested cells were resuspended in fresh YEP glucose to an OD600 of 0.18. Every 7 min, for the next 140 min, 25-ml samples were taken for RNA, and 5-ml samples were taken for FACS analysis. At 91 min after release the OD600 of the culture was reduced to ~0.2 from ~0.4 by addition of fresh medium.

Size-based Synchronization

Nine liters of yeast strain DBY7286 were grown in YEP ethanol (2%, vol/vol) at 25°C to a cell density of 1.5 × 107 cells/ml. Cells were pelleted in a Beckman JA-10 rotor for 10 min. The supernatant was saved and is referred to as clarified medium. Cells were resuspended in 300 ml of the clarified medium and sonicated for 2 min with a Virsonic 300 equipped with a microprobe at 50% power. This volume was loaded into a dual-chamber elutriation chamber (Beckman Instruments, Fullerton, CA; catalog numbers 356940 and 356941) in a Beckman J-6 M/E centrifuge equipped with a JE-5.0 elutriation rotor. The elutriator was run with clarified medium at 25°C. Unbudded daughter cells (400 ml at 2.3 × 107 cells/ml) were collected at a modal cell volume of 17.7 fl and grown at 25°C. Samples were take every 30 min for the next 6.5 h with independent samples for DAPI staining (1 ml), FACS analysis (2 ml), budding index (1 ml), and RNA preparation (25 ml). After harvesting, the samples for DAPI, FACS, and budding index were immediately chilled on ice.

Cdc15-based Synchronization

The cdc15-2 (DBY8728) strain was grown to 2.5 × 106 cells/ml in YEP glucose medium at 23°C. The culture was then shifted to an air incubator at 37°C and held at that temperature for 3.5 h. By this time, cell density had reached 6.6 × 106 cells/ml, and 96% of the cells were large dumbbells characteristic of a cdc15 arrest. The cells were then released from the cdc15 arrest by shifting the culture to a 23°C water bath. Samples were taken every 10 min for 300 min, starting at the time of the shift to the 23°C water bath. By 300 min after shift, cell density had reached 4 × 107 cells/ml. Part of the same original culture was grown at 23°C to 1 × 107 cells/ml, and cells were harvested for extraction of the control mRNA. Progress through the cell cycle was monitored by the appearance of new buds.

Because cdc15-2 cells do not quantitatively complete cell separation after a release from a 37°C arrest, FACS analysis is difficult to interpret. We therefore followed the progress of the cdc15-2 cells through the cycle by monitoring the appearance of new buds. The first new buds appeared 50 min after the release to 23°C, when 12% of the dumbbells had small buds (usually, two small buds, one on each half of the dumbbell). The percentage of dumbbells with small buds was 52% at 60 min, 76% at 70 min, 96% at 80 min, and virtually 100% at 90 min, at which time almost all the dumbbells had not just one bud, but two, one on each half of the dumbbell. The second round of small buds appeared at 150 min, when 3% of the cells had small buds. The percentage was 9.7% at 160 min, 32% at 170 min, 68% at 180 min, and 81% at 190 min. The third round of small buds appeared at 270 min, although by this time synchrony was decaying.

Cln3 and Clb2 Experiments

For the Cln3 experiment strain 31 (DBY8725) was grown in YEP raffinose/galactose (1% each) at 23°C to a density of 1 × 107 cells/ml. The cells were then filtered and washed with 2 vol of YEP and resuspended in YEP raffinose at 23°C. These cells arrested because of lack of Cln activity after incubating for 3 h. Cdc34p was then inactivated by shifting the culture to 37°C for 2.5 h. The culture was then split, and galactose was added to one-half at a final concentration of 2% (wt/vol). Cells from this culture were harvested every 10 min for 40 min for RNA. The entire control culture was harvested at time 0. The experiment was performed twice (once for each hybridization in our data set). Data from the 40-min (first experiment) and 30-min (second experiment) postgalactose samples are shown.

For the Clb2 experiment strain 245 (DBY8726) was grown to a density of 5 × 106 at 30°C in YEP raffinose/galactose (1% each) and then centrifuged, and the cells were washed with 2 vol of YEP and then resuspended in YEP raffinose. After 6 h DMSO was added to a final concentration of 1%, and nocodazole was added to a final concentration of 15 µg/ml. The culture was then split, and galactose was added to one-half at a final concentration of 2% (wt/vol). Cells from this culture were harvested every 10 min for 40 min for RNA. The entire control culture was harvested at time 0. The experiment was performed twice (once for each hybridization in our data set). Data from the two 40 min postgalactose samples are shown.

To control the Cln3 and Clb2 experiments for the effects of galactose addition, strain W303a (DBY8727) was grown in 250 ml of YEP raffinose at 30°C to a cell density of 1 × 107 cells/ml. The culture was split in two, and to one-half (the experimental culture) was added galactose to a final concentration of 2% (wt/vol). Forty minutes later both cultures were harvested for preparation of mRNA. The data for this experiment are available on our web site.

RNA Preparation

Samples for RNA isolation were taken by pipetting culture directly into 50-ml Falcon (Lincoln Park, NJ) tubes containing ~20 g of ice to quickly chill the cells. Cells were collected by spinning for 3 min in a tabletop centrifuge and then frozen by immersion in liquid nitrogen and stored at -80°C until RNA was prepared. RNA was prepared by adding 10 ml of water-saturated phenol, 10 ml of sodium acetate buffer (50 mM sodium acetate, 10 mM EDTA, pH 5.0), and 1 ml of 10% SDS (all prewarmed to 65°C) to each frozen cell pellet. Each mixture was incubated at 65°C for 10 min, vortexing vigorously every 1 min for 10 s. After spinning at 1500 × g for 10 min, the aqueous phase was transferred to another 50-ml conical tube containing 10 ml of water-saturated phenol. Samples were vortexed for 30 s, and the spin was repeated. Aqueous phases were again transferred to a new 50-ml conical tube and 10 ml of phenol:chloroform (1:1) were added, followed by a 15-min spin. RNA was precipitated by adding the aqueous phase to an equal volume of isopropanol and 0.1 vol of 3 M sodium acetate. Samples were spun for 30 min at 1500 × g to pellet the RNA. Pellets were washed with 70% ethanol and dried at room temperature. RNA pellets were dissolved in TE (10 mM Tris, 1 mM EDTA, pH 8.0) to ~2.5 mg/ml.

Probe Preparation

Total RNA (15 µg) and 6 µg of oligo-dT were combined in a total volume of 15 µl. RNA oligo-dT mixtures were heated to 70°C for 1 min and then cooled on ice. Three microliters of 25 mM Cy3- or Cy5-conjugated dUTP (Amersham, Arlington Heights, IL), 3 µl of 1 M DTT, 6 µl of first-strand buffer (Stratagene, La Jolla, CA), 0.6 µl of dNTPs (25 mM each dATP, dCTP, and dGTP and 15 mM dTTP), and 2 µl of Superscript II (Stratagene) were added. Each sample was then incubated at 42°C for 2 h to generate Cy-labeled cDNA. Starting RNA was degraded by addition of 1.5 µl of stop solution (1 N NaOH, 0.1 M EDTA) and incubation at 70°C for 10 min. Samples were neutralized by addition of 15 µl of 0.1 N HCl and 400 µl of TE (10 mM Tris, 1 mM EDTA, pH 7.4). Labeled cDNA was concentrated and separated from unbound fluor by separation in a Centricon-30 (Amicon, Danvers, MA) until no further fluor was visible in the flow through, and the probe was concentrated to <4 µl.

Microarray Hybridizations

A probe mixture (12 µl) consisting of Cy3- and Cy5-labeled cDNAs, 3× SSC, 0.3% SDS, and 1.8 µg/µl yeast tRNA was applied to each microarray. The microarray was covered by a 22-mm-square coverslip (Fisher Scientific, Pittsburgh, PA) and placed in a custom-manu-factured hybridization chamber (see http://cmgm.stanford.edu/pbrown/). Ten microliters of water were placed inside the hybridization chamber before sealing, and the chamber was placed in a 65°C water bath. The microarrays were allowed to hybridize 4-6 h. Microarrays were removed from the chambers and placed in standard histochemistry slide holders where they were washed by plunging 30 times in each of the following solutions, respectively: 2× SSC, 0.2% SDS; 0.4× SSC; and 0.2× SSC.

Data Acquisition and Processing

Microarrays were scanned using a custom-built scanning laser microscope. Separate 2 × 2-cm images were acquired for each fluor at a resolution of 20 µm/pixel. Data were extracted by manually superimposing a grid of boxes over the combined Cy3-Cy5 image so that each box contained a single array spot. The average fluorescence intensity for each fluor within each box was recorded. The local background was estimated by averaging the intensities of the weakest 12% of the pixels in each box. Fluorescence ratios were computed based on the background corrected values. Spots of poor quality (as assayed by visual inspection) were removed from further consideration. As a measure of the internal consistency of the data for each spot, the pixel-by-pixel correlation coefficient between the Cy3 and Cy5 intensities was computed; spots with low correlation values (i.e., <0.4) were excluded from further analysis.

Identification of mRNAs Regulated in a Cell Cycle-dependent Manner

Data for each gene in the alpha  factor time series were extracted from the database and were normalized so that the average log2(ratio) over the course of the experiments was equal to 0. A Fourier transform (Eq. 1-3) was applied to the data series for each gene, and the resulting vector (C) was stored for each gene, where omega  is the period of the cell cycle, t is the time, Phi  is the phase offset, and ratio(t) is the ratio measurement at time t. We found that the magnitude of the Fourier transform (D, Eq. 4) was unstable for small variations of omega , so we averaged the vectors of the transform over a range of 40 values, which were evenly spaced around the estimated division t ime for the experiment (66 ± 11). We initially set the value of Phi  to 0. 
<UP>A = </UP><LIM><OP>∑</OP></LIM><UP>sin</UP>(<UP>ωt + &PHgr;</UP>)<UP>log</UP><SUB><UP>2</UP></SUB>(<UP>ratio</UP>(<UP>t</UP>)) (1)
<UP>B = </UP><LIM><OP>∑</OP></LIM><UP>cos</UP>(<UP>ωt + &PHgr;</UP>)<UP>log</UP><SUB><UP>2</UP></SUB>(<UP>ratio</UP>(<UP>t</UP>)) (2)
<UP>C = ⟨A,B⟩</UP> (3)
<UP>D = </UP><RAD><RCD><UP>A<SUP>2</SUP> + B</UP><SUP><UP>2</UP></SUP></RCD></RAD> (4)
The expression profile of each gene across the experiments was then correlated to five different profiles representing genes known to be expressed in G1, S, G2, M, and M/G1 using a standard Pearson correlation function. The profiles for known gene classes were identified by averaging the log2(ratio) data for each of the genes known to peak in each of the five time periods. The peak correlation score was defined as the highest correlation value between the data series for each gene and each of the profiles. The vector calculated by the Fourier transform was scaled by the peak correlation value.

The above process was repeated for the cdc15 experiment (omega  varying between 60 and 80) and for the cdc28 data (omega  varying between 80 and 100) from Cho et al. (1998). The cdc28 data set was first converted to ratio style measurements by dividing each measurement by the average value of the measurements for that gene. Before this step it was necessary to exclude some data points that appeared to be aberrant. Any data value where the two values on either side were threefold different in the same direction were excluded. Each gene thus had three vector scores (one for each of the three analyzed data series).

To generate a single vector for each gene, we added the vectors for each experiment together. However, the value of Phi  for the three experiments should not be the same, because the experiments start at different points in the cell cycle. Therefore, before combining the vectors from the three experiments, constants, Phi cdc15 and Phi cdc28 (relative to the alpha  factor experiment), were calculated for the cdc15 and cdc28 experiments, respectively, that maximized, for the known genes, the average magnitude of the summed vectors. The elutriation data were not included, because it was not possible to calculate a Phi  that maximized the values of more than a handful of the known genes. The alpha  factor and cdc15 vectors were multiplied by 0.7, so that they would not unduly contribute to the final "aggregate CDC score," which was calculated by taking the magnitude of this final vector.

Genes were ranked by their aggregate CDC scores, and the list was examined to identify the positions of known cell cycle genes within it. We selected a threshold CDC score that was exceeded by 91% of known cell cycle-regulated genes. Altogether 800 genes met or exceeded this CDC score.

Promoter Analysis

Motifs were identified in the 700 bp upstream of the start codon using a Gibbs sampling strategy. Such a strategy was originally developed by Lawrence et al. (1993) to find patterns in protein sequences and later modified by Neuwald et al. (1995) to take into account the possibility of a repeated motif. We have modified these Gibbs sampling algorithms to allow pattern searches of DNA (Zhang, unpublished data), for which functions such as double-strand search, palindrome symmetry, and submotif inclusion and exclusion were included. Once motifs were established for a group or cluster, we tested their predictive value by searching for the motif consensus (with specified mismatches) in the 700 bp upstream of the ATG for all groups, as well as for a control set of non-cell cycle-regulated genes, and compared the distribution of these sites in different groups.

TAQman Assay

The TAQman assay was performed on the same alpha  factor samples that were used in the microarray hybridization experiments. For each sample, 500 ng of total RNA were incubated for 15 min with 0.1 U/µl DNase I (amplification grade; Life Technologies, Grand Island, NY) in 2 mM MgCl2, 50 mM KCl, 20 mM Tris-HCl (pH 8.4). The reaction was stopped by adding EDTA to 2.5 mM and incubating at 65°C for 10 min. The RNA was reverse transcribed using TAQman reverse transcription reagents (PE Applied Biosystems, Foster City, CA) consisting of 2.5 µM oligo-dT 16 mer, 1.25 U/µl MultiScribe reverse transcriptase, 0.5 mM dGTP, dATP, dTTP, and dCTP, 0.4 U/µl RNase inhibitor, 50 mM KCl, 10 mM Tris-HCl (pH 8.3). The reaction was incubated at 25°C for 10 min, 48°C for 30 min, and then 95°C for 5 min. The resulting cDNA served as a template for real-time quantitative PCR as follows, in which a fluorescent reporter dye (6-carboxy-fluorescein [FAM]) was released and quantitated during each specific replication of the template (Heid et al., 1996). The cDNA was mixed with 2× TAQman universal PCR master mix (PE Applied Biosystems) and then split into separate reaction tubes containing gene-specific forward and reverse primers (900 nM each) and dye-labeled oligonucleotide probes (200 nM). Each resultant PCR (25 µl) contained cDNA generated from 5 ng of RNA. The sequences of the primers and probes were the following: TUB1 primers: forward, 5'-AAAGCCGAAGGGAGGAGAAG-3'; reverse, 5'-CCCTTGGAACGAACTTACCGT-3'; TUB1 probe: 5'(FAM)-CTCCACGTTTTTCCATGAAACCGGC-(6-carboxy-tetramethylrhodamine [TAMRA])p3'; TUB2 primers: forward, 5'-TTGTCCCATTCCCACGTTTAC-3'; reverse, 5'-GATTGAGAGCCAATTGCCGT-3'; TUB2 probe: 5'(FAM)-TTCTTCATGGTCGGCTACGCTCCATT-(TAMRA)p3'; TUB3 primers: forward, 5'-CCTGCGCCTCAATTGTCTACT-3'; reverse, 5'-TTCCAGGGTGGTATGCGTG-3'; TUB3 probe: 5'(FAM)-CGTCGTGGAACCTTACAACACGGTTTTAA-(TAMRA)p3'; PPA1 primers: forward, 5'-TGTCGGTG-CTTCCAATTTGAT-3'; reverse, 5'-CATCGGAAATGGCAGCAGT-3'; and PPA1 probe: 5'(FAM)-CCGGTGATACCGACAGCGATACCA-(TAMRA)p3'. Each gene-specific PCR was done in triplicate or quadruplicate. The tubes were placed in a PE Applied Biosystems Prism 7700 sequence detection system and were incubated with the following parameters: 50°C for 2 min, 95°C for 10 min, followed by 40 cycles of 95°C for 15 sec and 60°C for 1 min. The computer program Sequence Detector version 1.6.3 (PE Applied Biosystems) provided output, which allowed the average quantities of TUB mRNA relative to PPA1 mRNA to be determined for each RNA sample. The TUB:PPA1 ratio in the asynchronous sample A1 was arbitrarily set at 1, and the results from the other samples were adjusted accordingly.

    RESULTS
Top
Abstract
Introduction
Materials & Methods
Results
Discussion
References

Experimental Overview

We wished to identify the genes whose RNA levels varied periodically during the cell cycle. We initially obtained microarray data from synchronized cells and suitable controls and analyzed the >400,000 measurements to obtain objective scores based on a Fourier algorithm (which assesses periodicity) and a correlation measurement (which compared our data with those of previously identified cell cycle-regulated genes). We compared scores among the previously known and total gene sets to find a threshold value for deciding the significance of the apparent cell cycle regulation. For completeness, we also reanalyzed the published data of Cho et al. (1998). Using all the data, we arrived at a threshold CDC value above which 91% (95 of 104) of the genes previously shown to be cell cycle regulated are included. This procedure identified a total of 800 yeast genes as being periodically regulated.

Synchronized Cultures

We measured the relative levels of mRNA as a function of time in cell cultures that had been synchronized in three independent ways. First we used alpha  pheromone to arrest MATa cells in G1. Second, we used centrifugal elutriation to obtain small G1 cells. Finally, we used a temperature-sensitive mutation, cdc15-2, which, at the restrictive temperature, arrests cells late in mitosis. We used three methods because each introduces characteristic artifacts. For instance, use of pheromone has regulatory consequences characteristic of mating, whereas use of temperature-sensitive mutants can cause heat shock.

The synchronization experiments differed in major ways. First, they were performed using different carbon sources and at different temperatures, with the consequence that the cells grew at different rates. Second, two different yeast strain backgrounds were used (S288C and W303), and finally, cells were synchronized at different points during the cell cycle. Each method produced significant cell cycle synchrony through one cell cycle (elutriation), two cycles (alpha  pheromone), or three cycles (cdc15), as established by at least one of the following methods for each experiment: bud count, DNA content analysis (FACS) and nuclear staining (DAPI), as described in MATERIALS AND METHODS.

RNA was extracted from each of the samples collected, as well as from a control sample (asynchronous cultures of the same cells growing exponentially at the same temperature in the same medium). Fluorescently labeled cDNA was synthesized using Cy3 ("green") for all controls and Cy5 ("red") for all experimental samples. Mixtures of labeled control and experimental cDNAs were competitively hybridized to individual microarrays containing essentially all yeast genes (DeRisi et al., 1997). The ratio of experimental (red) to control (green) cDNA was measured by scanning laser microscopy (Shalon et al., 1996).

Transcription in Response to the Cyclins Cln3p and Clb2p

To gain mechanistic insight into the control of the observed cell cycle regulation, we identified genes whose mRNA levels responded to the induction of two well-characterized cell cycle regulators, Cln3p and Clb2p (see Nasmyth, 1993). Late in G1 phase, the Cln3p-Cdc28p protein kinase complex activates two transcription factors, MBF and SBF, and these in turn promote the transcription of a number of genes important for budding and DNA synthesis (Cross, 1995). Later in the cell cycle, the Clb2p-Cdc28p complex represses the activity of SBF, returning the expression of SBF-regulated genes to low levels (Amon et al., 1993). Furthermore, Clb2p-Cdc28p is known to activate expression of at least four genes, CLB1, CLB2, SWI5, and BUD4 (Althoefer et al., 1995; Sanders and Herskowitz, 1996).

To identify other genes controlled by Cln3p and Clb2p, we arrested cln- or clb- cells in late G1 with cdc34-2 for the CLN3 experiment and in M with nocodazole for the CLB2 experiment. We then induced expression of CLN3 or CLB2 without inducing cell cycle progression. RNA from the G1-phase cells expressing Cln3p (labeled red) was compared with control RNA (labeled green) from the G1-phase cells arrested in the absence of Cln3p. Similarly, for the CLB2 experiment, RNA from M-phase cells expressing Clb2p (labeled red) was compared with control RNA (labeled green) from M-phase cells arrested in the absence of Clb2p. In each case, mRNA levels were quantitatively measured by microarray hybridization. In addition, we performed an experiment to test the effects of galactose to an asynchronous culture with no inducible cyclin (see MATERIALS AND METHODS). Genes identified as strongly affected by galactose addition were not considered further in the Gal cyclin experiments.

Data Analysis and Availability

The total data we collected comprise ~400,000 individual ratio measurements. The quality and reliability of the data can only be assessed by unrestricted access to all data in forms suitable for further query or computer analysis. Therefore, in addition to the summary printed here, we provide primary data from two locations on the Internet. The numerical data are provided in a table of the actual ratios measured for each gene, on each array. They can be downloaded as a tab-delimited text file from the journal web site (http://www.molbiolcell.org) or from a server at Stanford (http://cellcycle-www.stanford.edu). The Stanford web site also provides images of the arrays, accessory data, and the capability to browse and search the complete data set. Raw data are also available from the authors upon request.

The comprehensive nature of this work has another consequence: in what follows we refer by name to as many as 400 genes. It is impractical to provide detailed literature documentation for each gene every time it appears. Instead, we have provided references selectively, and we encourage readers to use the hyperlinks to the Saccharomyces Genome Database (http://genome-www.stanford.edu/Saccharomyces) and the Yeast Protein Database (http://quest7.proteome.com/YPDhome.html) that will be provided at both the Molecular Biology of the Cell and Stanford web sites.

Identification of Cell Cycle-regulated Transcripts

Combining the data from the synchronization experiments, we were able to identify 800 genes whose expression is cell cycle regulated. We did this by the combination of a Fourier algorithm and a correlation algorithm as described in MATERIALS AND METHODS. This resulted in a score for each gene that we refer to as the aggregate CDC score. To illustrate this, Table 3 provides some summary statistics and examples of the kinds of scores obtained for several genes (including specific examples that are and are not cell cycle regulated).

                              
View this table:
[in this window]
[in a new window]
 
Table 3.  Example scores and statistics for a collection of genes

In setting the threshold for the aggregate CDC score by our empirical method, we intended to minimize false-positive assessments while including the vast majority of previously characterized genes that are known to have periodic mRNA levels. Many additional genes showed indications of cell cycle regulation (by visual inspection of the data and by quantitation using our algorithm), although we could not objectively distinguish this behavior from noise.

We estimated the false-positive rate in two ways. First, we randomized the data from each experiment (both by gene and by time point) and performed all of the analyses described above. The randomized data produced 24 "genes" (of nearly 6200) with CDC scores that exceed the threshold we used to classify genes as cell cycle regulated. We assume that this represents a reasonable estimate of the false-positive rate (i.e., ~3% of all genes identified would be false positives). In a second, more conservative test, we randomized the data set only within genes. The number of genes that had scores above our threshold was about three times higher (75 genes) when we randomly shuffled the data in this way. Thus, the number of false positives (of the 800 genes identified as cell cycle regulated) is likely <10% and perhaps as low as 3%.

Classifying the Cell Cycle-regulated Genes by Pattern of Expression

We used two distinct methods to classify genes by their pattern of expression, which we refer to as "phasing" (by time of peak expression) and "clustering" (by similarity of expression across the experiments, which is described below). There is no simple relationship between these two methods, although there are common features in the results. "Phase groups" were created by determining the time of peak expression for each gene (calculated from the Fourier algorithm) and ordering all genes by this time. We divided this ordered set into five (somewhat arbitrary) groups termed G1, S, G2, M, and M/G1 that approximate those commonly used in the literature. To this end we used the published timing of gene expression for the known genes in determining which genes belonged in which phase group. Figure 1A displays the 800 genes that we identified, sorted according to the phase of expression. Each column represents a time point in an experiment, and each row represents a gene that we identified as cell cycle regulated. The ratio of expression that we measured for each gene in each time point is color coded, reflecting the magnitude of the ratio of expression relative to the average of that gene, with shades of red indicating an increase (on) and shades of green indicating a decrease (off). This display is based on the paradigm of Eisen et al. (1998). Genes expressed during each part of the cell cycle are indicated by the color bar (and phase) on the side, and temporal progress through the cell cycle is indicated on the top.


View larger version (83K):
[in this window]
[in a new window]
 


View larger version (94K):
[in this window]
[in a new window]
 
Figure 1.   Gene expression during the yeast cell cycle. Genes correspond to the rows, and the time points of each experiment are the columns. The ratio of induction/repression is shown for each gene such that the magnitude is indicated by the intensity of the colors displayed. If the color is black, then the ratio of control to experimental cDNA is equal to 1, whereas the brightest colors (red and green) represent a ratio of 2.8:1. Ratios >2.8 are displayed as the brightest color. In all cases red indicates an increase in mRNA abundance, whereas green indicates a decrease in abundance. Gray areas (when visible) indicate absent data or data of low quality. Color bars on the right indicate the phase group to which a gene belongs (M/G1, yellow; G1, green; S, purple; G2, red; M, orange). These same colors indicate cell cycle phase along the top. (A) Gene expression patterns for cell cycle-regulated genes. The 800 genes are ordered by the times at which they reach peak expression. (B) Genes that share similar expression profiles are grouped by a clustering algorithm as described in the text. The dendrogram on the left shows the structure of the cluster.

By phasing there were 300 G1 genes (e.g., CLN2, RNR1, CDC9, RAD27, SMC3, and MNN1), 71 S genes (e.g., the histones), 121 G2 genes (e.g., CLB4, WHI3, and CIS3), 195 M genes (e.g., DBF2, CLB2, CDC5, CDC20, and SWI5), and 113 M/G1 genes (e.g., ASH1, SIC1, CDC6, and EGT2). This is a crude classification with many disadvantages (e.g., the last gene in the G2 group and the first gene in the M group are expressed at virtually the same time yet are in different groups), but nevertheless it is useful for discussing the results.

Identification of DNA Binding Sites

We searched through the 700 bp immediately upstream of the start codon of each of the 800 genes in our list to identify potential binding sites for known or novel factors that might control expression during the cell cycle. We found that the majority of the genes have good matches to known cell cycle transcription factor binding sites relevant to the time of peak expression. Furthermore, we examined the distribution of these elements within the upstream sequences, and found that both the site and its position relative to the ATG contain information that is predictive of the phase group of the gene. Figure 2 shows the frequency of six sites in promoters of the G1, S, G2, M, and M/G1 phase groups and a control set of non-cell cycle-regulated genes. These sites are the previously published SCB and MCB as well as four extensions and modifications of published sites (MCM1 + SFF, extended SWI5, SCB variant, and degenerate MCB). Full results of all promoter searches are available on our web site.


View larger version (18K):
[in this window]
[in a new window]
 
Figure 2.   Binding site frequencies. The distribution of various promoter elements in the upstream regions of each the five cell cycle-regulated groups and a control group of 279 non-cell cycle-regulated genes that do not respond to either Cln3p or Clb2p induction are graphically displayed. In each case the numbers on the x axes represent distance from the start codon, and the bars represent the frequency of a particular site per residue per gene at that position in the upstream promoter regions. N is any base; Y is C or T; W is A or T; R is A or G; and M is A or C. (A) SCB. (B) A variant on the SCB, which is also similar to an MCB. (C) MCB. (D) MCB_d, a degenerate MCB sequence. (E) MCM1/SFF site. (F) Swi5e, an extended Swi5 site.

Clusters and Their Regulation

Clusters were established using the clustering algorithm of Eisen et al. (1999). This algorithm sorts through all the data to find the pairs of genes that behave most similarly in each experiment and then progressively adds other genes to the initial pairs to form clusters of apparently coregulated genes. As will be discussed below, the clustering algorithm successfully identifies coregulated genes, because analysis of the 5' regions of the genes in a cluster shows that such genes share common promoter elements, many of which are identifiable based on the published literature. Thus, these clusters provide a foundation for understanding the transcriptional mechanisms of cell cycle regulation. Figure 1B shows the entire clustergram of our cell cycle-regulated genes; a larger version with gene names attached is available at our web site. The same color-coded presentation is used, with the addition, on the extreme left, of the similarity tree (dendrogram) calculated by the clustering algorithm. Many portions of the clustergram (subclusters) are described below, and those that we discuss are summarized in Table 4. The locations of these subclusters in the main cluster are indicated on Figure 1B.

                              
View this table:
[in this window]
[in a new window]
 
Table 4.  Cluster summary

The G1 Clusters

The "CLN2" cluster is the largest subcluster and contains 76 genes. Genes in this cluster include CLN1, CLN2, CLB6, RNR1, CDC9, CDC21, CDC45, POL12, POL30, SWE1, and many other genes involved in DNA replication. A portion of this cluster is shown in Figure 3A. The key features of these genes are that expression is strongly cell cycle regulated (i.e., large peak-to-trough ratios); peak expression occurs in mid-G1 phase (~10 min before budding in the cdc15 experiment); and they are strongly induced by GAL-CLN3 but are strongly repressed by GAL-CLB2. Fifty-eight percent of the 5' regions of these genes had at least one copy of the motif ACGCGT (vs. 6% of control genes), which is a perfect MCB element. Fifty-two percent had at least one copy of CRCGAAA (vs. 13% of control genes), a degenerate SCB element. In addition, 16 had the motif AGAAGAAA, which is similar to a functionally important sequence found upstream of CLN3 (AAGAAAAA) (Parviz et al., 1998). Finally, 17 had the motif CCACAK, which we do not recognize. Outside the core of this cluster are at least 43 additional genes that are less tightly clustered but nevertheless appear to be coregulated with the CLN2 cluster (119 total genes).


View larger version (98K):
[in this window]
[in a new window]
 
Figure 3.   The G1 clusters. The transcription profiles are displayed as described in the legend to Figure 1. (A) CLN2 cluster. A fraction of the genes regulated similarly to the G1 cyclin CLN2, which reaches peak expression in the G1 phase of the cell cycle. To view the full cluster of all the cell cycle-regulated genes, visit http://cellcycle-www.stanford.edu. (B) Y' cluster. Thirty-one genes that are located within the Y' elements show cell cycle regulation of mRNA levels that peak in G1.

The "Y'" cluster (Figure 3B) contains 31 ORFs that all share DNA sequence similarity. There are 38 ORFs that share this similarity in the genome and we identify 36 of them as cell cycle regulated. All of these 38 ORFs are found in Y' elements, located at chromosomes ends. It should be noted that these results may not represent 36 independent observations, because the cDNAs corresponding to these ORFs are almost certain to cross-hybridize on the microarrays. We do not know how these ORFs are regulated or the functional significance.

There is a set of 92 genes, containing ALG7, FKS1, GAS1, GOG5, PMT1, and PMI40, as well as other genes involved in cell wall synthesis (Klis, 1994), that are not a cluster on the clustergram but that are substantially coregulated. These genes can be seen on our web site as Figure 3C. Expression is strongly cell cycle regulated, and peak expression is nearly coincident with budding (~10 min later than the CLN2 cluster in the cdc15 experiment). These genes are induced by GAL-CLN3 and repressed by GAL-CLB2. The majority of these genes had the motif ACRMSAAA (where R is A or G, M is A or C, and S is C or G), which may be an extension and variation of the SCB motif (CACGAAA). Comparison of the CLN2 cluster with this set suggests that expression from MCB motifs may be activated somewhat before expression from SCB motifs, but both kinds of expression are induced by CLN3 (consistent with previous studies) and repressed by CLB2. Earlier studies demonstrated that repression of SCB-driven expression requires CLB2, whereas repression of MCB-driven expression did not (Amon et al., 1993). Our results extend this by showing that CLB2 can repress MCB-driven expression, even though there may be additional repressive mechanisms. Many of the genes in this set also had the motif AARAARAAG, which is similar to a motif found in the CLN2 cluster (see above). However, because promoters generally are rich in such sequences, the significance of this motif is unclear.

The S and M Clusters

The histone cluster in Figure 4A forms the tightest cluster of any of the cell cycle genes. These nine genes have very high peak-to-trough ratios and give aggregate scores of ~10. The histones have three known modes of regulation: first, there are negative elements repressing transcription; second, there is an element in the 3' region of the mRNAs that destabilizes the message except during S phase; and third, there is a repeated positive element, which activates transcription (Freeman et al., 1992). Part of the core motif of the positive element is ATGCGAAR, which is similar to our degenerate SCB motif (ACRMSAAA). Consistent with this, histone expression is induced by GAL-CLN3. However it has been shown that the level and periodicity of HTA2/HTB2 mRNA accumulation are not noticeably affected by single mutation of SWI4, SWI6, or MBP1 (Lowndes et al., 1992; Cross et al., 1994). Additionally, histone levels are unaffected by GAL-CLB2. The sharpness of the peak in histone regulation is worth noting, both because it gives a good impression of the degree of synchronization and because the histones were the first genes for which periodic regulation was discovered (Hereford et al., 1981).


View larger version (76K):
[in this window]
[in a new window]
 
Figure 4.   The S and M clusters. The transcription profiles are displayed as described in the legend to Figure 1. (A) Histone cluster. The eight genes encoding histones and the yeast histone H1 homologue cluster very tightly and are expressed during S phase of the yeast cell cycle. (B) MET cluster. The expression of many of the members of the methionine pathway peaks just after the histones. (C) CLB2 cluster. A subcluster of genes that are expressed similarly to CLB2 highlights genes that peak during M phase.

The "MET" cluster (20 genes, Figure 4B) was completely unexpected. It contains 10 genes involved in the biosynthesis of methionine. Furthermore, two of the unnamed genes in this cluster show sequence similarity to human methionine synthetase, two are likely to be amino acid transporters (with unidentified specificities), one is similar to MET17, and one is on the opposite strand of MET2. Finally, ECM17, the only previously characterized gene in the cluster that is not known to be part of the methionine biosynthetic pathway, is similar to a sulfite redoxin from human. Thus, nearly all of the genes in this cluster are likely to be involved in methionine metabolism. Expression of the genes in this cluster peaks just after the histones, and at least some are inducible by CLN3. We searched the upstream region of the genes in the MET cluster and found that 15 of the genes had the consensus AAACTGTGG, which is identical to the consensus found for Met31/Met32 binding (Blaiseau et al., 1997).

The "CLB2" cluster (Figure 4C) contains 35 genes and includes many genes involved in mitosis such as CLB2, CDC5, CDC20, and SWI5. There are also many other less tightly clustered genes that appear to be regulated in a similar manner, including WSC4, PMP1, and the major plasma membrane proton pumps PMA1 and PMA2. The CLB2 cluster is highly regulated with a peak in M, and the genes are very strongly induced by GAL-CLB2, whereas GAL-CLN3 appears somewhat repressive. It was previously known that four of the genes found in this cluster, CLB1, CLB2, SWI5, and BUD4, are regulated by a combination of two transcription factors, Mcm1p and SFF (Althoefer et al., 1995; Sanders and Herskowitz, 1996). Mcm1p binds to the consensus TTACCNAATTNGGTAA (Acton et al., 1997), whereas, on the basis of three of these genes, SFF was thought to bind to the consensus sequence GTMAACAA. Furthermore, transcription of CLB1, CLB2, and SWI5 was known to be induced by Clb2p activity, possibly because of posttranslational activation of SFF (Amon et al., 1993). We compared the upstream regions of genes in the CLB2 cluster and certain other coregulated genes (e.g., ASE1, also thought to be a possible target of SFF [Pellman et al., 1995]) and found that most of them contain an easily recognizable MCM1 + SFF motif. Of the 35 genes in the cluster, only 9 genes (KIP2, MOB1, NUM1, YCL012W, BUD3, CHA1, YCL063W, YLR057W, and YML033W) did not have an easily recognizable near match to the MCM1 + SFF consensus. An alignment of the genes that contained this site can be viewed on our web site, and on the basis of this alignment, we deduce a new consensus for MCM1 + SFF binding, shown in Figure 5.


View larger version (9K):
[in this window]
[in a new window]
 
Figure 5.   The MCM1 + SFF consensus. By aligning promoter elements of several coregulated genes found in the CLB2 cluster (see our web site for the alignment), we developed a matrix for a new MCM1 + SFF consensus. The number of times each base was found at each position in the site was tallied and is displayed. The consensus was determined by examining the nucleotide frequencies at each position.

The M/G1 Clusters

The "MCM" cluster (Figure 6A) contains 34 genes, including all six MCM genes that are directly involved in DNA replication (MCM2, MCM3, CDC54, CDC46, MCM6, and CDC47; reviewed by Chevalier and Blow, 1996) as well as FAR1, DBF2, SPO12, and KIN3. These genes peak late in the cycle, at about the M/G1 boundary, and are induced by CLB2 and somewhat repressed by CLN3. This cluster has similarities to the CLB2 cluster, except that peak expression is slightly later. Searches of the upstream regions reveal that the majority of these genes contain binding sites for Mcm1p, as was previously shown for some members of the cluster (McInerny et al., 1997). Some, but not all, of these MCM1 sites have nearby sites for SFF (e.g., in FAR1, SPO12, KIN3, and CDC47), although these presumptive SFF sites are of varying quality. It has been suggested that some of the genes in this cluster are regulated through the "ECB," a variant of the Mcm1p binding site (McInerny et al., 1997).


View larger version (90K):
[in this window]
[in a new window]
 
Figure 6.   The M/G1 clusters. The transcription profiles are displayed as described in the legend to Figure 1. (A) MCM cluster. The MCM genes are involved in initiation of DNA replication and are coregulated during the M/G1 transition of the cell cycle. (B) SIC1 cluster. Twenty-seven genes that peak at the M/G1 boundary form a subcluster. (C) MAT cluster. This is a cluster of 13 coregulated genes expressed at the M/G1 boundary, many of which are involved in mating.

The "SIC1" cluster comprises 27 genes, including EGT2, PCL9, TEC1, ASH1, SIC1, and CTS1. These genes are strongly cell cycle regulated (Figure 6B) and peak in late M or at the M/G1 boundary. GAL-CLN3 may repress some of these genes, whereas GAL-CLB2 has no consistent effect on the expression of these genes. Several of these genes are known to be regulated by the transcription factor Swi5p, which itself is a member of the CLB2 cluster (Dohrmann et al., 1992; Bobola et al., 1996; Knapp et al., 1996). Swi5p is thought to bind to a site with the consensus ACCAGC (Knapp et al., 1996), and indeed, when we searched for common motifs in the 5' regions of the SIC1 cluster, we found the consensus RRCCAGCR in many of the 27 genes. When all cell cycle-regulated genes were examined for the presence of either the original Swi5p consensus, or this new extended consensus, the extended consensus was found to be much more specific for late M-phase genes. This comparison is shown on our web site. The motif GCSCRGC was also found in ~40% of the genes in this cluster.

The "MAT" cluster contains 13 genes and is shown in Figure 6C. Some of these genes (MFalpha 1, MFalpha 2, and STE3) are specific for MATalpha cells (Jarvis et al., 1988) and so are significantly expressed only in the cdc15 experiment, which was done with a MATalpha strain. Other genes in the cluster (KAR4, AGA1, SST2, and FUS1) are induced by alpha  factor and so are very strongly expressed at the beginning of the alpha  factor experiment. However, these four genes oscillate in the other experiments when no alpha  factor is present. We found MCM1 binding sites in the upstream regions of several of these genes, including MFalpha 1 and MFalpha 2. Furthermore, as discussed below, we found MATalpha 1, the transcription factor that cooperates with Mcm1p to induce alpha -specific genes, is itself cell cycle regulated, and this may largely explain the oscillation of the alpha  specific genes in this cluster.

Other Genes and Regulators

The nine clusters or near clusters summarized in Table 4 account for about half of the cell cycle-regulated genes. The remaining genes tend to be less strongly cell cycle regulated and cluster less tightly. We have attempted to find novel elements in the promoters of the remaining genes without great success. The best of these elements was the consensus GCAGNRNCCW, which we found in the upstream regions of CLB4, BUD3, CPR8, PRO2, YCL012W, YCL063W, YGL217C, YNL043C, YDR130C, and YOL030W; these genes appear to be moderately well coregulated (peak expression occurs in G2). There may be additional, novel, upstream elements that we are unable to find.

It is likely that many of the remaining genes are actually coregulated with members of the clusters we have described, and their transcription may be controlled by the same types of elements. Indeed, we know that some of the remaining genes have recognizable elements (e.g., MCBs and SCBs), whereas in other cases, the elements may be highly degenerate versions of the known elements. This may explain why the cell cycle regulation we observe is relatively weak and why the genes do not cluster tightly. Finally, mRNA levels could oscillate, not because of transcriptional control, but because of cell cycle control of mRNA stability; the histone mRNAs are controlled partly in this way (Wang et al., 1996).

For the clusters we have identified, some of the genes in the cluster do not contain an obvious element; for instance, nine of the genes in the CLB2 cluster do not contain an obvious MCM1 + SFF site. We do not know whether these genes contain cryptic, degenerate sites that our algorithms fail to recognize, or whether these genes are regulated by an unknown factor.

The Functions of the Cell Cycle-regulated Genes

The major functions of the cell cycle regulated genes we identified are cell cycle control, DNA replication, DNA repair, budding, glycosylation, nuclear division and mitosis, structure of the cytoskeleton, and mating. In Figure 7 we arrange 294 named genes in our set, according to both a functional class and the phase group to which they belong.


View larger version (65K):
[in this window]
[in a new window]
 
Figure 7.   Cell cycle-regulated genes with characterized functions. Two hundred ninety-seven of the cell cycle-regulated genes are grouped by both function and phase of peak expression. Several functional groups are split into subgroups, which reflect the nature of the function. Those highlighted in red were previously known to be cell cycle regulated. Many functional categories display strong biases toward gene expression during particular intervals of the cell cycle. Obvious examples include genes involved in DNA synthesis and DNA repair (G1), mating (M, M/G1, and G1), chromatin structure (G1 and S), and methionine biosynthesis (S and G2).

DNA Replication, Repair, and Chromosome Assembly

It is instructive to look at the pattern of expression of genes involved in a particular process. For instance, we can trace the expression of many genes somehow involved in DNA replication (as shown in Figure 7). Of the genes that peak in G1 there are 23 genes with known functions in DNA replication. These genes include subunits of the DNA polymerases and their accessory factors (e.g. CDC2, POL1, and POL2), genes involved in nucleotide synthesis (e.g. CDC21), and genes involved in initiation of DNA synthesis (e.g. CDC45). Many genes involved in DNA repair such as PMS1 and MSH2 reach peak expression in G1 phase, suggesting that repair of DNA lesions may be a normal part of S phase.

Later, when S phase is actually occurring, the histone genes reach peak expression. In late M phase or M/G1 all six MCM genes important for prereplicative complex formation (MCM2, MCM3, CDC54, CDC47, MCM6, CDC47, and CDC54) and CDC6 reach their peaks, presumably to help set up origins for the next cell cycle. Thus, many genes needed for replication and repair reach peak expression just before they are needed, the histones peak exactly at the time they are needed, and a few genes important for regulation of DNA synthesis peak well in advance of the next round of S phase. Only two known initiator genes, CDC45 and DBF4 (which we did not identify in our analysis; see below) peak just before S phase, suggesting these may be particularly important to trigger replication.

Bud Initiation and Bud Growth

Budding is a major metabolic activity for the cell and involves several subprocesses. The cell must choose a site for the new bud (initiation) and make components for an ever-increasing surface area consisting of a new cell membrane (which requires lipids and integral membrane proteins) and a new cell wall (composed largely of glucan, chitin, and mannoproteins). All of these processes require delivery of components, via the secretory apparatus, to the sites of new membrane and cell wall synthesis, which, in normal conditions, occurs exclusively in the bud (Kaiser et al., 1997; for reviews, see Lew et al., 1997; Orlean, 1997).

We found 17 genes that involved in bud site selection and cell polarization (e.g., BUD3, BUD4, BUD8, BUD9, BEM1, GIC1, MSB1, and MSB2). As indicated in Figure 7, none of these genes had been reported to be cell cycle regulated. Some of these (BUD9, CDC10, and RSR1) show peak expression in G1, consistent with roles in bud initiation. Others, (BUD4, BUD8, and BEM1) peak in M phase, suggesting roles in the following cell cycle, i.e., earlier in the budding pathway than the G1 group. We also identified many genes needed for secretion, glycosylation (needed for making mannoproteins), synthesis of lipids, and cell wall synthesis.

Cell Division and Mitosis

Another fundamental process of cell division, in which a large number of the genes involved have their messages regulated by the cell cycle, is the process of mitosis (for review of microtubule-related topics, see Botstein et al., 1997). During the cell cycle many events occur that allow mitosis to progress in a timely manner. This process begins in G1 when the spindle pole body (SPB) replicates. To facilitate this process six known components of the SPB reach peak expression in G1 (CNM67, NUF1, SPC42, SPC97, SPC98, and TUB4), one (SPC34) peaks during S, and one (NUF2) peaks during M phase. Some of these genes were already known to be cell cycle regulated (NUF1, and SPC42) (Kilmartin et al., 1993; Donaldson and Kilmartin 1996).

Once the mitotic program is entered the cell must create a spindle, which is responsible for moving the nucleus to the bud neck so that nuclear division can occur. This process requires microtubules and many accessory proteins (to form the spindle) as well as kinesins (for movements of the nucleus and the SPB). These genes reach peak expression largely during the first half of the cell cycle. In G1 BIM1, BUB1, IPL1, KAR3, and SLK19 reach peak expression, and during S, five genes (CIN8, KAR9, KIP1, STU2, and VIK1) peak. Five genes peak during G2 (BUB2, CIK1, KIP2, KIP3, and NUM1), as well as the major beta  tubulin TUB2. Finally, one gene (ASE1) reaches peak expression during M.

It was somewhat unexpected that tubulin messages would be regulated by the cell cycle; unfortunately the microarrays that we used for the alpha  factor and elutriation experiments did not contain DNA complementary to either the major (TUB1) or minor (TUB3) alpha  tubulins. Our data set suggested that that TUB1 might be cell cycle regulated because it had a score just below our cutoff. We wished to verify that the major tubulins were regulated in the cell cycle by an independent method (quantitative real-time PCR [Heid et al., 1996]). This method allows determinations of relative mRNA levels with excellent reproducibility. We performed the analysis as detailed in MATERIALS AND METHODS with the result that, as we suspected, TUB1 and TUB2 are moderately cell cycle regulated, but TUB3 appears less so (Figure 8). This suggests that the low score for TUB1 may have been caused by their absence from some of the arrays. It should be noted that TUB2 with a score of 2.33 is clearly above the threshold we set for cell cycle regulation, but that TUB1 with a score of 1.25 is just below, and TUB3 (score 0.53) is considerably below the threshold. Comparison with Figure 8 illustrates the point that a score near the threshold can be the result either of inadequate data or weak regulation.


View larger version (19K):
[in this window]
[in a new window]
 
Figure 8.   Tubulin message levels. The mRNA levels for TUB1, TUB2, and TUB3, relative to those of PPA1, were determined during synchronous division after release from an alpha  factor arrest, using the TAQman assay as described in MATERIALS AND METHODS.

One relatively small class of genes that displays tight temporal regulation is a group of genes involved in chromatid cohesion. Five of these genes (SMC1, SMC3, MCD1, PDS1, and PDS5 [Strunnikov et al., 1993; Yamamoto et al., 1996; Guacci et al., 1997; Michaelis et al., 1997]) have peak expression during G1 just before the next round of DNA synthesis.

At the end of the cell cycle the cell must exit mitosis so that the next round of division can occur. To do this, a system of proteins acts to inhibit the activity of Clb-Cdc28p. One of these proteins is Sic1p, whose expression is known to peak at this time (Donovan et al., 1994). Many of the proteins that inhibit Clb-Cdc28p or prepare the cell to exit from mitosis are known to be cell cycle regulated and peak in M phase. We also find that DBF20 (which is functionally related to DBF2) is cell cycle regulated and peaks in G2.

Mating

At least 19 genes directly involved in mating are cell cycle regulated. These include both mating pheromones (a-factor and alpha -factor) and, perhaps most interestingly, include the central mating-type transcription factor MATalpha 1 itself. MATalpha 1 binds to DNA in cooperation with Mcm1 (Sengupta and Cochran, 1991) and induces expression of alpha -specific genes. It was previously shown that some genes involved in mating were cell cycle regulated, and this regulation was shown to be due to cooperative binding between Mcm1 and Ste12. The fact that the MATalpha 1 transcription factor itself oscillates provides yet another mechanism by which genes involved in mating might be cell cycle regulated. We found Mcm1 sites in the upstream regions of several of these genes, including MATalpha 1. The regulation of genes involved in mating is clearly complex, and several transcription factors are involved. However, it seems that most of these transcription factors cooperate in one way or another with Mcm1. The fact that so many mating functions are cell cycle regulated, including an alpha -specific transcription factor, helps explain the deep connection between mating, start, and the cell cycle. For instance, if genes involved in mating are turned off at start by multiple mechanisms, it helps explain how passage through start precludes mating.

Cell Cycle Control Genes

Of the 19 genes involved in cell cycle control we identified, 17 were already known to be cell cycle regulated. This set mainly includes cyclins and transcription factors, whose activities and time of action are well documented (see Koch and Nasmyth, 1994; Andrews and Measday, 1998). The only two cell cycle control genes that we identified newly as regulated were WHI3 and HSL7.

Methionine Biosynthesis

It was an unexpected and somewhat surprising result that many genes involved in methionine biosynthesis are cell cycle regulated. A number of possibilities suggest themselves. First, the pool of available cellular methionine is smaller than virtually any other amino acid; thus, methionine is likely to be limiting (Jones and Fink, 1982). Indeed, Unger and Hartwell (1976) noted that starvation for sulfur or for methionine effectively causes G1 arrest, suggesting that cell cycle progression is particularly sensitive to the availability of methionine. They also found that a temperature-sensitive allele of methionine tRNA synthetase causes G1 arrest, even in the presence of methionine. These observations suggest that the cell cycle regulation of methionine genes ensures sufficient capacity for protein synthesis in that biosynthetic pathway for the next cell cycle; if there are insufficient resources, G1 arrest ensues.

It is known that the more than 20 genes that constitute the sulfur amino acid biosythesis pathway are coordinately regulated at the level of transcription. This transcription is repressed in response to an increase in the intracellular concentration of S-adenosylmethionine, an end product of the pathway (methionyl tRNA is another end product) (Thomas et al., 1989). A second possibility therefore is that the concentration of S-adenosylmethionine is depleted as cells enter S phase, causing derepression of these genes, which results in cell cycle regulation.

A third possibility is that the protein that actually represses these genes, Met30p,