|
|
|
REVIEW
Gene Center, University of Munich (LMU), Department of Chemistry and Biochemistry, 81377 Munich, Germany
| Abstract |
|---|
|
|
|---|
[Keywords: RNA polymerase II; gene transcription; nuclear coupling; C-terminal repeat domain; kinase; phosphatase]
| Free CTD structure |
|---|
|
|
|---|
-helix that binds the polymerase subunit Rpb7, which together with Rpb4 constitutes a subcomplex that protrudes from the enzyme (Armache et al. 2005
Current evidence suggests that the free CTD is largely flexible, although it shows some residual structure and a tendency to form
-turns. The CTD contains two SPXX motifs (S2-P3-T4-S5, S5-P6-S7-Y1), which were proposed to form
-turn structures stabilized by two hydrogen bonds (Suzuki 1989
). NMR studies of a single CTD consensus repeat peptide revealed such
-turns for the S2-P3-T4-S5 motif, but also unfolded forms (Harding 1992
). NMR and circular dichroism of a CTD peptide with eight repeats showed a small population of
-turns at both SPXX motifs in water, which increased in trifluoroethanol, a hydrogen-bond-promoting solvent (Cagas and Corden 1995
). Another study also detected a low population of turn structures, which strongly increased with single amino acid mutations (Dobbins et al. 1996
). A further study of an eight-repeat CTD peptide detected a content of 15% polyproline helix, and below 10%
-turn structure in water, which increases to 75% in 90% trifluoroethanol (Bienkiewicz et al. 2000
). Cyclic model peptides contain a larger content of
-turns than linear CTD peptides, with the turn at S2-P3-T4-S5 being more stable than at S5-P6-S7-Y1 (Kumaki et al. 2001
). Recent studies of a two-repeat CTD peptide with a central phosphorylated S2 (pS2) residue, however, revealed a dynamic disordered ensemble (Noble et al. 2005
). The largely disordered nature of the free CTD apparently allows for many different interactions with target proteins via an induced fit mechanism.
In an extended
-strand conformation, the length of the yeast CTD and linker would be
650 Å and 250 Å, respectively (Fig. 1; Cramer et al. 2001
). Thus, the CTD could in principle reach anywhere on the surface of Pol II, which is
150 Å in diameter. The CTD is, however, most likely compact, at least in its unphosphorylated state. Electron micrographs of Pol II revealed a weak density that was attributed to the CTD and measured only
100 Å (Meredith et al. 1996
). Pol II crystals contain a limited space adjacent to the linker that could harbor a compact CTD (Cramer et al. 2001
). Electrophoretic analysis, gel filtration, and sucrose gradients revealed that phosphorylation of the CTD results in a far more extended and more protease-sensitive structure (Laybourn and Dahmus 1989
; Zhang and Corden 1991
). One possible form of a compact CTD is a random coil (Cramer et al. 2001
). Alternatively, unusual "
-spiral" models were proposed that account for the equivalence of CTD repeats in NMR studies (Matsushima et al. 1990
; Suzuki 1990
; Cagas and Corden 1995
). One type of
-spiral consists of a series of staggered overlapping
-turns, two per CTD repeat, and an early model of this type had a length of 280 Å for the yeast CTD ("loose spiral") (Fig. 1; Cagas and Corden 1995
). A more compact
-spiral, only 100 Å long, and assuming only one
-turn per repeat, was suggested based on a crystal structure of a CTD peptide bound to a CTD-binding domain ("compact spiral") (Fig. 1; Meinhart and Cramer 2004
). Although it is unlikely that the CTD adopts a uniform repetitive conformation, portions of the CTD may adopt a spiral form, leading to an overall compaction. Upon extensive phosphorylation, compact forms of the CTD would become more extended due to charge repulsions.
|
| CTD modification |
|---|
|
|
|---|
A CTD code could consist of 16 different states of the CTD repeat, which result from the combination of four different phosphorylation states with four possible proline configurations (residues P3 and P6 can each be in cis or trans conformation) (Buratowski 2003
). The situation, however, becomes much more complicated when one considers how a CTD code may be read, in other words, how the different states of the CTD are recognized by proteins. First, proteins can bind more or less than one CTD repeat, and thus a single repeat is generally not the functional unit of the CTD. Indeed, genetics indicates that heptapeptide pairs are the functional units of the CTD (Stiller and Cook 2004
). Second, it is unclear how the action of prolyl isomerases on the CTD can change specificity for CTD-binding proteins, since these enzymes only accelerate cis/trans isomerization of proline residues, in contrast to kinases, which set a phosphate mark in the CTD, or phosphatases, which remove the phosphate mark. However, the prolyl isomerase Pin1 can influence the phosphorylation pattern of the CTD in vitro, and the hypophosphorylated Pol II accumulates in pin1-/- cells (Xu et al. 2003
). Third, the CTD code may be extended by phosphorylation at residue Y1 (Baskaran et al. 1993
, 1997
), and by possible CTD glycosylation (Kelly et al. 1993
). Fourth, the 52nd repeat in the human CTD contains a phosphorylated casein kinase II site (Chapman et al. 2004
). Finally, a short motif at the C terminus of the CTD, outside of the repeats, is also important for the stability and the function of the CTD (Fong et al. 2003
; Chapman et al. 2004
).
| Architecture of CTD-binding domains |
|---|
|
|
|---|
/
-fold (Fig. 2; Fabrega et al. 2003
-sheet (Ranganathan et al. 1997
-helices in a right-handed superhelical arrangement (Fig. 2; Meinhart and Cramer 2004
|
-propeller repeats (Dichtl et al. 2002b
-helical. | CTD recognition |
|---|
|
|
|---|
The structure of the S5-phosphorylated CTD peptide bound to Cgt1 revealed a largely extended CTD conformation (Fabrega et al. 2003
). Cgt1 can bind almost three CTD repeats. The central CTD repeat is partially looped out, forms a turn-like structure, and makes few interactions with the protein. The two flanking repeats are in an extended
-strand-like conformation (Fig. 2). Hydrophobic interactions involve the proline residues of the CTD, and the tyrosine residue Y1, which also forms a hydrogen bond with a Cgt1 residue via its phenylic hydroxyl group. This interaction is incompatible with Y1 phosphorylation. Cgt1 binds the phosphate groups of phosphoserine-5 (pS5) residues in the two extended flanking repeats, but not the one in the central repeat. An important implication of this structure is that these domains may bind two remote stretches of the CTD ("bivalent" or "bipartite" recognition), thereby looping out the intervening sequence. Such looping could facilitate formation of turned structures within the CTD, as observed for free cyclic CTD peptides (Kumaki et al. 2001
). Looped structures induced by CTD binding of one factor could be recognized by another CTD-binding factor, possibly contributing to sequential factor binding.
The structure of a doubly S2/S5-phosphorylated CTD peptide bound to the Pin1 WW domain also revealed an extended CTD conformation (Verdecia et al. 2000
). The P3-T4-pS5-P6 motif of the CTD peptide interacts with the protein. The two prolines are involved in van der Waals interactions with hydrophobic groups of Pin1, and two hydrogen bonds are formed with the peptide back-bone. The WW domain of Pin1 binds the pS5 phosphate group with several hydrogen bonds, and one hydrogen bond is formed with the pS2 phosphate (Verdecia et al. 2000
). In binding assays, the protein is not specific for S2 or S5 phosphorylation (Myers et al. 2001
).
In the structure of a S2-phosphorylated CTD peptide bound to the CID domain of Pcf11, the CTD motif S2-P3-T4-S5 forms a
-turn, whereas the flanking residues are in an extended conformation (Fig. 2; Meinhart and Cramer 2004
). Hydrogen bonds are formed between the CTD and the CID domain, and CTD residues Y1 and P3 bind to hydrophobic patches. The phenylic hydroxyl of Y1 forms a hydrogen bond with a conserved aspartate that is required for normal yeast growth (Sadowski et al. 2003
). The side chain of CTD residue S5 is exposed, consistent with the observation that S5 phosphorylation does not influence binding of the CID domain in SCAF8 (Patturajan et al. 1998b
). The pS2 phosphate group does also not contact the CID domain, consistent with the ability of Pcf11 to bind the unphosphorylated CTD. However, S2 phosphorylation strongly enhances the affinity of the CTD for the CID domain (Licatalosi et al. 2002
), suggesting that the pS2 phosphate group is recognized indirectly (Meinhart and Cramer 2004
). The pS2 phosphate forms an intramolecular hydrogen bond with the T4 side chain, which was proposed to stabilize the
-turn conformation (Meinhart and Cramer 2004
).
A recent NMR study of the CTD-Pcf11 interaction (Noble et al. 2005
) generally confirmed the crystallographic results (Meinhart and Cramer 2004
), and found additionally that the pS2-T4 hydrogen bond in the CTD-CID complex is not present in the free CTD, suggesting that the CTD turn conformation observed in the complex results from induced fit. Although the apparent indirect recognition of the CTD pS2 phosphate is not fully understood, an advantage of such indirect recognition was suggested (Meinhart and Cramer 2004
). Exposure of a CTD phosphate group could be important for its accessibility to a phosphatase, which could remove the phosphate, lower the binding affinity, and trigger CTD dissociation from CTD-binding domains (Meinhart and Cramer 2004
). Although sequence conservation suggests that all CIDs share the same fold, there are apparently differences in the details of CTD recognition by CID domains. The CID domain in the recently characterized factor Rtt103, which is involved in transcription termination (Kim et al. 2004a
), contains a CTD-binding pocket that differs in several amino acid positions from that of Pcf11.
Taken together, three different CTD conformations are observed in the three known CTD peptide-protein complex structures, reflecting the structurally versatile nature of the CTD and indicating that an induced fit to the target surface plays an important role in CTD recognition. Comparison of the structures and CTD conformations suggests that there is no simple structural basis for a possible CTD code. However, a few underlying principles of CTD recognition can be extracted. CTD peptides adopt extended
-strand-like, or
-turn conformations (Fig. 2B). In all three structures, the P3 side chain docks into a hydrophobic pocket of the target protein. In two structures (CID, Cgt1), the Y1 side chain is bound through hydrophobic interactions and via a hydrogen bond to its phenylic hydroxyl, incompatible with Y1 phosphorylation. Phosphorylated serines can be recognized directly, by interactions with the phosphate group, or indirectly, by a mechanism that is not clear yet. In all complex structures, prolines P3 and P6 are in trans conformation, and their isomerization to cis would likely impair binding. Since a CTD peptide exists as a mixture of cis and trans populations in solution (Noble et al. 2005
), the target domains apparently select the trans isomer.
| CTD-Mediator interaction |
|---|
|
|
|---|
The yeast Mediator comprises up to 25 subunits; 11 are essential and 22 are at least partially conserved in sequence among eukaryotes (Boube et al. 2002
; Bourbon et al. 2004
). Nine of the Mediator subunits were identified in a genetic screen for suppressors of CTD truncation mutants (Thompson et al. 1993
; Hengartner et al. 1995
). According to biochemical and genetic studies, Mediator consists of three distinct submodules (Kang et al. 2001
), which may correspond to density lobes observed by electron microscopy, termed head, middle, and tail, respectively (Dotson et al. 2000
). The CTD may bind between the head and middle modules, since recombinant head and middle modules independently bind to the CTD (Kang et al. 2001
). The middle module is the most conserved module and includes the MED7/MED21 heterodimer. The recent structure of the MED7/MED21 heterodimer revealed a novel, very extended helical fold, and a flexible hinge (Baumli et al. 2005
) that may partially account for changes in the overall Mediator structure upon binding to Pol II (Asturias et al. 1999
; Davis et al. 2002
; Naar et al. 2002
) or to activators (Taatjes et al. 2002
).
Larger isoforms of Mediator include a module that contains a CTD kinase, and can act independently of the CTD. For example, the Mediator-like complex SMCC from human cells does not require the CTD for activation (Gu et al. 1999
). The repressive function of the Mediator-like human complex NAT is also independent of the CTD (Sun et al. 1998
). The large inactive human Mediator ARC-L does not interact with the CTD (Naar et al. 2002
). Consistent with these observations, electron microscopic images of the yeast Pol II-Mediator complex suggest that the CTD is not the sole point of contact between Mediator and the polymerase, but that there are multiple interaction sites (Asturias et al. 1999
). In addition, Mediator binds to several general transcription factors involved in initiation (Kang et al. 2001
; Park et al. 2001
).
| CTD kinases |
|---|
|
|
|---|
The CDK7/cyclin H pair associates with the RING finger protein MAT1 to form the CDK-activating kinase (CAK), which phosphorylates and activates other CDKs involved in cell cycle regulation (Harper and Elledge 1998
; Kaldis 1999
). CAK also forms a subcomplex of the 10-subunit general transcription factor TFIIH, which phosphorylates the CTD at S5 during transcription initiation (Coin and Egly 1998
). Electron microscopy showed that the CAK complex protrudes from the ring-like structure of TFIIH (Schultz et al. 2000
). The yeast CDK7 homolog Kin28 is essential for viability, required for normal transcript levels in vivo, and is the primary kinase responsible for CTD phosphorylation during transcription initiation (Valay et al. 1995
; Holstege et al. 1998
; Komarnitsky et al. 2000
; Schroeder et al. 2000
; Liu et al. 2004
). TFIIH kinase activity is enhanced by Mediator during initiation, driving the transition to elongation (Guidi et al. 2004
), and facilitating recruitment of RNA processing factors (Rodriguez et al. 2000
).
The CDK8/cyclin C pair (Srb10/Srb11 in yeast) associates with MED12 (Srb8) and MED13 (Srb9), to form a fourth module of the Mediator that is present in a sub-population of Mediator complexes. This Mediator module phosphorylates the CTD, is conserved among eukaryotes, and is a target of signal transduction pathways (Liu et al. 2001
; Borggrefe et al. 2002
; Boube et al. 2002
; Samuelsen et al. 2003
). The CDK8/cyclin C pair is thought to be mainly implicated in transcriptional repression (Hengartner et al. 1998
). One model for repression is that CDK8 phosphorylates the CTD prematurely, thereby preventing formation of a transcription initiation complex (Hengartner et al. 1998
). Human CDK8/cyclin C can also repress CDK7 activity by phosphorylating cyclin H (Akoulitchev et al. 2000
). CDK8 further phosphorylates some gene-specific transcription factors, thereby decreasing their stability (Chi et al. 2001
; Nelson et al. 2003
). On the other hand, CDK8 can also have a positive effect on transcription. CDK8-dependent phosphorylation of the transcription factor Sip4 can stimulate transcription (Vincent et al. 2001
). A positive effect also results from ATP-dependent dissociation of preinitiation complexes, triggered by CDK8 (Liu et al. 2004
). CDK8-dependent phosphorylation of the Mediator subunit MED2 also has a positive effect on transcription (Hallberg et al. 2004
).
The CDK9/cyclin T pair forms the core of the positive transcription elongation factor P-TEFb (Price 2000
). The originally identified P-TEFb consists of CDK9 and one of the cyclin T isoforms T1, T2, or K (Peng et al. 1998
). A larger P-TEFb complex with reduced activity contains additionally the small nuclear RNA 7SK and the HEXIM protein (Nguyen et al. 2001
; Yang et al. 2001
; Michels et al. 2003
; Yik et al. 2003
). P-TEFb was isolated by its ability to overcome arrest of Pol II complexes during early elongation, a function that requires the CTD (Marshall and Price 1995
; Marshall et al. 1996
). P-TEFb phosphorylates the elongation factor DSIF on its Spt5 subunit and counteracts the negative effect of DSIF and its cofactor NELF during early elongation (Wada et al. 1998
; Yamaguchi et al. 1998
). There are two putative homologs of CDK9 in Saccharomyces cerevisiae, Ctk1 and Bur1 (Prelich and Winston 1993
; Murray et al. 2001
; Prelich 2002
; Guo and Stiller 2004
). Ctk1 associates with its cyclin partner Ctk2 and a third subunit, Ctk3, to form the CTDK1 complex. Bur1 associates with the cyclin Bur2. Chromatin immunoprecipitation and genetic experiments suggest that Ctk1 and Bur1 play nonoverlapping roles in transcription elongation (Yao et al. 2000
; E.J. Cho et al. 2001
; Yao and Prelich 2002
; Keogh et al. 2003
). It is possible that in yeast Ctk1 and Bur1 phosphorylate the CTD and Spt5, respectively, which are both substrates of P-TEFb (Keogh et al. 2003
). During stress response the CTD can also be phosphorylated at S5 by ERK kinases (Bonnet et al. 1999
).
| Kinase structure and specificity |
|---|
|
|
|---|
|
|
Despite the recent structural studies, the basis for specificity of a kinase for the CTD and for recognition of a particular CTD residue remains enigmatic, because no structures of CDK/CTD complexes are known. Compared with other CDK structures, the activation segment of CDK7 is in a different conformation, which may help in determining substrate specificity (Russo et al. 1996
; Lolli et al. 2004
). Compared with CDK7, CDK8 has three additional residues in the activation segment, and a nine-residue insertion near the activation segment, which could play a role in defining substrate specificity (S. Hoeppner, S. Baumli, and P. Cramer, unpubl.).
Kinase specificity for the CTD may not only be achieved by CTD recognition at the kinase active site, but also by CTD binding to kinase-associated factors. CDK7 specificity for the CTD is influenced by its binding to MAT1 (Yankulov and Bentley 1997
; Larochelle et al. 1998
), but it is unclear how MAT1 accomplishes this function, although the NMR structure of its RING finger domain is known (Gervais et al. 2001
). The CDK7-containing CAK complex targets other CDKs, but TFIIH, which includes the CAK complex, has a strong preference for the CTD as a substrate (Rossignol et al. 1997
; Yankulov and Bentley 1997
). CDK7 CTD specificity is highest in the context of a transcription initiation complex (Lu et al. 1992
; Watanabe et al. 2000
), and the preference of TFIIH for S5 phosphorylation is enforced by TFIIE (Yamamoto et al. 2001
). Cyclin C has a highly conserved surface depression that may bind substrates near the active site of CDK8 (S. Hoeppner, S. Baumli, and P. Cramer, unpubl.). A similar mechanism is established for cyclin A, which has a conserved surface patch that binds kinase substrates (Schulman et al. 1998
; Kontopidis et al. 2003
). Cyclin T binds the CTD via a histidine-rich stretch in its C-terminal domain (Taube et al. 2002
; Kurosu et al. 2004
). A recent study suggests that the cyclins generally act as adaptors to render a CDK specific for a substrate (Loog and Morgan 2005
). The HIV Tat protein shifts CDK9 phosphorylation preference from S2 to both S2 and S5 (Zhou et al. 2000
). Noncanonical phosphorylation of the CTD at Y1 by the Abl kinase involves CTD binding to an Abl domain distinct from the kinase domain (Baskaran et al. 1997
).
An open question is the activation mechanism of the CTD-targeting CDKs. CDKs involved in cell cycle regulation are generally activated in two steps, cyclin binding, and phosphorylation of a conserved threonine in the CDK activation segment (T160 in human CDK2) (Pavletich 1999
). Interaction of the phosphothreonine side chain with three conserved arginines triggers a conformational change that results in full kinase activation (Russo et al. 1996
). CDK7 and CDK9 carry a threonine or a serine at the phosphorylated position. In the free CDK7 structure, the phosphorylated threonine, however, is found at a different location than in CDK2 (Lolli et al. 2004
), and does not contact the three conserved arginines, pointing to a different mechanism of CDK activation. Also, CDK8 does not have a threonine or serine residue at the position phosphorylated in other CDKs (Tassan et al. 1995
). A conserved aspartate in CDK8 or a glutamate in cyclin C could, however, mimic a phosphothreonine (S. Hoeppner, S. Baumli, and P. Cramer, unpubl.).
|
| CTD phosphatases of the Fcp1 family |
|---|
|
|
|---|
The high-resolution structure of the catalytic FCPH domain of Scp1 (Kamenski et al. 2004
) revealed a core fold with a central parallel
-sheet (Fig. 4; Kamenski et al. 2004
). The fold is similar to that of other enzymes of the DXDX(T/V) superfamily (Wang et al. 2001
; Lahiri et al. 2003
), although they share no sequence similarity outside the signature motif. The signature motif is part of a central depression that forms the active site and binds a metal ion. Catalysis involves the metal-assisted phosphorylation of the first aspartate in the DXDX(T/V) motif. Magnesium ions are essential for Fcp1 and Scp1 activity, and the trifluoroberyllate anion inhibits activity by forming a stable tetrahedral adduct with the catalytic aspartate side chain, mimicking a labile phosphoaspartyl intermediate (Fig. 4; Kamenski et al. 2004
). This mechanism is consistent with biochemical data (Hausmann and Shuman 2003
), and corresponds to that of other DXDX(T/V) superfamily enzymes, which also use the N-terminal aspartate in the signature motif as a phosphoryl acceptor (H. Cho et al. 2001
; Lahiri et al. 2003
). Consistently, mutation of this aspartate in Scp1 or Fcp1 to alanine abolished activity (Kobor et al. 1999
; Hausmann and Shuman 2002
; Kamenski et al. 2004
). The second asparate in the DXDX(T/V) motif contributes to metal ion binding, and may act as a general acid/base.
Whereas the catalytic mechanism of Fcp1/Scp1 phosphatases is well understood, the basis for their CTD specificity remains to be fully established. Specificity for the CTD may to some extent be explained by recruitment of the enzymes to the CTD. Fcp1 binds to a docking site on Pol II outside the CTD (Chambers et al. 1995
) that includes the Pol II subcomplex Rpb4/7 (Kimura et al. 2002
; Kamenski et al. 2004
). Rpb4/7 is located directly adjacent to the polymerase linker to the CTD (Armache et al. 2003
; Bushnell and Kornberg 2003
; Armache et al. 2005
). In addition, Fcp1 binds the phosphorylated CTD via its BRCT domain (Yu et al. 2003
), and binds the polymerase-associated general transcription factor TFIIF, which stimulates Fcp1 activity (Chambers et al. 1995
; Archambault et al. 1998
; Kamada et al. 2003
; Nguyen et al. 2003
).
Specificity of the CTD phosphatases toward CTD phosphorylation sites requires recognition of CTD residues around the phosphorylated target side chain. Indeed, Fcp1 activity requires several CTD residues flanking the phosphoserine, and single alanine mutations of the flanking Y1 and P3 decrease activity (Hausmann et al. 2004
). Fcp1 and Scp1 were reported to dephosphorylate S5 and S2 (Hausmann and Shuman 2002
; Lin et al. 2002a
; Yeo et al. 2003
). Highly purified Fcp1 was recently shown to dephosphorylate S5, but not S2 (Kong et al. 2005
). Since both serines are flanked on the C-terminal side by a proline residue, CTD phosphatases may bind the adjacent prolines P3 or P6, and preferential phosphoserine dephosphorylation may be achieved by binding to other nearby residues. Indeed, the P3 side chain binds to a hydrophobic pocket in the known CTD peptide complex structures (Verdecia et al. 2000
; Fabrega et al. 2003
; Meinhart and Cramer 2004
), and specific recognition of a flanking proline is consistent with Fcp1 inhibition by the prolyl isomerase Pin1 (Xu et al. 2003
). Plant CPLs were also shown to specifically dephosphorylate S5 (Koiwa et al. 2004
), and therefore most likely also recognize flanking residues.
| CTD phosphatase Ssu72 |
|---|
|
|
|---|
Two groups independently reported that Ssu72 has phosphatase activity, and speculated that it may target the CTD (Ganem et al. 2003
; Meinhart et al. 2003
). Ssu72 cleaves a nonspecific phosphatase substrate, and its sequence contains the CX5R signature motif of protein tyrosine phosphatases (PTPases) (Ganem et al. 2003
; Meinhart et al. 2003
). Mutation of the cysteine in this signature motif abolishes Ssu72 activity in vitro (Meinhart et al. 2003
), and confers lethality in vivo (Sun and Hampsey 1996
). In PTPases, the conserved cysteine and arginine residues of the signature motif form part of the active site (Ramponi and Stefani 1997
). Whereas the cysteine attacks the substrate phosphorus atom, leading to formation of a phosphocysteinyl intermediate, the arginine stabilizes the transition state (Burke and Zhang 1998
). Although there is no apparent sequence homology between Ssu72 and PTPases outside the signature motif, secondary structure prediction suggested that Ssu72 adopts the fold of