|
|
|
REVIEW
1 Cancer Biology and Genetics Program, Howard Hughes Medical Institute, Memorial Sloan-Kettering Cancer Center, New York, New York 10021, USA; 2 ICREA, Medical Oncology Research Program, Vall d'Hebron University Hospital Research Institute, 08035 Barcelona, Spain; 3 Department of Biochemistry and Molecular Genetics, and Center for Cell Signaling, University of Virginia, Charlottesville, Virginia 22908, USA
| Abstract |
|---|
|
|
|---|
(TGF
) pathway. Recent progress has shed light into the processes of Smad activation and deactivation, nucleocytoplasmic dynamics, and assembly of transcriptional complexes. A rich repertoire of regulatory devices exerts control over each step of the Smad pathway. This knowledge is enabling work on more complex questions about the organization, integration, and modulation of Smad-dependent transcriptional programs. We are beginning to uncover self-enabled gene response cascades, graded Smad response mechanisms, and Smad-dependent synexpression groups. Our growing understanding of TGF
signaling through the Smad pathway provides general principles for how animal cells translate complex inputs into concrete behavior.
[Keywords: Smad; TGF-
; signaling; transcription]
(TGF
) family is one of the most prominent representatives of this class of molecules. TGF
and its family membersthe nodals, activins, bone morphogenetic proteins (BMPs), myostatins, anti-Muellerian hormone (AMH), and othersexert profound effects on cell division, differentiation, migration, adhesion, organization, and death. These factors may be produced by many cell types, as in the case of TGF
, or very few, as in the case of myostatin, and they may be active from the earliest stages of embryo development through adulthood, as in the case of the BMPs, or for very limited periods during development, as in the case of AMH. On the whole, the TGF
family provides a paradigm of functional versatility among hormonally active polypeptides.
The quest to understand how cells read these signals started as soon as these factors were isolated in the early 1980s and has continued unabated ever since. By now, we have obtained a fairly robust understanding of the biochemical backbone of the TGF
signaling pathway, and a growing sense for how cells translate these signals into responses. At the core of this pathway lie the Smad transcription factors. TGF
induces its membrane receptors to directly activate Smad proteins that then form transcriptional complexes to control target genes (Fig. 1). What is not so simple is how these complexes activate or repress hundreds of target genes at the same time, in the same cell, and under tightly controlled conditions. This complexity has been the subject of much research in recent years and is the focus of the present review. For more information on the biology of the TGF
family, we would recommend several classical reviews (Letterio and Roberts 1998
; Whitman 1998
; Massagué and Chen 2000
; Massagué et al. 2000
; Derynck et al. 2001
; Attisano and Wrana 2002
; Schier 2003
). Other recent reviews (Derynck and Zhang 2003
; Siegel and Massagué 2003
) and research articles (Foletta et al. 2003
; Ozdamar et al. 2005
) cover the emerging topic of Smad-independent pathways in TGF
signaling.
| Basic features of the Smad proteins |
|---|
|
|
|---|
family of receptors; these are commonly referred to as receptor-regulated Smads, or RSmads (Fig. 2A). Smads 1, 5, and 8 serve principally as substrates for the BMP and anti-Muellerian receptors, and Smads 2 and 3 for the TGF
, activin, and Nodal receptors. Smad4, also referred to as Co-Smad, serves as a common partner for all RSmads (Fig. 2B). Smad6 and Smad7 are inhibitory Smads that serve as decoys interfering with Smadreceptor or SmadSmad interactions. The Mad protein in Drosophila, which was the first identified member of this family (Raftery and Sutherland 1999
|
500 amino acids in length and consist of two globular domains coupled by a linker region (Fig. 2A; Shi and Massagué 2003
-hairpin structure, which is conserved in all the RSmads and Smad4. Interestingly, the most abundant splice form of Smad2 contains an insert (encoded by exon 3) that blocks DNA binding. The MH1 domain is followed by the linker region, a flexible segment with binding sites for Smurf (Smad ubiquitination-related factor) ubiquitin ligases, phosphorylation sites for several classes of protein kinases, and, in Smad4, a nuclear export signal (NES). The Smad MH2 domain is highly conserved and is one of the most versatile protein-interacting modules in signal transduction. RSmads have a conserved C-terminal motif, SerXSer, that is phosphorylated by the activated receptor. A pocket lined with basic residues interacts with the phosphorylated region of the activated receptor in the case of RSmads and with the phosphorylated tail of RSmads in the case of Smad4. A set of contiguous hydrophobic patches, referred to as the "hydrophobic corridor", on the surface of the MH2 domain mediates interactions with cytoplasmic retention proteins, with components of the nuclear pore complex (nucleoporins), and with DNA-binding cofactors. A region overlapping the linker and MH2 regions ("Smad4 activation domain", SAD) mediates interactions with transcriptional activators and repressors (Fig. 2A).
| Smad activation and deactivation |
|---|
|
|
|---|
or BMP was one of the early key observations placing Smads downstream of the TGF
receptors (Hoodless et al. 1996
|
The mechanisms of Smad nuclear import and export have been extensively studied over the last few years, particularly in Smads 2, 3, and 4. Interestingly, the nuclear import of these proteins can occur without the intervention of nuclear transport factors (Xu et al. 2000
, 2002
). Many proteins undergoing nuclear translocation in response to regulatory signals rely on importins, a set of factors that mediate nuclear import by recognizing a motif called the nuclear localization signal (NLS) in the cargo proteins (Jans et al. 2000
). The classic NLS motif consists of a cluster of basic residues and is recognized by importin-
. Importin-
binds to importin-
, which directly interacts with the nuclear pore componentsnucleoporinsto negotiate the passage of the importin-
-importin-
-cargo complex into the nucleus (Mattaj and Englmeier 1998
; Gorlich and Kutay 1999
). However, the nuclear translocation of Smad proteins can occur independently of importins because Smad proteins can directly interact with nucleoporins (Xu et al. 2000
, 2002
). This interaction maps to the hydrophobic corridor in the Smad MH2 domain and to the FG repeat region on the nucleoporins Nup153 and Nup214 (Fig. 3; Xu et al. 2002
). The FG repeat region is also the region for nucleoporin interaction with importins (Bayliss et al. 2000
). Importin-independent transport has also been described for other signaling factors, including
-catenin in the Wnt signaling pathway (Yokoya et al. 1999
) and ERK2 MAP kinase in response to Ras-mediated mitogenic signals (Whitehurst et al. 2002
).
Some Smad proteins may additionally undergo nuclear import via importins. A conserved lysine-rich sequence in the MH1 domain of Smads that resembles classical NLS motifs was hypothesized, and shown, to interact with importins (Xiao et al. 2000
; Kurisaki et al. 2001
). This conserved lysine-rich sequence forms an
-helix right next to the DNA-binding motif in the MH1 domain, not an extended loop as would be structurally required for recognition by importin-
(Shi et al. 1998
; Chai et al. 2003
). Indeed, Smad3 interacts with importin-
, not importin-
(Xiao et al. 2000
; Kurisaki et al. 2001
). Importin binding has been detected with overexpressed Smad3 and Smad4 but not Smad2. This inability of Smad2 has been attributed to the presence of the exon 3-encoded insert in the MH1 domain. One study comparing the efficiency of Smad3 translocation by the importin-dependent and -independent mechanism indicated that the importin-dependent process is considerably weaker (Xu et al. 2003
). On balance, it appears that Smads 2, 3, and 4 undergo nuclear import by means of direct interactions with nucleoporins, and this process may be aided by importin-
in the case of Smads 3 and 4.
The intrinsic asymmetry of the nuclear pore complex, and the distribution of different nucleoporins along the span of the pore, are thought to allow proteins docking on one side of the pore to move unidirectionally to the other side and vice versa (Rout et al. 2000
). Indeed, the direct interaction of Smads 2 and 3 with nucleoporins has been shown to enable nuclear export as well as import (Xu et al. 2002
). The presence of an NES and the recognition of this sequence by the general nuclear export factor CRM1, which are necessary for nuclear export of other proteins (Fornerod et al. 1997
), are not required for the export of Smads 2 and 3. Interestingly, these properties are also shared by
-catenin (Yokoya et al. 1999
), ERK2 (Whitehurst et al. 2002
) and importin-
(Fahrenkrog and Aebi 2003
).
|
Unlike the nuclear export of Smads 2 and 3, the export of Smad4 is dependent on, or at least enhanced by, CRM1. Smad4 contains a leucine-rich NES in the linker region that is recognized by CRM1 (Figs. 2A, 3; Pierreux et al. 2000
). The CRM1 inhibitor leptomycin B blocks the nuclear export of Smad4 without interfering with nuclear export of Smads 2 or 3 (Inman et al. 2002
; Xu et al. 2002
). The interaction with receptor-phosphorylated Smads 2 and 3 is thought to mask the NES in Smad4, protecting the RSmadSmad4 complexes from recognition by CRM1 and nuclear export (Watanabe et al. 2000
; Inman et al. 2002
). When RSmads are dephosphorylated and the complex dissociates, the Smad4 NES becomes exposed again and nuclear export of Smad4 by CRM1 can proceed. Little is known about other inputs that may regulate the exposure of the Smad4 NES. However, an alternatively spliced mammalian Smad4 isoform (Pierreux et al. 2000
) and a separate gene product, Smad4
, in Xenopus (Watanabe et al. 2000
) lack the NES and are constitutively located in the nucleus.
Smad subcellular retention mechanisms
Despite the intrinsic ability of RSmads to shuttle in and out the nucleus, many immunocytochemical studies have shown that in the basal steady state, RSmads are predominantly concentrated in the cytoplasm (Pierreux et al. 2000
; Watanabe et al. 2000
; Xu et al. 2002
). This is thought to result from the action of cytoplasmic Smad-binding factors. Indeed, endogenous Smad proteins in the basal state are found in high-molecular mass oligomeric complexes, as determined by molecular-sizing chromatography (Jayaraman and Massagué 2000
).
Different cytosolic proteins may function as Smad anchors and adaptors. The best characterized of the cytosolic retention factors for Smad2 and Smad 3 is the protein SARA (smad anchor for receptor activation) (Tsukazaki et al. 1998
). SARA is a multidomain protein that contains an 80-amino-acid Smad-binding domain (SBD) and a FYVE phosholipid-binding domain that avidly binds to phosphatidyl inositol 3' phosphate on endosomal membranes. FYVE targets SARA preferentially to early endosomes (Di Guglielmo et al. 2003
). As revealed by the X-ray crystal structure of the Smad2SBD complex, the SBD of SARA makes contact with the three consecutive hydrophobic patches on the MH2 domain surface that constitute the hydrophobic corridor (Figs. 2, 3; Wu et al. 2000
). This region is also involved in the interaction with the FG-repeat-containing domain nucleoporins and the binding of DNA-binding cofactors (Wu et al. 2000
; Randall et al. 2002
). Therefore, the interaction with SARA is incompatible with translocation of Smad2/3 into the nucleus and formation of transcriptional complexes.
Receptor-mediated phosphorylation causes a decrease in the affinity of Smad2 for SARA (Wu et al. 2001b
). This occurs without a concomitant increase in Smad affinity for nucleoporins. In vivo, however, this drop in affinity for SARA is sufficient for Smad2/3 dissociation and movement into the nucleus, where Smad2/3 are retained by interactions with Smad4 and additional proteins, as well as with the DNA. Dephosphorylation and dissociation of Smad transcriptional complexes are thought to end this retention, allowing the export of RSmad out of the nucleus (Inman et al. 2002
).
The hydrophobic corridor and adjacent regions are involved in binding contacts not only with SARA and nucleoporins, but also with DNA-binding partners in the nucleus (Fig. 3; Randall et al. 2002
). Two such partners of Smad2/3, the forkhead family member FoxH1 and the homeodomain transcription factor Mixer, share a conserved Smad-interacting motif (SIM) that has sequence similarity with the SBD of SARA (Germain et al. 2000
; Wu et al. 2000
). Interestingly, the SIM is shorter than the SBD and, contrary to the SBD, it binds to Smad without occluding the SmadSmad interface (Wu et al. 2001b
). Thus SIM-containing factors can bind to Smad2/3Smad4 oligomers, whereas SARA SBD interacts with monomeric Smad2/3.
As the same Smad protein surface interacts with cytoplasmic retention factors, nuclear pore components, and nuclear-interacting factors, this surface creates mutually exclusive interactionsand hence, competitionbetween Smad-binding components (Fig. 3). A soluble SBD peptide has been used as an effective inhibitor of Smad2/3 nuclear translocation (Xu et al. 2000
, 2003
). Nup214 and Nup153 compete with SARA for Smad2 in binding assays (Xu et al. 2002
). Overexpression of FoxH1 causes the concentration of Smad2/3 in the nucleus even in the absence of TGF
, providing evidence for the constant nucleocytoplasmic traffic of Smads (Xu et al. 2002
). This FoxH1Smad2/3 interaction is transcriptionally inert because, without receptor mediated phosphorylation, Smad2/3 in this complex cannot recruit Smad4.
Smad adaptors for receptor interaction
Several adaptor proteins that facilitate the Smadreceptor interaction have been described, of which SARA is the most extensively characterized. By regionally restricting Smad2/3 proteins to the plasma membrane and early endosomes, SARA facilitates the interaction of Smads 2 and 3 with the activated TGF
receptor (Tsukazaki et al. 1998
). The activated receptor complex is internalized via clathrin-coated pits from the plasma membrane to the early endosomes, where SARA-bound Smads are thought to be most readily accessible (Di Guglielmo et al. 2003
). However, this interaction also appears to occur at the plasma membrane prior to receptor internalization (Di Guglielmo et al. 2003
). Another FYVE domain protein, Hgs, has been found to cooperate with SARA on Smad phosphorylation (Miura et al. 2000
). Activated TGF
receptor complexes are also internalized via lipid rafts and caveolae, but this route is thought to lead to receptor interaction with the E3 ubiquitin ligase Smurf2 that targets the receptor for inactivation (Di Guglielmo et al. 2003
). However, genetic evidence for the necessity for SARA in Smad2/3 signaling is lacking, and so it remains to be established whether SARA is in fact a major player in this pathway.
Other proteins that have been proposed to participate in the interaction of Smad2/3 with the TGF
receptor, including Disabled-2 (Hocevar et al. 2001
), Dok-1 (Yamakawa et al. 2002
), Axin (Furuhashi et al. 2001
), the ELF
-spectrin (Tang et al. 2003
), and a cytoplasmic isoform (cPML) of the promyelocytic leukemia protein (Lin et al. 2004
), are interesting members of this group. cPML has been proposed to interact with SARA, Smad2/3, and the TGF
receptor and to be critical for TGF
phosphorylation of Smad2/3 and TGF
signaling (Lin et al. 2004
). However, PML-deficient mice develop fairly normally (Wang et al. 1998
), whereas embryos deficient in TGF
receptors or Smad2 do not (Oshima et al. 1996
; Waldrip et al. 1998
; Heyer et al. 1999
). The proteins TRAP-1 (TGF
-receptor-associated protein) (Wurthner et al. 2001
) and TLP (TRAP-1-like protein) (Felici et al. 2003
) have been described as adaptor proteins that interact with the receptor complex and facilitate the formation of Smad2/3Smad4 complexes. Many of these interactions require a more thorough examination before their significance as Smad adaptors can be established. Furthermore, little is known about factors that facilitate RSmadreceptor interactions in the BMP pathway. Smad1 does not interact with SARA, but given the overall similarity between these pathways, the existence of a SARA-like factor for Smads 1, 5, and 8 remains possible.
Smad phosphorylation by TGF
receptor kinases
Smad phosphorylation by the activated TGF
receptor complex is a pivotal event in the initiation of TGF
signal transduction. The mechanism of TGF
receptor activation was established by combining biochemical and genetic approaches (Wrana et al. 1994
). This process and its structural basis have been reviewed in detail elsewhere (Shi and Massagué 2003
). Briefly, TGF
binds to pairs of receptor serine/threonine kinases, known as the TGF
type I (T
R-I) and type II (T
R-II) receptors, forming a hetero-tetrameric receptor complex. In this complex, T
R-II phosphorylates a serine/threonine-rich region, called the GS region, that is located N-terminal to the canonical kinase domain of T
R-I (Fig. 1). In the absence of ligand, the small proteins FKBP12 and FKBP12.6 bind to the GS region and occlude the phosphorylation sites in this region (Y.G. Chen et al. 1997
; Datta et al. 1998
). The X-ray crystal structure of the FKBP12T
R-I cytoplasmic domain complex revealed that the bound FKBP12 additionally enforces a catalytically inactive conformation of the T
R-I kinase domain by pressing against the active center and causing a mis-alignment of the critical catalytic amino acid residues (Huse et al. 1999
). Thus, in the basal state, the GS region acts as a docking site for FKBP12 and an auto-inhibitory element of the receptor kinase (Fig. 1).
T
R-II appears to be a constitutively active kinase (Wrana et al. 1994
). Ligand access to the receptors is a tightly regulated process, with numerous proteins acting as ligand traps that bind various members of the TGF
family and prevent their contact with cell surface receptors (Fig. 1; for reviews, see Massagué and Chen 2000
; Shi and Massagué 2003
). In the ligand-induced receptor complex, T
R-II gains access to the GS region of T
R-I, catalyzing the phosphorylation of alternating serine (or threonine) residues in the sequence ThrThrSerGlySerGlySer (Wrana et al. 1994
; Massagué 1998
). This interaction is negatively regulated by the pseudoreceptor BAMBI, which intercalates in the receptor complex (Onichtchouk et al. 1999
). Phosphorylation of T
R-I turns the GS region from a FKBP12-binding site to a binding site for Smad2/3 (Huse et al. 2001
), providing a case of remarkable economy of function within a kinase structural element. Smad2/3 are then phosphorylated by T
R-I and released to propagate the signal.
This general mechanism applies to all TGF
and BMP family receptors characterized to date, from human through Drosophila (Shi and Massagué 2003
). The RSmad substrate specificity of a given receptor complex is determined by a particular region, the L45 loop, in the kinase domain of the type I receptor and a complementary region, the L3 loop, on the MH2 domain of RSmads (Fig. 3). T
R-I (also known as ALK5) and the nodal/activin type I receptors ALK2 and ALK7 recognize Smads 2 and 3, whereas ALK1, ALK3, and ALK6 recognize Smads 1, 5, and 8. Thus, the phosphorylated GS region drives receptor interaction with RSmads, whereas the L45 loop determines the specificity of this interaction.
Receptor-mediated phosphorylation occurs at two serine residues in the extreme C-terminal sequence SerValSer (SerMetSer in Smad2) of RSmads. pSerXpSer is the paradigmatic activation motif of the TGF-
/Smad pathway (Fig. 1). This motif occurs both in the GS region of type I receptors upon phosphorylation by type II receptors and in the C-terminal tail of RSmads upon phosphorylation by type I receptors. Together with the C-terminal carboxyl group, this di-phosphoserine moiety constitutes an acidic tail that binds to a basic pocket in the Smad4 MH2 domain (Fig. 1; Wu et al. 2001b
; Chacko et al. 2004
). As a result, RSmadSmad4 oligomers are formed that nucleate a large number of transcriptional regulatory complexes.
In solution, the unphosphorylated Smad2 MH2 domain is a monomer but the phosphorylated form is a homotrimer (Wu et al. 2001b
), and Smad4 also forms homotrimers (Shi et al. 1997
). Smad2/3 mixed oligomers (presumably trimers) have been observed in cells on TGF-
stimulation (Wu et al. 2001b
; Chacko et al. 2004
), but their function is unknown. More significant from the standpoint of signal transduction are the RSmadSmad4 oligomers, both heterodimers (RSmadSmad4) or heterotrimers (two RSmads molecules plus one Smad4 molecule) (Figs. 1, 2B). As revealed by X-ray crystallographic studies, these oligomers are stabilized by interactions within an extensive proteinprotein interface between MH2 domains plus the binding of the pSerXpSer motif of one MH2 domain into the di-phosphoserine-binding pocket on the adjacent MH2 domain (Wu et al. 2001b
). Mutation of these serine residues into aspartic acid mimics this interaction. Several Smad-inactivating mutations found in tumors map to the MH2 domain interface and inhibit Smad oligomer formation (Wu et al. 2001b
).
Smad dephosphorylation
A steady level of Smad2/3 phosphorylation is achieved in cells within 1530 min of exposure to of TGF
addition and can last for several hours. Eventually, a drop in TGF
levels in the extracellular space, receptor inactivation by internalization and degradation, or the action of negative feedback mechanisms leads to a loss of Smad phosphorylation.
Recent evidence suggests that sustained receptor activity is required for the maintenance of the steady-state phospho-Smad levels over the duration of the TGF
stimulation. The existence of a rapid cycle of dephosphorylation and return to the cytoplasm has been inferred from results using a TGF
receptor kinase inhibitor (Inman et al. 2002
). Interrupting signaling by addition of this inhibitor causes a precipitous loss of phospho-Smad2, with a half-life of <30 min (Inman et al. 2002
). This loss is primarily caused by dephosphorylation (Fig. 1; Inman et al. 2002
), although ubiquitination and proteasome-mediated degradation of Smad2 in the nucleus may also eventually occur (Lo and Massagué 1999
). RSmad dephosphorylation seems to occur in the nucleus, as suggested by the underphosphorylated state of Smad2 protein exported from the nuclei of permeabilized TGF
-treated cells (Xu et al. 2002
). Dephosphorylation is accompanied with dissociation of the RSmadSmad4 complex and export of its components to the cytoplasm. Thus, RSmads undergo repeated cycles of receptor-mediated phosphorylation and re-entry into the nucleus, as long as TGF
receptors remain active (Fig. 1). Smad signaling activity becomes thereby tied to receptor activation. The identity of the RSmad phosphatase(s) has not been revealed, but their identification will be of great interest because the action of this enzyme(s) causes the termination of Smad signal transduction.
| Smad transcriptional complexes |
|---|
|
|
|---|
DNA-binding determinants in the MH1 domain:
-hairpin and exon 3 insert
Smad4 and all RSmads (except the long form of Smad2, see below) can directly bind to DNA. The X-ray crystal structure of the Smad3 MH1 domain bound to the SBE shows that a
-hairpin (i.e., two anti-parallel short
-strands separated by a linker loop) in the MH1 domain mediates this binding interaction. The
-hairpin is inserted in the major groove of the DNA and establishes hydrogen bonds with nucleotides in three base pairs of the Smad-binding element (Shi et al. 1998
). The
-hairpin sequence is conserved in all RSmads and Smad4 (Shi et al. 1998
), arguing that this SmadDNA contact cannot provide much selectivity in the interaction of different Smads with their corresponding target genes.
Interestingly, the most abundant splice variant of Smad2 in vertebrates contains an insert in the MH1 domain, in the vicinity of the
-hairpin, that prevents Smad2 binding to DNA (Shi et al. 1998
). This insert is codified by exon 3 of Smad2. The predicted sequence of the Smad2 ortholog in Drosophila, dSmad2, also contains an insert in this position (Brummel et al. 1999
). A splice variant of Smad2 lacking the exon 3 insert is able to bind DNA (Yagi et al. 1999
). Because the splice form containing the insert is the most abundant and was cloned first, this variant is called Smad2, whereas the version lacking the insert, and thus most closely resembling the other RSmads, is stuck with the name Smad2
E3.
The endogenous Smad2 and Smad2
E3 proteins are coexpressed in various cell types at a ratio ranging from 3:1 to 10:1 (Dunn et al. 2005
). Smad2-null mice show an early embryonic lethal phenotype because Smad2 plays an essential role in patterning the embryonic axis and specifying the endoderm (Waldrip et al. 1998
; Heyer et al. 1999
). Mice engineered to exclusively express Smad2
E3 from the Smad2 locus are viable and fertile, indicating that full-length Smad2 with the insert is not required for viability (Dunn et al. 2005
).
The role of the insert and its inhibitory effect on DNA binding remain a mystery. Smad2 might act as a competitive inhibitor of Smad2
E3 and Smad3 in transcriptional complexes. However, this scenario has not been borne out by the evidence. Smad2 was originally identified for its ability to act as a mediator and not an inhibitor of activin/TGF
-like transcriptional responses. Smad2 can associate with Smad4 and an additional DNA-binding cofactor to form transcriptional complexes. This was shown by the ability of the complex FoxH1Smad2Smad4 to mediate activation of the Mix2 promoter (X. Chen et al. 1997
; Liu et al. 1997
). Still, it is possible that Smad2 may act as an effector of some gene responses and as an inhibitor of others. Furthermore, the exon 3-encoded insert may have functions not related to transcriptional regulation. Of interest in this regard, Smad2
E3, like Smad3, forms high-molecular mass oligomeric complexes in the basal state, whereas Smad2 is essentially monomeric under these conditions (Jayaraman and Massagué 2000
).
Smad-binding elements
Using a bound-oligonucleotide selection strategy, the binding specificity of recombinant Smad proteins was originally defined as 5'-GTCTAGAC-3' (Zawel et al. 1998
) and later shown to be 5'-GTCT-3', or its complement 5'-AGAC-3', called the Smad-binding element (SBE). Many Smad-responsive promoter regions contain one or more SBEs, which in many instances contain an extra base, as 5'-CAGAC-3'. The crystal structure of the MH1SBE complex shows that Smads recognize the 5'-GTCT-3' sequence through the
-hairpin in the MH1 domain (Shi et al. 1998
).
The affinity of Smad proteins for the SBE is too low to support binding of a Smad complex to a single SBE in vivo (Shi et al. 1998
). Sufficient binding affinity for transcriptional activation can be artificially achieved with concatemers of multiple SBEs (Zawel et al. 1998
). As activated Smad complexes consist of Smad oligomers, the presence of multiple SBEs likely enables tight binding through cooperative interactions between multiple MH1 domainSBE contacts by the same Smad complex. However, natural Smad target promoters seldom contain SBE concatemers, and those that contain up to four SBEs still require cooperating factors for effective DNA binding (Seoane et al. 2004
).
High-affinity binding of the Smad complex is thought to occur through the incorporation of a different DNA-binding cofactor into the RSmadSmad4 complex (Fig. 1). This allows the recognition of promoter regions that present one SBE in the vicinity of the cognate sequence for that particular cofactor. This mode of interaction provides a basis for high affinity and selectivity in the recognition of target genes and a venue for the differential action of TGF
in different cell types (Massagué and Wotton 2000
). As discussed below, the interaction of Smads with other transcription factors to generate target gene-specific transcriptional complexes is crucial for the pleiotropic nature of TGF
signaling. However, no crystal structure of a natural Smad complex bound to its cognate DNA region has been described to date, leaving us to wonder how such interactions may actually take place.
SBE variants and GC-rich elements
Some TGF
responsive regions lack a canonical SBE but contain sequences that may function as such because of a certain degree of tolerance in the MH1SBE interaction. Based on the crystal structure of this complex, the second nucleotide in the 5'-GTCT-3' sequence does not support interactions with the MH1 domain, thus allowing substitutions that do not impair Smad binding (Shi et al. 1998
). Such "degenerate" SBEs have been proposed in the TGF
inhibitory element (TIE) in the c-Myc promoter (Chen et al. 2002
). The TIE does not contain a canonical SBE but binds a Smad3Smad4 complex, as verified by chromatin immunoprecipitation assays (Chen et al. 2002
). Scrutiny of this region in vitro and in cells has demonstrated that the sequence 5'-GGCTT-3' is contacted by Smad3 or Smad4 (Chen et al. 2002
; Frederick et al. 2004
). Either way, the SmadDNA interaction in the c-Myc TIE is buttressed by E2F4/5 and DP1 as DNA-binding cofactors in the c-Myc inhibitory complex (Chen et al. 2002
; Frederick et al. 2004
).
In certain promoters, Smad complexes recognize GC-rich regions. Mad (Drosophila Smad1) and Medea (Drosophila Smad4) were shown to interact with a GCCGnCGC sequence in the promoters of Vestigial (Kim et al. 1997
) and Tinman (Xu et al. 1998
). Interestingly, the murine counterpart of Tinman, Nkx2.5, contains a canonical SBE that interacts with the Smad complex (Lien et al. 2002
; Brown et al. 2004
). The BMP-responsive element (BRE) in the promoter of Smad6 contains four overlapping copies of a GC-rich motif. Direct binding of Smads to GC-rich sequences has been demonstrated in oligonucleotide binding assays using extracts from cells that overexpress Smad5 and Smad4 (Ishida et al. 2000
). Based on these observations, GC-rich motifs are sometimes referred to as "Smad1-binding elements". However, it would be erroneous to imply that Smad1/5/8 bind GC-rich motifs, whereas Smad2/3 bind to SBEs. The first Smad1-binding element defined in vertebrates, in the BMP-responsive region of the Xenopus Vent-2 promoter, is a canonical SBE (Hata et al. 2000
). Furthermore, binding of Smad1Smad4 complexes and Smad3Smad4 complexes to the Id1 promoter requires both SBE and GC-rich elements. Both elements are required for BMP-mediated induction of Id1 (Korchynskyi and ten Dijke 2002
; Lopez-Rovira et al. 2002
; Kang et al. 2003
) and TGF
-mediated repression of Id1, the difference being determined by the recruitment of an additional factor, ATF3 (Kang et al. 2003
). No crystal structure of a Smad molecule bound to a GC region has been reported, but the dual interaction with SBE and GC-rich motifs invites the speculation that the MH1 domain might actually contain separate DNA-binding sites for these two different sequences.
Core Smad complexes: dimers and trimers
Transcriptionally active Smad complexes are mediated by RSmadSmad4 oligomers. The stoichiometry of such oligomers has been the subject of debate. The analysis of complexes formed with Smads containing acidic residue substitutions for the C-terminal serine residues suggested that they form heterotrimers of two phospho-RSmad molecules and one Smad4 molecule (Chacko et al. 2001
; Qin et al. 2001
). Moreover, phospho-Smad2 and Smad4 each in isolation form homotrimers in solution (Shi et al. 1997
; Wu et al. 2001b
). These results indicated that functional RSmadSmad4 complexes are probably heterotrimers (Chacko et al. 2001
). However, the crystal structure of isolated MH2 domains of phospho-Smad2 supported as well a heterodimer model between Smad4 and the RSmad (Wu et al. 2001a
,b
). Recently, the X-ray crystal structure analysis of the complexes formed by the MH2 domains of phospho-Smad2Smad4 and phospho-Smad3Smad4, digested with chymotrypsin, shows that both complexes are heterotrimers comprising two phosphorylated RSmad subunits and one Smad4 (Chacko et al. 2004
).
Studies on endogenous Smad transcriptional complexes in vivo have suggested a more complex scenario. Both heterodimers and heterotrimers may be formed depending on the target gene and other factors present in the complex (Inman and Hill 2002
). For example, the Mix2 promoter may be targeted by a heterotrimer Smad2Smad2Smad4 bound to FoxH1, whereas a Smad3Smad4 heterodimer and an as yet unknown cofactor may be involved in targeting the JunB promoter (Inman and Hill 2002
). The physical interaction of the Smad cofactors with the RSmadSmad4 complexes might facilitate the formation of a heterodimer or a heterotrimer.
The regulation of c-Myc, Id1, p21Cip1, and other genes by TGF
involves Smad3 preferentially over Smad2. Smad3 has a higher affinity than Smad2 for the cofactorsE2F4/5, ATF3, and FoxO, respectivelythat target Smads to these gene promoters (Chen et al. 2002
; Kang et al. 2003
; Seoane et al. 2004
). The role of Smad2
E3 in these complexes has not been characterized. Therefore, the choice between Smad heterodimers and heterotrimers might depend on several variables, including the balance between Smad2 and Smad2
E3, the other DNA-binding partners present in the complex, and the number of SBEs present in the target promoter.
Smad4 requirement
Whether Smad4 is always required in Smad transcriptional complexes remains an important yet elusive question. All endogenous Smad complexes described to date have been shown to contain Smad4. All Smad target genes characterized by chromatin immunoprecipitation showed Smad4 binding along with RSmads. Nevertheless, Smad4-deficient tumor cells and fibroblasts from Smad4-deficient mice still display some gene responses to TGF
(Sirard et al. 1998
; Wisotzkey et al. 1998
; Subramanian et al. 2004
). Certain pancreatic carcinoma cells that lack Smad4 contain high levels of phosphorylated RSmads and respond to TGF
receptor signaling with increased motility (Subramanian et al. 2004
). TGF
receptors could signal some of these responses via MAP kinases, phosphatidyl inositol 3'-kinase, PP2A protein phosphatases, or Rho family members (for review, see Derynck and Zhang 2003
; Siegel and Massagué 2003
), or as a result of effects on tight junction membrane components (Ozdamar et al. 2005
). However, TGF
signaling via RSmads independently of Smad4 remains a possibility, as suggested by the more severe phenotype of mad mutants compared with medea mutants in Drosophila (Wisotzkey et al. 1998
) and Smad2-/- mice compared with Smad4-/- mice (Sirard et al. 1998
; Chu et al. 2004
). Phospho-RSmads might yet be shown to interact with factors distinct from, and in competition with, Smad4.
| Target gene selection by association with DNA-binding partners |
|---|
|
|
|---|
target genes in a given cell, only a few may be targeted by a particular RSmadSmad4partner combination. A given cell type that does not express a particular Smad partner will be unable to mount the TGF
gene response(s) that depend on that particular RSmad complex combination. This would make a gene response to TGF
cell type dependent. Some DNA-binding partners can pair with Smad2/3 only and others with Smad1/5/8 only, thus establishing pathway specificity for TGF
and BMP gene responses. Furthermore, in some cases Smad partners also determine the sign of the effect activation or repressionthat is exerted on a target gene. Smad partners therefore provide four levels of specificity: (1) target gene specificity, (2) pathway specificity, (3) cell type specificity, and (4) specific transcriptional effect.
|
transcriptional responses. It should be noted that only a few of these interactions have been validated to date by analysis of endogenous (as opposed to overexpressed) proteins, chromatin immunoprecipitation analysis of protein binding to DNA in vivo, and genetic depletion of Smad partners. Whenever possible, we will focus here on examples that meet these stringent criteria.
The first identified Smad-interacting transcription factor was the forkhead family member, FoxH1 (previously FAST1) (Chen et al. 1996
). The FoxH1Smad2Smad4 complex binds an activin-responsive element on the promoter region of Mix2 in response to activin/nodal-like signals during mesoderm specification in Xenopus. FoxH1 is essential for the binding of this complex to the Mix2 promoter (X. Chen et al. 1997
; Liu et al. 1997
). FoxH1 specifically interacts with the MH2 domain of Smads 2 and 3. This interaction is mediated by two separate sequences on FoxH1: a proline-rich sequence called the SIM and a FoxH1-specific motif (Germain et al. 2000
). Three other forkhead family members, FoxO1, FoxO3, and FoxO4, serve as Smad3 transcriptional partners in the activation of the cyclin-dependent kinase inhibitor p21Cip1 or CDKN1A (Seoane et al. 2004
). In this case, however, the interaction is mediated by the DNA-binding domain ("forkhead" or "Fox-box" domain) of the FoxO proteins and the MH1 domain of Smad3.
A region structurally and functionally similar to the FoxH1 SIM is also present in Mixer and Milk. However, these proteins are members of the Mix family of homeodomain transcription factors. Mixer and Milk serve as Smad partners in the regulation of Xenopus goosecoid (Germain et al. 2000
). Interestingly, it has been suggested that the contact with the goosecoid promoter is established by Mixer or Milk with Smads tethered to these proteins without directly contacting the DNA (Germain et al. 2000
).
Activation of Vent2 during ventral mesoderm differentiation in Xenopus is mediated by an OAZSmad1Smad4 complex in response to BMP (Hata et al. 2000
). OAZ belongs to yet another family of DNA-binding proteins, the zinc-finger protein family. OAZ contains 30 zinc-finger motifs, some of which mediate binding to Smad1 and others mediate binding to a Vent2 promoter sequence (Hata et al. 2000
). Another BMP target gene activated by the SmadOAZ complex is Xretpos, a novel family of long terminal repeat (LTR)-retrotransposons in Xenopus, whose transcript is restricted to ventroposterior-specific regions (Shim et al. 2002
). However, the activation of the homeobox gene, Tlx2, by BMP in the mouse embryo during gastrulation and in the embryonic carcinoma p19 cell line (Macias-Silva et al. 1998
; Tang et al. 1998
) is not mediated by SmadOAZ (Hata et al. 2000
). In this case, the TRAF6-interacting protein Ecsit, which is also implicated in the Toll signaling pathway, helps the Smad1Smad4 complex recognize Tlx2 for activation (Xiao et al. 2003
). This provides yet another example of the level of selectivity that has evolved to ensure proper control of target gene selection in Smad transcriptional programs.
The list of DNA-binding partners is rapidly growing. The RUNX family of DNA-binding factors contributes three members to this list: Runx1 (also known as AML1/CBFA2/PEBP2
B, Runx2 (AML3/CBFA1/PEBP2
A), and Runx3 (AML2/CBFA3/PEBP2
C) (Hanai et al. 1999
; E. Pardali et al. 2000
; Zhang and Derynck 2000
; Alliston et al. 2001
). TGF
-activated RunxSmad2/3Smad4 complexes target the IgC
promoter in B cells (Hanai et al. 1999
; E. Pardali et al. 2000
; Zhang and Derynck 2000
) and the osteocalcin promoter in osteoblasts (Alliston et al. 2001
). AP1 transcription factors interact with Smads in the regulation of c-jun (Zhang et al. 1998
), collagenase-I/MMP1 (Wong et al. 1999
), and interleukin-11 (Y. Kang et al. 2005
). While AP1 and Smad complexes have been shown to bind to adjacent sites on the collagenase-I/MMP1 and interleukin-11 promoters, they can act on the c-jun promoter by a long-range synergy from distant promoter sites (Wong et al. 1999
). Thus, in the c-jun response, AP1 and Smads may not establish the kind of physical partnership that is characteristic of other Smadpartner interactions. The E2F family members E2F4 and E2F5 act as partners of Smad3Smad4 in repression of c-Myc by TGF
in epithelial cells (Chen et al. 2002
) and the ATF/CREB family member ATF3 in the repression of Id1 (Kang et al. 2003
).
p53 has been proposed to interact with Smads and to be required for many Smad gene responses, although no direct physical interaction has been demonstrated (Cordenonsi et al. 2003
). Among general transcription factors, Sp1 has been implicated in the induction of p15Ink4b (Feng et al. 2000
) and p21Cip1 (K. Pardali et al. 2000
), but no evidence has been provided yet that TGF
controls the interaction of endogenous Sp1 with the p15Ink4b and p21Cip1 promoters in chromatin immunoprecipitation assays or that conditional depletion of Sp1 specifically interferes with the induction of these genes by TGF
.
It remains possible that genes containing enough clustered copies of the SBE in the promoter region might be activated by Smad-only complexes. One candidate in this class is Smad7, whose promoter contains two pal-indromic SBEs (Denissova et al. 2000
). Smad7 is activated by both TGF
and BMP pathways in many cell types for negative feedback. So, it would make sense if its induction could be achieved by activated Smads regardless of the activating input. Nevertheless, definitive proof for this notion is still lacking. Indeed, several reports show that full activation of the Smad7 promoter requires cooperation of the Smad complex with AP1, Sp1, or TFE3 transcription factors (Brodin et al. 2000
; Hua et al. 2000
). Of note,