Genes and Development

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


GENES & DEVELOPMENT 21:11-42, 2007
©2007 by Cold Spring Harbor Laboratory Press; ISSN 0890-9369/ $5.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Prasanth, K. V.
Right arrow Articles by Spector, D. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Prasanth, K. V.
Right arrow Articles by Spector, D. L.
Related Content
Right arrow Chromatin and Gene Expression
Right arrow Cancer and Disease Models
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

REVIEW

Eukaryotic regulatory RNAs: an answer to the ‘genome complexity’ conundrum

Kannanganattu V. Prasanth and David L. Spector1

Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA


    Abstract
 Top
 Abstract
 Roles of RNA in...
 Roles of ncRNAs in...
 Intergenic transcripts: sense in...
 Natural antisense transcripts...
 RNAs as modulators of...
 RNAs: location, location,...
 Pseudogene transcripts: no more...
 Nuclear retained regulatory...
 New roles for RNAs?
 Regulatory RNAs implicated in...
 Summary
 Acknowledgments
 References
 
A large portion of the eukaryotic genome is transcribed as noncoding RNAs (ncRNAs). While once thought of primarily as "junk," recent studies indicate that a large number of these RNAs play central roles in regulating gene expression at multiple levels. The increasing diversity of ncRNAs identified in the eukaryotic genome suggests a critical nexus between the regulatory potential of ncRNAs and the complexity of genome organization. We provide an overview of recent advances in the identification and function of eukaryotic ncRNAs and the roles played by these RNAs in chromatin organization, gene expression, and disease etiology.

[Keywords: RNA; disease; intergenic transcripts; noncoding RNA; nuclear regulatory RNAs]


Although Jacob and Monod (1961)Go suggested early on that structural genes encode proteins and regulatory genes produce noncoding RNAs (ncRNAs), the prevailing view has been that proteins not only constitute the primary structural and functional components of cells, but also constitute most of the regulatory control system in both simple and complex organisms. However, recent advances in the fields of RNA biology and genome research have reassessed this "age-old" assumption and provided significant evidence of the importance of RNAs as "riboregulators" outside of their more conventional role as accessory molecules.

Recent large-scale studies of the human and mouse genomes have revealed that although there are ~21,561 protein-coding genes in human and 21,839 in mouse, significantly larger portions of both genomes are transcribed (69,185 gene predictions in human and 71,259 in mouse) (http://www.ensembl.org). Based on such analyses, eukaryotic genomes appear to harbor fewer protein-coding genes than initially expected, and gene number does not scale with complexity as steeply as originally anticipated (Mattick 2004aGo; Mattick and Makunin 2006Go). For example, the Drosophila melanogaster genome contains only twice as many genes as some bacterial species, although the former is far more complex in its genome organization than the latter. Similarly, the number of protein-coding genes in human and in the nematode Caenorhabditis elegans is extremely close (http://www. ensembl.org). Such analyses suggest that protein-coding genes alone are not sufficient to account for the complexity of higher eukaryotic organisms. Interestingly, from genomic analysis it is evident that as an organism’s complexity increases, the protein-coding contribution of its genome decreases (Mattick 2004aGo, bGo; Szymanski and Barciszewski 2006Go). A portion of this paradox can be resolved through alternative pre-mRNA splicing, whereby diverse mRNA species, encoding different protein isoforms, can be derived from a single gene (Lareau et al. 2004Go). In addition, a range of post-translational modifications contributes to the increased complexity and diversity of protein species (Yang 2005Go).

It is estimated that ~98% of the transcriptional output of the human genome represents RNA that does not encode protein (Mattick 2005Go). This suggests that these genomes are either replete with largely useless transcription or that these ncRNAs are fulfilling a wide range of unexpected functions in eukaryotic biology (Huttenhofer et al. 2005Go; Mattick 2005Go). Recent observations strongly suggest that ncRNAs contribute to the complex networks needed to regulate cell function and could be the ultimate answer to the genome paradox (Mattick 2001Go, 2003Go, 2004aGo, cGo; Mattick and Gagen 2001Go). Initially the term ncRNA was used primarily to describe eukaryotic RNAs that are transcribed by RNA polymerase II (RNA pol II) and have a 7-methylguanosine cap structure at their 5' end and a poly(A) tail at their 3' end, but lack a single long ORF. However, more recently this classification has been extended to all RNA transcripts that do not have a protein-coding capacity. NcRNAs include introns and independently transcribed RNAs, with the latter accounting for 50%–75% of all transcription in higher eukaryotes (Mattick and Gagen 2001Go; Shabalina and Spiridonov 2004Go). Introns account for at least 30% of the human genome, but they have been largely overlooked due to the general assumption that they are rapidly degraded upon pre-mRNA splicing (Mattick 1994Go, 2005Go). In mammalian genomes, introns comprise ~95% of the sequence within protein-coding genes. Introns have been suggested to play important roles in nucleosome formation and chromatin organization, alternative pre-mRNA splicing, and as scaffold/matrix-attachment regions (Shabalina and Spiridonov 2004Go). Intronic sequences have also been shown to harbor independent transcription units, such as microRNAs, small nucleolar RNAs (snoRNAs), and repetitive elements (Mattick and Makunin 2005Go).

It is not clear how many ncRNA genes are present in the mammalian genome. The existing catalog of mammalian genes is strongly biased toward protein-coding genes. Novel ncRNA genes are difficult to identify based on sequence analysis due to their sequence divergence across phyla (Pang et al. 2006Go). The nature of ncRNA genes, including their variation in length (20 nucleotides [nt] to >100 kb), lack of ORFs, and relative immunity to point mutations makes them difficult targets for genetic screens. Analysis of mouse full-length cDNAs revealed that ncRNAs constitute more than one-third of all identified transcripts (Okazaki et al. 2002Go; Numata et al. 2003Go; Carninci et al. 2005Go). Whole human chromosome analysis using oligonucleotide tiling arrays has demonstrated a significantly large number of genes encoding ncRNAs on most of the analyzed chromosomes (Kapranov et al. 2002Go; Cawley et al. 2004Go; Kampa et al. 2004Go; Cheng et al. 2005Go), many of which show extraordinarily complex patterns of interlaced and overlapping transcription (Carninci et al. 2005Go; Kapranov et al. 2005Go). Current estimates of the number of independent transcription units (~70,000) and protein-coding genes (~21,500) in the mammalian transcriptome suggest that ncRNA genes are highly abundant in the genome (Mattick 2004bGo, cGo, 2005Go; Mattick and Makunin 2006Go; Willingham and Gingeras 2006Go).

Based on functional relevance, ncRNAs can be subdivided into two classes: (1) housekeeping ncRNAs and (2) regulatory ncRNAs. Housekeeping ncRNAs are generally constitutively expressed and are required for the normal function and viability of the cell. Some examples include transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small nuclear (snRNAs), snoRNAs, RNase P RNAs, telomerase RNA, etc. These RNAs have been the focus of many reviews (Eddy 2001Go; Gesteland et al. 2006Go) and will not be considered further here. In contrast, regulatory ncRNAs or riboregulators include those ncRNAs that are expressed at certain stages of development, during cell differentiation, or as a response to external stimuli, which can affect the expression of other genes at the level of transcription or translation. Several recent excellent reviews have focused on small regulatory RNAs, including small interfering RNAs (siRNAs) and microRNAs (Hannon 2002Go; He and Hannon 2004Go; Mattick and Makunin 2005Go; Zamore and Haley 2005Go; Petersen et al. 2006Go), and therefore will not be extensively discussed here, except for their involvement in various diseases. In the present review, we discuss our current understanding of the roles of other noncoding regulatory RNAs in eukaryotic cells and their involvement in gene organization, regulation, and disease etiology.


    Roles of RNA in dosage compensation and sex determination: everything needs to be equal
 Top
 Abstract
 Roles of RNA in...
 Roles of ncRNAs in...
 Intergenic transcripts: sense in...
 Natural antisense transcripts...
 RNAs as modulators of...
 RNAs: location, location,...
 Pseudogene transcripts: no more...
 Nuclear retained regulatory...
 New roles for RNAs?
 Regulatory RNAs implicated in...
 Summary
 Acknowledgments
 References
 
In most animals, the males and females differ in the number of X chromosomes. The expression levels of X-chromosome genes must therefore be equalized in the two sexes, a process referred to as dosage compensation. This can be achieved either by X-chromosome inactivation (XCI) in XX cells or by hyperactivation of the single X chromosome in XY cells. Both of these mechanisms are used by different species and both depend on the expression of regulatory ncRNAs that are key elements of the pathways leading to chromatin remodeling (Lucchesi et al. 2005Go).


XCI

In mammals, dosage compensation of X-linked gene products between the sexes is achieved by transcriptional silencing of a single X chromosome during early female embryogenesis (Lyon 1961Go; Plath et al. 2002Go; Heard and Disteche 2006Go; Spencer and Lee 2006Go). Initiation of XCI requires the counting of X chromosomes. XCI follows the "n – 1" rule that leads to transcriptional silencing of all but one X chromosome. In female soma, XCI occurs in early development shortly after uterine implantation of the embryo. This form of XCI is called "random" because silencing can take place on either X chromosome (Spencer and Lee 2006Go). However, in the extraembryonic tissues of some placental mammals, such as rodents, XCI takes place in an "imprinted" manner such that the paternal X (Xp) is always silenced (Takagi and Sasaki 1975Go). Earlier classical cytogenetics studies suggested that the paternal X only becomes inactivated at the blastocyst stage, accompanying cellular differentiation in the trophoectoderm and primitive endoderm (Takagi et al. 1982Go). However, recent studies have revealed that the paternal X has already begun to inactivate by the eight-cell stage (Huynh and Lee 2003Go; Mak et al. 2004Go; Okamoto et al. 2004Go; Okamoto and Heard 2006Go) and this inactivation of Xp initiates following Xist RNA coating at the four-cell stage (Okamoto et al. 2004Go, 2005Go). Imprinted XCI is also observed in marsupials and is believed to be the earliest form of XCI (Graves 1996Go). This inactive state is stably maintained through subsequent cell divisions. The X inactivation center (Xic) is a critical region of 80–450 kb on the X chromosome that controls XCI initiation and spreading (Heard and Disteche 2006Go; Spencer and Lee 2006Go). Only the chromosomes carrying the Xic sequence are able to induce XCI, even though the "random" and "imprinted" forms of XCI may differ with respect to the requirement of the Xic sequences (Okamoto et al. 2005Go). Interestingly, when Xic sequences are inserted into an autosome, the autosome becomes subject to counting, choice, and inactivation (Spencer and Lee 2006Go).

Of the several long ncRNA genes present in Xic, Xist (X-inactive-specific transcript) has been the most extensively studied ncRNA gene. The Xist gene encodes a ncRNA that is associated exclusively with the inactive X chromosome (Fig. 1; Brockdorff et al. 1992Go; Brown et al. 1992Go). Although potential ORFs exist in Xist RNA, they are short and not conserved between species (Brockdorff et al. 1992Go; Brown et al. 1992Go). The gene is conserved between species at the level of its genomic organization but shows only weak sequence homology, possibly implying a role for its secondary structure. Xist ncRNAs are 15–17 kb long in mice, ~19 kb in human, are spliced, polyadenylated, and are restricted to the nuclear compartment. In the female embryo, Xist up-regulation on the putative inactive X chromosome (Xi) and RNA coating of this chromosome constitute the first detectable signs of XCI (Morey and Avner 2004Go). Using inducible Xist cDNA transgenes, it was shown that Xist-RNA-induced X-chromosome silencing occurs only during early embryonic stem (ES) cell differentiation (Wutz and Jaenisch 2000Go). However, during initial phases of ES cell differentiation, XCI can be reversed by switching off the Xist gene, but subsequently the repressed state becomes locked in and is no longer dependent on Xist. This irreversibility of silencing of Xi can be attributed to changes in chromatin modifications observed on the Xi followed by Xist RNA coating (Heard and Disteche 2006Go). The earliest chromatin modifications observed are the loss of histone modifications associated with active chromatin, such as H3K9 acetylation and H3K4 methylation. Subsequently, the X chromosome becomes H4 hypoacetylated and enriched in H3 Lys 27 (H3K27) trimethylation (Plath et al. 2003Go; J. Silva et al. 2003Go). H3K27 hypermethylation is accompanied by other chromatin changes, including H3K9 hypermethylation and H4K20 monomethylation as well as H2A K119 monoubiquitylation, and all of these modifications appear concomitantly with the transcriptional silencing of the X-linked genes (Morey and Avner 2004Go; Heard and Disteche 2006Go). The inactive X chromosome is also enriched in the histone variant macroH2A, and Xist RNA is necessary for its localization to the inactive X (Costanzi and Pehrson 1998Go). These successive layers of modifications lead to the establishment of silent chromatin and, in turn, lock the inactive X into a stable heterochromatic state throughout the cell cycle. Deletion and transgene analyses have shown that Xist is essential for both imprinted and random XCI and affects only the chromosome that transcribes Xist RNA (Penny et al. 1996Go; Marahrens et al. 1997Go; Wutz and Jaenisch 2000Go). However, Xist alone cannot account for the multiple functions attributed to the Xic, such as "counting," as deletion of one Xist allele still allows the cell to register the presence of less than one Xic, which triggers XCI via the wild-type Xist allele (Penny et al. 1996Go). Interestingly, multiple DNA elements 3' to Xist appear to be involved in counting and choice functions (Heard and Disteche 2006Go).


Figure 1
View larger version (64K):
[in this window]
[in a new window]

 
Figure 1. Interphase mouse nucleus showing the localization of Xist RNA (green) in the inactive X chromosome. DNA is counterstained with DAPI (blue). (Image provided by Edith Heard, Curie Institute, Paris, France.)

 
Although Xist is associated with X-chromosome silencing, its mechanism of action remains unclear. With its noncoding properties, Xist could conceivably function through its RNA, either by modulating transcription at the locus, or through organizing chromatin. Several lines of evidence strongly favor the view that Xist functions as an RNA (Spencer and Lee 2006Go). These include (1) the physical association of Xist RNA with the inactive X chromosome and the nuclear matrix around the X chromosome. In support of this, recent studies have shown that Xist RNA defines a silent nuclear compartment around the future Xi early in the XCI process (Chaumeil et al. 2006Go; Clemson et al. 2006Go; Heard and Disteche 2006Go). (2) Mutations that decrease Xist RNA localization generally correlate with reduced silencing (Newall et al. 2001Go). (3) Expression of Xist RNA from an autosome during ES cell differentiation initiates inactivation of the autosome carrying the Xist transgene (Lee and Jaenisch 1997Go). (4) The repeat-A region that contains A-repeats located within intron 1 of Xist RNA, which is required for silencing, functions only when placed in the forward (native) orientation (Wutz et al. 2002Go). The current model suggests that Xist RNA initiates silencing by binding to specific silencing factors, recruiting those silencing proteins to the Xic, and subsequently propagating those factors along the X chromosome as the RNA itself spreads throughout the chromosome (Spencer and Lee 2006Go).

Other than Xist RNA, the Xic region in mouse also harbors many other ncRNA genes including Tsix, Xite, and Jpx/Enox, several of which are integral to the regulation of XCI. Tsix negatively regulates the expression of Xist RNA and is transcribed in an antisense orientation relative to Xist. Like Xist, Tsix lacks a conserved ORF and is found only in the nucleus (Lee et al. 1999Go). In undifferentiated female ES cells, Xist and Tsix are coexpressed on both X chromosomes, although the Tsix levels are in 10- to 100-fold molar excess over Xist RNA (Shibata and Lee 2003Go). However, a recent study suggests that Xist is expressed at an extremely low level prior to XCI and that Tsix is the major RNA component detected at the Xist/Tsix locus in undifferentiated ES cells (Sun et al. 2006Go). At the onset of cell differentiation, Tsix becomes asymmetrically expressed: Whereas Tsix expression persists transiently on the future active X (Xa), expression ceases on the future inactive X (Xi). The loss of Tsix expression on the future Xi enables the up-regulation and spread of Xist RNA along the chromosome. The persistence of Tsix on the future Xa enables that X to remain active. Once the window for XCI has passed, Tsix is also turned off on the Xa. These results suggest that by controlling the fate of Xist and therefore the X chromosome, Tsix acts as a binary switch for XCI. The reason for this sudden reciprocal expression profile of Xist and Tsix remains unknown. Interestingly, two recent studies have revealed that the Xics transiently colocalize, via the Tsix region, during the onset of XCI, at the time when counting and choice are thought to occur (Bacher et al. 2006Go; Xu et al. 2006Go). This "cross-talk" between the Xics is thought to be required for the exchange of information between Xist/Tsix that ultimately results in the monoallelic down-regulation of Tsix and up-regulation of Xist on the inactive X chromosome (Heard and Disteche 2006Go). Several mechanisms have been proposed to explain how Tsix regulates Xist (Spencer and Lee 2006Go). These include (1) a DNA-based mechanism in which DNA sequences at Tsix bind transcription factors that then repress the Xist promoter at long range, or Tsix could also compete with Xist for an enhancer or any other regulatory sequence; (2) a transcription-based mechanism, where antisense transcription across the Xist promoter could interfere with the ability of the Xist promoter to fire by affecting chromatin modification or transcription factor binding; (3) Tsix RNA itself could recruit repressive factors or could form duplex RNA with Xist that would either facilitate the degradation of Xist RNA or prevent binding of necessary silencing factors to Xist RNA. Recent studies have provided clues that suggest either Tsix transcription or Tsix RNA itself has a role in Xist RNA regulation (Spencer and Lee 2006Go). It has been observed that overexpression of Tsix always results in an active X in cis (Luikenhuis et al. 2001Go; Stavropoulos et al. 2001Go). Furthermore, when Tsix RNA is prematurely truncated before it crosses into the Xist gene, Tsix no longer functions as a repressor of Xist, and XCI invariably occurs on the mutated X (Shibata and Lee 2004Go). It was also proposed earlier that the modulation of Xist chromatin structure might play a role in how Tsix regulates Xist (Navarro et al. 2005Go; Sado et al. 2005Go). Interestingly, a recent study has suggested that up-regulation of Xist RNA observed on the future inactive X is not due to the increased stability of the Xist transcript as suggested earlier but is regulated by Tsix (Panning et al. 1997Go; Sheardown et al. 1997Go; Sun et al. 2006Go). Lee and colleagues (Sun et al. 2006Go) reported that Tsix down-regulation on the future inactive X induces a transient heterochromatic state to Xist, followed by high levels of Xist expression. This heterochromatic state adopted by the Tsix-deficient chromosome in pre-XCI cells persisted through XCI establishment and reverted to a euchromatic state during XCI maintenance (Sun et al. 2006Go).

The mouse Xic harbors yet another functional ncRNA gene, called Xite (X-inactivation intergenic transcription elements). Xite is transcribed at low levels, on the order of 10- to 60-fold less than Tsix levels in mouse ES cells. Although there is some bidirectional transcription, the majority of the transcripts are oriented in the same direction as Tsix. Deleting Xite results in preferential silencing of the X in cis, thereby skewing the normally random probability that any one X would be chosen as the silent one (Ogawa and Lee 2003Go). Xite action does not appear to depend on the RNA per se, because truncation of the RNA does not produce any obvious phenotype, suggesting that transcription from the region could be more important. The monoallelic expression of Xist, at least in mice, is controlled by complex regulation of Tsix and Xite as well as cis-regulatory sequences located in the 3' region of Xist (Heard and Disteche 2006Go). In the current model, Xite works together with Tsix to designate the Xa where transcription from Xite acts as an enhancer for Tsix by promoting the persistence of Tsix expression during cell differentiation; this in turn prevents the up-regulation and spread of Xist RNA along the chosen Xa (Spencer and Lee 2006Go). Ftx is another ncRNA gene located ~150 kb upstream of mouse Xist. In mouse and humans, the 5' regions of Ftx are well conserved and contain CpG islands at positions corresponding to the cDNA start sites and are transcribed in opposite orientation relative to Xist/XIST genes (Chureau et al. 2002Go). Future investigation of a less-characterized ncRNA gene Jpx (Enox) found around the Xic may also show that this gene participates in the regulatory events of XCI (Spencer and Lee 2006Go).


X-chromosome hyperactivation in Drosophila

Unlike the situation in mammals, dosage compensation in Drosophila is achieved by a twofold up-regulation of transcription of genes on the single X chromosome present in males (Kelley 2004Go). Intriguingly, the fly dosage compensation system also involves multiple ncRNAs: roX1 and roX2 (RNA on the X). These RNAs are members of the dosage compensation complex (DCC), a huge RNA–protein complex that binds to hundreds of sites along the male X chromosome in a highly reproducible, banded pattern (Meller 2000Go; Meller et al. 2000Go). In addition to roX1 and roX2 RNAs, the DCC also contains a specific set of proteins that include MLE (maleless); MSL1, MSL2, and MSL3 (male-specific lethal 1, 2, and 3, respectively); and MOF (males absent on the first). Mutations in these genes result in male-specific lethality of larvae, and their products are collectively termed MSL proteins. A characteristic feature of the up-regulated X chromosome is the specific acetylation of histone H4 at Lys 16 (H4Ac16) (Akhtar 2003Go).

The two roX genes are transcribed from the X chromosome, produce polyadenylated nuclear retained transcripts, and are expressed only in male adult flies (Fig. 2). The roX RNAs are functionally redundant even though they have very little sequence homology and are distinct in size (3.7 kb for roX1 RNA vs. 0.5–1.2 kb for roX2 RNA) (Meller and Rattner 2002Go). Deletion of either roX gene has no effect on males. However, deletion of both results in male lethality. The MSL-binding pattern on the X chromosome is drastically disrupted in the roX1 roX2 double-mutant males, suggesting that roX RNAs are important for correctly targeting the MSL complex to the X (Meller and Rattner 2002Go). The roX genes could be performing two distinct and separable functions in dosage compensation. First, roX RNAs constitute indispensable elements of the DCC responsible for chromatin modifications. Second, the genes themselves provide strong chromatin entry sites for the MSL complex, possibly to ensure rapid recruitment of the MSL proteins for roX RNA binding. The current model suggests that there are different DNA recognition elements on the X chromosome that have different affinities for the MSL complex; high, intermediate, or weak. High-affinity cis elements, such as within the roX genes, would not require additional cis-elements for recruiting MSL complexes, and this interaction is strengthened by roX RNA. Intermediate and weak-affinity cis-elements might require several cis-elements for robust binding, resulting in the ability to attract partial MSL complexes (Oh et al. 2004Go).


Figure 2
View larger version (94K):
[in this window]
[in a new window]

 
Figure 2. RoX2 ncRNA (red), visualized by RNA fluorescence in situ hybridization, is localized to the active X chromosome in Drosophila male SL2 tissue culture cells. DNA (blue) is counterstained with DAPI. (Image provided by Polina Gordadze and Mitzi I. Kuroda, Brigham and Women’s Hospital, Boston, MA, USA.)

 
The targeting mechanisms of DCC to X chromosomes between mammals and Drosophila show some superficial similarities. In both cases, ncRNAs are required for targeting the correct chromosome for regulation. Furthermore, in each case there is evidence for spreading of the DCC process long distances along the chromosome from the sites of synthesis of those ncRNAs. However, the major differences between the mammalian and Drosophila systems is that in mammals, the DCC is involved in the inactivation of one of the X chromosomes, whereas in Drosophila, it results in the hyperactivation of the single X. Drosophila and mammals also differ in that Xist is limited to its action in cis, while roX RNAs and the MSL complex can also clearly act in trans (Oh et al. 2004Go; Heard and Disteche 2006Go).

Although there is significant evidence to show that ncRNAs are the major effectors of dosage compensation, the molecular basis of how they regulate these processes is still not clearly understood, and the future is likely to reveal many exciting solutions.


Male hypermethylated (MHM) region in birds

In birds, sex determination and differentiation depend on the sex chromosomes Z and W. Males have two Z chromosomes, whereas females are determined by the ZW karyotype. One of the genes proposed to play a role in sex determination in birds is a homolog of human DMRT1 (doublesex and mab-3-related transcription factor) implicated in testis differentiation. DMRT1 has been mapped to the Z chromosome, and its elevated expression in males has been found to correlate with testis development (Smith et al. 2003Go). A MHM region was identified in the Z chromosome in the vicinity of the DMRT1 gene, and the CpG islands in this region are hypermethylated only in males. However, in females, the MHM region is hypomethylated, and transcription from this region produces ncRNAs (the longest transcripts are ~9.5 kb), most of which are nonpolyadenylated and accumulate at or very close to the sites of transcription and close to the DMRT1 locus. The female-specific MHM ncRNAs are suggested to play a role as transcriptional repressors of the DMRT1 locus similar to the role played by Xist RNA in XCI (Teranishi et al. 2001Go; Szymanski and Barciszewski 2003Go).


    Roles of ncRNAs in genomic imprinting: one is enough
 Top
 Abstract
 Roles of RNA in...
 Roles of ncRNAs in...
 Intergenic transcripts: sense in...
 Natural antisense transcripts...
 RNAs as modulators of...
 RNAs: location, location,...
 Pseudogene transcripts: no more...
 Nuclear retained regulatory...
 New roles for RNAs?
 Regulatory RNAs implicated in...
 Summary
 Acknowledgments
 References
 
Diploid organisms usually express both alleles of an active gene. However, in marsupial and placental mammals, some genes express only one of the alleles, a phenomenon termed "genomic imprinting." Genomic imprinting is a process whereby the expression of an allele depends on whether it is derived from the mother or father (Bartolomei and Tilghman 1997Go; Verona et al. 2003Go). Genomic imprinting was first discovered on the X chromosome, where Sharman and colleagues (Richardson et al. 1971Go; Sharman 1971Go) described a form of XCI in marsupials in which the paternal X chromosome is preferentially silenced. Other than in mammals, genomic imprinting has also been identified in angiosperm plants and in a few insects (Braidotti et al. 2004Go). Recent studies have shown that >100 genes are imprinted in mammals, either "paternally imprinted" (the gene is silent on the paternal allele) or "maternally imprinted" (the gene is silent on the maternal allele). The imprinted genes generally exist in clusters on various chromosomes, suggesting that the mechanism to control imprinted expression acts on the chromosomal domains rather than on individual genes. Interestingly, these imprinted clusters often are associated with imprinted ncRNA genes. Expression of the ncRNA from one of the alleles often correlates with the repression of the linked protein-coding gene on the same allele (O’Neill 2005Go). This reciprocal parental-specific expression of imprinted mRNAs and ncRNAs has long been suggested to indicate that ncRNAs play a role in silencing the mRNA genes in an imprinted cluster (Pauler and Barlow 2006Go). Some of the imprinted loci in mammalian cells where the presence of ncRNAs is well documented are described below.


IGF2/H19 locus

The Igf2/H19 domain is perhaps the best characterized of any autosomally imprinted locus (human 11p15.5 and mouse distal 7b). The first imprinted ncRNA locus to be discovered, the H19 gene produces a spliced and polyadenylated ncRNA transcript of ~2.3 kb that is expressed only from the maternal allele (Brannan et al. 1990Go). H19 is the reciprocally imprinted partner of Igf2 (insulin-like growth factor), and Igf2 is expressed only from the paternal allele. Mutations disrupting the imprinted expression of Igf2 underlie a substantial proportion of cases of congenital growth disorder. Interestingly, in the Igf2/H19 domain, imprinting is achieved through "enhancer competition" mediated by a set of chromatin insulators. Igf2 and H19 share a set of enhancers, but only one gene can engage the enhancer at any time and is regulated by an insulator sequence that lies just upstream of the H19 promoter (Webber et al. 1998Go; Kanduri et al. 2000Go; Kaffer et al. 2001Go). On the maternal chromosome, the insulator sequence is not methylated and, therefore, binds CCCTC-binding factor (CTCF), a vertebrate insulator protein. Binding of CTCF prevents the enhancers from engaging the Igf2 gene and together with the enhancers also trans-activates H19. However, on the paternal chromosome, the insulator sequence is methylated and, therefore, cannot bind the methylation-sensitive CTCF, allowing the enhancer to engage Igf2 (Engel and Bartolomei 2003Go). In this way, the insulator sequence upstream of H19 comprises an "imprinting center" that regulates the reciprocal expression of H19 and Igf2. Although H19 is conserved among mammals and highly expressed in embryos, studies carried out over the last 15 yr indicate that the H19 transcript itself has no apparent role in the imprinted expression of its neighboring genes (Jones et al. 1998Go) and is also not necessary for normal development in mice (Ripoche et al. 1997Go). The chromosomal region containing H19 has also been associated with tumor suppressor activity, and the expression pattern of H19 RNA in several cancer cell types differs from neighboring nonmalignant cells (see "Regulatory RNAs Implicated in Complex Diseases: Dark Side of RNA" below). In addition to H19, other ncRNAs emanating from the Igf2/H19 region have been identified, some of which show imprinting while others are expressed biallelically; however, their functional significance has yet to be determined (Moore et al. 1997Go; Drewell et al. 2002bGo).


KCNQ1 locus

As with the closely linked Igf2/H19 cluster, the KCNQ1 locus is closely associated with human Beckwith-Wiedemann syndrome (BWS), a syndrome characterized by parental asymmetric overgrowth, enlarged tongue, and cancer such as Wilms’ tumor (Szymanski and Barciszewski 2003Go; O’Neill 2005Go). The inheritance of BWS is exceptionally complex because the etiology of the disease involves multiple genes in both the KCNQ1 and the Igf2/H19 domains. Interestingly, almost all of the imprinted genes in the KCNQ1 domain are maternally expressed except the paternally expressed ncRNA gene Kcnq1ot1 (Lit1), the antisense counterpart of Kcnq1 (Mitsuya et al. 1999Go; Umlauf et al. 2004Go). The antisense Kcnq1ot1 gene appears to be critical for establishing the imprinted profile of the nearby genes (Mancini-Dinardo et al. 2006Go). Recent studies suggest that the Kcnq1ot1 RNA does so by the recruitment of chromatin changes to the imprinted domain, including H3K9 methylation and H3K27 methylation (Lewis et al. 2004Go; Umlauf et al. 2004Go). The Kcnq1ot1 promoter lies within a differentially methylated region of the Kcnq1 gene body and is now known to make up the imprinting center for the BWS domain (Spencer and Lee 2006Go). Deleting the Kcnq1ot1 CpG island (5' end) results in loss of imprinting in mice, and either the Kcnq1ot1 RNA or transcription through its entire length is required in cis for imprinting of neighboring genes (Cleary et al. 2001Go; Thakur et al. 2004Go; Mancini-Dinardo et al. 2006Go). A transgenic mouse producing a truncated Kcnq1ot1 transcript exhibited correct imprinting but does not result in silencing any of the flanking mRNA genes in the imprinted cluster (Mancini-Dinardo et al. 2006Go; Pauler and Barlow 2006Go). Interestingly, the most common abnormalities in BWS are epigenetic, involving abnormal methylation of H19 or Kcnq1ot1. Recently, microdeletions either in the H19 or Kcnq1ot1 gene have been shown to be associated with BWS, providing genetic confirmation of the importance of this chromosomal region for the disease (Costa 2005Go).


Igf2r (insulin-like growth factor type-2 receptor)/Air (Antisense Igf2r RNA)

The Igf2r/Air locus (proximal chromosome 17) in mice provides yet another example of ncRNA regulation within imprinted loci. A differentially methylated region-2 (DMR2) within the second intron of Igf2r constitutes a critical, bidirectional element controlling silencing of the paternal allele of three protein-coding imprinted genes, Igf2r, Slc22a2, and Slc22a3 (Zwart et al. 2001Go). DMR2 resides in a promoter that drives the transcription of a nonprotein-coding antisense transcript, Air, which partially overlaps with Igf2r. Air is an ~108-kb, capped, polyadenylated, ncRNA and is transcribed exclusively by RNA pol II from the paternal allele (Wutz et al. 1997Go; Braidotti et al. 2004Go; Seidl et al. 2006Go). The majority of Air transcripts evade cotranscriptional splicing resulting in mature unspliced, highly unstable nuclear transcripts (Seidl et al. 2006Go). Like Kcnqlot1, the Air gene is responsible for the bidirectional silencing of neighboring genes in cis, as deleting the Air CpG island results in loss of parental silencing across the entire domain (Wutz et al. 1997Go; Zwart et al. 2001Go). The silencing of these three genes depends on the unmethylated CpG islands and transcription of Air RNA. Because Air RNA does not overlap with two of the three imprinted genes in the domain (Slc22a2 and Slc22a3), Air RNA cannot work through double-stranded RNA (dsRNA) mechanisms, but because truncating Air RNA leads to a disruption of imprinting, its transcription and/or the RNA itself may be required for imprinting (Sleutels et al. 2002Go; Spencer and Lee 2006Go). A suggested mechanism of Air action involves two steps. First, Air expression results in the silencing of the overlapping Igf2r by promoter occlusion or cis-acting RNA interference (RNAi). This could result in an induction of the silent chromatin state that would spread and shut off flanking genes. However, studies by Barlow and colleagues (Sleutels et al. 2003Go) showed proper imprinting of Slc22a2, Slc22a3, and also Air in mice that lack Igf2r, suggesting that the antisense mechanism followed by spreading of silencing may not be the only mechanism responsible for Igf2r/Air locus imprinting. Alternatively, Air RNA could recruit chromatin modifier proteins to specific regions of the imprinted locus in a manner similar to the role suggested for Xist RNA (Sleutels et al. 2003Go). Consistent with this, Igf2r exhibits allele-specific histone modifications (Fournier et al. 2002Go). However, RNA FISH analysis using specific probes against Air RNA did not show coating by Air RNA of the imprinted chromosomal region (Braidotti et al. 2004Go).


Prader-Willi/Angelman syndrome (PWS/AS) locus

PWS AS are the result of disrupted expression of imprinted genes covering a >4-Mb region of human 15q11–13 (mouse proximal 7). The PWS/AS locus in human provided the first example of an imprinted disorder when it was discovered that uniparental disomies (the inheritance of both chromosome copies from the single parent) of chromosome 15 results in an assemblage of congenital problems (O’Neill 2005Go). Maternal disomies result in PWS, whereas the paternal disomies result in AS. PWS is exemplified in newborns by hypotonia, hypogonadism, and various mental retardation and feeding difficulties, followed later in childhood by hyperphagia (Cassidy et al. 2000Go). PWS is a continuous gene disorder manifested by loss of expression of a group of paternally transcribed protein-coding genes including SNURF/SNRPN, MKRN3, MAGEL2, and ZNF127 (O’Neill 2005Go). IPW (Imprinted in Prader-Willi) was isolated as a novel imprinted ncRNA gene from the PWCR (Prader-Willi chromosome region) that produces a spliced and polyadenylated ncRNA (Wevrick et al. 1994Go). The same locus also codes for another ncRNA gene, ZNF127AS, an antisense gene to ZNF127 expressed in brain and lungs (Jong et al. 1999Go). AS is characterized by ataxic gate, jerky arm movements, inappropriate laughter, and severe mental retardation (Williams et al. 1995Go). Loss-of-function mutations in a maternally transcribed gene at this locus, UBE3A, can cause AS (Albrecht et al. 1997Go; Kishino et al. 1997Go). The paternal silencing of UBE3A is confined to specific brain subregions; elsewhere it is biallelically expressed (Rougeulle et al. 1997Go; Vu and Hoffman 1997Go). Additionally, there is paternal-specific expression of a large, alternatively spliced antisense transcript (UBE3A-ATS), spanning ~450 kb in human and ~1 Mb in mice. Deleting the 5' end of this long antisense transcript results in reduced expression of UBE3A on the paternal chromosome (Chamberlain and Brannan 2001Go). Although no role has been ascribed to the large UBE3A antisense transcripts, it has been proposed that these RNAs may be directly linked to the etiology of the diseases (Rougeulle et al. 1998Go; Runte et al. 2004Go). A second maternal-specific transcript from this region, ATP10C, has also been implicated in the AS phenotype (Meguro et al. 2001Go). The PWS/AS locus also contains several clusters of snoRNAs (C/D-box snoRNAs) expressed exclusively from the paternal chromosome. Interestingly, many of these snoRNA genes that overlap UBE3A on the opposite strand were shown to be overexpressed in AS patients (Runte et al. 2001Go).


GNAS locus

Transcription of genes at the GNAS imprinted locus (human 20q13 and mouse distal 2) is exceptionally complex. The core gene of this locus is GNAS, which is expressed ubiquitously and biallelically in all but a few tissues. It encodes Gs{alpha}, the {alpha}-subunit of the heterotrimeric G-protein complex. Constitutive activating mutations in Gs{alpha} give rise to McCune-Albright syndrome, characterized variably by café-au-lait spots, gonadotropin-independent sexual precocity, and fibrous dysplasia of bone (Schwindinger et al. 1992Go). In certain hormone targeted tissues (renal proximal tissues, gonads, and thyroid in humans), GNAS is transcribed predominantly from the maternal allele. NESP55, encoding a chromogranin-like neurosecretory protein, is also maternally expressed. Unusually, NESP55 incorporates exons 2–13 of GNAS into its 3' untranslated region (UTR). The ncRNAs transcribed from this locus includes NESPAS, a spliced antisense transcript, and a truncated ncRNA transcript expressed from the GNAS locus by alternative promoter usage. A recent report implicates a possible role for the NESPAS transcript in the transcriptional control of GNAS (Bastepe et al. 2005Go). NESPAS RNA expression could repress NESP55 by promoter occlusion, localized heterochromatinization, or competition for shared transcription factors (Wroe et al. 2000Go).

Study of the molecular elements that combine to initiate and maintain the imprint and translate it into monoallelic expression has suggested a critical role of ncRNAs in governing gene silencing. Better insight into the mechanism of ncRNA action on the imprinted loci will provide an important paradigm for understanding genomic imprinting.


    Intergenic transcripts: sense in reading between the genes
 Top
 Abstract
 Roles of RNA in...
 Roles of ncRNAs in...
 Intergenic transcripts: sense in...
 Natural antisense transcripts...
 RNAs as modulators of...
 RNAs: location, location,...
 Pseudogene transcripts: no more...
 Nuclear retained regulatory...
 New roles for RNAs?
 Regulatory RNAs implicated in...
 Summary
 Acknowledgments
 References
 
A large proportion of transcripts from eukaryotic genomes correspond to intergenic transcripts and antisense transcripts. The intergenic transcription units produce ncRNAs of variable sizes that are not well conserved across the phyla (Babak et al. 2005Go). Although the exact functions of these RNAs have not been validated, their functions are likely linked to transcription-dependent mechanisms rather than being RNA-dependent per se. There are already several examples of intergenic transcription associated with developmentally regulated genes, which play important roles in the coordination of gene expression. Several of the more well-documented intergenic transcription sites include the following.


Mammalian beta-globin locus

In humans, the 70-kb beta-globin locus consists of five erythroid-specific genes; embryonic ({varepsilon}), fetal (G{gamma} and A{gamma}), and adult ({delta} and beta), whose expression is under the control of the beta-LCR (locus control region). Analysis of nascent transcripts from the beta-globin gene cluster revealed that both intergenic regions and LCR constitutively produce specific ncRNAs (Ashe et al. 1997Go). Both LCR and intergenic transcripts originate from the same strand as other globin genes and are retained in the nucleus (Ashe et al. 1997Go). Expression of ncRNA transcripts from the LCR and intergenic regions are restricted primarily to erythroid cells. Interestingly, transient expression of globin genes in nonerythroid cells can induce transcription from the intergenic region without activating the protein-coding domains (Ashe et al. 1997Go). An explanation for the production of intergenic trancripts from the LCR has been suggested by a "tracking model." According to this model, erythroid-specific and ubiquitous transcription factors and cofactors form complexes with the LCR and track along the locus. When this transcription complex encounters the basal transcription machinery, located at the promoter, transcription of the gene is initiated (Q. Li et al. 2002Go). During this process, there is a high probability that intergenic transcripts would arise from the cryptic start sites along the locus. It has been proposed that these intergenic transcripts might facilitate the recruitment of trans-acting factors and RNA pol II to the promoters of globin genes via this tracking mechanism (Tuan et al. 1992Go). Alternatively, intergenic transcription may be required for the establishment and maintenance of an open chromatin conformation within the globin locus (Gribnau et al. 2000Go; Plant et al. 2001Go). However, the persistence of DNase I hypersensitivity following deletion of the LCRs in cell lines argues against this role (Epner et al. 1998Go; Reik et al. 1998Go). Similarly, studies by Haussecker and Proudfoot (2005)Go did not observe a positive correlation between intergenic transcript abundance and chromatin activation and/or globin gene expression. Instead, this study suggested that intergenic transcription at the beta-globin locus mediates the formation of silent chromatin in the absence of erythrocyte-specific transcription factors (Haussecker and Proudfoot 2005Go).


IL-4/IL-13 gene cluster

During differentiation of naive CD4+ precursors to T helper 1 (Th1) or Th2 effector cells, several epigenetic changes occur in a lineage-specific manner at the IFN{gamma} or IL4/IL13 loci. Upon activation, a subset of Th2 cells involved in cell-mediated immune responses express IL-4 and IL-13 genes located in tandem on human chromosome 5q (chromosome 11 in mouse) (Frazer et al. 1997Go). This cluster is flanked by two constitutively expressed genes: Rad50 and Kif3a. Transcription analysis from this intergenic region in CD4+ T cells has revealed the presence of a 130- to 260-nt polyadenylated nuclear retained ncRNA. Studies in a mouse transgenic model have revealed that the intergenic transcription is restricted to tissues and lineages in which IL-4 and IL-13 are expressed and is up-regulated upon Th2 differentiation (Rogan et al. 2004Go). However, these intergenic transcripts are constitutively expressed even in the absence of active IL genes, implying that they are derived from independent transcription units. Although the role of these intergenic transcripts is not clear, one possible explanation is that they result from the chromatin remodeling activity at this locus (Takemoto et al. 2000Go). Consistent with this idea, the differentiation of Th2 cells was found to be associated with hyperacetylation of histone H3 and hypomethylation of the CpG islands (Yamashita et al. 2002). Another example of integenic transcription in a lineage-specific gene cluster has been described at the MHC class II locus (Masternak et al. 2003Go).


Intergenic transcripts from the Dlx-5/6 region

Vertebrate Dlx genes are members of the homeodomain protein family that play critical roles in differentiation and migration of neurons as well as craniofacial and limb patterning during development (Feng et al. 2006Go and references therein). The Dlx genes are expressed in bi-gene clusters, and conserved intergenic enhancers have been identified for the Dlx-5/6 and Dlx-1/2 loci (Zerucha et al. 2000Go; Ghanem et al. 2003Go). One of the two conserved intergenic regions from mouse, the Dlx-5/6 region transcribes two ncRNAs, Evf-1 and Evf-2, the latter being the alternatively spliced form of Evf-1 (Kohtz and Fishell 2004Go; Feng et al. 2006Go). Evf-1 is a 2.7-kb polyadenylated RNA, and its expression is developmentally regulated (Kohtz and Fishell 2004Go). The Evf-2 ncRNA (3.8 kb) specifically cooperates with the homeodomain protein Dlx-2 to increase the transcriptional activity of the Dlx-5/6 enhancer region in a target- and homeodomain-specific manner. Interestingly, a stable complex containing the Evf-2 ncRNA/Dlx-2 homeodomain protein forms in vivo in the nucleus (Feng et al. 2006Go). Together, these data suggest that the Evf-2/Dlx-2 complex stabilizes the interaction between Dlx-2 and target Dlx-5/6 enhancer sequences to increase transcriptional activity. The role of Evf-2 as a transcriptional activator suggests the possibility that a subset of such vertebrate ultraconserved regions may function at the RNA level as key developmental regulators.


Bithorax complex (BX-C) in Drosophila

In Drosophila, the homeotic genes encoded by the BX-C are involved in specifying the segmentation of the embryo and determining the body plan (Lewis 1978Go). The correct spatial and temporal expression of the three protein-coding genes Ultrabithorax (Ubx), Abdominal-A (Abd-A), and Abdominal-B (Abd-B) is crucial for the development of thoracic and abdominal segments. The expression pattern of Abd-A and Abd-B depends on an array of regulatory elements located in the intergenic regions between these genes, including seven genetically defined infra-abdominal (iab-2–8) domains, and mutations in this region are associated with developmental defects affecting abdominal segments (Sanchez-Herrero and Akam 1989Go). The iabs are transcribed exclusively in the embryos. A systemic examination of the distribution of these intergenic transcripts from the iab regions revealed that they show highly specific localization along the anterior–posterior axis of the blastoderm embryo and the transcripts are restricted to the nucleus (Bae et al. 2002Go). The intergenic transcripts originating from iab-4 revealed 1.7-kb and 2.0-kb polyadenylated ncRNAs that are transcribed in the opposite direction to Abd-A (Cumberledge et al. 1990Go). Alteration of transcription in one iab subdomain induces a homeotic transformation of the more posterior segment under its control, suggesting that intergenic transcription plays a crucial role in iab activity (Drewell et al. 2002aGo). Intergenic transcription from the iab regions has also been proposed to play a role in the activation of cis-regulatory elements by interfering with the Polycomb-repressing complex, responsible for silencing the homeotic genes (Bender and Fitzgerald 2002Go; Hogga and Karch 2002Go). The iab-4 region contains a single ~100-nt pre-miRNA hairpin structure that encodes two stable miRNAs: mir-iab-4-5p and mir-iab-4-3p (Aravin et al. 2003Go). Recent studies revealed that these miRNAs regulate Ubx activity in vivo (Stark et al. 2003Go; Grun et al. 2005Go; Ronshaugen et al. 2005Go).

Intergenic transcription within the BX-C is not limited to the iab regions but also has been reported for the bithoraxoid (bxd) region (Lipshitz et al. 1987Go). This region exhibits active transcription twice: once early in embryogenesis and once in later larval and adult stages. The early transcripts (1.1–1.3 kb, are processed from a 26-kb precursor) appear to be ncRNAs, whereas the late transcripts (0.8 kb) can be translated to produce a protein (Lipshitz et al. 1987Go). Recently, an elegant study by Sauer and colleagues (Sanchez-Elsner et al. 2006Go) provided direct evidence of the role of intergenic transcripts from the Ubx region in epigenetic activation of gene expression. The Ubx locus contains multiple cis-regulatory elements known as trithorax response elements (TRE) that recruit transcriptional activators such as the trithorax group (trxG) of epigenetic regulators. Interestingly, the same DNA elements can also act as repressor-binding sites, Polycomb response elements (PRE), and facilitate the recruitment of members of the Polycomb (PcG) complex. It has previously been shown that intergenic transcription of ncRNAs from TRE/PRE elements switches a silent PRE to a TRE, which indicates that TRE/PRE transcription plays an important role in epigenetic activation (Lipshitz et al. 1987Go; Rank et al. 2002Go; Schmitt et al. 2005Go). Recent studies by Sanchez-Elsner (Sanchez-Elsner et al. 2006Go) further showed that these intergenic transcripts from the TRE at the Ubx locus mediate transcriptional activation of Ubx by recruiting the epigenetic regulator Ash1 to the TRE elements. Ash1 is a histone methyltransferase (HMT) that promotes transcriptional activation by trimethylating H3K4, H3K9, and H4K20 (Beisel et al. 2002Go) and is essential for the tissue-specific expression of Ubx (Beisel et al. 2002Go and references therein). Therefore, intergenic transcripts derived from the TRE locus mediate the recruitment of Ash1 to the TRE DNA elements of Ubx. These ncRNA transcripts serve as an intermediary between the TRE DNA elements and Ash1 protein (Sanchez-Elsner et al. 2006Go). These data further support a model in which an intergenic ncRNA transcribed from the TRE of Ubx is retained at the TRE through DNA–RNA interactions and plays an important role in providing an RNA scaffold that is recognized by Ash1.


SRG1 in Saccharomyces cerevisiae

Unlike the above examples in which intergenic transcription is involved in the transcriptional activation of the corresponding region, studies in the budding yeast S. cerevisiae have revealed the role of intergenic transcription in transcriptional repression (Martens et al. 2004Go, 2005Go). Transcription of the intergenic ncRNA gene SRG1 (SER3 regulatory gene 1) across the promoter of the adjacent SER3, a serine biosynthetic gene, represses the transcription of SER3 by transcriptional interference (Martens et al. 2004Go). SRG1 transcription is regulated by serine such that in the presence of serine, the serine-dependent activator Cha4 binds to the SRG1 promoter and activates its transcription, thereby negatively regulating the expression of SER3 (Martens et al. 2005Go). These studies demonstrate an example where intergenic transcription provides a mechanism for a single protein, Cha4, to simultaneously activate and repress opposing pathways.

The evergrowing list of intergenic transcripts located mostly in the nonprotein-coding regions of the genome has highlighted the importance of intergenic transcription in regulating gene activity. This further highlights the fact that the high proportion of nonprotein-coding regions in the eukaryotic genome is probably not due to the accumulation of nonsense DNA but rather represents the evolution of more complicated gene regulatory mechanisms (Schmitt and Paro 2004Go).


    Natural antisense transcripts (NATs): new players in the gene regulatory network
 Top
 Abstract
 Roles of RNA in...
 Roles of ncRNAs in...
 Intergenic transcripts: sense in...
 Natural antisense transcripts...
 RNAs as modulators of...
 RNAs: location, location,...
 Pseudogene transcripts: no more...
 Nuclear retained regulatory...
 New roles for RNAs?
 Regulatory RNAs implicated in...
 Summary
 Acknowledgments
 References
 
Computational analysis of data from large-scale sequencing projects has revealed a surprising abundance of NATs in several eukaryotic genomes (Lehner et al. 2002Go; Lavorgna et al. 2004Go). More than 2500 NATs have been identified in human of which >1600 are predicted to be true NATs (Yelin et al. 2003Go). Recent genome-wide analyses suggest that as much as 15%–25% of human genes might be involved in antisense transcription (http://www.narna.ncl.ac.uk). Similar analyses in other organisms including mouse have revealed a large number of NATs (Kiyosawa et al. 2003Go, 2005Go; Lavorgna et al. 2004Go; Katayama et al. 2005Go). NATs are RNAs containing sequences that are complementary to other endogenous RNAs. They can be transcribed in cis from opposing DNA strands at the same genomic locus (cis-NATs) or in trans from separate loci (trans-NATs). In human tissues, the sense–antisense pairs tend to be coexpressed and/or inversely expressed more frequently than expected by chance, and this expression pattern tends to be evolutionarily conserved (Chen et al. 2005Go). NATs have been implicated in many levels of eukaryotic gene regulation including translational regulation, genomic imprinting, RNAi, alternative splicing, XCI, RNA editing, and gene silencing (Kumar and Carmichael 1997Go; Lavorgna et al. 2004Go). Even though the eukaryotic genome contains a large number of NATs, our understanding of how antisense transcription regulates gene expression remains largely incomplete. The regulation of gene expression by NATs can occur through multiple mechanisms, as shown below.


Transcriptional interference

Transcription by RNA pol II involves both large protein complexes and the unwinding of the duplex DNA. It is unlikely that two overlapping transcriptional units could be transcribed concomitantly by the RNA pol II machinery. Such effects have been well studied with respect to the GAL10 and GAL7 genes in S. cerevisiae (Prescott and Proudfoot 2002Go). When arranged convergently, but not overlapping, both genes are transcribed at normal levels. However, when the two transcription units overlap, steady-state mRNA levels are severely reduced due to an inhibition of transcription elongation, suggesting that the expression of cis-NAT partners could be tightly regulated through a process of competitive transcriptional interference. Under such circumstances, cis-NATs might be expected to exhibit reciprocal expression, which holds true for many of the antisense partners in the eukaryotic genome (http://www.narna.ncl.ac.uk).

An antisense transcript that may function as a negative regulator of gene expression by transcriptional interference has been identified in plants (Kapranov et al. 2001Go). In the legume Lotus japonicus, the expression of the late nodulin LjNOD16 gene is controlled by a bidirectional promoter located within an intron of the gene LjPLP-IV (LjPLP-IV encodes a phosphatidylinositol transfer-like protein). Transcription from the opposite strand gives rise to an antisense transcript responsible for the control of LjPLP-IV expression in root nodules, where its level is significantly lower than in flowers (Kapranov et al. 2001Go). Similarly, during XCI, it was suggested that the Tsix transcripts regulate the asymmetric expression of Xist by an antisense mechanism (Lee et al. 1999Go; Sun et al. 2006Go). However, this mechanism of transcriptional interference cannot fully explain the repressive effect of the Air/Igf2r and KCNQ1 loci, as genes outside of the region of overlapping antisense Air and Kcnq1ot1 are also transcriptionally repressed.


RNA masking

Formation of RNA duplexes between sense and antisense transcripts might mask key regulatory features within either transcript, thereby inhibiting the interaction of important trans-acting factors. This form of steric inhibition could affect any step in gene expression involving protein–RNA interactions, including pre-mRNA processing, transport, translation, and degradation. An example of this method of antisense regulation is the inhibition of alternative splicing induced by the Rev-ErbA{alpha} transcript in different B-cell lines, which overlaps one of two functionally anatagonistic splice forms of the thyroid hormone receptor ErbA{alpha}2 mRNA (Hastings et al. 1997Go, 2000Go). An antisense RNA-based mechanism has also been shown to be responsible for the regulation of the human HFE gene, which is implicated in iron metabolism and involved in the human inherited disorder hereditary hemochromatosis (Thenie et al. 2001Go). Although there is no direct evidence for the role of the HFE antisense transcript in vivo, in vitro studies </