Polymerase pausing induced by sequence-specific RNA-binding protein drives heterochromatin assembly

In this study, Parsa et al. investigated the mechanisms underlying RNAi-independent heterochromatin assembly by the CTD–RRM protein Seb1 in S. pombe. They show that Seb1 promotes long-lived RNAPII pauses at pericentromeric repeat regions and that their presence correlates with the heterochromatin-triggering activities of the corresponding dg and dh DNA fragments, providing new insight into Seb1-mediated polymerase stalling as a signal necessary for heterochromatin nucleation.


3
To understand how Seb1 interfaces with the transcription of dg and dh repeats to promote heterochromatin, we employed a previously identified viable, heterochromatindefective allele, seb1-1 15 . When combined with mutants in the RNAi machinery, seb1-1 eliminates pericentromeric heterochromatin, while the corresponding single mutants decrease H3K9me, indicative of partially redundant pathways 15 . We examined transcription of heterochromatin at single-nucleotide resolution and tested the impact of the seb1-1 allele using Nascent Elongating Transcript Sequencing (NET-Seq) 20 . To analyze the intrinsic transcriptional properties of heterochromatic sequences prior to the establishment of heterochromatin assembly, we used the clr4∆ mutant, which lacks H3K9me and displays full derepression of most silenced chromatin regions. We compared this strain to a clr4∆ seb1-1 double mutant to assess the impact of seb1-1.
We first examined the effect of seb1-1 on transcription of non-heterochromatic regions ( Figure 1). Initial inspection revealed numerous genes with a decreased peak density at 5' regions in the double mutant with increased peak density upstream of annotated cleavage-polyadenylation sites (often called Transcription end sites or TESs) (see for p values). These data indicate that the seb1-1 allele leads to decreased RNAPII pausing at gene 5' ends with an associated increased 3' signal; the latter may be due to polymerase release from upstream pauses.
Our prior RIP-qPCR analysis indicates that Seb1 functions directly in heterochromatin assembly by binding pericentromeric dg and dh repeat transcripts 15

5
To compare the binding of Seb1 across transcript classes, we computed the fraction of RNA covered by Seb1 PAR-CLIP read clusters. We observed a ~12-fold higher PAR-CLIP cluster coverage for pericentromeric repeat intervals than for coding gene intervals ( Figure 2d and Extended Data Figure 4a). Non-coding RNAs display the highest coverage at a mean level ~100-fold higher than that of coding-genes (Extended Data These data reveal detectable Seb1-dependent RNAPII pauses in pericentromeric sequences. Previous studies identified two segments of pericentromeric DNA that can trigger heterochromatin (L5 24,25 and Frag1 15 respectively). Frag1 defined a segment that requires both RNAi and Seb1 for its activity 15 . To compare activity of pericentromeric fragments to their transcription properties described above, we extended this analysis using a system we employed previously 15 . The cen1R region was divided into nine overlapping fragments (Figure 2f and Extended Data Table 3). Each fragment was placed downstream of an adh1+ promoter (padh1 + ) in either forward or reverse orientation, and upstream of a transcription terminator. This insert was then placed downstream of ura4 + (Figure 2g). Silencing of ura4 + was determined using YS-FOA plates, which selects for ura4+ repression. The insert of Fragment 1 (Frag1) displays silencing activity; this construct was used previously to isolate the seb1-1 mutant and was shown to require the padh1 + promoter for silencing activity 15 . Three additional fragments exhibit strong silencing activity and each is functional in only one orientation ( Figure 2h; Frag2A, 8S and 9S). Thus, these pericentromeric regions harbor a transcription-dependent, orientation-specific signal capable of triggering silencing. To examine the relationship of these regions to those that display detectable Seb1dependent pauses, we identified clusters of NET-seq peaks (see Methods) and computed the total read density of these clusters within each fragment. A comparison of clr4∆ to clr4∆ seb1-1 strains revealed significant correlation (χ 2 =12.6, p<0.001) Frag8S, displays silencing activity but no detectable Seb1-dependent NET-seq peak clusters (although it does display Seb1-dependent NET-seq signal; Figure 2h and Extended Data Figure 5d). The heterochromatin assembly activity of this fragment may be pause-independent or, the relevant RNAPII pauses may be below the sensitivity of NET-seq. These data indicate that Seb1 directly recognizes dg and dh RNAs and induces detectable pausing in centromere fragments that display silencing activity.   Remarkably, of 13 isolates derived from epe1∆ tfs1 DN parents analyzed by H3K9me ChIP-seq, five separate isolates harbor a distinct ectopic region of heterochromatin, which we termed Pause-Induced Ectopic heterochromatic Region (PIER) (Figure 3c  PIERs range in size from ~3 to ~15 kb, and each PIER was unique. Three PIERs are bounded by an essential gene on at least one side of the locus suggesting that selection likely prevents observing PIERs that assemble over essential genes (Figure 3d  To determine if H3K9me at PIERs lead to repression, we conducted RNA-seq analysis on epe1∆ tfs1 DN isolate 2 (containing PIER 2) (Extended Data Figure 9a and Extended Data Table 4). We observed a significant decrease (p<0.01) in two of the three genes present in PIER2, cta3 + and its8 + (Extended Data Figure 9a and Extended Data Table   4). TFIIS DN  RNAi. Remarkably, upon integration of tfs1 DN into these two epe1∆ ago1∆ strains, H3K9me2 ChIP-seq revealed establishment of heterochromatin at constitutively silenced loci (e.g. centromeres) and its loss at the clr4 + locus (Extended Data Figure   11b and c -isolates 1, 3, 4, and 5). This observation indicates that pericentromeric repeats harbor the ability to respond to RNAPII pausing and assemble heterochromatin independently of RNAi. Accompanying these changes, six out of the 12 epe1∆ ago1∆ tfs1 DN isolates subjected to whole-genome analysis acquired PIERs ( Figure 4; Extended Data Figure 11). Five of six of these PIERs are bounded by an essential gene on at least one side of the region similar to that seen in the epe1∆ tfs1 DN (Figure 4; PIERs 6-10), consistent with the notion that essential genes limit our ability to observe ectopic heterochromatin. Again, each PIER was unique. Thus, PIERs can be triggered by RNAPII pausing even in cells lacking a functional RITS complex.
Our results indicate that Seb1, a conserved RNAPII-associated RNA binding protein that mediates RNAi-independent heterochromatin assembly in S. pombe 15 , is enriched on pericentromeric ncRNA transcripts relative to coding sequences and promotes longlived RNAPII pauses. Remarkably, pausing is sufficient to trigger ectopic heterochromatin assembly in an RNAi-independent fashion, indicating that this is a relevant activity of Seb1 in promoting heterochromatin assembly. Binding of Seb1 to euchromatic ncRNAs (e.g. snRNAs) is not associated with detectable heterochromatin assembly; this may be due to high levels of transcription, which induces anti-silencing histone marks and histone turnover, both of which antagonize silencing 1 . Alternatively, the repetitiveness of pericentromeric sequences may also contribute to specificity by producing a threshold density of paused polymerases within a discrete genomic interval.
Testing these and other possibilities will require the development of synthetic biology tools that enable the programming of pauses of defined length at defined sites and at define levels of transcription, a task beyond current technology. Our data are germane to the observation that mutations in the Paf1 complex (Paf1-C), a multifunctional elongation complex that binds cooperatively to RNAPII with TFIIS 39 , enables synthetic hairpin RNAs to trigger heterochromatin in trans and increases heterochromatin spreading in S. pombe [40][41][42] . While the elongation-promoting activity of Paf-C has been suggested to limit heterochromatin by limiting targeting of RITS to the nascent transcript 43 , it may also act via RNAi-independent mechanisms as we find that increased RNAPII pausing can trigger H3K9me independently of RNAi. Seb1-triggered RNAPII pausing may drive heterochromatin assembly by promoting the heterochromatic stalling of replisomes associated with CLR-C through RNAPII-replisome collisions as 12 proposed 44,45 (Extended Data Figure 12). Analogous concepts have been put forth in S.
cerevisiae where tight protein-DNA interactions are sufficient to trigger recruitment of the SIR complex 46 . Consistent with this hypothesis, such transcription-replication conflicts are limited by Paf1-C 47 which inhibits heterochromatin assembly, while slowing of replisome progression enhances heterochromatin spread 48,49 . It has also been proposed that the 5'à3' RNA exonuclease Dhp1 (related to S. cerevisiae Rat1/Xrn2), which is required for RNAi-independent heterochromatin assembly, recruits the silencing machinery via a physical interaction with CLR-C 50,51 . Because RNAPII pausing enhances recruitment of Xrn2 52,53 , and Seb1 copurifies with the Dhp1 14,19 , Seb1-induced pausing may promote heterochromatin assembly via this mechanism as well (Extended Data Figure 12). Our model also readily accommodates genetic observations that null mutants in S. pombe RNAPII elongation factors suppress the H3K9me defect of RNAi mutants 16,41 , as well as analogous observations for mutants in RNA biogenesis factors 16 as these factors also promote transcriptional elongation 54 . Weak cleavagepolyadenylation signals promote heterochromatin assembly 55 , which is predicted to result in accumulation of paused RNAPII at S. pombe Downstream Pause Elements 56 .
Another key factor recruited to pericentromeric regions by Seb1 is remodeling/HDAC complex SHREC 57

Yeast strains, plasmids, and media
A list of all S. pombe strains and plasmids used in this study is provided in Extended Data Table 5. Cells were grown at 30°C in synthetic complete medium (SC) with adenine and amino acid supplements with reduced levels of uracil (150mg/L) for PAR-CLIP, or in Edinburgh minimal medium (EMM) supplemented with adenine, uracil, and the appropriate amino acids with or without thiamine (15µM) for NET-seq and ChIP-seq.

NET-seq
NET-seq experiments were conducted as previously described 63 with minor alterations for S. pombe. S. pombe cultures were grown in 1L EMM without thiamine to an OD 600 of 0.7 and harvested via filtration and flash frozen in liquid nitrogen. Lysis and immunoprecipitation was conducted as previously described 64 . Adaptor ligation was performed using random hexamer-barcoded adaptors. All strains were analyzed in duplicate and sequencing was conducted on a HiSeq 4000 platform.

RNA-seq
Strains were grown in YS media + 3% Glucose overnight to OD 600 = 0.7. Cells were harvested by centrifugation, washed twice with ice-cold water and flash-frozen. Pellets were resuspended in 1ml Trizol (Thermo Fisher Scientific, #15596026

Spotting Assay
Strains were grown overnight to saturation and diluted to OD600 of 1. Serial dilutions were performed with a dilution factor of 5. For ura4 + silencing assays, cells were grown on non-selective and 5-fluoroortic acid (FOA) (2 g/L) YS plates at 30°C for 3 days. For tfs1 DN induction assays, cells were grown on YS, EMM -leu -thiamine, or EMM -leu +thiamine, for 3 days.

NET-seq analysis: Genome alignment
For NET-seq analysis, adapter sequences were removed and reads were flattened to remove sequence duplicates. Barcoded reads were then mapped to the S. pombe genome 68 using BOWTIE 69 to align and omit any sequence reads that were misprimed during the reverse transcription step of NET-seq and thus lack a barcode using the following flags: -M1 --best --strata. Unaligned files were collected for further analysis.
Barcodes were removed, and the new unique, debarcoded reads were realigned to the genome using the following flags in BOWTIE: -M1 --best --strata .

NET-seq analysis: Cluster finding
High-density regions of NET-seq signal were defined across centromeres and coding regions to compare NET-seq density between genotypes. First, NET-seq peaks were discovered by calculating robust Z-scores (based on median and median absolute deviation) from the log2 transform of the number of reads starting at each position in the defined region (centromere fragment or transcript). Positions with a robust Z-score of at least 2 and at least 10 unique reads were considered peaks. Next, peaks were clustered together using a sliding window (width=50, increment=10). The density of the cluster is calculated as the number of reads in the cluster divided by the size of the cluster in kilobases (kb).
To determine cluster densities for each fragment derived from the right arm of centromere 1, the sum of cluster densities was normalized to the sum of all densities in each sample. Error bars represent the range of two replicates.

NET-seq analysis: Traveling ratio
Traveling ratios were calculated for every non-overlapping annotated transcript at least

NET-seq analysis: Dwell time
Dwell time was determined by normalizing peak height to the average NET-seq signal density of the surrounding 100 nt. NET-seq peaks with at least a two-fold decrease from clr4∆ to clr4∆ seb1-1 in both replicates were considered Seb1-dependent. P-values were determined by KS test.

RNA-seq analysis
Analysis was performed using TopHat 70 and DESeq2 71 . Changes in transcript expression levels required >2-fold change in mutants compared to wildtype to be considered significantly changed enough to have a functional consequence. Data analysis was performed on 2 replicates per condition.
To determine the fraction of reads derived from the expression of tfs1 DN we divided the total number of reads that specifically aligned to the mutated region of the tfs1 DN allele by the total number of reads (both WT and mutant alleles) that aligned to this same region.

Seb1 PAR-CLIP data analysis by PARalyzer
For PAR-CLIP analysis, adapter sequences were removed and reads were mapped to the S. pombe genome 68 using BOWTIE 69 , allowing for three mismatches with the following flags: -M1 -v3 --best --strata. Seb1 binding site read clusters were identified with PARalyzer 21 . Reads of < 20nt were omitted, and read clusters required at least 10 reads and at least two T→C conversions per cluster to be called as a Seb1 binding site.

ChIP-seq analysis
ChIP-seq analysis was conducted as previously described 67 . Briefly, adaptor sequences from ChIP-seq sequencing libraries were removed and reads <20nt were omitted.
Reads were aligned to the S. pombe genome 68 using BOWTIE 69 with the following flags: -M1 --best --strata. Aligned reads were smoothed over a 1kb window.

ChIP-seq analysis: PIER discovery
H3K9me ChIP-seq peaks were considered as novel ectopic sites of H3K9me if two criteria were met: 1) H3K9me peaks were ≥3-fold higher than the genome background signal in the isolate, and 2) when normalized to the WCE, the H3K9me signal at the peak was ≥3-fold higher than the parental H3K9me ChIP-seq signal. A curated list of genomic regions previously observed to have a propensity to form heterochromatin in various S. pombe backgrounds 35-37 was generated (Extended Data isolates to the whole cell extracts and plotted as a heatmap (Extended Data Figure 8)

Data sets
All available sequencing data sets are listed in Extended Data Table 7.

REFERENCES
Extended Data Figure 11. PIER formation is RNAi-independent and expression of