Elevated FOXG1 and SOX2 in glioblastoma enforces neural stem cell identity through transcriptional control of cell cycle and epigenetic regulators

Bulstrode et al. show that increased expression of FOXG1 and SOX2 restricts astrocyte differentiation and can trigger dedifferentiation to a proliferative neural stem cell state. FOXG1 and SOX2 operate in distinct but complementary roles to fuel unconstrained self-renewal in glioblastoma stem cells via transcriptional control of core cell cycle regulators and epigenetic targets.

astrocyte marker Gfap, respectively. (D) Schematic representations of domains in the FOXG1 and SOX2 proteins. Overall protein sequence homology of mouse Foxg1 to human FOXG1 is 97%. The annotated forkhead binding domain, groucho binding domain (GBD) and jarid binding domain (JBD) demonstrate 100% protein homology between mouse and human. Overall protein sequence homology of mouse Sox2 to human SOX2 is also 97%. The HMG binding domain protein sequence is 100% homologous between mouse and human, and the transactivation domains are identical save for serine-295 which is substituted for alanine in the mouse. (E) and (F) Schematics of Tet-on V5 epitope-tagged FOXG1 (E) or SOX2 (F) transgene cassettes used to establish clonal ANS4 derivatives F6 and S15 by piggyBac recombination. (TRE, TET-responsive element; V5, V5 epitope tag; PB, piggyBac; BSD, blasticidin resistance; IRES, internal ribosome entry site).     Time-lapse imaging of FS3 cells plated at clonal density treated with BMP then exposed to growth factors + Dox (1 second = 12 hours). A single cell is seen to reenter cycle at the 48 h time point, then begin dividing with a cycle time of ~24 h.

Guide RNA design and cloning
Guide RNA sequences were selected using the Zifit tool (zifit.partners.org/ZiFiT) to bind at instances of the PAM sequence occurring within gene open reading frames, with low predicted off-target binding.
Oligonucleotides were annealed to produce double-stranded guide inserts, with 4base overhangs for ligation into a U6 expression plasmid backbone (gift from S. Gerety, Sanger Institute). The guide inserts were phosphorylated, and the U6 vector (SP 117) digested with Bsa1 to generate matching overhangs. Backbone and insert were ligated with T4 DNA ligase (Fermentas) according to manufacturers' guidelines.
The resulting plasmids were checked by restriction digest (Nhe1/EcoR1) to confirm incorporation of an insert and then verified by DNA sequencing.

Gene targeting and inducible vector construction
TetON 3G inducible expression cassettes were delivered via piggyBac transposition (Guo et al., 2009)
For each targeted locus, two sets of genotyping primers spanning the junction of genomic sequences and targeting vector were used. Gene-specific primers outside each end of the 5′ and 3′ homology arms were used in combination with the appropriate universal cassette primers (either CAG-Blasticidin for targeting Rosa26 and AAVS1 loci, or Ef1α -Puro for knockout experiments). To identify NHEJ-based damage on the second, non-targeted alleles, regions flanking the sgRNA target sites (500-600 bp) were amplified using gene-specific primers and assessed by DNA sequencing (Source Bioscience).

RNA-seq data analysis
Reads were aligned to the mouse reference genome (mm10/GRCm38) augmented with the Foxg1 expression plasmid sequence with STAR 2.5.2a (Dobin et al., 2013) using the two-pass method for novel splice detection (Engström et al., 2013).

Reduced Representation Bisulfite Sequencing data analysis
Adaptors were removed using Trim Galore (v0.4.1, adaptors: AGATCGGAAGAGC and AAATCAAAAAAAC). Diversity bases introduced by the NuGEN Ovation kit to facilite sequencing were then trimmed using a python script provided by NuGEN as detailed in the ovation kit manual. Paired-end alignment to the mouse reference genome (mm10/GRCm38) was then performed using Bismark (v0.16.3, using Bowtie v2.2.6 and parameters: -N 0 -L 20). Read alignments were parsed to quantify methylation levels at CpGs using Bismark (parameters: -p --no_overlap) and Bedgraph conversion modules. Custom AWK and Python scripts were then used to combine data from both strands for each CpG and across samples. This generated a report for each observed CpG with read coverage and frequency of observed methylation status. In all cases conversion efficiency as assessed from reads mapping to λ phage was > 98%.
CpGs were tested for differential methylation between NS cells cultured in EGF/FGF versus those treated with BMP for 24 h or 10 days. CpGs with coverage < 10 in any sample analysed in each comparison were excluded from the analysis. CpGs were tested for differential methylation using Fisher's exact tests of total and methylated coverage based on running sums of 5 CpGs. Only those CpGs with p < 0.05 and changing in the same direction in all 3 replicates were considered significant. DMRs were then defined from the analysis of 5-CpG windows containing at least 3 significant CpGs where the inter-CpG distance did not exceed 200 bp.
CpGs located in significant differentially methylated regions (DMRs) were associated with the most proximal TSS using ChIPpeakAnno (Zhu et al., 2010) and biomaRt (Durinck et al., 2005) against annotation from Ensembl 87. These were compared to polycomb-marked genes, as profiled for H3K27me3 and H3K27me4+H3K27me3 in mouse ES cells, NS cells and brain (Meissner et al., 2008). Enriched loci were compared to a background of all CpGs located in clusters of 5 CpGs were the inter-CpG distance did not exceed 200 bp (i.e., those that could potentially be included in DMRs). Significance was assessed using Fisher's exact tests. GO terms were checked for enrichment at DMR-associated genes using the Bioconductor packages biomaRt and GO.db to annotate genes to DMRs, and Benjami-Hochberg corrected Fisher's exact tests for enrichment versus the background set of CpGs as above.