Activin/Nodal signaling and NANOG orchestrate human embryonic stem cell fate decisions by controlling the H3K4me3 chromatin mark

Bertero et al. used human embryonic stem cells (hESCs) to show that the Activin–SMAD2/3 signaling pathway cooperates with the core pluripotency factor NANOG to recruit the DPY30-COMPASS histone modifiers onto key developmental genes. Functional studies demonstrate the importance of these interactions for correct histone 3 Lys4 trimethylation and also for self-renewal and differentiation. In mice, Dpy30 is also necessary to maintain pluripotency in the pregastrulation embryo.


SUPPLEMENTAL FIGURES
. (C) ChIP-seq results for H3K4me3 and H3K27me3 on selected representative intergenic H3K4me3 peaks before and after inhibition of Activin with SB for 2h. Lines represent read-enrichments normalized by million mapped reads and size of the library.
H3K4me1 and H3K27ac ChIP-seq data and peak calls (below) for the H1 hESCs line from the ENCODE project are shown. SMAD2/3 binding sites in hESCs (from Brown et al., 2011) are reported. (D) ChIP-qPCR for H3, H3K4me1, H3K4me2 and H3K4me3 before and after 2h of SB. qPCR were performed against the genomic regions showing decreased H3K4me3 in Fig. 1E and S1C. Significant differences vs Activin for each gene as calculated by t-tests are reported.     Figure  - Schematic representation of the location of ChIP-qPCR primers used for histone mark ChIP ("H3K4me3"; also used for H3K4me2, H3K4me1 and H3 ChIP) or transcription factor and histone remodeller ChIP ("S/N/D BS": SMAD2/3, NANOG and DPY30 binding site; also used for WDR5, MLL2/KMT2B, MLL1/KMT2A, SETD1A and BPTF ChIP) relative to the genes to which these regions are associated. Unless otherwise specifically indicated in the figure legends (see Fig. S4E), these ChIP-qPCR primers were used as just described for all figures reported in the study. All ChIP-qPCR primer sequences, the genomic coordinates of the amplicons they generate, and their localization relative to the genes of interest are reported in Table S6.  Figure S6, related to Figure 6 and Table S5    The H9 hESCs line (WiCell, Madison, WI) was cultured in feeder-free chemically defined conditions on gelatine coated plates as described in (Vallier 2011) with minor modifications described below. The formulation of all cell culture media is detailed in Table S6.
Pluripotent cells were maintained in Chemically Defined Media with BSA (CDM-BSA) supplemented with 10ng/ml recombinant human Activin A and 12ng/ml recombinant human FGF2 (both from Dr. Marko Hyvonen, Dept. of Biochemistry, University of Cambridge). Cells were passaged every 4-6 days with collagenase IV as clumps of 50-100 cells, and dispensed at a density of 100-150 clumps/cm 2 . The culture media was replaced 48 hours after the split and then every 24 hours. All assays on undifferentiated hESCs were performed the third day after the split unless otherwise indicated.
Pancreatic and hepatic specification was initiated after endoderm differentiation as described in (Cho et al. 2012;Hannan et al. 2013), and details on the differentiation protocols can be found in Table S6. All assays on differentiated hESCs were performed at the end of the differentiation protocol unless otherwise described.
Animal procedures were performed in accordance with the local committee on Animal Experimentation at Centro de Investigación Príncipe Felipe. One million hESCs were injected in the lumen of the right testicle of 6 to 8-weeks-old SCID mice, and three animals were injected in each group. After 12 weeks, mice were sacrificed, and the testicles and tumours were dissected and fixed for 48h in Bouins solution (Sigma-Aldrich).
The fixed tissues were then paraffin-embedded and processed according to standard procedures. Sections (5µm) were stained with hematoxylin/eosin and subsequently examined under bright-field microscope for the presence of tissues deriving from the three germ layers.

Genetic manipulation of hESCs.
Knock-down of DPY30 in hESCs was performed as described in (Vallier et al. 2004) by using pLKO.1 vectors expressing specific and validated shRNA (Sigma-Aldrich; clone numbers TRCN0000129317 and TRCN0000131112 for DPY30-sh1 and DPY30-sh2 respectively). A pLKO.1 vector containing a scramble shRNA that does not target any known human gene was used as a control (Sigma-Aldrich; catalogue number SHC016). hESCs were seeded in 6-well plates and transfected 48 hours after the split with 4µg of pLKO.1 using Lipofectamine 2000 (Invitrogen) for 24 hours following manufacturer's instructions.
For stable transfections, 1mg/ml Puromycin (Gibco) was added to the media from 5 days after the transfection onwards to allow for selection of stable integrants. After 10 days of selection, colonies derived from a single event of stable integration were counted and morphologically assessed for pluripotency or differentiation. Single colonies were then picked and clonally expanded. The level of knock-down was determined by qPCR and confirmed by Western blot as described below. Stable NANOG knock-down hESCs and matched controls were previously obtained using this same protocol as described in .
In the case of transient transfections, selection with Puromycin was initiated immediately after transfection and maintained for one, two or three days.

Alkaline phosphatase assay.
hESCs were fixed with 4% PFA in PBS (Alfa Aesar) for 20 minutes at 4°C, washed three times with PBS, and rinsed with Alkaline Phosphatase (AP) solution (100mM Tris-HCl ph9.5, 100mM NaCl). The AP reaction was performed by incubating cells for 20 minutes at room temperature with AP solution supplemented with Nitro Blue Tetrazolium (NBT) and 5-bromo-4-chloro-3indolyl-phospatase (BCIP) at a ratio of 33µl of NBT and 16µl BCIP per 5ml AP solution (NBT/BCIP colour development substrate, Promega). After staining, cells were washed three times with PBS and imaged using an Axiovert 200M inverted microscope (Zeiss).

Proliferation curve.
hESCs were plated in four biological replicates for each time point and fixed in PBS 4% PFA for 20 minutes at 4°C at intervals of 24 hours for four days starting from the day after the split (D0). After being washed three times with PBS, cells were stained for 10 minutes at room temperature with a solution of 0.1% Crystal Violet (Sigma) in deionized water. Plates were then washed three times with deionized water for 10 minutes at room temperature, air-dried and stored protected from light. The dye was dissolved in 0.5ml of 10% Acetic Acid (Sigma), and the absorbance at 590nm was measured using an Envision Multilabel Reader (Perkin Elmer). The data, which is proportional to the number of cells, was normalized to the average absorbance at D0.

Apoptosis assay.
Apoptosis was measured 12 hours after the last media change by using the ApoTox-Glo Triplex Assay from Promega according to manufacturer's instructions. Briefly, cells were first incubated for 30 minutes at 37°C with the Viability/Cytotoxicity reagent containing the GF-AFC and bis-AAF-R110 substrate and fluorescence was measured at 400 ex /505 em (Viability) and 485 ex /520 em (Citotoxicity) using an Envision Multilabel Reader.
Then, cells were further incubated for 30 minutes at room temperature with the Caspase-Glo 3/7 Reagent, and Caspase 9 activation was measured as luminescence using a Glomax 96-Microplate Luminometer.
Measurements were performed on four biological replicates and expressed as average Caspase activation/Viability for each condition.

Gene expression analysis by quantitative real-time PCR (qPCR).
Cellular RNA was extracted using the GenElute Mammalian Total RNA Miniprep Kit (Sigma-Aldrich) and contaminating genomic DNA was removed using the On-Column DNase I Digestion Set (Sigma-Aldrich) following manufacturer's instructions. 500ng of RNA was used for cDNA synthesis in a reaction containing 250ng random primers, 0.5mM dNTPs, 20U RNaseOUT and 25U of SuperScript II (all from Invitrogen) in a total of 20µl according to manufacturer's instructions.
cDNA was diluted 30 fold and 5µl were used for qPCR using SensiMix SYBR low-ROX (Bioline) and 150nM forward and reverse primers (Sigma-Aldrich; see Table S6 for all primer sequences). Samples from three biological replicates for each condition were run in technical duplicates on 96-well plates on a Stratagene Mx-3005P (Agilent) and results were analysed using the ΔΔCt approach using HMBS/PBGD as housekeeping gene (Livak and Schmittgen 2001). The reference sample used as control to calculate gene expression is indicated in each figure. In cases where multiple control samples were used for one experiment, the average ΔCt from all controls was used during this calculation. Data was expressed as average relative gene expression ± SEM. All primers were designed using PrimerBlast (http://www.ncbi.nlm.nih.gov/tools/primer-blast/) and validated to have a PCR efficiency >98% as well as to produce a single PCR product.

Protein expression analysis by Western blot.
In order to prepare total protein lysates, cells were harvested by scraping in PBS, centrifuged at 250g for 5 minutes at 4°C, and re-suspended in ice-cold CellLytic M buffer (Sigma) supplemented with cOmplete Protease Inhibitor (Roche). After an incubation for 30 minutes at 4°C on a rotating wheel, lysates were clarified by centrifugation at 16000g for 10 minutes at 4°C. The supernatant was transferred to a new tube, and protein concentration was assessed using Protein Quantification Kit-Rapid (Sigma).
Samples were prepared for Western Blot analysis by adding Laemly Buffer (final concentration: 30mM Tris-HCl pH 6.8, 6% Glycerol, 2% SDS, 0.02% Bromophenol Blue, 0.25% β-mercaptoethanol), and denatured at 95°C for 5 minutes. 20µg of proteins were loaded and run on 4-12% NuPAGE Bis-Tris Precasts Gels (Invitrogen), then transferred to PVDF membranes by liquid transfer using NuPAGE Transfer Buffer (Invitrogen). Membranes were blocked for 1 hour at room temperature in PBS 0.05% Tween (PBST) supplemented with 4% non-fat dried milk, and incubated overnight at 4°C with the primary antibody diluted in the same blocking buffer (Table S6 contains details on all primary and secondary antibodies used for Western Blot and other applications). After three washes in PBST, membranes were incubated for 1 hour at room temperature with HRP-conjugated secondary antibodies diluted in blocking buffer, then further washed three times with PBST before being incubated with Pierce ECL2 Western Blotting Substrate (Thermo) and exposed to X-Ray Super RX Films (Fujifilm).

Immunostaining.
Cells were fixed for 20 minutes at 4°C in PBS 4% PFA, rinsed three times with PBS, and blocked and permeabilized at the same time for 30 minutes at room temperature using PBS with 10% Donkey Serum (Biorad) and 0.1% Triton X-100 (Sigma). Overnight incubation at 4°C with the primary antibodies (Table S6) diluted in PBS 1% Donkey Serum 0.1% Triton X-100 was followed by three washes with PBS, and by further incubation with AlexaFluor secondary antibodies (Table S6) for 1 hour at room temperature protected from light. Cells were finally washed three times with PBS, and DAPI (Sigma) was added to the first wash to stain nuclei. Images were acquired using a LSM 700 confocal microscope (Leica).

Flow-cytometry.
Single cell suspensions were prepared by incubation in Cell Dissociation Buffer (Gibco) for 10 minutes at 37° followed by extensive pipetting. Cells were washed twice with PBS and fixed for 20 minutes at 4°C with PBS 4% PFA. After three washes with PBS, cells were first permeabilized for 20 minutes at room temperature with PBS 0.1% Triton X-100, then blocked for 30 minutes at room temperature with PBS 10% Donkey Serum.
Primary and secondary antibodies incubations (Table S6) were performed for 1 hour each at room temperature in PBS 1% Donkey Serum 0.1% Triton X-100, and cells were washed three times with this same buffer after each incubation. Flow-cytometry was performed using a Cyan ADP flow-cytometer and at least 10.000 events were recorded.

Immunoprecipitation (IP).
All steps were performed on ice or at 4°C and ice-cold buffers (compositions detailed in Table S6) supplemented with cOmplete Protease Inhibitors (Roche), PhosSTOP Phosphatase Inhibitor Cocktail (Roche), 1mg/ml Leupeptin, 0.2mM DTT, 0.2mM PMSF, and 10mM NaButyrate (all from Sigma) were used unless otherwise stated. Cells were fed with fresh media for 2 hours before being washed with PBS, scraped in Cell Dissociation Buffer and pelleted at 250g for 10 minutes. The cell pellet was incubated in five volumes of ILB for 10 minutes to allow for cells to swell, then Triton X-100 was added to a final concentration of 0.3% and cells were incubated for 6 further minutes to lyse the plasma membrane. Nuclei were pelleted at 600g for 5 minutes, washed once with ten volumes of ILB, and finally re-suspended in two volumes of NLB. The nuclear suspension was transferred to a Dounce Homogenizer (Jencons Scientific) and homogenized by performing 70 strokes with a "tight" pestle. The nuclear lysate was first incubated in rotation for 30 minutes, then 125U/ml of Benzonase Nuclease (Sigma) were added to digest nucleic acids for 45 at room temperature. The lysate was clarified by ultracentrifugation at 180.000g for 30 minutes and pre-cleared by incubation with non-immune IgG (R&D) for one hour in rotation followed by two extra hours with Protein G-Agarose (Roche). The protein concentration was assessed, and 1mg of protein was used for overnight IP with the primary antibody in rotation (Table S6). IP were further incubated for one hour with 15µl of Protein G-Agarose, washed five times with NLB, and resuspended in Laemly Buffer. Samples were analysed by Western Blot as described above.

Chromatin immunoprecipitation (ChIP).
All steps were performed on ice or at 4°C and ice-cold buffers (compositions detailed in Table S6) supplemented with 1mg/ml Leupeptin, 0.2mM PMSF, and 10mM NaButyrate were used unless otherwise stated. Cells were fed with fresh media for 2 hours before the beginning of the experiment. 2x10 7 cells were used for transcription factors ChIP (ChIP-XL), and 1.5x10 6 cells were used for histone marks ChIP. Cells for ChIP-XL were cross-linked on plates first with protein crosslinkers (10mM dimethyl 3,3'-dithiopropionimidate dihydrochloride and 2.5mM 3,3'-dithiodipropionic acid di-N-hydroxysuccinimide ester; Sigma) for 15 minutes at room temperature, then with 1% formaldehyde for 15 more minutes, as described in .
Cells for histone ChIP were cross-linked only with 1% formaldehyde for 15 minutes. Cross-linking was stopped by adding glycine to a final concentration of 0.125M followed by incubation for 10 minutes at room temperature, and the cells were washed with PBS before being scraped off the plates in PBS. The cell suspension was centrifuged at 250g for 5 minutes, and the pellet was re-suspended and incubated for 10 minutes in 2ml of CLB to lyse the plasma membranes. Nuclei were pelleted at 600g for 5minutes and lysed in 1.25ml of SNLB for 10 minutes, after which 0.75ml of IDB were added. Chromatin was sonicated using a S-400 Ultrasonic Liquid Processor (Misonix) equipped with a 1-16" dia Ultrasonic Cell Disruptor Microtip Probe (Cole-Palmer) by doing cycles of 15 seconds ON, 45 seconds OFF at 60% amplitude; 20 and 12 cycles were performed for ChIP-XL and standard ChIP respectively. This protocol resulted in the homogeneous generation of fragments of 100-400bp. Samples were clarified by centrifugation at 16000g for 10 minutes, and diluted with 3.5ml of IDB. After pre-clearing with 25µg of non-immune IgG for 1h and 50µl of Protein G-Agarose for 2h, ChIP was performed overnight in rotation using specific antibodies (Table S6) or non-immune IgG as a control.
After incubation for 1 hour with 30µl of Protein G-Agarose, beads were washed twice with IWB1, once with IWB2 and twice with TE. Precipitated DNA was eluted with 150µl of EB twice for 15 minutes at room temperature in rotation, and processed as follows in parallel with 300µl of sonicated chromatin non-used for ChIP (input). Protein-protein cross-linking in ChIP-XL samples was first reverted by adding DTT to a concentration of 100µM and incubating for 30 minutes at 37°C, then samples were incubated at 65°C for 5 hours after adding NaCl to a final concentration of 300mM for protein-DNA de-crosslinking and 1µg RNase A (Sigma) to digest contaminating RNA. Finally, 60µg of Proteinase K (Sigma) were added overnight at 45°C.
DNA was extracted by sequential phenol-chloroform and chloroform extractions using Phase Lock Gel Heavy tubes (5Prime), and precipitated overnight at -80°C in 100mM NaAcetate and 66% ethanol; 50µg of glycogen (Ambion) were added as carrier. After centrifugation at 16.000g for 1 hour at 4°C, DNA pellets were washed once with ice-cold 70% ethanol, and finally air-dried. ChIP and inputs were resuspended in 300µl and 700µl respectively and 5µl were used for qPCR as described above. Results of transcription factor ChIP were calculated using the ΔΔCt approach using a region in the last exon of SMAD7 to normalize for background binding (Table S6), and normalizing the enrichment to the one of non-immune IgG ChIP. Histone ChIP data were expressed as % of input DNA by calculating (Ct chip -Ct input )^2 and correcting for the dilutions performed.
On top of this, for some graphs histone ChIP enrichments were normalized to a control condition to facilitate visualization and interpretation of the results.

Sequential ChIP
Sequential ChIP was performed as described in (Truax and Greer 2012) with minor modifications. Briefly, a first round of ChIP-XL was performed as described above, but elution of the immunoprecipitated material was done for 30' at 37 °C using 75ul of TE buffer supplemented with 1% SDS, fresh 15mM DTT and cOmplete Protease Inhibitors (Roche). The beads were then washed once with 75ul of IDB and this second eluate was pooled with the first. The total eluate (150ul) was diluted 10 times in IDB, incubated at 4 °C for five hours, and then subjected to a second round of ChIP as described above. Purified sequential ChIP material was resuspended in 100µl and 5µl were used for qPCR as described above, with the exception that KAPA SYBR FAST qPCR Master Mix (KAPA Biosystems) was used, as this proved to provide higher sensitivity and reliability in detecting lower amounts of DNA obtained from sequential ChIP. Data was analyzed as described above, and the enrichments were normalized to those obtained for a control sequential ChIP where the first ChIP using the antibody of interest (anti-DPY30) was followed by a second round of ChIP using non-immune IgG.

Microarrays.
Cellular RNA was extracted as described above, and 500ng were amplified and purified using the Illumina TotalPrep-96 RNA Amplification kit (Life Technologies) according to the manufacturer's instructions. Three biological replicates for each condition were analysed. Biotin-Labelled cRNA was then normalized to a concentration of 150ng/µl and 750ng were hybridised to Illumina Human-12 v4 BeadChips for 16 hours (overnight) at 58 °C. Following hybridisation, BeadChips were washed and stained with streptavidin-Cy3 (GE Healthcare). BeadChips were then scanned using the BeadArray reader, and image data was then processed using Genome Studio software (Illumina). The raw and processed microarray data are available on ArrayExpress (Accession number: E-MTAB-2749).

Microarrays analysis.
Probe summaries for all arrays were obtained from the raw data using the method "Making Probe Summary". These values were transformed (variance stabilized) and quantile normalised using the R/Bioconductor package lumi (Du et al. 2008). Standard lumi QC procedure was applied and no outliers were identified.
The analysis of differentially expressed probes during the time-course of Activin inhibition was performed using the R/Bioconductor timecourse package (Smyth 2004). The Hotelling T 2 score for each probe that fluoresced above background in at least one condition was calculated using the mb.long function with all parameters set to their default value. We used the T 2 score to rank probes according to differential expression across the time-course. The top 10% probes were selected for complete Eucledian hierarchical clustering (kmeans preprocessing; max of 300 clusters) using Perseus software (MaxQuant); Z-scores of the log 2 normalized expression values across the time-course were calculated and used for this analysis. 19 probe clusters (having an Euclidean distance >3.68) were defined, and gene enrichment analysis of genes in selected clusters was performed using Enrichr (Chen et al. 2013). The significance of the overlap between genes associated to SMAD2/3 binding sites  and genes in selected clusters was measured with hypergeometric tests considering the sampling population as the number of genes whose expression was measured by the microarray.
Differential expression between pairs of conditions was evaluated using the R/Bioconductor package limma (Smyth 2004). A linear model fit was applied, and the top differentially expressed genes were tabulated for each contrast using the method of Benjamini and Hochberg to correct the p-values (Benjamini and Hochberg 1995). Probes that failed to fluoresce above background in both conditions were removed. Differentially expressed probes were selected using a cutoff of adjusted p-value <0.05 and absolute fold-change > 1.35 (the recommended threshold for confident measurement of differential expression according to the microarray chip manufacturer).
Rank-rank hypergeometric overlap analysis (RRHO, Plaisier et al., 2010) was used to measure the significance of the overlap between genes up-or downregulated in pairs of treatment-control experiments. All probes that significantly fluoresced above background in at least one condition were ranked by their log 2 foldchange of normalized expression in each experiment. This raking was used as input for the online version of RRHO to calculate Benjamini-corrected hypergeometric probabilities (step size = 150). Heatmaps reported in Fig. 3G and Fig. S5B were obtained by plotting Z-scores of the log 2 normalized expression values calculated within each experiments separately (an experiment including triplicate samples for a treatment and a control condition). The data was subjected to Eucledian hierarchical clustering as described above, but only for rows.

ChIP sequencing.
For histone mark ChIP-sequencing, ChIP was performed as described above on three biological replicates per condition analyzed; 7.5µg of fragmented chromatin (corresponding to roughly 1.5x10 6 hESCs) were used per ChIP with anti-H3K4me3 or anti-H3K27me3 antibodies (Table S6). For DPY30 ChIP-sequencing, ChIP-XL was performed as described above on one biological replicate per condition analyzed; 2x10 7 hESCs were used with 5µg of anti-DPY30 antibody (Table S6). At the end of the ChIP protocol, fragments between 100bp bedGraph format files were produced for each sample using BEDTools 2.17.0 (Quinlan and Hall 2010). The reads mapped at both DNA strands from 5' to 3' direction were extended to a length of 250 bp, and the read count at each genomic position was normalized to the library size and per million reads (multiplying every value by '1,000,000 / number_of_mapped_reads'). bedGraph files were converted to bigWig using UCSC tool bedGraphToBigWig, and visualized on the Biodalliance genome viewer (Down et al. 2011)  Those located equal to or closer than 500 bp (H3K4me3) or 5 Kb (H3K27me3) were merged.
ChIP-seq peaks were annotated using the R/Bioconductor package ChIPpeakAnno (Zhu et al. 2010) using the dataset "hsapiens_gene_ensembl" from ensembl. Proximal promoter and immediate downstream were considered respectively 1Kb upstream or downstream the TSS. GREAT analysis was performed as previously described (McLean et al. 2010) using all parameters as standard.
Negative Binomial tests implemented in diffReps (Shen et al. 2013) were used to detect differential histone modification regions using a sliding window of 600bp (H3K4me3) or 5000bp (H3K27me3), p-value cutoff 1e-6, sharp peaks mode for H3K4me3 (--nsb 20) and broad peak mode for H3K27me3 (--nsb 2), hg19 as reference genome, and an average fragment size of 250bp (rest of parameters default). Differential histone modifications regions not overlapping (at least 1bp) significant chromatin marks previously detected during peak calling at least in one of conditions under comparison were removed. Regions were ranked by their fold-change (FC), and reported as differentially enriched only if the absolute FC≥1.5, and Benjamini-Hochberg corrected p-value≤1e-3. As described on the figures, figure legends and results, for Fig. S6C and S6D only the thresholds for statistical significance were relaxed as it follows: binomial test p-value cutoff 1e-4 (diffReps standard); absolute FC≥1.25, and Benjamini-Hochberg corrected p-value≤1e-2 (FDR 1%).

Integration of ChIP-seq datasets.
The significance of the overlap between H3K4me3 peaks decreased after 2h of SB treatment and SMAD2/3 binding sites was calculated using GAT (Heger et al. 2013), with the annotations being SMAD2/3 binding sites , the segments being H3K4me3 regions, and the workspace being the hg19 genome after subtraction of the hg19/GRCh37 ENCODE Excludable Mappability Regions (Bernstein et al., 2012; number of samples to compute = 10k). To enable the comparison, SMAD2/3 binding sites coordinates were translated from hg18 to hg19 using UCSC liftOver under default parameters.
The overlap of different histone marks was evaluated using the function intersect (with parameter -u) from the BEDTools suite (Quinlan and Hall 2010). Bed format files for H3K4me3 and H3K27me3 peaks were obtained from the reproducible peak calls described above, while data for H3K4me1 and H3K27me3 in H1 hESCs from the ENCODE project was downloaded from the UCSC browser.
The significance of the multiple overlaps between H3K4me3 peaks decreased after 2h of SB treatment, DPY30 KD or NANOG KD was calculated using MULTOVL (Aszódi 2012). We pooled the total set of H3K4me3 regions for all conditions considered as reported by the diffReps analysis described above, and considered them as the free space in the tool multovlprob included in MULTOVL. We randomly reshuffled 10,000 times down-regulated regions in H3K4me3 to estimate a null distribution of overlap length. P-values were estimated from the Z-scores as described in (Aszódi 2012).
Quantitative changes of H3K4me3 and H3K27me3 levels on selected genomic regions were obtained by computing for each region a score from the normalized coverage as follows: for each region i=1,…,L, where L is total number of regions considered, we obtained the score S i from the normalized coverage (bigwig data, as described above) y i,j ∈ [x i,a ,x i,b ] for a replicate j=1,2,3 as where x i,a is the start location of the peak region, and x i,b the end. Scores were calculated for ChIP-seq in the presence of Activin or after 2h of SB treatment, and significant differences in their mean values were assessed by Welch's t-test for two-samples. This analysis was performed on different sets of genomic regions described in the text and figure legends.
Average meta-region plots and coverage heatmaps for NANOG, SMAD2/3 and DPY30 ChIP-seq were done using the genomation toolkit (Akalin et al. 2014). 1000 bins were used for each window, and a winsorize configuration of (0,99) to limit the values to only 99th percentile for a matrix (everything above the 99 percentile was equalized to the value of 99th percentile). Signal track for NANOG in H1 hESCs was obtained from the ENCODE project (Bernstein et al. 2012). Genome-wide coverage for SMAD2/3 was calculated by combining the alignments in BED format in GEO Series GSE19461 , using the function 'genomecov' in the BEDTools suite (Quinlan and Hall 2010), and then converting to bigwig format. 'bwtool lift' (Pohl and Beato 2014) was used to lift hg18 to hg19 coordinates to allow inter-dataset comparison, using the liftOver chain file hg18ToHg19.over.chain.gz downloaded from UCSC. The subset of SMAD2/3 binding sites in Fig 6H and 6I was obtained overlapping SMAD2/3 peaks with regions obtained extending +/-100 Kb H3K4me3 peaks downregulated after SB treatment.

Integration of ChIP-seq and microarray data.
Each H3K4me3 peak was mapped to its closest gene using the annotatePeaks.pl function from the HOMER suite (Heinz et al. 2010) with standard parameters, and using the UCSC hg19 annotation. The 491 H3K4me3 peaks downregulated after 2h of SB mapped to 415 unique genes. The difference in expression of microarray probes mapping to genes associated to differentially enriched H3K4me3 was statistically evaluated by performing Welch's paired two-samples t-tests. In the case of the SB treatment time-course Friedman's non-parametric paired test was performed to evaluate the significance of overall changes, and Dunn's corrected multiple comparisons were performed for selected pairs of conditions. GSEA (Gene Set Enrichment Analysis; Subramanian et al., 2005) was performed by generating a custom gene set file containing genes significantly downregulated after 48h of SB treatment (adj.p<0.05 and fold-change<-1.35, calculated with limma package as described above), and testing the enrichment of this gene set in the list of genes closest to H3K4me3 peaks ranked by the log 2 fold-change of the enrichment in the treatment vs control (in cases where multiple H3K4me3 peaks mapped to the same gene, the average log 2 fold-change was considered).

Generation of knockout mice.
All animal procedures were performed in agreement with the UK Home Office regulations, UK Animals (Scientific Procedures) Act of 1996. Mice carrying the Dpy30 "knockout first" tm1a(KOMP)Wtsi allele were generated by the Sanger Institute Mouse Genetics Project as described previously (Skarnes et al. 2011;White et al. 2013). Briefly, mouse C57BL/6N ES cells were electroporated with a linearized "knockout-first" plasmid vector carrying a IRES:LacZ trapping cassette and a floxed promoter-driven neo cassette inserted between two homology arms mapping to exons 3 and 4 of Dpy30. Cells were selected for integration using Geneticin (Invitrogen), expanded and screened with a panel of tests to confirm site-specific homologous recombination (loxP site confirmation qPCR; lacZ confirmation qPCR; loss of wild-type allele copy number qPCR; Valenzuela et al., 2003).
After screening, targeted mouse ESC were microinjected in C57BL/6 blastocyst to obtain chimeras.
Germline transmission was detected by neo copy number qPCR and additional quality control tests performed to confirm the structure and targeting of the allele . Heterozygotes were crossed with C57BL/6NTac mice and the colony was expanded before intercrossing.
Phenotyping of knockout mice.

Bertero_SI_p. 32
The viability of homozygous Dpy30 knockout mice was tested by crossing heterozygous animals and genotyping the offspring at postnatal day 14 and embryonic days encompassing 6.5 to 14.5. Genotype was confirmed both by neo copy number qPCR and loss of wild-type allele copy number qPCR (Table S6) on DNA extracted from earclips using the TaqMan Sample-to-SNP kit (Life Technologies). Embryos were dissected from the decidua of pregnant females and all extraembryonic tissues were removed. Images were acquired using a Leica MZ16A microscope.
In the case of E6.5 and E7.5 embryos, DNA and RNA were simultaneously extracted from whole embryos using the AllPrep DNA/RNA Mini Kit (Qiagen) according to manufacturers instructions. DNA was eluted in 25µl and used for both qPCR as described above and standard PCR to confirm the presence of wild-type and mutant alleles. RNA was eluted in 15µl and used for cDNA synteshis and then qPCR to measure gene expression as described above. Gene expression was measured with the ΔΔCt approach using the geometric mean of HMBS/PBGD and GAPDH Ct values as housekeeping reference.