This track was produced as part of the ENCODE project. The track reports the
percentage of DNA molecules that exhibit cytosine methylation at specific CpG
dinucleotides. In general, DNA methylation within a gene's promoter is associated
with gene silencing and DNA methylation within the exons and introns of a gene is
associated with gene expression. Proper regulation of DNA methylation is essential
during development and aberrant DNA methylation is a hallmark of cancer. DNA
methylation status was assayed at more than 500,000 CpG dinucleotides in the genome
using Reduced Representation Bisulfite Sequencing (RRBS). Genomic DNA was digested
with the methyl-insensitive restriction enzyme MspI and then small genomic DNA fragments
were purified by gel electrophoresis and used to construct an Illumina sequencing
library. The library fragments were treated with sodium bisulfite and amplified by
PCR to convert every unmethylated cytosine to a thymidine while leaving methylated
cytosines intact. The sequenced fragments were aligned to a customized reference
genome sequence. For each assayed CpG, the number of sequencing reads covering that
CpG and the percentage of those reads that were methylated were reported.
Display Conventions and Configuration
Methylation status is represented with an 11-color gradient using the following convention:
red = 100% of molecules sequenced are methylated
yellow = 50% of molecules sequenced are methylated
green = 0% of molecules sequenced are methylated
The score in this track reports the number of sequencing reads obtained for each CpG,
which is often called 'coverage'. The score is capped at 1000, so any CpGs that were covered
by more than 1000 sequencing reads have a score of 1000. The BED files available for
download contain two extra columns: one with the uncapped coverage (number of reads at that
site) and one with the percentage of those reads that show methylation. High reproducibility
was obtained, with correlation coefficients greater than 0.9 between biological replicates,
when only considering CpGs represented by at least 10 sequencing reads (10X coverage, score=10).
Therefore, the default view for this track is set to 10X coverage, or a score of 10.
Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.
Methods
DNA methylation at CpG sites was assayed with a modified version of Reduced Representation
Bisulfite Sequencing (Meissner et al., 2008). RRBS was performed on cell lines
grown by many ENCODE production groups. The production group that grew the cells and isolated
genomic DNA is indicated in the "obtainedBy" field of the metadata. When a cell type was
provided by more than one lab, the data from only one lab are available for immediate display.
However, the data for every cell type from every lab is available from the
Downloads page.
RRBS was also performed on genomic DNA from tissue samples provided by BioChain. The replicates
for the BioChain tissues are technical replicates (rather than biological replicates) beginning
at the bisulfite treatment step. RRBS was carried out by the Myers production group at the
HudsonAlpha Institute for Biotechnology.
Isolation of Genomic DNA
Genomic DNA was isolated from biological replicates of each cell line using the QIAGEN DNeasy
Blood & Tissue Kit according to the instructions provided by the manufacturer. DNA
concentrations for each genomic DNA preparation were determined using fluorescent DNA binding
dye and a fluorometer (Invitrogen Quant-iT dsDNA High Sensitivity Kit and Qubit Fluorometer).
Typically, 1 µg of DNA is used to make an RRBS library; however, there has been success
in making libraries with 200 ng genomic DNA from rare or precious samples.
RRBS Library Construction and Sequencing
RRBS library construction started with MspI digestion of genomic DNA, which cut at every CCGG
regardless of methylation status. Klenow exo- DNA Polymerase was then used to fill in the recessed
end of the genomic DNA and add an adenosine as a 3' overhang. Next, a methylated version of
the Illumina paired-end adapters was ligated onto the DNA. Adapter-ligated genomic DNA fragments
between 105 and 185 base pairs were selected using agarose gel electrophoresis and a Qiagen Qiaquick
Gel Extraction Kit. The selected adapter-ligated fragments were treated with sodium bisulfite using
the Zymo Research EZ DNA Methylation Gold Kit, which converts unmethylated cytosines to uracils and
leaves methylated cytosines unchanged. Bisulfite treated DNA was amplified in a final PCR reaction
which was optimized to uniformly amplify diverse fragment sizes and sequence contexts in the same
reaction. During this final PCR reaction, uracils were copied as thymines resulting in a thymine in the
PCR products wherever an unmethylated cytosine existed in the genomic DNA. The sample was then ready for
sequencing on the Illumina sequencing platform. These libraries were sequenced with an Illumina Genome
Analyzer IIx according to the manufacturer's recommendations. The full RRBS protocol can be found
here.
Data Analysis
To analyze the sequence data, a reference genome was created that contained only the 36 base pairs
adjacent to every MspI site and in which every C was changed to a T. A converted sequence read file
was then created by changing each C in the original sequence reads to a T. The converted sequence
reads were aligned to the converted reference genome and only reads that mapped uniquely to the
reference genome were kept. Once the reads were aligned, the percent methylation was calculated for
each CpG using the original sequence reads. The percent methylation and number of reads were reported
for each CpG.
Release Notes
This is Release 3 (July 2012) of this track which adds the MCF-7 cell line with
shRNA knockdowns
obtained from the Crawford Lab at Duke University.
Data users may freely use ENCODE data, but may not, without prior consent,
submit publications that use an unpublished ENCODE dataset until nine months
following the release of the dataset. This date is listed in the
Restricted Until column, above. The full data release policy for
ENCODE is available here.