Abstract
Free full text
Antibody cross-reactivity accounts for widespread appearance of m1A in 5’UTRs
Abstract
N1-methyladenosine (m1A) was proposed to be a highly prevalent modification in mRNA 5’UTRs based on mapping studies using an m1A-binding antibody. We developed a bioinformatic approach to discover m1A and other modifications in mRNA throughout the transcriptome by analyzing preexisting ultra-deep RNA-Seq data for modification-induced misincorporations. Using this approach, we detected appreciable levels of m1A only in one mRNA: the mitochondrial MT-ND5 transcript. As an alternative approach, we also developed an antibody-based m1A-mapping approach to detect m1A at single-nucleotide resolution, and confirmed that the commonly used m1A antibody maps sites to the transcription-start site in mRNA 5’UTRs. However, further analysis revealed that these were false-positives caused by binding of the antibody to the m7G-cap. A different m1A antibody that lacks cap-binding cross-reactivity does not show enriched binding in 5’UTRs. These results demonstrate that high-stoichiometry m1A sites are exceedingly rare in mRNAs and that previous mappings of m1A to 5’UTRs were the result of antibody cross-reactivity to the 5’ cap.
Introduction
The initial concept of the epitranscriptome was born with the transcriptome-wide mapping of thousands of internally located modified nucleotide N6-methyladenosine (m6A) residues in the transcriptome1,2. Two studies later identified N1-methyladenosine (m1A) as another abundant epitranscriptomic modification3,4. Both studies mapped m1A in thousands of mRNAs by sequencing mRNA fragments immunoprecipitated with a monoclonal antibody (clone AMA-2) commercially distributed by MBL Bioscience, which was originally raised against KLH-conjugated 1-methyladenosine5. This antibody was previously shown to recognize m1A-containing RNAs6. One study estimated the average stoichiometry of mapped m1A sites at 20%3. Notably, most m1A sites were located near start codons and proposed to provide a novel form of translational regulation3.
Subsequent work reported different distributions for m1A, one arguing that m1A was exceptionally rare in mRNA7. In that study, the antibody-bound RNA was reverse transcribed with an enzyme that efficiently introduces misincorporations at m1A. Using this approach, m1A was rarely observed in the RNA immunoprecipitated with m1A antibodies7. Although mRNA fragments from 5′UTRs and start codon-proximal regions were immunoprecipitated, these fragments did not generate misincorporations. Thus it was concluded that mRNA fragments from the 5′UTR may be nonspecifically enriched during immunoprecipitation7. Ultimately it was concluded that only two mRNAs contained high-confidence m1A sites: C9orf100 and MT-ND5, a cytosolic and a mitochondrial mRNA, respectively7. Twelve other sites were detected at very low stoichiometry.
The second study mapped m1A to 740 sites, 473 of which were in mRNA and lncRNA8. In mRNAs, the majority of sites were found in the 5′UTR; 22 of which the authors localized to the first nucleotide of the transcript. Based on this location, it was proposed that m1A forms a novel cap structure in which m1A immediately follows the 7-methylguanosine (m7G) cap of mRNA (m7G-ppp-m1A). A re-analysis of these data showed that many of the sites that were mapped internally within the 5′UTR were actually transcription-start sites9.
It remained unclear why those studies produced divergent m1A maps, and if m1A exists at transcription-start sites or start codons, or neither, and why these particular sites are so prominent in m1A-mapping studies. Additionally, whether m1A sites are present with high stoichiometry as initially reported3, or low stoichiometry and rare7 also remained to be resolved.
Here, to address the question of the prevalence and location of m1A in the transcriptome, we used both a high-resolution m1A-mapping method as well as a bioinformatic approach, termed “misincorporation mapping”. Misincorporation mapping takes advantage of the ability of m1A and numerous other modified nucleotides to induce misincorporations during the reverse transcription step common to most RNA-Seq protocols. By probing several ultra-deep RNA-Seq datasets for such misincorporations, we discovered that very few mRNAs contain misincorporations. Only the MT-ND5 mitochondrial transcript and the MALAT1 noncoding RNA generated statistically significant misincorporations, demonstrating the rarity of high stoichiometry m1A sites. To understand why misincorporation mapping identified only a few m1A sites while m1A antibody-based mapping detects many, we mapped m1A at high resolution using the same m1A-directed antibody used in all previous studies. This mapping recapitulated the selective binding of the AMA-2 m1A antibody to transcription-start nucleotides in mRNA. However, we also found that this m1A antibody recognizes the m7G cap structure, and that m1A-independent binding explains why previous maps showed m1A in mRNA 5′UTR regions. To further confirm this observation, we demonstrate that a different m1A antibody, which we show does not bind the m7G cap, produces an m1A map that no longer enriches for the 5′ end of mRNAs. Overall, our data demonstrate that (1) m1A and other hard stop nucleotides are rare in mRNA; (2) that—with the exception of MT-ND5—m1A sites have very low stoichiometry; and (3) that cross-reactivity of the AMA-2 m1A antibody with 5′ caps leads to false-positive localization of m1A to transcription-start nucleotides and start codons.
Results
Misincorporation mapping using ultra-deep RNA-seq
Given the inconsistency in the different antibody-dependent m1A-mapping methods (Supplementary Fig. 1a–c), we sought to use an antibody-independent approach to detect m1A at single-nucleotide resolution in mRNA. For this, we took advantage of existing ultra-deep RNA-seq datasets and the fact that m1A is a “hard-stop” modification, meaning it typically arrests cDNA synthesized by standard reverse transcriptases10,11 (Supplementary Fig. 1d). However, SuperScript III will read through m1A and other hard-stop nucleotides at low frequency, resulting in misincorporations that are variable and sequence dependent10,11. Most m1A-induced misincorporations are A→T transitions that can be detected by sequencing the cDNA10,11. This approach can detect other hard-stop modifications, such as 3-methylcytidine (m3C), 3-methyluridine, N2,N2-dimethylguanosine, and N6,N6-dimethyladenosine since these also produce misincorporations10. Therefore, misincorporations can directly localize m1A and other hard-stop nucleotides in sequencing data10,11.
Misincorporations are difficult to distinguish from sequencing errors using standard next-generation sequencing for two major reasons. First, substantial read depth is required to detect m1A since m1A typically induces a misincorporation in only approximately 20–30% of the cDNAs generated by SuperScript III (refs. 10,11). The misincorporations would therefore be particularly difficult to detect for low stoichiometry m1A residues. Second, misincorporations cannot be readily distinguished from stochastic errors originating during PCR amplification or during sequencing. Thus, m1A cannot be definitively identified in standard RNA-seq experiments.
To overcome these problems, we developed a bioinformatic approach similar to high-throughput annotation of modified ribonucleotides (HAMR)10 to distinguish modified nucleotides from sequencing errors (Fig. 1a). We used an ultra-deep RNA-seq dataset from blood mononucleocytes comprising approximately three billion reads derived from 20 independent sequencing experiments (“replicates”)12. These replicates were derived from a single human donor whose genome was sequenced, allowing any differences between the cDNA and genome to be readily detected. As with most RNA-seq datasets, the exact reverse transcriptase termination site is not detectable12 (see Supplementary Fig. 1a). Instead, we localized hard-stop modifications by identifying all nucleotide positions in the transcriptome that showed misincorporations across multiple replicates (see Methods). Notably, for m1A, we searched for A→T transitions alongside other less common transition types induced by m1A10,11 (see Methods). Importantly, this method only reveals misincorporations, not the identity of the modification; the identity would have to be determined by biochemical methods.
We first confirmed that we could detect known m1A sites. After aligning reads to rRNA, we readily detected the known 28S rRNA m1A at position 1322. (Fig. 1b, Supplementary Fig. 2a). As expected, the misincorporations were predominantly A→T transitions, which are characteristic of m1A10,11. These site-specific misincorporations were detected in all 20 replicates, confirming that the A→T transitions were not stochastic sequencing errors.
Misincorporation mapping can detect other hard-stop modifications in rRNA, including 1-methyl-3-(3-amino-3-carboxypropyl)pseudouridine and m3U (Supplementary Fig. 2b). However, modifications that do not significantly affect reverse transcription, such as m6A, pseudouridine, N4-acetylcytidine, 2′-O-methylated nucleotides, and m7G, did not induce misincorporations (Supplementary Fig. 2c).
We considered the possibility that m1A detection could be impaired because m1A can convert to m6A through the Dimroth rearrangement, a heat and base-catalyzed reaction13 (Supplementary Fig. 2d). To estimate m1A loss during the preparation of the ultra-deep sequencing libraries, we examined the m1A at position 1322 in the 28S rRNA, which is methylated at near complete stoichiometry14. Since reverse transcription of m1A results in read-through approximately 20–30% of the time10,11, the fraction of read-through events can suggest the overall m1A stoichiometry. Notably, we found that m1A at this position was associated with a ~15% read-through rate in this dataset (see Supplementary Fig. 2a). This suggests that the library preparation protocol did not cause substantial degradation of m1A, and m1A residues should be detectable throughout the transcriptome using this dataset.
m1A is not readily detected in mRNA
In order to detect m1A, the modified residue must be reverse transcribed a sufficient number of times during library preparation to generate misincorporations. Since m1A sites were reported to have on average a 20% stoichiometry3, we set a threshold of 500 unique reads on any given nucleotide to detect m1A sites. At this stoichiometry, 100 reverse transcription events would encounter m1A. Of these 100 reverse transcription events, approximately 20% would read through, and most of these would be associated with a misincorporation10,11. At this read depth, misincorporations should therefore be readily detected in multiple replicates. Thus, to detect m1A in mRNA, we restricted our search to approximately eight million adenosine residues in the transcriptome that showed a read depth of >500 reads (Supplementary Fig. 3a).
Analysis of the three billion reads showed 14 high-confidence nucleotide positions across the transcriptome with misincorporations in more than one replicate (see Methods). Of these, 12 occurred at adenosine residues (Supplementary Data 1 and 2). Most of these modified adenosines were found in mitochondrial tRNAs and occurred at known m1A positions in mitochondrial tRNAs15 (Supplementary Data 2). We also detected a modified adenosine in mascRNA (MALAT1-associated small cytoplasmic RNA), a short tRNA-like ncRNA that is derived from endonucleolytic processing of MALAT1 (ref. 16) (Fig. 1c). Notably, this modified adenosine corresponds to position 58 within the T-loop of tRNAs (Fig. 1d), a conserved m1A site in tRNAs17. This m1A site in MALAT1 may be similarly formed by T-loop-specific m1A-synthesizing enzymes17.
Besides these noncoding RNAs, the previously reported7 m1A-containing MT-ND5 mitochondrial mRNA also contained a modified adenosine (Supplementary Data 2). This adenosine exhibited a misincorporation rate of 13.5%. Thus, misincorporation mapping resulted in the same major conclusion as previously reported7 that MT-ND5 is the major m1A-modified mRNA in the cell.
We next examined misincorporations at other mitochondrial mRNAs with previously annotated m1A sites. Safra et al.7 and Li et al.8 identified 11 and 5 putative mitochondrial m1A-containing protein-coding genes, respectively. Four mitochondrial mRNAs were common to both studies (MT-ND5, MT-CO1, MT-CO2, and MT-CO3). The misincorporation rates in poly(A) RNA-Seq for these mRNAs were very low (less than 0.7% and 2.1% in the Safra and Li studies7,8, respectively). However, misincorporations could be detected when m1A-containing mRNAs were enriched using the m1A antibody, suggesting that m1A-containing transcripts are indeed present in cells, but are so rare that they require enrichment to be detected. We therefore leveraged the exceptional mitochondrial read depth in our ultra-deep RNA-seq samples (average ~1 million reads/nucleotide). Here we found that with the exception of MT-ND5 which had a misincorporation rate of 13.5%, the misincorporation rates for all other putative m1A sites was less than 0.4%, which is close to the background rate (Supplementary Fig. 3b, Supplementary Data 3). This demonstrates that the baseline stoichiometry of m1A is very low in all mitochondrial mRNAs except for MT-ND5, and m1A detection requires a pre-enrichment step due to its exceptionally low stoichiometry.
We also detected three additional sites of modifications in cytosolic mRNAs, with only one site being a modified adenosine (Supplementary Data 2). Therefore, it is unlikely that there are many high-stoichiometry hard-stop nucleotides in mRNA.
Notably, when we examined cytosolic mRNAs reported3 to have the highest stoichiometry of m1A (i.e. >50%) such as CCDC71, DLST, and STK16, each lacked A→T transitions based on misincorporation mapping (Supplementary Fig. 3c, Supplementary Data 1).
m3C is also unlikely to be present at high stoichiometry in mRNA since we could not detect any high-confidence misincorporations at cytidine residues (Supplementary Data 2). The m3C previously detected in mRNA by mass spectrometry18 may therefore have originated from contaminating tRNA.
Although misincorporation mapping demonstrated that few mRNAs have m1A or other hard-stop modifications, an important caveat is that the 5′ end of mRNAs cannot be assessed. This is because RNA-seq typically provides less coverage at 5′ ends of RNAs (see Supplementary Fig. 1a)19. Overall, the paucity of m1A sites in mRNA using misincorporation mapping demonstrates that m1A is not a prevalent high-stoichiometry modification in mRNA.
m1A-miCLIP detects known m1A sites at nucleotide resolution
To understand why m1A antibody-based mapping approaches produce a prominent 5′UTR signal, we developed an approach to detect m1A at single-nucleotide resolution: m1A-miCLIP (m1A-modification individual nucleotide resolution crosslinking and immunoprecipitation) (Fig. 2a). We initially used the AMA-2 m1A antibody previously used to map m1A sites in mRNAs3,4. In m1A-miCLIP, the m1A antibody is crosslinked to sheared RNA (Fig. 2a). UV crosslinking with stringent washing reduces nonspecific RNA binding and increases peak resolution in mapping studies20. RNA fragments crosslinked to the antibody are then purified and cloned as a cDNA library. Terminations introduced during reverse transcription are then analyzed to localize precise sites where the m1A antibody binds throughout the transcriptome. Prior to performing transcriptome-wide m1A mapping, we confirmed the AMA-2 m1A antibody binds m1A, and not other nucleotides (Supplementary Fig. 4a).
m1A-miCLIP differs from earlier methods3,4 by preserving the cDNA 3′ ends. Any m1A antibody-binding site at the transcription-start site would produce a peak that is displaced in the 3′ direction (see Supplementary Fig. 1a). This can make a m1A antibody-binding site at the transcription-start site appear to be located at the start codon. To avoid this problem, we chose to generate our libraries in a way that preserves the cDNA ends. Therefore, m1A-miCLIP reveals exact m1A antibody-binding sites within transcripts.
We performed m1A-miCLIP with the AMA-2 antibody using poly(A) RNA from HEK293T cells and examined termination signatures at known m1A sites in rRNA and tRNA, which typically co-purify to some extent with poly(A) RNA21–23. The majority of reads truncated at the +1 position relative to the m1A at position 1322 of the 28S rRNA (Fig. 2b, Supplementary Fig. 4b) as well as known m1A sites in tRNA (Supplementary Fig. 4c). As expected, some read-through was also observed, reflecting the low read-through rate of SuperScript III when it encounters m1A. These data demonstrate specific detection of m1A by m1A-miCLIP using the AMA-2 m1A antibody. Notably, m1A-miCLIP showed markedly improved peak resolution compared to the initial peak-based m1A mapping studies3,4 (Supplementary Fig. 4b).
We next asked if m1A detection in m1A-miCLIP is adversely affected by m1A conversion to m6A via the Dimroth reaction. m1A-miCLIP does not use high temperatures and basic pH13 which are needed for this conversion (see Supplementary Fig. 2d). No m6A-miCLIP reads24 could be detected at the 28S rRNA m1A site (Supplementary Fig. 4d), demonstrating that m1A does not appreciably converts to m6A during the m1A-miCLIP protocol. Taken together, these data demonstrate that m1A-miCLIP maps m1A with high specificity and resolution.
The AMA-2 m1A antibody binds near the first mRNA nucleotide
We next used m1A-miCLIP to map m1A in mRNA (Supplementary Fig. 5a). To do this, we aligned m1A-miCLIP unique reads to the genome, generated m1A-miCLIP clusters (see Methods, Supplementary Data 4), and analyzed their distribution. A metagene analysis of m1A-miCLIP clusters obtained from HEK293T cells showed a marked enrichment in the 5′UTR (Fig. 2c). More precisely, the clusters were located at mRNA transcription-start sites (Fig. 2d). A similar enrichment was seen using mouse mRNA (Supplementary Data 5, Supplementary Fig. 5b).
Crosslinking of the m1A antibody is expected to cause reverse transcription terminations within several nucleotides of the site of the antibody-RNA adduct20. As expected, we found that in miCLIP, terminations were enriched not only at the transcription-start site, but also prominently at the +1 position relative to the transcription-start site (Fig. 2e), with additional terminations sometimes seen between position +2 and +3 (Fig. 2e). Thus, the AMA-2 m1A antibody binds at or near mRNA transcription-start nucleotides.
We considered the possibility that terminations near the transcription-start nucleotide could simply reflect general behavior of the reverse transcriptase as it approaches the mRNA 5′ end. To test this, we examined the input RNA fragments in the RNA-seq dataset prepared using the same library cloning strategy as m1A-miCLIP. In general, RNA-seq reads terminated almost exclusively at the transcription-start nucleotide (Fig. 2e, Supplementary Fig. 5c). Therefore, read terminations seen in m1A-miCLIP near the transcription-start nucleotide are likely induced by selective binding and crosslinking of the AMA-2 m1A antibody rather than an artifact of reverse transcription near mRNA 5′ ends.
m1A-miCLIP motif analysis revealed a consensus sequence in the upstream genomic region that was pyrimidine-rich (Fig. 2f) and highly similar to Initiator, a transcription-initiating sequence which produces transcripts that initiate with adenosine25,26. Indeed, 85% of transcripts containing an m1A-miCLIP cluster at their transcription-start nucleotide initiated with adenosine (Fig. 2g, Supplementary Data 6). Thus, we reasoned that adenosine at the transcription-start nucleotide was important for binding of the AMA-2 antibody at or near the transcription-start site.
m1A and m1Am are not detected in extended cap structures
Because the AMA-2 m1A antibody binds at transcription-start sites, it has been proposed that mRNAs contain a novel mRNA cap structure comprising m7G followed by N1-methylated adenine at the transcription-start nucleotide8. The methylated nucleotide would be m1A or N1,2′-O-dimethyladenosine (m1Am) since the first encoded nucleotide of mRNAs is typically subjected to 2′-O-methylation27,28. To biochemically validate this, we used mass spectrometry to detect m1A or m1Am in “cap dinucleotides,” i.e., m7G-ppp-m1Am. We treated cellular RNA with P1 nuclease, which digests internal nucleotides to mononucleotides, but leaves the cap dinucleotide intact (see Methods).
We readily detected diverse cap dinucleotides using high-resolution liquid chromatography and mass spectrometry using positive ion mode detection. We developed a multiple reaction monitoring protocol based on the fragment ion transitions from distinct dinucleotide precursor species (see Methods). To confirm that the N1-methylated adenosine in a cap dinucleotide can be detected, we used synthetic RNA standards (see Methods). Using these standards, we readily detected m7G-ppp-m1A as well as other cap dinucleotides, such as m7G-ppp-Am, and m7G-ppp-m6Am (Fig. 3a).
We next examined endogenous cap dinucleotides prepared by digesting HEK293T poly(A) RNA. We readily detected m7G-ppp-m6Am (m/z=815.1) and also m7G-ppp-Am (m/z=801.1), though to a lower degree. m7G-ppp-Cm (m/z=777.1), m7G-ppp-Gm (m/z=817.1), and m7G-ppp-Um (m/z=778.1) were also detected (Fig. 3b). The identity of each species was confirmed by detection of fragment masses corresponding to 7-methylguanine (m/z=166.1) and the base comprising the first nucleotide (m/z=m6Am 150.1, Am 136.1, Cm 112.1, Gm 152.1, Um 112.1) within the extended cap.
Next, we asked whether either m7G-ppp-m1A or m7G-ppp-m1Am is present in mRNA. Their masses (m/z=801.1 and 815.1, respectively) are identical to cap dinucleotides containing Am or m6Am. Moreover, the mass of the fragment produced by the N1-methylated adenine base (m/z=150.1) would be identical to that produced by the m6Am cap. However, m7G-ppp-m1A and m7G-ppp-m1Am exhibit very distinct retention times from the Am and m6Am cap dinucleotides based on our synthetic RNA standards (see Fig. 3a). This allows us to differentiate N1-methyl- and N6-methyl-containing adenine. Nevertheless, no N1-methylated adenine-containing cap dinucleotide was detected in mRNA (Fig. 3b). Thus, m7G-ppp-Am or m7G-ppp-m6Am cap structures were readily detected, while m7G-ppp-m1A(m) was undetectable.
Since mass spectrometry analysis could not validate m1A at the transcription-start nucleotide, we wanted to directly and sensitively determine the transcription-start nucleotide that is enriched by the m1A antibody. We used two-dimensional thin layer chromatography (2D-TLC), which can identify and quantify the first encoded nucleotide29. In this approach, mRNA is decapped, and the exposed 5′ RNA end is radiolabeled, permitting sensitive detection of the first transcribed nucleotide. The radiolabeled nucleotide species are then resolved using 2D-TLC29. Based in the mobility of each species, the transcription-start nucleotide can be determined.
We analyzed the transcription-start nucleotide of poly(A) RNA and poly(A) RNA enriched with the m1A antibody by 2D-TLC. We also used synthetic RNA containing m7G-ppp-m1A and m7G-ppp-m1Am extended cap structures as standards (see Methods). We first optimized the solvent conditions so that m1A and m1Am migrated to distinct positions by 2D-TLC (Fig. 3c). In poly(A) RNA, no m1A or m1Am was detected at transcription-start nucleotides (Fig. 3d). Since these might be rare nucleotides at transcription-start nucleotides, we enriched for transcription start m1A or m1Am-containing mRNAs by immunoprecipitating poly(A) mRNA with the AMA-2 m1A antibody before 2D-TLC. Here, we again did not see m1A or m1Am as the transcription-start nucleotide (Fig. 3d).
Taken together, the mass spectrometry and the TLC data suggest that m1A and m1Am are not readily detectable at the transcription-start nucleotide, and m7G-ppp-m1Am does not constitute a novel and prevalent mRNA cap structure as proposed8.
The AMA-2 m1A antibody recognizes m7G-ppp-A cap structures
At this juncture, we had contradictory results: m1A-miCLIP suggested that m1A is at the transcription-start nucleotide, but we did not observe m1A or m1Am at this site by either mass spectrometry or TLC. Therefore, we wondered if the AMA-2 antibody binds the transcription-start region in an m1A-independent manner. When we originally characterized the specificity of the antibody, we performed classic competition studies using nucleosides or nucleotides. However, based on the binding properties of the antibody revealed by mapping studies, we considered the possibility that the AMA-2 m1A antibody could recognize an epitope comprising the mRNA extended cap.
To test this, we used a dot blot assay to measure binding of the AMA-2 m1A antibody to an m1A-containing oligonucleotide in the presence of various competitors. We considered performing the dot blot assay using m7G-ppp-A immobilized on the membrane. However, this approach could be misleading since we do not know if m7G-ppp-A interacts with the membrane in a way that would prevent antibody binding. We therefore used the classic competition approach. In this approach, a m1A-containing RNA is immobilized on the membrane and different competitors are added in solution. Competitors that bind the m1A antibody will prevent the antibody from binding to m1A on the membrane.
As expected, competition with m1A inhibited antibody binding, while related nucleotides, including adenosine, m6A, ethenoadenosine, and N1-substituted nucleotides, like N1,6-dimethyladenosine (m1,6A), did not compete with binding (Fig. 4a). Surprisingly, a commercially available cap analog, m7G-ppp-A, was a relatively effective competitor, with an IC50 of 480nM compared to 100nM for m1A (Fig. 4b). m7G-ppp-G showed weaker inhibition (IC50 ~4 μM) (Fig. 4c). The higher binding to the m7G-ppp-A cap analog compared to the m7G-ppp-G cap analog may explain the preferential binding of the AMA-2 m1A antibody to mRNAs that initiate with adenosine.
Notably, the antibody also showed binding to m7G-ppp, but not m7G or ATP (Fig. 4c, Supplementary Fig. 4a), suggesting that the antibody’s binding specificity includes recognition of features all along the entire m7G-ppp-A extended cap structure.
Since the AMA-2 m1A antibody binds the cap structure in an m1A-independent manner, the m1A peaks seen in the 5′UTR likely reflect binding to the mRNA cap. This would explain why our m1A-miCLIP shows read enrichment at the transcription-start nucleotide, as has also been seen using other m1A mapping approaches8.
To test this hypothesis further, we used a second commercially available m1A antibody from Abcam (catalog number ab208196). Unlike the AMA-2 m1A monoclonal antibody available from MBL, the Abcam antibody did not bind the m7G-ppp-A cap analog (Fig. 4d), indicating that it does not exhibit cross-reactivity with the mRNA cap.
We first confirmed by m1A-miCLIP using HEK293T poly(A) RNA that both ab208196 and AMA-2 antibodies can detect authentic m1A sites. In each case, we observed a robust peak at the m1A sites in MT-ND5 and MT-RNR2, the mitochondrially encoded 16S RNA (Fig. 4e), confirming the ability of both antibody to detect validated m1A sites in mRNA.
We next asked if m1A-miCLIP performed using the m1A-specific Ab208196 would produce the same transcriptome-wide 5′UTR enrichment of m1A-containing fragments observed with the AMA-2 m1A antibody3,4. As expected, the metagene of the miCLIP fragments using the AMA-2 antibody showed a prominent 5′UTR enrichment (Fig. 4f). However, a metagene analysis of all the immunoprecipitated reads using ab208196 lacked the 5′UTR enrichment (Fig. 4f). Together, these data demonstrate that only the AMA-2 antibody, which cross-reacts with the mRNA cap, results in a 5′UTR enrichment in read coverage. Overall, these data demonstrate that binding to the 5′UTR regions is not linked to the presence of m1A at these sites, but rather attributable to cross-reactivity.
Comparison of m1A-miCLIP with earlier m1A maps
Although cross-reactivity with mRNA cap structures explains why m1A was mapped to transcription-start nucleotides using the AMA-2 antibody, it does not explain the localization of m1A to internal sites within the 5′UTR, such as the start codon-proximal region, which was proposed to mediate a novel form of translation initiation3. We therefore wanted to understand the exact location of these putative 5′UTR- and start codon-associated m1A sites.
When we compared mRNAs that show AMA-2 m1A antibody miCLIP coverage with the Dominissini et al.3 m1A map that localized m1A to start codons, we found considerable overlap (Supplementary Fig. 6a). However, the location of reads was different (Supplementary Fig. 6b). In particular, the 5′ ends of the miCLIP reads approached the transcription-start site, while reads from Dominissini et al.3 were located downstream of the transcription-start site (Supplementary Fig. 6b, insets). This lateral displacement of peaks towards the start codon is consistent with the library cloning method used in this earlier method (see Supplementary Fig. 1a).
In earlier m1A mapping studies, accumulations of reads, or in some cases, “troughs” of reduced read coverage due to a putative m1A site, were used to predict m1A residues to start codons in mRNA3. miCLIP provides more precise positioning by detecting exact sites of antibody-induced crosslinks, rather than using peaks and troughs, which are a common nonspecific feature in RNA-seq data (see Fig. 2b, e and Supplementary Figs. 5c and 6b).
We additionally re-examined the Li et al.8 high-resolution m1A mapping dataset in HEK293T cells. This study identified 474 m1A sites in nuclear-encoded genes based on m1A-induced reverse transcriptase misincorporations8. However, we eliminated 122 sites for the following reasons: three sites had gene identifiers missing or removed from Refseq, 37 sites did not map to adenosines, and 82 sites were duplicates resulting from mapping to transcript isoforms of the same gene. This left 352 unique putative m1A sites in nuclear-encoded genes (see Methods and Supplementary Fig. 6c).
The Li et al.8 study located m1A sites mostly within the 5′UTR. Only 19 sites were at annotated transcriptional-start sites. However, mRNAs can have alternative transcription-start sites, which differ from the RefSeq-annotated transcription-start site30. To determine if the Li et al.8 m1A sites mapped to alternative transcription-start sites, we compared the reported m1A sites to a list of experimentally validated transcription-start sites in HEK293T cells. This list was derived from CAGE-seq and m6Am mapping data31,32. Of the putative 352 m1A sites, 134 overlapped with CAGE and/or m6Am-inferred transcription-start sites (Supplementary Fig. 6c). Hence, 140 m1A sites occurred at transcription-start sites. The false-positive rates of m1A mapping is not known, so it is possible that other 5′UTR m1A sites are either false positives or map to currently unannotated transcription-start sites. Thus, most, if not all, of the putative start codon/5′UTR m1A sites mapped by that study are localized to alternative transcription-start sites, consistent with our mapping results and consistent with the cap-binding properties of the m1A antibody.
Discussion
Considerable attention has revolved around m1A based on its description as a high-stoichiometry, translation-promoting modification in thousands of mRNAs located near start codons3. Subsequent studies concluded that m1A is less prevalent (~700 sites) with a fraction in mitochondrial mRNA8, while other studies suggest even fewer sites, with only two mRNAs having an m1A at a stoichiometry above 5%7. To address these discrepancies, we developed two m1A mapping approaches: (1) misincorporation mapping, a computational approach to discover m1A-induced misincorporations in ultra-deep RNA-Seq datasets; and (2) m1A-miCLIP, a high-resolution method for mapping m1A antibody-binding sites in the transcriptome using two different m1A-binding antibodies. Misincorporation mapping shows that m1A is present at detectable stoichiometries only in the MT-ND5 transcript, with no m1A in other mitochondrial mRNAs or 5′UTRs of mRNAs as reported previously. Using m1A-miCLIP, we find that the previously observed binding of the AMA-2 m1A antibody to transcription-start nucleotides and the vicinity of start codons is due to a previously unrecognized cross-reactivity of the AMA-2 m1A antibody to cap structures. We confirm this using a separate m1A antibody that lacks this cap-binding cross-reactivity. We further show that m1A is not detectable at transcription-start nucleotides, as previously proposed8, based on mass spectrometry and TLC. Overall, these data show that the divergent m1A mapping data and large number of 5′UTR-mapped m1A sites largely reflect cross-reactivity of the m1A antibody with mRNA caps.
Both m1A and m7G are positively charged purines. This common structural feature may be recognized by the AMA-2 antibody. Since the Ab208196 binds m1A but not cap structures, this antibody does not generate 5′UTR false-positive signals. m1A-miCLIP carried out using the Abcam antibody shows an m1A signature in MT-ND5 but no transcriptome-wide enrichment in 5′UTRs. These data, along with mass spectrometry, TLC, and misincorporation mapping data, support the idea that the m1A localization to 5′UTR sites is specific to the AMA-2 antibody rather than a reflection of bona fide m1A nucleotides at these sites.
Our results thus support the idea that m1A is a rare and low stoichiometry modification except in the case of MT-ND5 mRNA, as seen in another study7. We were not able to detect m1A in any other cytosolic mRNAs or the mitochondrial mRNAs that were reported to contain m1A8. This could reflect the inability of our method to detect very low stoichiometry m1A modifications.
The initial m1A mapping studies found that m1A is highly prevalent based on mass spectrometry analysis of mRNA3,4. However, more recent experiments showed that poly(A) preparations used for mass spectrometry are usually contaminated with tRNA and rRNA, and when these contaminants are meticulously removed, the m1A signal in poly(A) preparations is absent33. Thus, the newer mass spectrometry data also support the idea that m1A is rare.
A pressing question is whether there are more yet-to-be-discovered modified nucleotides in mRNA. Misincorporation mapping suggests that this is not the case, at least for hard-stop nucleotides. The relative paucity of hard-stop nucleotides in mRNA may reflect the incompatibility of these nucleotides with translation. The ribosome mRNA surveillance pathway induces degradation of mRNAs when tRNAs cannot basepair with codons34. It is therefore notable that the original m1A mapping studies localized many m1A sites to coding sequences, which contributed to skepticism about these maps. If hard-stop nucleotides occur in mRNA, they would likely be transient and act to induce mRNA degradation though this surveillance pathway.
Methods
Cell lines and animals
For misincorporation mapping, an ultra-deep RNA-seq dataset that profiled RNA expression in blood mononucleocytes was used12. For m1A-miCLIP, HEK293T cells (passage 5–10, ATCC CRL-3216) or whole mouse brain (16 week age, pooled male and female brain, C57BL/6) was used. HEK293T cells were purchased directly from ATCC but not further validated for identity or tested for mycoplasma contamination. Experiments involving the use of animals were approved by the Institutional Animal Care and Use Committee at Weill Cornell Medicine.
Antibodies
The AMA-2 m1A antibody is a mouse monoclonal antibody (MBL catalog number D345-3). This antibody is documented to react with both m1A within RNA and the N1-methylated adenine base, as documented in MBL’s product specifications for AMA-2 (catalog no. D345-3) and prior studies3,4. The specificity was validated previously3,4 and in the current study. The Abcam m1A antibody (ab208196) is a rabbit monoclonal generated against m1A. Its specificity for m1A was validated in this study.
Alignment of reads for misincorporation mapping
Raw reads from the ultra-deep RNA-seq dataset used for this study12 were downloaded from GEO (accession code: GSE33029). This RNA-seq dataset was prepared using standard reverse transcription with SuperScript III, an enzyme expected to produce misincorporations at m1A positions11. Variants identified in the genomic DNA corresponding to this dataset were acquired from http://snyderome.stanford.edu. Coordinates of other SNPs that may be present in the DNA sequence were downloaded from the SNP database dbSNP (February 2017 build; https://www.ncbi.nlm.nih.gov/projects/SNP/). Read alignment of forward and reverse read mates was performed using STAR (version 2.5.3a) and the hg19 genome build. Alignment incorporated removal of PCR duplicates, and clipping of 10 bases on either end of each read, since the ends of Illumina reads are prone to sequencing error35. Only reads that mapped to a single location in the genome were used for downstream analysis. A maximum of one mismatch per read was permitted for alignment.
Misincorporation mapping
To identify misincorporations, aligned reads were analyzed using Rsamtools Pileup (version 1.27.16). This program was used to determine the frequency of each of the four nucleotides present in mapped reads at every genomic position with read coverage. We limited our analysis to nucleotide positions with a minimum combined read depth of 500 unique reads across the 20 biological replicates to maximize sensitivity of detecting modified nucleotides. To prevent calling genomic variants and SNPs as modification-induced misincorporations, we did not analyze nucleotide positions containing variants discovered in the genomic DNA corresponding to the RNA-seq dataset, or SNPs annotated in dbSNP. Importantly, our analysis could only be performed on transcripts longer than the library insert size of ~250 bases12. For this reason, analysis of cytosolic tRNAs, which are ~75 nt-long RNAs that contain known conserved m1A residues, could not be performed. However, short RNAs generated from polycistronic transcripts, like mitochondrial tRNAs36, were represented in the analyzed library. To identify sites of modification throughout the transcriptome, we initially filtered for all nucleotide positions that were covered by at least 500 mapped reads and contained a 1% misincorporation rate, and that were present in at least half of biological replicates (Supplementary Data 1). To further obtain a high-confidence list of modification positions, we required that within the misincorporation profile at each initially identified position, a minimum of 5% of misincorporations were heterogeneous (i.e. transitions of the reference nucleotide to all three possible alternative nucleotides) in order to minimize detection of adenosine-to-inosine editing, and heterozygous alleles not reported as variants or SNPs. We chose this filter because hard-stop nucleotides have been shown to cause heterogeneous misincorporations, even when one type of misincorporation is predominant11. This resulted in a high-confidence list of sites that were detected at known and novel modification positions (Supplementary Data 2).
m1A-miCLIP
m1A-miCLIP was performed as previously described20,37, briefly described below along with any modifications: C57BL/6 mice (8 weeks) were sacrificed by CO2 inhalation and cervical dislocation as approved by the Weill Cornell Medicine Institutional Animal Care and Use Committee (IACUC). Total RNA from HEK293T cells (n=2 biological replicates) or whole mouse brain (n=6 biological replicates) was extracted using TRIzol (ThermoFisher) and treated with RNase-free DNase I (Promega). Poly(A) RNA was isolated using one round of selection with oligo(dT)25 magnetic beads (New England Biolabs). This resulted in approximately 10μg of poly(A) RNA for each replicate used in this study. Poly(A) RNA was subjected to fragmentation using RNA Fragmentation Reagents (ThermoFisher) for exactly 12min at 75°C. This fragmentation protocol is identical to the one in m1A-seq, and has been reported not to facilitate substantial m1A to m6A rearrangement.3 Fragmented RNA was then incubated with 10–15μg of m1A antibody per replicate and the antibody-RNA complexes were processed for crosslinking, immunoprecipitation, RNA 3′ linker ligation, purification, and reverse transcription20,37. Following reverse transcription of purified peptide-RNA complexes, first-strand cDNA was circularized using CircLigase II ssDNA Ligase (EpiBio) to preserve the 3′ end of the cDNA, and thus, sites of m1A-induced terminations of reverse transcription. To generate priming sites for library amplification, the cDNA was cut in the middle of the cDNA primer sequence using a single-stranded DNA oligo complementary to this sequence and FastDigest BamHI (ThermoFisher)20,37. This generated priming sites for the Illumina P5 and P3 primers on either side of the first-strand cDNA, eliminating the need for a second-strand synthesis step. For library amplification, Accuprime Supermix I (ThermoFisher) and Illumina P5 and P3 primers were used (see Supplementary Data 7). Amplified libraries were purified using AMPure XP magnetic beads (Beckman Coulter). Libraries were subjected to next-generation sequencing at the Epigenomics Core of Weill Cornell Medicine. Libraries were sequenced on an Illumina HiSeq 2500 and MiSeq instrument in single-end mode to generate 50-base reads.
RNA-seq
HEK293T cell total RNA was extracted with TRIzol (ThermoFisher), treated with RNase-free DNase I (Promega), and poly(A) RNA was isolated using one round of selection with oligo(dT)25 magnetic beads (New England Biolabs). RNA was then subjected to fragmentation using RNA Fragmentation Reagents (ThermoFisher) for exactly 12min at 75°C. Fragmented RNA was then subjected to RNA 3′ linker ligation using T4 RNA Ligase I (New England Biolabs) and reverse transcription using a primer complementary to the linker sequence and SuperScript III (ThermoFisher) (see Supplementary Data 7). First-strand cDNA was gel-purified using denaturing PAGE, and then circularized using Circligase II ssDNA Ligase (EpiBio). Circularized cDNA was then cut and amplified exactly as described above for m1A-miCLIP. Resulting RNA-seq libraries were subjected to next-generation sequencing at the Epigenomics Core of Weill Cornell Medicine. Libraries were sequenced on an Illumina HiSeq 2500 instrument in single-end mode to generate 50-base reads.
Read processing and alignment
After sequencing, reads from m1A-miCLIP or RNA-seq libraries were trimmed of the 3′ linker sequence and barcoded reverse transcription primer sequences using Flexbar (version 2.5) (see Supplementary Data 7). To demultiplex reads belonging to individual biological replicates, the pyBarcodeFilter.py script of the pyCRAC suite (version 1.2.2) was used. The random portion of the reverse transcription barcode was then moved into the sequence header using a custom awk script (available upon request). PCR duplicates were collapsed using pyFastqDuplicateRemover.py of the pyCRAC suite. Finally, reads were aligned to hg19 for HEK293T cells or mm10 for mouse brain using Bowtie (version 1.1.2).
Generation of m1A-miCLIP clusters
m1A-miCLIP clusters of unique reads were generated using the CIMS software package for analysis of HITS-CLIP data38,39. To generate clusters and determine the cluster score (maximum of stacked reads), the tag2profile.pl, tag2cluster.pl and extractPeak.pl scripts of the CIMS software package were used. A custom awk script was then used to filter for clusters of a minimum score (at least 20 stacked reads; script is available upon request).
Motif analyses
To search for a possible common sequence motif present in our HEK293T cell m1A-miCLIP dataset, we focused on potential motifs present in m1A-miCLIP clusters in the 5′UTR, the region of predominant m1A-miCLIP cluster enrichment. The genomic sequences of these clusters were retrieved using bedtools and subjected to motif discovery using the MEME suite (version 4.11.4).
Metagene distribution analyses
To analyze the metagene distribution of m1A-miCLIP clusters on mRNAs, MetaPlotR was used40, with in-house modifications. The density of m1A-miCLIP coverage was normalized to that of RNA-seq coverage to reveal any enrichments using a custom R script (available upon request). For the HEK293T cell metagene, the in-house HEK293T cell RNA-seq dataset described above was used for normalization. For the mouse brain metagene, a published whole-brain RNA-seq dataset was used41 (accession code: GSE52564). To plot the coverage of transcription-start sites by m1A-miCLIP at higher resolution, the plotProfile tool of the Deeptools suite was used.
Examination of antibody crosslink sites around the transcription-start site
For analysis of antibody crosslinks at the transcription-start sites of mRNAs, we analyzed terminations of reverse transcription (i.e. 5′ ends of reads) around these sites. To do so, the number of terminations was measured around RefSeq-annotated transcription-start sites that had coverage in both m1A-miCLIP and RNA-seq. Terminations were counted at positions ranging from the transcription-start site to position +4 relative to the transcription-start site. Then, transcription-start sites were filtered for those that contained a minimum coverage of five unique reads at the transcription-start site position in both m1A-miCLIP and RNA-seq. This filtered set of transcription-start sites was then used to compare the distributions of read terminations in m1A-miCLIP and RNA-seq. We focused on terminations rather than misincorporations in m1A-miCLIP is because the misincorporation profile of m1A is sequence dependent, with both upstream and downstream nucleotides contributing to misincorporation variability11. Thus, we used the presence of terminations as a signature of antibody crosslinking events in our dataset. Additionally, while rare types of reverse transcriptases that read through m1A have been described42, standard reverse transcriptases, like the SuperScript III used in m1A-miCLIP, produce frequent terminations at m1A residues11,43.
Measurement of transcription-start sites enriched by m1A-miCLIP
To determine the types of transcription-start sites overlapping m1A-miCLIP clusters, we used a collection of transcription-start sites that included RefSeq transcription-start sites as well as recently-mapped transcription-start site regions containing the m6Am mRNA extended cap24. The frequencies of all transcription-start site types or those overlapping m1A-miCLIP clusters were thus determined using this collective set.
Synthetic oligonucleotides used in this study
For biochemical analysis of various modifications present within the extended caps of mRNAs, synthetic oligonucleotides were generated as standards for mass spectrometry and/or thin layer chromatography (see below; see Supplementary Data 7). Oligonucleotides containing m7G-ppp-Am, m7G-ppp-m6A, or m7G-ppp-m6Am were synthesized chemically44. Oligonucleotides containing m7G-ppp-m1A or m7G-ppp-m1Am were synthesized enzymatically using an oligonucleotide initiating with ppp-m1A (Trilink). This oligonucleotide was capped using ScriptCap Cap 1 Capping System (CellScript) to generate the m7G cap and, in the case of m7G-ppp-m1Am, 2′-O-methylation of m1A.
Liquid chromatography and mass spectrometry (LC-MS)
Poly(A) RNA was prepared for mass spectrometry as follows. Total RNA from HEK293T cells was treated with TURBO DNase (ThermoFisher) according to the manufacturer’s instructions, followed by two rounds of poly(A) selection using oligo(dT) magnetic beads (NEB). Small RNAs shorter than 200 nt were then removed from the poly(A) RNA using the RNeasy kit (Qiagen). This size selection was performed to prevent detection of extended cap structures that are known to be present in certain small RNAs, like small nuclear RNAs (snRNAs). Approximately 5μg of DNase-treated, poly(A)-selected, and size-selected RNA was thus generated for each sample for mass spectrometry analysis. To release extended cap structures from the nucleotides comprising the internal portion of the RNA, RNA was digested with 2–4 units of Nuclease P1 (Sigma Aldrich) in a final buffer concentration of 30mM sodium acetate (pH 5.5) for 3h at 37°C. Following digestion, the nuclease was removed from the samples using molecular weight cutoff centrifugal filters (VWR). The digested and purified RNA was finally dried using an Eppendorf Vacufuge and reconstituted with 70% acetonitrile (LC-MS grade; Sigma Aldrich) to a final concentration of 0.5µg/µl. Two microliters of the resulting solution were subjected to MS analysis.
Samples were injected into an LC-MS/MS system comprising an Agilent 1260 HPLC and an Agilent 6460 triple quadrupole mass spectrometer equipped with a JetStream electrospray ionization source. Positive ion monitoring and multiple reaction monitoring was used for detection of extended caps. The caps were resolved on an aqueous normal phase column (ANP, Cogent Diamond Hydride, 4µm particle size, 150mm×2.1mm; Microsolv). To achieve chromatographic separation of the cap structures from mononucleotides, the following gradient was used. The aqueous mobile phase (Buffer A) was 50% isopropanol with 0.025% acetic acid, and the organic mobile phase (Buffer B) was 90% acetonitrile containing 5mM ammonium acetate. EDTA was added to the mobile phase in a final concentration of 6µM. The final gradient applied was 0–1.0min 99% B, 1.0–7.0min to 80% B, 7.0–18.0min to 50% B, 18.0–19.0min to 0% B, and 19.1–29.0min 99% B. The flow rate was 0.4mL/min during data acquisition and 0.6mL/min during column re-equilibration. Data were saved in centroid mode using MassHunter workstation acquisition software (Agilent). Data files were processed with MassHunter Qualitative Analysis Software (Agilent).
Exact operating source parameters for the LC-MS analysis are available upon request.
Biochemical examination of modifications at transcription-start sites
To identify the initiating nucleotide structure in mRNAs bound by the m1A antibody, we utilized 2D-TLC analysis of mRNA extended caps24. For analysis of cellular RNA, we used HEK293T cell poly(A) RNA or poly(A) RNA enriched using the m1A antibody3,4. Oligonucleotide standards, input (antibody-unbound) poly(A) RNA, and m1A antibody-enriched poly(A) RNA were then subjected to 2D-TLC24. To enhance resolution of m1A and m1Am from other nucleotide species, the first dimension of 2D-TLC was resolved using 66% isobutyric acid and 1% NH4OH in water, and the second dimension was resolved using 60% ammonium sulfate in 100mM sodium phosphate buffer of pH 6.8 (w/v) with a final concentration of 2% n-propanol. Both dimensions were resolved overnight.
Synthesis of N1,6-methyladenosine
To synthesize N1,6-methyladenosine, N6-methyladenosine (Selleckchem) was dissolved in dry DMF and followed with addition of iodomethane (Acros Organics; 10:1 molar ratio iodomethane:N6-methyladenosine). The mixture was stirred overnight at room temperature. The product was purified by flash chromatography on silica gel (EMD), eluting with methanol and dichloromethane (1:10 to 1:5; ACS or HPLC grade solvents). This resulted in a product yield of 46.3% N1,6-methyladenosine. Product identity was confirmed by nuclear magnetic resonance (NMR) and high-resolution mass spectrometry (HR-MS).
NMR spectra were recorded using a 500-MHz Bruker DMX-500 instrument at room temperature, and chemical shifts were referenced to the residual solvent peak. Shifts were as follows: 1H NMR (500MHz, DMSO-d6) δ 8.11 (s, 1H), 8.04 (s, 1H), 5.76 (d, J=5.8Hz, 1H), 5.15 (s, 1H), 4.43 (t, J=5.4Hz, 1H), 4.15 (m, 2H), 3.92 (d, J=3.7Hz, 1H), 3.63 (dd, J=12.0, 3.9Hz, 1H), 3.56 (m, 2H), 3.50 (s, 3H), 3.45 (d, J=6.5Hz, 1H), 1.23 (s, 3H).
HR-MS data were recorded with Waters LCT-Premier XE at room temperature. For a predicted mass for N1,6-methyladenosine, or C12H18N5O4+, of 296.1353, the mass found was 296.1361.
Analysis of Li et al. m1A sites
The Li et al.8 m1A sites for nuclear-encoded genes was obtained from supplementary Table 2 of the published manuscript. We annotated each of the 474 transcriptomic sites with their corresponding genomic coordinates and nucleotide sequences using an annotation file generated from Refseq with MetaPlotR. With a custom R script, we then filtered likely erroneous sites as specified in the Results section. Briefly, sites corresponding to gene IDs missing in Refseq, or that mapped to non-adenosine nucleotides, or with duplicate genomic coordinates were all removed.
Characterization of antibody affinity for various substrates
The specificity or affinity of the m1A antibody for various nucleosides, nucleotides, or cap structures was determined as follows. To determine the specificity of the m1A antibody for various nucleosides, two approaches were performed. For testing the specificity of the antibody for various nucleosides in the context of m1A-miCLIP, the antibody was crosslinked to total cellular RNA in the presence of various competitor nucleotides. Antibody binding, crosslinking, and detection of crosslinked antibody-RNA complexes was performed exactly as in miCLIP, except with the inclusion of the competitor nucleotide during the antibody-binding reaction.
For testing the specificity of the antibody for modified adenines, especially those resembling N1-methylated adenine, a dot blot assay was performed wherein the competing molecule is added during antibody binding1. Competition assays are usually used to measure binding, rather than spotting the nucleotides to the membrane, since the manner of interaction of each nucleotide to the membrane is not known and can affect antibody binding. For measuring the affinity, the IC50 of the various nucleotides and cap dinucleotides was measured. In these experiments, a series of the dot blot assays were performed, where serial dilutions of each competitor molecule (ranging from 10µM to 1nM) were used in parallel during antibody binding reactions. The dot blots were performed as follows: 250ng of m1A-containing synthetic oligonucleotide (see Supplementary Data 7) were spotted in triplicate on a BrightStar membrane (ThermoFisher), allowed to briefly air-dry, and auto-crosslinked twice in a Stratalinker 2400 (Stratagene). Each membrane was rinsed briefly in PBST, then blocked for 1h at room temperature in 5% milk in PBST. Each membrane was then placed into a pouch containing a 1:1000 dilution of the m1A antibody in 0.5% milk in PBST, and an appropriate concentration of competitor molecule. The antibody binding proceeded for 2h at room temperature. Then, each membrane was washed three times in PBST (5min per wash), and then incubated in a dilution of 1:2500 of secondary antibody (anti-mouse, GE # NA931; anti-rabbit, GE NA934) in 0.5% milk in PBST for 1h at room temperature. Finally, the membrane was washed three times in PBST (5min per wash), and developed using ECL Prime (GE). Membranes corresponding to a dilution series of a specific competitor molecule were imaged together using a ChemiDoc Imager (Bio-Rad).
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank S. Schwartz for helpful discussions, and members of the Jaffrey laboratory for helpful comments and suggestions. We thank J. Mauer and S. Zaccara for experimental contributions that were not included in this manuscript. This work was supported by NIH grants R01DA037755 (to S.R.J.), T32 HD060600 and UL1 TR000457 (to A.V.G.), T32 CA062948, KL2-TR-002385, NRSA 1F32GM120987 (to A.O.O.-G.), U01 HL121828 and P01 HD067244 (to M.S. and S.S.G.), and by a Postdoctoral Enrichment Program Award from the Burroughs Wellcome Fund (to A.O.O.-G.).
Author contributions
S.R.J., A.V.G., and A.O.O.-G. conceived the project and designed and carried out experiments. A.O.O.-G. designed and implemented the misincorporation mapping method. A.V.G., and partly A.O.O.-G, performed m1A-miCLIP, characterized the specificity of the antibodies, and analyzed the miCLIP data. M.S. and S.S.G. performed mass spectrometry analysis of mRNA extended caps. X.L. performed chemical synthesis. S.R.J., A.V.G., and A.O.O.-G wrote the manuscript with input from all the co-authors.
Data availability
A reporting summary for this Article is available as a Supplementary Information file. Sequencing data have been deposited in GEO (GSE97909). The source data for Figs. 2e, 4b and 4c are provided in the Source Data File. All data are available from the corresponding author upon reasonable request.
Code availability
Custom scripts used in this study are available here: https://github.com/olarerin/misincorporation_mapping.
Competing interests
S.R.J. declares a competing interest; he is scientific founder, advisor to, and owns equity in Gotham Therapeutics. All other authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks Yuri Motorin, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Anya V. Grozhik, Anthony O. Olarerin-George.
Supplementary information
Supplementary information is available for this paper at 10.1038/s41467-019-13146-w.
References
Articles from Nature Communications are provided here courtesy of Nature Publishing Group
Full text links
Read article at publisher's site: https://doi.org/10.1038/s41467-019-13146-w
Read article for free, from open access legal sources, via Unpaywall: https://www.nature.com/articles/s41467-019-13146-w.pdf
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1038/s41467-019-13146-w
Article citations
Dynamics of epitranscriptomes uncover translational reprogramming directed by ac4C in rice during pathogen infection.
Nat Plants, 10(10):1548-1561, 24 Sep 2024
Cited by: 0 articles | PMID: 39317771
Dinoflagellate mRNA is pervasively modified with m<sup>1</sup>A.
EMBO Rep, 25(11):4634-4635, 20 Sep 2024
Cited by: 0 articles | PMID: 39304776 | PMCID: PMC11549392
Abundant mRNA m<sup>1</sup>A modification in dinoflagellates: a new layer of gene regulation.
EMBO Rep, 25(11):4655-4673, 02 Sep 2024
Cited by: 0 articles | PMID: 39223385 | PMCID: PMC11549093
Elucidation of the Epitranscriptomic RNA Modification Landscape of Chikungunya Virus.
Viruses, 16(6):945, 12 Jun 2024
Cited by: 1 article | PMID: 38932237 | PMCID: PMC11209572
Enhanced ac4C detection in RNA via chemical reduction and cDNA synthesis with modified dNTPs.
RNA, 30(7):938-953, 17 Jun 2024
Cited by: 1 article | PMID: 38697668
Go to all (53) article citations
Data
Data behind the article
This data has been text mined from the article, or deposited into data resources.
BioStudies: supplemental material and supporting data
GEO - Gene Expression Omnibus (2)
- (1 citation) GEO - GSE33029
- (1 citation) GEO - GSE52564
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Optimizing 5'UTRs for mRNA-delivered gene editing using deep learning.
Nat Commun, 15(1):5284, 20 Jun 2024
Cited by: 2 articles | PMID: 38902240 | PMCID: PMC11189900
Evolution of alternative and constitutive regions of mammalian 5'UTRs.
BMC Genomics, 10:162, 16 Apr 2009
Cited by: 47 articles | PMID: 19371439 | PMCID: PMC2674463
Novel splice variants in the 5'UTR of Gtf2i expressed in the rat brain: alternative 5'UTRs and differential expression in the neuronal dendrites.
J Neurochem, 134(3):578-589, 14 May 2015
Cited by: 4 articles | PMID: 25913238
Cap-Independent Translation: What's in a Name?
Trends Biochem Sci, 43(11):882-895, 19 May 2018
Cited by: 70 articles | PMID: 29789219
Review
Funding
Funders who supported this work.
Burroughs Wellcome Fund (1)
Grant ID: PDEA
NCATS NIH HHS (3)
Grant ID: UL1 TR000457
Grant ID: UL1 TR002384
Grant ID: KL2 TR002385
NCI NIH HHS (1)
Grant ID: T32 CA062948
NHLBI NIH HHS (1)
Grant ID: U01 HL121828
NICHD NIH HHS (2)
Grant ID: T32 HD060600
Grant ID: P01 HD067244
NIDA NIH HHS (1)
Grant ID: R01 DA037755
NIGMS NIH HHS (1)
Grant ID: F32 GM120987
NINDS NIH HHS (1)
Grant ID: R35 NS111631