Abstract
More than 90% of common variants associated with complex traits do not affect proteins directly, but instead the circuits that control gene expression. This has increased the urgency of understanding the regulatory genome as a key component for translating genetic results into mechanistic insights and ultimately therapeutics. To address this challenge, we developed HaploReg (http://compbio.mit.edu/HaploReg) to aid the functional dissection of genome-wide association study (GWAS) results, the prediction of putative causal variants in haplotype blocks, the prediction of likely cell types of action, and the prediction of candidate target genes by systematic mining of comparative, epigenomic and regulatory annotations. Since first launching the website in 2011, we have greatly expanded HaploReg, increasing the number of chromatin state maps to 127 reference epigenomes from ENCODE 2012 and Roadmap Epigenomics, incorporating regulator binding data, expanding regulatory motif disruption annotations, and integrating expression quantitative trait locus (eQTL) variants and their tissue-specific target genes from GTEx, Geuvadis, and other recent studies. We present these updates as HaploReg v4, and illustrate a use case of HaploReg for attention deficit hyperactivity disorder (ADHD)-associated SNPs with putative brain regulatory mechanisms.
INTRODUCTION
Phenotype-associated loci from genome-wide association studies (GWAS) are usually non-coding, and functionally interpreting them is a challenge due to linkage disequilibrium (LD) and our almost complete inability to predict regulatory function directly from non-coding sequence. Therefore, regulatory genomic data such as maps of enhancers and transcription factor binding sites are essential to interpreting GWAS, developing mechanistic hypotheses, and ultimately understanding the genetic architecture of complex traits and disease (1–3). For human geneticists, these regulatory data can be unwieldy to translate from a genome browser to insights about a set of genomically-dispersed disease variants. HaploReg (4) integrates regulatory genomic maps together in the context of haplotype blocks, allowing researchers to intersect regulatory elements with genetic variants to quickly formulate functional hypotheses, both through dissection of multiple variants within a haplotype block and through global enrichment analysis of a set of associated loci. HaploReg annotation of GWAS has successfully been applied for haplotype fine-mapping (5–9) and enrichment analysis (7,10,11).
DATA AND INTERFACE UPDATES
HaploReg has been expanded substantially since it first launched in 2011. Here we describe the updates that have been incorporated in Haploreg v4 in response to new research in regulatory genomics and feedback from users.
Catalog of variants
HaploReg v4 defines a core set of 52 054 804 variants, consisting primarily of single-nucleotide polymorphisms (SNPs) using all refSNP IDs, hg19 positions and alleles from dbSNP release b137 (12). Corresponding hg38 coordinates for these variants were obtained from dbSNP release b141. This core set of dbSNP variants was integrated with other data sets either by rsID (for GWAS, eQTL and 1000 Genomes data) or by intersecting intervals by coordinate using the BEDTools software package (13) (for all other functional tracks.)
Linkage disequilibrium was calculated using phased low-coverage whole-genome autosomal sequences for four ancestral super-populations (AFR, AMR, ASN and EUR) from the 1000 Genomes Project Phase 1 release (14), using a search space of all variants within 250 kilobases of each other. Allele frequencies were also obtained for each population.
Location of variants relative to genes was calculated using BEDTools and both GENCODE (15) and RefSeq (16).
Genome-wide association studies
GWAS were obtained from the EBI-NHGRI GWAS Catalog (17) (downloaded 30 October 2015). When there were multiple GWAS for the same trait, a trait-wide pruning was performed to retain only the strongest (lowest P-value) GWAS result from all studies on that trait, when two results from different studies were overlapping or within one megabase of each other.
Sequence conservation
Mammalian evolutionarily constrained elements are defined as originally reported, using both SiPhy elements (18) and GERP elements (19). Both of these comparative genomics studies report both base-level conservation scores as well as discretized elements; we chose to report discretized elements resulting from the authors’ algorithms for the sake of simplicity and interpretability. A colored cell represents that that the element is conserved according to the algorithm.
Regulatory protein binding
Protein-binding sites from a variety of cell types and experimental conditions was obtained from the ENCODE Project ChIP-Seq data (20), processed by the narrowPeak algorithm.
Reference epigenomes
Epigenomic data from the Roadmap Epigenomics project (11) for the following data sets were included: ChromHMM states corresponding to enhancer or promoter elements, from the 15-state core model and 25-state model incorporating imputed data (21); histone modification ChIP-seq peaks using the gappedPeak algorithm for H3K27ac, H3K9ac, H3K9me1 and H3K9me3; and DNase hypersensitivity data peaks using the narrowPeak algorithm.
Expression quantitative trait loci
Expression QTL (eQTL) results were obtained from the GTEx pilot analysis v6 (22), the GEUVADIS project (23) and 12 other studies (10,24–34) in order to annotate variants with their putative regulatory target genes and the tissue(s) in which genotype has been associated with gene expression level. A wide range of QTLs, including eQTLs and other molecular QTLs such as metabolite QTLs, were also extracted from the GRASP database, build 2.0.0.0 (35,36).
Regulatory motifs
A library of position weight matrices from commercial, literature and motif-finding analysis of the ENCODE project (37) was used to score the effect of variants on regulatory motifs using the position weight matrix (PWM)-scanning process described previously (4).
Enrichment analysis
For a given set of lead SNPs from a GWAS or user-input SNPs, the overlap of SNPs with predicted enhancers in each reference epigenome is assessed. Users have four different options for defining enhancers, available in the option panel: using the 15-state core model, using the 25-state model incorporating imputed epigenomes, using H3K4me1 peaks and using H3K27ac peaks. The overlap with enhancers in each cell type is compared to two background models to assess enrichment: all 1000 Genomes variants with a frequency above 5% in any population and all independent GWAS catalog SNPs. The enrichment relative to these background frequencies is performed using a binomial test and uncorrected P-values are reported in an enrichment table underneath the haplotype views.
USE EXAMPLE
To become acquainted with HaploReg, use the GWAS drop-down menu to select ‘Attention deficit hyperactivity disorder (Lesch KP, 2008, 26 SNPs)’ and select ‘Submit’. Notice that the first two haplotype blocks from this study (38) are driven by lead SNPs with the same P-value = 1 × 10−8. Go to the second haplotype result, for lead SNP rs864643 (Figure 1). Note that the top row in the haplotype block shows the SNP rs561543, and that it has LD of r2 = 0.81 and D′ = 0.95 with the lead variant rs864643. It overlaps with an HMM-predicted enhancer in four major tissue types; hover over ‘4 tissues’ in that row to see a variety of enhancer tissues, including brain. Note that there is also an experiment with HNF4 protein bound by ChIP-Seq, 9 QTL results and an HNF4 motif disruption.
Notice the enrichment results at the bottom of the page below the haplotype results. Note that the strongest enrichment for enhancers (as defined by the 15-state core ChromHMM model) is in the angular gyrus sample from brain, with binomial P = 2.0 × 10−6 relative to all common SNPs.
Then go to the entry on the block for the lead SNP itself, rs864643. Click on the rsid, which is colored red because it is the lead SNP. Note that in the full table of epigenomic information from Roadmap Epigenomics (11), there is a cluster of enhancer activity in brain, and that it is classified as a genic enhancer by the 15-state core model and transcribed 3′ enhancer by the 25-state model (Figure 2). Note that H3K4me1, H3K27ac and H3K9ac all contribute to the chromatin state assignment at this locus. Black cells on the right hand of this part of the table indicate that DNase was not assayed by Roadmap in these tissues.
Go to the bottom of the detail page for rs864643. Note that the SNP has been correlated with MOBP expression in two brain tissues (29), MPRL15 expression in blood (39) and serum ratio of allantoin to quinate (40); all three of these studies were curated by GRASP and found by cross-referencing this SNP to its database (35,36) (Figure 3). Looking at studies individually curated by HaploReg, notice that the SNP has been associated with differential expression of a single exon of RPSA in lymphoblastoid cells by the GEUVADIS study (23). In the motif table, note that the SNP changes the match to the p300 PWM, ATTAYRWCA, with the alternate allele changing a match to the fourth A to a G. Hover over the ‘p300_disc’ ID to see that the motif was discovered using the Trawler algorithm on a p300 ChIP-Seq experiment in HeLa cells from the ENCODE dataset (37).
These lines of evidence suggest regulatory mechanisms by which the SNPs from this GWAS may affect the complex phenotype of ADHD. While individually each piece of evidence is relatively weak, they offer ways in which molecular biologists could proceed with further experiments that would more definitively establish mechanisms. For example, the GWAS-wide enrichment suggests global differential gene regulation in angular gyrus, which has been associated with hyperactivation in ADHD by fMRI (41) and suggests a tissue to study gene expression directly in animal models. ChIP-seq and motif data suggest specifically testing HNF4 binding differentially to the alleles of rs561543, and the strong motif coupled with eQTL data suggest looking at whether p300 binds differentially to rs864643 in a brain tissue model. Finally, MOBP eQTL evidence suggests experiments to dissect the mechanism of MOBP differential expression, perhaps modulated by p300 at rs864643 and suggests that it may be useful to perform ADHD-relevant behavioral assays of MOBP-deficient mice, which do not show an overt behavioral phenotype (42).
Acknowledgments
We thank HaploReg users and the reviewers of this manuscript for helpful suggestions and feedback.
Footnotes
Present address: Lucas D. Ward, Amgen, Inc., Cambridge, MA 02142, USA.
FUNDING
National Institutes of Health (NIH) [R01-HG004037, RC1-HG005334, R01-HG008155]. Funding for open access charge: NIH [R01 HG004037].
Conflict of interest statement. None declared.
REFERENCES
- 1.Ward L.D., Kellis M. Interpreting noncoding genetic variation in complex traits and human disease. Nat. Biotechnol. 2012;30:1095–1106. doi: 10.1038/nbt.2422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Paul D.S., Soranzo N., Beck S. Functional interpretation of non-coding sequence variation: concepts and challenges. Bioessays. 2014;36:191–199. doi: 10.1002/bies.201300126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Civelek M., Lusis A.J. Systems genetics approaches to understand complex traits. Nat. Rev. Genet. 2014;15:34–48. doi: 10.1038/nrg3575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ward L.D., Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40:D930–D934. doi: 10.1093/nar/gkr917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Verhoeven V.J., Hysi P.G., Wojciechowski R., Fan Q., Guggenheim J.A., Hohn R., MacGregor S., Hewitt A.W., Nag A., Cheng C.Y., et al. Genome-wide meta-analyses of multiancestry cohorts identify multiple new susceptibility loci for refractive error and myopia. Nat. Genet. 2013;45:314–318. doi: 10.1038/ng.2554. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chung C.C., Kanetsky P.A., Wang Z., Hildebrandt M.A., Koster R., Skotheim R.I., Kratz C.P., Turnbull C., Cortessis V.K., Bakken A.C., et al. Meta-analysis identifies four new loci associated with testicular germ cell tumor. Nat. Genet. 2013;45:680–685. doi: 10.1038/ng.2634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lee M.N., Ye C., Villani A.C., Raj T., Li W., Eisenhaure T.M., Imboywa S.H., Chipendo P.I., Ran F.A., Slowikowski K., et al. Common genetic variants modulate pathogen-sensing responses in human dendritic cells. Science. 2014;343:1246980. doi: 10.1126/science.1246980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ruark E., Seal S., McDonald H., Zhang F., Elliot A., Lau K., Perdeaux E., Rapley E., Eeles R., Peto J., et al. Identification of nine new susceptibility loci for testicular cancer, including variants near DAZL and PRDM14. Nat. Genet. 2013;45:686–689. doi: 10.1038/ng.2635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Franceschini N., Fox E., Zhang Z., Edwards T.L., Nalls M.A., Sung Y.J., Tayo B.O., Sun Y.V., Gottesman O., Adeyemo A., et al. Genome-wide association analysis of blood-pressure traits in African-ancestry individuals reveals common associated genes in African and non-African populations. Am. J. Hum. Genet. 2013;93:545–554. doi: 10.1016/j.ajhg.2013.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Westra H.J., Peters M.J., Esko T., Yaghootkar H., Schurmann C., Kettunen J., Christiansen M.W., Fairfax B.P., Schramm K., Powell J.E., et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 2013;45:1238–1243. doi: 10.1038/ng.2756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Roadmap Epigenomics Consortium. Kundaje A., Meuleman W., Ernst J., Bilenky M., Yen A., Heravi-Moussavi A., Kheradpour P., Zhang Z., Wang J., et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–330. doi: 10.1038/nature14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Sherry S.T., Ward M.-H., Kholodov M., Baker J., Phan L., Smigielski E.M., Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–311. doi: 10.1093/nar/29.1.308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.1000 Genomes Project Consortium. Abecasis G.R., Auton A., Brooks L.D., DePristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., McVean G.A. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. doi: 10.1038/nature11632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Harrow J., Denoeud F., Frankish A., Reymond A., Chen C.K., Chrast J., Lagarde J., Gilbert J.G., Storey R., Swarbreck D., et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006;7(Suppl. 1):S4.1–S4.9. doi: 10.1186/gb-2006-7-s1-s4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Pruitt K.D., Tatusova T., Maglott D.R. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007;35:D61–D65. doi: 10.1093/nar/gkl842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Welter D., MacArthur J., Morales J., Burdett T., Hall P., Junkins H., Klemm A., Flicek P., Manolio T., Hindorff L., et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–D1006. doi: 10.1093/nar/gkt1229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lindblad-Toh K., Garber M., Zuk O., Lin M.F., Parker B.J., Washietl S., Kheradpour P., Ernst J., Jordan G., Mauceli E., et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature. 2011;478:476–482. doi: 10.1038/nature10530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Davydov E.V., Goode D.L., Sirota M., Cooper G.M., Sidow A., Batzoglou S. Identifying a high fraction of the human genome to be under selective constraint using GERP++ PLoS Comput. Biol. 2010;6:e1001025. doi: 10.1371/journal.pcbi.1001025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Encode Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ernst J., Kellis M. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat. Biotechnol. 2015;33:364–376. doi: 10.1038/nbt.3157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.GTEx Consortium. Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science. 2015;348:648–660. doi: 10.1126/science.1262110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lappalainen T., Sammeth M., Friedlander M.R., t Hoen P.A., Monlong J., Rivas M.A., Gonzalez-Porta M., Kurbatova N., Griebel T., Ferreira P.G., et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature. 2013;501:506–511. doi: 10.1038/nature12531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Montgomery S.B., Sammeth M., Gutierrez-Arcelus M., Lach R.P., Ingle C., Nisbett J., Guigo R., Dermitzakis E.T. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature. 2010;464:773–777. doi: 10.1038/nature08903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Schadt E.E., Molony C., Chudin E., Hao K., Yang X., Lum P.Y., Kasarskis A., Zhang B., Wang S., Suver C., et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 2008;6:e107. doi: 10.1371/journal.pbio.0060107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Gibbs J.R., van der Brug M.P., Hernandez D.G., Traynor B.J., Nalls M.A., Lai S.L., Arepalli S., Dillman A., Rafferty I.P., Troncoso J., et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet. 2010;6:e1000952. doi: 10.1371/journal.pgen.1000952. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Stranger B.E., Nica A.C., Forrest M.S., Dimas A., Bird C.P., Beazley C., Ingle C.E., Dunning M., Flicek P., Koller D., et al. Population genomics of human gene expression. Nat. Genet. 2007;39:1217–1224. doi: 10.1038/ng2142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Li Q., Stram A., Chen C., Kar S., Gayther S., Pharoah P., Haiman C., Stranger B., Kraft P., Freedman M.L. Expression QTL-based analyses reveal candidate causal genes and loci across five tumor types. Hum. Mol. Genet. 2014;23:5294–5302. doi: 10.1093/hmg/ddu228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zou F., Chai H.S., Younkin C.S., Allen M., Crook J., Pankratz V.S., Carrasquillo M.M., Rowley C.N., Nair A.A., Middha S., et al. Brain expression genome-wide association study (eGWAS) identifies human disease-associated variants. PLoS Genet. 2012;8:e1002707. doi: 10.1371/journal.pgen.1002707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Koopmann T.T., Adriaens M.E., Moerland P.D., Marsman R.F., Westerveld M.L., Lal S., Zhang T., Simmons C.Q., Baczko I., dos Remedios C., et al. Genome-wide identification of expression quantitative trait loci (eQTLs) in human heart. PLoS One. 2014;9:e97380. doi: 10.1371/journal.pone.0097380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Ramasamy A., Trabzuni D., Guelfi S., Varghese V., Smith C., Walker R., De T., U. K. Brain Expression Consortium, North American Brain Expression Consortium U. K. Brain Expression Consortium, North American Brain Expression Consortium. Coin L., et al. Genetic variability in the regulation of gene expression in ten regions of the human brain. Nat. Neurosci. 2014;17:1418–1428. doi: 10.1038/nn.3801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Fairfax B.P., Humburg P., Makino S., Naranbhai V., Wong D., Lau E., Jostins L., Plant K., Andrews R., McGee C., et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science. 2014;343:1246949. doi: 10.1126/science.1246949. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Grundberg E., Adoue V., Kwan T., Ge B., Duan Q.L., Lam K.C., Koka V., Kindmark A., Weiss S.T., Tantisira K., et al. Global analysis of the impact of environmental perturbation on cis-regulation of gene expression. PLoS Genet. 2011;7:e1001279. doi: 10.1371/journal.pgen.1001279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hao K., Bosse Y., Nickle D.C., Pare P.D., Postma D.S., Laviolette M., Sandford A., Hackett T.L., Daley D., Hogg J.C., et al. Lung eQTLs to help reveal the molecular underpinnings of asthma. PLoS Genet. 2012;8:e1003029. doi: 10.1371/journal.pgen.1003029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Eicher J.D., Landowski C., Stackhouse B., Sloan A., Chen W., Jensen N., Lien J.P., Leslie R., Johnson A.D. GRASP v2.0: an update on the Genome-Wide Repository of Associations between SNPs and phenotypes. Nucleic Acids Res. 2015;43:D799–D804. doi: 10.1093/nar/gku1202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Leslie R., O'Donnell C.J., Johnson A.D. GRASP: analysis of genotype-phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics. 2014;30:i185–i194. doi: 10.1093/bioinformatics/btu273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kheradpour P., Kellis M. Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. Nucleic Acids Res. 2014;42:2976–2987. doi: 10.1093/nar/gkt1249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lesch K.P., Timmesfeld N., Renner T.J., Halperin R., Roser C., Nguyen T.T., Craig D.W., Romanos J., Heine M., Meyer J., et al. Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies. J. Neural Transm. 2008;115:1573–1585. doi: 10.1007/s00702-008-0119-3. [DOI] [PubMed] [Google Scholar]
- 39.Fehrmann R.S., Jansen R.C., Veldink J.H., Westra H.J., Arends D., Bonder M.J., Fu J., Deelen P., Groen H.J., Smolonska A., et al. Trans-eQTLs reveal that independent genetic variants associated with a complex phenotype converge on intermediate genes, with a major role for the HLA. PLoS Genet. 2011;7:e1002197. doi: 10.1371/journal.pgen.1002197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Suhre K., Shin S.Y., Petersen A.K., Mohney R.P., Meredith D., Wagele B., Altmaier E., CardioGram, Deloukas P., Erdmann J., et al. Human metabolic individuality in biomedical and pharmaceutical research. Nature. 2011;477:54–60. doi: 10.1038/nature10354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Cortese S., Kelly C., Chabernaud C., Proal E., Martino A., Milham M.P., Castellanos F.X. Toward systems neuroscience of ADHD: a meta-analysis of 55 fMRI studies. Am. J. Psychiatry. 2012;169:1038–1055. doi: 10.1176/appi.ajp.2012.11101521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Eppig J.T., Blake J.A., Bult C.J., Kadin J.A., Richardson J.E., Mouse Genome Database, G. The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease. Nucleic Acids Res. 2015;43:D726–D736. doi: 10.1093/nar/gku967. [DOI] [PMC free article] [PubMed] [Google Scholar]