Abstract
Free full text
The reactome pathway knowledgebase
Abstract
The Reactome Knowledgebase (https://reactome.org) provides molecular details of signal transduction, transport, DNA replication, metabolism and other cellular processes as an ordered network of molecular transformations in a single consistent data model, an extended version of a classic metabolic map. Reactome functions both as an archive of biological processes and as a tool for discovering functional relationships in data such as gene expression profiles or somatic mutation catalogs from tumor cells. To extend our ability to annotate human disease processes, we have implemented a new drug class and have used it initially to annotate drugs relevant to cardiovascular disease. Our annotation model depends on external domain experts to identify new areas for annotation and to review new content. New web pages facilitate recruitment of community experts and allow those who have contributed to Reactome to identify their contributions and link them to their ORCID records. To improve visualization of our content, we have implemented a new tool to automatically lay out the components of individual reactions with multiple options for downloading the reaction diagrams and associated data, and a new display of our event hierarchy that will facilitate visual interpretation of pathway analysis results.
INTRODUCTION
At the cellular level, life is a network of molecular reactions that enable signal transduction, transport, DNA replication, protein synthesis and intermediary metabolism. A variety of online resources capture aspects of this information at the level of individual reactions such as Rhea (1) or at the level of reaction sequences spanning various domains of biology such as KEGG (2), MetaCyc (3) or PANTHER (4). The Reactome Knowledgebase is distinctive in focusing its manual annotation effort on a single species, Homo sapiens, and applying a single consistent data model across all domains of biology. Processes are systematically described in molecular detail to generate an ordered network of molecular transformations, resulting in an extended version of a classic metabolic map (5,6). The Reactome Knowledgebase systematically links human proteins to their molecular functions, providing a resource that functions both as an archive of biological processes and as a tool for discovering novel functional relationships in data such as gene expression studies or catalogs of somatic mutations in tumor cells.
Reactome (version 70—September 2019) has entries for 10 867 human protein-coding genes, 53% of the 20 454 predicted human protein-coding genes (Ensembl release 97—July 2019—http://www.ensembl.org/Homo_sapiens/Info/Annotation), supporting the annotation of 25 849 specific forms of proteins distinguished by co- and post-translational modifications and subcellular localizations. These function with 1856 naturally occurring small molecules as substrates, catalysts and regulators in 11 638 reactions annotated on the basis of data from 30 398 literature references. These reactions are grouped into 1803 pathways (e.g. interleukin-15 signaling, phosphatidylinositol phosphate metabolism and receptor-mediated mitophagy) grouped into 26 superpathways (e.g. immune system, metabolism and autophagy) that describe normal cellular functions. Notable recent additions include extended annotations of SUMOylation and NEDDylation reactions and their regulatory consequences, annotations of NOTCH and RUNX signaling processes, systematic annotation of the processes of autophagy, and annotation of the metabolism of arachidonate-derived proresolvin mediators.
An additional ‘disease’ superpathway groups 484 annotations of disease counterparts of these normal cellular processes. These disease annotations include 1599 variant proteins and their post-translationally modified forms derived from 308 gene products, used to annotate 970 disease-specific reactions, tagged with 387 Disease Ontology terms (7).
Notable recent changes in Reactome include expanding the scope of the project to support annotation of the molecular function of drugs, developing new tools to facilitate community participation in annotation and to explicitly acknowledge it, and developing new web features to improve the layout of individual reactions and the visualization of our event hierarchy.
ANNOTATING MOLECULAR MECHANISMS OF DRUG ACTION
A ‘drug’ is not a molecularly distinct kind of physical entity but rather a role that the entity can assume under specific circumstances. For Reactome, a drug is a physical entity not normally present in a human system and not a normal dietary constituent that when introduced into the system interacts with the naturally occurring components of the system to modulate their molecular functions. A new ‘drug’ class of physical entities in our data model distinguishes chemical drugs (e.g. β-blockers) from protein drugs (e.g. therapeutic antibodies) and RNA drugs (e.g. synthetic small RNAs).
As shown for the antithrombotic chemical apixaban in Figure Figure1A,1A, each chemical drug instance is mapped to its counterpart in IUPHAR (8) and if one is available in ChEBI (9) and for additional pharmacological data. The drug instance is also associated with a disease target using terms from the Disease Ontology (7) and a subcellular location using terms from the GO cellular component ontology (10). When several such drugs form a chemically related family with a single target and mechanism of action, we group them into a set (Figure (Figure1B);1B); that set is then used to create reactions to annotate the shared action (either negative or positive) of the set members on the target. In the case of apixaban and closely related chemical drugs that bind and inhibit Factor Xa both alone and as a complex with Factor Va, a reaction shows drug binding to the complex to form a drug:protein complex that negatively regulates cleavage of Factor II (Figure (Figure1C1C).
Our September 2019 release includes annotations of effects of 222 drugs, mostly chemical drugs in widespread use to treat thrombosis and other cardiovascular diseases. Work is underway to extend annotations to drugs involved in other disease processes, and to other stages of the complete drug life cycle, linking the molecular function of its active form to reactions for its uptake, its activation if it enters the body in prodrug from, and its inactivation and elimination.
Of all the exogenous molecules that can perturb human biological processes, which ones qualify as drugs? A molecule in widespread clinical use as indicated by its IUPHAR annotations is clearly qualified. In addition, we include molecules whose potential therapeutic mechanisms of action have been defined at a molecular level even if those molecules are not yet approved for clinical use or are restricted to a pre-clinical setting because of toxic side effects, if this annotation is useful to illustrate the mechanism of action and possible limitations in the application of a drug class in human systems. These broad boundaries are compatible with Reactome’s role as a biomedical research tool, rather than as a resource to support clinical decision making.
Drug annotation to date centers on cases in which a drug binds a protein (normal or genetic variant) and alters the protein’s default function, thereby regulating it. The Reactome data model can accommodate more diverse and complicated drug–target interactions, and can likewise accommodate the other stages of a drug’s life cycle in the body in which it is taken up, transported to a target site, activated, then inactivated and excreted. These extensions are the focus of work now getting underway.
FACILITATING COMMUNITY INVOLVEMENT IN ANNOTATION
Domain experts participate in creating, validating and updating Reactome annotations at all levels of granularity from the details of an individual reaction to the organization of a superpathway. We have always solicited such participation, and in several cases have organized formal collaborations to annotate a subject area. To broaden the process to ensure that it is open to all interested biologists, we have implemented two new features on our web site, one to enable new users to participate in the review process, and one to enable individuals who have already contributed to Reactome to associate the Reactome events on which they have worked with their ORCID records (https://orcid.org).
Community participation is enabled by a web page (Figure (Figure2)2) that lists all newly created events that are ready for final review, with an on-line form that allows a person to identify an event of interest and volunteer to work with us on it. The page also can be used to propose new topics for annotation.
To associate a person’s ORCID identifier with events on which the person has worked, the Reactome search engine has been modified so that a search on the person’s name now returns a web page that lists all events (reactions and pathways) to which the named person has contributed as an author or reviewer (Figure (Figure3).3). Features on that page enable the person to examine individual events, to download citations to them in BibTex format, and to validate his or her ORCID identifier and then to claim those events to make them part of his or her ORCID record. When a person makes contributions over multiple Reactome release cycles, repetition of this claiming process will identify the new work and add it to the ORCID record.
IMPROVED REACTION AND PATHWAY VISUALIZATION
Automatic layout of individual reactions
Human usability of web searches that return one or more individual reactions, and of downloadable PDF files with text descriptions of all reactions and subpathways in a pathway is improved by providing a diagram of each reaction that shows its participants (inputs, outputs, regulators and catalyst) laid out in a conventional left-to-right mode, localized within the appropriate cellular compartments. When a reaction involves more than one compartment, e.g., signal transduction or transport, the compartments are correctly located with respect to one another and the participating entities. A pilot version of a script to generate such diagrams has been deployed on our web site (Figure (Figure4).4). The resulting images and their associated data can be exported in a variety of formats.
Alternative pathway visualization scheme
The ‘fireworks’ view of our event hierarchy represents each superpathway as a node surrounded by concentric rings of nodes representing the superpathway’s child pathways, their child pathways and so on. Nodes are scaled in proportion to the numbers of events they contain. Arc edges represent the part_of relationships between pathways and subpathways; multiple part_of relationships for a single pathway are readily represented as multiple arcs (Figure (Figure5A).5A). This layout provides a legible view of a large event hierarchy with complex parent-child relationships and is easily edited to accommodate both new domains of knowledge (superpathways) and new material added to existing pathways. To maintain this legibility, however, nodes must be small and most of the space in the diagram must be left empty. As a result, colorizing the ‘fireworks’ view to display an overview of the results, e.g., of overlaying a gene expression dataset on the Reactome event hierarchy often yields results that are not readily viewed by a human user. To solve this display problem, we have developed an alternative display option based on Voronoi diagrams (Wikipedia. Voronoi diagram https://en.wikipedia.org/w/index.php?title=Voronoi_diagram&oldid=908392936 (accessed 26 August 2019)). The resulting pathway display (Figure (Figure5B)5B) is partitioned into contiguous regions, each corresponding to a pathway and grouped according to the relationships among pathways specified in the event hierarchy. This arrangement devotes maximum possible space in the diagram to the pathway nodes. Figure Figure55 shows our metabolism node, comparing its fireworks and Voronoi diagram layouts.
ACCESS TO DATA AND SOFTWARE
Reactome is open-source. All original Reactome data are available in various formats from the ‘downloads’ page of our web site (https://reactome.org/download-data) and all software is available from our GitHub repository (https://github.com/reactome).
CONCLUSIONS
The Reactome Knowledgebase of the molecular details of human biological processes continues to grow in size and scope, notably with the development of tools to annotate the roles of drugs in these processes to yield an integrated description of default human processes and their modulation by drugs. We have implemented a new web feature to recruit experts to participate in our review process and other aspects of curation. New visualization features provide high-quality, downloadable diagrams of individual reactions and an overview of all of our content in a format that should facilitate exploration of gene expression and similar data sets.
ACKNOWLEDGEMENTS
We are grateful to the more than 800 expert scientists who have collaborated with us as external authors and reviewers of Reactome content since 2002. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
FUNDING
National Institutes of Health [U41HG003751, U54GM114833]; European Bioinformatics Institute (EMBL-EBI); Open Targets (The Target Validation Platform); Medicine by Design (University of Toronto). Funding for open access charge: National Institutes of Health [U41HG003751].
Conflict of interest statement. None declared.
REFERENCES
Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press
Full text links
Read article at publisher's site: https://doi.org/10.1093/nar/gkz1031
Read article for free, from open access legal sources, via Unpaywall: https://academic.oup.com/nar/article-pdf/48/D1/D498/31697544/gkz1031.pdf
Citations & impact
Impact metrics
Article citations
A Computational Protocol for the Knowledge-Based Assessment and Capture of Pathologies.
Methods Mol Biol, 2868:265-284, 01 Jan 2025
Cited by: 0 articles | PMID: 39546235
Metabolomics-driven approaches for identifying therapeutic targets in drug discovery.
MedComm (2020), 5(11):e792, 11 Nov 2024
Cited by: 0 articles | PMID: 39534557
Review
Integrative ensemble modelling of cetuximab sensitivity in colorectal cancer patient-derived xenografts.
Nat Commun, 15(1):9139, 11 Nov 2024
Cited by: 0 articles | PMID: 39528460 | PMCID: PMC11555063
Transcriptomic analysis of the 12 major human breast cell types reveals mechanisms of cell and tissue function.
PLoS Biol, 22(11):e3002820, 05 Nov 2024
Cited by: 0 articles | PMID: 39499736 | PMCID: PMC11537416
Go to all (1,466) article citations
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
The Reactome Pathway Knowledgebase.
Nucleic Acids Res, 46(d1):D649-D655, 01 Jan 2018
Cited by: 1470 articles | PMID: 29145629 | PMCID: PMC5753187
The Reactome pathway Knowledgebase.
Nucleic Acids Res, 44(d1):D481-7, 09 Dec 2015
Cited by: 822 articles | PMID: 26656494 | PMCID: PMC4702931
The Reactome Pathway Knowledgebase 2024.
Nucleic Acids Res, 52(d1):D672-D678, 01 Jan 2024
Cited by: 91 articles | PMID: 37941124 | PMCID: PMC10767911
Plant Reactome and PubChem: The Plant Pathway and (Bio)Chemical Entity Knowledgebases.
Methods Mol Biol, 2443:511-525, 01 Jan 2022
Cited by: 7 articles | PMID: 35037224
Review
Funding
Funders who supported this work.
European Bioinformatics Institute
NHGRI NIH HHS (2)
Grant ID: U41 HG003751
Grant ID: P41 HG003751
National Institutes of Health (2)
Grant ID: U54GM114833
Grant ID: U41HG003751