Synthetic Virology: Building Viruses to Better Understand Them
- Correspondence: Benjamin.tenOever{at}mssm.edu
Abstract
Generally comprised of less than a dozen components, RNA viruses can be viewed as well-designed genetic circuits optimized to replicate and spread within a given host. Understanding the molecular design that enables this activity not only allows one to disrupt these circuits to study their biology, but it provides a reprogramming framework to achieve novel outputs. Recent advances have enabled a “learning by building” approach to better understand virus biology and create valuable tools. Below is a summary of how modifying the preexisting genetic framework of influenza A virus has been used to track viral movement, understand virus replication, and identify host factors that engage this viral circuitry.
Influenza virus biology demands cellular entry and successfully launching of an RNA-based circuit to generate the necessary components for nuclear import, transcription, replication, nuclear export, and ultimate egress. Influenza virus circuitry is maintained as a ribonucleoprotein (RNP) complex, comprised of the nucleoprotein (NP) and viral polymerases bound to eight independent RNA segments of negative polarity and packaged into a host-derived lipid membrane decorated by viral glycoproteins (Bouvier and Palese 2008). Cellular entry of the eight RNPs is mediated by a trimeric glycoprotein (hemagglutinin or HA), which triggers endocytosis and fusion within the endosome (Bouvier and Palese 2008). During this process, a small ion channel (formed by the Matrix-2 or M2 protein) enables acidification of the virion and the dissociation of the RNPs from Matrix-1 (M1), a viral protein required for RNP encapsidation and nuclear export. Dissociation of M1 exposes NP and enables nuclear import in which each RNP is “launched” by a bound viral RNA-dependent RNA polymerase (RdRp). The RdRp, composed of three viral gene products (PA, PB1, and PB2), is responsible for generating 10 major viral products shared by all influenza A virus (IAV) subtypes. Apart from HA, M1, M2, NP, PA, PB1, and PB2, the RdRp also generates transcripts that encode proteins responsible for RNP export (nuclear export protein [NEP]) and egress (neuraminidase [NA]). The 10th transcript generated by IAV encodes the nonstructural 1 protein (NS1), which is not thought to contribute to the virus circuit, but rather is a product dedicated to protecting it from unwanted host interference (Ayllon and García-Sastre 2015). In addition to these 10 major products shared among all influenza A viruses, additional strain-specific gene products have also been described that, like NS1, generally function to shut down antiviral host defenses (Vasin et al. 2014). Collectively these proteins are responsible for multiple outputs that function in concert to achieve a biological circuit designed for amplification and spread (Fig. 1). This understanding of IAV biology has been made possible largely through modification of this circuit and its individual modules through a process of genetic engineering.
The ability to reprogram viruses began with the discovery that one could launch a replication-competent virus from a DNA-based plasmid—a method now commonly referred to as reverse genetics. The first reverse genetic systems established were for positive-strand RNA viruses, as these generally launch independently of any other viral component (Taniguchi et al. 1978; Racaniello and Baltimore 1981). In contrast, because negative-strand RNA viruses require an RNA with precise termini bound properly to NP in the presence of the cognate RdRp, reverse genetic systems for this order of viruses were significantly more complicated. As a result, early attempts at recapitulating aspects of negative-strand RNA virus replication began as minireplicon systems, DNA-based platforms that use small genomic mimics with NP to study RdRp biology (Neumann et al. 2002). The generation of these early systems to build virus-like circuits laid the groundwork for all of our understanding regarding negative-strand RNA virus biology. For IAV, it was the minireplicon systems that enabled one to deduce the necessary RNA sequences required for RdRp recognition, amplification, and packaging (Hsu et al. 1987; Luytjes et al. 1989; Enami et al. 1990). Informed by this work, reverse genetic systems soon followed ushering in a new era of virus engineering (Neumann et al. 2002). The capacity to generate IAV entirely from DNA has enabled us to probe the virus at single-nucleotide resolution to deduce the circuitry involved at both the RNA and protein level. Moreover, our understanding of IAV biology has also allowed us to reprogram the virus to run additional modules from the central IAV program concurrent with the infection. This latter technology has enabled significant advancements toward our understanding of virus replication, transmission, and pathogenesis. These findings and the designs responsible for their discovery are summarized below.
DEFINING IAV CIRCUITRY
The successful generation of IAV-based rescue systems was essential in elucidating the relative importance of each base or residue with very high resolution. In fact, the laboratories responsible for pioneering the minireplicon systems, and then later the reverse genetic systems, were also the ones who initially applied this new tool to better understand the biology of IAV RNA (Fodor et al. 1999; Neumann et al. 1999). Some of the first studies that applied this new methodology focused on the capacity to generate foreign RNAs that would be packaged by the virus (Luytjes et al. 1989; Fujii et al. 2003). These studies were the first to show that the packaging signals for each segment extended beyond the promoter on the termini of each segment. The capacity to introduce modified segments using reverse genetics was quickly used to define the minimal packaging material required for each of the eight segments (Duhaut and Dimmock 2000; Fujii et al. 2005; Liang et al. 2005; Marsh et al. 2007, 2008; Hutchinson et al. 2008; Essere et al. 2013; Gavazzi et al. 2013). This work revealed that each viral RNA consists of one or more open reading frames that are flanked by segment-specific untranslated regions and conserved, partially complementary 5′ and 3′ promoter elements. Much of the noncoding material folds to generate a promoter element necessary for RdRp recognition or generation of a poly(A) tail, but it is insufficient for packaging. Instead, packaging sequences have been found to extend into the open reading frame and the length of material required appears to be unique for each segment (Fig. 2). This work has led to the concept that IAV packaging is mediated by RNA:RNA interactions, a hypothesis that has been corroborated through more current methodologies (Dadonaite et al. 2019; Majarian et al. 2019). This knowledge of packaging requirements enabled one to build viral recombinants expressing a foreign segment and complement the missing viral protein using stably expressing cell lines (Shinya et al. 2004; Martínez-Sobrido et al. 2010; Ozawa et al. 2011). Furthermore, the fact that each segment can be defined by its packaging sequence allows one to swap this material between open reading frames and build a virus that is incapable of reassortment with any IAV found in nature (Gao and Palese 2009).
Use of reverse genetics to probe for RNA packaging signals was quickly followed by studies aimed at defining the tolerance for modification in IAV proteins. Through some form of mutagenesis on a segment of interest, one can perform DNA-based rescue en masse using the heterogeneous population of DNA to ascertain which mutants result in replication-competent virus. The first study to perform genome-wide insertional mutagenesis of IAV used the bacteriophage Mu transposase to determine the relative tolerance of each of the IAV proteins (Heaton et al. 2013b). In brief, a mouse-adapted laboratory strain of IAV was subjected to a transposable element which, when removed, left a 15-nt insertion at the site of integration. Virus rescue was performed using seven wild-type genomes alongside the mutant gene pool of the remaining segment. Transfection of these plasmids then resulted in virus rescue for which the capacity to recover virus was used as a proxy to define the flexibility each viral protein had for insertions. These data found most amino acid insertions were not tolerated with the exception of HA and NS1—two proteins that demand more flexibility to enable virus adaptation to different host immune defenses (Heaton et al. 2013b).
Despite the value provided by the insertional mutagenesis approach, one of its limitations is that it selects at the level of protein subdomain—generally showing the highest tolerance at surface loops. To better resolve the biological circuitry of IAV, two independent groups adopted different approaches that also relied on the reverse genetics system. In the first approach, error-prone polymerase chain reaction (PCR) was performed on either segment 4 or 8 to ascertain what single-nucleotide changes and/or residues were tolerated as defined by virus rescue. These studies successfully mapped key residues involved in receptor binding, receptor structure, and NS1 antagonism (Wu et al. 2014a,b). In the second approach, a more directed PCR methodology was used to generate a codon-based library to define the sequence plasticity of either NP or HA based on the capacity for virus rescue (Bloom 2014; Thyagarajan and Bloom 2014; Lee et al. 2018). Collectively, these data highlight critical residues that show no tolerance to mutation as well as areas that are generally agnostic to change. Although additional mutagenesis studies are required to map the relative importance of every residue in every gene, these pioneering studies have provided us the knowledge that IAV represents a highly optimized circuit with little tolerance for random insertion. Moreover, although the virus shows more flexibility at the level of amino acid identity, here too it appears the consensus strain generally represents the optimal composition for the given host from which it is derived. This knowledge has helped define the rules and strategies applied to the further engineering of IAV including the length of 3′ and 5′ RNA that is required for packaging that must be duplicated should a given design disrupt it.
ISOLATING CIRCUIT MODULES
The capacity to manipulate individual IAV segments enabled one to insert motifs in both RNA and protein to aid in their isolation and characterization (York et al. 2013; Heaton et al. 2016). For example, in an effort to better understand polymerase dynamics on viral RNA (vRNA) or the complementary vRNA (cRNA), IAV was engineered to harbor the Pseudomonas aeruginosa bacteriophage PP7 that binds with high affinity to the PP7 coat protein (PP7CP) (York et al. 2013). This 25-nt PP7 hairpin was inserted into the NA segment that normally encodes the stalk of NA, making the changes to any corresponding residue more tolerable (Luo et al. 1993). By incorporating the PP7 hairpin in the cRNA orientation, one could then use PP7CP to immunoprecipitate and purify virus-derived cRNPs, the intermediate complex required for de novo vRNP production. This engineered virus enabled structural and functional characterization of this replicative intermediate and found it to be organized in a filamentous double-helical structure.
Similar to the insertion of the RNA affinity tag, the data generated from the transposable mutagenesis enabled building of eight recombinant viruses defined by having a major open reading frame encoding a Flag epitope (Heaton et al. 2016). With the exception of M1 or NP, recombinant viruses expressing Flag epitope-tagged PB1, PB2, PA, HA, NA, NS1, NEP, or M2 were generated. These individual viruses could then be used to infect cells and perform flag-affinity-purified preparations that were submitted for protein identification via mass spectrometry. These virus designs enabled the mapping of a proteome interaction network and a greater understanding of how the virus’ circuitry interfaced with host biology.
A MODULE FOR TRACKING
With the establishment of some general understanding regarding packaging and tolerance to protein modifications, the virology community next focused on generating stable, replication-competent reporter viruses using reverse genetics. The first engineered reporter IAVs that stably expressed a foreign gene focused on replacing IAV modules to use the newly generated genetic space to incorporate green fluorescent protein (GFP) (Watanabe et al. 2003; Kittel et al. 2004). In one such study, NS1, the antagonist for the cellular antiviral response, was truncated and fused to GFP, generating a reporter virus but one that would be attenuated in any interferon-competent model (Kittel et al. 2004). In an unrelated study, the authors replaced the HA/NA two-component entry system with the single-glycoprotein (G) fusion product used by rhabdoviruses (Watanabe et al. 2003). As this latter design remained functional for entry and egress, albeit somewhat attenuated, use of G in place of HA and NA provided an entire segment to be dedicated to GFP expression. This work successfully generated a replication-competent GFP-expressing virus and showed that IAV circuitry was partly modular and therefore could be engineered for other purposes.
In an effort to build a stable IAV recombinant that maintained its endogenous genes and harbored an additional reporter, groups applied 2A peptide sequences to generate polycistronic mRNAs. The 2A peptide family derives from an unrelated family of viruses and comprises an ∼20-aa sequence that induces a ribosomal skipping event to generate two proteins from a single mRNA (Sharma et al. 2012). In one of the early attempts to generate a reporter virus, segment 8 was deconstructed and engineered with the assumption that the smallest segment may provide more flexibility with regard to the addition of exogenous material (Manicassamy et al. 2010). For this design, the authors chose to generate an NS1-GFP fusion protein, which had already been shown to be functional, and used the 2A peptide to generate NEP, the alternative splice product generated from segment 8. The resulting virus was found to be replication-competent and retain pathogenicity in vivo albeit both attenuated and unstable. In another independent design, the 2A site was used to generate a recombinant segment six capable of both NA and GFP production (Li et al. 2010a). This NA-GFP virus was found to be stable and functional but it too showed reduced replicative capacity as compared with its wild-type counterpart (Li et al. 2010a). Alternative approaches used to track virus through virus engineering involved the incorporation of a tetracysteine tag or biotin acceptor peptides into a desired viral protein (Li et al. 2010b; Qin et al. 2019). However, these engineered designs also resulted in viral attenuation and are limited to cell culture as both require additional steps for imaging.
Although the aforementioned viruses set the stage for the engineering of IAV, the common attenuation phenotype that resulted with each modification suggested that the virus was operating in an optional fitness space where any major change would be detrimental in some way. This idea was supported by the fact that when the NS1 fusion reporter virus design was revisited but passaged six times in mice, a variant was identified that increased the stability of the design, although the virus remained less pathogenic than wild type (Fukuyama et al. 2015). One significant advance in the engineering of IAV recombinants harboring a recombinant gene without inducing attenuation came on the surprising finding that one could add a significant amount of foreign material to PB2, the largest of the viral segments (Dos Santos Afonso et al. 2005). Although this group still found the addition of GFP to PB2 did not retain wild-type fitness levels, the study inspired additional modifications to this design, ultimately generating a suite of reporter constructs based on segmented GFP or smaller luciferase-based reporters that were both stable and showed little to no attenuation (Avilov et al. 2012; Heaton et al. 2013a; Tran et al. 2013; Karlsson et al. 2015).
A BROWSER HISTORY MODULE
The finding that IAV could tolerate insertions on the polymerase segment or be passaged to enable expression on segment 8 also allowed for the addition of other modules independent of GFP or luciferase. One such application included the addition of the bacteriophage P1 Cre recombinase (Heaton et al. 2014; Reuther et al. 2015). Although the design expressing Cre from NS1 using a 2A site showed significant attenuation, a comparable design built onto PB2 was well-tolerated with the lethal dose of the virus diminished by only one log as compared with unmodified virus (Heaton et al. 2014; Reuther et al. 2015). Cre is a well-characterized recombinase that catalyzes the cleavage and religation of DNA at specific DNA “LoxP” sites (Nagy 2000). This tool has been applied to countless genetic studies and, as a result, there are many LoxP transgenic animal models, including a LoxP reporter in which the introduction of Cre results in the expression of tdTomato (Madisen et al. 2010). Bringing these two model systems together, an engineered IAV expressing Cre and a LoxP reporter mouse, the authors could track not only where the virus was, but also where it had been. This model system enabled one to answer the question as to whether cells ever successfully cleared a productive IAV infection. By administrating the virus and monitoring tdTomato expressing cells at various time points postinfection, one could discern whether fluorescent cells were present and if they belonged to a particular cell lineage. Monitoring nonimmune tdTomato-expressing cells over the time course of the infection clearly showed that although alveolar cells succumbed to IAV, a subset of club cells were able to clear the virus (Heaton et al. 2014). Subsequent characterization of this cellular lineage found that club cells were inherently more resistant to virus infection and showed genetic changes that altered their subsequent response to both virus and interferon for weeks post–viral clearance (Hamilton et al. 2016). Interestingly, the molecular basis for this resistance appears to relate to an unusually strong induction of antiviral genes in addition to evasion of CD8-mediated clearance (Chambers et al. 2019; Fiege et al. 2019).
A GENETIC OVERRIDE MODULE
Another common module in circuit design is the genetic override or kill switch. This module enables rapid shutdown of the running program, an attribute that would be desirable to build into a virus. The first described such design for IAV came in the form of microRNA-mediated targeting (Perez et al. 2009). This biological tool exploits the hosts preexisting microRNA (miRNA) population to enable RNA interference (RNAi) and silence a desired viral transcript (tenOever 2013). For example, in this study the authors built target sites for miR-93 into the open reading frame of NP. This miRNA was chosen because it was not expressed in the chicken egg but was ubiquitously expressed in mammalian cells (tenOever 2013). Through the incorporation of just two target sites, the resulting virus was unable to achieve a productive infection in mammalian cells where the miRNA was present; whereas it grew to wild-type levels in eggs where the miRNA was absent (Perez et al. 2009). Similar kill switches for IAV were later built to create IAV strains that could only replicate in ferrets or were restricted in all hematopoietic cells as a sort of molecular biocontainment system or immunological tool, respectfully (Langlois et al. 2012, 2013; Waring et al. 2018). In each of these designs, no coding material was modified, enabling the generation of a stably engineered virus with unchanged fitness (Benitez et al. 2015b; Aguado et al. 2018). As an alternative to host-mediated targeting, IAV can be engineered to generate miRNAs, which can then be used to self-target the virus in a similar manner (Benitez et al. 2015a). In this system, IAV can be engineered to express an artificial miRNA that is processed to generate a perfect small interfering RNA against a structural constrained and conserved part of the virus, thereby generating a self-inactivating virus.
As an alternative to a miRNA-based kill switch, others have engineered IAV to be susceptible to a particular drug. An ingenious example of this approach was the application of the small molecule–assisted shutoff (SMASh) tag to the PA polymerase subunit (Fay et al. 2019). The SMASh tag is comprised of the hepatitis C virus NS3 protease followed by a portion of the NS4a protein (Chung et al. 2015). The basis of this biological kill switch is that NS3, by default, will cleave the NS4A domain from the resulting protein upon being translated. As the NS4a peptide triggers polyubiquitination and proteasome degradation, this cleavage event prevents targeting of whatever it is conjugated to. However, in the presence of asunaprevir, a potent inhibitor of NS3, cleavage is blocked and any NS4A-associated transcripts are degraded. Given this dynamic, the authors built the SMASh tag into the PA transcript, thereby creating a virus containing a drug-inducible kill switch (Fay et al. 2019). They found that this approach could successfully reduce titers by 1–2 logs and prevent disease in animals, although selective mutations of the target site were noted (Fay et al. 2019).
A GPS MODULE
The unique splicing activity of IAV exists to increase its coding capacity without extending the length of a given segment through use of overlapping reading frames. Ironically, this biology has also enabled the addition of genetic material with little to no loss of fitness, albeit using a heavily laboratory-adapted viral strain. This is achieved through the disruption of the natural 3′ splice acceptor site required for NEP production and the duplication of this area downstream of the NS1 stop codon. This dynamic retains the splicing dynamics of segment 8 but generates an intergenic region between NS1 and NEP without any loss of fitness (Fig. 2; Chua et al. 2013). This newly created genetic real estate is ideal for the addition of modules that function only at the level of RNA (Varble et al. 2010, 2014).
One such noncoding RNA module added to this site were 22-nt barcodes that served no function beyond providing a unique identity to each virion (Varble et al. 2014). Using this strategy, the authors built a library composed of more than 100 RNA tagged viruses all at relatively equal proportion. This virus population was applied to a variety of infection conditions that could be monitored by deep sequencing to identify population dynamics at single-virion resolution. This concept was arguably best applied to virus transmission events. Using the barcoded library of IAV, this study found that whereas transmission events enabled by contact could transfer as much as one-half of the viral population, aerosol-based transmission resulted in founder populations as low as just two to three virions (Varble et al. 2014). These findings were subsequently validated through a variety of reassortment assays that found that there is a high likelihood that a given infection fails to deliver all 8 segments, meaning co-infection is often required for the virus to be successful (Fonville et al. 2015; Jacobs et al. 2019). This concept has also been observed using other methodologies (Brooke et al. 2013; Russell et al. 2018).
RNAi MODULE
One additional module that the reconstructed segment 8 provided was the capacity to enable miRNA production from a replication-competent IAV (Varble et al. 2010). Although this stand-alone add-on was of limited value given the contrasting kinetics of miRNA-mediated repression and the IAV life cycle, the capacity to generate small RNAs could be further exploited. As miRNAs are processed based on their folded structure as opposed to their sequence, one can change the sequence while maintaining shape to create any desired small RNA (tenOever 2013). This, therefore, allows one to enable IAV to effectively generate a siRNA capable of silencing a host factor (Benitez et al. 2015a). This engineered design works well with regard to siRNA generation but is restricted by the kinetics of the virus. That is, the engineered IAV can effectively target the host RNA during the process of infection but would be unable to impact any protein generated before this event. For this reason, this module is really only well-suited to determine which virus-induced genes are most impactful in reducing virus replication levels. However, in an effort to gauge the use of this tool, a virus library was engineered in which each virus was enabled with the capacity to silence a specific host factor. By monitoring the virus population over time in the context of an in vivo infection, one would expect virus enrichment of a particular artificial miRNA to reflect a replicative advantage as the population was homogenous in every other way. These efforts showed the greatest enrichment in viruses targeting known pattern recognition receptors (PRRs) (Ddx58, Tlr7, and Ifih1), virus-induced transcription factors (Irf1, Irf7, and Stat1), and antiviral effectors (Adar, Rnasel) suggesting this to be the case (Benitez et al. 2015a).
CONCLUDING REMARKS
The power of model systems stems from the critical mass of people who collectively contribute to the understanding of the biology surrounding it. This not only advances our knowledge about that system and those comparable with it, but it can often lead to entirely new applications. For IAV, reverse genetics has not only enabled a much greater understanding of virus biology, but we are now using the platform to generate recombinant viruses that can be used to treat cancer, generate biologics, or even perform RNA editing (Schmid et al. 2014; Pizzuto et al. 2016; Hamilton et al. 2018). This new area of synthetic virology not only provides greater knowledge through virus building, but it may also hold significant potential as a platform for future therapeutics.
ACKNOWLEDGMENTS
The author thanks the members of the Microbiology Department at Icahn School of Medicine at Mount Sinai for countless discussions about the biology of influenza A virus and Drs. R.A. Langlois (University of Minnesota), B. Manicassamy (University of Iowa), and A.J.W. te Velthuis (University of Cambridge) for critical reading of this review.
This article has been made freely available online courtesy of TAUNS Laboratories.
Footnotes
-
Editors: Gabriele Neumann and Yoshihiro Kawaoka
-
Additional Perspectives on Influenza: The Cutting Edge available at www.perspectivesinmedicine.org