The Ecology and Evolution of Influenza Viruses

  1. Edward C. Holmes2
  1. 1WHO Collaborating Centre for Reference and Research on Influenza, at The Peter Doherty Institute for Infection and Immunity, Melbourne 3000, Australia
  2. 2Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, The University of Sydney, Sydney 2006, Australia
  1. Correspondence: edward.holmes{at}sydney.edu.au

Abstract

The patterns and processes of influenza virus evolution are of fundamental importance, underpinning such traits as the propensity to emerge in new host species and the ability to rapidly generate antigenic variation. Herein, we review key aspects of the ecology and evolution of influenza viruses. We begin with an exploration of the origins of influenza viruses within the orthomyxoviruses, showing how our perception of the evolutionary history of these viruses has been transformed with metagenomic sequencing. We then outline the diversity of virus subtypes in different species and the processes by which these viruses have emerged in new hosts, with a particular focus on the role played by segment reassortment. We then turn our attention to documenting the spread and phylodynamics of seasonal influenza A and B viruses in human populations, including the drivers of antigenic evolution, and finish with a discussion of virus diversity and evolution at the scale of individual hosts.

From an evolutionary perspective, more is known and more sequence data have been generated about influenza viruses than arguably any other group of pathogens. These data have provided a general understanding of the extent and structure of virus genetic diversity, the evolutionary processes that gave rise to it, from where influenza viruses originate, and the mutations that underpin host adaptation, antigenic drift, and antiviral resistance. We also know much about how human influenza viruses spread and evolve on a seasonal basis. Human influenza A virus was the focus of one the first large-scale pathogen genome-sequencing projects (Ghedin et al. 2005), and influenza virus genomes are regularly used to test methods of evolutionary analysis, and these evolutionary analyses are increasingly employed to understand antigenic drift and help choose vaccine strains (Smith et al. 2004; Koel et al. 2013; Bedford et al. 2014; Neher et al. 2016).

Herein, we review key aspects of the ecology and evolution of influenza virus, taking a strong genomic perspective. We will consider the evolutionary behavior of the virus in different host species and the commonalities among them and highlight areas where uncertainty remains.

PHYLOGENETIC DIVERSITY AND ORIGINS OF INFLUENZA VIRUSES

The Origins of Influenza Viruses

The reservoirs of influenza A viruses are traditionally considered to be waterbirds of the orders Anseriformes (ducks) and Charadriiformes (shorebirds, gulls); these animals are commonly infected, reaching prevalence levels of >20% in the autumn migration season (Latorre-Margalef et al. 2014), harbor 16 hemagglutinin (HA) and nine neuraminidase (NA) subtypes, and usually experience clinically asymptomatic infections. Occasionally, influenza A virus leaves these aquatic bird reservoirs, overcomes a variety of host barriers (Kuiken et al. 2006), and jumps to either poultry or various mammalian species, including humans, resulting in sporadic infections, disease epidemics, or pandemics. Although the interplay between the birds and mammals is critical for understanding the emergence of influenza A virus, our knowledge of influenza ecology and evolution in general has changed dramatically with the recent advent of metagenomic sequencing.

The first indication that the phylogenetic diversity of influenza A viruses was greater than in the bird–mammal 16HA-9NA model was the discovery of highly divergent and diverse viruses in fruit bats (Artibeus spp.) from Central and South America (Tong et al. 2012, 2013). These bat viruses represented unique subtypes, H17N10 and H18N11, and possessed some internal genes with more genetic diversity among them than between all other known avian and mammalian influenza viruses combined (Tong et al. 2013). These features are indicative of a long evolutionary association between these viruses and their bat hosts, as well as an ancient separation from the viral lineages found in birds and other mammals. Although it has been suggested that bat influenza A viruses possess the biological features necessary to infect humans, including the use of major histocompatibility complex (MHC)-II as an entry mediator (Karakus et al. 2019), these viruses may only replicate poorly in some mammalian cell lines (Ciminski et al. 2019).

The remaining puzzle is why so few bat influenza virus genome sequences have been identified, with the only other known bat virus, an H9N2-like variant, recently identified in Egyptian bats and not as divergent as those from South America (Kandeil et al. 2019). The most likely explanations are that we have not sampled the right populations and that these viruses are present at low prevalence and/or only establish short-lived infections such that the chances of detecting actively replicating viruses is slim. Indeed, the large gap in the phylogenies that link bat influenza viruses to those in other species is indicative of a very limited sampling of virus diversity. It is, therefore, inevitable that more mammalian influenza viruses, including from bats, will be identified in the future.

More surprises on the host range of influenza viruses arose from the large-scale metagenomic (particularly RNA) sequencing of diverse vertebrate species. A large-scale metatranscriptomic study of diverse vertebrate viruses revealed that sequences clearly related to known influenza viruses were present in amphibians, fish, and even the hagfish (Eptatretus burgeri), a jawless and basal vertebrate species (Shi et al. 2018b). That these viruses were sampled from seemingly healthy animals tentatively suggests that they represent low-pathogenicity variants. Notably, the (PB2) phylogeny of these viruses generally matches that of the host species from which they were sampled, albeit with obvious cases of host jumping (Fig. 1). This pattern strongly suggests that influenza viruses and their hosts have likely coevolved for the entire evolutionary history of vertebrates. If so, then many more influenza viruses will ultimately be identified from diverse vertebrates. Hence, although wild waterbirds are rightly considered the ultimate source of those influenza A viruses that eventually emerge in humans, they are only milestones in a far older evolutionary history. Host jumping, which underpins disease emergence, occurs on this backbone of ancient virus–host codivergence. Of particular note here is that the closest relative of human influenza B virus identified to date is from a fish (Mastacembelus aculeatus), again suggesting that a diverse array of influenza B–like viruses will eventually be found in other vertebrate species (Shi et al. 2018b).

Figure 1.

Evolutionary relationships among the orthomyxo-like viruses revealing the position of the vertebrate-associated influenza viruses (shaded green). The phylogeny was inferred using a maximum likelihood approach based on an amino acid alignment of the PB2 protein that encodes the RNA polymerase (original tree kindly provided by Dr. Mang Shi, University of Sydney). Influenza viruses A–D are shown in bold italic. Note the low levels of divergence among these viruses compared to the orthomyxo-like viruses as a whole. Branches shaded red denote those viruses associated with vertebrates, whereas all other (black) branches reflect viruses associated with invertebrates. Documented orthomyxovirus genera, other than those associated with influenza viruses, are indicated. The tree is unrooted with branches scaled to the number of amino acid changes per site.

Metagenomics and the Orthomyxoviruses

Metagenomic studies have been central to revealing the relationships between the influenza viruses and related orthomyxoviruses (family Orthomyxoviridae). Orthomyxoviruses appear to be relatively common in fish in which they harbor enormous phylogenetic diversity, again indicative of ancient origins (Fig. 1; Shi et al. 2018b). A good example is infectious salmon anemia virus (now Salmon isavirus) that has been known to cause disease in farmed Atlantic salmon for >30 years (Rimstad and Mjaaland 2002). More recently, an even more divergent orthomyxovirus (Tilapia lake virus) responsible for large-scale die-offs of tilapia (Oreochromis sp.) was identified (Bacharach et al. 2016).

Perhaps more surprising was the discovery that orthomyxoviruses (and RNA viruses in general) are abundant in invertebrates, including mosquitoes, earwigs, earthworms, cockroaches, and a variety of fly species (Fig. 1; Li et al. 2015; Shi et al. 2016). Although there have been multiple and complex cross-species transmission events throughout orthomyxovirus evolution, visible as vertebrate-associated viruses falling in different phylogenetic locations within the diversity of invertebrate viruses, the data suggests that the orthomyxoviruses originated in invertebrates and have been associated with animals for perhaps the entire history of the Metazoa. This marks a major transition in our understanding of the orthomyxoviruses, from predominantly a family of vertebrate viruses to a largely invertebrate family with multiple jumps to vertebrates, including those that gave rise to the influenza viruses and in which respiratory transmission evolved in some mammalian species. Despite this antiquity, all orthomyxoviruses, irrespective of host, share a segmented genome structure, although segment numbers vary from 6 to 10 (Shi et al. 2016).

The Timescale of Influenza Virus Evolution

One of the most complex topics in influenza virus evolution has been revealing the timescale over which its genetic diversity has been created, particularly those influenza A viruses that infect humans. As discussed above, metagenomic studies strongly suggest that influenza viruses have existed for hundreds of millions of years (Shi et al. 2018b), and it seems reasonable to assume that Anseriformes and Charadriiformes have similarly been infected for millennia. Such antiquity is also supported by the observation that the HA and NA comprise highly divergent sequences that are difficult to align. The same pattern of high divergence, and hence deep antiquity, is also true of the A and B alleles (the latter only found in birds) of the virus nonstructural (NS) segment.

A very different timescale emerges with the molecular clock dating of influenza A viruses sampled from birds and mammals (excluding bats). As expected, given its high background mutation rate (Sanjuán and Domingo-Calap 2016), evolutionary rates are universally high in influenza viruses at ∼10−3 nucleotide substitutions per site per year (Chen and Holmes 2006; Rambaut et al. 2008; Worobey et al. 2014a; Vijaykrishna et al. 2015), although there is variation among viruses, hosts species, genome segments, and subtypes of influenza A virus. In the case of influenza A virus, these rates suggest that circulating genetic diversity has a very recent origin, only dating to the nineteenth century. Indeed, it has been proposed that that there was a large-scale “selective sweep” (i.e., the selective fixation of an advantageous mutation that purged existing genetic diversity) of influenza genomes that occurred in the late 1800s, such that all currently circulating avian and mammalian influenza A virus lineages originated at this point (Worobey et al. 2014a). A complicating factor is that viruses evolve at different rates in different hosts (Worobey et al. 2014a), with poultry viruses evolving more rapidly than those in wild birds (Fourment and Holmes 2015). However, even accounting for such variation, all estimates for the divergence times of avian and mammal viruses are recent, generally falling in the nineteenth century. Either these dates are real and virus evolution has been characterized by a complex series of cross-species transmission events and selective sweeps (Chen and Holmes 2010; Worobey et al. 2014a) or they are wildly erroneous and the timescale of virus evolution is uncertain. Likely the best way to resolve this key question is through the analysis of “ancient” influenza viruses sampled from earlier time points. At the time of writing, however, the only available samples of this kind are those of the H1N1 pandemic of 1918/1919 (Taubenberger et al. 1997).

There has similarly been debate over the timing and nature of the H1N1 virus associated with the global pandemic of 1918/1919. The key issue here is how long the virus was present in humans before the 1918 pandemic began (Reid and Taubenberger 2003). Despite the availability of samples from the pandemic (Taubenberger et al. 1997, 2005), it is difficult to determine which of the various scenarios for the origin of the 1918 virus are correct: that the entire virus, or at least some of its segments, originated in birds shortly before the epidemic began, that it emerged earlier in pigs before crossing to humans (Smith et al. 2009a), or that it emerged earlier in humans and evolved increased virulence through time (Worobey et al. 2014b). Again, more ancient samples will be critical to answering this question.

CROSS-SPECIES TRANSMISSION AND EVOLUTION OF INFLUENZA VIRUSES IN MULTIPLE HOSTS

Like many RNA viruses, influenza viruses regularly jump host species. Although many of these cross-species transmission events only result in transient spillover infections, occasionally they result in sustained epidemic transmission. The phylogenetic distance between hosts appears to be an important general predictor of the evolutionary success of cross-species transmission events (Longdon et al. 2014). Hence, the most common cross-species transmission events occur within a specific host class, particularly within the Aves, and more so within different waterfowl of the genus Anas (Ren et al. 2016). There is also strong evidence for the cross-species transmission of influenza A viruses between humans and swine, although the role played by pigs in human pandemics prior to 2009 is unclear (Nelson and Worobey 2018), and reverse zoonoses from humans to pigs likely occur at a higher rate than virus transmission from pigs to humans (Nelson et al. 2012; Nelson and Vincent 2015). Interestingly, however, swine (and bovines) are key hosts for influenza D virus, first detected in 2011 (Hause et al. 2013).

Cross-species transmission is best documented in wild birds. These animals can be envisaged as being infected by a single and common pool of influenza viruses, where geographical subdivision may play a more important role than host species in shaping virus population structure. The most striking pattern here is that influenza A viruses from wild birds fall into independently evolving Eurasian and North American lineages, regardless of subtype (Fig. 2; Donis et al. 1989). It is possible that these geographically segregated lineages are sometimes antigenically different, offering a competitive advantage to lineages/viruses introduced into the mismatched host population (Bahl et al. 2009; Vijaykrishna et al. 2013). More controversial is how avian influenza viruses are structured within the continental scale, and the precise role played by “flyways” or the specific migratory routes used by birds. Despite some counter suggestions (Bahl et al. 2013), there is relatively strong evidence that influenza A viruses in North American birds tend to spread in patterns that loosely respect the geographical boundaries imposed by avian flyway (Lam et al. 2012; Fourment et al. 2017).

Figure 2.

The phylogenetic diversity of avian influenza A viruses. There are 16 different hemagglutinin (HA) subtypes of influenza A virus circulating in wild birds, falling into two main HA groups and further subdivided into a series of clades (Latorre-Margalef et al. 2013). Within each subtype, there is clear genetic structure based upon avian geography, such that sequences sampled from birds of the Americas or Eurasia fall into two distinct and independently evolving clades regardless of subtype. This phylogeographic division is demonstrated in a phylogeny containing all currently available H4 sequences (right panel), and is also observed in all nine neuraminidases (NAs) and the internal gene segments. Occasionally, there is spillover across these main lineages, as demonstrated by the N8 tree (right panel) in which virus sequences originating in Asia cluster in a clade dominated by North American sequences.

It is also apparent that subtypes of influenza A virus differ in the specificity or generality of the hosts they infect. For example, H13 and H16 have, to date, almost exclusively been detected in gulls, whereas more generalist subtypes like H3 can be detected in numerous wild bird species, poultry, humans, pigs, horses, and dogs. Host specificity appears to be dictated by a number of factors, including cellular receptors, pH, and temperature. The successful emergence of viruses in new host species likely involves adaptive evolution to overcome these intrinsic barriers, although in some cases this process may be biologically subtle (Feng et al. 2015). Experimental studies, in which avian influenza H5N1 virus was serially passaged in ferrets, greatly illuminated the mutation types needed for cross-species transmission from birds to mammals (Herfst et al. 2012; Imai et al. 2012; Linster et al. 2014). With respect to sustained (and airborne) transmission in mammals, key mutations were associated with changing receptor specificity from avian-type (α2,3) to mammalian-type (α2,6) receptors, increased thermostability and pH stability, increased replication of viruses at temperatures in the mammalian respiratory tract, and modified polymerase activity to increase transcription.

Importantly, not all host-jumping events are successful, and the evolution of stable transition transmission cycles in new hosts likely depends as much on host ecology as the acquisition of genetic changes that enable productive replication. Canine influenza A virus provides a useful illustration. This virus has emerged twice in dogs: once with horses the reservoir species (H3N8; Crawford et al. 2005) and another when the viruses were derived from avian hosts (H3N2) (Parrish et al. 2015). Although in both cases the virus was able to establish local transmission chains, particularly in dog shelters in the United States, these chains eventually died out, and the H3N8 virus went extinct in dogs. Although the avian-derived H3N2 outbreak is ongoing, there has been continual local fade-out in the United States, and this epizootic may also ultimately suffer extinction (Voorhees et al. 2018). Because canine influenza A virus seems well-adapted to replicate in dogs (Feng et al. 2015), the most likely explanation for the high extinction rate is that there is an insufficient density of susceptible hosts in many localities to sustain host-to-host transmission (Dalziel et al. 2014).

REASSORTMENT AND INFLUENZA VIRUS EVOLUTION

Genomic reassortment is central to the biology of influenza viruses and occurs frequently in all hosts studied (Steel and Lowen 2014; Lowen 2017, 2018). The consequence of reassortment is that, following coinfection, viral progeny contains various gene segment combinations from the different parental viruses. Reassortment is of evolutionary importance because it creates new genomic constellations: Although the majority of these will be deleterious (as most single mutations are deleterious), some may facilitate adaptation to new hosts, help evade host immune responses, and assist in the generation of antiviral resistance.

Reassortment rate is shaped by a number of factors, including the extent of viral diversity in the host population. For example, reassortment may be most frequent in the Anseriformes as these harbor the largest number of subtypes (Lu et al. 2014). In addition, reassortment is not random, because segmental mismatch ensures that only some genomic constellations are viable. Indeed, segment mismatch, which encompasses both RNA- and protein-based incompatibilities between coinfecting viruses, is an important determining factor of the outcomes of mixed influenza A virus infections (White and Lowen 2018). For example, packaging signals may restrict or bias reassortment, decreasing the efficiency with which reassortant genotypes form (Essere et al. 2013). Similarly, there is a complex network of RNA–RNA interactions among segments that impacts the fitness of reassortant progeny (Dadonaite et al. 2019). Examples of protein incompatibility include the HA avidity/NA activity imbalance that occurs because the HA mediates attachment, whereas the NA facilitates release of virions, making these proteins functionally interdependent (Kaverin et al. 1998; Mitnaul et al. 2000). This may, in part, explain why only 117/144 possible HA/NA subtype combinations have been observed in wild birds (Olsen et al. 2006).

Reassortment in Avian Influenza A Virus

As wild birds support a large diversity of avian influenza subtypes and lineages, with both infection and reinfection commonplace, it is no surprise that influenza viruses from wild birds experience frequent reassortment. For example, Dugan et al. (2008) showed that 26% of 167 wild bird samples carried more than one HA/NA subtype, and that four different genotypes were present in five H4N6 isolates collected from mallards (Anas playrhynchos) at the same location on the same day. Similarly, of 96 virus genomes from mallards isolated in 2011, 56% were reassortants (Wille et al. 2013), and using a natural experimental system it was demonstrated that 10 individual sentinel mallards were infected with at least three different subtypes within an autumn season, with 15 HA/NA subtype combinations (Tolf et al. 2013; Wille et al. 2013, 2017). Reassortment in the wild bird reservoir is most easily observed when the viruses comprise a mosaic of Eurasian and North American origin gene segments, and are most often detected in gulls and seabirds that move across continental margins (Fig. 3; Ramey et al. 2010; Wille et al. 2011; Lang et al. 2016).

Reassortment is central to the emergence of highly pathogenic avian influenza viruses in poultry. For example, the emergence of H5Nx and H7N9 viruses are closely tied to reassortment, such that the internal genes of H9N2 viruses reassorted with different HA/NA subtypes to generate both the gs/GD lineage H5N1 viruses and contemporary H7N9 viruses (Fig. 3). The H7N9 virus that emerged in China in 2013 resulted from numerous reassortment events, with the six internal protein genes derived from at least two separate H9N2 virus lineages, H7 and N9 gene segments of wild bird origin (Gao et al. 2013; Lam et al. 2013; Wu et al. 2013; Pu et al. 2015; Wang et al. 2016). Since emergence, and across all seasonal waves, the H7N9 virus has experienced high rates of reassortment, generating multiple genome constellations. For example, Cui et al. (2014) documented 27 H7N9 genotypes within 3 months of emergence. This high reassortment rate is likely a result of the very high diversity of avian influenza A virus circulating in poultry markets and farms (Shi et al. 2018a), and it is not surprising that there has been reassortment among H5N6, H6N6, and H7N9 viruses (Wu et al. 2013; Jin et al. 2017).

Figure 3.

Reassortment and emergence of influenza viruses. Within the wild bird reservoir, reassortment is most recognizable when viruses contain gene segments with a mosaic of geographic lineages (see Fig. 2). These intercontinental reassortants are most often of the H13/H16 subtype and found in gull species (Wille et al. 2011). Reassortment in birds can cross host species barriers, such as the H7N9 viruses in China that involve viruses isolated in wild birds, domestic ducks, and poultry. According to the model proposed by Wang et al. (2014), an H7Nx virus from a domestic duck reassorted with an HxN9 from a wild bird. This H7N9 virus, once introduced to poultry, reassorted with an H9N2. More recent literature suggests this virus has segments from more than one H9N2 lineage, suggesting additional reassortment events. Following the emergence of this virus in poultry, it continued to reassort, continually producing new genotypes. Cross-species reassortment may also involve mammals. North American triple reassortant viruses in swine populations emerged following a number of reassortment events; segments have been traced back to classical swine lineages, and human seasonal viruses as well as North American avian viruses.

Reassortment of Swine Influenza Viruses

The evolutionary genetics of swine influenza viruses are complex; this is the result of numerous cross-species transmissions, introductions, and reassortment events occurring independently in different continents (Vincent et al. 2008; Brockwell-Staats et al. 2009; Steel and Lowen 2014). For example, in North America, one major virus lineage of influenza A virus, denoted the classical swine lineage, was introduced into pigs from humans during the 1918 pandemic where it was stably transmitted for ∼70 years (Brockwell-Staats et al. 2009). There was evidence for the cocirculation of H3N2 human viruses in pigs, and a double reassortant swine/human genotype virus and a triple reassortant swine/human/avian virus emerged shortly after (Fig. 3; Zhou et al. 1999; Karasin et al. 2000b). The triple reassortant was rapidly established in North American pigs (Webby et al. 2000) and was subject to further reassortment such that numerous lineages now exist, each carrying the internal genes of the established triple reassortant virus (Karasin et al. 2000a, 2006; Webby et al. 2000, 2004; Vincent et al. 2008). Because of the highly reassorted nature of influenza A viruses in swine populations, there is great concern that some of these viruses may have pandemic potential (Ito et al. 1998). However, with the exception of the 2009 H1N1 human pandemic, there is no concrete evidence that swine played a role in the 1918, 1957, and 1968 human pandemics (Nelson and Worobey 2018).

Reassortment of Human Influenza Viruses

Reassortment has undoubtedly played a starring role in the emergence of human influenza A viruses, although the lack of contemporary viruses from other animal species complicates analysis. For example, the H2N2 pandemic virus that emerged in 1957 was a reassortant between previously circulating human and avian viruses with the novel H2, N2, and PB1 genes acquired from the Eurasian avian reservoir (Smith et al. 2009a). Similarly, the H3N2 virus that emerged in 1968 was a reassortant between H2N2/1957 virus and an influenza virus with novel HA and PB1 acquired from the avian reservoir. Finally, the H1N1pdm09 virus was the result of complex and numerous reassortment events: the PB1, PB2, PA, HA, NP, and NS segments were acquired from the North American swine triple-reassortant viruses described above, whereas the NA and M segments had their origin in the Eurasian avian-like swine H1N1 lineage (Garten et al. 2009; Smith et al. 2009b).

Reassortment also occurs in seasonal influenza viruses, both within and among subtypes. Despite the cocirculation of H1N1 and H3N2 viruses since 1977, inter-subtype reassortment occurs infrequently, which is likely a result of the lower fitness of reassortants compared to the parental wild-type viruses (Phipps et al. 2017). However, important examples include the appearance of H1N2 viruses detected sporadically in humans (Guo et al. 1992; Xu et al. 2002; Ellis et al. 2003; Al Faress et al. 2008). In contrast, intra-subtype reassortment occurs frequently (Holmes et al. 2005; Nelson et al. 2008; Rambaut et al. 2008; Westgeest et al. 2014; Berry et al. 2016) and likely far more so than detected through phylogenetic analysis as it may involve parental lineages that are difficult to distinguish on evolutionary trees and/or the inclusion of internal gene segments often not included in analyses.

THE MOLECULAR EPIDEMIOLOGY OF HUMAN INFLUENZA VIRUS

Influenza virus is a hugely successful global pathogen that causes epidemics of varying magnitude every winter season in the temperate parts of both the northern and southern hemispheres, with rather more continual circulation in the tropics (Viboud et al. 2006). Influenza A and B viruses therefore spread globally each year, although in a complex and unpredictable manner, with multiple introductions (and consequent cocirculation) into individual geographic regions, often undergoing antigenic drift during the process (Nelson et al. 2007, 2008; Bedford et al. 2015). Despite the apparent regularity of influenza, considerable uncertainty remains over the patterns and drivers of virus spread and of the possible existence and location of a global “source” population.

A variety of models have been put forward to explain the global spread of influenza virus, largely based on the phylogenetic analysis of genome sequence data. Most attention has been directed toward H3N2 influenza A viruses, with influenza A/H1N1 and B viruses exhibiting less frequent global movement (Bedford et al. 2015). A common idea is that both seasonal and antigenically distinct variants of influenza A virus regularly appear in East and Southeast Asia (including mainland China) from where they spread globally (Smith et al. 2004). Although both genomic and prevalence data should be interpreted with caution because of major sampling biases, this theory is compatible with the number, density, and movement of people that live in this region that together facilitate virus transmission and evolution. Indeed, expansive HA phylogenies have shown that viruses from East and Southeast Asia tend to fall toward the central trunk of the phylogeny as expected from a source population (Lemey et al. 2014). However, other phylogeographic studies have suggested that regions like southern China may not be global sources (Cheng et al. 2013), such that the phylogenetic data may better fit a “shifting metapopulation” in which viruses can emerge in any geographic region (e.g., Asia, Europe, North America), with the location of the source population regularly changing (Bahl et al. 2011).

The distinctive seasonality of influenza viruses complicates analyses of global scale molecular epidemiology. Although influenza is largely a winter disease in the temperate regions of both the northern and southern hemispheres, tropical regions are characterized by more continual virus transmission, sometimes with multiple peaks during the year (Viboud et al. 2006). For example, whereas there is a single seasonal peak of influenza in northern China, there are often a major and minor peak in southern China (Cheng et al. 2013). Similarly, the tropical regions of northern Australia typically have longer influenza seasons than the more temperate regions in the south of the country, despite a far smaller population density in the north (Geoghegan et al. 2018). Hence, influenza seasonality likely reflects a complex interplay between temperature, humidity, and mode of transmission. These factors, combined with increasingly frequent global transport (Brockmann and Helbing 2013), ensure that influenza viruses can appear in any locality at any time, and whether they lead to outbreaks depends on the local combination of climatic and epidemiological factors.

The Phylodynamics of Influenza Virus

The array of influenza viruses that circulate in human populations differ in their evolutionary behavior (Bedford et al. 2014). These differences are reflected in so-called “phylodynamic” patterns: the structure of viral phylogenetic trees produced by a combination of evolutionary and epidemiological processes (Grenfell et al. 2004). Most notably, the H1N1 and H3N2 subtypes of influenza A virus that dominate seasonal influenza have markedly different dynamics (Fig. 4; Bedford et al. 2014, 2015; Vijaykrishna et al. 2015). H3N2 viruses experience a strongly selectively driven evolution, with frequent selective sweeps, strong antigenic drift, and major seasonal crashes in genetic diversity. In contrast, epidemics and selective sweeps are both less common in H1N1 viruses, with multiple lineages persisting across seasons. Parallel patterns are observed in influenza B virus. Whereas the Victoria lineage viruses undergo punctuated fluctuations in genetic diversity in the same manner as H3N2 viruses, the Yamagata lineage viruses experience fewer seasonal peaks, lower rates of amino acid change, and slower epidemics (Vijaykrishna et al. 2015), although multiple lineages persist across influenza seasons in both influenza B virus lineages (Fig. 4). Possible explanations for these distinctive patterns include differences in receptor-binding preferences shaped by HA structure or the age structure of those infected, which also differs among subtypes and likely impacts patterns of cross-protective immunity (Bedford et al. 2015; Vijaykrishna et al. 2015).

Figure 4.

The contrasting phylodynamics of human influenza viruses. (Top) Left-to-right: Phylogenetic trees of globally sampled hemagglutinin (HA) gene segments (∼1200 sequences) of influenza A H3N2 virus, 2002–2013; H1N1 virus, 1998–2009; H1N1pdm09 virus, 2009–2013; and the Yamagata (red) and Victoria (black) lineages of influenza B viruses, 2002–2013. Note the different tree shapes that reflect the impact of differing evolutionary and evolutionary pressures. (Bottom) Left-to-right: Relative genetic diversities through time, a marker of changing population sizes, in the influenza B virus Victoria lineage; influenza B virus Yamagata lineage; H3N2 influenza A virus; H1N1 influenza A virus 2003–2008; and H1N1pdm09 influenza A virus (orange) 2009–2013. Again, note the differences among subtypes. The analysis only utilized viruses sampled in Australia and New Zealand. (From Vijaykrishna et al. 2015; adapted, courtesy of Creative Commons Attribution Licensing.)

There is now an effort to use these phylogenetic patterns to predict which virus variants will dominate in the future and hence should be incorporated into vaccines, with online tools able to depict viral evolution in near real time (Hadfield et al. 2018). Specifically, lineages with higher rates of branching, and hence that are producing more descendants, are in theory more successful (Luksza and Lässig 2014; Neher et al. 2014).

The Evolution of Antigenic Drift

Arguably, the defining process of influenza virus evolution, at least in humans, is antigenic drift, which is the fixation, by natural selection, of mutations in the HA and NA that enable the virus to evade the human immune response. Indeed, because the human immune response is not completely cross-protective, natural selection will predictably favor antigenic variants that allow the virus to evade immunity (Fitch et al. 1997). Hence, antigenic drift enables escape from antibody-mediated neutralization acquired following infection or vaccination. Despite a name easily confused with “genetic drift,” which in marked contrast describes the random sampling of mutations through time particularly in small populations, the defining feature of antigenic drift is strong positive natural selection (adaptive evolution).

A complete understanding of antigenic drift is central to the design of effective vaccines, particularly if it can be inferred through genomic analyses alone. Although there has been considerable attention given to revealing which HA residues are the most important in antigenic escape (Koel et al. 2013; Li et al. 2016), there are still major uncertainties in our understanding of antigenic evolution and whether it exhibits any predictability, particularly in the face of complex epistatic interactions (Koel et al. 2019). Mutations in the HA1 region of the HA protein, particularly in or around the receptor-binding site, are thought to drive antigenic evolution (Wiley et al. 1981; Fitch et al. 1997; Bush et al. 1999; Koel et al. 2013). It is interesting that there is a correlation between antigenic drift and the incidence of influenza viruses, highlighting its epidemiological importance (Bedford et al. 2014; Neher et al. 2016).

A central issue is whether antigenic drift is continuous, involving the gradual generation and accumulation of antigenically distinct mutations through time, or if it is punctuated, such that the virus makes “jumps” in antigenic space because some mutations have greater phenotypic effect than others, followed by periods of antigenic stasis (Smith et al. 2004; Koelle et al. 2006; Wolf et al. 2006; Koel et al. 2013). In particular, the use of antigenic cartography, which explores the pattern of pairwise hemagglutinin inhibition (HI) distances, has suggested that antigenic drift is punctuated with major jumps in antigenic space roughly every 1–5 years, corresponding to instances when vaccine efficacy is especially poor (Smith et al. 2004; Koel et al. 2013). Less clear is whether these jumps in antigenic space have any predictability, as this would greatly enhance vaccine design, and it is notable that no antigenic jumps have been observed in recent years.

There is also evidence for antigenic drift in H5N1 viruses in poultry (Chen and Holmes 2006; Dugan et al. 2008), which may contribute to increased evolutionary rates in this virus compared to the subtypes sampled from wild birds (Fourment and Holmes 2015). In addition, unlike other avian influenza viruses, H5N1 viruses are under vaccination pressure because of large veterinary vaccine campaigns in many countries, and it is possible that poor vaccination implementation and efficacy may contribute to its continued antigenic drift (Lee et al. 2004; Chen et al. 2006; Swayne and Kapczynski 2008; Eggert et al. 2010; Cattoli et al. 2011). Similarly, after initial suppression of H7N9 viruses following mass poultry vaccination in China, new strains have begun to emerge that are able to escape the vaccine (Shi et al. 2018a).

INTRAHOST EVOLUTION OF INFLUENZA VIRUS

The rapidity of the evolutionary process that seemingly characterizes all RNA viruses ensures that genetic and phenotypic variation is generated and accumulates within individual hosts infected with influenza virus, although human influenza B virus may exhibit less intrahost diversity than influenza A virus (Valesano et al. 2019). There is, however, considerable uncertainty as to how influenza virus evolves over such short timescales, and what this means for the long-term evolution of the virus (Xue et al. 2018b).

A key issue of contention is the number of viruses that are transmitted between hosts and hence that establish new infections (McCrone and Lauring 2018). In other words, how large is the population bottleneck associated with host transmission? Although there is clearly a substantial reduction in population size as the virus moves between hosts, there is debate as to whether single or multiple virus particles initiate new infections. Critically, the narrower the transmission bottleneck, the lower the chance of coinfection and reassortment. However, “mixed” influenza virus infections, in which different influenza viruses (A or B), subtypes (H1N1 and H3N2), or lineages/antigenic variants of the same subtype are relatively commonplace within single hosts (Ghedin et al. 2009) and frequently observed in avian populations (Dugan et al. 2008; Wille et al. 2013, 2017), provide the raw material for reassortment. Although it is possible that these mixed infections result from relatively “loose” population bottlenecks in some cases, such that multiple viruses are transmitted between hosts (Murcia et al. 2010), they may result from the “superinfection” of individual hosts in the face of relatively weak protective immunity.

The most convincing data suggests that most, although not all, influenza A virus infections are initiated by a single virion (McCrone et al. 2018). Importantly, such a severe population bottleneck also reduces the occurrence of “cooperative” interactions between virus particles, in which different variants contribute different functions, explaining why these cooperative effects have only been observed at a relatively high multiplicity of infection in cell culture (Xue et al. 2016) and not natural in influenza A virus infections (Xue et al. 2018a). A substantial transmission bottleneck may also mean that antigenic drift is heavily dependent on the chance transmission of particular viruses between individuals, rather than the selectively mediated transfer of advantageous variants.

Although it is tempting to ascribe great importance to intrahost virus evolution, the infection period associated with influenza is generally so short that natural selection is usually unable to radically alter mutation frequencies (Han et al. 2019). Indeed, recent studies suggest that the intrahost evolution of influenza virus is dominated by stochastic processes, including those following frequent population bottlenecks (McCrone et al. 2018). The exception may be cases in which the hosts are immunocompromised and shed virus for extended time periods. Studies of these patients have revealed evolutionary behaviors that differ from hosts infected for far shorter time periods, with the limited immune response resulting in variant patterns of mutational accumulation (Ghedin et al. 2011; Rogers et al. 2015). Although it is possible that the prolonged shedding of viruses in immunocompromised hosts may sometimes generate mutations of phenotypic importance and some of the mutations generated can spread globally (Xue et al. 2017), modeling suggests that immunocompromised hosts only play a limited role in virus evolution because of their low frequency in the population as a whole (Eden et al. 2017).

ACKNOWLEDGMENTS

E.C.H. is funded by an Australian Research Council Australian Laureate Fellowship (FL170100022). The WHO Collaborating Centre for Reference and Research on Influenza is funded by the Australian Commonwealth Government.

This article has been made freely available online courtesy of TAUNS Laboratories.

Footnotes

REFERENCES

| Table of Contents

Richard Sever interviews Joan Brugge