 Research
 Open Access
 Published:
Quantifying the annual incidence and underestimation of seasonal influenza: A modelling approach
Theoretical Biology and Medical Modelling volume 17, Article number: 11 (2020)
Abstract
Background
Seasonal influenza poses a significant public health and economic burden, associated with the outcome of infection and resulting complications. The true burden of the disease is difficult to capture due to the wide range of presentation, from asymptomatic cases to nonrespiratory complications such as cardiovascular events, and its seasonal variability. An understanding of the magnitude of the true annual incidence of influenza is important to support prevention and control policy development and to evaluate the impact of preventative measures such as vaccination.
Methods
We use a dynamic disease transmission model, laboratoryconfirmed influenza surveillance data, and randomizedcontrolled trial (RCT) data to quantify the underestimation factor, expansion factor, and symptomatic influenza illnesses in the US and Canada during the 20112012 and 20122013 influenza seasons.
Results
Based on 2 case definitions, we estimate between 0.42−3.2% and 0.33−1.2% of symptomatic influenza illnesses were laboratoryconfirmed in Canada during the 20112012 and 20122013 seasons, respectively. In the US, we estimate between 0.08−0.61% and 0.07−0.33% of symptomatic influenza illnesses were laboratoryconfirmed in the 20112012 and 20122013 seasons, respectively. We estimated the symptomatic influenza illnesses in Canada to be 0.32−2.4 million in 20112012 and 1.8−8.2 million in 20122013. In the US, we estimate the number of symptomatic influenza illnesses to be 4.4−34 million in 20112012 and 23−102 million in 20122013.
Conclusions
We illustrate that monitoring a representative group within a population may aid in effectively modelling the transmission of infectious diseases such as influenza. In particular, the utilization of RCTs in models may enhance the accuracy of epidemiological parameter estimation.
Introduction
The exact number of cases of a disease is complex to capture. Different methods can be used, from epidemiological studies to disease surveillance systems. While data collected routinely for surveillance purposes have the advantage of being readily accessible over a long period of time, they are subject to underestimation. Underestimation is a combination of underreporting (failure to capture cases that seek care due to underdiagnoses or undernotifications) and underascertainment (failure to seek health care) [1]. Symptomatic individuals who seek medical care but are misdiagnosed due to an atypical presentation which does not fit the case definition or to the lack of sensitivity of the laboratory test (underdiagnosis) and/or for which administrative steps may not be taken at the physician’s office to report the case contribute to underreporting [1]. Also, infected individuals may be asymptomatic or with a mild form of the disease and may not seek healthcare, leading to underascertainment. Mathematical modelling may play a role in quantifying the effects of factors contributing to underestimation to assess the true number of influenza illnesses, ultimately to assist in policy development and to evaluate the impact of influenza vaccination.
Passive influenza surveillance systems are not designed to capture all illnesses; however, surveillance data has been utilized to assess underreporting, underestimation and incompleteness. Statistical modelling has been utilized to quantify underreporting and underestimation of influenzaassociated hospitalizations, morbidity and mortality in the United States (US) and Canada [2–12]. However, assessing the underestimation associated with the true number of influenza illnesses has received less attention. Influenza surveillance data has been utilized to assess underestimation and estimate symptomatic influenza illnesses using mulitiplier methods [13, 14], Bayesian evidence synthesis [15], and dynamic models using ordinary differential equations [16]. In one study, assessments of underestimation were shown to be unreliable when utilizing only virologic surveillance data [16]. Similarly, a noted contributor of error of symptomatic illness estimates was the quality of sentinel influenzalike illness (ILI) data, which is subject to both overestimation and underestimation [15]. A method to assess underestimation of influenza illnesses utilizing a reliable data source, such as active surveillance, may enhance the ability to quantify the true underestimation and the number of influenza illnesses. In the present study, we develop a framework to utilize data from a closely monitored group whose epidemiological status is subject to minimal uncertainty. In particular, we utilize a randomizedcontrolled trial (RCT) designed to minimize the effects of underestimation and completely capture participant’s illnesses through active surveillance and systematic testing [17].
A recent vaccine RCT, which assessed the efficacy of a highdose (HD) influenza vaccine compared to a standarddose (SD) influenza vaccine, presents an opportunity to explore a new method using mathematical modelling to correct for underestimation in national virological influenza surveillance, ultimately to infer the number of symptomatic influenza illnesses. In particular, we establish a parameter estimation technique based on a mechanistic disease transmission model to assess the underestimation and symptomatic illnesses in the US and Canada during the 20112012 and 20122013 influenza seasons.
Methods
Data sources
The present modelling study integrates multiple sources of data to generate estimates for the underestimation factor, disease transmission rate, expansion factor, and symptomatic influenza illnesses (Fig. 1). The laboratoryconfirmed influenza cases during the 20112012 and 20122013 seasons are included in the final seasonal surveillance reports from FluWatch and FluView in Canada and the US, respectively [18, 19]. Additionally, we utilize the US and Canadian census profiles [20–23].
A recent RCT assessed the efficacy of a HD influenza vaccine compared to a SD influenza vaccine among 14,500 and 17,489 participants during the 20112012 and 20122013 seasons, respectively [17]. We utilize the laboratoryconfirmed influenza counts among participants receiving SD and HD influenza vaccines in this RCT, which have been provided in the Supplementary Appendix of a prior publication [17]. The number of laboratoryconfirmed influenza cases within the RCT are provided for three different case definitions [17]. In the present study, we utilize the laboratoryconfirmed influenza cases associated with the least and most restrictive case definitions in the RCT. We utilize data associated with the least restrictive case definition, respiratory illness (RI), to provide estimates representing the true underestimation of symptomatic influenza illnesses. We utilize a more restrictive case definition, modified CDCdefined ILI, which is closer to the case definition used by surveillance systems in the US and Canada, to provide estimates representing underestimation of symptomatic influenza illnesses which were captured by virologic surveillance. RI was defined as the occurrence of one or more of the following: sneezing, nasal congestion or rhinorrhea, sore throat, cough, sputum production, wheezing, or difficulty breathing and modified CDCdefined ILI was defined as RI with cough or sore throat, concurrent with a temperature above 37.2 ^{∘} C [17].
The SD influenza vaccine effectiveness and coverage over the 20112012 and 20122013 influenza seasons has been established in the US and Canada [24–27]. Also, the HD influenza vaccine coverage estimates have been made available in the US [28]. In Canada, the HD influenza vaccine was not yet licensed for use during 20112012 and 20122013 seasons.
Mathematical modelling
We develop two compartmental models consisting of vaccinated and infected individuals for the 1) seniors aged 65+ who participated in the RCT and 2) the general population. Individuals are assumed to be vaccinated by the beginning of the influenza season, so there is no inflow to the vaccinated compartment during the season. Since we study the dynamics of influenza within one season, we ignore the demographic dynamics (e.g. birth and death). As the size of the population for participants of the RCT is small relative to the total population, we ignore the influenza transmission from these participants to the general community; however, we assume the same transmission rate among the RCT participants. Also, we assume that the age groups 064 and 65+ are homogeneously mixed. In addition to homogeneous mixing by age group, we assume spatial homogeneity in transmission. We assign units of symptomatic influenza illness to the infected compartments of the following models for dimensional consistency with data.
Modelling influenza infections among RCT participants
We develop a compartmental model consisting of vaccinated and infected individuals for the seniors aged 65+ who participated in the RCT [17]. We make use of the following notation for model variables and parameters; the subscript ^{′′}+^{′′} refers to compartments of individuals aged 65+ and subscript ^{′′}−^{′′} refers to individuals aged 064. The vaccination status of the compartment will also appear in the subscript as SD (standarddose) or HD (highdose), if applicable. Lastly, let the superscript “O" denote individuals outside of the RCT and the superscript “C" denote participants within the RCT. With these assumptions and notations set, we formulate the following system of ordinary differential equations, where \(V^{C}_{+,SD}\) represents RCT participants vaccinated given the standarddose vaccine, \(V^{C}_{+,HD}\) represents RCT participants vaccinated given the highdose vaccine, \(I^{C}_{+,HD}\) represents infected RCT participants with highdose vaccine, and \(I^{C}_{+,SD}\) represents the infected RCT participants given standarddose vaccine. These model variables are described in Table 1. Submodel (1) is equipped with the following initial conditions to replicate the RCT conditions preceding each season: \(V^{C}_{+,SD}(0) = V_{SD0}\phantom {\dot {i}\!}\), \(V^{C}_{+,HD}(0) = V_{HD0}\phantom {\dot {i}\!}\), \(I^{C}_{+,SD}(0) = 0\), \(I^{C}_{+,HD}(0)=0\) [17].
Parameters ε_{+,SD} and ε_{+,HD} denote the vaccinemodified susceptibilities corresponding to the standarddose and highdose influenza vaccine efficacy among seniors, respectively, β is the transmission rate and γ is the influenza recovery rate. Specifically, β=b/N where b is the daily effective contact rate and N is the total population size. The vaccinemodified susceptibilities ε represent a multiplier for the reduced infection rate of vaccinated individuals who are susceptible to influenza infection. Lastly, J represents the total infected individuals in the population from model (2), \(J = I^{O}_{} +I^{O}_{+}\).
Countrywide influenza transmission model
We subdivide the general community (i.e., the entire US or Canada) into two age groups: seniors aged 65+ and nonseniors aged 064, as well as vaccination status (SD or HD). The model variables and notations for model (2) are displayed in Table 2 and described in more detail below.
As identical with model (1), β=b/N where b is the daily effective contact rate and N is the total population size. Model (2) is equipped with the following initial conditions: the influenza season begins with the initial vaccinated population \(V^{O}_{,SD}(0) = \rho ^{O}_{,SD} N_{}\), \(V^{O}_{+,SD}(0) = \rho ^{O}_{+,SD} N_{+}\), \(V^{O}_{+,HD}(0) = \rho ^{O}_{+,HD} N_{+}\), where \(\rho ^{O}_{,SD}\), \(\rho ^{O}_{+,SD}\), \(\rho ^{O}_{+,HD}\) are the vaccine coverages for the population aged 064 and population aged 65 or over (either vaccinated with standarddose or highdose) in each year and country. Similarly, the initial susceptible populations are: \(S^{O}_{}(0)=\left (1\rho ^{O}_{,SD}\right)N_{}\), \(S^{O}_{+}(0)=\left (1\rho ^{O}_{+,HD}\rho ^{O}_{+,SD}\right)N_{+}\); there are no infections preceding the epidemic, so \(I^{O}_{}(0) = I^{O}_{+}(0)=0\). The values for these parameters embedded in the initial conditions are shown in Table 3 and estimated in “Parameter estimation” section.
Disease transmission and burden estimates
The approach for quantifying the underestimation factor and transmission rate utilizes the modelling framework developed in “Mathematical modelling” section. The mathematical details are presented in Appendix A. We assume that, since the RCT participants were actively monitored in the community, i.e. instructed to contact their study site if they had any respiratory symptoms and were, in addition, contacted weekly or biweekly by the site, that there was no underestimation of influenza infection within the RCT. The key idea is captured in the relationship pJ=R; a percent p (which denotes the underestimation factor) of symptomatic influenza cases J yields the laboratoryconfirmed influenza cases R. We use this relationship pJ=R, the final epidemic size relationships for submodel (1) and model (2), and structure of the SVIR model to derive a tractable system of nonlinear equations. The representative equation derived from submodel (1) captures information from the RCT, while the representative equation derived from model (2) captures information from the general population. Specifically, we derive a nonlinear system of the form
where
and \(\hat {R}\) is the laboratoryconfirmed cases in an influenza season captured in national virological surveillance. To solve this system of nonlinear equations (4) for p and β, we use Matlab’s fsolve function in Matlab R2016a.
Expansion factor and symptomatic illnesses
The expansion factor, E, is defined in this study as the number of symptomatic influenza illnesses per laboratoryconfirmed infection. In terms of the underestimation factor, the expansion factor is its multiplicative inverse. Hence, we compute the expansion factor by finding the multiplicative inverse of p, that is E=p^{−1}.
The number of estimated symptomatic influenza illnesses is \(E \hat {R}\) where \(\hat {R}\) is the total laboratoryconfirmed influenza cases from national surveillance during an influenza season.
The basic reproduction number is the average number of secondary cases produced by one infected individual introduced into a population of susceptible individuals. We determine the basic reproduction number of model (2) using the next generation method [35].
Parameter estimation
We determine estimates for parameters associated with submodel (1) and model (2) utilizing data in “Data sources” section. We provide an outline of the methods in the main text and more complete calculations and explanations are included in Appendix B.
Estimating vaccinemodified susceptibility ε_{−,SD}, ε_{+,SD} and ε_{+,HD}: Recall the vaccinemodified susceptibility ε captures the protection added from vaccination among vaccinated individuals against influenza infection. Here we outline the method for estimating parameter values ε_{+,SD},ε_{+,HD} and ε_{−,SD} for influenza seasons 20112012 and 20122013. This process integrates model analysis and prior vaccine effectiveness (VE) studies. We make use of a relationship between vaccinemodified susceptibility and VE [36].
To infer ε_{−,SD} and ε_{+,SD} we use prior estimates of vaccine effectiveness (VE) against influenza. Specifically, we relate these VE estimates to vaccinemodified susceptibility ε with the relationship ε=1−VE [36]. Remaining is to find ε_{+,HD}, which we use ε_{+,SD} and the ratio ε_{+,HD}/ε_{+,SD}. We use submodel (1) to determine the ratio ε_{+,HD}/ε_{+,SD}. Specifically, we find ε_{+,HD}/ε_{+,SD} by dividing the first two equations in submodel (1), which yields \(\frac {\dot {V}^{C}_{+,HD}}{\dot {V}^{C}_{+,SD}} = \frac {\epsilon _{+,HD} V^{C}_{+,HD}}{\epsilon _{+,SD} V^{C}_{+,SD}}\). Finally, we use separation of variables to find the ratio in terms of known RCT outcomes [17]:
The quantities \(V^{C}_{+,SD}(\infty)\) and \(V^{C}_{+,HD}(\infty)\) are the limit values of the state variables \(V^{C}_{+,SD}\) and \(V^{C}_{+,HD}\), respectively. Finally, we use estimates of now known quantities; ε_{+,SD} from VE studies and the ratio \(\frac {\epsilon _{+,HD}}{\epsilon _{+,SD}}\) from the RCT to estimate ε_{+,HD} [17].
Initial susceptible and vaccinated populations: To inform the initial conditions for model (2) in the years 20112012 and 20122013, we use population sizes given by the US and Canadian census programs [20–23]. The population size and the estimated vaccine coverage in each country then gives us the susceptible and vaccinated population initial conditions. The values and descriptions of these parameters embedded in the initial conditions; \(N_{}, N_{+}, \rho ^{O}_{,SD}, \rho ^{O}_{+,SD}, \text {and} \rho ^{O}_{+,HD}\), are listed in Table 3. Similarly, the initial conditions for submodel (1) are given in Table 3.
Influenza recovery rate γ: We inform the recovery rate γ using the infectious period of influenza, which has been estimated to be 3.8 days with a 95% confidence interval (CI) of 3.1  4.6 days [34]. The recovery rate γ is then the inverse of the mean sojourn time in the infectious compartment; hence, we consider \(\gamma = \frac {1}{3.8}\) day^{−1} as a baseline. We utilize the bounds on the CI of the estimated infectious period for sensitivity analysis and vary γ from \(\frac {1}{4.6}\) to \(\frac {1}{3.1}\) day^{−1}.
We utilize the laboratoryconfirmed influenza counts reported in the RCT according to two case definitions to inform \(V^{C}_{+,HD}(\infty)\) and \(V^{C}_{+,SD}(\infty)\). We use 1) laboratoryconfirmed cases associated with modified CDCdefined ILI and 2) laboratoryconfirmed cases associated with respiratory illness (RI) [17] (Table 3). Results generated for each case definition has an interpretation and is left for the Discussion.
Wellposedness
To ensure that the parameter estimation process for obtaining p and β will yield biologically relevant results, we study the wellposedness of the inverse problem outlined in “Disease transmission and burden estimates” section. The solution p and β to system (4) is unique and positive, i.e. the problem is wellposed. From f(β,p)=0, we note that β is related to p by the following equation
Substituting this expression for β into g(β,p) yields the following equation of p
where
Let Φ(p) denote the left hand side of equation (6). Note that Φ is a monotonedecreasing function of p with Φ(0)>0 and \({\lim }_{p\to \infty } \Phi (p)=0\). On the other hand, the right hand side of the equation is a monotonedecreasing function of p with \({\lim }_{p\to 0} \gamma \hat {R}p^{1}=\infty \) and \({\lim }_{p\to \infty } \gamma \hat {R}p^{1}=0\). Therefore, there exists a unique solution p∈(0,1] if \(\Phi (1)>\gamma \hat {R}\).
From the parameter sets estimated in “Parameter estimation” section, \(\Phi (1) > \gamma \hat {R}\), hence we have a unique solution of System (3) corresponding to each contextspecific parameter set. In other words, according to this specific US/Canadian demographic information, vaccinespecific parameters, influenza surveillance reports, and RCT study results, there is a single underestimation factors p (which we ensure is logically between 0 and 1) and transmission rate β which satisfy System (3) [17–19].
Sensitivity analysis
To quantify uncertainty in the underestimation factor, expansion factor, number of symptomatic influenza illnesses, and basic reproduction number, we utilize variability in parameter estimates in Table 3. Note that model parameter estimates appear as point values with the exception of the recovery rate γ (Table 3). We solve system (4) according to each case definition (ModifiedCDC and RI), country (US and Canada) and study year (20112012 and 20122013) with the mean value of γ, lower 95% confidence bound and upper 95% confidence bound on γ. The estimated value of p and β corresponding to the mean value of γ represent baseline results. The ranges of p and β obtained using the 95% confidence bounds of γ represent their sensitivity to an estimated 95% CI of the infectious period of influenza. We obtain baseline estimates of the expansion factor, number of symptomatic influenza illnesses and the basic reproduction number using baseline p and β estimates and methods in “Expansion factor and symptomatic illnesses” section. To retrieve an interval based on variation of recovery rate γ, we propagate the variability in p and β to each of the epidemiological parameters using methods in “Expansion factor and symptomatic illnesses” section.
Sensitivity to underestimation in the RCT
We have assumed perfect reporting and ascertainment of influenza virus infection within the RCT. Participants were instructed to contact their study site if they had any respiratory symptoms [17]. In addition, participants were contacted by a call center twice weekly (between the beginning of January and the end of February) or weekly until the end of illness surveillance (April 30 each year) [17]. In light of these frequent participant followups in the RCT, we expect this assumption to hold. However, we consider the possibility of underestimation occurring within the RCT to be exhaustive in our analysis [17]. For details regarding the sensitivity of p and β to underestimation in the RCT, see Appendix C.
Results
Disease transmission and burden estimates
The estimates for Canada are displayed in Table 4. Using the laboratoryconfirmed influenza counts associated with modified CDCdefined ILI within the RCT, we quantified the underestimation factor p=2.6% (2.1−3.2%) and p=1.2% (0.98−1.2%) in the 20112012 and 20122013 influenza seasons, respectively. These underestimation factors correspond to expansion factors of E = 37.9 (31−46.8) and E=84.8 (69.1−102.6) in 20112012 and 20122013 influenza seasons, respectively (Fig. 2a). Equipped with expansion factors in each year, we estimated symptomatic influenza illnesses to be 390,000 (320,000−480,000) and 2.3 million (1.8−2.7 million) in 20112012 and 20122013 influenza seasons, respectively. Also in Canada, using the laboratoryconfirmed influenza counts associated with RI within the RCT, we quantified the underestimation factor p = 0.51% (0.42−0.63%) and p=0.4% (0.33−0.49%) in the 20112012 and 20122013 influenza seasons, respectively. These underestimation factors correspond to expansion factors of E = 200.1 (163.4−243.9) and E = 376.1 (306.7−455.1) in 20112012 and 20122013 influenza seasons, respectively. Equipped with expansion factors in each year, we estimated symptomatic influenza illnesses to be 2 million (1.6−2.4 million) and 6.7 million (5.5−8.2 million) in 20112012 and 20122013 influenza seasons, respectively.
The estimates for the US are displayed in Table 5. Using the laboratoryconfirmed influenza associated with modified CDCdefined ILI within the RCT, we estimated p=0.5% (0.41−0.61%) and p=0.27% (0.2−0.33%) in the 20112012 and 20122013 seasons, respectively. These underestimation factors correspond to expansion factors of E = 200.1 (163.4−243.9) and E = 376.1 (306.7−455.1) in 20112012 and 20122013 influenza seasons, respectively (Fig. 2b). Equipped with expansion factors in each year, we estimated symptomatic influenza illnesses to be 5.4 million (4.4−6.9 million) and 29 million (23−34 million) in 20112012 and 20122013 influenza seasons, respectively. Also in the US, using the laboratoryconfirmed influenza counts associated with RI within the RCT, we estimated p=0.1% (0.08−0.12%) and p=0.09% (0.07−0.11%) in the 20112012 and 20122013 seasons, respectively. These underestimation factors correspond to expansion factors of E = 1030.1 (840.5−1247) and E = 1111.4 (906.6−1345.3) in 20112012 and 20122013 influenza seasons, respectively. Equipped with expansion factors in each year, we estimated symptomatic influenza illnesses to be 28 million (23−34 million) and 84 million (69−102 million) in 20112012 and 20122013 influenza seasons, respectively.
We estimated the disease transmission rate β range to between 0.25 and 0.39 in each season and country, corresponding to basic reproduction numbers R_{0} ranging between 1.19 and 1.22 (Tables 4 and 5, RI case definition).
Sensitivity to underestimation in RCT
Recall we have assumed perfect reporting and ascertainment of influenza virus infection among the RCT participants and proposed to revisit this assumption with a sensitivity analysis. We have conducted a sensitivity analysis and present the details in Appendix C. Overall, the sensitivity analysis suggests that the results in “Results” section are robust to underestimation within the RCT [17]. We find that our estimates for β are weakly dependent on underestimation within the RCT [17]. Further, the captured fraction of influenza infections by virologic surveillance p is also robust to underestimation within the RCT [17], maintaining an order of magnitude while varying RCT underestimation over its full range. See Appendix C for a full presentation of this analysis.
Discussion
This study develops and illustrates a method utilizing a parameter estimation technique based on a mechanistic model and data synthesis to quantify the underestimation factor associated with season influenza and the number of symptomatic illnesses. These estimates take into account mechanistic detail of influenza transmission, vaccine effectiveness, relative vaccine efficacy of SD to HD from the RCT, and vaccine coverage. While this method does utilize RCT outcomes, the remaining data required to generate the estimates become publicly available by the end of the influenza season. This method can be utilized for epidemiological parameter estimation for infectious diseases, provided there is an appropriate form of active surveillance (e.g., clinical trial) data available. A coupled system of differential equations for the actively monitored population and general community can be developed to integrate surveillance epidemiological data to quantify key populationlevel parameters, including the underestimation factor.
Our estimates generated from the laboratoryconfirmed influenza associated with modified CDCdefined ILI case definition are representative of the underestimation of cases which could be captured by the surveillance system (Tables 4 and 5, modified CDCdefined ILI case definition). In this light, our analysis indicates that the surveillance system in Canada captured 2.6% (2.1−3.2%) and 1.2% (0.98−1.2%) of symptomatic cases closely associated with modified CDCdefined ILI in 20112012 and 20122013 influenza seasons, respectively. In the US, the virologic surveillance system was estimated to capture 0.1% (0.08−0.12%) and 0.09% (0.07−0.11%) of symptomatic influenza cases closely associated with modified CDCdefined ILI in 20112012 and 20122013 influenza seasons, respectively. In each country, the percentage of captured cases decreases with an increase in laboratoryconfirmed cases from passive surveillance, which may be due to laboratory testing capacity or changing testing practices (Tables 4 and 5, modified CDCdefined ILI).
The underestimation factors generated from laboratoryconfirmed influenza associated with RI, i.e., a broader range of symptoms, provide an underestimation factor more closely representing the true symptomatic influenza underestimation factor and are apt to assess the true number of influenza illnesses (Tables 4 and 5, respiratory illness case definition). When considering a range of symptomatic influenza illness from the estimates using data specified by these two case definitions, we provide symptomatic influenza illness estimates which are consistent with US CDC’s in 20112012 and 20122013 (Fig. 3b). The US CDC estimate in 20122013 is near the low end of the range we estimated, which may be due to difference in methodology. In particular, this difference may be due to the nonlinear (exponential) relationship between the underestimation factor and laboratoryconfirmed illnesses in surveillance data. An explanation to support this nonlinear relationship and our results could be that the increased number of influenza illnesses results in limited availability of laboratory tests, therefore a fewer proportion of cases were captured in virological surveillance.
We offer two sources of comparison from studies of the 2009 influenza pandemic. In an expert opinion from the 2009 influenza pandemic, the number of expected influenza infections per laboratoryconfirmed infection ranged from 10 to 500 among experts [37]. For Canada, our estimates of the expansion factor lie in this range of 10 to 500 (Table 4). For the US, we estimate expansion factors greater than 500 using respiratory illness case definition (Table 5). Also, estimates of underestimation factors were between 0.1% and 0.7% during the 2009 influenza pandemic in Mexico [16]. Our estimates for the true underestimation in Canada and the US lie in this range (Table 4 and 5, respiratory illness case definition). In both cases, investigations of the 2009 pandemic influenza in Canada and Mexico align well with Canada; however, according to our analysis the underestimation in laboratoryconfirmed surveillance in the US is more substantial. This is likely due to differences in surveillance systems, laboratorytesting protocols, or healthseeking behavior. We also note that epidemiological characteristics differ from pandemics to seasonal influenza and the level of awareness may have impacted health seeking behavior and diagnosis practices.
The basic reproduction number ranged from 1.19  1.22 for estimates associated with RI case definition, which more closely represent the true R_{0}. While these basic reproduction numbers align tightly in the US and Canada, our estimates for R_{0} are higher in Canada in all influenza seasons (Tables 4 and 5). A recent systematic review found an interquartile range of reproduction numbers for seasonal influenza of 1.19–1.37, with median value 1.27 [38]. Our estimates for R_{0} are in line with typical findings representative of seasonal influenza [38].
We estimate the symptomatic influenza illnesses in the US and Canada; however, the number of asymptomatic influenza illnesses has not been assessed in this work. A recent review of estimates for the fraction of asymptomatic infections vary widely; however, values from 20%50% are typical [39]. These estimates indicate that the number of asymptomatic influenza illnesses may be substantial. Even so, based on our scan of the literature, there is no clear consensus on the contribution of asymptomatic individuals to influenza transmission. Future studies may attempt to estimate the total number influenza illnesses, symptomatic and asymptomatic, to provide a more accurate assessment of influenza illnesses; influenza underestimation; and the force of infection. As a result of this simplification, we may underestimate R_{0} and hence the force of infection.
There are several limitations in the current study. There are differences in symptomatic attack rates, disease presentation, and healthseeking behaviour between age groups. As a result, the symptomatic reporting fraction likely varies from age group to age group. Future study may be extended to address these age group disparities by quantifying the underestimation and expansion factors for each age group. Another limitation to this study is the assumed homogeneous mixing between age groups; however, this may also be addressed in a future study. In fact, the RCT data may allow the contact preference between age groups (i.e. the contact rate between age groups 064 and 65+, and viceversa) to be estimated. In this light, the methods presented herein may be used as a method to validate empirical social contact data obtained from surveys or existing contact mixing patterns. In addition to agespecific heterogeneity, there may be spatial heterogeneity in contact mixing, surveillance systems, and social behavior which impacts case ascertainment and reporting rates. In this light, it may be advantageous to conduct analysis at more granular scales (e.g. city, state, or provincial levels) to more accurately capture transmission and underestimation at these subnational levels. The RCT was conducted in the US and Canada, and we use aggregate case counts and RCT participant counts from US and Canada. In other words, the RCT is assumed to take place in the US or Canada. The underestimation factor could also be separated into several multipliers, as in prior works [13]. An improvement could be made to account for laboratory test accuracy; laboratoryconfirmed cases from the RCT and surveillance reports can be preprocessed for the method to generate more representative estimates. Overall, this work lays a functional framework to expand as the research questions at hand require.
Conclusions
We develop a method for quantifying the underestimation factor and disease transmission rate by integrating several data sources including RCT outcomes. We use our method to assess surveillance system capacity, number of symptomatic influenza illnesses, and R_{0} in the US and Canada during 20112012 and 20122013 seasons. The utilization of outcomes from RCTs (in this case a comparative vaccine RCT) may allow for the extraction of additional information characterizing an epidemic which may not be possible by limiting data usage to national influenza surveillance reports. This work illustrates the point that monitoring a representative group within a population may aid in effectively modelling the transmission of infectious diseases such as influenza. In general, the utilization of available surveillance data may increase the capabilities of disease models and broaden their power to draw inferences. A formal structural and practical identifiability analysis should be carried out to rigorously address these details of parameter identification.
It is key to provide more accurate methods to estimate the annual incidence of influenza to guide evidencebased immunization policymaking. In this light, it is vital to develop more accurate mathematical models of influenza transmission, and to accurately evaluate the impact of influenza vaccination. These methods may contribute to and enable the design of optimal vaccination programs that best reduce the annual incidence of seasonal influenza as well as associated hospitalizations, medical visits and deaths.
Appendix A: Mathematical details for estimating p and β.
Here we present the details for estimating p and β. We also introduce the following notation for convenience and readability; let a hat (\(\hat {}\)) over a variable denote integration over all nonnegative time. For example, \(\hat {I} = \int _{0}^{\infty }I(s)ds\).
Integrating the first equation of submodel (1) we have
Similarly, taking the second equation from submodel (1), we have:
For finding \(\hat {I}^{C}_{+,SD}\), \(\hat {I}^{C}_{+,HD}\), adding the first and third equations, & second and fourth of submodel (1) followed by integrating both sides give us:
provided that \(I_{+,SD}^{C}(0) = I_{+,SD}^{C}(\infty) = I_{+,HD}^{C}(0) = I_{+,HD}^{C}(\infty) =0.\) This, along with pJ=R (where \(J = I^{O}_{}+I^{O}_{+}\)), from Eqs. 7 and (8) we have:
and
Thus, we define the following functions:
Now, from model (2) we have the following five equations:
Recall pJ=R, where p is the underestimation factor. Integrating the equations above yields the respective final size relations:
Further, by summing all the equations of model (2), we obtain the following equality:
Integrating both sides of the above equations with the assumption that there are no infections at t_{0} and t_{∞} gives:
Substituting values of final sizes from above relations and \(p\hat {J} = \hat {R}\) into (11):
Thus we define:
With this definition, we obtain the following system of equations, which β and p must satisfy:
Suppose that β and p satisfy f(β,p)=0, then we have:
and that β and p also satisfy h(β,p)=0, that is
Note that for both of these equalities to hold, it is required that
Equation (10) does hold since this is precisely the expression used to estimate ε_{+,HD} from vaccine RCT information and ε_{+,SD}.
Since all known variables in f and g have been estimated except for β and p, we can obtain estimates for β and p by solving (15) (e.g., numerically).
Solving System (15) yields estimates for p and β, which we provide with numerical methods in Section “Disease transmission and burden estimates”. Lastly, we note that Equations (9) and (12) remain intact if a latent compartment were to be added to models (1) and (2) to account for the incubation period of influenza.
Appendix B: Detailed parameter estimation
Here we present the details for the calculations of parameter values ε_{+,SD},ε_{+,HD} and ε_{−,SD} for influenza seasons 20112012 and 20122013. For this purpose, we utilize an approximation which relates vaccinemodified susceptibility to vaccine effectiveness (VE) [36]. We also leverage influenza vaccine effectiveness (VE) from several studies in the US and Canada.
20112012 Influenza season
Canada
Informing ε_{−,SD}: First, we inform ε_{+,SD} using VE estimates. In Canada, the VE against RealTime Polymerase Chain Reaction (Realtime PCR) confirmed influenza in a testnegative casecontrol study was estimated to be 55% among all participants of all ages [25]. We use the approximation ε=1−VE to estimate ε_{−,SD}; that is ε_{−,SD}=1−0.55=0.45 [36].
Informing ε_{+,SD}: In the same testnegative casecontrol study, the VE against Realtime PCR confirmed influenza among participants aged 50+ was estimated to be 58% [25]. We use this figure to estimate ε_{+,SD} using the approximation ε=1−VE to find ε_{+,SD}=1−0.58=0.42 [36].
US
Informing ε_{−,SD}: In the US, the VE against medicallyattended influenza among all ages was estimated to be 47% [26]. We now use the approximation ε=1−VE to estimate ε_{−,SD} as follows: ε_{−,SD}=1−0.47=0.53.
Informing ε_{+,SD}: The VE against medicallyattended influenza among those aged 65+ was estimated to be 43% in the US in 20112012 [26]. With the relationship ε_{+,SD}=1−VE we calculate ε_{−,SD}=0.57 [36].
Informing ε_{+,HD}: Lastly, to estimate the highdose vaccinemodified susceptibility we use Equation (3) in the main text informed with RCT results to calculate ε_{+,HD}/ε_{+,SD} [17].
Note that with the ratio of two vaccinemodified susceptibilities and ε_{+,SD} known, we can calculate ε_{+,HD}. For the US, we then have ε_{+,HD}=0.55×0.53=0.29.
20122013
Canada
Informing ε_{−,SD}: We inform ε_{−,SD} using the VE against PCRconfirmed influenza from a testnegative case control study in Canada [24]. In this study, the VE against PCRconfirmed influenza was estimated to be 51% [24]. Now, we use the approximation ε=1−VE=1−0.51=0.49 [24].
Informing ε_{+,SD}: In the same testnegative case control study, VE was estimated to be 47% among participants aged 50+. As a result, we estimate ε_{+,SD} using the approximation ε = 1 − VE to find ε_{+,SD} = 1 − 0.47 = 0.53 [36].
US
Informing ε_{−,SD}: First, we inform ε_{−,SD} using the estimates for the VE against medicallyattended influenza over all ages in the US of 49% [27]. We then calculate ε_{−,SD}=1− 0.49=0.51 using the approximation ε_{+,SD}=1−VE [36].
Informing ε_{+,SD}: In the same US study, VE against medicallyattended influenza was estimated to be 29% among those aged 65+ [27]. We then calculate ε_{+,SD} = 1 − 0.29 = 0.71 using the approximation ε_{+,SD} = 1−VE [36].
Informing ε_{+,HD}: Now, to estimate the highdose vaccinemodified susceptibility we first use Equation (3) in the main text informed by RCT results to calculate ε_{+,SD}/ε_{+,HD} [17].
We use the known ratio: ε_{+,SD}/ε_{+,HD} to calculate ε_{+,HD}=0.79∗0.71=0.56. In the US we have ε_{+,HD}=0.56.
Appendix C: Accounting for underestimation in the RCT
In this section, we present the details of the sensitivity analysis exploring the impact of underestimation in the RCT on study results. Specifically, we investigate the sensitivity of p and β on underestimation in the RCT [17]. To consider underestimation in the RCT, we introduce a underestimation factor p_{1} in the RCT. As before, p remains the underestimation factor in the general community. We also assume that RCT participants provided with standarddose and highdose vaccines are equally underestimated. Lastly, we define δ to be the fraction of true remaining vaccinated participants at the end of the RCT.
Method
The total number of observed infections in the RCT can be written as:
The above equation can also be formulated in terms of p_{1} and δ, that is:
The relationship between p_{1} and δ can be determined by equating (16) and (17) which yields:
Under these conditions, by conducting the same analysis of model (1) as shown in Appendix A, we derive the condition:
Continuing to follow the analysis from Appendix A, we arrive at the following system of equations:
We analyze the robustness of the estimates of β and p by varying δ in the above system of equations. Exploring sensitivity with respect to δ gives insight into how underestimation in the RCT affects our results in Section “Results”. Specifically, for a given p_{1}, we calculate a corresponding δ then solve the above system numerically.
Results
The relationship between the RCT underestimation factor, p_{1}, with β and p estimates is shown in Figs. 4, 5, 6, and 7. We explore the sensitivity of p_{1} on p and β for each context of interest; both regionally in the US and Canada during 201112 and 201213 influenza seasons. We present the sensitivity analysis using the laboratoryconfirmed influenza associated with RI case definition outcomes in the RCT.
Overall, p and β are robust to underestimation in the RCT. We see that the effect of underestimation in the RCT is negligible on the transmission rate β for reasonable values of p_{1}. On the other hand, p is also weakly dependent on p_{1}. By varying p_{1} over its entire range, p remains to be on the same order of magnitude.
Availability of data and materials
Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.
Abbreviations
 SVIR:

Susceptiblevaccinatedinfectedrecovered
 RCT:

Randomizedcontrolled trial
 PHAC:

Public Health Agency of Canada
 CDC:

Centers for Disease Control and Prevention
 SD:

Standarddose vaccine
 HD:

Highdose vaccine
 Realtime PCR:

RealTime Polymerase Chain Reaction
 RI:

Respiratory illness
References
 1
Gibbons CL, Mangen MJJ, Plass D, Havelaar AH, Brooke RJ, Kramarz P, Peterson KL, Stuurman AL, Cassini A, Fèvre EM, et al. Measuring underreporting and underascertainment in infectious disease datasets: a comparison of methods. BMC Public Health. 2014; 14(1):147.
 2
Poehling KA, Edwards KM, Weinberg GA, Szilagyi P, Staat MA, Iwane MK, Bridges CB, Grijalva CG, Zhu Y, Bernstein DI, et al.The underrecognized burden of influenza in young children. N Eng J Med. 2006; 355:31–40.
 3
Grijalva C, Weinberg G, Bennett N, Staat M, Craig A, Dupont W, Iwane M, Postema A, Schaffner W, Edwards K, et al. Estimating the undetected burden of influenza hospitalizations in children. Epidemiol Infect. 2007; 135(6):951–8.
 4
Barker WH, Mullooly JP. Underestimation of the role of pneumonia and influenza in causing excess mortality. Am J of Public Health. 1981; 71(6).
 5
Ortiz JR, Neuzil KM, Shay DK, Rue TC, Neradilek MB, Zhou H, Seymour CW, Hooper LG, Cheng PY, Goss CH, et al.The burden of influenzaassociated critical illness hospitalizations. Criti Care Med. 2014; 42(11):2325–32.
 6
Ortiz JR, Neuzil KM, Cooke CR, Neradilek MB, Goss CH, Shay DK. Influenza pneumonia surveillance among hospitalized adults may underestimate the burden of severe influenza disease. PloS one. 2014; 9(11).
 7
Goldstein E, et al.Improving the estimation of influenzarelated mortality over a seasonal baseline. Epidemiology. 2012; 23(6):829–38.
 8
Matias G, Taylor R, Haguinet F, SchuckPaim C, Lustig R, Shinde V. Estimates of mortality attributable to influenza and rsv in the united states during 1997–2009 by influenza type or subtype, age, cause of death, and risk status. Influenza Other Respir Viruses. 2014; 8(5):507–15.
 9
Millman AJ, Reed C, Kirley PD, Aragon D, Meek J, Farley MM, Ryan P, Collins J, Lynfield R, Baumbach J, et al.Improving accuracy of influenzaassociated hospitalization rate estimates. BMC Infect Dis. 2015; 21(9).
 10
Schanzer DL, Langley JM, Tam TW. Hospitalization attributable to influenza and other viral respiratory illnesses in canadian children. Pediatr Infect Dis J. 2006; 25(9):795–800.
 11
Schanzer DL, Langley JM, Tam TW. Role of influenza and other respiratory viruses in admissions of adults to canadian hospitals. Influenza Other Respir Viruses. 2008; 2(1):1–8.
 12
Schanzer DL, McGeer A, Morris K. Statistical estimates of respiratory admissions attributable to seasonal and pandemic influenza for canada. Influenza Other Respir Viruses. 2013; 7(5):799–808.
 13
Reed C, Angulo FJ, Swerdlow DL, Lipsitch M, Meltzer MI, Jernigan D, Finelli L. Estimates of the prevalence of pandemic (h1n1) 2009, united states, april–july 2009. Emerg Infect Dis. 2009; 15(12):2004.
 14
US Centers for Disease Control and Prevention. Disease Burden of Influenza. https://www.cdc.gov/flu/about/burden/index.html. Accessed 10 Sept 2019.
 15
McDonald SA, Presanis AM, De Angelis D, van der Hoek W, Hooiveld M, Donker G, Kretzschmar ME. An evidence synthesis approach to estimating the incidence of seasonal influenza in the netherlands. Influenza Other Respir Viruses. 2014; 8(1):33–41.
 16
Chong K, Fong H, Zee C. Estimating the incidence reporting rates of new influenza pandemics at an early stage using travel data from the source country. Epidemiol Infect. 2014; 142(5):955–63.
 17
DiazGranados CA, Dunning AJ, Kimmel M, Kirby D, Treanor J, Collins A, Pollak R, Christoff J, Earl J, Landolfi V, et al.Efficacy of highdose versus standarddose influenza vaccine in older adults. N Engl J Med. 2014; 371(7):635–45.
 18
Centers for Disease Control and Prevention. Weekly U.S. Influenza Surveillance Report. https://www.cdc.gov/flu/weekly/index.htm. Accessed 8 May 2019.
 19
Public Health Agency of Canada. Weekly Influenza Reports. https://www.canada.ca/en/publichealth/services/diseases/fluinfluenza/influenzasurveillance/weeklyinfluenzareports.html. Accessed 9 May 2019.
 20
Statistics Canada. Census Profile. https://www12.statcan.gc.ca/censusrecensement/2011/dppd/prof/index.cfm?Lang=E. Accessed 8 May 2019.
 21
Statistics Canada. Census Profile. https://www12.statcan.gc.ca/censusrecensement/2012/dppd/prof/index.cfm?Lang=E. Accessed 8 May 2019.
 22
US Census Bureau. Age and Sex Composition in the United States: 2011. https://www.census.gov/data/tables/2011/demo/ageandsex/2011agesexcomposition.html. Accessed 7 May 2019.
 23
US Census Bureau. Age and Sex Composition in the United States: 2012. https://www.census.gov/data/tables/2012/demo/ageandsex/2012agesexcomposition.html. Accessed 7 May 2019.
 24
Skowronski DM, Janjua NZ, De Serres G, Sabaiduc S, Eshaghi A, Dickinson JA, Fonseca K, Winter AL, Gubbay JB, Krajden M, et al. Low 2012–13 influenza vaccine effectiveness associated with mutation in the eggadapted h3n2 vaccine strain not antigenic drift in circulating viruses. PloS one. 2014; 9(3):92153.
 25
Skowronski DM, Janjua NZ, Sabaiduc S, De Serres G, Winter AL, Gubbay JB, Dickinson JA, Fonseca K, Charest H, Bastien N, et al.Influenza a/subtype and b/lineage effectiveness estimates for the 2011–2012 trivalent vaccine: crossseason and crosslineage protection with unchanged vaccine. J Infect Dis. 2014; 210(1):126–37.
 26
Ohmit SE, Thompson MG, Petrie JG, Thaker SN, Jackson ML, Belongia EA, Zimmerman RK, Gaglani M, Lamerato L, Spencer SM, et al.Influenza vaccine effectiveness in the 2011–2012 season: protection against each circulating virus and the effect of prior vaccination on estimates. Clin Infect Dis. 2013; 58(3):319–27.
 27
McLean HQ, Thompson MG, Sundaram ME, Kieke BA, Gaglani M, Murthy K, Piedra PA, Zimmerman RK, Nowalk MP, Raviotta JM, et al.Influenza vaccine effectiveness in the united states during 2012–2013: variable protection by age and virus type. J Infect Dis. 2014; 211(10):1529–40.
 28
Izurieta HS, Thadani N, Shay DK, Lu Y, Maurer A, Foppa IM, Franks R, Pratt D, Forshee RA, MaCurdy T, et al.Comparative effectiveness of highdose versus standard dose influenza vaccines in us residents aged 65 years and older from 2012 to 2013 using medicare data: a retrospective cohort analysis. Lancet Infect Dis. 2015; 15:293–300.
 29
National Advisory Committee on Immunization (NACI). Vaccine Coverage Amongst Adult Canadians: Results from the 2012 Adult National Immunization Coverage (aNIC) Survey. https://www.canada.ca/en/publichealth/services/immunization/vaccinecoverageamongstadultcanadiansresults2012adultnationalimmunizationcoverageanicsurvey.html. Accessed 7 May 2019.
 30
National Advisory Committee on Immunization (NACI). A Review of the Literature of High Dose Seasonal Influenza Vaccine for Adults 65 Years and Older. http://www.phacaspc.gc.ca/naciccni/assets/pdf/influenzavaccine65plusvaccincontrelagrippe65pluseng.pdf. Accessed 7 May 2019.
 31
US Centers for Disease Control and Prevention. Flu Vaccination Coverage, United States, 201213 Influenza Season. https://www.cdc.gov/flu/fluvaxview/coverage1213estimates.htm. Accessed 7 May 2019.
 32
US Centers for Disease Control and Prevention. Flu Vaccination Coverage, United States, 201112 Influenza Season. https://www.cdc.gov/flu/fluvaxview/coverage1112estimates.htm. Accessed 1 Oct 2019.
 33
US Centers for Disease Control and Prevention. Seasonal Influenza Vaccine Effectiveness, 20122013. https://www.cdc.gov/flu/vaccineswork/20122013.html. Accessed 7 May 2019.
 34
Cauchemez S, Carrat F, Viboud C, Valleron A, Boelle P. A bayesian mcmc approach to study transmission of influenza: application to household longitudinal data. Stat Med. 2004; 23(22):3469–87.
 35
van den Driessche P. Reproduction numbers of infectious disease models. Infect Dis Model. 2017; 2(3):288–303.
 36
Halloran ME, Struchiner CJ, Longini Jr IM. Study designs for evaluating different efficacy and effectiveness aspects of vaccines. Am J Epidemiol. 1997; 146(10):789–803.
 37
Arino J, Bauch C, Brauer F, Driedger SM, Greer AL, Moghadas SM, Pizzi NJ, Sander B, Tuite A, Van Den Driessche P, et al.Pandemic influenza: modelling and public health perspectives. Math Bio Eng. 2011; 8(1):1–20.
 38
Biggerstaff M, Cauchemez S, Reed C, Gambhir M, Finelli L. Estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature. BMC infect dis. 2014; 14(1):480.
 39
Leung NH, Xu C, Ip DK, Cowling BJ. The fraction of influenza virus infections that are asymptomatic: a systematic review and metaanalysis. Epidemiology. 2015; 26(6):862.
Acknowledgements
Special thanks to the Fields Institute and the Centre for Quantitative Analysis (CQAM) for hosting the Industrial Problem Solving Workshop for which this research originated.
Funding
ZM, MA, SA, JW, CC, JDK have received support from Sanofi Pasteur and the Natural Science and Engineering Council of Canada. JW is supported by the Natural Sciences and Engineering Research Council of Canada (Grant number:1055882011) and the Canada Research Chairs Program (Grant number:230720). JH is supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grants Program and Discovery Accelerator Supplements (DAS) Program. JH is also supported by the York Research Chairs Program. This project is supportedby the NSERC/Sanofi Industrial Research Chair Program in Vaccine Mathematics, Modelling and Manufacturing.
Author information
Affiliations
Contributions
CC, MA, SA, AFT, and ZM prepared the manuscript. Research was conceptualized by SA, JL, JH, ZM, TS, and JW. Numerical simulations conducted by ZM, SA, and CC. All authors contributed significantly to this work. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
AC, TS, JL, AFT, ET, LC, and BS are employees of Sanofi Pasteur which is a manufacturer of in uenza vaccines.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
McCarthy, Z., Athar, S., Alavinejad, M. et al. Quantifying the annual incidence and underestimation of seasonal influenza: A modelling approach. Theor Biol Med Model 17, 11 (2020). https://doi.org/10.1186/s12976020001294
Received:
Accepted:
Published:
Keywords
 Mathematical modelling
 influenza
 vaccine
 evidence synthesis