Single-Cell and Bulk RNA-Sequencing Reveal Differences in Monocyte Susceptibility to Influenza A Virus Infection Between Africans and Europeans

There is considerable inter-individual and inter-population variability in response to viruses. The potential of monocytes to elicit type-I interferon responses has attracted attention to their role in viral infections. Here, we use single-cell RNA-sequencing to characterize the role of cellular heterogeneity in human variation of monocyte responses to influenza A virus (IAV) exposure. We show widespread inter-individual variability in the percentage of IAV-infected monocytes. Notably, individuals with high cellular susceptibility to IAV are characterized by a lower activation at basal state of an IRF/STAT-induced transcriptional network, which includes antiviral genes such as IFITM3, MX1 and OAS3. Upon IAV challenge, we find that cells escaping viral infection display increased mRNA expression of type-I interferon stimulated genes and decreased expression of ribosomal genes, relative to both infected cells and those never exposed to IAV. We also uncover a stronger resistance of CD16+ monocytes to IAV infection, together with CD16+ -specific mRNA expression of IL6 and TNF in response to IAV. Finally, using flow cytometry and bulk RNA-sequencing across 200 individuals of African and European ancestry, we observe a higher number of CD16 + monocytes and lower susceptibility to IAV infection among monocytes from individuals of African-descent. Based on these data, we hypothesize that higher basal monocyte activation, driven by environmental factors and/or weak-effect genetic variants, underlies the lower cellular susceptibility to IAV infection of individuals of African ancestry relative to those of European ancestry. Further studies are now required to investigate how such cellular differences in IAV susceptibility translate into population differences in clinical outcomes and susceptibility to severe influenza.

There is considerable inter-individual and inter-population variability in response to viruses. The potential of monocytes to elicit type-I interferon responses has attracted attention to their role in viral infections. Here, we use single-cell RNA-sequencing to characterize the role of cellular heterogeneity in human variation of monocyte responses to influenza A virus (IAV) exposure. We show widespread inter-individual variability in the percentage of IAVinfected monocytes. Notably, individuals with high cellular susceptibility to IAV are characterized by a lower activation at basal state of an IRF/STAT-induced transcriptional network, which includes antiviral genes such as IFITM3, MX1 and OAS3. Upon IAV challenge, we find that cells escaping viral infection display increased mRNA expression of type-I interferon stimulated genes and decreased expression of ribosomal genes, relative to both infected cells and those never exposed to IAV. We also uncover a stronger resistance of CD16 + monocytes to IAV infection, together with CD16 + -specific mRNA expression of IL6 and TNF in response to IAV. Finally, using flow cytometry and bulk RNA-sequencing across 200 individuals of African and European ancestry, we observe a higher number of CD16 + monocytes and lower susceptibility to IAV infection among monocytes from individuals of African-descent. Based on these data, we hypothesize that higher basal monocyte activation, driven by environmental factors and/or weak-effect genetic variants, underlies the lower cellular susceptibility to IAV

INTRODUCTION
Respiratory viruses with pandemic potential pose enormous health and economic impacts on the human population. In the last century, we have witnessed outbreaks of several coronaviruses, including SARS-CoV-2, SARS-CoV-1 and MERS, and a number of avian and swine influenza A viruses (IAV). A particularly harrowing and shared feature of these pandemics are the sudden deaths of otherwise healthy individuals (1). A hyperinflammatory state characterized by high levels of inflammatory cytokines, often referred to as a 'cytokine storm' (2,3), has emerged as a hallmark of these severe viral infections. While still controversial, there is increasing evidence to suggest that the mononuclear phagocyte system is an important immunological determinant of this phenotype (4)(5)(6). Upon viral infection, sentinel cells such as lung-resident macrophages trigger complex signaling cascades that recruit leukocytes to the site of infection, among them monocytes. These infiltrating monocytes differentiate into monocyte-derived dendritic cells or macrophages, enabling viral clearance through the induction of the adaptive response, and help replenish the pool of tissue-resident alveolar macrophages (4,7).
In humans, circulating monocytes are divided into classical (~80%), intermediate (~15%), and nonclassical (~5%) subsets, based on surface receptor expression of the cluster-determinant antigens CD14 and CD16 (8). While nonclassical monocytes (CD14 + CD16 ++ ) are long-lived and 'patrol' healthy tissues through long-range crawling on the endothelium, classical (CD14 ++ CD16 -) and intermediate (CD14 ++ CD16 + ) monocytes are recruited to the lung in response to viral infection, where they secrete inflammatory cytokines and chemokines, as well as type I interferons (IFNs) (7,(9)(10)(11). In most individuals, recruited cells help clear infection despite being susceptible to infection themselves (12,13); yet, in some individuals, a dysfunctional immune response occurs resulting in widespread lung inflammation. Whether monocyte subsets behave differently upon viral exposure, and how direct viral sensing and exposure to secreted cytokines shape monocyte activation and differentiation are not well understood.
Variation in blood composition and cellular proportions have been shown to be one of the main factors underlying transcriptional variation in immune genes across individuals (14), with these proportions being influenced by both genetic and non-heritable factors (15)(16)(17). Recently, we characterized the genetic architecture of transcriptional responses of primary monocytes from 200 individuals of African and European ancestry to ex vivo challenge with viral stimuli (18). In this model, where we were able to control for viral determinants of disease (i.e. dose and strain), we reported marked inter-and intra-population differences in transcriptional responses to IAV. While our analyses revealed numerous cis-expression quantitative trait loci (18), genetic variants could only account for a small fraction of expression variation, in line with other studies (14,19).
Here, we implemented single-cell RNA-sequencing (scRNAseq) on human primary monocytes exposed to IAV to investigate (i) the effects of direct viral infection versus activation by exposure to secreted cytokines, (ii) the subset-specific responses of monocytes to viral challenge, and (iii) the extent of inter-individual and between-population variation in the proportions of monocyte subsets and the degree of monocyte susceptibility to IAV infection. Our study reveals a profound reprogramming of monocyte transcriptomes upon viral infection and shows a proinflammatory role of CD16 + monocytes following IAV challenge. Furthermore, it highlights that African-ancestry individuals are characterized by both a higher frequency of CD16 + monocytes and a generally lower susceptibility of their monocytes to IAV infection. Based on these results, we propose that population differences in the composition of circulating monocytes and their susceptibility to infection may contribute to the higher severity of IAV infections reported among African-ancestry individuals.

Using scRNA-Seq to Investigate Cellular Heterogeneity
To investigate the role of cellular heterogeneity in driving immune variability across individuals, we performed a timecourse experiment where we monitored the CD14 + fraction of peripheral blood mononuclear cells (PBMCs) from eight donors, both in the presence and absence of viral challenge. To maximize inter-individual variability, we chose individuals from two distinct ancestries whose cells demonstrated extreme responses to viral stimuli in a previous bulk RNA-seq experiment (18). Droplet-based scRNA-seq was performed on monocytes from all eight donors immediately before infection initiation (T 0 ), as well as at 2 (T 2 ), 4 (T 4 ), 6 (T 6 ), and 8 (T 8 ) hours post challenge with A/USSR/90/1977(H1N1) at a multiplicity of infection (MOI) equal to 1 (IAV-challenged) and mock infection (non-infected). To mitigate batch effects, we pooled IAV-challenged and noninfected cells from distinct donors in each library, assigning cells to their condition in silico via genetic barcoding (20). After stringent quality control where we removed low-quality, dying, and contaminants of the CD14 + monocyte isolation, our final dataset contained 88,559 high-quality cells, among which we predicted >99% monocyte purity at T 0 ( Figure 1A; Supplementary Figures 1 and 2). At later time points, a substantial fraction of non-infected cells (up to 70% at T 8 ) were predicted to be macrophage-like, indicating monocyte differentiation over the course of the experiment. For clarity, we refer to cells as monocytes at T 0 and as monocyte-derived cells from T 2 -T 8 .

Stable FCGR3A Expression Distinguishes Monocyte Subsets Over Time
We next sought to characterize each cell by its mRNA expression of the canonical monocyte markers, CD14 and CD16, given that much of the structure in our data was associated with FCGR3A (aka CD16) mRNA expression. In droplet-based scRNA-seq, encapsulation of ambient mRNAs emanating from dying cells can occur during library preparation leading to spurious mRNA detection (21). We thus used a statistical framework to test whether CD14 and CD16 were expressed at a level significantly higher than expected when accounting for potential contamination from the ambient pool (Methods). Despite having been positively selected for the CD14 antigen, only 32.4% of monocytes significantly expressed CD14 at T 0 ; this percentage further decreased at later time points and remained <15% across all time points and conditions (average 6.4% s.d.: 5.0%, Supplementary Figures 3A-C). On the other hand, 12.1% of monocytes significantly expressed FCGR3A (CD16) (referred to as CD16 + ) at T 0 , this marker proving much more stable across conditions and time points (9.3% of CD16 + cells on average, s.d.: 1.8%, Figure 1B and Supplementary Figures 3D-F). While we deciphered classical, intermediate, and nonclassical monocytes subsets at T 0 (Supplementary Note 1; Supplementary Figure 4 and Supplementary Data 1), we focus on the simpler distinction of CD16and CD16 + subsets given that positive-selection for monocytes does not capture the entire nonclassical population and that we were unable to distinguish the intermediate and nonclassical subsets after T 0 .

Functional Features of Monocyte Subsets Are Conserved Upon Manipulation
To assess how transcriptional profiles of CD16and CD16 + monocytes and their derived-cells differ, we focused on the 5,681 genes expressed with a normalized log 2 count > 0.1 in at least one condition, time point, and subset (Supplementary Data 2A). We found that the log 2 fold change (log 2 FC) in gene expression between CD16 +/subsets remained relatively stable over the course of the experiment (Pearson r between time points >0.42  Figure 5B). We thus searched for genes that were consistently differentially expressed between CD16 + and CD16cells across all time points (including T 0 ), conditions, and donors. We identified 266 genes over-expressed (log 2 FC>0.2, FDR<1%) in CD16 + cells relative to CD16cells, and 389 genes that showed the opposite pattern, and performed a GO-term enrichment analysis on these genes (Supplementary Data 2B). Consistent with previous reports (22)(23)(24), CD16subsets were characterized by high expression of several proinflammatory S100 Calcium Binding Proteins (S100A12, S100A9, and S100A8), contributing to a sizable GO-term enrichment in the defense response to fungus pathway (GO:0050832: OR=41.3, FDR=4.9x10 -4 ), while CD16 + subsets were characterized by high expression of Fc-gamma receptor signaling pathway genes (GO:0038096: OR=8.7, FDR=6.2x10 -6 ).

scRNA-Seq Highlights Heterogeneity in Monocyte Susceptibility and Viral Transcription
Using the presence of IAV transcripts as a proxy for infection ( Figure 1B), we next sought to distinguish cells that were successfully infected from those that were not. Among monocyte-derived cells that were exposed to IAV, we found that 50.3% expressed IAV transcripts above ambient levels when allowing up to 10% of mRNAs to come from the ambient pool. In contrast, less than 1% of non-infected cells showed evidence of viral transcription, supporting the validity of the threshold used to detect IAV expressing cells ( Figure 1C). We deemed cells with statistical evidence for expression of IAV transcripts from the IAV-challenged condition as 'infected', while the remaining cells from this condition were considered as 'bystanders', as these either did not come into contact with the virus or were able to fully repress viral mRNA transcription. When comparing the percentage of infected cells between subsets, we noticed that CD16 + cells were slightly less likely to be infected than CD16cells ( Figure 6).
We observed that the proportions of viral mRNAs among infected cells were bimodally distributed and largely varied between the clusters identified in our unsupervised analysis ( Figure 1D). We used a Gaussian mixture model to locate the two modes of the distribution and further sub-classify infected cells into those with lower IAV mRNA levels (<1-6%) and those with higher IAV mRNA levels (6-83%); while viral mRNA levels are dictated by both the rate of transcription and degradation, for simplicity we refer to these infected cell states as 'low IAVtranscribers' and 'high IAV-transcribers', respectively. The proportions of infected cells among individuals remained largely unchanged over the course of the experiment; however, high IAV-transcribers were virtually absent at 2h (<2% of infected cells), peaked to~36% of IAV-infected cells at 4h, and decreased to 8.5% by 8h, suggesting that high-IAV transcribers represent a transient state of IAV-infection preceding IAVinduced apoptosis ( Figure 1E). These results reveal profound heterogeneity in monocyte susceptibility and subsequent viral transcription upon IAV challenge.

Interplay of Cytokine and Ribosome Networks Drive Cell States Upon Infection
To characterize host transcriptional responses over time, we next subsampled each subset (CD16 -/CD16 + ), cell state (unexposed, bystander, infected), and time point in our scRNA-seq data to a uniform number of cells to avoid biases emanating from differences in sample sizes. Limited by the number of CD16 + high IAV-transcribing cells, we randomly sampled 100 cells from each subgroup, while ensuring representation of all donors. We then focused on the 6,669 host genes with average log 2 normalized count >0.1 in at least one subgroup (Supplementary Data 3A). Overall, CD16and CD16 + subsets behaved similarly upon stimulation with changes in gene expression between cell states being strongly correlated among subsets (Pearson r=0.83-0.95, p-values<2.2x10 -16 ; Supplementary Figure 7). GO term enrichment analyses of shared responses (FDR<1% & log 2 FC>0.2 in same direction in both subsets) uncovered several functional categories interacting to shape the activation state of cells ( Figure 2A; Supplementary Data 3B). Both bystander and infected cells showed increased mRNA expression of genes involved in antigen processing and presentation via class I MHC (GO:0019885, OR=53.7, FDR=2.0x10 -6 ) and ISGs (GO:0034340, OR=14.8, FDR=3.3x10 -20 ). Yet, bystander cells showed increased mRNA expression of ISGs and defense response to virus pathways relative to infected cells (GO:0034340, OR=13.4, FDR=4.4x10 -7 ; GO:0051607, OR=9.0, FDR=2.1x10 -7 ), while infected cells displayed higher mRNA expression of mitochondrial (GO:0005743, OR=4.7, FDR=3.3x10 -3 ) and ribosomal genes (GO:0005840, OR=117, FDR=1.0x10 -78 ). Notably, type-I IFN genes themselves tended to be preferentially expressed by infected cells (e.g. log 2 normalized count at 6h for IFNB1~0.12/ 0.29 in CD16and CD16 + subsets, respectively, vs <0.01 for bystander cells of both subsets), although this difference was only barely significant in our setting (FDR=0.03), likely due to the highly transient nature of IFN expression. Among infected cells, ribosomal genes showed higher activity among high IAV-transcribing cells relative to low IAVtranscribing cells ( Figure 2B, comparison only made at T 4 due to sample size constraints, e.g. GO:0019083, OR=137, FDR=6.1x10 -65 ). This observation is consistent with the notion that the expression of viral proteins is dependent on cellular ribosomes, with recent data suggesting that IAVs do not induce a global shut-off of cellular translation but rather a reshaping of the translation landscape (25)(26)(27). Likewise, among bystander cells, numerous ribosomal genes were downregulated at later time points relative to unexposed cells ( Collectively, these results suggest that expression of ISGs and ribosomal genes interact to shape cell states upon IAV challenge.

Increased IRF and STAT Activity Drives Stronger Antiviral Response
Despite qualitatively similar responses to infection between CD16 -/CD16 + subsets (Supplementary Figure 7), we hypothesized that subtle differences in the intensity of such responses might contribute to the increased resistance of CD16 + cells to infection. We thus performed an interaction test on the subsampled scRNA-seq data, and searched for genes for which transcriptional response upon IAV challenge differed between CD16and CD16 + subsets in either infected and/or bystander At FDR≤1%, we identified a total of 335 such genes, of which 98 differed between subsets only in bystander cells, 144 only in infected cells, and 93 in both. Hierarchical clustering highlighted eight major patterns of transcriptional responses (modules) among the 335 genes, several of which were associated with specific biological functions ( Figure 3A; Supplementary Data 3C, D).
Notably, module 1 (green) was enriched for genes in the antiviral response pathway (GO:0051607, OR=23.2, FDR=5.43×10 -7 ) and displayed a stronger response in infected CD16 + cells relative to CD16infected cells. Of additional interest was the transient CD16 + -specific transcription of the inflammatory cytokine genes IL6 and TNF, following viral challenge ( Figure 3B). We also found that several genes involved in the regulation and production of IL-6 and TNFa were over-expressed in CD16 + subsets at all time points and conditions (Supplementary Data 2B), but only see active transcription of the cytokines upon viral exposure. These results reveal the strong antiviral and inflammatory potential of CD16 + relative to CD16monocytes in response to viral infection (28). We next sought to characterize the regulatory architecture underlying the 335 genes whose transcriptional response to IAV challenge differed between monocyte subsets. Using SCENIC (29), we identified 113 high-confidence gene regulatory networks, or 'regulons', which were active in non-infected and/ or IAV-challenged cells, each composed of a transcription factor (TF) and a set of predicted targets (genes). We used these 113 regulons to search for an enrichment/depletion of TF targets among the eight modules of genes displaying subset-specific response to infection (Supplementary Data 3E). Among modules associated with an increased expression in cells exposed to IAV (modules 1-5), we observed a widespread over-representation of targets of IFN regulatory factors (IRFs) and signal transducing and activators of transcription (STATs) ( Figure 3C), reinforcing the central role of the IFN response upon IAV challenge. Interestingly, several of these factors displayed subset-specific activity themselves in response to IAV (IRF1/2/7 and STAT1/2/3, FDR<1%), mirroring the expression patterns of module 1 (Pearson r>0.92). These results collectively highlight a CD16 + -specific inflammatory response upon IAV challenge and suggest stronger activation of IRF and STAT transcription factors as a driver of the increased antiviral response observed in CD16 + cells upon IAV infection.

Basal Activation Differences Correlate With Monocyte Susceptibility
To explore the degree of inter-individual variation upon viral challenge, we next quantified IAV transcripts in the monocytederived cells of each individual, and created pseudo-bulk estimates by averaging the percent of viral mRNAs per-cell across all cells from each donor at each time point ( Figure 4A). While viral mRNAs peaked at the same time for all individuals, we observed extensive variation in the levels of viral mRNAs and percentages of infected cells across individuals ( Figure 4B). To identify specific genes that might underlie infection potential, we focused on the 4,589 genes that were expressed at >0.1 log 2 normalized counts in at least one canonical monocyte subset at T 0 . We identified a total of 3,131 genes that differed among our eight donors in either classical, intermediate, and/or nonclassical monocyte subsets (Kruskal-Wallis Rank Test, FDR=1%; Supplementary Data 4A). Within each subset, focusing on genes that significantly differed between donors, we searched for those for which mean expression at basal state was correlated with the percentage of infected cells at T 4 among our eight donors. Despite our limited sample size, we found that cellular susceptibility was strongly correlated with basal expression of the well-known host viral restriction factor IFITM3. Although it reached significance only in nonclassical monocytes (FDR~1%), the association remained strong in other subsets (p-value< 4.1×10 -4 ; Figure 4C). We next relaxed our search to all genes for which basal expression showed nominal correlation (p-value<0.01) with the percentage of infected cells at T 4 . Depending on the monocyte subset, between 3.6 to 8.3% of genes matched these criteria, resulting in a set of 118 genes displaying correlation with monocyte susceptibility in at least one subset. These 118 genes were collectively enriched for several related biological processes such as defense response to virus (GO:0051607, OR=15.3, FDR=9.2×10 -19 ) and ISGs (GO:0034340, OR=19.6 FDR=8.4×10 -15 ) (Supplementary Data 4B). Among genes contributing to this enrichment, we found additional antiviral genes such as OAS3, and MX1, as well as the critical TF, IRF7, involved in the severity of IAV-infection both in mice and humans (30)(31)(32). Finally, overlap with the TF targets identified by SCENIC revealed strong enrichments of several IRFs and STATs among the 118 genes, including IRF7, as well as STAT1, STAT2 and IRF9 that form the tripartite IFN-stimulated gene factor 3 (ISGF3) ( Figure 4D; Supplementary Data 4C). Together, our results provide evidence that the basal mRNA expression of genes related to IFN-induced and antiviral responses are indicative of the proportion of cells that will become infected in the first cycle of IAV infection.

African-Ancestry Monocytes Are More Resistant to Infection
Lastly, we wondered how our findings of inter-individual variation might extrapolate to the population level. In a previous study (18), we challenged the primary monocytes from 200 Belgian individuals of African (AFB) and European (EUB) ancestry with the same IAV strain and MOI used in the present study, and performed bulk RNA-seq at 6 hours post infection (hpi). While basal (T 0 ) expression profiles were not collected, flow cytometry labelling of CD14 and CD16 was performed on the CD14 + -selected monocytes for the majority of donors. Interestingly, AFB individuals had higher proportions of CD16 + cells than EUB individuals ( Figure 5A; Supplementary Figure 8). In light of our findings that CD16 + cells are more resistant to IAV infection, we hypothesized that this might translate to lower infection rates among AFB monocytes relative to EUB monocytes.
To test this hypothesis, we mapped the bulk RNA-seq profiles collected 6hpi challenge with IAV for the 200 individuals to a combined human-IAV reference. Excluding 1 sample with low quality RNAs, we found that 0.02-13.5% of RNA-seq reads from each sample were of viral origin ( Figure 5B). Reassuringly, these percentages correlated with IAV mRNA levels estimated from the single-cell experiment across all time points for the eight donors used in the present study (Pearson r>0.84, p-values<8.9×10 -3 ), with the strongest correlation being observed at the peak of viral transcription (T 4 ) (Pearson r=0.97, p-value=5.1×10 -5 ). These observations indicate that ex vivo cellular susceptibility is highly reproducible among individuals, even across different experimental protocols and technologies. Among the 199 bulk profiles, AFB and EUB samples presented overlapping but significantly shifted distributions of total IAVmapping reads ( Figure 5B, 4.9% vs. 6.8% of reads, respectively, Wilcoxon p-value=5.3×10 -8 ), and of each of the 10 primary viral transcripts ( Figure 5C, Wilcoxon p-values<5.5×10 -4 ).
Using the transcriptional profiles obtained from the scRNAseq data at T 6 , we estimated the proportion of reads coming from each inferred cell state in these bulk RNA-seq profiles (Figures 5D, E; Supplementary Note 2 and Supplementary Figure 9A). We found that, on average, AFB monocytes were more resistant to IAV infection than EUB monocytes (39.2% vs. 48.9% infected, respectively, Wilcoxon p-value=5.3×10 -10 ). Differences in the estimated percentage of infected cells alone explained 63% of the inter-individual variability in viral mRNA levels ( Figure 5F), and was sufficient to account for the observed difference in viral mRNA levels between AFB and EUB individuals (p-value=0.16 after adjusting on infected cells, compared to p-value=5.3×10 -8 without adjustment). Nonetheless, variation in the percentage of high/low IAVtranscribers among infected cells accounted for an additional 19% of variance in viral mRNA expression (Supplementary Note 2 and Supplementary Figure 9B). Finally, the ratio of CD16 + /CD16cells negatively correlated with the percentage of infected cells, albeit weakly (-0.27, p-value=0.0165 adjusted on population). Altogether, these results show that population differences in viral mRNA levels are primarily driven by the overall proportion of cells that will ultimately become infected, with only a fraction of the differences being attributable to the different proportions of CD16 +/subsets observed in individuals of African and European ancestry.

DISCUSSION AND HYPOTHESIS
We performed scRNA-seq on primary monocytes, before and after ex vivo IAV challenge, to assess transcriptional differences between monocytes infected by IAV (i.e. infected) versus those activated only by exposure to secreted cytokines (i.e. bystanders), and to identify subset-specific responses of monocytes to viral challenge. We found that bystander cells display increased mRNA expression of ISGs relative to infected cells; yet, we additionally observed both an induction of ribosomal gene mRNA expression in IAV-transcribing cells and a down regulation of these genes in bystander cells at later time points. While the former is likely induced by the virus to enhance mRNA translation (33), the repression of ribosomal expression observed in bystander cells may reflect a host mechanism to contain infection by shutting down the translational machinery of neighboring cells, and we speculate that this may hold true across other cells types and constitute a general cellular defense mechanism against viral infections. Interestingly, the interplay of ribosomal and ISG expression also distinguished infected cells into two distinct states (high and low IAV-transcribers), providing an explanation for the high cell-to-cell variation in IAV replication observed among circulating monocytes, which has also been documented in other cell types and during natural infection (34)(35)(36)(37)(38)(39)(40)(41). Notably, type-I IFN genes themselves tended to be preferentially expressed by infected cells in a highly transient manner, suggesting a potential role of monocyte infection in the triggering of the type I IFN response among bystander cells. While these patterns were generally shared across CD16and CD16 + subsets, we found CD16 + cells to be slightly more resistant to infection. This is likely attributable to their higher absolute expression of some ISGs relative to CD16cells (independent of viral exposure), as well as their more robust upregulation of antiviral genes upon IAV challenge, which we found to be driven by stronger activity of IRF transcription factors. Interestingly, CD16 + cells displayed transient mRNA expression of IL6 and TNF upon viral exposure (both infected and bystander cells), two cytokines that have been widely implicated in cytokine storms (5). Collectively, these findings highlight the opposing roles of ISG and ribosomal gene mRNA expression on viral transcription, and reveal the stronger antiviral and pro-inflammatory potential of CD16 + monocyte subsets.
At the population level, we found that the ratio of CD16 + / CD16at basal state was predictive of the percentage of monocytes that were susceptible to IAV infection, and observed that African-ancestry individuals, from our sample, harbored more CD16 + monocytes on average than Europeanancestry individuals residing in the same city (Ghent, Belgium), consistent with previous observations (42). Independently of monocyte subset proportions, we identified that individuals presenting lower monocyte susceptibility to IAV had a higher basal activation of an IRF/STAT-driven antiviral program. These findings suggest that the fate of a monocyte hinges upon its basal activation state, and that the infection potential differs both within an individuals' monocyte population, in part based on the differentiation status of the cell (i.e. CD16-positivity), but also between individuals, where a CD16cell from one individual may have a higher antiviral state than a CD16 + cell from another individual.
Our finding of a decreased ability of IAV to infect and replicate in monocytes from individuals of African-ancestry was recently replicated in an independent cohort of American individuals with varying levels of African and European ancestry whose PBMCs were challenged with the 2009 pandemic H1N1 strain (43). While the cause of these population differences remains to be determined, we did not find evidence that strong-effect genetic factors, nor evidence of past exposure to H1N1, could explain such an association with the viral replication phenotype. Nevertheless, the observed inter-and intra-population differences are noteworthy in and of themselves, and may reflect the influence of both weak-effect genetic loci, and non-heritable factors, such as stress, nutrition or lifestyle, on transcriptional variation of immune genes (14,15). Future studies are needed to determine if such population differences hold true across other cell types, such as lung epithelial cells.
Given our finding that CD16 + subsets are the main drivers of inflammatory cytokine gene expression such as IL6 and TNF, and that African-ancestry individuals harbor a larger fraction of these subsets, it is tangible to conceive that monocyte subset composition prior to infection may influence disease outcome. A lower percentage of infected monocytes could also contribute to a faster disease progression, as we find that infected monocytes continue to express antigen-presenting genes. Thus, a higher number of infected cells could lead to a stronger activation of the adaptive immune system. In support of these hypotheses, patients with severe influenza and COVID-19 harbor higher proportions of intermediate monocytes in peripheral blood than patients with mild disease (44,45), and African Americans are more often hospitalized than other self-defined ethnic groups by both influenza (46,47) and COVID-19 (48,49), even when adjusting for age and various social factors such as poverty and vaccination status. In light of these observations, we hypothesize that the higher percentage of CD16 + monocyte observed among African-ancestry individuals may, in conjunction with a stronger basal activation of their monocytes, contribute to poor infectious outcomes. Further studies are now needed to formally establish the clinical relevance of monocyte heterogeneity in the context of viral infections, IAV in particular, and determine its potential use as a biomarker.

LIMITATIONS OF STUDY
In this Hypothesis and Theory article, we analyze single-cell transcriptional heterogeneity of circulating monocytes before and after ex vivo IAV challenge, and propose that differences in basal monocyte activation underlie population disparities in cellular susceptibility to IAV. We acknowledge that our study is based on positively-selected monocytes isolated from PBMCs, and that infection of this cell population in vivo would be expected to take place in the lung, after an initial infection has been established. Future studies should investigate how these findings translate to other cell populations -including lung epithelial cells and resident monocyte and macrophage populations -and whether they influence clinical outcomes. This could be achieved, for instance, by using single cell techniques to measure how nasal epithelial cells from healthy patients of various ancestries differ in their expression of viral RNAs and proteins upon IAV challenge. Additionally, sputum extract could be collected from mild and severe influenza patients of both ancestries, to compare the single cell transcriptome of lung epithelial cells and resident macrophage populations, both across ancestries and in relation with disease severity. Another caveat of the study is the lack of detailed lifestyle observations in the cohort used, precluding us from examining in further detail the influence of non-genetic factors. Further studies are now needed to evaluate how non-genetic factors, such as social status, chronic stress levels (and the induced physiological response), previous exposures to pathogens or even the microbiome, could contribute to shape basal monocyte activation and prime the innate immune response to viral infections.

Experimental Model and Subjects
All individuals from this study were part of the EVOIMMUNOPOP cohort, which has been previously described (18). Human blood was obtained from healthy volunteers who gave informed consent, and the PBMC fraction was isolated and frozen. In brief, 200 healthy male donors living in Belgium of self-reported African descent (AFB) or European descent (EUB) were recruited. Inclusion was restricted to nominally healthy individuals between 19 and 50 years of age at the time of sample collection. The majority of our African-descent individuals originated from West Central Africa, with >90% of our sample being born in either Cameroon or Congo. Serological testing was performed for all donors to exclude those with serological signs of past or ongoing infection with human immunodeficiency virus (HIV), hepatitis B virus (HBV) or hepatitis C virus (HCV).

Single-Cell Analyses and RNA-Sequencing
For eight selected donors [4 individuals from each ancestry, selected from extremes of the first principal component of gene expression in our previous study of monocyte response to IAV challenge (18)], 100×10 6 PBMCs were thawed, washed twice and resuspended in complete medium: pre-warmed RPMI-1640 Glutamax medium, supplemented with 10% FCS and 1% penicillin/streptomycin (Cat#15140-122, Life Technologies). Monocytes were then positively selected with magnetic CD14 microbeads, according to the manufacturer's instructions (Cat#130-050-201, Miltenyi Biotec). The number of monocytes was determined with the Countless2 automated cell counter system (Cat#AMQAX1000, ThermoFisher Scientific) in the presence of trypan blue. For each donor, monocytes were seeded at 0.5×10 6 monocytes per well on 24-well NUNC plates in 500 µL of complete media and allowed to rest for one hour at 37°C under 5% CO 2 . Five-hundred microliters of complete media (non-infected) or A/USSR/90/1977(H1N1) at a concentration of 1×10 6 pfu/mL in complete media (IAVchallenged, MOI=1) were added to each sample. Following one hour of staging at 4°C, plates were centrifuged at 1300 rpm for 10 minutes at 4°C, media was removed by pipette, and each well was washed with 1mL complete media. The spin was repeated, media removed by pipette, and samples were resuspended in 1mL prewarmed complete media before being transferred to an incubator at 37°C under 5% CO 2 to initiate infection (T 0 ).
At each time point (T 0 , T 2 , T 4 , T 6 , and T 8 ), samples were mixed by pipetting and transferred to Eppendorf tubes. Wells were washed with 300uL of PBS + 0.04% BSA and transferred to the same tubes. Collection tubes were centrifuged at 1300 rpm for 10 minutes, media was removed and replaced with 1mL PBS + 0.04% BSA and an aliquot of 10µL was taken to count each sample on a Countless2 automated cell counter system, before repeating the centrifugations. Individual samples were adjusted to 2×10 6 live cells/mL. Samples were multiplexed for running on the 10X Chromium (Cat#120223 & 1000074, 10X Genomics) by mixing equal proportions from 6-8 samples in a manner that balanced conditions and allowed us to assess for batch effects across lanes (Supplementary Table 1

Processing of scRNA-Seq Data
Basic pre-processing of the sequencing data was performed with CellRanger v3.0.2 (52), including the mkfastq, count, and aggr commands. Default parameters and our combined human-IAV reference were used, and batch correction was disabled in the aggr command. Cell-containing droplets (n=132,130) were traced back to individual donors using two independent methods, Demuxlet and SoupOrCell, which capitalize on genetic variation in the sequencing reads (20,53). Barcodes with ambiguous and/or non-concordant calls between the two programs were used to establish suitable QC metrics. We found that barcodes deemed as doublets (i.e. the droplet contained two or more cells originating from different donors) were more likely to be nearest-neighbors in a knn-graph with other doublets than assigned singlets. We used this feature to identify droplets presumed to contain two or more cells originating from the same donor; barcodes with > 5 doublets as nearest-neighbors were excluded from further analysis (Supplementary Figures 1A, B). Additionally, droplets containing low-quality cells (i.e. damaged, dying) were excluded using the following thresholds: <1500 total counts, <500 genes, or >50% mitochondrial gene content (Supplementary Figure 1C). This QC resulted in 96,386 single cells.
Transcriptomes (i.e. counts) were adjusted for the presence of ambient RNA with SoupX, (https://github.com/constantAmateur/ SoupX, accessed November 28, 2019) (21), using estimated contamination fractions (per 10X library) from SoupOrCell (53). SoupX-adjusted counts were normalized using pool-based size factors followed by deconvolution as implemented in the scran R package (54). Feature selection was performed by (i) constructing a mean-variance trend in the log-counts and retaining genes found to exhibit more variation than expected assuming Poisson-distributed technical noise, as implemented in the makeTechTrend and TrendVar functions from package scran (54), and (ii) selecting genes expressed in at least 25 cells (n=22,603). The first 10 PCs of the data were retained for data visualization and clustering analyses. Graph-based clustering was performed by building the shared nearest-neighbor graph with the buildSNNGraph function from scran (54) using a series of k values, and cell clusters were defined with the igraph Walktrap algorithm (55). Similar clustering results were obtained based on the knn-graphs generated using k=25, 50, 75, and 100, and k=25 was used for all downstream analyses (Supplementary Figure 2A). Cell types were predicted using SingleR and the built-in BlueprintEncodeData reference (56). Based on the clustering and cell-type predictions, we removed cells belonging to clusters associated with lymphoid cell types or low QC metrics from downstream analyses (Supplementary Figures 2B, C).

Accounting for Ambient RNA Contamination in scRNA-Seq Data and Assigning Cell States
Droplet-based scRNA-seq methods capture ambient mRNAs present in the cell suspension in addition to cell specific mRNAs. To estimate which cells in our experiment were genuinely expressing mRNAs for CD14, FCGR3A (CD16), and those originating from the virus, we implemented a two-step strategy utilizing the estimateNonExpressionCells function of the SoupX package (21). This function estimates whether each cell contains significantly more counts of a provided gene-set than would be expected under a Poisson model, given the estimated ambient RNA from its library of origin and the maximum contamination fraction. First, we used the viral genes to estimate the true maximum contamination fraction, based on the assumption that cells from the non-infected state should only contain viral reads from ambient mRNA captured in their droplets. To do so, we modified the estimateNonExpressionCells function to return p-values, and performed the test on each of our 13 libraries with a range of maximum contamination values from 1-50% (step of 1%) using the viral genes. We then computed FDR adjusted p-values for each maximum contamination value on the 88,559 high-quality, single monocytes. The number of nonsimulated cells deemed to significantly express IAV transcripts (FDR<0.01) was used as a proxy for false positives. In examining the relationship between this number and the number of IAVchallenged cells found to significantly express viral transcripts at FDR<0.01 ( Figure 1C), we found that a maximum contamination fraction of 10% resulted in a 1% false positive rate (defined as the percentage of non-infected cells from T 2 -T 8 that were deemed to significantly express IAV transcripts). This parameter value was then used to correct for contamination from ambient for all genes considered (CD14, FCGR3A and IAV transcripts).

Assigning Cell States and Investigating Sources of Variability in IAV Levels
We used a maximum contamination fraction of 10% to test for significant expression of IAV transcripts in each cell ( Figures 1C-E). IAV-challenged cells that contained a significant amount of IAV transcripts were considered as infected, while the others were deemed bystanders. To distinguish low from high IAV-transcribing cells, a Gaussian mixture model was fitted to the total percentage of viral mRNAs per cell across all infected cells, using the normalmixEM function from mixtools R package with k=2 (57). Each cell was assigned to the cluster with the highest posterior probability, and the cluster of cells with higher IAV content was annotated as high IAV-transcribing.

Characterizing Monocyte Subsets and Transcriptional Profiles From scRNA-Seq Data
Principal components analysis of 6,601 cells at T 0 was used to order monocytes along a differentiation axis separating CD14 + cells from CD16 + cells. We then computed the average percentage of classical and nonclassical monocytes obtained by flow cytometry across the eight donors, weighting each individual by the number of high-quality cells in the scRNAseq data at T 0 . Based on these percentages (87.1% for classical and 7.6% for nonclassical), we annotated the monocytes on each side of the differentiation axis as classical and nonclassical, respectively, with the remaining 5.3% of monocytes being annotated as intermediates. Validity of our approach was confirmed by correlating the proportion of monocytes assigned to each subset across the eight donors, with the percentage of classical, intermediate and nonclassical monocytes estimated by flow cytometry.
Differential expression between subsets was assessed for the 4,859 genes expressed at a normalized log 2 count > 0.1 in any of the 3 subsets. Specifically, Wilcoxon rank tests were implemented in the scran package (54), using the findMarkers function and blocking on donor. We considered genes to be differentially expressed (DE) between monocyte subsets when gene expression was significant at an FDR≤1% and log 2 FC>0. At later time points, comparisons between CD16 + and CD16monocytes subsets were done based on 5,681 genes expressed with a normalized log 2 count > 0.1 on average in either subset, in at least one condition and time point. For each subset, log 2 fold change in gene expression relative to T 0 were correlated across times points. Differential expression between CD16 + and CD16cells was assessed with findMarkers (54), based on Wilcoxon rank tests and blocking on donors, time points and condition. Again, an FDR≤1% and log 2 FC>0.2 were required to define differentially expressed genes. To assess how CD16 +/status alters the infection of monocytes by IAV, we used logistic regression to model bystander/infected status as a function of CD16 +/status, while adjusting on donor, and time point (as factors).

Characterizing Subset-Specific Responses to IAV Challenge
To allow comparison between responses of CD16 + and CD16monocytes, 100 cells were subsampled from each subset and cell state, and at each time point. When subsampling, we ensured balanced representation of all donors across each monocyte subset and cell state, by using sampling weights that were inversely proportional to each donor representation in the original dataset. After sampling, a total of 6,669 genes with normalized log 2 counts >0.1 on average in at least one group (cell-state x subset x time point) was selected for further analyses. For each monocyte subset, differences in expression between cell states (unexposed, bystander, infected) as well as between highand low-IAV transcribing infected cells were performed using the findMarkers function from the scran package (54) and blocked on time point. For each comparison, genes were considered to be differentially expressed between cell states when gene expression was significant at an FDR=1% (Wilcoxon rank tests) and the log 2 fold change was > 0.2. In addition, for each comparison between cell states, we tested for differences in response between subsets using a linear model of the form: where Expr i is the expression of the gene being tested in cell i, State i is an indicator variable that distinguishes the two cell states being compared (e.g. unexposed and bystander), and subset i is an indicator variable that reflects the CD16 +/status of cell i. The 335 genes with significant interactions at a 1% FDR (for unexposedbystander and unexposed-infected comparisons) were clustered using the hclust R function with method 'Ward.D2'. DynamicTreeCut algorithm (58) was used to identify eight major patterns of response to IAV.

Transcription Factor Enrichment Analyses
To estimate Transcription Factor (TF) activity and define TFtargets relationships, we ran the R SCENIC pipeline (29) on the expression matrix (pre-normalization) on a random subsample of 4800 cells (100 cells from each donor at each time point and each condition, pre-exclusion of dying and contaminant cells) with default parameters. For each gene, motif-enrichment was considered for either cis-regulatory regions located <10kb from the TSS (distal regulatory elements), or between 500 bp upstream and 100 bp downstream of the promoter (proximal regulatory elements). To do so, motif-enrichment scores for all human genes (hg38 build, refseq_r80), were retrieved from https:// resources.aertslab.org/cistarget and used as input for the Rcistarget package (29). Sets of high-confidence targets for the 113 TFs whose activity could be quantified by SCENIC were then extracted and used for enrichment analysis. For each gene module, TF enrichment was assessed using a Fisher's exact test with the 6,669 expressed genes as background (Supplementary Data 3). Resulting p-values were adjusted using a global Benjamini-Hochberg correction for all eight modules and 113 TFs.
For each TF, with its targets enriched among one of the eight modules, TF activity inferred by SCENIC was used to test for subset-specific activity using a linear model of the form: where TF i is the activity of the TF being tested in cell i, State i is an indicator variable that distinguishes the two cell states being compared (e.g. unexposed and bystander), and subset i is an indicator variable that reflects the CD16 +/status of cell i. Average TF activity was then computed for each cell state, subset and time point, and correlated with gene expression of the associated module, to assess the link between TF activation and the TF-target enriched modules.

Association of the Outcome of IAV Infection With Basal Gene Expression
For each of the three monocyte subsets detected at basal state, a Kruskall-wallis test was used to search for genes whose expression levels significantly differ across donors. Within each monocyte subset, we then computed the average expression of each gene for all eight donors and correlated it with the percentage of infected cells at 4hpi. Genes that differed in expression between donors (FDR≤1%), and passed a nominal p-value threshold of 0.01 for association with IAV levels in any of the 3 subsets, were selected for downstream enrichment analyses.
For genes nominally correlated with viral mRNA levels, TF enrichment was assessed as previously using a Fisher's exact test with all 4,859 genes expressed at T 0 as background (Supplementary Data 4), and Benjamini-Hochberg correction for all 113 TFs was applied.

Gene Ontology Enrichment Analyses
All Gene Ontology (GO) enrichment analyses were performed with the GOSeq package using default settings (59). Background gene sets consisted of all genes that had average log-normalized expression values > 0.1 in at least one of the groupings being examined, and are described in the text. Only enrichments significant at FDR≤5% are reported.

Pseudo-Bulk Estimates From scRNA-Seq Data
Pseudo bulk estimates of IAV mRNA levels were computed by measuring, for each donor and time point, the mean percentage of reads of viral origin across all cells from the sample. At each time point, we then used a Pearson's correlation test to compare pseudo-bulk estimates for the 8 donors with IAV mRNA levels obtained in bulk data at 6hpi.

Quantification of Canonical Monocyte Subsets in EVOIMMUNOPOP Samples
FlowJo v10.6.1 software (60) was used with the gating strategy depicted in Supplementary Figure 8 to quantify monocyte subsets for 174 EVOIMMUNOPOP donors. Population-level differences in proportion of canonical monocyte subsets were assessed using Wilcoxon Rank tests. Correlation of the ratio of CD16 + to CD16cells with IAV mRNA levels was assessed using a linear model of the form where 'IAV' are IAV mRNA levels, 'ratio' is the percentage of CD16 + monocytes (nonclassical+intermediates) divided by the percentage of CD16monocytes (classical), and 'Pop' is and indicator variable separating AFB from EUB individuals.

Analysis of Bulk RNA-Seq Profiles From the EVOIMMUNOPOP Cohort
A combined human-IAV reference was generated by concatenation of the primary human genome assembly

Cell States Deconvolution From Bulk RNA Sequencing
To assess the percentage of total transcripts that originate from each cell state across the 199 IAV-challenged samples, we pooled cells from T 6 into 3 groups, based on their assigned cell-state (bystander, infected: high and low IAV-transcribing) and to which we added a 4 th group containing all singlets that were either (i) assigned to cluster numbers 3, 8, 10, and 11 (dying cells) or (ii) discarded based on their high mitochondrial content or low read counts (dead cells). We then estimated pseudo-bulk profiles for each group by summing UMIs across all cells and computing the number of UMIs associated to each gene per million of sequenced UMIs. TPM profiles obtained from bulk data were then normalized to improve comparison with pseudobulk. Specifically, we first computed a global pseudo-bulk profile of the entire single cell dataset as the average of the pseudo bulk profiles from the 4 cell states (bystander, infected: high and low IAV-transcribing, or dying/dead), weighted by the percentage of UMIs they contribute to the overall pool of cells. To account for the difference in how gene expression is quantified between the two methods (3' end counts for scRNA-seq and full-length gene coverage for bulk RNA-seq), we computed for each gene i a normalization factor s i given by where PB i is the number of UMI per million for gene i in the global pseudo-bulk profile, and TPM i is the average expression of the gene i in the 199 IAV-stimulated samples from the bulk RNA-seq data. For each gene, s i was then subtracted from the log transformed TPM to yield normalized TPM profiles. We next applied DeconRNAseq (67) to the normalized log TPM profiles from all individuals, using the log-transformed pseudo bulk profiles from the 4 cell states as a basis for deconvolution. Quality of the deconvolution was assessed using leave-one-out cross validation, based on the eight individuals for whom we had scRNA-seq data. Specifically, for each of these eight individuals, bulk mRNAs were decomposed using pseudo-bulk profiles recomputed based on the seven other individuals. The resulting proportions were then compared with the percentage of UMIs that originate in each cell-state in the scRNA-seq to assess the quality of the deconvolution. Excluding IAV genes from bulk transcriptomic profiles prior to preforming the deconvolution had virtually no impact on the estimated proportions (Pearson r > 0.98 with proportions estimated without excluding IAV genes), confirming that our estimates were not driven by IAV expression alone. Comparisons between populations were performed using Wilcoxon rank tests. The effect of the percentage of infected cells and percentage of high IAV-transcribing cells among infected cells on the total IAV mRNA levels were assessed by modelling And IAV ∼ HI + POP where IAV are the IAV mRNA levels across the 199 bulk mRNA samples, INF and HI are respectively the percentage of infected cells and the percentage of high IAV-transcribing cells among infected cells that we estimated from the deconvolution, and POP is a factor variable reflecting the population (EUB or AFB). The fraction h of population differences attributable to difference in rate of infection was estimating by comparing model (5) with model (7) below and computing h = 100 Â (1 − b 5 b 7 ), where b i is the effect of population on IAV levels in model (i). To assess how the contribution of the percentage of high IAV transcribing cells to total IAV mRNA levels differed between populations, we used a linear model of the form IAV ∼ HI + POP + HI : POP (8) and tested for significant effect of the interaction term HI : POP on IAV mRNA levels.

Flow Cytometry Analysis of Monocyte Susceptibility to IAV Infection
Frozen PBMCs from 8 individuals included in the EVOIMMUNOPOP cohort were thawed and allowed to rest overnight at 37°C, 5% CO 2 in 25cm 2 flasks. Cells were then seeded at 2×10 6

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Board of Institut Pasteur (EVOIMMUNO POP-281297) and the relevant French authorities (CPP, CCITRS, and CNIL). All experimental methods were conducted in accordance with the Declaration of Helsinki principles. The patients/participants provided their written informed consent to participate in this study.