Analysis of the Transmission of SARS-CoV-2 Delta VOC in Yantai, China, August 2021

Objective Starting 31 July 2021, a SARS-CoV-2 outbreak occurred in Yantai, Shandong Province. The investigation showed that this outbreak was closely related to the epidemic at Nanjing Lukou Airport. In view of the fact that there were many people involved in this outbreak and these people had a complex activity area, the transmission route cannot be analyzed by simple epidemiological investigation. Here we combined the SARS-COV-2 whole-genome sequencing with epidemiology to determine the epidemic transmission route of Yantai. Methods Thirteen samples of SARS-CoV-2 outbreak cases from 31 July to 4 August 2021 were collected and identified by fluorescence quantitative PCR, then whole-genome deep sequencing based on NGS was performed, and the data were analyzed and processed by biological software. Results All sequences were over 29,000 bases in length and all belonged to B.1.617.2, which was the Delta strain. All sequences shared two amino acid deletions and 9 amino acid mutations in Spike protein compared with reference sequence NC_045512.2 (Wuhan virus strain). Compared with the sequence of Lukou Airport Delta strain, the homology was 99.99%. In order to confirm the transmission relationship between patients, we performed a phylogenetic tree analysis. The results showed that patient 1, patient 2, and patient 9 belong to an independent branch, and other patients have a close relationship. Combined with the epidemiological investigation, we speculated that the epidemic of Yantai was transmitted by two routes at the same time. Based on this information, our prevention and control work was carried out in two ways and effectively prevented the further spread of this epidemic.

INTRODUCTION SARS-COV-2 is a novel coronavirus first reported in Wuhan, China in December 2019 which caused an epidemic of acute respiratory syndrome (1,2). Since then, the coronavirus disease 2019  has spread quickly all over the world causing great casualties and property losses (3,4). By mid-March 2022, nearly 460 million cases of COVID-19 were diagnosed with over 6 million deaths around the world (https://coronavirus.jhu.edu/ map.html). All viruses, including SARS-CoV-2, change over time.
Most changes have little to no effect on virus properties, but some changes especially the mutation accumulation may affect the propagation, pathogenicity, performance of vaccines, diagnostic tools, and so on (5). In order to prioritize global monitoring and research, and ultimately inform the ongoing response to the COVID-19 pandemic, the world health organization (WHO) classified important variants into two categories: variants of concern (VOC) and variants of interest (VOI) (https://www.who. int/en/activities/tracking-SARS-CoV-2-variants). VOC means that the Virus strains have a wide range of influence, and data supports it enhancing the transmissibility and detrimental change or reduces the vaccine effectiveness and therapeutic effect. VOI means that the Virus strains are predicted or known to change characteristics, and have been found in many countries with an increasing number of cases over time. Given the continuous evolution of the virus and the constant developments in our understanding of the impacts of variants, these definitions may be periodically adjusted. Currently, there are five designated VOCs (Alpha from the UK, Beta from South Africa, Gamma from Brazil, Delta from India, and Omicron from Multiple countries) and two VOIs (Lambda from Peru and Mu from Colombia). Each strain contains its unique characteristic mutation spectrum and also has the same mutation sites among strains. Alpha, Beta, and Gamma have the same mutation N501Y within the receptor-binding domain (RBD) of the spike protein, which can increase the affinity to human angiotensin-converting enzyme 2 (hACE2) (6,7). This may play an essential role in the higher transmission of these strains. Beta and Gamma have another shared mutation, E484K, in their spike protein, this mutation can not only enhance the receptor binding affinity but also can escape the neutralization by vaccine-induced humoral immunity or some therapeutic monoclonal antibodies (8)(9)(10). Focusing on the mutations of the Delta strain, it hosts L452R T478K P681R mutations in RBD, these can greatly improve the transmission ability and immune system evasion (11,12). Since April 2021, Delta has expanded rapidly in the world until the emergence of Omicron in December 2021. Omicron contains more than 15 mutations in RBD, these mutations greatly changed the structure of Spike protein, enhanced its binding ability to ACE2, and invalidated many antibody binding sites (13,14). In addition, Omicron also got rid of the dependence on cellular protease TMPRSS2 and made it reproduce rapidly and massively in airway cells above the lungs that do not express TMPRSS2, which not only increased the viral load but also accelerated the transmission speed of the virus (15,16). At present, Omicron has almost completely replaced Delta all over the world. China was also troubled by the SARS-COV-2 Delta strain. Since June 2021, it has been found in new outbreaks in Yunnan, Guangdong, and Jiangsu. The outbreak started at Lukou Airport of Nanjing with related epidemics in many provinces and cities. Because of omissions in cleaning and disinfection of an inbound Russian aircraft CA910 which arrived at Lukou Airport in Nanjing from Moscow on July 10, the cleaning staff were infected with SARS-COV-2 and then caused the spread of the infection. The investigation showed that the SARS-COV-2 outbreak in Yantai was also closely related to this source. The first Lukourelated case in Yantai was diagnosed on 31 July 2021 and a total of 13 patients were finally diagnosed in 5 days. It was worth noting that the epidemiological investigation showed that the transmission relationship among the 13 people was complex. So in order to determine the virus strains type and the transmission relationship between cases, we sequenced the whole genomic nucleic acids of these 13 cases based on second-generation highthroughput sequencing technology (NGS), analyzed the gene characteristics and variation of the virus from the molecular level, and traced the source of the virus.

Sample Collection
Since 31 July 2021, a SARS-CoV-2 outbreak has occurred in Yantai, Shandong Province. As of 4 August, a total of 13 novel coronavirus-positive cases have been detected. Their nasopharyngeal swabs were collected to our laboratory for testing before sending them to an infectious disease hospital for treatment.

SARS-CoV-2 Nucleic Acid Diagnosis
Viral RNA was extracted from 140 µL clinical specimens using a QIAamp viral RNA mini kit (Qiagen, Hilden, Germany) following the manufacturer's protocol. The purified RNA was eluted in a 50 µL elution buffer. Fluorescent qPCR was performed using an In Vitro Diagnostic (IVD) reagent (Bioperfectus Technologies, Jiangsu, China) prior to sequencing of the PCR product. Open Reading Frame gene region (ORF1a/b), Nucleocapsid region (N) of SARS-CoV-2, and a positive reference gene were used to evaluate the presence and the quantity of SARS-COV-2. We followed kit instructions with thermocycler protocol: 1 cycle 50 • C 10 min; 1 cycle 97 • C 1 min; 45 cycles 97 • C 5 s; 58 • C 30 s with fluorescence reading. The circulation threshold (Ct) detection limit was 40 (350 copies/ml). A Ct value <37 is considered positive. All samples' Ct values were <30, meaning that subsequent sequencing steps could be carried out. All tests were conducted under strict biosafety conditions and standard operating procedures.

Sequencing Strategies
In order to obtain the sequence of SARS-COV-2 specifically, an amplicon-based enrichment method was used for sequencing library preparation. Reverse transcription and amplification steps were performed using ULSEN R 2019-nCoV Whole Genome Kit (Micro-Future, Beijing, China). A measure of 16 µL of viral RNA was reverse-transcribed into the first strand of cDNA and the viral genome was amplified by primer pools A and B. The PCR product was purified with AMPure XP beads (Beckman Coulter, Brea, CA) and diluted to 0.2 ng/µL. Paired-end libraries were generated with Nextera XT DNA Library Preparation Kit (Illumina, San Diego, CA) following the reference guide. Samples were multiplexed, using the Nextera XT index kit (Illumina, San Diego, CA). For the quantification and validation of the library, the Qubit 4.0 Fluorometer system (Life Technologies, Carlsbad, CA) and 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA) were used. Library sequencing was performed on Miseq using MiSeq Reagent Kit v2 (300-cycles; Illumina, San Diego, CA).

Data Analysis
For raw data, we first calculated the quality of sequencing reads by FastQC software (Babraham Institute, Cambridge, UK), and clean data was generated after removing sequencing adapters, reads containing poly-N and low quality reads by trimmomatic software (17). All downstream analysis was based on high-quality clean data. The reference genome  (18). Mapped reads were sorted by name using sambamba v0.6.8 (19). PCR duplications were processed by GATK (Genome Analysis Toolkit) (20) v4.2.0.0. The full length of virus sequences were obtained by ivar v1.3.1 (21), sequencing depth <3, and uncovered areas were replaced with "N." For clade assignment and mutation calling, we imported all sequences into Nextclade (https://clades.nextstrain.org/tree) and the web-application Phylogenetic Assignment of Named Global Outbreak Lineages (pangolin: https://pangolin.cog-uk. io/). The full-length SARS-CoV-2 genome sequences were aligned using ClustalW integrated in the MEGA X. The neighborjoining (NJ) phylogenetic tree was constructed by the program MEGA X using the Kimura two-parameter model and 1,000 bootstrap samplings.

Nucleic Acid Test Results
Nasopharyngeal swabs were collected from all 13 patients, viral RNA was extracted from a 140 µL sample using a QIAamp viral RNA mini kit (Qiagen, Hilden, Germany) following the manufacturer's protocol. The purified RNA was eluted in a 50 µL elution buffer. Before sequencing, an In Vitro Diagnostic (IVD) reagent (Bioperfectus Technologies, Jiangsu, China) applying fluorescent PCR technology was used. Internal quality control was evaluated using a group of positive (confirmed case RNA) and negative (DEPC H 2 O) controls. Results showed that two specific targets (ORF1ab, N gene) of SARS-CoV-2 from 13 cases, standard kit, and positive internal quality control were positive, and an ideal logarithmic curve was obtained.

Next-Generation Sequencing Results
NGS was used to complete the whole genome sequencing of 13 cases, and a total of 13 novel coronavirus genome sequences were obtained. The fastA sequences of the whole genome were assembled successfully, all of which were over 29,000 bases in length. Through the web-application pangolin, we got the information that the novel coronavirus genome sequences of the above cases all belong to branch B.1.617.2 (Delta) strain.

Homology Analysis and Gene Traceability Analysis
In order to confirm the close relationship of the epidemic between Yantai and Nanjing Lukou Airport, we aligned our sequences with one SARS-COV-2 sequence from the confirmed case (EPI ISL 7876604) in CA910, and the homology was 99.99%. Due to the fact that the CA910 took off from Moscow, we also aligned our sequences with all SARS-COV-2 sequences from Russia from 20 June to 20 July from the GISAID database. One sequence (EPI_ISL_3007759) collected from Moscow on June 28 2021 was highly homologous with our sequence.
This evidence could support that the Yantai epidemic belongs to the transmission chain of the Lukou epidemic. To infer the transmission relationship between all patients, we built a Neighbor-Joining phylogenetic tree based on the whole SARS-COV-2 genome of 13 sequences in Yantai and 12 genomes available on GISAID including the sequence from Russia, and the reference sequence download from NCBI (https://www. ncbi.nlm.nih.gov/sars-cov-2/; Figure 1). Details of all sequences were shown in Table 2. The result showed that patient 1, patient 2, and patient 9 belong to an independent branch, and other patients have a close relationship. Combined with the epidemiological investigation, we speculated that the epidemic in Yantai was transmitted by two routes at the same time. To confirm this speculation, we analyzed their mutation information (Figure 2). A total of 41 mutations were found in the 13 sequences compared to the reference sequence (NC_045512.2). The mutation spectrum of patient 1, patient 2, and patient 9 were the same and included two specific mutations (G23311T in Spike protein, C28748T in N protein), it could be concluded that they belong to the same route and that patient 1 was the source of transmission. There were some differences in the mutations of others, patient 3 had the least number of mutations and had been to Lukou Airport. So patient 3 was the source of transmission of another route, and mutations occurred during passage. Compared with patient 3, patient 8 carried a unique mutation G27990T in ORF8 protein and no other patients had this mutation, which showed that the virus carried by patient 8 had not spread again. Compared with patient 3 and patient 8, patient 4, patient 5, patient 6, patient 7, patient 10, patient 11, patient 12, and patient 13 had the same mutation C27527T in ORF7a protein, we can make sure that the virus had spread between these patients, but we cannot determine the order of transmission.

DISCUSSION
Delta VOC was first identified in October 2020 and has become a major variant globally since April 2021. According to WHO research, the transmission rate of the Delta virus has increased by nearly 100% compared to other strains not listed as "of concern, " and a recent study of the transmission dynamics of the Delta variant virus that caused the COVID-19 outbreak in Guangdong, China, also suggests that it is twice as infectious as previous pandemic strains (22). The Delta variant also spread faster than other strains. In the past, the incubation period of the Novel Coronavirus has been 5-6 days, and that of the Novel Coronavirus Delta variant is 4 days. The passage interval used to be 4 or 5 days, but now it is about 3 days (23,24). Thirteen cases of this outbreak, caused by a Delta variant in Yantai have been locally transmitted. During the study period, the local government implemented an epidemiological follow-up, and we sequenced all confirmed patients. This provides an opportunity for our study to understand its transmission characteristics.
In this outbreak, we found that patient 1 has been to YEDA and infected one close contact, there may be track crossing with other cases in YEDA. So only investigating the track of the action could not determine the transmission relationship of this epidemic. At this time, whole-genome sequence information may provide evidence for genotyping and phylogenetic analysis which help us to resolve this difference (25,26), of course, this must also be based on a certain basis: their sequences must have enough differences. Fortunately, the virus transmitted this time meets this prerequisite. Through sequence analysis, we determined that this epidemic situation had two transmission routes (Figure 1) and obtained the mutation spectrum (Figure 2) of each virus. Based on this information, the prevention and control work was carried out in two ways immediately and simultaneously. By the end of this epidemic, there were only 13 cases, which was a great achievement for a city with 3 million people. Like other similar studies, it fully illustrates the importance of rapid virus genome analysis in epidemic prevention and control (27)(28)(29)(30).
As SARS-CoV-2 continues to spread around the world, the dynamics of virus evolution and mutation are still changing, and new viruses are constantly acquiring new mutations in their genomes. Although some mutations provide the virus with the advantage of resisting human immune response, these mutations may lead to changes in pathogenicity and virulence (31). Therefore, future prevention and control work should strengthen screening of close contacts, investigation of infection sources, investigation of clusters of outbreaks, and active detection of people in high-risk areas.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are publicly available. This data can be found here: GISAID, the accession numbers can be found in the Supplementary Material.

ETHICS STATEMENT
Written informed consent was not obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
YS: data curation, investigation, and writing original draft preparation. YZ: data curation, formal analysis, investigation, and resources. ZL: formal analysis, investigation, and methodology. XL and JL: data curation, methodology, and resources. XT, QG, PN, and ZH: data curation and investigation. ZS: methodology, data curation, writing original draft preparation, writing review and editing, and project administration. YT and JW: methodology, data curation, writing review and editing, and project administration. All authors have read and agreed to the published version of the manuscript.