The SARS-CoV-2 differential genomic adaptation in response to varying UVindex reveals potential genomic resources for better COVID-19 diagnosis and prevention

Coronavirus disease 2019 (COVID-19) has been a pandemic disease reported in almost every country and causes life-threatening, severe respiratory symptoms. Recent studies showed that various environmental selection pressures challenge the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infectivity and, in response, the virus engenders new mutations, leading to the emergence of more virulent strains of WHO concern. Advance prediction of the forthcoming virulent SARS-CoV-2 strains in response to the principal environmental selection pressures like temperature and solar UV radiation is indispensable to overcome COVID-19. To discover the UV-solar radiation-driven genomic adaption of SARS-CoV-2, a curated dataset of 2,500 full-grade genomes from five different UVindex regions (25 countries) was subjected to in-depth downstream genome-wide analysis. The recurrent variants that best respond to UV-solar radiations were extracted and extensively annotated to determine their possible effects and impacts on gene functions. This study revealed 515 recurrent single nucleotide variants (rcntSNVs) as SARS-CoV-2 genomic responses to UV-solar radiation, of which 380 were found to be distinct. For all discovered rcntSNVs, 596 functional effects (rcntEffs) were detected, containing 290 missense, 194 synonymous, 81 regulatory, and 31 in the intergenic region. The highest counts of missense rcntSNVs in spike (27) and nucleocapsid (26) genes explain the SARS-CoV-2 genomic adjustment to escape immunity and prevent UV-induced DNA damage, respectively. Among all, the most commonly observed rcntEffs were four missenses (RdRp-Pro327Leu, N-Arg203Lys, N-Gly204Arg, and Spike-Asp614Gly) and one synonymous (ORF1ab-Phe924Phe) functional effects. The highest number of rcntSNVs found distinct and were uniquely attributed to the specific UVindex regions, proposing solar-UV radiation as one of the driving forces for SARS-CoV-2 differential genomic adaptation. The phylogenetic relationship indicated the high UVindex region populating SARS-CoV-2 as the recent progenitor of all included samples. Altogether, these results provide baseline genomic data that may need to be included for preparing UVindex region-specific future diagnostic and vaccine formulations.


Coronavirus disease
(COVID-) has been a pandemic disease reported in almost every country and causes life-threatening, severe respiratory symptoms. Recent studies showed that various environmental selection pressures challenge the severe acute respiratory syndrome coronavirus-(SARS-CoV-) infectivity and, in response, the virus engenders new mutations, leading to the emergence of more virulent strains of WHO concern. Advance prediction of the forthcoming virulent SARS-CoV-strains in response to the principal environmental selection pressures like temperature and solar UV radiation is indispensable to overcome COVID-. To discover the UV-solar radiation-driven genomic adaption of SARS-CoV-, a curated dataset of , full-grade genomes from five di erent UVindex regions ( countries) was subjected to in-depth downstream genome-wide analysis. The recurrent variants that best respond to UV-solar radiations were extracted and extensively annotated to determine their possible e ects and impacts on gene functions. This study revealed recurrent single nucleotide variants (rcntSNVs) as SARS-CoV-genomic responses to UV-solar radiation, of which were found to be distinct. For all discovered rcntSNVs, functional e ects (rcntE s) were detected, containing missense, synonymous, regulatory, and in the intergenic region. The highest counts of missense rcntSNVs in spike ( ) and nucleocapsid ( ) genes explain the SARS-CoV-genomic adjustment to escape immunity and prevent UV-induced DNA damage, respectively. Among all, the most commonly observed rcntE s were four missenses (RdRp-Pro Leu, N-Arg Lys, N-Gly Arg, and Spike-Asp Gly) and one synonymous (ORF ab-Phe Phe) functional e ects. The

Introduction
In December 2019, clusters of pneumonia cases were reported from the Wuhan city, Hubei province, China. Some of the early disease cases were reported working in the live animal market. On 11 March 2020, the WHO announced the disease outbreak, now named coronavirus diseases 2019 (COVID-19), as a public health emergency of international concern and declared it a pandemic (Koyama et al., 2020). As of June 2022, ∼ >528.82 million positive cases were reported to WHO across the world   Dashboard, 2022], with more than 6.29 million deaths. The COVID-19 symptoms range from mild fever, cough and fatigue to severe shortness of breath, and loss of taste and smell (Guan, 2020;Wang D. et al., 2020), with the 5% average fatality rate of all confirmed positive cases, which is of lower than SARS-CoV (9.6%) and MERS (34.3%) (World Health Organization., 2003Wang C. et al., 2020).
Since December 2019, whole-genome sequence analysis revealed hundreds of viable genetic variants of SARS-CoV-2 from different parts of the globe. Within SARS-CoV-2, the observed predominating drivers of genetic variation are the single-nucleotide variants (SNVs) caused by the errorprone viral polymerases (Smertina et al., 2019;Lu et al., 2022) and endogenous mutagenesis via the host RNA-editing enzymes (nucleotide deaminases APOBEC: C>U and ADAR: A>G) (Placido et al., 2007;Moris et al., 2014;Mourier et al., 2021;Tong et al., 2022). The genome-wide studies of large sets of SARS-CoV-2 revealed SNV-based nucleotide substitution rate of ∼1 × 10 −3 per year (Duchene et al., 2022), closer to the 1.42 × 10 −3 Ebola virus substitution rate reported from West Africa during 2013-2016. However, SNVs are not the only genetic variations discovered in coronaviruses, but the small insertions/deletions of viral or non-viral sequences were also reported in various genetic variants of coronavirus genomes possibly caused by the discontinuous nature of viral transcriptase for sub-genomic mRNA synthesis (Licitra et al., 2013;V'kovski et al., 2021). In total, a large proportion of the mutations represent neutral "genetic drift" or have died out quickly, and a small subset is affecting viable viral traits, such as host range, transmissibility, antigenicity, pathogenicity, and adaptability of the virus to various selection pressures.
Predicting genomic level adaptation of SARS-CoV-2 in response to various selection pressures is indispensable in understanding the viral spread, mutation, pathogenicity, control, and future treatment options to effectively tackle COVID-19 (O'Reilly et al., 2020). Solar ultraviolet radiation is thought to have a great impact on the formation of viral populations by selecting variants that can withstand UV-solar radiations (Ratnesar-Shumate et al., 2020). In this study, to investigate the SARS-CoV-2 genomic adaptation in response to UV solar radiation, we analyzed 2,500 high-quality, full-length genomes from five different WHO's defined UVindex regions. The comparative genome-wide analysis of SARS-CoV-2 populations revealed differential genomic adjustments in response to different ultraviolet solar radiations. All identified differential genomic signatures in response to various UVindex ranges provide baseline data for future more effective molecular COVID-19 diagnosis and region-specific vaccine production against COVID-19.
UVindex mean data for 12 months (from 7 December 2020 to 8 December 2021) for all included countries were obtained from the monthly weather forecast and climate by WeatherAtlas (retrieved on 08 December 2021, at 15:30 GMT/UTC + 5h; https://www.weather-atlas.com/). The UVindex value for each country was presented as a single value rounded to the nearest whole number. For each category, irrespective of the country's geographical location, the most relevant (top of the category's list) five countries were selected provided that the country experiencing UVindex falls in the specified category range and must have at least 100 full-length, highquality genome sequences reported in publicly accessible databases (Supplementary Table 1). Initially, for all UVindex categories, the all available (total of 8,631) full-length SARS-CoV-2 genomes were downloaded from GISAID on 11 December 2021, GenBank on 15 December 2021, the Chinese National Genomics Data Center Genome Warehouse on 23 December 2021, and the Chinese National Microbiology Data Center on 23 December 2021 (Benson et al., 2012;Shu and McCauley, 2017;CNCB-NGDC Members and Partners, 2021). To process high-quality, full-length genomes in each of the UVindex category, downloaded sequences shorter than 29,700 bps and containing seven consecutive ambiguous nucleotides (NNNs) were excluded from the downstream analysis. The China National Center for Bioinformation annotations was used to remove redundancy (Gong et al., 2020). Downloaded sequences containing 50 ambiguous bases were removed from the downstream analysis to reduce the number of false-positive variants using Trimmomatic version 0.39 (Bolger et al., 2014). Finally, using the accustomed Perl script, a 100 high-quality genome sequences from each of the five included countries in a UVindex category were randomly selected, so in a nutshell, for all five UVindex categories, 2,500 full-length SARS-CoV-2 reported genomes were retained for analysis.

Reference genome
The SARS_CoV-2 (NC_045512.2) sequence was used as a reference genome in this study. The NC_045512.2 was sequenced in December 2019 from a sample recovered from Wuhan, China . According to the standard procedure for variant detection , to retrieve high-quality variants, first, each sample was converted to short FastQ reads using emboss-splitter (Rice et al., 2000) and an accustomed fasta-to-fastq.pl script available in GitHub (Dabbish et al., 2012).

Read mapping
High-quality reads from each sample were mapped to the latest available reference SARS-CoV-2 genome NC_045512.2 using the BWA-MEM algorithm with the default minimum seed length of 20, gap open penalty 6, gap extension penalty 1, and matching score 1 (Li, 2013). For variant identification and downstream processing, open-source software packages were used. The "RealignerTargetCreator" and "InDelRealigner" command-line tools of the Genome Analysis Toolkit (GATK version 3.3.0) were used to fix all mapping issues through locally realigning improperly mapped reads, possessing variant artifacts at their terminals (McKenna et al., 2010). Before calling variants, Picard, Samtools, and BWA were used to generate the reference and bam file indexes (Li and Durbin, 2009;McKenna et al., 2010;DePristo et al., 2011).

Variant calling and quality filtration
Any deviation of the properly mapped read sequence to the reference genome NC_045512.2 was called as a variation. For variant discovery, initially, the "mpileup" utility of bcftools, with default parameters, was used to call genotypes for each of the samples included in this study. From the derived genotypes, high-quality variants were identified as any deviation of the mapped read sequences from the reference genome using the bcftools "call" command (Li, 2011). To differentiate between real hereditary variants from the false-positive data-processing artifacts (caused ambiguous bases), a calibrated statistical likelihood was generated for each of the identified variant loci using the GATK "Variant Recalibrator" and "ApplyRecalibrator" functions (McKenna et al., 2010). Finally, false-positive dataprocessing artifacts were removed using the following options of bcftools filter and GATK variant filtration; (a) variants were removed with a Phred quality score ≤20; (b) since Fisher's exact test-based Phred-scaled P-value (FS) represents strand bias for the reference and alternative allele, a sign for the false-positive variant. Therefore, variants with FS values >60 were filtered out from the downstream analysis (Kim et al., 2017;Iqbal et al., 2019).

Variant functional annotation and prioritization
After filtration, high-quality variants were retained for each of the UVindex categories. Furthermore, high-quality variants to predict possible variant functional effects, impact, and their respective distribution across the reference NC_045512.2 genome were comprehensively investigated. The SnpEff_4.3 was used to attribute each variant by a functional class and offered various annotation levels to identify potential coding variants. For functional annotation, the SnpEff database was developed according to the SnpEff database building protocol (Cingolani et al., 2012) using the NCBI SARS-CoV-2 sequence annotation resources (NC_045512.2; Bio-Project, PRJNA485481; https:// www.ncbi.nlm.nih.gov/sars-cov-2/). For all potential coding variants, the assigned SnpEff functional class vocabularies were UTR 3 prime, UTR 5 prime, splice site donor, splice site acceptor, splice site region, downstream, upstream, disruptive in-frame deletion and insertion, and conserved in-frame insertion and deletion. The results are provided in the list of functionally annotated variants (Supplementary Material: rcntSNV_UVindex.snpEff.vcf). A customized script was developed in Python to extract all identified variants for each of the genes in all UVindex categories (Supplementary Material: rcntSNVs_genes_functional_effects_UV.Case.genes). Following variant functional annotation, all coding region variants were compared to find UVindex category-specific and overlapping variants using vcftools (Danecek et al., 2011), the bioinformatics, and evolutionary genomics resources (http://bioinformatics. psb.ugent.be/webtools/Venn/).

Phylogeny
For phylogeny, sequences were precisely chosen with <30 variations, and the lengths were adjusted by 5 ′ UTR and 3 ′ Frontiers in Microbiology frontiersin.org . /fmicb. .

FIGURE
Total and recurrent SNVs (rcntSNVs) count in all examined , SARS-CoV-genomes, grouped in five distinct UVindex-based categories. For each WHO's defined UVindex category, the outer bar represents total identified SNVs, whereas the inner short bar represents predicted rcntSNVs.

FIGURE
The SARS-CoV-genome-wide distribution of all observed high-quality rcntSNVs. Structural protein-encoding genes category is shown in orange (left-most), non-structural protein-encoding genes category is represented in blue (in the middle), whereas accessory genes category is shown in gray blocks (right-most). In each category, the smaller blocks and their sizes represent genes in a particular category and their respective rcntSNVs load, respectively. UTR truncation, without losing the key sequence sites. From this sequence pool, for an optimal phylogenetic relationship, a subset of 125 high-quality SARS-CoV-2 whole-genome samples (25 from each of the UVindex category) randomly selected in Perl by using a random number generator. All selected genomes were first aligned using the progressive multiple sequence alignment method of ClustalW (Thompson et al., 1994). The MEGA X (version 11.0.10) was used to produce and visualize the phylogenetic tree . The maximum likelihood approach with Tamura-Nei substitution model, uniform rates among sites, all sites' data treatment, 1,000 bootstrap value, and nearest neighbor interchange (NNI) heuristic method was used for the best interfacing of a tree.

Results and discussions
To determine the differential genomic adaptation of SARS-CoV-2 in response to different UVindex ranges, 2,500 fulllength, high-quality reported genomes were investigated from 25 countries, classified into five distinct categories based on the country's UVindex exposures. UVindex-based categories are described in the "Methods" section (   Perl script was used to randomly select 100 high-quality SARS-CoV-2 genomes from each of the included countries.

Variant discovery (Total/RcntSNVs)
For 2,500 SARS-CoV-2 complete genome samples, we discovered a total of 10,228 single nucleotide variants (SNVs) with an average variation load of one SNV after every 15.49 nucleotides per UVindex category (averaging ∼3.92 SNVs/sample). In each UVindex category, countries are included based on their commonly shared UVindex ranges, irrespective of their relative humidity, temperature, altitude, geographical location, and many other selection pressures. Considering our sampling strategy, all identified SNVs in .

FIGURE
Di erent functional e ects (rcntE s) predicted for all rcntSNVs in all five WHO's defined UVindex categories are shown using a combo bar-line chart. The most prevalent rcntE s missense are displayed using red-pointed gray line, whereas the synonymous, regulatory, and non-coding rcntE s are represented here in blue-, gray-, and yellow-stacked columns, respectively.

FIGURE
A Venn diagram depicting the overlap of recurrent single nucleotide variants (rcntSNVs) found across di erent SARS-CoV-populations from five WHO's defined UVindex country categories. The comparison based on total identified rcntSNVs across all , SARS-CoV-genomes from UVindex categories; Extreme (blue), Very_High (red), High (green), Moderate (yellow), and Low (brown) revealed a total of seven commonly shared variants (A). The complete description of all UVindex categories is presented in the method section. Upon detailed functional annotation, all seven commonly shared rcntSNVs are found with five shared missense functional e ects (rcntMissense-e ects) on gene's functions (B), of which three shared rcntMissense-e ects are revealed in structural protein-encoding genes (C), comprising two in nucleocapsid (D) and one in spike (E). In all Venn diagrams, the UVindex-specific rcntSNVs/Missense-e ects (CaSp-rcntSNVs/E s) counts are given near the outer edges, whereas the shared rcntSNVs/e ects are represented in the dark brown core middle of each diagram.   each UVindex category are the probable genomic adjustments against all experienced biotic and abiotic selection pressures, whereas only the most common SNVs in a UVindex category are the potential genomic adaptation of SARS-CoV-2 in response to UVindex. Therefore, based on a 25% reoccurrence rate in a UVindex category, a sum of 515 (5.03% of a total of 10,228) recurring SNVs were carefully prioritized to discover the SARS-CoV-2 genomic responses to a commonly experienced environmental selection pressure, the UV solar radiation. These SNVs with atleast 25% reoccurrences in each UVindex category are termed recurrent-SNVs (rcntSNVs). For all UVindex categories, lists of all discovered rcntSNVs are given in Supplementary_info_file.docx Supplementary Tables 2-6. Of the total, the least number of rcntSNVs (75) were observed in SARS-CoV-2 genomes included from countries exposed to Extreme UVindex solar radiation, revealing that the Extreme UVindex solar radiation employs negative selection pressure by damaging viral DNA and thus limits the diversity of SARS-CoV-2 strains. Our finding is consistent with the hypothesis that Extreme UVindex radiations induces viral DNA damage to disinfect the SARS-CoV-2 without altering its morphology (Lo et al., 2021). Furthermore, the solar UV radiation of extreme intensity inactivates SARS-CoV-2 and other related strains of corona and influenza viruses on surfaces (Pi et al., 2003;Darnell et al., 2004;Ianevski et al., 2019;Ratnesar-Shumate et al., 2020). On the contrary, the highest number of rcntSNVs (141) (Ianevski et al., 2019). Based on these findings, we propose that COVID-19-causing viruses have had sufficient evolutionary time to acquire genomic-level adaptation in High UVindex regions, probably in their primary natural reservoir (bat). Our findings . /fmicb. . are scientifically in line with the Li et al.'s work that found bats families, being the zoonotic origin of several SARS-like coronaviruses, greatly enriched in tropical regions experiencing High UVindex solar radiations (e.g., Guangdong, Guangxi, Hubei, and Tianjin) (Hamre and Procknow, 1966;McIntosh et al., 1967;Li et al., 2005;Wu et al., 2020). Figure 1 shows the total number of identified and rcntSNVs in each of the UVindex category.

rcntSNVs genomic distribution
The SARS-CoV-2 genome exhibits two non-structural multi-domain protein-encoding genes (ORF1a and ORF1b), four structural protein-encoding genes (SPeGs; S, E, M, and N), and up to six genes that encode accessory proteins 3a, 6, 7a, 7b, 8, and 10a (Brant et al., 2021). Our in-depth analysis for geneset-based distribution of all potentially UVindex responding variants revealed the large majority of the total rcntSNVs (302: 53.45%) in the non-structural protein-encoding genes (ORF1ab), followed by 168 (29.73%) in four SPeGs (N = 75, S = 64, M = 20, and E = 9), whereas only 95 (16.81%) were found in six accessory genes (Figure 2). These inferences are in agreement with the genomic architecture of the SARS-CoV-2  and illustrate that SARS-CoV-2 has done most (approximately >53%) of the genomic-level adaptation in non-structural multi-domain protein-encoding genes (ORF1ab) to adapt various UVindex regions, where the accessory protein-encoding genes were the most conserved gene-set of SARS-CoV-2.
Of all the virion proteins, the structural gene products were directly exposed to environmental selection pressures, like solar UV radiation. Therefore, the downstream analysis was focused to identify rcntSNVs in E, M, S, and N SPeGs for each of the UVindex category (Figure 3). Of the total identified 168 structural rcntSNVs, we discovered 75, 64, 20, and 9 in nucleocapsid, spike, membrane, and envelope SPeGs, respectively. Of all four SPeGs, the nucleocapsid gene has gone through most of the genomic rearrangements, possibly to shield the nucleic acid damaging effects of UV radiation via adaptation in response to differential UVindex exposures. These findings support recent studies on SARS-CoV-2, revealing the adverse effects of UV radiation (UVC) on nucleic acid without affecting viral proteins (Chang et al., 2014), and the nucleocapsid protein's key role in packaging and protecting COVID-19 viral genome in a viable virion (Tahara et al., 1994(Tahara et al., , 1998Lai and Cavanagh, 1997).

rcntSNVs functional e ects
Since rcntSNVs in each of the five UVindex categories best represent differentially adapted SARS-CoV-2 populations. Therefore, all rcntSNVs were functionally annotated to predict their direct effects and impacts on the genes' functions. One SNV may have more than one effects, possibly due to the gene overlapping (Cingolani et al., 2012;Iqbal et al., 2019). As a result, slightly more rcntSNV-effects (rcntEffs) were observed compared to the total rcntSNV count. In this study, a total of 596 functional rcntEffs were discovered for all rcntSNVs.
. /fmicb. . Functional annotation revealed only 31 (5.2%) rcntEffs in the non-coding intergenic regions, and the remaining 565 (94.8%) were located in the genic regions of the SARS-CoV-2 genome. Of the total genic region rcntEffs, 81 (14.3%) were detected in the gene's regulatory regions, positioned 200 bp upstream (34 count) and downstream (47 count) of all genes, and the remaining 484 (85.7%) were found in the coding regions (exonic). These results are scientifically in line with the genomic architecture of the SARS-CoV-2, and similar results were also shown by Koyama et al. (2020). The overall functional rcntEffs count for all rcntSNVs and their corresponding distribution across the SARS-CoV-2 genome are shown in Figure 4.
The exonic rcntEffs set comprises 290 missenses and 194 synonymous genes' functional effects. Interestingly, of the total identified rcntSNVs in all UVindex categories, the highest number of the variants are with missense functional effects (290; 48.7%), suggesting that in response to immense selection pressure imposed by varying degrees of UV radiation, the SARS-CoV-2 has capitalized on the high impact missense variation enrichment to qualify for UV radiation stress. More than 71.38% (∼207) of the total missense rcntEffs are found segregating in High (92; 31.7%), Moderate (65; 22.4%), and Low (50; 17.2%) UVindex categories. Suggesting that the UVindex range ≤ seven allows more SARS-CoV-2 strains to survive. On the contrary, the . /fmicb. .
UVindex ≥ eight imposes strong negative selection pressure on SARS-CoV-2 as only ∼28.62% (83) of the total missense rcntEffs are identified segregating in the Extreme (43; 14.8%) and the Very_High (40; 13.7%) UVindex categories. Furthermore, the ORF1ab, which occupies two-thirds of the SARS-CoV-2 genome and expresses into 16 non-structural proteins (NSPs), harbors the highest number (163) of missense rcntEffs. We also observed that the nucleocapsid protein (N) and spike glycoprotein (S) encoding genes carry the second and third highest number of missense rcntEffs, 43 and 40, respectively. The rcntEffs counts observed in all UVindex categories are presented in Figure 5.

Comparative genomic analysis
The rcntSNVs-based comparative analysis of all studied fulllength SARS-CoV-2 genomes revealed a total of 380 (∼73.8% of the all rcntSNVs) UVindex category-specific rcntSNVs (CaSp-rcntSNVs), not being shared among any two or more categories (Extreme 58, Very_High 63, High 107, Moderate 84, and Low 68). The comprehensive annotation of each categoryspecific rcntSNV is given in Supplementary_info_file.docx, Supplementary Tables 2-6. A total of seven rcntSNVs, five missense and two synonymous, observed commonly shared among all UVindex categories, with at least 3,217 overall recurrences, suggesting that all these common rcntSNVs are conserved and near to fixation (rcntSNVs-based comparison is shown in Figure 6A). Of seven shared rcntSNVs, the ORF1ab 14159C>T (missense; Pro4720Leu) is the most common rcntSNV found in RNA-dependent RNA polymerase ( Functional annotation of all 380 CaSp-rcntSNVs revealed a sum of 420 category-specific rcntSNV effects (CaSp-rcntEffs) on genes products (Extreme 64, Very_High 68, High 120, Moderate 94, and Low 74). Of the total genes, the ORF1ab harbors the highest number of CaSp-rcntEffs (234), followed by all four structural genes (103) and six accessory genes (73). The detailed number of CaSp-rcntEffs loads per gene for each of the UVindex categories is given in Table 2.
Approximately 69.4% (154/222) of the overall CaSp-rcntMissense effects are detected in the UVindex range ≤ 7 (UVindex categories: Low 35, Moderate 46, and High 73), whereas the remaining 30.6% (68/222) are observed in the Extreme UVindex (36) and Very_High UVindex (31) categories (for details, see Figures 6B-E). The negatively related lineartrending line with the UVindex implies that the UVindex is inversely proportional to the CaSp-rcntMissense effects count. Suggesting that a higher UVindex (mostly ≥ 8) allows significantly fewer SARS-CoV-2 viral strains to survive hence imposing strong negative selection pressure (Figure 7). A set of all category-specific rcntMissense effects causing rcntSNVs may serve as potential resource for considerably more effective region-specific vaccine production.

Phylogeny
To find the evolutionary relationship between SARS-CoV-2 populations prevailing in different UVindex regions, we constructed a phylogenetic tree based on high-quality wholegenome sequences of 125 randomly selected SARS-CoV-2 samples, 25 from each of the UVindex categories ( Figure 8).
Our phylogenetic analysis revealed five different branches for all randomly selected 125 high-quality SARS-CoV-2 genome samples (25 from each of the UVindex region). The tree displays separate branches for SARS-CoV-2 retrieved from UVindex regions, namely High (orange), Extreme (purple), Low (green), Very_High (red), and Moderate (yellow). The phylogenetic analysis has shown High UVindex inhabiting SARS-CoV-2 population as an outgroup and the SARS-CoV-2 prevailing Extreme, Low, Very_High, and Moderate UVindex regions as ingroup populations.
To accommodate four SARS-CoV-2 populations, three main lineages were found within the ingroup, revealing the extent of relationships between different populations. The Extreme and Low UVindex populations are placed in two separate ingroup lineages and SARS-CoV-2 populations from the Very_High and Moderate UVindex regions are found sharing the third lineage. This relationship reflects that all SARS-CoV-2 samples, which are included in this study, are descended from the High UVindex region's inhabiting populations, whereas the SARS-CoV-2 populations from Very_High and Moderate UVindex regions are closely related to others.

Conclusion
SARS-CoV-2 is the pandemic COVID-19-causing coronavirus, which has raised a great threat to human health in almost all regions of the world. The genome-wide analysis of the rapidly evolving SARS-CoV-2 genomes discovered a large majority of the rcntSNVs as distinctive (found uniquely in a specific UVindex region), revealing the SARS-CoV-2 differential genomic responses to WHO's defined five different UVindex regions. Based on the total number of rcntSNVs predicted in all included SARS-CoV-2 genomes, our analysis showed that the Extreme UVindex applies negative selection pressure, whereas UVindex range of 6-7 provides the most suitable conditions for SARS-CoV-2 endurance. The phylogenetic relationship indicated the high UVindex region inhabiting the SARS-CoV-2 population as the recent progenitor of all included samples. To help in immune evasion and tolerate the DNA damaging effects of varying UV-solar radiation, the SARS-CoV-2 has acquired the highest number of missense rcntSNVs in their spike glycoprotein and nucleocapsid-encoding genes. Since COVID-19 diagnostic tests and vaccines are based on the spike or the nucleocapsid viral proteins, all missense rcntSNVs may need to be included in future diagnostic and vaccine formulations.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.