Unraveling the shift in bacterial communities profile grown in sediments co-contaminated with chlorolignin waste of pulp-paper mill by metagenomics approach

Pulp-paper mills (PPMs) are known for consistently generating a wide variety of pollutants, that are often unidentified and highly resistant to environmental degradation. The current study aims to investigate the changes in the indigenous bacterial communities profile grown in the sediment co-contaminated with organic and inorganic pollutants discharged from the PPMs. The two sediment samples, designated PPS-1 and PPS-2, were collected from two different sites. Physico-chemical characterization of PPS-1 and PPS-2 revealed the presence of heavy metals (mg kg−1) like Cu (0.009–0.01), Ni (0.005–0.002), Mn (0.078–0.056), Cr (0.015–0.009), Pb (0.008–0.006), Zn (0.225–0.086), Fe (2.124–0.764), Al (3.477–22.277), and Ti (99.792–45.012) along with high content of chlorophenol, and lignin. The comparative analysis of organic pollutants in sediment samples using gas chromatography–mass spectrometry (GC–MS) revealed the presence of major highly refractory compounds, such as stigmasterol, β-sitosterol, hexadecanoic acid, octadecanoic acid; 2,4-di-tert-butylphenol; heptacosane; dimethyl phthalate; hexachlorobenzene; 1-decanol,2-hexyl; furane 2,5-dimethyl, etc in sediment samples which are reported as a potential toxic compounds. Simultaneously, high-throughput sequencing targeting the V3–V4 hypervariable region of the 16S rRNA genes, resulted in the identification of 1,249 and 1,345 operational taxonomic units (OTUs) derived from a total of 115,665 and 119,386 sequences read, in PPS-1 and PPS-2, respectively. Analysis of rarefaction curves indicated a diversity in OTU abundance between PPS-1 (1,249 OTUs) and PPS-2 (1,345 OTUs). Furthermore, taxonomic assignment of metagenomics sequence data showed that Proteobacteria (55.40%; 56.30%), Bacteoidetes (11.30%; 12.20%), and Planctomycetes (5.40%; 4.70%) were the most abundant phyla; Alphproteobacteria (20.50%; 23.50%), Betaproteobacteria (16.00%; 12.30%), and Gammaproteobacteria were the most recorded classes in PPS-1 and PPS-2, respectively. At the genus level, Thiobacillus (7.60%; 4.50%) was the most abundant genera grown in sediment samples. The results indicate significant differences in both the diversity and relative abundance of taxa in the bacterial communities associated with PPS-2 when compared to PPS-1. This study unveils key insights into contaminant characteristics and shifts in bacterial communities within contaminated environments. It highlights the potential for developing efficient bioremediation techniques to restore ecological balance in pulp-paper mill waste-polluted areas, stressing the importance of identifying a significant percentage of unclassified genera and species to explore novel genes.


Introduction
The pulp-paper mill (PPM) stands as a significant industrial segment that extensively utilizes large quantities of lignocellulosic feedstock and freshwater throughout the paper manufacturing procedure.Although this industrial sector contributes a significant role to the world's economy, this industry in particular is known to be one of the largest producers of wastewater worldwide emanated during subsequent industrial operations (Bui et al., 2022;Namdarimonfared et al., 2023).From the different stages of the papermaking process, wastewater from the bleaching process is probably the major problem of PPMs (Oke et al., 2017;Kumar et al., 2020;Coimbra et al., 2021).It is estimated that freshwater ranging from 273 to 450 m 3 is essential to make 1 tonne of paper in PPMs and about 60-300 m 3 of bleached wastewater is generated (Mittar et al., 1992).Based on a survey report, approximately 3 billion cubic meters of wastewater discharged annually from PPMs in the 21st century (FPAC, 2009).There are about 900 PPMs spread across different states of India (Central Pulp Paper Research Institute, 2021).
Bleaching of pulp by chlorine-based chemicals to remove the characteristic brown color of lignin for whitening pulp as a part of pulp processing in PPMS is still practiced in India (Kaur et al., 2018;Rajput et al., 2020) and many other developing countries (Liang et al., 2021).As a consequence, a myriad of chloroorganic compounds, such as chlorolignin, chlorophenols, chlorohydrocarbons, chlorlignosulphonic acids, chlorinated resin acids, dibenzo-p-dioxin, and dibenzofuran, which are formed inadvertently as adsorbable organic halides (AOX), are discharged in the bleaching wastewater stream (Deshmukh et al., 2009;Zhang et al., 2020;Kumar and Verma, 2023).Bleach wastewater exhibits high strengths of chemical oxygen demand, suspended solids, dissolved lignin, color, and a blend of various organic compounds (Sharma et al., 2014;Kumar A. et al., 2021;Liang et al., 2021).Recent reports have brought to light the presence of per-and polyfluoroalkyl substances (PFASs) in the effluent wastewater of PPMs (Chow and Foo, 2023).In addition, inorganic chemical contaminants including various heavy metals have also been concomitantly discharged in bleached wastewater (Bhatti et al., 2021;Singh et al., 2021).Organic substances, including AOX, have become a significant and growing concern.They are characterized by their remarkable persistence in open environments over extended periods, high mobility, and the ability to traverse vast distances within ecosystems.Eventually, these substances reach the food chain and accumulate in the adipose tissues of the human body (Costigan et al., 2012;Castro et al., 2018;Balabanič et al., 2021).This presents not only a serious health hazard but also inflicts irreparable damage to pristine ecosystems, biodiversity, soil quality, and water pollution.Moreover, non-degradable heavy metals and organometallic pollutants also pose an imminent threat to human health (McElroy et al., 2011;Mitra et al., 2022).
The disposal of bleached wastewater from PPMs into freshwater bodies, prior to the implementation of wastewater treatment processes, has led to the accumulation of a substantial volume of solids in the sediments of the terrestrial ecosystem (Yadav and Chandra, 2018;Tripathy et al., 2022).These deposited solids typically include various components, such as wood fibers, organochlorine compounds, heavy metals, papermaking fillers, pitch, lignin by-products, and ash.Previous studies have reported that wastewater discharged from the PPMs during subsequent industrial operations at the primary and secondary levels still retain an extreme load of multifaceted pollutants which significantly deteriorate the quality of the environment if discharged back into the environment (Orrego et al., 2010(Orrego et al., , 2019;;Yuliani et al., 2019;Sharma et al., 2021a).However, there is still a lack of detailed knowledge regarding the specific nature of the pollutants discharged from PPMs.
It has been reported that indigenous microbial communities, predominantly composed of bacteria, play pivotal roles in the recycling of organic matter, including their involvement in the biodegradation of organic compounds, and contribute to the sustainable eco-restoration of contaminated sites (Guo et al., 2017;Selvi et al., 2021;Muthukumar et al., 2022;Chen et al., 2023).Wang et al. (2023) demonstrated that colonized functional microbes carried by agro-industrial waste played an important role in the co-pollutant removal.Prakash et al. (2021) highlighted the bioreduction of hexavalent chromium Cr(VI) to trivalent chromium Cr(III) by bioelectrokinetic techniques.Nextgeneration sequencing (NGS)-data analysis revealed the presence of Proteobacteria, Firmicutes, Bacteroidetes, Actinobacteria, and Planctomycetes in the bio-electrokinetic system.Proteobacteria are responsible for the bioreduction of Cr(VI) by the formation of FeS particles.Yang X. et al. (2023) explored the environmental pollution behavior/fate of ammunition soil and microbial remediation of trinitrotoluene (TNT), such as 2-amino-4,6dinitrotoluene (2-ADNT), 4-amino-2,6-dinitrotoluene (4-ADNT), and 2,4-diamino-6-nitrotoluene (2,4-DANT) and its intermediates.The abundance of Sphingomonadaceae showing key tolerance/ degradation TNT was significantly upregulated.However, organic compounds and heavy metals can impact the growth and survival of microbes in tailings soil and sludge, leading to variations in community structure and diversity (Yergeau et al., 2012;Kumar and Chandra, 2020;Tong et al., 2021;Mapelli et al., 2022;Lee et al., 2023;Yang L. et al., 2023).The contamination of chlorolignin compounds and heavy metals can have adverse effects on the diversity, survival, growth, and other ecological functions of soil microbial communities.An exploration of bacterial structures and community functions can provide insights into how microbial Abbreviations: DCM, Dichloromethan; EDCs, Endocrine-disrupting chemicals; GC-MS, Gas chromatography-mass spectrometry; HTS, High-throughput sequencing; NGS, Next-generation sequencing; OUT, Operational taxonomic unit; POPs, Persistent organic pollutants; PPM, Pulp and paper mill; PPMW, Pulp paper mill wastewater; RT, Retention time.
10. 3389/fmicb.2024.1350164Frontiers in Microbiology 03 frontiersin.orgcommunities respond to variations in contaminant levels within an open environment.Thus, the detailed information on the profile of indigenous bacterial communities' structure and functions surviving in chlorolignin waste contaminated sites will provide insight into possible strategies for decontamination and eco-restoration of polluted sites employing bacterial communities.
To date, only an inadequate number of microbial agents have been identified that can effectively degrade and treat wastewater or pollutants discharged from PPMs (Fernandes et al., 2014;Sonkar et al., 2019;An et al., 2021;Hajdu-Rahkama and Puhakka, 2022;Yue et al., 2023).This scarcity of microbial agents capable of treating paper mill wastewater may be attributed to challenges in isolating and culturing these microbial species under controlled conditions.The culturing technique has been broadly employed as a direct and efficient technique for culturing and characterizing the microbial community thriving in complex environments (Tripathi et al., 2022).However, it's worth noting that a significant portion of microbes in natural environments cannot be cultured in a laboratory's artificial medium, and the diversity of uncultured microbes is quite extensive.
Over the last decade, an array of molecular methods, such as terminal restriction fragment length polymorphism (T-RFLP) (Osborn et al., 2000), denaturing gradient gel electrophoresis (DGGE) (Muyzer and Smalla, 1998), and Fluorescent in situ hybridization (FISH) (Moter and Göbel, 2000), has greatly promoted our understanding of the microbial community.However, for a multifaceted environment with vast genetic diversity, these conventional methods fall short of providing a comprehensive view of the bacterial community, offering only limited information about microbial populations (Ranjard et al., 2000;Prada et al., 2022).In current decades, thanks to the progress in second-generation sequencing and computational biology methods, it has become feasible to obtain insights from previously uncultured beneficial microbial communities.It's now possible to scrutinize changes in microbial communities within environmental samples at the molecular level with an unprecedented level of detail through metagenomic analysis using NGS technology.
Metagenomics, a revolutionary robust approach based on highthroughput sequencing (HTS), serves as a powerful tool to enable the accurate mining of novel microbes and functional genes from the environmental samples without identifying them individually (Han et al., 2020;Fan et al., 2022).In addition, it can overwhelm the shortcomings of conventional microbial community analysis methods, and unveil the microbial community interactions and their function in contaminated environments (Abia et al., 2018;Qian et al., 2022).Limited studies have been directed to explore the microbial community structure and composition grown in contaminated environments of pulp paper mill wastewater (PPMW).Sharma et al. (2021b) profiled the microbial community abundance and their structure grown in the PPMW, which contained a blend of toxic metals and organic chemicals, through a metagenomics approach.Tripathi et al. (2022) characterized the wide array of persistent organic pollutants (POPs), which influence the culturable and unculturable bacterial communities, in sludge of PPMs discharged after secondary treatment.In a recent study by Sharma et al. (2023), they quantified the microbial communities by 16S rRNA analysis thriving in activated sludge of PPMs containing lignin and chlorophenol.While various efforts have been made to clean up the environment polluted by pulp and paper mill discharge by focusing on indigenous microorganisms, the quest for an efficient and cost-effective solution continues (Sonkar et al., 2019;Kumar et al., 2022).Surprisingly, there have been no initiatives to investigate the profile of the bacterial communities within the sediments that accumulate over time in the ecosystem receiving discharges from PPMs.The exploration of how indigenous microbial communities respond can provide valuable insights for researchers in their search for eco-friendly solutions to combat soil and/or water pollution.
The present study examined variations in the profile of bacterial community diversity and composition grown within two sediment samples.These changes are believed to be the driving force behind the improvement in pollutant conditions, ultimately fostering biological succession and bioremediation.This, in turn, holds promise for the ecological restoration of sites burdened by a heavy load of wastewater pollutants discharged from PPMs.The specific objectives of this study were to: (1) analyze the impact of discharged wastewater and their pollutants on the bacterial community in the sediment samples; (2) detect and characterize the broad range of refractory organic pollutants, extracted with two different solvents, by gas chromatography-mass spectrometry (GC-MS) technique; (3) identify the prevalent indigenous bacterial community, unveiling the microbial niche within this contaminated environment by metagenomic approach; and (4) discuss the relationship between the bacterial community and pollutants rich environment.The findings presented in this study offer valuable insights that contribute to the comprehension of the mechanisms involved in both ex situ and in situ bioremediation, facilitating the safe disposal of such waste.

Chemicals and reagents
In this study, HPLC grade organic solvents such as ethyl acetate, dichloromethane (DCM), and methanol that were obtained from Merck (Merck India) and had a purity exceeding 99%.Ethyl acetate and DCM were employed for the recovery of organic pollutants from the sediment samples.Additionally, other chemicals including BSTFA (N,O-bis(trimethylsilyl)trifluoroacetamide), dioxane, and pyridine, employed in the derivatization process, were sourced from Sigma-Aldrich (Saint Louis, MO, USA).For the preparation of digestion mixtures to extract heavy metals from the sediments, concentrated hydrochloric acid (HCl; 37%) and nitric acid (HNO 3 ; 69.5%) were procured from HiMedia Laboratories (Maharashtra, India).

Collection of samples
Star Paper Mill Ltd., an integrated paper mill Established in 1938, is one of the leading sellers of industrial and craft paper, printing paper, writing paper, packaging, and cultural papers located in Saharanpur, Uttar Pradesh, India.The mill employs woody raw feedstock, including poplar, eucalyptus, and veneer chips, for the production of a diverse range of high-grade papers (Dagar et al., 2022).In the paper manufacturing process, a Kraft method is utilized to make pulp from wood chips, followed by chlorine bleaching to produce white paper.The mill is equipped with a comprehensive effluent treatment plant that employs the activated sludge process, consisting of a primary clarifier, aeration basin, secondary clarifier, and sludge dewatering.After secondary treatment, this mill discharges its wastewater through a covered canal in the open area which is located in the Paragpur village of the Saharanpur District, India (Figure 1).Due to the water scarcity, the discharged wastewater is used by the farmers of this village to irrigate agricultural lands, which is a public practiced in developing nations including India (Medhi et al., 2011;Singh et al., 2012;Mishra et al., 2023).In a 2012 field study by Kumar and Chopra, wastewater  ), total Kjeldhal nitrogen (TKN), calcium (Ca 2+ ), potassium (K + ), sodium (Na + ), carbonate (CO 3 2− ), bicarbonate (HCO 3 − ), organic carbon, and various metals in the soil (Kumar and Chopra, 2012).The discharged site was located only about 2 km away from the mill and is one of the highly contaminated areas of Saharanpur, as reported by earlier researchers (Kumar et al., 2019;Singh et al., 2021).The mill discharged its wastewater after the paper manufacturing process through a covered canal which is finally mixed with the water of the Hindon River which finally Mixed with Yamuna River's water (Sharma et al., 2021c;Dagar et al., 2022).The effluent discharged site selected for sample collections was denoted as Site-1, while the second site (Site-2) was 1,000 meters far away from Site-1, as shown in Figure 1.Triplicate sediment samples were randomly collected in a sterile plastic container (1 L) using an excavator bucket from each sampling sites at a depth of 0-10 cm.The depth of soil sampling is significantly affected abundance and diversity of microbial community (Böer et al., 2009;Mendes and Tsai, 2014;Fu et al., 2023).The selection of the 0-10 cm depth range for collecting sediment samples is often based on the findings of earlier researchers (Zhao et al., 2021;Beule et al., 2022).Randomly collected samples were mixed to get one composite sample from each site, Site-1, and Site-2, and were designated as PPS-1 and PPS-2, respectively.The collected sediment samples were maintained at a temperature of 4°C and promptly transported in ice packed box to the laboratory for subsequent investigation.A temperature of 4°C is recommended to inhibit bacterial growth and decelerate metabolic processes, ensuring a dependable metagenomic analysis within 24 h of collection (Vandeputte et al., 2017).

FIGURE 1
The map indicating the geographical location of sample collection site from two sites situated in Paragpur village, Saharanpur, Uttar Pradesh, India.The star in red color represents the two different sampling points where from the samples were collected.

Physico-chemical characterization of samples
Samples taken from two different sites (Site-1 and Site-2), as depicted in Figure 1, were analyzed to determine the physico-chemical load of various pollutants including pH, electrical conductivity (EC), potassium (K + ), sodium (Na + ), total organic carbon (TOC), etc., as per the standard methods described earlier (Kumar and Chandra, 2020;Kumar V. et al., 2021).EC and pH measurements were performed on 1:2.5 sediment-water suspensions and measured by conductivity meter and Digital pH meter, respectively.The total concentrations of sodium (Na + ), chloride (Cl − ), and sulfate (SO 4 2− ) were quantified based on the procedure outlined by Kalra and Maynard (1991).Total phenol content in was quantified employing the 4-aminoantipyrine reaction method (Ettinger et al., 1951).Lignin content was quantified and estimated using the methodology described by Pearl and Benson (1990).The concentrations of chlorophenol in the samples were determined following standard procedures (Choudhary et al., 2012).

Heavy metal analysis
The heavy metal contents in the sediment samples were estimated via ICP-MS, as outlined by Kumar and Chandra (2020).For heavy metal determination, sediment samples underwent digestion using the HNO 3 -H 2 O 2 digestion method (3050B of USEPA, 1996).After digestion, the volume was adjusted with ultra-pure deionized water and filtered by Whatman filter paper No. 42 prior to analysis.Subsequently, the ICP-MS method was utilized to ascertain the overall concentration of various heavy metals in filtered and digested samples.

Sample pre-processing and solid -liquid extraction
A pre-weighed 25 g sediment sample was weighed and placed into the Erlenmeyer flask (250 mL), and 100 mL of distilled water was added to each flask separately.Thereafter, the mixture was continuously agitated at room temperate in a rotatory incubator shaker for 48 h to mix vigorously, and the sludge suspension was allowed to stand still for 6 h.After this, the suspension was filtrated by passing through the Whatman filter paper No. 42.Next, the filtrate was acidified with H 2 SO 4 1 M up to pH 2.0.Subsequently, acidified filtrate was employed to extract the organic compounds with two different solvents (separately) namely ethyl acetate and DCM, as per the standard liquid-liquid extraction procedure (Chandra et al., 2017).The choice of ethyl acetate and DCM as solvents for liquidliquid extraction in sample pre-processing is often based on their selectivity and compatibility with the classes of phenolic compounds typically found in sediment samples (Möder et al., 1997;Kumar, 2014;Azzam and Hazaimeh, 2021;Sharma et al., 2021a).Briefly, 50 mL of acidified supernatant was transferred into a 500 mL separatory funnel.Subsequently, an equal volume of organic solvent was added, and the mixture was vigorously shaken for 15 min.Thereafter, the organic phase was separated from the aqueous phase.This extraction step was repeated successively thrice to extract the maximum number of organic compounds.Subsequently, the organic phases collected from separating funnel in a 250-mL beaker were combined, evaporated up to near dryness using a vacuum rotary evaporator at ≤40°C, and then and dehydrated over anhydrous sodium sulfate to eliminate water traces.Finally, the dried residues were reconstituted in methanol (2.0 mL), filtered through 0.22-μm Millipore syringe filters.The final dissolved extracts were analyzed by GC-MS.For GC-MS analysis, a silylation reagent mixture consisting of 100 μL BSTFA, 50 μL dioxane, and 50 μL pyridine, was added to the extracts (Sonkar et al., 2019).

Gas chromatography-mass spectrometry analysis
GC-MS analysis of extracts was performed with Gas Chromatograph (Agilent 8890; Agilent Technologies, Inc) quipped with autosampler coupled to 5977B Series double-quadrupole Mass Selective Detector (MSD) system (Agilent Technologies, Inc) operated in full-scan mode using a mass range of 30-550 amu to obtain their mass spectrogram.The mass range of 30-550 amu was chosen to encompass a broad spectrum of high molecular weight organic compounds commonly found in environmental samples (Medeiros, 2018;Kumar V. et al., 2021Kumar V. et al., , 2022)).This range was considered to cover a diverse array of organic pollutants, including chlorinated compounds discharged in wastewater that may be present in sediment matrices.In the analysis, a 1.0 μL aliquot of the extract was used.Organic compounds were separated using a DB-5 MS capillary column (0.25 mm × 60 m × 0.25 μm) (Agilent Technologies, Inc) contains 5% phenyl-methylpolysiloxane as a stationary phase.The temperature of the injector and detector were worked at 280°C.The temperature of the GC oven was initially programmed as follows; 70°C (held 2 min), 6°C min −1 up to 230°C (2 min); 6°C min −1 up to 280°C (hold time: 20 min).Helium with a purity of 99.99% served as the carrier gas, and the column flow was sustained at a rate of 1.1 mL min −1 .The total chromatographic run was 60 min.For instrument control, data acquisition, and evaluation MassHunter GC/MS Acquisition 10.1.49was used (Agilent Technologies, Inc).The identification of the organic compounds eluted from the GC was carried out by analyzing their mass spectra and comparing them to entries in the National Institute of Standards and Technology (NIST) mass spectrum library search for matching and identification.

Characterization of bacterial communities in sediments 2.7.1 Preparation of samples for NGS analyses: DNA extraction and amplification
Metagenome was recovered from pre-weighted 1 g of representative homogenized sediment samples under sterile conditions using the NucleoSpin ® Soil kit (Macherey-Nagel, GmbH & Co. KG) following the manufacturer's instructions.The quantity and quality of metagenome were assessed by measuring the absorbances at A 260 nm and A 280 nm using a NanoDrop 2000 Spectrophotometer (ThermoFisher Scientific, Massachusetts, USA).Subsequently, gel electrophoresis was conducted using a 1.2% agarose gel as an additional quality control step for the extracted DNA.This was undertaken to identify and evaluate the presence of any contaminants that could potentially interfere with the activity of enzymes necessary for subsequent analyses.The criteria typically used to assess the quality of DNA on a gel are size estimation, band intensity, single band presence, band sharpness and resolution, and absence of RNA contamination and high molecular weight smearing.Thereafter, the V3 and V4 hypervariable regions of the 16S rRNA gene were amplified with the universal bacterial forward primers; 341F (5′-CCTACGGGNGGCWGCAG-3′) and reverse primer; 805R (5′-GAC-TACHVGGGTATCTAATCC-3′).The PCR (Polymerase Chain Reaction) was conducted using a Perkin Elmer Thermocycler following this procedure: initial denaturation at 98°C for 3 min, followed by 27 cycles of denaturation at 98°C for 15 s, annealing at 50°C for 30 s, and extension at 72°C for 30 s.The process concluded with a final extension step at 72°C for 5 min.The first amplicons were recovered with 1.2% agarose gel (2 μL sample) and then purified and quality checked again, then the sequencing library was built.

Library construction, and high-throughput sequencing
The Illumina paired-end multiplexed sequencing library was constructed using the Nextera XT Index kit (Illumina Inc.) following the standard protocol (part#15044223 Rev. B.).The amplicon library was then subjected to purification by AMPureXP beads and quantified using Qubit ®3.0 Fulorometer (Thermo Fisher Scientific, Waltham, USA).The amplified library underwent analysis using a 4,200 Tape Station System (Agilent Technologies, Santa Clara, USA) with D1000 Screen tape for quality control, following the manufacturer's instructions.The Qubit fluorometric concentration for the libraries was determined.Following the determination of the mean peak size from the Tape Station Profile, the libraries were loaded onto the MiSeq platform (MiSeq PE300; Illumina, San Diego, USA) at an appropriate concentration (10-20 pM) for cluster generation.A 2 × 300 paired-end sequencing run was performed on the MiSeq sequencer to generate the raw reads.

Illumina data processing and microbiota characterization
Firstly, the quality of the raw sequencing data generated by the Illumina MiSeq sequencing platform was processed, and quality trimming (Q N 30) and length trimming were conducted to obtain high quality reads for subsequent analysis.To ensure data reliability, low-quality reads-those with more than 10% of quality threshold (QV) <20 Phred scores-were eliminated using the Trimmomatic (v0.38) software1 (Bolger et al., 2014).The high-quality (QV > 20), paired-end reads were used for read assembly.The filtered metagenomic reads were used for taxonomical assignment.Operational taxonomic unit (OTU) analysis of high quality paired-end FASTQ reads from Illumina sequencing was performed using the software package Quantitative Insights into Microbial Ecology 2 program (QIIME v1.8.02 ) (Caporaso et al., 2010).The closed-reference OTU picking method was employed, and sequences were searched against the Greengenes database (version 13_8). 3Furthermore, in this study, the UCLUST classifier was employed to group high-quality sequencing data into OTUs with a 97% similarity threshold within the QIIME platform.The taxonomic classification of each 16S rRNA gene sequence was performed using the Ribosomal Database Project (RDP) Classifier algorithm4 , and the representative OTU sequences were compared to those in the Silva (SSU132) 16S rRNA database.Alpha diversity indices were obtained using the QIIME, including Shannon and Simpson diversity indices (Feranchuk et al., 2018).Stacked bar plots and rarefaction curves were generated using the R package Phyloseq (McMurdie and Holmes, 2013) and Microsoft Excel (Microsoft Office ® version 2018).Rarefaction curves were created utilizing the Shannon and Observed-OTU indices through QIIME (version 1.7.0) and visualized using R software (version 2.15.3).Venn diagrams were generated to depict the number and similarity of OTUs using a Venn plotter available at http://bioinformatics.psb.ugent.be/webtools/Venn/.The abundance distribution of dominant bacterial genera among all samples was displayed in the species abundance heat map.The OTU -heat map displays raw OTU count per sample, where the counts are colored based on the contribution of each OTU to the total OTU count present in that sample.Heatmap analysis, focusing on the most abundant OTUs in the entire libraries, was performed using the heatmap.2function within the R package gplots (version 3.1.0). 5 The taxonomic composition of PPS-1 and PPS-2 was visualized using a Krona chart generated through the Krona web application, available at https://github.com/marbl/Krona/wiki.A workflow that represents the major steps associated with bioinformatic analysis of high throughput sequence data is depicted in Supplementary Figure S1.

Heavy metal content
The average concentration of analyzed metals in the sediment samples is shown in Supplementary Table S1; certain metals, such as titanium (99.792 ± 0.016 mg kg −1 ), aluminum (3.477 ± 0.006 mg kg −1 ), and iron (2.127 ± 0.002 mg kg −1 ), displayed a high concentration in PPS-1.Out of 17 analyzed metals in PPS-1, vanadium and molybdenum were found to be below detectable limit.In contrast to PPS-1, the values of various heavy metals determined in the PPS-2 sample were slightly lower, as tabulated in Supplementary Table S1.Therefore, further investigations are imperative to characterize the organic pollutants sequestered in the sediment samples.Profiling of organic pollutants in sediment samples will be useful for the development of effective and novel bioremediation technologies aimed at remediating chlorolignin waste polluting sites with consideration for environmental sustainability.

Ethyl acetate extracts
In the GC-MS technique, the separation of plentiful organic compounds found in the extracts occurs based on retention times (RT), resulting in several minor and major peaks in the chromatogram, as shown in Figure 2. The organic compounds found in the samples have been identified as trimethylsilyl (TMS) derivatives by comparing their RT with the mass spectra of compounds available in the NIST library, and the closest matches are presented in Table 2. From Figure 2A, the major peaks detected in the ethyl acetate extract of PPS-1 at different RTs, such as 21.614 and 48.644 min, corresponded to trans-2,4-dimethylthiane, S,S-dioxide and furane 2,5-dimethyl, respectively.Furthermore, multiple minor peaks were observed at different RT values, and the respective compounds are presented in Table 2.In contract to PPS-1 extract, GC-MS analysis of PPS-2 extracted by ethyl acetate displayed predominant peaks at RT 13.940,17.540,20.795,23.616,28.457,30.574,32.961,34.963,36.508,37.790,38.923,41.750,and 48.662 min (Figure 2B), which was identified as dodecamethy, tetradecamethyl; nonadecane, 2-methyl; furane 2,5-dimethyl; tetradecanoic methyl ester; dimethyl phthalate; hexadecanoic acid; benzene acetic acid; propanoic acid; benzoic acid; octadecane, 3-ethyl-5-(2-) and β-sitosterol, respectively.In addition to the mentioned peaks, very minor peaks were also detected at different RTs: 6.456,9.889,22.615,27.353,43.689,46.190,and 53.174 min, as illustrated in Figure 2B.These minor peaks were, respectively, identified as spiro[2.4]heptane;1-ethenyl-5-(1propenylidene); 1-phenyl-1H-tetrazol-5-, trans-2,4-dimethylthiane, S,S-dioxide; benzene dicarboxylic acid; stigmasterol; heptacosane; and hexachlorobenzene.Tables 2, 3 represent a list of identified organic compounds detected at various RT in ethyl acetate extract of PPS-1 and PPS-2.The existence of these compounds in different sampling points underscores their persistence, as some of them do not undergo complete degradation during the secondary treatment process.

Dichloromethane extracts
The comparative GC-MS analysis of samples extracted with DCM from PPS-1 and PPS-2 showed the presence of a wide range of organic pollutants, as detailed in Tables 4, 5, respectively.The GC-MS chromatogram of PPS-1 revealed few dominant peaks at RT 18.020 and 42.968 min (Figure 2C), which were identified using the NIST library as adamantane, 1-isothiocyanato-3-methyl-; and arsenous acid, tris(TMS) ester, respectively.Moreover, the total ion chromatogram (TIC) of PPS-1 showed several minor peaks at RT 26.094, 34.837, 41.303, 45.601, 48.650, and 50.510 min (Figure 2C).These peaks were characterized as dodecanoic acid; 3-methylpyrazol obis(diethylboryl)hydroxide; tris(tert-butyldimethylsilyloxy)arsane; 9,12-octadecadienoic acid(z,z)-2,3-dihydroxypropyl ester; 2 ethyl 4-6 dimethyl-1,3,5-trixane; and ethane, 1,1-diethoxy.Furthermore, some minor peaks were observed at different RT values, and the compounds corresponding to these minor peaks are provided in Table 4.In contrast to the previously mentioned observations, the GC-MS chromatogram of dichloromethane-extracted samples, collected from PPS-2, revealed multiple major and minor peaks at various RT, as illustrated in Figure 2D.These peaks were found to correspond to different organic compounds, and their identities are provided in Table 5.The major detected peaks at RT of 21.356, 44.691, and 48.650 min, which were identified as octadecane, β-sitosterol, and 1-decanol, 2-hexyl, respectively.Moreover, there were additional minor peaks observed at RT of 6. 685, 11.835, 18.037, 19.519, 23.084, 26.340, 29.361, 32.531, and 35.341 min, as depicted in Figure 2D.This  2D.However, these particular peaks could not be identified since their corresponding compounds were not present in the NIST library.In our study, a substantial number of identified compounds were detected in the ethyl acetate extract of both sediment samples, as summarized in Tables 2, 3.
3.4 Bacterial communities profiling by high throughput sequencing

Isolation of metagenomic DNA and PCR amplification
The DNA samples recovered from PPS-1 and PPS-2 using a soil DNA isolation kit were of high quality.The A 260/230 ratio for both PPS-1 and PPS-2 fell within the range of 1.89 to 1.88, while the A 260/280 ratio was between 2.07 and 2.10, respectively.The concentration of the metagenomic DNA isolated from the sample was 233.3 and 261.8 ng/ μl (260 nm).A summary of the quality and quantity of the extracted metagenome has been presented in Supplementary Table S2.

Quality and quantity of MiSeq library
The quality check amplified 2× 300 MiSeq library analyzed on the 4,200 Tap Station system showed three peaks; minimum, optimum, and upper peaks as shown in Supplementary Figure S2.The results of the analyzed library are depicted in Table 6A.

Metagenomics sequencing reads
In the present study, HTS of the V3 and V4 region of the 16S rRNA gene derived from PPS1 and PPS2, a total of 1,53,675 and 1,62,691 raw sequence reads were generated, respectively.After trimming low-quality reads, the MiSeq sequencing yielded 115,665 and 119,386 high-quality paired reads with an average length of 250 bases in PPS-1 and PPS-2, respectively which were further used for de novo assembly.Table 6B provides a summary of sequence reads that have successfully passed through each filter.Moreover, a total of 2,594 OTU was predicted across all the samples, utilizing the Greengene database.OTU cluster analysis revealed a total of 1,249 OTUs were identified after clustering at a 97% similarity level from 1,15,665 highquality reads of PPS-1.A total of 1,345 OTU was obtained from unique high-quality read sequences of PPS-2 sample.The average OTUs of the contaminated PPS-2 samples were higher than those of the PPS-1 samples (Supplementary Table S3).A total of 56,444,791 and 57,992,884 were derived from high-quality total base reads of PPS-1 and PPS-2, respectively.The metagenome from sample PPS-1 gave 56.44 High Quality (HQ) data (In MB) reads, while in the sample of PPS-2 were 57.99 HQ data (In MB) reads (Table 6B).

Abundance of OTU
The rarefaction curves illustrated that the abundance of OTUs varied between PPS-1 and PPS-2, as depicted in Figure 3A.Based on the data, it appears that PPS-2 displayed a higher bacterial diversity (richness) compared to the samples from PPS-1.Notably, both the rarefaction curves (Figure 3A) for observed OTU approached nearsaturation.Additionally, a Venn diagram was developed to illustrate the overlaps and distinctions in the bacterial communities within the sediment samples.This analysis was based on the proportion of unique and shared OTUs identified among the sediment samples, as shown in Figure 3B.The diagram vividly illustrates the variations in the number of bacterial communities among the different samples.In this study, notable differences were observed in the number of bacterial OTUs between PPS-1 and PPS-2.Specifically, 1,249 OTUs were identified in PPS-1, while PPS-2 exhibited 1,345 OTUs based on Illumina analysis.In both samples, there were unique OTUs, and these unique OTUs accounted for 8.6 to 11.6% of the total sequences.Specifically, the number of unique OTUs in PPS-1 and PPS-2 was 279 and 375, respectively.The total number of OTUs shared between the two sediment samples was 970, representing 30.1% of the total number of OTUs observed, which amounted to 2,594.When the same OTUs are present in two distinct samples, it signifies the presence of the same bacterial species in different sediment samples throughout the study.

Variation in alpha diversity indices
To gain a deeper understanding of bacterial diversity within the samples, α-diversity indices, specifically the Shannon and Simpson indices, were computed based on the number of retrieved OTUs (Feranchuk et al., 2018;Walters and Martiny, 2020).The results from the Shannon and Simpson indices underscore that the sediment PPS-1 had the low community richness (Shannon index: 7.99; Simpson index: 0.987), whereas the PPS-2 sample had the highest community richness (Shannon index: 8.23; Simpson index: 0.991).The details of the observed OUT and computed Shannon and Simpson are listed in Supplementary Table S3.

Taxonomic profiles of the bacterial community
According to the annotation process, a total of 2,594 OTUs were identified in both sediment samples.These OTUs were further classified into different taxonomic levels, as depicted in Figure 4. Illumina sequencing analysis identified 1,249 OTUs (from 1,15,665 high-quality reads) recovered from PPS-1 that were taxonomically classified to different phyla, classes, order, families, genera, and species.In this study, the RDP Classifier was utilized to assign the effective bacterial sequences to different phylogenetic taxa.Notably, bacterial communities found in sediment samples exhibited significant differences at each taxonomic unit level, as illustrated in Figure 4.
At the order level, Rhodobacterales was the dominant group in sediment samples, accounting for 8.80 and 10.80% in PPS-1 and PPS-2, respectively.The other most abundant orders in PPS-1 and PPS-2 were Rhizobioales (7.60 and 7.60%), Hydrogenophilales (7.60At the family level, the relative abundance analysis indicated that Hydrogenophilaceae was the main dominant family in PPS-1, accounting for 7.60% (Figure 4D).The second most abundant family was Rhodobacteraceae (6.70%) in PPS-2.In contracts to PPS-1, the most abundant family in PPS-2 was Rhodobacteraceae, with relative abundance of 8.00%.Other less abundant families in PPS-1 and PPS-2 included Hyphomicrobiaceae, Comamonadaceae, Helicobacteraceae, Hyphomonadaceae.A substantial portion of the bacteria could not be identified beyond the Order and family level, with the majority categorized as "Unknown." At the genus level, the major genus in PPS-1 and PPS-2 was Thiobacillus (7.60 and 4.50%) (Figure 4E).As depicted in Figure 4E, a significant number of bacteria could not be identified at the genus level; labeled as "Unidentified, " and their relative abundances are displayed in Figure 4E.
At the species level, all the detected bacterial genera are unassigned to any species.As shown in Figure 4F, a significant portion of the bacteria could not be identified at the genus level, with the majority being labeled as "Unidentified, " and their relative abundances are visually presented in Figure 4F.The summary of assigned bacteria at different taxonomic level are given in Table 7 and Supplementary Table S6.
Furthermore, a heatmap illustrating the most abundant OTUs revealed distinctions between the sediment samples from Site-1 and Site-2 (Figure 5).The heatmap grouped the sediment samples but also separated them into distinct clusters based on the correlation between the most abundant OTUs (Figure 5).Krona charts (Figures 6, 7), known for their visually striking presentation, depict the relative abundance of taxa at various hierarchical levels within the sediment samples.

Variation in physico-chemical parameters
PPMs extensively utilize huge volumes of freshwater in several processing steps, including cleaning, chemical pulping, bleaching, and  10.3389/fmicb.2024.1350164 Frontiers in Microbiology 14 frontiersin.orgpaper development, simultaneously generating vast volume of effluent as a by-product (Kumar A. et al., 2021;Conte et al., 2022).The generated effluent is heavily loaded with a multifaceted blend of organic, inorganic, and organometallic pollutants like lignins, tannins, resins, and their derivatives, fatty acids, sulfur compounds, including chloroorganics like lignins, phenols, hydrocarbon, dioxins, furans, and resin acids (Castro et al., 2018;Kumar and Verma, 2023).Improper discharge of effluent can have serious impacts on the environment, leading to contamination of natural resources and posing serious health risks (Ratia et al., 2012;Orrego et al., 2019;Haq et al., 2022).In this study, two sites were selected for the collection of sediment samples.Notably, significant differences in the physico-chemical properties were detected between PPS-1 (Site-1) and PPS-2 (Site-2), as shown in Table 1.In our study, the pH of both sediment samples were found to be alkaline.The elevated pH in the sediment can likely be attributed to the residues of sodium hydroxide and sodium sulfide, commonly used in the pulping process, as well as the high concentrations of ions such as K + , Na + , and Cl − .Furthermore, high EC was found in sediment samples.EC serves as a valuable indicator of soil health, representing overall salinity.The elevated EC values are likely associated with the presence of hydroxides, carbonates, bicarbonates, and salts, along with ions such as K + , Na + , and Cl − , commonly used in the pulping and bleaching processes.The presence of salt and Cl − may result from byproduct reactions involving sodium sulfide and/or sodium hydroxide during the pulping process.
Consequently, the salinity of the soil/sediment's is unsuitable for agricultural purposes, as high salt levels can adversely impact crop yields, nutrient availability, and the activity of the microbial community inhabiting the soil.It's worth noting that several authors have reported elevated levels of these cations and anions in wastewater (Eskelinen et al., 2010;Hajdu-Rahkama and Puhakka, 2022) and paper sludge (Chandra et al., 2017;Singh et al., 2020).In India, farmers widely use chloride-containing bleach wastewater in irrigation as it not adsorbed by soil and moves quickly travels with soil water, gets adsorbed by crops, and ultimately accumulates in the leaves.
Chloride is typically considered more toxic to flora and fauna, including the microbial community, compared to sulfates.It was observed that the organic carbon was found in high concentration in PPS-1 and PPS-2.The elevated organic matter content is attributed to  the presence of residual chlorolignin fiber material that persists even after undergoing secondary treatment.This observation aligns with findings previously reported by other researchers who reported the high load of various physico-chemical pollutants in discharged paper sludge and sediment (Chandra et al., 2017;Singh et al., 2020).In this study, the chlorophenol, lignin, and total phenol concentrations of Heat map showing relative abundance of major bacterial phyla identified by metagenomic sequencing in the two metagenomic libraries prepared from two metagenome obtained from sediment sample, PPS-1 and PPS-2, collected from different locations.The heatmap legend displays the percentage abundance of phyla in each of the samples.Krona show the taxonomic composition of microbial communities grown in sediment sample, PPS-1.The outermost to inner circles represent species, genera, family, orders, and phyla levels, respectively.Percent (%) numbers are indicated the bacterial abundance.

FIGURE 7
Krona charts show the taxonomic composition of microbial communities grown in sediment sample, PPS-2.The outermost to inner circles represent species, genera, family, order, and phyla levels, respectively.Percent (%) numbers are indicated the bacterial abundance. 10.3389/fmicb.2024.1350164 Frontiers in Microbiology 17 frontiersin.orgPPS-2 were lower than those in PPS1, as represented in Table 1.The presence of chlorophenols is attributed to higher molecular weight compounds generated as degradation product of lignin and substantial bleaching of pulp by chemical agents.The phenol content of sediment sample PPS-1 was recorded as 351 mg kg −1 , which was higher compared to PPS-2.Chlorophenols, a common group of toxic pollutants from pulp manufacturing, pose a significant risk to soil, water, and aquatic ecosystems in the food chain, endangering humans and other organisms (Hoovield et al., 1998).The high chlorophenol content indicates that chlorophenol has a toxic impact on the indigenous soil microbial community.The results of our study found strong support in previous observations made by various researchers who reported lignin, chlorophenol, and total phenol in PPMW (Kumar and Chandra, 2021;Sharma et al., 2021a) and activated sludge (Sharma et al., 2023).This study confirmed that the sediment sample is highly contaminated with the inorganic and organic load of various pollutants discharged in PPMW even after the secondary treatment process.We believe that the biological and chemical profiles of soils in proximity to these informal dumps persistently experience substantial impacts.

Heavy metals content in sediments
The elevated levels of heavy metals pose a severe threat to the environment and humans health hazards due to their non-destructive nature, tending to bioaccumulate and biomagnify in the food chain.In our study, Al, Fe, and Ti, showed to be found in high concentration in the sediment samples.The presence of Fe might be result from the corrosion of iron vessels and pipes during the alkaline digestion process used in pulping wood chips (Chandra et al., 2017).Our results validated the earlier findings, where excess concentration of metals was found in pulp paper mill waste (An et al., 2021;Haq et al., 2022).Yadav and Chandra (2018) examined the physico-chemical characteristics of sediment samples collected from the pulp-paper mill waste dumping site and reported the high concentration of heavy metals Zn, Fe, Cd, Cr, Mn, Ni, and Cu.Elevated concentrations of these metals can impact soil permeability, texture, and, over time, lead to a reduction in the soil microbial community, ultimately soil productivity (Zhang et al., 2016;Feng et al., 2018;Zhu et al., 2023).Heavy metals like Cr and lead (Pb), along with various phenolic compounds, are categorized as "priority pollutants" by the USEPA (United States Environmental Protection Agency).These substances are known to have significant mutagenic and carcinogenic effects on both animals and humans, posing a serious health risk.The significantly elevated values of the physico-chemical parameters observed in PPS-1 and PPS-2 suggest a substantial pollution load due to discharged wastewater.The exceptionally high levels of these pollution parameters indicate the ecotoxicological and hazardous nature of the sediment.

Identified organic pollutants by GC-MS
The GC-MS is a robust analytical tool extensively employed to identify and quantify organic compounds in a complex environmental samples (Špánik and Machyňáková, 2018;Maurin et al., 2023).In the present study, the GC-MS technique has been utilized to detect and characterize organic compounds in the organic extract, extracted with two different solvents namely ethyl acetate and DCM, derived from PPS-1 and PPS-2.The GC-MS analysis of ethyl acetate extracts from PPS-1 identified over 35 organic compounds including some that are known mutagens, carcinogens, and environmental endocrine disruptors.Pentadecanone, 1-decanol,2-hexyl, benzene dicarboxilic acid, tetradecanoic methyl ester, octadecane, 3-ethyl-5-(2-), heptacosane, and β-sitosterol, constitute the primary components found in the wastewater remaining after secondary treatment of wastewater.Phenolic compound and phthalates are documented as potential EDCs.Stigmasterol and β-sitosterol, prevalent phytosterols extracted from wood during pulp development processes (Cook et al., 1997;Christianson-Heiska et al., 2008), have the potential to end up in the final discharge in pulp mill effluents.β-Sitosterol is known to induce various endocrine effects in fish, including estrogenic activity such as plasma vitellogenin (VTG) induction (Mellanen et al., 1996;Tremblay and Van Der Kraak, 1998;Orrego et al., 2009), as well as alterations in plasma sex hormone levels and changes in gonadal steroidogenesis (MacLatchy and Van Der Kraak, 1995;Orrego et al., 2010).These chemical pollutants were previously reported to be present in activated sludge (Tripathi et al., 2022), sludge (Chandra et al., 2017), and untreated or treated PPMW (Balabanič et al., 2021;Sharma et al., 2021b).In our study, the GC-MS analysis of ethyl acetal extracted samples from PPS-2 revealed the presence of pentadecanone, 1-decanol,2-hexyl, benzene dicarboxilic acid, tetradecanoic methyl ester, dimethyl phthalate, hexadecanoic acid, propanoic acid, benzoic acid, octadecane, 3-ethyl-5-(2-), stigmasterol, heptacosane, β-sitosterol, etc.The existence of these compounds is also corroborated with the previous observations (Chandra et al., 2017;Liang et al., 2021;Tripathi et al., 2022).The identified compounds may pose a significant threat to aquatic organisms due to their cytotoxic, genotoxic, and carcinogenic properties, as recognized the potential EDCs by USEPA (2012).
GC-MS analysis of DCM extracted samples from PPS-1 and PPS-2 revealed the presence of several organic compounds.The presence of organic contaminants suggests that the sediments pose an environmental risk to aquatic flora and fauna at waste disposal sites.Similar compounds have been reported by various researchers (Chandra et al., 2017;Tripathi et al., 2022).Notably, the detected compounds like hexadecanoic acid and octadecanoic acid have been reported as potential EDCs by the USEPA (2012).Octadecanoic acid and hexadecenoic have been detected and identified in the final discharge of PPMs by earlier researchers (Yadav and Chandra, 2018;Kumar and Chandra, 2021;Sharma et al., 2021a).
In this study, several other organic compounds in the ethyl acetate and DCM extract of PPS-1and PPS-2 have also been detected.These might be generated as a residual pollutant of the pulping and bleaching process during pulp production.Moreover, these compounds were likely produced during microbial biotransformation in the course of effluent treatment, potentially leading to the formation of sulfurous compounds.Nevertheless, the role of numerous other organic contaminants detected in the sediment remains a subject of interest and warrants further study for the sake of environmental safety.This finding underscore that ethyl acetate serves as an excellent organic solvent for extracting pollutants from the waste discharged from PPMs, even after secondary treatment.These conclusions are well supported by earlier results reported by Yadav and Chandra (2018).The sediment samples under the current experiment were rich in microorganisms as well as organic matter especially humic acids which denature DNA by binding phenolic groups to amides.The ratio of the readings at 260 nm and 230 nm (A 260 /A 230 ) provides an estimate of DNA purity with respect to contaminants that absorb UV light.A A 260 /A 230 ratio less than 2 indicates humic acid contamination (Ning et al., 2009), while A 260 /A 280 ratio of less than 1.8 indicates protein contamination (Maniatis et al., 1982).Samples with a A 260 / 230 ratio between ~2.0-2.2 and 260/280 ~ 1.7-2.0 are assumed as "pure" (Demkina et al., 2023).In our study the A 260 / 230 ratio for PPS-1 and PPS-2 was in range of between 1.89 and 1.88 demonstrates the lack of organic impurities (Olson and Morrow, 2012).The ratio of absorbance at 260 nm and 280 nm is used to assess the purity of DNA.In general, the extracted DNA is free from inhibitors.In this study, the A 260 / 280 ratio was between 2.07 and 2.10, for PPS-1 and PPS-2, respectively.
The maximum yield of DNA was obtained in the sediment collected from Site-I.The quantity of DNA amounted to around 10.8 ng/μl and and 9.74 ng/μl isolated from PPS-1 and PPS-2, respectively.

High throughput sequencing reads data
Chlorolignin contamination of soil, sediment, ground, and surface water is a potential threat to human health and the natural environment.Exploring microbial diversity in contaminated habitats has been a compelling and challenging endeavor.This is primarily because a significant portion of the microbial community is either non-cultivable or their growth requirements are not well-understood.Metagenomic DNA was obtained from two distinct samples, PPS-1 and PPS-2, which were contaminated with a wide range of toxic and recalcitrant pollutants.In this study, the V3 and V4 hypervariable regions of the 16S rRNA gene were amplified with the pair of universal bacterial primers to profile the bacterial community.The 16S rRNA gene is approximately 1,500 base pairs long and includes nine hypervariable regions of varying conservation (V1-V9) (Tringe and Hugenholtz, 2008;Kim et al., 2011).The hypervariable regions of 16S rRNA gene demonstrate considerable sequence diversity among different bacteria.The use of primer pairs targeting the hypervariable region of the 16S rRNA gene may bias the estimates of bacterial abundance toward certain phyla (Tremblay et al., 2015).This can be mitigated by using primer pairs that target the V3-V4 hypervariable regions of the 16S rRNA gene.
Such primer pairs have been reported to capture the true range of bacterial phyla and thus are effective for Illumina MiSeq analyses of microbial communities that inhabit complex environments (Klindworth et al., 2013;Vargas-Albores et al., 2019;Fadeev et al., 2021).The V3-V4 hypervariable region provides a broad taxonomic range of bacteria, thus capturing more microbial diversity with decreased taxonomic bias (Rintala et al., 2017;Thijs et al., 2017;Vargas-Albores et al., 2019;Oyewusi et al., 2021).Based on a detailed literature survey of studies focusing on microbial community analysis from contaminated environments, we choose only V3-V4 hypervariable regions of the 16S rRNA gene for bacterial community analysis.After generation of first amplicon, the Miseq library was developed.In this study, HTS analysis for metagenome was performed using an Illumina MiSeq platform using a 2× 300 pair-end sequencing run to generate the raw reads on a MiSeq sequencer.Paired-end sequencing enables the sequencing of template fragments in both the reverse and forward directions on the MiSeq platform.In our study, a total of 1,53,675 and 1,62,691 raw reads were generated from V-3 and V-4 segment of the 16S rRNA gene derived from PPS-1 and PPS-2, respectively.After quality filtering, the MiSeq sequencing yielded 1,15,665 and 119,386 high-quality paired reads.

Alpha diversity of the bacterial communities
Microbial diversity serves as a crucial ecological indicator, with the belief that higher diversity leads to greater sustainability of microbial communities.To explore this aspect, a HTS analysis was carried out to examine the microbial diversity within the sediment samples, PPS-1 and PPS-2.In the present study, the sediment samples were obtained from two different locations polluted with PPMW to unravel the impact of bacterial community composition and diversity using HTS of the metagenome.The total number of obtained highquality OTUs per sample was used to generate the rarefaction curves and calculate the α-diversity indices.OTUs are commonly used to group clustered sequence data, where each OTU represents a distinct microbial population in the community (Nguyen et al., 2016).In this study, rarefaction curves were constructed to evaluate whether the sequencing depth adequately captured the entire bacterial diversity within the sediments.Consequently, a rarefaction curve typically illustrates the relationship between the number of OTUs and the number of sequenced reads.As observed in Figure 3, as the number of sequences from a sample increase, the number of OTUs converges toward the true diversity.Rarefaction curves generated at the OUT level demonstrated that all samples reached a plateau.This suggests that the sequencing depth was adequate for a comprehensive characterization of each sample, encompassing a significant portion of the bacterial diversity present in sediments.
Additionally, the relative abundance and alpha diversity were also assessed using metrics, such as Shannon and Simpson indices.Shannon and Simpson indices indicated significant differences in alpha diversity of between PPS-1 and PPS-2.In our study, the level of diversity in bacterial communities in PPS-1 was lower than bacterial diversity of PPS-2 based on Shannon diversity index (Supplementary Table S3).Further, Simpson index values were slightly higher for PPS-2 than PPS-1.It was evident that contaminants had a substantial negative impact on bacterial biodiversity, as indicated by the lower Shannon and Simpson indices.Shannon's index considers both the abundance and evenness of the species present.Similarly, Simpson's Diversity Index is a measure of diversity that considers both the number of species present and their relative abundances.Shannon diversity is the most effective measure among the commonly used to measure the diversity indices.The α-diversity indices indicate that microorganisms have adapted to prolonged exposure to both organic and inorganic pollution by altering the structure, diversity, and evenness of their communities.A similar result was also reported by Salam et al. (2019).

Taxonomic composition of bacterial community
Currently, a large portion of bacterial and archaeal taxa across diverse ecosystems remains uncultured, limiting our ability to fully characterize environmental microbial communities using traditional culture techniques.Fortunately, the increasing power of NGS technologies of metagenomics enables us to delve deeper and gain a clearer understanding of the structure, function, and diversity of microbial communities in both engineered and natural environments (Cabreros et al., 2023).In this study, illumina (MiSeq) sequencing of metagenome revealed substantial differences in the bacterial community structure between PPS-1 and PPS-2 at various taxonomic levels, originating from Site-1 and Site-2 (Figure 4).The relative diversity and abundance of bacterial populations in the sediment samples from Site-1 and Site-2 have been visualized as stacked bars in the graph.As a result of Figure 4, the microbial taxonomic distribution study displayed Proteobacteria followed by the Bacteroidetes, Planctomycetes, Acidobacteria, Firmicutes, Actinobacteria, Chloroflexi and Verrucomicrobia as the most predominant bacterial phyla in PPS-1 and PPS-2.Proteobacteria, the largest and most metabolically diverse phylum, includes all gram-negative bacteria, encompassing chemolithotrophic, chemoorganotrophic, and phototrophic species, which play crucial roles in nutrient cycling and are often encountered in contaminated environments.Our analysis suggests that Proteobacteria may play a crucial role in the detoxification and assimilation of both organic and inorganic substances, including heavy metal contaminants, in sediment samples.Earlier many researchers have reported the dominant phylum as Proteobacteria in several waste-contaminated sites, such as solid waste dumping site (Kumar R. et al., 2021), distillery waste (Kumar and Chandra, 2020), antibiotic containing waste (Marathe et al., 2016), tannery wastewater (Ma et al., 2018), electronic waste polluted site (Salam and Varma, 2019) and their important role in pollutant removal.The heavy metals/metalloids and pH significantly affected the abundance and structure of most microorganisms (Zhao et al., 2020;Hu et al., 2021).
From the phylum assignment result, it was found that the total number of phyla in the PPS-1 and PPS-2 were 37 and 35, respectively, suggesting that the bacterial diversity in the PPS-1 was lower than the PPS-2 even at the phylum level.
In our study, Bacteroidetes was the second-highest represented phylum observed in both sediment samples.Bacteroidetes, as a group, are gram-negative, non-sporulating rod-shaped bacteria, and they are primarily chemoheterotrophs (Willey et al., 2019).These bacteria are known for their role in the digestion of cellulose, proteins, and other macromolecules.They possess the capacity to utilize various polysaccharides and organic chemicals as their sole carbon and nitrogen sources.Bacteroidetes have successfully colonized various ecological niches and are closely associated with detoxification, the breakdown of organic matter, and their active participation in the carbon cycle within the environment (Willey et al., 2019).This unique ability positions them as a specialized bacterial community capable of potentially contributing to in situ remediation of toxic pollutants.
In this study, the third most abundant phylum, Planctomycetes, was detected in PPS-1 and PPS-2, representing Gram-negative bacteria with diversity in cell biology, ecology, and physiology.This study explores that with the phylum Proteobacteria, the most abundant classes were Alphaproteobacteria followed by Betaproteobacteria, Gammaproteobacteria, and Deltaproteobacteria. Alphaproteobacteria is the second largest class of Proteobacteria contains extensive functional diversity of organisms, preferring to grow in environments that have low nutrient concentration (Kersters et al., 2006).However, numerous studies have indicated that Alphaproteobacteria usually thrive in organic-rich environments, such as sites contaminated with organic pollution.Betaproteobacteria, the third largest class of Proteobacteria, share metabolic similarities with Alphaproteobacteria (Willey et al., 2019).However, they predominantly utilize substances released from organic decomposition in anoxic habitats.Alphaproteobacteria and Betaproteobacteria had a more significant contribution to the wastewater or wastewater pollutants, which is in agreement with earlier findings (Desta et al., 2014;Jena et al., 2015).
In our study, Gammaproteobacteria, the largest and most diverse class of Proteobacteria, was detected.The species within Gammaproteobacteria exhibit a range of metabolic and ecological characteristics, comprising facultatively aerobic, nonsporulating, gram-negative rods that are either nonmotile or motile (Williams et al., 2010;Dyksma et al., 2016).Previous studies revealed that Gammaproteobacteria has been the most dominant phylum in contaminated soils and play a vital role in degrading various recalcitrant contaminants in polluted sites (Dell' Anno et al., 2021;Yang et al., 2021).
At the genus level, the most abundant genus was Thiobacillus followed by Comamonadaceae, and a lot of Unidentified bacterial genera.Thiobacillus is a rod-shaped, gram-negative chemolithotrophs, that belongs to the class, Betaproteobacteria (Willey et al., 2019).This genus is particularly noteworthy due to its capacity as chemolithotrophs, as they can oxidize reduced sulfur compounds like thiosulfate, hydrogen sulfide (H 2 S), elemental sulfur (S 0 ), or thiocyanate (-SCN) to serve as electron donors in energy generation (Kuenen and Tuovinen, 1981).Thiobacillus has been reported as the dominant genus in effluent treatment plants that treat tannery and pesticide contaminated wastewater (Kim et al., 2014;Fang et al., 2018;Pandit et al., 2021).It's worth mentioning that certain microorganisms with low relative abundance (less than 1%) but playing crucial roles in degradation were also detected in both PPS-1 and PPS-2.In our study, the sequences that failed to align with the taxonomic database were classified as "unidentified." It's noteworthy that the presence of unidentified organisms has been reported previously using Illumina MiSeq framework for the analysis of 16S rRNA V3 and V4 segment of metagenome grown in PPMW (Sharma et al., 2021d;Profiling) and activate sludge containing lignin and chlorophenol (Sharma et al., 2023).In our study, numerous uncultured bacteria were also detected at various levels of taxa.The findings of unculture organisms are in agreement with Sharma et al. (2023), who reported a huge number of unculture microbial communities grown in activated sludge discharged from PPM.

Linking bacterial community structure to habitat
Microbial communities are crucial for bioremediating refractory organic pollutants at contaminated sites.To develop an effective strategy, it's vital to understand how microbial communities respond to chlorolignin compound degradation.Bacteria show more versatility than fungi in breaking down these compounds, and the sediment samples provide an ecological niche for bacterial growth.Our study shows differences in microbial community structure and dynamics between two sites.Variations in microbial diversity across these sites may stem from differences in sludge physico-chemical parameters, contaminant loads, or environmental conditions (Prabha et al., 2017;10.3389/fmicb.2024.1350164Frontiers in Microbiology 20 frontiersin.orgSingh et al., 2022).Environmental factors, especially pH, significantly influence bacterial diversity.Soil pH and organic content affect microbial community structure, as they impact ion and trace metal availability (Garcia-Sánchez et al., 2015;Zhen et al., 2019;Stefanowicz et al., 2020).Microbial community structure variations are also linked to environmental conditions and pollutant loads (Mark Ibekwe et al., 2012;Verduzo Garibay et al., 2022).The dominant phyla include Proteobacteria, Bacteroidetes, Planctomycetes, Acidobacteria, Firmicutes, and Actinobacteria.Specific contaminants influence the local increase of bacterial taxa, supporting bioremediation.The presence of pollutants strongly shapes microbial community composition, with habitat more influential than geographical distance.
Future studies should consider physico-chemical factors like pH, organic concentration, and seasonal changes in community structure.Additionally, exploring specific genes favoring survival in harsh environments is warranted.This community contributes to the bioremediation and detoxification of chemicals, both organic and inorganic, as well as chlorinated compounds discharged from the PPMs in the pulp and paper-making process.

Environmental monitoring tool
Metagenomic analysis is a valuable tool for environmental monitoring, enabling comprehensive identification of microbial communities in specific habitats (Feng et al., 2018;Jurelevicius et al., 2022).In this study, bacterial communities in sediment samples were analyzed using amplicon NGS targeting the V3 and V4 regions of the 16S rRNA gene.For assessing ecosystem health, soil and sediment samples, known for their high diversity and varying sensitivities to environmental disturbances, are often employed.However, traditional taxonomic approaches based on morphological characteristics are labor-intensive, costly, and time-consuming for routine biomonitoring.In contrast, metagenomic approaches offer an environmentally friendly and efficient alternative, requiring minimal materials (Lan et al., 2024;Wang et al., 2024).While it's important to note that in metagenomic analysis, the number of OTUs may not precisely reflect the number of species due to inherent limitations in high-throughput sequencing and analysis; nevertheless, the two are positively correlated (Nilsson et al., 2019).Notably, the application of metagenomic analysis as a biomonitoring tool for paper mill waste-contaminated sediment as a biomonitoring tool has not been previously reported in India.Taking into account various physico-chemical conditions such as heavy metal concentrations, and pH, the metagenomic approach to biomonitoring bacterial community holds great promise for assessing environmental health.

Conclusion and recommendations
To the best of our knowledge, this is the first study utilizing a metagenomic approach to unravel the bacterial communities profile growing in sediment contaminated with recalcitrant pollutants discharged from PPMs.The sediment has very high load of chemical contaminants as revealed by GC-MS analysis, unveiling the presence of numerous refractory organic pollutants, such stigmasterol, β-sitosterol, hexadecanoic acid, octadecanoic acid; 2,4-di-tert-butylphenol; heptacosane; dimethyl phthalate; hexachlorobenzene; 1-decanol,2-hexyl; furane 2,5-dimethyl which has been reported as potential EDCs and mutagenic compounds.Sediment samples, PPS-1 exhibited a notable increase in alpha diversity, as evidenced by the Chao1 richness index.The bacterial community compositions of the PPS-2 samples were greatly diverse, as indicated by OTUs.The sediment samples harbored diverse bacterial phyla, such as Proteobacteria, Bacteroidetes, Planctomycetes, Acidobacteria, Firmicutes, and Actinobacteria.The most abundant bacterial genus in PPS-1 and PPS-2 was Thiobacillus, with a relative abundance of 7.60 and 4.50%, respectively.The present study showed that the recalcitrant contaminants discharged from PPMs significantly affected the bacterial community structure at disposal sites.The results presented in this study offer novel insights into comprehending the characteristics of contaminants, alterations in microbial community structure, and potential functions under pollution stress at chlorolignin waste-polluted sites.These findings hold significance for risk assessments and microbial monitoring efforts.Moreover, this knowledge holds potential for guiding the development of suitable bioremediation techniques aimed at restoring ecological balance in sites polluted by PPMW.The study recommends the detection of a large percentage of genera and species as still unclassified providing avenues for the search of novel genes.

Data availability statement
Illumina paired-end raw reads data generated in this study were deposited in the National Center for Biotechnology Information (NCBI) sequence read archive (SRA) under accession number PRJNA1035633, available at https://www.ncbi.nlm.nih.gov/sra/PRJNA1035633.The metagenomic project can also be accessed through NCBI under the BioSample Accessions SAMN38095719 and SAMN38095720 with BioProject ID PRJNA1035633 (https://www.ncbi.nlm.nih.gov/bioproject/?term =PRJNA1035633).The metagenomic dataset generated and analyzed during the current study can also be available from Mendeley Data repository available at https://data.mendeley.com/datasets/gpmwyfhbp3/2.
from the Star Paper Mill in Saharanpur (Uttar Pradesh) was utilized in various concentrations as irrigates for the cultivation of V. radiata.The study found that the mill's effluent led to a reduction in soil moisture content, waterholding capaciy, and bulk density.Simultaneously, it resulted in an increase in pH, electrical conductivity (EC), chloride (Cl − ), sulfate (

FIGURE 3
FIGURE 3 Bacterial communities diversity analysis in sediment samples.(A) Rarefaction curve showing bacterial communities diversity over the number of reads of the sequences (OTU) recovered from polluted sediment samples.The horizontal axis represents the number of sequences, while the vertical axis illustrates the diversity of the community.The sediment sample PPS-2 had the highest bacterial diversity compared to the PPS-1.(B) Venn diagram of samples shown the overlap of the bacterial communities from sediment samples based on OTU (0.97 similarity).

FIGURE 4
FIGURE 4Comparison of bacterial community structure in PPS-1 and PPS-2 samples at different levels.Stacked bar chart of relative abundance (Bray-Cutis distance) of most dominant bacterial communities within the samples at different level of (A) Phyla (B) classes (C) order (D) family (E) genera (F) species.The relative abundance is expressed as a percentage of the total effective bacterial sequences in various samples, classified using the RDP Classifier.Low abundant phyla shown as "Unknown.""Unclassified" signifies sequences which did not match any known sequence in the database.

FIGURE 6
FIGURE 6 and quantity of isolated DNA

TABLE 1
Physico-chemical characteristics of sediments samples collected from two different sites polluted with chlorolignin contaminants of pulppaper mills.

TABLE 2
Organic compounds identified in ethyl acetate extract of sediment sample (PPS-1) collected from Site-1, by GC-MS.

TABLE 3
Organic compounds identified in ethyl acetate extract of sediment sample, PPS-2, collected from Site-2, by GC-MS.

TABLE 4
Organic compounds identified in dichloromethane extract of sediment sample, PPS-1 collected from Site-1, by GC-MS.

TABLE 5
Organic pollutants identified in dichloromethane extract of sediment sample, PPS-2 collected from Site-2, by GC-MS.

TABLE 6 (
A) A summary of quality and quantity of MiSeq library checked on 4,200 Tape Station System; (B) Metagenome high throughput sequence read count statistics.A. A summary of quality and quantity of MiSeq library checked on 4,200 Tape Station System MB, Megha base; HQ, High quality; Conc, Concentration; %, Percent; bp, Base pair.

TABLE 7
Dominant and specific bacterial taxa identified in sediments samples at different taxonomic levels.