Development and Evaluation of a Fusion Polyprotein Based on HspX and Other Antigen Sequences for the Serodiagnosis of Tuberculosis

Background The lack of suitable diagnostic tools contributes to the high prevalence of tuberculosis (TB) worldwide. Serological tests, based on multiple target antigens, represent an attractive option for diagnosis of this disease due to their rapidity, convenience, and low cost. Methods Measures to reduce non-specific reactions and thereby improve the specificity of serological tests were investigated, including blocking antibodies against common bacteria in serum samples and synthesizing polypeptides covering non-conserved dominant B-cell epitopes of antigens. In addition, a fusion polyprotein containing HspX and eight other antigen sequences was constructed and expressed to increase overall sensitivity of the tests. Results Inclusion of Escherichia coli lysate partially increased the specificity of the serological tests, while synthesis and inclusion of peptides containing non-conserved sequences of TB antigens as well as dominant B-cell epitopes reduced non-specific reactions without a decrease in sensitivity of the tests. A polyprotein fusing HspX and eight other antigen sequences was constructed and displayed 60.2% sensitivity, which was higher than that of HspX and the other individual antigen segments. Moreover, the specificity of the polyprotein was 93.8%, which was not significantly decreased when compared with HspX and the other individual antigen segments. Conclusions The roles of the fusion polyprotein in the humoral immune response against TB infection were demonstrated and provide a potential novel approach for the development of TB diagnostics.


INTRODUCTION
Tuberculosis (TB) is one of the most prevalent infectious diseases and among the top 10 causes of death worldwide despite substantial progress toward TB control over the last decades. In 2020, there were an estimated 10.0 million new TB cases and 1.2 million deaths globally due to TB (1). The emergence of multidrug-resistant tuberculosis (MDR-TB) and extensively drug-resistant tuberculosis (XDR-TB), and the spread of HIV/ AIDS in TB-endemic regions have impeded efforts to control and eliminate TB and prompted health authorities to strengthen and reinforce control strategies to limit their spread (2)(3)(4).
Several methods for diagnosis of TB are currently available, including sputum smear and culture tests, chest X-ray and immunodiagnostic detection are currently available. However, a rapid, accurate, and cost-effective diagnostic tool for TB is urgently required to control this disease. Biomarkers predicting treatment efficacy and cure of active TB, the reactivation of latent tuberculosis infection (LTBI), and the induction of protective immune responses by vaccination have been investigated (5)(6)(7). However, TB-specific biomarkers have not yet been discovered and the qualification of biomarkers as a surrogate for a clinical endpoint in TB is very challenging. Immunodiagnostic detection of antigens or their cognate antibodies in the blood of patients has been successfully applied to other pathogens (8) and thus is an attractive option for TB. However, the sensitivities and specificities of the currently available options based on single or multiple target antigens for TB are variable and do not yet meet the requirements for clinical use. In 2011, the World Health Organization (WHO) issued a policy recommendation against the use of the various commercial serological tests for TB diagnosis due to the suboptimal sensitivity and specificity (9). However, further research and development in this field, specifically the identification and screening of novel serodiagnostic antigens, is still highly recommended by WHO (10).
A major challenge for TB serodiagnosis is false-positive reactions in healthy individuals, which reduces the specificity of the tests. In addition to cross-reactivity with Bacillus Calmette-Gueŕin (BCG) vaccination and other environmental mycobacteria, most TB antigens have sequences that are homologous those of other common bacteria, and TB serum antibodies inevitably cross-react with antigens from these bacteria. One way to reduce such non-specific reactions is to block antibodies against these bacteria by pre-adsorbing serum samples with lysates of common bacteria. Another approach to reduce non-specific reactions is through bioinformatics analysis to select non-conserved fragments of TB antigens compared with other bacteria while simultaneously retaining dominant B-cell epitopes where possible. However, sensitivity might decrease with this approach. Goyal et al. used B-cell epitope-containing peptides of RD1 (ESAT-6, CFP-10) and RD2 (CFP-21, MPT-64) antigens for immunodiagnosis of pulmonary TB (11). Afzal et al. constructed fusion proteins tn1FbpC1-tnPstS1 and tn2FbpC1-tnPstS1 with immunodominant B-cell epitope sequences and found that removal of a non-epitopic FbpC1 region (amino-acid residues 34-96) unmasked some of the epitopes, resulting in greater sensitivity (12).
To data, no single TB antigen-based assay has achieved a satisfactory serodiagnostic performance, which impelled us to identify new protein targets and investigate different combinations of currently identified antigens. Strategies to improve sensitivity of the tests include mixing recombinant antigens, fusing recombinant antigens (or segments), and peptide-based antibody detection. Zhang et al. mixed and combined three antigens-Rv3425, 38 kDa, and LAM-to developed a multiple-antigen enzyme-linked immunosorbent assay (ELISA) test, which was a potentially useful tool for the serodiagnosis and screening of active TB (13). Yang et al. constructed a recombinant fusion protein Rv0057-Rv1352 that exhibited good immunoreactivity with serum from patients with TB (14). A recombinant fusion of three immunodominant antigens (38-kDa-16-kDa-ESAT-6) was also reported (15). Another approach to TB serodiagnosis involved fusing antigen fragments, which contained dominant linear B-cell epitopes, instead of whole protein antigens. Two novel polyproteins, 38kD-ESAT6-CFP10 (38F) and Mtb8.4-MPT64-TB16.3-Mtb8 (64F), were constructed and evaluated by Feng et al, with the novel 38F-64F indirect ELISA exhibiting effective diagnostic performance (16). Recently, several serological tests based on synthetic peptides derived from highly antigenic proteins were also designed and evaluated. Four peptides (7-10 amino acids in length) corresponding to group-specific epitopes of Ag 85 complex of Mycobacterium tuberculosis (M. tb) were synthesized and the peptide-based ELISA found to be a sensitive, specific, rapid, and cost-effective immunoassay for early diagnosis of pulmonary and extrapulmonary TB (17). Another study evaluated the combination of peptides from B-cell epitopes of ESAT-6, CFP-10, CFP-21, and MPT-64 antigens for immunodiagnosis (11). Our own previous study identified a cocktail of serodiagnostic antigens for TB, but no single antigen had high sensitivity (18,19). In the current study, nine antigens (or antigen segments) were combined with non-conserved dominant B-cell epitopes to be a fusion polyprotein and evaluated for TB serodiagnosis to ascertain whether the polyprotein would improve overall sensitivity and specificity compared with individual antigens and other antigen combinations.

Study Population
Serum samples from healthy individuals and patients with TB were obtained from Shanghai Pulmonary Hospital (Shanghai, China) from August 2016 to December 2017. All patients with TB were diagnosed as newly treated active TB and were determined to require a full course of TB treatment according to TB diagnostic criteria. Diagnostic criteria included sputum culture or smear positivity, typical radiological manifestation and clinical response to anti-TB treatment consistent with active TB. The group with LTBI was recruited from individuals referred to the hospital with suspected TB but displaying no clinical symptoms and subsequent medical evaluation of LTBI based on a positive QuantiFERON-TB Gold (QFT-G; Qiagen) test result. Based on the manufacturer's information, QFT-G results were interpreted as positive when the value of TB Antigen minus Nil [IU/mL] was > 0. 35

Acquisition of the Lysates From Common Bacteria for Sera Pre-Adsorption
Lysates from Vibrio mimicus, Staphylococcus aureus, Bacillus subtilis, Proteus vulgaris, Staphylococcus epidermidis, Enterobacter aerogenes, g-Streptococcus, Staphylococcus citreus, and Escherichia coli were used to pre-adsorb the serum samples to block antibodies against bacterial antigens and reduce non-specific reactions. Bacteria were cultured at in a 100-mL volume overnight at 37°C, collected by centrifugation at 13,000 rpm at 4°C for 10 min, resuspended with 10 mL phosphate buffered saline (PBS), and then were subjected to ultrasonic decomposition. The sonicates were centrifuged at 13,000 rpm and the supernatants were collected. Next, 800 µL supernatant, 200 µL sera, and 2 mL PBS were mixed at room temperature for 5 h to adsorb antibodies against bacterial proteins. After centrifugation, the absorbed sera were stored at -20°C until use.

B-Cell Epitope Selection
ABCpred software (available at https://webs.iiitd.edu.in/raghava/ abcpred/ABC_submission.html, Date of access: August 18, 2021) was used online to screen potential B-cell epitopes (20). Key parameters, including probability of surface exposure, local hydrophobicity, beta-turn amino-acid sequence propensity, atomic flexibility and experimental high-performance liquid chromatography (HPLC) retention times of synthetic peptides, were considered in this software. For individual antigens, default parameters, such as threshold (more than 0.80) and overlapping filter, were set.

BLAST for Non-Conserved Sequences
Protein BLAST from NCBI (available at https://blast.ncbi.nlm. nih.gov/Blast.cgi, Date of access: August 20, 2021) was used to screen each individual antigen to identify sequences that were homologous with those from other bacteria. Firstly, sequences derived from Mycobacterium that were highly homologous to the target antigen sequences (greater than 90%) were excluded. Next, sequences derived from three to four types of common bacteria with 50-80% identity and occurring frequently were selected for further analysis. COBALT online web server (available at https:// www.ncbi.nlm.nih.gov/tools/cobalt/cobalt.cgi, Date of access: August 20, 2021) was then employed to compare targeted antigen sequences with the above selected sequences, and identify non-conserved sequences (21). Final ready-tosynthesize sequences were obtained by integration of B-cell epitope prediction and BLAST and COBALT results.

Synthesis of Polypeptides
Polypeptides were synthesized by ChinaPeptides Ltd (Suzhou, China). Purification of recombinant proteins was performed by ion exchange and protein concentrations were determined by the Bradford method (22). Each polypeptide exhibited greater than 95% purity and concentrations of each polypeptide were all =1 mg/mL.

Three-Dimensional (3D) Structure Prediction of the Fusion Polyprotein
To predict the 3D structure of the fusion polyprotein, I-TASSER online web server (available at https://zhanggroup.org/I-TASSER/, Date of access: August 24, 2021) was used, which is a hierarchical template-based method for protein structure and function (23,24). For a given polyprotein sequence, I-TASSER firstly identifies super secondary structure motifs from the Protein Data Bank (PDB) library by multiple threading approach LOMETS (25). A higher score indicates a more confident prediction of secondary structure. Normalized Bfactor is then predicted and negative values indicates the residue is more stable in the structure. Top 10 threading templates are used by I-TASSER. The alignments are from the top templates, where conserved regions often have higher structure accuracy. Norm. Z-score > 1 indicates a good alignment and the higher, the better. Top five models were further predicted. The confidence of each model is quantitatively measured by C-score that is calculated based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations. C-score is typically in the range of (-2, 5), where a C-score of a higher value signifies a model with a higher confidence and viceversa. C-score > -1.5 indicates a high-quality structure prediction. TM-score and RMSD are estimated based on Cscore and protein length following the correlation observed between these qualities.
Protein disorder was predicted by DEPICTER online web server (available at http://biomine.cs.vcu.edu/servers/ DEPICTER/#Help, Date of access: August 24, 2021) (26). Users submits a FASTA-formatted sequence of the input protein using the interface of the server. A subset of predictors, including lupred L, lupred S, and SPOT-Disorder, can be selected to run.

Polyprotein Expression
Life Technology (Thermo Fisher Scientific, MA, USA) helped to construct the plasma expression vector for cloning and expression of the polyprotein. Codon usage was optimized for E. coli. In detail, polyprotein expression in E. coli Rosetta (DE3; Novagen, Germany) was induced with 1 mM isopropyl-b-Dthiogalactoside (IPTG) and analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and Western blotting. The supernatant and inclusion bodies from the expression process were stored for further analysis.

Statistical Analysis
Statistical analyses were performed using SPSS (Version 20.0, SPSS Inc., Chicago, IL, USA), GraphPad Prism 8.0 (GraphPad Software Inc., San Diego, CA, USA), and MedCalc for Windows (Version 17.8, Ostend, Belgium) software. All statistical tests were two-sided and P < 0.05 was considered statistically significant. Results are expressed as mean ± standard deviation (SD), unless otherwise specified. A positive antibody test was defined as an OD value greater than the cutoff value, i.e., the mean OD value plus three SD from the negative healthy control serum.

Using Lysates From Common Bacteria for Sera Pre-Adsorption to Reduce Non-Specific Reactions
Most TB antigens have sequences that are homologous with those of other common bacteria, and TB serum antibodies will inevitably cross-react with antigens from these bacteria. One way to reduce nonspecific reactions is to block antibodies against those bacteria. In this study, three serum samples from healthy individuals, including one strong false positive (SFP), one weak false positive (WFP), and one normal, and one positive serum sample from a patient with TB were used. The serum samples were pre-adsorbed with Escherichia coli and other bacterial lysates, including V. mimicus, S. aureus, B. subtilis, P. vulgaris, S. epidermidis, Enterobacter aerogenes, and S. citreus, to block antibodies against antigens from these bacteria. There were no significant differences pre-adsorption with V. mimicus, S. aureus, B. subtilis, S. epidermidis, and S. citreus (Table 1 and  Supplementary Table 2). However, the value of serum samples from SFP and WFP markedly decreased after preadsorption with P. vulgaris, Enterobacter aerogenes, and Escherichia coli compared with groups without any bacteria lysates (background groups). Subsequent analyses therefore focused on these three bacteria. Twelve false-positive serum samples from healthy individuals were collected to test preadsorption. For protein PstS1, the value of 91.7% (11/12) serum samples significantly decreased after adding Escherichia coli lysate, and 88.3% (10/12) decreased after pre-adsorption with P. vulgaris or Enterobacter aerogenes ( Table 2). However, for Rv1488, only 75% (8/12) decreased after being pre-adsorbed with P. vulgaris (Supplementary Table 3). Escherichia coli and Enterobacter aerogenes were therefore selected for further study. A panel of 96 serum samples comprising 72 patients with TB and 24 healthy controls was employed to compare the difference between Escherichia coli and Escherichia coli & Enterobacter aerogenes. Four proteins, namely PstS1, Rv1488, PanD and EchA3, were used to validate the panel. There were no significant differences with the addition of Escherichia coli and Enterobacter aerogenes lysate compared with Escherichia coli lysate alone, indicating the redundancy of adding Enterobacter aerogenes lysate ( Figure 1 and Table 3). However, despite partially increasing the specificity of the serodiagnosis assay, false-positive ratio remained very high for the healthy controls, which warranted exploration of different measures to solve this problem.

Synthesizing Polypeptides to Reduce Non-Specific Reactions
Another measure to reduce non-specific reactions is to synthesize polypeptides that contained dominant B-cell epitopes and nonconserved fragments compared with other common bacteria. ABCpred software was used to screen potential B-cell epitopes and COBALT software was further employed to compare and select non-conserved segments of antigens. For example, the full sequence of Rv1488 was submitted to ABCpred software setting the parameters as Threshold (0.80) and overlapping filter ( Figure 2A). As shown in Figure 2B, 15 potential B-cell epitopes were predicted. Residues colored in green represented dominant B-cell epitopes predicted. Simultaneously, BLAST and COBALT from NCBI were employed to screen and identify homologous sequences of Rv1488 antigen compared with other common bacteria, such as Escherichia coli, and non-conserved segments were selected ( Figure 2C). The sequence "AADGDDAEVAGWFSTDTDPSIARAVATAEAIARKP VEGSLGTPPRLTQ" was non-conserved, and contained a dominant B-cell epitope "AEATARKPVEGSLGTP" ( Figure 2D). Thus, the final ready-to-synthesize sequence (Pep-Rv1488), i.e. "TDPSIARAVATAEAIARKPVEGSLGTPPRLTQ" was obtained by integration of the B-cell epitope prediction and nonconserved segments BLAST results. Similar results were obtained for PstS1 and six other antigens (Supplementary Figure 1).
To test the effect of pre-adsorption by Escherichia coli on the peptides, eight false-positive serum samples from different healthy individuals were collected for each peptide. Two serum samples from normal healthy individuals and two positive serum samples from patients with TB were used as controls. The value of most false-positive serum samples significantly decreased after addition of Escherichia coli lysate (Supplementary Table 4). The efficiency of reducing non-specific reactions ranged from 62.5% (5/8) to 87.5% (7/8), which demonstrated the effect of preadsorption by Escherichia coli.
To assess the potential diagnostic value of the polypeptides, a panel of 96 serum samples comprising 63 patients with TB and 33 healthy individuals, was collected to compare the differences in sensitivities and specificities between these polypeptides and their corresponding antigens. The levels of antibodies against Pep-PstS1 and Pep-Rv1488 in patients with TB and healthy individual groups were significantly decreased when compared with their corresponding antigens (Figure 3), which may be due to the loss of some other B-cell epitopes. However, the sensitivities of each polypeptide slightly decreased (P = 0.38 and P = 0.69) ( Table 4). Conversely, some false-positive serum samples showed seroconversion when tested by polypeptides, which improved their specificities ( Figure 3). These observations indicated that identification of suitable polypeptides would help effectively reduce non-specific reactions without markedly decreasing sensitivities.

Assembly of the Fusion Polyprotein
In the search for appropriate diagnostic antigens for TB, it was already recognized that no single antigen-based assay had achieved an optimal serodiagnostic performance to date due to the complexity of the human immune response to TB antigens (30). Thus, strategies using multiple antigens either individually or as fusion polyproteins (or segments) have been recommended. However, when the antigens were mixed and tested as one for each patient, sensitivity decreased (Supplementary Figure 2). Thus, a novel M. tb fusion polyprotein containing multiple polypeptides was preferentially constructed and expressed as an antigen with multi-epitopes.
Nine antigens, previously identified as exhibiting TB serodiagnostic potential, would be fused as a polyprotein (Supplementary Table 5). Since HspX was a serodiagnostic antigen with 33.33% sensitivity and 100% specificity (18), and expression of the fusion molecule HspX with other antigens increased by about 50% as compared with those of the individual antigen, resulting in cheaper production of the fusion antigens (31), the whole sequence of HspX was retained in the fusion protein to enhance immunogenicity. For the other eight antigens, polypeptides were screened as described in the Materials and Methods and fused as an entirety by glycine-and proline-rich (GPGPGPGPGPG) spacers. Finally, a fusion polyprotein containing nine antigens or segments was constructed ( Figure 4). The polyprotein sequence and its underlying expression-vector/gene sequence have been deposited at NCBI (GenBank accession number MZ956586). Codon usage of the polyprotein was adapted to the codon bias of Escherichia coli genes (Supplementary Figure 3), and regions of very high

3D Structure Prediction of the Fusion Polyprotein
The use of homology modeling (HM) was always available for 3D structure prediction of most proteins (32). However, it was questionable in the current study as the fusion polyprotein was artificially constructed with the "tail" sequence added to HspX consisting of diverse protein segments linked by glycine-and proline-rich (GPGPGPGPGPG) spacers, such that HM may be unable to reliably predict the polyprotein structure, and whatever structure had thus been obtained by HM was potentially misleading. Therefore, it might be more suitable using I-TASSER software, which combined homology modeling and de novo (ab initio) prediction strategies. Prediction of the secondary structure suggested that the polyprotein was an alpha-beta protein, which contains 17 alpha-helices (in red) and 17 betastrands (in blue) (Supplementary Figure 4A). The predicted solvent accessibility was presented in 10 levels, from buried (0) to highly exposed (9) (Supplementary Figure 4B). The normalized B-factor was predicted and the regions at the N-and C-terminals and most of the loop regions were predicted with positive values, indicating that these regions enriched for linear/continuous Bcell epitopes were structurally more flexible and disordered than other regions (Supplementary Figure 4C). Similar observation was made using DEPICTER for the prediction of protein disorders of the polyprotein (Supplementary Figure 5). Top 10 threading templates were used and top five models were further predicted with global and local accuracy estimations (Supplementary Figures 4D, E). The C-score of the first model was -1.79, with an estimated TM-score = 0.50 ± 0.15 and RMSD = 11.8 ± 4.5Å relative to the native. The prediction with low C-score value indicated the lack of good templates in the protein structure library, and ab initio modeling of medium-to-large size proteins without using templates remained to be improved.

Assessment of the Sensitivity and Specificity of the Fusion Polyprotein
The optimized gene was amplified by PCR using the primers F-E.coli 5'-CGGATCCGGCTCTAAACCGCCGTCCG-3' and R-E.coli 5'-CCTCGAGGGTTTCGATACGCTGCTGCAG3-'. The constructed plasma expression vector was cloned and expressed in E. coli Rosetta and analyzed by SDS-PAGE and Western blotting. The purity of the fusion polyprotein was 90.8%, as determined by Quantity One software (Supplementary Figure 6).
To test the effect of pre-adsorption by Escherichia coli for the polyprotein, eight false-positive serum samples from different healthy individuals were used. The value of 87.5% (7/8) serum samples significantly decreased after adding Escherichia coli lysate (Supplementary Table 5), which was consistent with previous results.
The sensitivity and specificity of the polyprotein was evaluated using an indirect ELISA and a panel of 192 serum samples comprising 128 from patients with TB and 64 from healthy individuals ( Figure 5 and Supplementary Table 6). The sensitivity of the polyprotein was 60.2%, which was higher than that of HspX and other individual antigen segments. The specificity of the polyprotein was 93.8%, which was not significantly decreased ( Table 5). To further evaluate the factors affecting diagnosis of the polyprotein, patients with TB were divided into different subpopulations based on clinical backgrounds. The sensitivities of the polyprotein in sputum smear-positive and -negative samples were similar, with no significant difference (63.3% vs 58.2%, P = 0.57). Similar observations were made between chest X-ray-positive and -negative samples (55.6% vs 62.0%, P = 0.51) (Supplementary Table 7). To evaluate whether the detection of whole antibodies would improve the sensitivities of the polyprotein, a panel of 48  serum samples from patients with TB and 16 serum samples from healthy individuals were independently collected and horseradish peroxidase (HRP)-labeled LD5 (HRP-LD5) was used to detect human IgG, IgM, and IgA. The sensitivities of the polyprotein for IgG and LD5 detection were 64.6% and 52.1%, respectively, with no significant difference (P = 0.29), and the specificities were both 93.8%, with no significant difference (P = 1.00) (Supplementary Table 8).

DISCUSSION
Serology-based tests for TB diagnosis, though rapid, efficient and easily implemented, have exhibited unsatisfactory and suboptimal levels of sensitivity and specificity to date. One possible reason for this is the heterogeneity of the antibody response in patients with TB. The number and types of seropositive antigens vary from person to person and this variation may be linked to genetic polymorphisms of the human leukocyte antigen (HLA) class II alleles (30). WHO does not recommend the use of the current commercial serological tests for TB diagnosis but does still encourage further research and development in this field. Due to TB antigens having sequences homologous with those of BCG and other common bacteria, TB serum antibodies would inevitably cross-react with antigens from these bacteria, thus inducing high false-positive rates and reducing the specificity of the serodiagnostic tests. Blocking antibodies against these bacteria is a feasible way to reduce nonspecific reactions. In the current study, in addition to Escherichia coli that had been used previously, lysates from other bacteria, including V. mimicus, S. aureus, B. subtilis, P. vulgaris, S. epidermidis, Enterobacter aerogenes, and S. citreus, were used to pre-adsorb the serum samples to block antibodies against bacterial antigens. There were no significant differences from adding V. mimicus, S. aureus, B. subtilis, S. epidermidis, and S. citreus, and the value of serum samples from strong false-positive and weak falsepositive groups significantly decreased after pre-adsorption with P. vulgaris, Enterobacter aerogenes, and Escherichia coli compared with the groups without any bacterial lysate. Further study analysis revealed that Escherichia coli and Enterobacter aerogenes performed better. There were no significant differences resulting from the addition of a combined Escherichia coli and Enterobacter aerogenes lysate compared with only Escherichia coli, indicating the redundancy of adding E. aerogenes lysate. However, despite partially increasing the specificities, there was still a large false-positive ratio for healthy controls, hence, exploration of further measures was required to solve this problem.
Synthesizing peptides according to immunodominant antigens of M. tb could be an alternative and effective approach for serodiagnosis. B-cell epitope-containing peptides of RD1 (ESAT-6, CFP-10) and RD2 (CFP-21, MPT-64) antigens were used for immunodiagnosis of pulmonary TB (11). Afzal et al. constructed fusion proteins tn1FbpC1-tnPstS1 and tn2FbpC1-tnPstS1 with immunodominant B-cell epitope sequences and found that removal of a non-epitopic FbpC1 region (amino-acid residues 34-96) unmasked some of the epitopes, resulting in greater sensitivity (12).In the current study, these peptides should contain non-conserved fragments of TB antigens compared with other bacteria, as determined by bioinformatics analysis, and simultaneously retain dominant Bcell epitopes where possible. To investigate whether preadsorption with Escherichia coli lysate reduced non-specific Since it was impossible to achieve optimal serodiagnostic performance by using a single antigen-based assay, strategies using multiple antigens either individually or as fusion polyproteins (or segments) have been recommended. Our preliminary experiment revealed that a crude mixture of several antigens did not significantly increase the rates of positive reactions, and was even counterproductive ( Figure  S2). A strategy of constructing and expressing a fusion polyprotein containing multiple polypeptides was therefore explored. Our previous study identified a cocktail of serodiagnostic antigens for active TB (18) and the segments from nine of these antigens were combined with non-conserved dominant B-cell epitopes as a fusion polyprotein in the current study, which may improve overall sensitivity and specificity compared with individual antigen and other combination forms. Since HspX was a serodiagnostic antigen with 33.33% sensitivity and 100% specificity in our previous study (18), and expression of the fusion molecule HspX with other antigens was increased by about 50% as compared to those of the individual antigen, resulting in cheaper production of the fusion antigens (32), the whole sequence of HspX was retained in the fusion protein to enhance immunogenicity. Of the other eight antigens, pstS1 and mpt64 were the most frequently studied, while Rv1488, panD, echA3, cydA, Rv1825 and hns were novel antigens identified in our previous study. The 3D theoretical structure of the polyprotein comprising nine antigens or segments was predicted and the codon bias was adapted for Escherichia coli genes. In the current study, it was unreasonable to use HM models to predict the 3D structure of the polyprotein since it was artificially constructed, such that structure obtained by HM was potentially misleading and unrealistic. A combination of ab initio protein structure prediction and protein disorder prediction (using I-TASSER and DEPICTER) was used to replace HM models (32). The normalized B-factor prediction indicated that the regions of the polyprotein enriched for linear/continuous B-cell epitopes were structurally more flexible and disordered. Similar observation was made using DEPICTER. However, the 3D model prediction with low Cscore value indicated this artificially constructed polyprotein lacked good templates in the protein structure library, and ab initio modeling of medium-to-large size proteins without using templates remained to be improved. As Roy et al. suggested (23), other sources of structural information, such as data from mutagenesis or crosslinking experiments on the target protein, can be specified as external restraints to improve the modeling quality.
The fusion polyprotein was successfully expressed and an indirect ELISA assay was conducted to evaluate the sensitivity and specificity of the polyprotein. The sensitivity of the polyprotein was 60.2%, which was higher than that of HspX and other individual antigen segments. The specificity of the polyprotein was 93.8%, which was not significantly decreased compared with HspX and other individual antigen segments. Previous studies indicated that there were some associations between clinical backgrounds, including sex, age, bacterial loads, and chest X-ray status, and antibody reactivity of TB diagnostic antigens (33). However, there were no significant differences between the sensitivities of the polyprotein in sputum smearpositive and -negative samples (63.3% vs 58.2%, P = 0.57), and in chest X-ray-positive and -negative samples (55.6% vs 62.0%, P = 0.51) in the current study. IgM and IgA detection can supplement IgG detection in advanced TB and the simultaneous detection of IgG/IgM/IgA may improve the positivity rate. Li et al. found that the mixture of anti-human IgG and IgM added to a well [Ig(G + M)] had a stronger immunoreactivity to PstS1-LEP than the single antibody (34). Abebe et al. found that there were significant variations in IgA, IgG, and IgM responses to the different antigens, but not all antibody isotype responses are markers of clinical TB (35). Here we used HRP-LD5 to detect human IgG, IgM, and IgA. However, the sensitivities and specificities of the polyprotein for IgG and LD5 detection were both not significant difference. The lack of unified standards for antibody-based diagnosis made different studies controversial and variable, implying further research and development in this field.
In summary, serological tests represent an attractive option for TB diagnosis. However, the unsatisfactory sensitivities and specificities of the currently available options based on single or multiple target antigens do not yet meet the requirements for clinical use. This study therefore explored measures to solve this problem. Lysates from common bacteria to block antibodies against these bacteria were used in serum samples and nonconserved dominant B-cell epitopes of antigens were synthesized to reduce nonspecific reactions and thus improve the specificity of the serological tests. Furthermore, construction and evaluation of a fusion polyprotein containing HspX and eight other antigen segments revealed that the sensitivity of the polyprotein was 60.2%, which was higher than that of HspX and other individual antigen segments, while there was no significant decrease in the specificity of the polyprotein (93.8%). This study demonstrates the roles of fusion polyproteins in the humoral immune response against TB infection and provides a potential novel approach for development of TB diagnostics.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Naval Medical University. The patients/participants provided their written informed consent to participate in this study.