Finding Potential Therapeutic Targets against Shigella flexneri through Proteome Exploration

Background: Shigella flexneri is a gram negative bacteria that causes the infectious disease “shigellosis.” S. flexneri is responsible for developing diarrhea, fever, and stomach cramps in human. Antibiotics are mostly given to patients infected with shigella. Resistance to antibiotics can hinder its treatment significantly. Upon identification of essential therapeutic targets, vaccine and drug could be effective therapy for the treatment of shigellosis. Methods: The study was designed for the identification and qualitative characterization for potential drug targets from S. flexneri by using the subtractive proteome analysis. A set of computational tools were used to identify essential proteins those are required for the survival of S. flexneri. Total proteome (13,503 proteins) of S. flexneri was retrieved from NCBI and further analyzed by subtractive channel analysis. After identification of the metabolic proteins we have also performed its qualitative characterization to pave the way for the identification of promising drug targets. Results: Subtractive analysis revealed that a list of 53 targets of S. flexneri were human non-homologous essential metabolic proteins that might be used for potential drug targets. We have also found that 11 drug targets are involved in unique pathway. Most of these proteins are cytoplasmic, can be used as broad spectrum drug targets, can interact with other proteins and show the druggable properties. The functionality and drug binding site analysis suggest a promising effective way to design the new drugs against S. flexneri. Conclusion: Among the 53 therapeutic targets identified through this study, 13 were found highly potential as drug targets based on their physicochemical properties whilst only one was found as vaccine target against S. flexneri. The outcome might also be used as module as well as circuit design in systems biology.

Shigella, as Gram-negative facultative human pathogens, cause intestinal infections with sign and symptoms such as fever, abdominal cramps and watery, or bloody diarrhea. Recent evidence suggests that the third prominent reason of global infant mortality is diarrhea (Black et al., 2010). Children below the age of five are mostly affected by Shigella (Kotloff et al., 1999;Peng et al., 2002). Majority of endemic dysentery are caused by S. flexneri in regions of the world having less facility of sanitation. Also, improper use of antibiotic rendered resistance to shigella. Therefore, antimicrobials development could be the better alternatives for the prevention of antibiotic-resistant Shigella as it is non-toxic, cheap, easy to apply and produce life-long immunity (Kärnell et al., 1995;Coster et al., 1999;Mukhopadhaya et al., 2003Mukhopadhaya et al., , 2006Katz et al., 2004;Ranallo et al., 2005;Paterson, 2006). Recently, it has been reported that Shigella serotypes including S. flexneri 2a, 3a, S. dysenteriae 1, and S. sonnei are targeted for vaccine development. In developing countries, the first three are more widespread while the last serotypes found in regions where sanitation standard is high (Jennison and Verma, 2004;Mukhopadhaya et al., 2006).
Antibiotic is mostly given for the treatment with shigella infection. But the high failure rate is increasing day by day due to acquired resistance to commonly used antibiotics (Nessar et al., 2012). To combat these infections new drugs discovery are necessary due to emerging multi-drug resistance and absence of optimal treatment. Therefore, to identify novel drug target(s) are one of the key ways of drug discovery. The present study is aimed to identify potential drug targets in S. flexneri by subtractive proteome analysis. The traditional way of drug discovery needs more times, expensive experiments and also laborious efforts whereas the computational way could be effective alternative way which could accelerate of drug discovery process within a very short time. The identification of drug targets has been growing more by coupling of "omics" data viz., genomics, proteomics, and metabolomics and the utilization of computational approaches. The sequencing of genome and revealing of proteome of disease causing organisms are advancing the search of drug targets in the field of drug research based on essential genes of specific pathogen, interacting factors of host-pathogen, proteins persistence, resistance genes/resistance-associated proteins, metabolic pathways, prediction of gene expression levels (Galperin and Koonin, 1999;Yeh et al., 2004;Briken, 2008;Raman et al., 2008;Barh et al., 2011;Vetrivel et al., 2011). To identify novel drug targets these approaches have already been utilized in several life threatening pathogens, including Mycobacterium tuberculosis (Anishetty et al., 2005;Asif et al., 2009), M. leprae (Shanmugam and Natarajan, 2010), M. ulcerans (Butt et al., 2012), Helicobacter pylori (Sarkar et al., 2012), Streptococcus pneumonia (Singh et al., 2007), Yersinia pestis (Sharma and Pan, 2012), and Pseudomonas aeruginosa (Sakharkar et al., 2004). The foremost criteria for the identification of promising therapeutic candidates are essentiality and selectivity/specificity. To avoid the undesired interactions of host-pathogen which could occur to death of the pathogen by mediating inhibition of important proteins, the targets must be specific as drugs interact to host proteins. The present study integrates the various computational methods for the identification and characterization of the drug targets of S. flexneri which enables us to identify 53 potential therapeutic targets based on essentiality and specificity. The qualitative characterization of 53 therapeutic candidates predicts the uniqueness in metabolic pathway, capability to act as a broad spectrum target, the cellular location, cellular function, functional association with metabolic proteins, and druggability properties.

MATERIALS AND METHODS
An in silico systematic method consists of three stages is applied to classify and illustrate possible drug targets against S. flexneri. At the stage I, the protein dataset was collected from the NCBI-FTP site for the analysis. Clarification of protein datasets through subtractive channel of analysis was done in stage II and possible drug targets found from stage I and II were qualitatively characterized in stage III.

Stage I: Mining of Protein Datasets
The whole protein sequences of S. flexneri 2a strain 2457T were retrieved from NCBI (http://www.ncbi.nlm.nih.gov/) protein database as FASTA format.

Stage II: Subtractive Channel of Analysis
Protein datasets were further selected and subjected to be qualified by passing through a sequence of subtractive proteome analysis. In this process, highly discerning and effective drug targets could be identified.

Identification of Paralog Proteins
Paralog proteins were identified by exposing S. flexneri proteins to CD-hit suite (Huang et al., 2010). The CDHIT server can be utilized for the "sequence identity" among the proteins (Fasta file). In this server users can exploit 10-90% sequence identity in the "sequence identity cut-off " box depending on their requirement (Huang et al., 2010). It has been widely accepted to set 60% sequence identity as cut-off to maintain a rigid criteria to remove duplicate proteins (Dutta et al., 2006;Barh and Kumar, 2009;Rahman et al., 2014;Mondal et al., 2015;Hasan et al., 2016). Therefore, we have used CDHIT server (http://weizhong-lab.ucsd.edu/cdhit_suite/cgi-bin/index. cgi?cmd=cd-hit) in which 0.6 (60%) was manually set in "sequence identity cut-off " box for the stringent selection of duplicate proteins. The duplicates were omitted and for further selection those were designated from the remaining set of nonparalog proteins that have in excess of 100 amino acids.

Identification of Orthologs in Gut Flora
The non-duplicate proteins that were described in the preceding step, analyzed to explore their similarity with the proteome of human gut microbiota (Fujimura et al., 2010). BLASTing (Altschul et al., 1990) of those proteins against the gut flora proteome available from literature with 0.0001 as e-value threshold helped to further escape from the orthologs.

Identification of Orthologs in Human Proteome
BLASTp (Altschul et al., 1990) was done for the qualified proteins resulted from stage I against non-redundant database of H. sapiens with an estimated threshold value of 0.0001. Those proteins are selected for the next step that shows no hits for the above mentioned e-value.

Essentiality Analysis
To identify essential proteins by BLASTp searching against Database of Essential Genes (DEG; Zhang and Lin, 2009), non-homologous proteins were screened. Protein alignments associated with expect value of <0.0001 (Barh et al., 2011;Sharma and Pan, 2012) were considered as more significant hits.

Metabolic Pathway Analysis
Metabolic pathway analysis was done by KAAS server at KEGG to classify the possible targets of the human non-homologous essential proteins of S. flexneri which has been assimilated from DEG (Moriya et al., 2007). Functional annotation of genes is obtained through KAAS by comparing BLAST beside manually created KEGG GENES database. The result comprises of KO (KEGG Orthology) task that identify the metabolic proteins.

Stage III: Qualitative Characterization of the Short-Listed Targets
Detection of Proteins Involved In Unique Pathways KEGG (Kyoto Encyclopedia of Genes and Genomes) Genome Database (Kanehisa et al., 2014) was applied to find the distinctive metabolic pathways of S. flexneri in contrast to H. sapiens. In order to compare the metabolic pathway three letter codes that are particular for host and pathogen was placed in the Genome Comparison and Combination box. Eventually KEGG Genome generated pathway maps were recognized as distinctive pathways. Human non-homologous metabolic proteins that were selected formerly were subjected to screening to disclose their association in those distinctive pathways.

Cellular Localization Analysis
In this study, PSORTb 3.0.2 (Yu et al., 2010), CELLO 2.0 (Yu et al., 2006), Signal IP 4.1 (Petersen et al., 2011), and Phobius (Käll et al., 2007) were used to identify the location of the shortlisted proteins in various modules like SVM, S-TMHMM, and SCL-BLAST with proteins of known localization from bacteria (Gram-positive and negative) and archaea as training set. These server were used for the accurate localization of identified targets.

Broad Spectrum Analysis
BLASTp (Altschul et al., 1990) search was done beside a wide-range of pathogenic bacteria to explore proteins with an estimated threshold value of 0.005 to find broad spectrum targets. In the broad spectrum analysis, overall 240 disease-causing bacteria from different genus including with other serotypes of S. flexneri were used. Cluster of Orthologous Groups of proteins (COG) search was used to identify homologs of the short listed targets in other pathogenic bacteria by means of COGnitor from NCBI that matches the query sequence with the COG database.
Interactome Analysis STRING 9.0 (Szklarczyk et al., 2011) was used to build a protein-protein interaction network for each of the authorized targets. All interactors with low along with medium confidence score (<0.700) were removed from the network to escape false positives and false negatives.

Functionality Analysis
INTERPROSCAN, a tool that incorporates various protein signature recognition methods and databases, was used to predict the role of the hypothetical proteins from the list of possible targets (Mulder and Apweiler, 2007).

Druggability Analysis
In the current study, a similarity search was done in contrast to Drug Bank 3.0 target collection for all of the target (Knox et al., 2011).

Docking Analysis
We have performed Autodock vina (Trott and Olson, 2010) for the analysis of binding affinity to the druggable targets. We retrieved all the interacting drugs (.pdb files) from 6 druggable targets which were found by virtual screening in Drugbank database. Then we generated the .pdbqt files of these 6 druggable targets for docking experiments. Blind docking was performed for the identification of most effective binding site of these drugs. A grid box parameter for covering the whole protein was set for all docking runs. Furtheremore, we have performed ProtParam (http://web. expasy.org/protparam/) and VaxiJen (http://www.ddg-pharmfac. net/vaxijen/VaxiJen/VaxiJen.html) server for the identification of most suitable drug/vaccine target.

Data Acquisition
A number of potential therapeutic targets in S. flexneri is detected and categorized through an in silico method. The strategy applied a hierarchy of subtractive analysis where functionally essential proteins of S. flexneri derived from whole proteome of 13,503 proteins dataset (Drug Target, Data Sheet S1). The number of subtractive channel was used to select the potential candidates that could serve either drug or vaccine for therapeutics treatment. Overall, the strategy composed of stages of data mining, subtractive channel of analysis and qualitative characterization.

Subtraction of Duplicate and Mini Proteins
CD-HIT suite identified the duplicate or paralog proteins from the proteome. This tool sorted out 4559 proteins as non-duplicate at sequence identity upto 60% (Drug Target, Data Sheet S2). Proteins that showed above 60% matching were considered as duplicates or paralogs in this analysis. We have chosen 60% similarity as a cut-off to maintain a very stringent selection criteria for the identification of the most effective therapeutic targets. Also, we considered to keep one protein from two identical sequences (>60% similarity) as they might be similar for protein domain, motifs, binding site etc. The proteins found from non-duplicate analysis have various length distribution. Proteins were excluded from the analysis having length of less than 100 amino acids known as mini proteins (Wang et al., 2008;Kumar et al., 2010;Barh et al., 2011). Prokaryotic genomes contain a high level of mini proteins which have key role in numerous biological phenomenon as well as regulatory purposes (Kumar et al., 2010). These mini proteins were deleted from the nonduplicate proteins as they are less likely to represent the essential therapeutics candidate (Drug Target, Data Sheet S3). In addition to this, the larger amino acid sequence has the probability to be involved in essential metabolic pathways (Haag et al., 2012).

Subtraction of Orthologs in Gut Flora
A critical step in this study is to identify proteins that are nonhomologous to gut flora proteins for circumventing extreme lethal effects in host. It is reported that around 10 14 (Kärnell et al., 1995) microorganisms exist in the gastrointestinal tract of a normal healthy human (Fujimura et al., 2010). As gut microbiota maintains a symbiotic relationship, it helps in metabolism by fermenting indigestible food particles along with defense from colonization of pathogenic bacteria in gut (Rabizadeh and Sears, 2008). If gut flora proteins are spoiled accidentally it may decline the microbiota which may cause harm to the host. Therefore, Human gut flora proteins were subjected to analyze for the assurance of non-similarity with our selected proteins. BLASTp was employed to recognize the gut homologs with our selected proteins and gut flora proteins. Eventually, we have identified 2708 gut flora similar proteins (Drug Target, Data Sheet S4 and Figure 1).

Subtraction of Orthologs in Human proteome
Identification of pathogen specific protein is the main goal of this analysis. The importance of this step is to reduce unwanted cross reactivity of the drug and thus to inhibit its binding to the active sites of the homologous proteins in host (Sarkar et al., 2012). In in silico drug target identification method, the first step is considered as the filtration of homologous proteins to human proteome (Anishetty et al., 2005;Sarkar et al., 2012). This non-similarity analysis was carried out for the gut flora non-similarity proteins datasets. BLASTp was used for similarity search in contrast to whole proteome of H. sapiens (host) with e-value threshold of 0.0001 (Altschul et al., 1990). Those Proteins were considered as close homologs that exhibit hits with proteins of human proteome. From the analysis of 2708 input proteins 1987 homologous proteins were omitted and 721 proteins were nominated that are non-homologous to human (Drug Target, Data Sheet S5 and Figure 1).

Identification of Essential Genes in S. flexneri
DEG server was used for the screening the essential genes of S. flexneri from human proteome non-homologous protein list with an estimated value of 0.0001. DEG 6.1 is a storehouse of genes necessary for the survival of an organism. It comprises 10,618 essential genes from prokaryotic and eukaryotic organisms. These types of proteins were considered as essential and it was clearly demonstrated that similar proteins which are crucial in one organism are likely to be essential in another. A potential drug target possesses a crucial feature for the existence of the pathogen and must be an indispensable protein (Sarkar et al., 2012). The 67 proteins (Drug Target, Data Sheet S6 and Figure 1) out of 721 non-homologous input proteins were nominated for the consecutive analysis and considered as vital for the existence of the pathogen as they have homologs with not more than the given threshold value (Supplementary Table,  S1). Proteins showing no hit against DEG were omitted from the analysis and regarded as non-essential.

Metabolic Pathway Analysis
The output of this server assists to identify the potential drug targets by revealing the KEGG pathways as well as KO (KEGG orthology) assignments. About 53 proteins are involved in metabolic pathways obtained from the essential proteins (Drug Target, Data Sheet S7 and Figure 1). These 53 proteins (Supplementary Table, S2) have the key role in metabolism for the bacterial survival.

Unique Pathway Analysis
Besides the identification of metabolic pathway proteins we have also analyzed the unique metabolic pathway proteins which answers the disputable question whether the metabolic pathway proteins are also present or not in host. Here, we have found 11 unique metabolic pathway proteins that are only present in the bacterial metabolic pathway ( Table 1). These unique proteins were found in the pathways: Purine metabolism, Pyrimidine metabolism, Fructose and mannose metabolism, Amino sugar and nucleotide sugar metabolism, Lipopolysaccharide biosynthesis, Pyruvate metabolism, Propanoate metabolism Butanoate metabolism, Lysine biosynthesis, Terpenoid  backbone biosynthesis, Phosphotransferase system (PTS), Peptidoglycan biosynthesis, Flagellar assembly, Arginine and proline metabolism and bacterial pathogenic cycle.

Subcellular Localization
Proteins can be found in five possible subcellular locations, specifically, cytoplasm, periplasm, plasma membrane, outer membrane, and extracellular. The importance of the localization study is to depict the protein as drug or vaccine target. Surface membrane proteins and cytoplasmic proteins can be used as vaccine and drug targets respectively (Barh et al., 2011). Protein databases like UniProt contain information about subcellular location of some proteins. PSORTb 3.0.2 (Yu et al., 2010), CELLO 2.0 (Yu et al., 2006), SignalIP (Petersen et al., 2011), and Phobius (Käll et al., 2007) tools were utilized for the prediction of subcellular localization. Therefore, we have analyzed the essential human nonhomologous metabolic pathway proteins (53) for the prediction of their subcellular localization. From these analysis we have found 29 targets as cytoplasmic, 14 targets as inner membrane, 6 targets as periplasmic, 3 targets as outer membrane and only 1 target as extracellular ( Table 2).

Broadspectrum Analysis
The progress of drug resistance can be reduced greatly by such types of broadspectrum and pathogen specific targets analysis (Raman et al., 2008). In this study, list of pathogenic bacteria described in literature was well-thought out (Griffith et al., 2007;Raman et al., 2008;Shenai et al., 2009). It is theorized from the similarity analysis in contrast to each of the pathogen that close homologs existing in more number of pathogens are more expected to be a "promising broad spectrum target" (Raman et al., 2008). Therefore, we have also analyzed broadspectrum targets by comparing the sequences analysis between the 240 pathogens shigella species as well as other clinically important bacterial pathogens with the selected 53 essential metabolic proteins (Drug Target, Data Sheet S7). This was done by the BLASTp analysis. About 25 targets were found to have close identity with less than 10 bacterial pathogens, 10 targets in less than 20 bacteria, 8 targets in less than 30, 7 less than 40 bacteria, and 3 in less than 50 bacterial pathogens (Table 3). From these results, it is concluded that these 3 (NP_836465.1, NP_836278.1 and NP_835876.1) or 7 (NP_837443.1, NP_837438.1, NP_836948.1, NP_836937.1, NP_836681.1, NP_836675.1 and NP_836672.1) targets are the exclusive proteins of Shigella.

Interactome Analysis
After data mining, finally identified protein data analyzed and depicted a PPI network which is shown in (Figure 2). The networking was established by STRING analysis. In STRING, one protein is interacted with a number of proteins and showed the strength of interaction as score. The interacting score depends on Neighborhood in the genome, Gene fusions, Co-occurrence across genomes, Co-Expression, Association in curated databases and text mining (Jensen et al., 2009). Protein network contains high confidence interactors with score more than or equal to 0.700. Based on the variation in the critical network parameter values the potentiality of the targets was determined. In the bacterial metabolic system, the significance of the query protein was figured out from the number of interacting proteins (nodes) and interactions (edges) interrupted on its deletion (Kushwaha and Shakya, 2010). In low confidence score (0.150), all of the proteins interacted with each other. In highest confidence score (0.900), 30S ribosomal protein (rpsD), DNA-directed RNA  polymerase subunit alpha (rpoA), RNA polymerase sigma factor RpoS (RpoS), RNA polymerase-binding transcription factor (dksA), anti-RNA polymerase sigma 70 factor (rsd), Holliday junction resolvase (ruvC), Holliday junction DNA helicase (RuvA), 7,8-dihydropteroate synthase (folP), and 2-amino-4hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase (folK) were shown to interact with each other (Figure 2). The highest confidence score of interaction found in between folP and folk which was 0.999. The homologous genes of those proteins are neighbors of other species and in addition, putative homologs are reported to interact in other species.

Functionality Analysis
From the subtractive analysis we have found 6 hypothetical proteins (Drug Target, Data Sheet S8) which are uncharacterized proteins from the list of 53 potential targets. Attempts were made to characterize their function. To predict their function we have used INTERPROSCAN for the characterization of their functional domain, Molecular function, and Biological process ( Table 4). But Molecular function and Biological process also was none predicted in some of the cases.

Binding Site Analysis
We have selected the best template from the Local meta-threading-server (LOMETS; Wu and Zhang, 2007) where more than one threading program showed same template with higher score (Align Length, Coverage, Z score and Confidence score; Supplementary Figure S1). After selecting the template we have prepared the required files for sequence of interest and template (.ali, .pir, .bin, and python file). We have run python programming based Modeller software (Modeller 9.17) for generating the number of models (Webb and Sali, 2014). Then we have followed the both DOPE (Discrete Optimized Protein Energy) and GA341 method for selecting the best model from the number of generated models. As GA341 is not as good as DOPE at differentiating between high and low quality model, we have calculated the DOPE score as it is designed for selecting the best structure from a collection of models built by MODELLER. We have selected the best model which shown the lowest DOPE score as it is reported that lower the DOPE score the better is the model. Thereafter, we have checked the quality assessment of every model built by Modeller. We have utilized Ramachandran Plot (Laskowski et al., 1993), Verify3D (Eisenberg et al., 1997), and ProSA (Wiederstein and Sippl, 2007) for the model assessment. All the three tools showed satisfactory results which are summarized in Supplementary Table, S3, Figures, S2, S3. Through these results we have confirmed the predicted model of each hypothetical protein as high-quality. All the predicted model showed above 90% residues in favorable region (Supplementary Table, S3). Verify3D and ProSA also confirmed that the built model are good (Supplementary Figures, S2, S3). We have also analyzed the binding site and binding site residues of these hypothetical proteins (Table 5 and Supplementary  Figures, S4-S9).

Druggability Analysis
High binding affinity to the drug like molecule is a major feature of a "druggable target." Drug Bank contains 6816 experimental and FDA-approved drugs, 4326 drug targets and 169 drug enzymes/carriers. The module estimates the degree of similarity by using the similarity search option BLASTp program (Mondal et al., 2015). Presence of targets acts as assign for their druggable property. In contrast, its absence designates the novelty of the target and therefore, categorized as "novel target" (Knox et al., 2011). From the selected 53 targets we have found 6 targets (Drug Target, Data Sheet S9) which showed the similarity with approved, investigational, and experimental drugs when the BLASTp search was done against DrugBank database with cut-off parameters ( Table 6). We have observed that target EFS15406.1 showed the highest similarity with 15 drugs which could act as inhibitors (both approved and investigational). The docking energy also revealed that the interacting drugs have better binding affinity to their respective targets (Table 6). Therefore, out of 53 therapeutic targets, 47 could be considered as novel drug targets as the remaining 6 targets showed druggable properties.

Implications for Drug and Vaccine Development
The 53 targets identified through the current study for vaccine and drug design against S. flexneri will make inroad for the development of effective drug(s) and vaccine(s) (Figure 3). Among the 53 targets 43 were found as novel drug targets (Figure 3 and Table 6). The development of such therapeutics  might be targeted based on their qualitative characteristics. S. flexneri specific unique metabolic pathway might be a target for drug development (Table 1). Drug and vaccine development might also consider cellular location of the targets: Bacterial surface appears to be important for the immunogenicity and surface proteins are more accessible for vaccination.
ideal target for the epitope based peptide vaccine as it showed probable antigenecity and non-allergenecity (Supplementary Table, S4). Membrane proteins are the gateways to the cell: many nutrients, ions, waste products, and even DNA and proteins enter and leave cells via proteins which are tightly controlled, maintaining the integrity of the cell. Drugs often target membrane proteins; therefore, understanding their molecular structure helps us design better drugs to cure diseases. We have found 14 targets identified through this analysis can be suitable for drug design ( Table 2).
Besides, Membrane localized proteins are sometimes difficult to purify and assay (Duffield et al., 2010) and therefore, cytoplasmic proteins are more favorable as drug targets. We have found 35 targets which might be suitable for drug design ( Table 2).
A promising drug target should have better physicochemical properties such as increased hydrophobicities, in vivo halflives, propensity for being membrane bound and the stability of protein (Bull and Doig, 2015 Table, S5). We observed 13 targets as novel candidate where only NP_835770.1 target showed druggable (Figure 3).
Also, the functions of each identified targets including hypothetical targets may be exploited for the identification or design drug/vaccine for effective therapy against S. flexneri. In the perspectives of broadspectrum analysis we have identified some targets which might be used as "promising broadspectrum targets" as they cover the targets more than 40-50 pathogenic bacteria including S. flexneri. From the 13 novel drug targets we investigated NP_837443.1 as the most suitable drug candidate which covers the target similarity of 40 pathogenic bacteria including Shigella (Figure 3). The identification of the suitable target(s) is the key device for drug discovery. Our identified promising target (NP_837443.1) plays an important role in Holliday junction which exchanges the segments of genetic information during recombination as well as Helicase activity for the survival of S. flexneri. For the drug discovery against this target we need to identify the conserved domain responsible for Holliday junction. The effective binding site and its active residues of this domain should also be identified for the drug design. Finally, drug-target binding affinity of this binding site, ADMET (Absorption, Distribution, Metabolism, and Excretion), QSAR (Quantitative Structural Activity Relationship), and toxicity measurement will lead the way to inhibit the function of this target protein. Further, animal model experiment must be needed to confirm the drug efficacy against this target.
As the selection of effective therapeutics targets are the crucial things for the vaccine and drug design (Hasan et al., 2015;Khan et al., 2015;Oany et al., 2015;Hossain et al., 2016a,b;Hossain and Oany, 2016d) as well as for the building of networking (protein-protein interaction; Hossain et al., 2016c), our subtractive strategy will have a great impact in therapeutics discovery against S. flexneri.

CONCLUSION
Subtractive channel analysis in proteome of S. flexneri 2a str. 2445T suggest 53 drug targets and further with their qualitative characterization stage explicates unique metabolic pathways, localization of targets, broadspectrum, functionality, interactome, and druggable properties. This study might facilitate the development of drug and vaccines against S. flexneri as well as come to be handy in clinical interest with identification of drug candidates against other pathogens.

AUTHOR CONTRIBUTIONS
MS: Conceived, designed, and guided the study, analyzed the data, helped in drafting and performed critical revision. CK: Guided the study, acquisition and analyzed the data, helped in drafting the manuscript. MM and AH: Guided the study, analyzed the data, and helped in drafting the manuscript. MK and MI: Helped in Bioinformatics analysis and drafted the manuscript. MH: Helped to design the study, performed bioinformatics analysis, drafted, and developed the manuscript and performed critical revision. All authors have approved the manuscript.