<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">809001</article-id>
<article-id pub-id-type="doi">10.3389/fgene.2021.809001</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>SNAREs-SAP: SNARE Proteins Identification With PSSM Profiles</article-title>
<alt-title alt-title-type="left-running-head">Zhang et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">SNARE Proteins Identification With PSSM</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Zhang</surname>
<given-names>Zixiao</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gong</surname>
<given-names>Yue</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1542946/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gao</surname>
<given-names>Bo</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1225393/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Hongfei</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gao</surname>
<given-names>Wentao</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Zhao</surname>
<given-names>Yuming</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Dong</surname>
<given-names>Benzhi</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1544326/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>
<institution>College of Information and Computer Engineering, Northeast Forestry University</institution>, <addr-line>Harbin</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>
<institution>Department of Radiology, The Second Affiliated Hospital, Harbin Medical University</institution>, <addr-line>Harbin</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/531759/overview">Quan Zou</ext-link>, University of Electronic Science and Technology of China, China</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/513748/overview">Yi Xiong</ext-link>, Shanghai Jiao Tong University, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/870446/overview">Liran Juan</ext-link>, Harbin Institute of Technology, China</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Zixiao Zhang, <email>zixiao_zhang@nefu.edu.cn</email>; Benzhi Dong, <email>nefudbz@nefu.edu.cn</email>; Yuming Zhao, <email>zym@nefu.edu.cn</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>20</day>
<month>12</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>12</volume>
<elocation-id>809001</elocation-id>
<history>
<date date-type="received">
<day>04</day>
<month>11</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>11</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Zhang, Gong, Gao, Li, Gao, Zhao and Dong.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Zhang, Gong, Gao, Li, Gao, Zhao and Dong</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Soluble N-ethylmaleimide sensitive factor activating protein receptor (SNARE) proteins are a large family of transmembrane proteins located in organelles and vesicles. The important roles of SNARE proteins include initiating the vesicle fusion process and activating and fusing proteins as they undergo exocytosis activity, and SNARE proteins are also vital for the transport regulation of membrane proteins and non-regulatory vesicles. Therefore, there is great significance in establishing a method to efficiently identify SNARE proteins. However, the identification accuracy of the existing methods such as SNARE CNN is not satisfied. In our study, we developed a method based on a support vector machine (SVM) that can effectively recognize SNARE proteins. We used the position-specific scoring matrix (PSSM) method to extract features of SNARE protein sequences, used the support vector machine recursive elimination correlation bias reduction (SVM-RFE-CBR) algorithm to rank the importance of features, and then screened out the optimal subset of feature data based on the sorted results. We input the feature data into the model when building the model, used 10-fold crossing validation for training, and tested model performance by using an independent dataset. In independent tests, the ability of our method to identify SNARE proteins achieved a sensitivity of 68%, specificity of 94%, accuracy of 92%, area under the curve (AUC) of 84%, and Matthew&#x2019;s correlation coefficient (MCC) of 0.48. The results of the experiment show that the common evaluation indicators of our method are excellent, indicating that our method performs better than other existing classification methods in identifying SNARE proteins.</p>
</abstract>
<kwd-group>
<kwd>SNARE proteins</kwd>
<kwd>position-specific scoring matrix</kwd>
<kwd>machine learning</kwd>
<kwd>support vector machine</kwd>
<kwd>SVM-RFE-CBR</kwd>
</kwd-group>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>N-ethylmaleimide sensitive factor (NSF) (<xref ref-type="bibr" rid="B57">Whiteheart et&#x20;al., 2001</xref>) protein and soluble NSF attachment proteins (SNAPS) (<xref ref-type="bibr" rid="B56">Whiteheart et&#x20;al., 1993</xref>) are two essential factors for protein transport between membranes (<xref ref-type="bibr" rid="B18">Hohl et&#x20;al., 1998</xref>) (<xref ref-type="bibr" rid="B17">Hanson et&#x20;al., 1997</xref>). They were first discovered as essential proteins for protein transport from donor to receptor subcellular structures during the processes of Golgi modification and secretion. The discovery of these two proteins led to the discovery of multiple receptor proteins on transport vesicles and plasma membranes and snap receptors, which are collectively called soluble N-ethylmaleimide-sensitive factor activating protein receptor (SNARE) proteins (<xref ref-type="bibr" rid="B47">Ungar and Hughson, 2003</xref>; <xref ref-type="bibr" rid="B72">Zhao et&#x20;al., 2019</xref>). According to the SNARE theory, exocytosis and secretory processes are completed by precise coordination between SNARE proteins. The specificity of membrane fusion is based on the specific binding of SNARE protein members. At the molecular level, when the transport vesicle is close to the target membrane, syntaxin1A/B on the target membrane receives a signal to recognize, approach and combine with SNAP25, which is also located on the target membrane. At the same time, VAMP2 (q-snare) on the transport vesicle also recognizes (<xref ref-type="bibr" rid="B25">Kweon et&#x20;al., 2003</xref>), draws close to and binds to form a 7S R-Q-SNARE complex, which guides the attachment and fusion of the transport vesicle and the target membrane, leading to the secretion of substances in the transport vesicle into the new subcellular structure or out of the cell through exocytosis, completing the intracellular transport and extracellular exocytosis and secretion processes.</p>
<p>The binding sites of SNARE proteins are specific, which is the reason for the specificity and precision of exocytosis and secretion in different organisms and organs (<xref ref-type="bibr" rid="B14">Fasshauer et&#x20;al., 1998</xref>; <xref ref-type="bibr" rid="B66">Yin et&#x20;al., 2021</xref>). SNARE theory convincingly explains the key role of synapses in the process of nerve impulse transmission at the molecular level (<xref ref-type="bibr" rid="B3">Chen and Scheller, 2001</xref>). Its new insights in the fields of molecular neurobiology and endocrinology have made research on SNARE proteins a hot spot in the basic life sciences worldwide. Such findings greatly enrich understanding of the regulation of intracellular information transmission, substance transport and exocytosis and secretion at the molecular level and improve knowledge of the interaction between proteins and the plasma membrane (<xref ref-type="bibr" rid="B32">Liu et&#x20;al., 2019a</xref>; <xref ref-type="bibr" rid="B48">Wang et&#x20;al., 2020a</xref>; <xref ref-type="bibr" rid="B58">Xu et&#x20;al., 2021</xref>).</p>
<p>Due to the important roles of SNARE proteins in cell biology, research on SNARE proteins is also developing, and a variety of technologies are used to study SNARE proteins (<xref ref-type="bibr" rid="B49">Wang et&#x20;al., 2020b</xref>; <xref ref-type="bibr" rid="B65">Yin et&#x20;al., 2020</xref>), including the establishment of a SNARE protein database, the retrieval and classification of SNARE proteins, bioinformatics technology that was used to predict the role of SNARE proteins, and construction of a neural network model to recognize SNARE proteins.</p>
<p>With the development of computational biology, the application of machine learning to bioinformatics continues to be deep and widespread (<xref ref-type="bibr" rid="B22">Jiang et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B46">Tao et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B75">Zhao et&#x20;al., 2021</xref>). Machine learning is complex and cross disciplinary across multiple fields (<xref ref-type="bibr" rid="B8">Cheng, 2020</xref>). Machine learning obtains new knowledge through learning from pre-existing knowledge and can continuously advance itself based on large quantities of this pre-existing knowledge and skills. Research on machine learning includes the study of computer algorithms, using data and previous techniques to improve the performance of computer algorithms. Machine learning also has significant implications for the development of artificial intelligence, through which computers continuously progress along a path of constant intelligence. A typical way to predict proteins is to transform each protein sequence into a numerical eigenvector used to represent the protein sequence, training a classification model based on the eigenvectors of the training samples and the labels. After feature construction, the classifiers that make predictions about proteins include covariant discriminant (CD) (<xref ref-type="bibr" rid="B10">Chou, 2000</xref>), support vector machine (SVM) (<xref ref-type="bibr" rid="B21">Hua and Sun, 2001</xref>), K-nearest neighbor (KNN) (<xref ref-type="bibr" rid="B39">Shen and Chou, 2006</xref>), deep learning and ensemble classifiers (<xref ref-type="bibr" rid="B39">Shen and Chou, 2006</xref>).</p>
<p>In this study, based on SVM classifier (<xref ref-type="bibr" rid="B31">Liu et&#x20;al., 2010</xref>), we constructed a model to recognize SNARE proteins. We use position-specific scoring matrix (PSSM) profiles of protein sequences to extract features (<xref ref-type="bibr" rid="B24">Kumar et&#x20;al., 2008</xref>), process the feature data by the min-max normalization method, build a model based on SVM, train the model with 10-fold cross validation and measure the performance of the model on an independent dataset.</p>
</sec>
<sec id="s2">
<title>2 Materials and Methods</title>
<p>We developed a method to recognize SNARE proteins based on PSSM (<xref ref-type="bibr" rid="B9">Chou and Shen, 2007</xref>; <xref ref-type="bibr" rid="B33">Liu et&#x20;al., 2019b</xref>; <xref ref-type="bibr" rid="B19">Hong et&#x20;al., 2020a</xref>; <xref ref-type="bibr" rid="B20">Hong et&#x20;al., 2020b</xref>) profiles and SVM. Method steps include data collection, data processing, feature extraction, feature selection, model training, and model performance evaluation. The overall flow of our designed method is summarized in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>, and each section in the figure is described in detail in the following sections. We carried out experiments through the above process, constantly adjusted in our experiment, and finally constructed an excellent method to identify SNARE proteins. The following is a detailed description of the method.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Flow chart of SNARE proteins recognition based on PSSM profiles matrix and SVM.</p>
</caption>
<graphic xlink:href="fgene-12-809001-g001.tif"/>
</fig>
<sec id="s2-1">
<title>2.1 Feature Extraction</title>
<p>It is very important to select good feature information for protein recognition (<xref ref-type="bibr" rid="B79">Zuo et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B76">Zheng et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B44">Tang et&#x20;al., 2020a</xref>; <xref ref-type="bibr" rid="B16">Guo et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B71">Zhang et&#x20;al., 2021</xref>). We chose the method based on PSSM profiles to extract the feature information of protein sequence data. We use the National Center for Biotechnology Information basic local alignment search tool (NCBI-BLAST) and select a non-redundant (NR) protein sequence database as a comparison dataset. We use the prepared SNARE protein FASTA sequence files to generate PSSM profiles. Each amino acid of the original sequence in the PSSM profiles consists of a vector of 20 values. Then, we transform the original PSSM files into PSSM profiles with 400 dimensions. Finally, 400-dimensional data are extracted as the feature data of each protein sequence for the experiment.</p>
</sec>
<sec id="s2-2">
<title>2.2 Data Processing</title>
<p>The feature data in the datasets are seriously unbalanced, especially the ratio of positive samples to negative samples in the independent dataset, which varies tremendously. The model would exhibit the problem of poor generalization, and the applicability would be low, so it is unable to effectively identify SNARE proteins. Therefore, we need to choose the appropriate method to deal with the data. In this study, the data processing methods we chose included Z-score standardization, min-max normalization and L2 regularization.</p>
<p>Normalization: Data can be changed to [0, one] ranges using the normalization method. Normalization, as an effective way to simplify calculation and scale down data values, can change the absolute values of data in the dataset into a relationship of some relative value. After normalization, the data can be calculated conveniently and quickly. This is mainly for the convenience of data processing, mapping the data to the range of 0&#x2013;1, which will be convenient and fast to use. The method is defined as:<disp-formula id="e1">
<mml:math id="m1">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msup>
<mml:mtext>x</mml:mtext>
<mml:mtext>&#x2a;</mml:mtext>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mtext>x</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>min</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mtext>max</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>min</mml:mtext>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mo>&#x23;</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>
</p>
<p>The distribution of original data can be changed by normalization, and then the weights of each feature dimension can be balanced by varying the feature dimension, such as converting the distribution of data from planar to circular. Normalization can remove the influence of dimensionality on the experimental results by reducing the difference in dimensionality. After normalization, the data of different variables can be compared. Although the maximum and minimum values of the resulting data in the normalization process are affected by outliers in the dataset, and the resulting data are less robust, normalization does improve the accuracy of iterations in the operational data process as well as the efficiency of data convergence.</p>
</sec>
<sec id="s2-3">
<title>2.3 Feature Selection</title>
<p>Feature selection refers to sorting features by suitable techniques and algorithms and filtering out the better characterized subset of features based on the sorted results; this is a common technique in bioinformatics (<xref ref-type="bibr" rid="B5">Cheng et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B78">Zhu et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B73">Zhao et&#x20;al., 2020a</xref>; <xref ref-type="bibr" rid="B74">Zhao et&#x20;al., 2020b</xref>; <xref ref-type="bibr" rid="B37">Shao and Liu, 2021</xref>; <xref ref-type="bibr" rid="B68">Yu et&#x20;al., 2021</xref>). After feature selection, the optimal feature subset selected from existing features is used to build the model, which can improve the performance of the model. Feature selection is a very important part of building models for pattern recognition and is a high priority in data processing (<xref ref-type="bibr" rid="B54">Wei et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B59">Xue et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B28">Li et&#x20;al., 2020a</xref>; <xref ref-type="bibr" rid="B61">Yang et&#x20;al., 2020a</xref>; <xref ref-type="bibr" rid="B42">Su et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B55">Wei et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B67">Yu et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B70">Zhang et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B77">Zheng et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B51">Wang et&#x20;al., 2021a</xref>; <xref ref-type="bibr" rid="B36">Shang et&#x20;al., 2021</xref>; <xref ref-type="bibr" rid="B38">Shao et&#x20;al., 2021</xref>). Selecting the effective features from the original feature dataset and removing the redundant features can reduce the dimensionality of the feature data, and using more effective feature data can improve the performance of the model. Our original feature is based on PSSM to extract 400 dimensional features. In these original feature spaces, there will be irrelevant, noisy, and redundant features. Suitable feature selection methods with excellent performance are required for accurate screening of redundant features. In our experiment, we finally chose the SVM-RFE-CBR (<xref ref-type="bibr" rid="B60">Yan and Zhang, 2015</xref>) algorithm to screen features after comparing multiple feature selection methods. The algorithm ranks the importance of features and selects the optimal subset of features based on the sorted results.</p>
<p>SVM-RFE-CBR is an improved algorithm based on support vector machine recursive feature elimination (SVM-RFE), which introduces the strategy of correlation deviation reduction (CBR) into the process of feature elimination. SVM-RFE estimates feature importance based on the coefficient of the SVM model, and it is a powerful feature selection algorithm. There are linear and nonlinear versions. The SVM-RFE-CBR method adds the correlation reduction strategy (CBR) to the SVM-RFE algorithm to reduce the potential deviation of the algorithm, and the result of feature selection is improved by the integrated CBR strategy. SVM-RFE uses the sequential backward selection algorithm in SVM, which is based on the principle of maximum interval. During the model training process, SVM-RFE sort features based on the score of every feature, deletes the feature with the lowest score, puts the remaining feature data into the next round of training of the model, and finally outputs the feature sort result to a table. The optimal feature subset can be selected according to the results of sorting. SVM is an excellent machine learning classification algorithm. The feature sort result derived from the SVM model has better performance, and it is also more convenient for subsequent experiments.</p>
</sec>
<sec id="s2-4">
<title>2.4 Support Vector Machine</title>
<p>SVM is currently a commonly used classifier in machine learning that classifies data by supervised learning (<xref ref-type="bibr" rid="B6">Cheng et&#x20;al., 2019a</xref>; <xref ref-type="bibr" rid="B7">Cheng et&#x20;al., 2019b</xref>). SVM is commonly used in data dichotomization. In addition, SVM can classify nonlinearly by using the kernel function (<xref ref-type="bibr" rid="B11">Ding et&#x20;al., 2020a</xref>; <xref ref-type="bibr" rid="B34">Liu et&#x20;al., 2020a</xref>; <xref ref-type="bibr" rid="B62">Yang et&#x20;al., 2020b</xref>). SVM was developed from the generalized portrait algorithm in pattern recognition. The basic idea of SVM is to construct a model that separates the dataset accurately according to the geometric interval of the hyperplane with the maximum separation of samples. SVM can map the features of a dataset to points in space and draw a line to distinguish these points effectively. SVM uses a hinge loss function to computationally predict the presence of empirical risk, and a regularization term is added to ensure its robustness and correct rate. The process of SVM: Suppose the training set is <inline-formula id="inf1">
<mml:math id="m2">
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext>y</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mtext>N</mml:mtext>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf2">
<mml:math id="m3">
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mtext>R</mml:mtext>
<mml:mtext>D</mml:mtext>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>,<inline-formula id="inf3">
<mml:math id="m4">
<mml:mrow>
<mml:msub>
<mml:mtext>y</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf4">
<mml:math id="m5">
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the <italic>i</italic>th sample, N is the sample size, and D is the number of sample features. SVM finding the optimal classification hyperplane.<inline-formula id="inf5">
<mml:math id="m6">
<mml:mrow>
<mml:mtext>&#x3c9;</mml:mtext>
<mml:mo>&#x22c5;</mml:mo>
<mml:mtext>x</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>b</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> The optimization problems that SVM needs to solve are:<disp-formula id="e2">
<mml:math id="m7">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>min&#xa0;</mml:mtext>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mtext>&#x3c9;</mml:mtext>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>C</mml:mi>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mtext>i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mtext>N</mml:mtext>
</mml:msubsup>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b5;</mml:mi>
<mml:mtext>i</mml:mtext>
</mml:msub>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mtext>s</mml:mtext>
<mml:mo>.</mml:mo>
<mml:mtext>t</mml:mtext>
<mml:mo>.</mml:mo>
<mml:msub>
<mml:mtext>&#xa0;y</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext>&#x3c9;</mml:mtext>
<mml:mo>&#x22c5;</mml:mo>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>b</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mtext>&#x3b5;</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mo>&#x22ef;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;N</mml:mtext>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:msub>
<mml:mtext>&#x3b5;</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mo>&#x22ef;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;N</mml:mtext>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
<label>(2)</label>
</disp-formula>
</p>
<p>Transforming the original problem into the dual problem:<disp-formula id="e3">
<mml:math id="m8">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>min</mml:mi>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mtext>i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mtext>N</mml:mtext>
</mml:msubsup>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mtext>j</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mtext>N</mml:mtext>
</mml:msubsup>
<mml:mrow>
<mml:msub>
<mml:mtext>&#x3b1;</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:msub>
<mml:mtext>&#x3b1;</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
<mml:msub>
<mml:mtext>y</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:msub>
<mml:mtext>y</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#x22c5;</mml:mo>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mtext>i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mtext>N</mml:mtext>
</mml:msubsup>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="normal">&#x3b1;</mml:mi>
<mml:mtext>i</mml:mtext>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
<disp-formula id="e4">
<mml:math id="m9">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="normal">s</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi mathvariant="normal">t</mml:mi>
<mml:mo>.</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mtext>i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mtext>N</mml:mtext>
</mml:msubsup>
<mml:mrow>
<mml:msub>
<mml:mtext>y</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:msub>
<mml:mtext>&#x3b1;</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>.</mml:mo>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>
<disp-formula id="equ1">
<mml:math id="m10">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:msub>
<mml:mtext>&#x3b1;</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#x2264;</mml:mo>
<mml:mtext>C</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mo>&#x22ef;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mtext>&#xa0;N&#xa0;&#x3b1;</mml:mtext>
</mml:mrow>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mtext>&#xa0;is&#xa0;a&#xa0;Lagrangian</mml:mtext>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Finally, the solution of <inline-formula id="inf6">
<mml:math id="m11">
<mml:mtext>&#x3c9;</mml:mtext>
</mml:math>
</inline-formula> is:<disp-formula id="e5">
<mml:math id="m12">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="normal">&#x3c9;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mtext>i</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi mathvariant="normal">N</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:msub>
<mml:mtext>&#x3b1;</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:msub>
<mml:mtext>y</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>
</p>
<p>When we use SVM to solve nonlinear problems, we need to choose the appropriate kernel function (<xref ref-type="bibr" rid="B63">Yang et&#x20;al., 2021a</xref>) (<xref ref-type="bibr" rid="B12">Ding et&#x20;al., 2020b</xref>) and then map the data to the high-dimensional space to solve the linearly inseparable problem of the data in the original&#x20;space.</p>
<p>In the experiment, the Python version of a library for support vector machine (LIBSVM) was selected to build an SVM model and identify SNARE proteins. The selection of different kernel functions using LIBSVM as well as the settings of kernel parameters are described as follows: The kernel function (<xref ref-type="bibr" rid="B13">Ding et&#x20;al., 2020c</xref>) of SVM includes the linear kernel function (LKF), polynomial kernel function (PKF), radial basis function (RBF), and sigmoid kernel function (SKF). Formulas corresponding to four kernel functions are as follows:</p>
<p>Linear kernel function defined as:<disp-formula id="e6">
<mml:math id="m13">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>K</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
<mml:mtext>T</mml:mtext>
</mml:msubsup>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
<mml:mo>.</mml:mo>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(6)</label>
</disp-formula>
</p>
<p>Polynomial kernel function:<disp-formula id="e7">
<mml:math id="m14">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>K</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="normal">&#x3bd;</mml:mi>
<mml:mtext>x</mml:mtext>
</mml:mrow>
<mml:mtext>i</mml:mtext>
<mml:mtext>T</mml:mtext>
</mml:msubsup>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>r</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mtext>d</mml:mtext>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>&#x3bd;</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>.</mml:mo>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(7)</label>
</disp-formula>
</p>
<p>Radial basis functions:<disp-formula id="e8">
<mml:math id="m15">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>K</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>exp</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x3bd;</mml:mtext>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>&#x3bd;</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>.</mml:mo>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(8)</label>
</disp-formula>
</p>
<p>Sigmoid kernel function:<disp-formula id="e9">
<mml:math id="m16">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>K</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>i</mml:mtext>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext>x</mml:mtext>
<mml:mtext>j</mml:mtext>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>tanh</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mtext>&#x3bd;x</mml:mtext>
</mml:mrow>
<mml:mtext>i</mml:mtext>
<mml:mtext>T</mml:mtext>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>r</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>.</mml:mo>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(9)</label>
</disp-formula>
</p>
<p>&#x3bd;, r, and d in formulas are parameters of kernel function.</p>
<p>Parameters are different in different kernel functions. <inline-formula id="inf7">
<mml:math id="m17">
<mml:mtext>&#x3bd;</mml:mtext>
</mml:math>
</inline-formula> in the formula represents the parameter gamma in the kernel function, the default of which is 1/K (K is the number of classes), and g is used to set it in the LIBSVM.</p>
<p>r in the formula represents the parameter r in the kernel function, the default of which is 0, and r is used to set it in the LIBSVM. d in the formula represents the parameter d in the kernel function; it is used to set the highest number of times in the polynomial kernel function, and its default value is&#x20;3.</p>
<p>SVM is a very powerful model that allows the decision boundary to be very complex and performs well on both low-dimensional data and high-dimensional data. SVM has been widely used in bioinformatics, binding protein prediction, protein methylation site prediction and so on. We use the LIBSVM of Scikit-learn library integration in Python to train and build the model. In our experimental process, we optimize the parameters according to the results and finally build the model with the best performance.</p>
</sec>
</sec>
<sec sec-type="results|discussion" id="s3">
<title>3 Results and Discussion</title>
<sec id="s3-1">
<title>3.1 Dataset</title>
<p>Our research is devoted to constructing a method to recognize SNARE proteins. To establish a model to effectively distinguish SNARE proteins and non-SNARE proteins, we collected a SNARE protein dataset and a non-SNARE protein dataset for our prediction model. The dataset we use has been used by Le, N.Q.K. and V.-N. Nguyen (<xref ref-type="bibr" rid="B26">Le and Nguyen, 2019</xref>) previously. The data come from the UniProt database, which is the most informative and resource-free protein database. We collect all SNARE proteins from the UniProt database according to the keyword SNARE. To avoid the homology of the SNARE protein sequence data that we collect, we use BLAST to address the redundancy of the SNARE protein sequence and eliminate the redundant sequence. Finally, 682 SNARE protein sequences are obtained as a positive sample dataset. At the same time, we select vesicular transport proteins as negative samples to establish a non-SNARE protein dataset. We divide the two datasets into a cross-validation dataset and an independent test dataset, and the size and details of the datasets are summarized in <xref ref-type="table" rid="T1">Table&#x20;1</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Summary of SNARE protein and non-SNARE protein datasets.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Dataset</th>
<th align="center">SNARE</th>
<th align="center">Non-SANRE</th>
<th align="center">Total</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Original dataset</td>
<td align="char" char=".">682</td>
<td align="char" char=".">2,583</td>
<td align="char" char=".">3,265</td>
</tr>
<tr>
<td align="left">Train dataset</td>
<td align="char" char=".">644</td>
<td align="char" char=".">2,234</td>
<td align="char" char=".">2,878</td>
</tr>
<tr>
<td align="left">Test dataset</td>
<td align="char" char=".">38</td>
<td align="char" char=".">349</td>
<td align="char" char=".">387</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>
<xref ref-type="table" rid="T1">Table&#x20;1</xref> shows that SNARE proteins and non-SNARE proteins correspond to two datasets: a training dataset and an independent test dataset, both of which include positive samples and negative samples. We use the cross-validation method to train the model with the training dataset, evaluate the performance of the model developed in this study, and optimize the model by adjusting the parameters according to the results of the training dataset. The independent test dataset is used to test and measure the predictive ability of the prediction model we developed.</p>
</sec>
<sec id="s3-2">
<title>3.2 Performance Measurements</title>
<p>Our research aims to establish a model to predict whether an amino acid sequence is a SNARE protein. Therefore, we need to use universally acknowledged evaluation indicators to measure the performance of the model. When training the model, we choose 10-fold cross validation as the training model after various considerations and take the average value of the crossing validation results as the result of model training. We optimize the parameters of SVM, select the best parameters to build the model, and evaluate the performance of the model through an independent test dataset to avoid systematic deviation in the process of cross validation. This study adopts some standard evaluation indicators that are widely used in bioinformatics research (<xref ref-type="bibr" rid="B40">Shen et&#x20;al., 2019a</xref>; <xref ref-type="bibr" rid="B41">Shen et&#x20;al., 2019b</xref>; <xref ref-type="bibr" rid="B1">Ao et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B29">Li et&#x20;al., 2020b</xref>; <xref ref-type="bibr" rid="B35">Liu et&#x20;al., 2020b</xref>; <xref ref-type="bibr" rid="B45">Tang et&#x20;al., 2020b</xref>; <xref ref-type="bibr" rid="B65">Yin et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B4">Chen et&#x20;al., 2021</xref>). The standard evaluation indicators include sensitivity (Sn), specificity (Sp), accuracy (Acc), area under the curve (AUC), Mathew&#x2019;s correlation coefficient (MCC), and F-score (<xref ref-type="bibr" rid="B69">Zhai et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B52">Wang et&#x20;al., 2021b</xref>; <xref ref-type="bibr" rid="B64">Yang et&#x20;al., 2021b</xref>). The calculation formulas are as follows (TP means true positive values, FP means false positive values, TN means true negative values, FN means false negative values):<disp-formula id="e10">
<mml:math id="m18">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>Sensitivity</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mtext>TP</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mtext>TP</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FN</mml:mtext>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>n</mml:mi>
<mml:mo>&#x2264;</mml:mo>
<mml:mn>1.</mml:mn>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(10)</label>
</disp-formula>
<disp-formula id="e11">
<mml:math id="m19">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>Speccificity</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mtext>TN</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mtext>TN</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FP</mml:mtext>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>S</mml:mi>
<mml:mi>p</mml:mi>
<mml:mo>&#x2264;</mml:mo>
<mml:mn>1.</mml:mn>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(11)</label>
</disp-formula>
<disp-formula id="e12">
<mml:math id="m20">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>Accurarcy</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mtext>TN</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>TP</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mtext>TP</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>TN</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FN</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FP</mml:mtext>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mtext>Acc</mml:mtext>
<mml:mo>&#x2264;</mml:mo>
<mml:mn>1.</mml:mn>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(12)</label>
</disp-formula>
<disp-formula id="e13">
<mml:math id="m21">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>MCC</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mtext>TP&#x2a;TN</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>FP&#x2a;FN</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext>TP</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FN</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext>TN</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FN</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext>TP</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FP</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext>TN</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FP</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mtext>MCC</mml:mtext>
<mml:mo>&#x2264;</mml:mo>
<mml:mn>1.</mml:mn>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(13)</label>
</disp-formula>
<disp-formula id="e14">
<mml:math id="m22">
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>F</mml:mtext>
<mml:mo>-</mml:mo>
<mml:mtext>score</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mtext>&#x2a;TP</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mtext>TP</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FN</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>FP</mml:mtext>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mtext>F</mml:mtext>
<mml:mo>-</mml:mo>
<mml:mtext>score</mml:mtext>
<mml:mo>&#x2264;</mml:mo>
<mml:mn>1.</mml:mn>
<mml:mo>&#x23;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:math>
<label>(14)</label>
</disp-formula>
</p>
<p>In machine learning research, receiver operating characteristic (ROC) curves are usually used to test the prediction performance of the model. AUC is a floating-point number from 0 to one of ROC. The AUC value can reflect the quality of the model. The greater the value, the better the performance of the model. ROC curves and AUCs are commonly used to compare the performance of different models as machine learning performance indicators, which is very reliable. MCC is often used to measure imbalanced data sets, which is one of the most important indicators to measure the performance of two kinds of classification in machine learning. We use Python&#x2019;s processing library to process&#x20;data.</p>
</sec>
<sec id="s3-3">
<title>3.3 Performance Comparison With Different Feature Dimensions</title>
<p>We use the SVM-RFE-CBR algorithm to evaluate the original 400-dimensional feature data. We use MATLAB to implement the SVM-REF-CBR algorithm to sort the features. When sorting features, a performance comparison will be given. The evaluation results are shown in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>. From <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>, it can be found that the ACC achieved highest value, when the top 350-dimensional feature is used in the experiment. Therefore, we choose 350-dimensional feature data for the experiment.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>The results of dimension reduction by using SVM-RFE-CBR algorithm.</p>
</caption>
<graphic xlink:href="fgene-12-809001-g002.tif"/>
</fig>
<p>We use the optimal 350-dimensional feature dataset after sorting for the experiment. First, 350-dimensional feature data are selected from the original feature training dataset and test dataset files according to the index obtained by the SVM-RFE-CBR algorithm. Then, the training dataset is 10-fold cross validated, and the model is optimized. After many experiments, the optimal parameters of SVM are obtained. When we choose the radial basis function, penalty coefficient (C) &#x3d; &#x201c;11&#x201d;, gamma &#x3d; &#x201c;0.1&#x201d;, the model achieves the optimal performance. At the same time, we also use the original 400-dimensional feature data for the experiment and choose the optimal parameterization in the experiment. The comparison of experimental results in different dimensions is shown in <xref ref-type="table" rid="T2">Table&#x20;2</xref>.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Comparison of prediction results between SVM-RFE-CBR dimension reduction and original dimension.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Feature-dimension</th>
<th align="center">Sn</th>
<th align="center">Sp</th>
<th align="center">Acc</th>
<th align="center">AUC</th>
<th align="center">MCC</th>
<th align="center">F-score</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">350</td>
<td align="char" char=".">
<bold>0.68</bold>
</td>
<td align="char" char=".">
<bold>0.94</bold>
</td>
<td align="char" char=".">
<bold>0.92</bold>
</td>
<td align="char" char=".">
<bold>0.84</bold>
</td>
<td align="char" char=".">
<bold>0.48</bold>
</td>
<td align="char" char=".">
<bold>0.5</bold>
</td>
</tr>
<tr>
<td align="left">400</td>
<td align="char" char=".">0.68</td>
<td align="char" char=".">0.94</td>
<td align="char" char=".">0.91</td>
<td align="char" char=".">0.83</td>
<td align="char" char=".">0.48</td>
<td align="char" char=".">0.5</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Comparison of prediction results between SVM-RFE-CBR dimension reduction and original dimension. The bold values mean maximum value in the column.</p>
</table-wrap-foot>
</table-wrap>
<p>The experimental results show that both Acc and MCC are improved after feature dimensionality reduction, which eliminates the redundant part of the original feature and improves the performance of the&#x20;model.</p>
</sec>
<sec id="s3-4">
<title>3.4 Comparison of Different Classifier Performance on Dataset</title>
<p>With the development of computers, machine learning has been widely used in bioinformatics (<xref ref-type="bibr" rid="B43">Tang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B50">Wang et&#x20;al., 2020c</xref>; <xref ref-type="bibr" rid="B15">Fu et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B2">Cai et&#x20;al., 2021</xref>; <xref ref-type="bibr" rid="B53">Wang et&#x20;al., 2021c</xref>; <xref ref-type="bibr" rid="B23">Jin et&#x20;al., 2021</xref>), and there are many classification models, including the linear classifier, SVM, naive byes, K-nearest neighbor (KNN), decision tree (DT), and ensemble model (random forest/GDBT, etc.). To obtain the most effective classifier method to identify SNARE proteins, we use various machine learning classifiers to construct a model of SNARE protein recognition, including random forest, KNN and naive Bayes.</p>
<p>We compare the experimental results of multiple machine learning classifier training models with the performance measurement results. The performance result of different classifier shown in <xref ref-type="table" rid="T3">Table&#x20;3</xref>.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>The result of performance compares between SVM and other classification method.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th align="center">Sn</th>
<th align="center">Sp</th>
<th align="center">Acc</th>
<th align="center">MCC</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">KNN</td>
<td align="char" char=".">
<bold>0.870</bold>
</td>
<td align="char" char=".">0.906</td>
<td align="char" char=".">0.898</td>
<td align="char" char=".">
<bold>0.73</bold>
</td>
</tr>
<tr>
<td align="left">Random Forest</td>
<td align="char" char=".">0.620</td>
<td align="char" char=".">0.962</td>
<td align="char" char=".">0.900</td>
<td align="char" char=".">0.70</td>
</tr>
<tr>
<td align="left">Na&#xef;ve Bayes</td>
<td align="char" char=".">0.853</td>
<td align="char" char=".">0.595</td>
<td align="char" char=".">0.624</td>
<td align="char" char=".">0.28</td>
</tr>
<tr>
<td align="left">SVM</td>
<td align="char" char=".">0.650</td>
<td align="char" char=".">
<bold>0.970</bold>
</td>
<td align="char" char=".">
<bold>0.900</bold>
</td>
<td align="char" char=".">0.70</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The result of performances compares between SVM and other classification method. The bold values mean maximum value in the column.</p>
</table-wrap-foot>
</table-wrap>
<p>As we can observe from <xref ref-type="table" rid="T3">Table&#x20;3</xref>, the results of SVM on training dataset are better than another classifier.</p>
<p>In particular, Sp &#x3d; 0.970, Acc &#x3d; 0.900. SVM shows higher performance. Meanwhile, we compare the ROC curves of different classifier method. The result shown in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>. As we can observe from <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>, The ROC curve of SVM is obviously better than the other three classifiers.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>ROC curves of different classifier methods.</p>
</caption>
<graphic xlink:href="fgene-12-809001-g003.tif"/>
</fig>
</sec>
<sec id="s3-5">
<title>3.5 Comparison of Different SNARE Protein Identification Methods</title>
<p>We compare the experimental results of SNARE CNN with the performance measurement results of our research method. The independent test results of using different methods to identify SNARE proteins are shown in <xref ref-type="fig" rid="F4">Figure&#x20;4</xref>. <xref ref-type="fig" rid="F4">Figure&#x20;4A</xref> shows the result of performance compares between our classification method and other classification method on training datasets. <xref ref-type="fig" rid="F4">Figure&#x20;4B</xref> shows the result of performance compares between our classification method and other classification method on test datasets.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>
<bold>(A)</bold>The result of performance compares between our classification method and other classification method on training datasets <bold>(B)</bold> The result of performance compares between our classification method and other classification method on test datasets.</p>
</caption>
<graphic xlink:href="fgene-12-809001-g004.tif"/>
</fig>
<p>The results show that our method gives good results in both training and independent test datasets. To compare the performance measurements of our method for identifying SNARE proteins with other methods more accurately, we compare the results of different methods on independent test datasets.As we can observe from <xref ref-type="fig" rid="F4">Figure&#x20;4B</xref>, the independent test results of our method are better than SANRE CNN. Sn &#x3d; 0.68, Sp &#x3d; 0.940, Acc &#x3d; 0.92 and MCC &#x3d; 0.48, and all these indicators reach the highest values using our method. As shown above, our method shows higher performance. These results clearly demonstrate the superiority of our method over the existing methods, especially when using an independent dataset test. This means that our method can better recognize SNARE proteins.</p>
</sec>
</sec>
<sec id="s4">
<title>4 Discussion</title>
<p>Because of the importance of SNARE proteins and the vital significance of SNARE proteins in vesicular transport, there is an urgent need for classification methods to identify SNARE proteins. Extracting meaningful features and selecting an appropriate machine learning algorithm can greatly increase the model performance of protein prediction. We propose a method based on PSSM profiles to extract features and SVM to construct a model to identify SNARE proteins. We normalize the feature data and use the SVM-RFE-CBR algorithm to reduce the dimensions of feature. Then, we use a 10-fold crossing validation training model and use an independent dataset to test the performance of the model (<xref ref-type="bibr" rid="B27">Li et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B30">Li et&#x20;al., 2020c</xref>). The accuracy, specificity, sensitivity, AUC, MCC and other performance indicators of our method have excellent experimental results. All results show that our model has better performance than other machine learning methods and advanced neural networks. Our method can effectively identify SNARE proteins. Taken together, the method proposed in our study is of great significance for the study of SNARE proteins and may also contribute to the prediction of protein function. Future works may include investigation of more kinds of proteins.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The datasets presented in this study can be found in online repositories: <ext-link ext-link-type="uri" xlink:href="https://github.com/First-Leaner/Identify-proteins">https://github.com/First-Leaner/Identify-proteins</ext-link>. The names of the repository/repositories and accession number(s) can also be found in the article/Supplementary Material.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>ZZ and BD conceived and designed the project. ZZ, HL, and YG conducted the experiments and analyzed the data. ZZ and BG wrote the manuscript. BD, WG, and YZ revised the manuscript. All authors read and approved the final manuscript.</p>
</sec>
<sec id="s7">
<title>Funding</title>
<p>This work was supported by National Natural Science Foundation of China (No. 62172129).</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ao</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Dong</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Prediction of Antioxidant Proteins Using Hybrid Feature Representation Method and Random forest</article-title>. <source>Genomics</source> <volume>112</volume> (<issue>6</issue>), <fpage>4666</fpage>&#x2013;<lpage>4674</lpage>. <pub-id pub-id-type="doi">10.1016/j.ygeno.2020.08.016</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cai</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>iEnhancer-XG: Interpretable Sequence-Based Enhancers and Their Strength Predictor</article-title>. <source>Bioinformatics</source> <volume>37</volume>, <fpage>1060</fpage>&#x2013;<lpage>1067</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa914</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>Y. A.</given-names>
</name>
<name>
<surname>Scheller</surname>
<given-names>R. H.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>SNARE-Mediated Membrane Fusion</article-title>. <source>Nat. Rev. Mol. Cel. Biol.</source> <volume>2</volume> (<issue>2</issue>), <fpage>98</fpage>&#x2013;<lpage>106</lpage>. <pub-id pub-id-type="doi">10.1038/35052017</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>MUFFIN: Multi-Scale Feature Fusion for Drug-Drug Interaction Prediction</article-title>. <source>Bioinformatics</source> <volume>37</volume>, <fpage>2651</fpage>&#x2013;<lpage>2658</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btab169</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>DincRNA: a Comprehensive Web-Based Bioinformatics Toolkit for Exploring Disease Associations and ncRNA Function</article-title>. <source>Bioinformatics</source> <volume>34</volume> (<issue>11</issue>), <fpage>1953</fpage>&#x2013;<lpage>1956</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bty002</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Pei</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>MetSigDis: a Manually Curated Resource for the Metabolic Signatures of Diseases</article-title>. <source>Brief Bioinform.</source> <volume>20</volume> (<issue>1</issue>), <fpage>203</fpage>&#x2013;<lpage>209</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbx103</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>T.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Computational Methods for Identifying Similar Diseases</article-title>. <source>Mol. Ther. - Nucleic Acids</source> <volume>18</volume>, <fpage>590</fpage>&#x2013;<lpage>604</lpage>. <pub-id pub-id-type="doi">10.1016/j.omtn.2019.09.019</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Omics Data and Artificial Intelligence: New Challenges for Gene Therapy</article-title>. <source>Cgt</source> <volume>20</volume> (<issue>1</issue>), <fpage>1</fpage>. <pub-id pub-id-type="doi">10.2174/156652322001200604150041</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chou</surname>
<given-names>K.-C.</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>H.-B.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>MemType-2L: A Web Server for Predicting Membrane Proteins and Their Types by Incorporating Evolution Information through Pse-PSSM</article-title>. <source>Biochem. Biophys. Res. Commun.</source> <volume>360</volume> (<issue>2</issue>), <fpage>339</fpage>&#x2013;<lpage>345</lpage>. <pub-id pub-id-type="doi">10.1016/j.bbrc.2007.06.027</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chou</surname>
<given-names>K.-C.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Prediction of Protein Subcellular Locations by Incorporating Quasi-Sequence-Order Effect</article-title>. <source>Biochem. Biophys. Res. Commun.</source> <volume>278</volume> (<issue>2</issue>), <fpage>477</fpage>&#x2013;<lpage>483</lpage>. <pub-id pub-id-type="doi">10.1006/bbrc.2000.3815</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ding</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Identification of Drug-Target Interactions via Fuzzy Bipartite Local Model</article-title>. <source>Neural Comput. Applic</source> <volume>32</volume> (<issue>14</issue>), <fpage>10303</fpage>&#x2013;<lpage>10319</lpage>. <pub-id pub-id-type="doi">10.1007/s00521-019-04569-z</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ding</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Human Protein Subcellular Localization Identification via Fuzzy Model on Kernelized Neighborhood Representation</article-title>. <source>Appl. Soft Comput.</source> <volume>96</volume>, <fpage>106596</fpage>. <pub-id pub-id-type="doi">10.1016/j.asoc.2020.106596</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ding</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Identification of Drug-Target Interactions via Dual Laplacian Regularized Least Squares with Multiple Kernel Fusion</article-title>. <source>Knowledge-Based Syst.</source> <volume>204</volume>, <fpage>106254</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2020.106254</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fasshauer</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Sutton</surname>
<given-names>R. B.</given-names>
</name>
<name>
<surname>Brunger</surname>
<given-names>A. T.</given-names>
</name>
<name>
<surname>Jahn</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Conserved Structural Features of the Synaptic Fusion Complex: SNARE Proteins Reclassified as Q- and R-SNAREs</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>95</volume> (<issue>26</issue>), <fpage>15781</fpage>&#x2013;<lpage>15786</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.95.26.15781</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Cai</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zou</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>StackCPPred: a Stacking and Pairwise Energy Content-Based Prediction of Cell-Penetrating Peptides and Their Uptake Efficiency</article-title>. <source>Bioinformatics</source> <volume>36</volume> (<issue>10</issue>), <fpage>3028</fpage>&#x2013;<lpage>3034</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa131</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guo</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Discrimination of Thermophilic Proteins and Non-thermophilic Proteins Using Feature Dimension Reduction</article-title>. <source>Front. Bioeng. Biotechnol.</source> <volume>8</volume>, <fpage>584807</fpage>. <pub-id pub-id-type="doi">10.3389/fbioe.2020.584807</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hanson</surname>
<given-names>P. I.</given-names>
</name>
<name>
<surname>Heuser</surname>
<given-names>J.&#x20;E.</given-names>
</name>
<name>
<surname>Jahn</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Neurotransmitter Release - Four Years of SNARE Complexes</article-title>. <source>Curr. Opin. Neurobiol.</source> <volume>7</volume> (<issue>3</issue>), <fpage>310</fpage>&#x2013;<lpage>315</lpage>. <pub-id pub-id-type="doi">10.1016/s0959-4388(97)80057-8</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hohl</surname>
<given-names>T. M.</given-names>
</name>
<name>
<surname>Parlati</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Wimmer</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Rothman</surname>
<given-names>J.&#x20;E.</given-names>
</name>
<name>
<surname>S&#xf6;llner</surname>
<given-names>T. H.</given-names>
</name>
<name>
<surname>Engelhardt</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Arrangement of Subunits in 20&#x20;S Particles Consisting of NSF, SNAPs, and SNARE Complexes</article-title>. <source>Mol. Cel.</source> <volume>2</volume> (<issue>5</issue>), <fpage>539</fpage>&#x2013;<lpage>548</lpage>. <pub-id pub-id-type="doi">10.1016/s1097-2765(00)80153-7</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hong</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ying</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>T.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Protein Functional Annotation of Simultaneously Improved Stability, Accuracy and False Discovery Rate Achieved by a Sequence-Based Deep Learning</article-title>. <source>Brief Bioinform.</source> <volume>21</volume> (<issue>4</issue>), <fpage>1437</fpage>&#x2013;<lpage>1447</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbz081</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hong</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Mou</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>W.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Convolutional Neural Network-Based Annotation of Bacterial Type IV Secretion System Effectors with Enhanced Accuracy and Reduced False Discovery</article-title>. <source>Brief Bioinform.</source> <volume>21</volume> (<issue>5</issue>), <fpage>1825</fpage>&#x2013;<lpage>1836</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbz120</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hua</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Support Vector Machine Approach for Protein Subcellular Localization Prediction</article-title>. <source>Bioinformatics</source> <volume>17</volume> (<issue>8</issue>), <fpage>721</fpage>&#x2013;<lpage>728</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/17.8.721</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Predicting Human microRNA-Disease Associations Based on Support Vector Machine</article-title>. <source>Ijdmb</source> <volume>8</volume> (<issue>3</issue>), <fpage>282</fpage>&#x2013;<lpage>293</lpage>. <pub-id pub-id-type="doi">10.1504/ijdmb.2013.056078</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jin</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Application of Deep Learning Methods in Biological Networks</article-title>. <source>Brief. Bioinform.</source> <volume>22</volume> (<issue>2</issue>), <fpage>1902</fpage>&#x2013;<lpage>1917</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbaa043</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Gromiha</surname>
<given-names>M. M.</given-names>
</name>
<name>
<surname>Raghava</surname>
<given-names>G. P. S.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Prediction of RNA Binding Sites in a Protein Using SVM and PSSM Profile</article-title>. <source>Proteins</source> <volume>71</volume> (<issue>1</issue>), <fpage>189</fpage>&#x2013;<lpage>194</lpage>. <pub-id pub-id-type="doi">10.1002/prot.21677</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kweon</surname>
<given-names>D.-H.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>C. S.</given-names>
</name>
<name>
<surname>Shin</surname>
<given-names>Y.-K.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Regulation of Neuronal SNARE Assembly by the Membrane</article-title>. <source>Nat. Struct. Mol. Biol.</source> <volume>10</volume> (<issue>6</issue>), <fpage>440</fpage>&#x2013;<lpage>447</lpage>. <pub-id pub-id-type="doi">10.1038/nsb928</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Le</surname>
<given-names>N. Q. K.</given-names>
</name>
<name>
<surname>Nguyen</surname>
<given-names>V.-N.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>SNARE-CNN: a 2D Convolutional Neural Network Architecture to Identify SNARE Proteins from High-Throughput Sequencing Data</article-title>. <source>PeerJ&#x20;Comp. Sci.</source> <volume>5</volume>, <fpage>e177</fpage>. <pub-id pub-id-type="doi">10.7717/peerj-cs.177</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>NOREVA: Normalization and Evaluation of MS-based Metabolomics Data</article-title>. <source>Nucleic Acids Res.</source> <volume>45</volume> (<issue>W1</issue>), <fpage>W162</fpage>&#x2013;<lpage>W170</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkx449</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>SSizer: Determining the Sample Sufficiency for Comparative Biological Study</article-title>. <source>J.&#x20;Mol. Biol.</source> <volume>432</volume> (<issue>11</issue>), <fpage>3411</fpage>&#x2013;<lpage>3421</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmb.2020.01.027</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Pu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zou</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>DeepATT: a Hybrid Category Attention Neural Network for Identifying Functional Effects of DNA Sequences</article-title>. <source>Brief Bioinform.</source> <volume>22</volume> (<issue>3</issue>), <fpage>bbaa159</fpage>. <pub-id pub-id-type="doi">10.1093/bib/bbaa159</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Pu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zou</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>DeepAVP: a Dual-Channel Deep Neural Network for Identifying Variable-Length Antiviral Peptides</article-title>. <source>IEEE J.&#x20;Biomed. Health Inform.</source> <volume>24</volume> (<issue>10</issue>), <fpage>3012</fpage>&#x2013;<lpage>3019</lpage>. <pub-id pub-id-type="doi">10.1109/jbhi.2020.2977091</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Prediction of Protein Structural Class for Low-Similarity Sequences Using Support Vector Machine and PSI-BLAST Profile</article-title>. <source>Biochimie</source> <volume>92</volume> (<issue>10</issue>), <fpage>1330</fpage>&#x2013;<lpage>1334</lpage>. <pub-id pub-id-type="doi">10.1016/j.biochi.2010.06.013</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Zuo</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Function Determinants of TET Proteins: the Arrangements of Sequence Motifs with Specific Codes</article-title>. <source>Brief Bioinform.</source> <volume>20</volume> (<issue>5</issue>), <fpage>1826</fpage>&#x2013;<lpage>1835</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bby053</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>BioSeq-Analysis2.0: an Updated Platform for Analyzing DNA, RNA and Protein Sequences at Sequence Level and Residue Level Based on Machine Learning Approaches</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume> (<issue>20</issue>), <fpage>e127</fpage>. <pub-id pub-id-type="doi">10.1093/nar/gkz740</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>C.-C.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>DeepSVM-fold: Protein Fold Recognition by Combining Support Vector Machines and Pairwise Sequence Similarity Scores Generated by Deep Learning Networks</article-title>. <source>Brief. Bioinform.</source> <volume>21</volume> (<issue>5</issue>), <fpage>1733</fpage>&#x2013;<lpage>1741</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbz098</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Fold-LTR-TCP: Protein Fold Recognition Based on Triadic Closure Principle</article-title>. <source>Brief. Bioinform.</source> <volume>21</volume> (<issue>6</issue>), <fpage>2185</fpage>&#x2013;<lpage>2193</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbz139</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zou</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Prediction of Drug-Target Interactions Based on Multi-Layer Network Representation Learning</article-title>. <source>Neurocomputing</source> <volume>434</volume>, <fpage>80</fpage>&#x2013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2020.12.068</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shao</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>ProtFold-DFG: Protein Fold Recognition by Combining Directed Fusion Graph and PageRank Algorithm</article-title>. <source>Brief. Bioinform.</source> <volume>22</volume>, <fpage>bbaa192</fpage>. <pub-id pub-id-type="doi">10.1093/bib/bbaa192</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shao</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>FoldRec-C2C: Protein Fold Recognition by Combining Cluster-To-Cluster Model and Protein Similarity Network</article-title>. <source>Brief. Bioinform.</source> <volume>22</volume> (<issue>3</issue>), <fpage>bbaa144</fpage>. <pub-id pub-id-type="doi">10.1093/bib/bbaa144</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shen</surname>
<given-names>H.-B.</given-names>
</name>
<name>
<surname>Chou</surname>
<given-names>K.-C.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Ensemble Classifier for Protein Fold Pattern Recognition</article-title>. <source>Bioinformatics</source> <volume>22</volume> (<issue>14</issue>), <fpage>1717</fpage>&#x2013;<lpage>1722</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btl170</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Identification of Protein Subcellular Localization via Integrating Evolutionary and Physicochemical Information into Chou&#x27;s General PseAAC</article-title>. <source>J.&#x20;Theor. Biol.</source> <volume>462</volume>, <fpage>230</fpage>&#x2013;<lpage>239</lpage>. <pub-id pub-id-type="doi">10.1016/j.jtbi.2018.11.012</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zou</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Critical Evaluation of Web-Based Prediction Tools for Human Protein Subcellular Localization</article-title>. <source>Brief. Bioinform.</source> <volume>21</volume> (<issue>5</issue>), <fpage>1628</fpage>&#x2013;<lpage>1640</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbz106</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Su</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>MinE-RFE: Determine the Optimal Subset from RFE by Minimizing the Subset-Accuracy-Defined Energy</article-title>. <source>Brief. Bioinform.</source> <volume>21</volume> (<issue>2</issue>), <fpage>687</fpage>&#x2013;<lpage>698</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbz021</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Simultaneous Improvement in the Precision, Accuracy, and Robustness of Label-free Proteome Quantification by Optimizing Data Manipulation Chains&#x2a;</article-title>. <source>Mol. Cell Proteomics</source> <volume>18</volume> (<issue>8</issue>), <fpage>1683</fpage>&#x2013;<lpage>1699</lpage>. <pub-id pub-id-type="doi">10.1074/mcp.ra118.001169</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>Y.-J.</given-names>
</name>
<name>
<surname>Pang</surname>
<given-names>Y.-H.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>IDP-Seq2Seq: Identification of Intrinsically Disordered Regions Based on Sequence to Sequence Learning</article-title>. <source>Bioinformaitcs</source> <volume>36</volume> (<issue>21</issue>), <fpage>5177</fpage>&#x2013;<lpage>5186</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa667</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Q.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>ANPELA: Analysis and Performance Assessment of the Label-free Quantification Workflow for Metaproteomic Studies</article-title>. <source>Brief Bioinform.</source> <volume>21</volume> (<issue>2</issue>), <fpage>621</fpage>&#x2013;<lpage>636</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bby127</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tao</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Teng</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>A Method for Identifying Vesicle Transport Proteins Based on LibSVM and MRMD</article-title>. <source>Comput. Math. Methods Med.</source> <volume>2020</volume>, <fpage>8926750</fpage>. <pub-id pub-id-type="doi">10.1155/2020/8926750</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ungar</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Hughson</surname>
<given-names>F. M.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>SNARE Protein Structure and Function</article-title>. <source>Annu. Rev. Cel Dev. Biol.</source> <volume>19</volume> (<issue>1</issue>), <fpage>493</fpage>&#x2013;<lpage>517</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.cellbio.19.110701.155609</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Zuo</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Modular Arrangements of Sequence Motifs Determine the Functional Diversity of KDM Proteins</article-title>. <source>Brief Bioinform.</source> <volume>22</volume> (<issue>3</issue>), <fpage>bbaa215</fpage>. <pub-id pub-id-type="doi">10.1093/bib/bbaa215</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Identification of Membrane Protein Types via Multivariate Information Fusion with Hilbert-Schmidt Independence Criterion</article-title>. <source>Neurocomputing</source> <volume>383</volume>, <fpage>257</fpage>&#x2013;<lpage>269</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2019.11.103</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Therapeutic Target Database 2020: Enriched Resource for Facilitating Research and Early Development of Targeted Therapeutics</article-title>. <source>Nucleic Acids Res.</source> <volume>48</volume> (<issue>D1</issue>), <fpage>D1031</fpage>&#x2013;<lpage>D1041</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkz981</pub-id> </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Jijun</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Exploring Associations of Non-coding RNAs in Human Diseases via Three-Matrix Factorization with Hypergraph-Regular Terms on center Kernel Alignment</article-title>. <source>Brief. Bioinform.</source> <volume>22</volume> (<issue>5</issue>), <fpage>bbaa409</fpage>. <pub-id pub-id-type="doi">10.1093/bib/bbaa409</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>The Stacking Strategy-Based Hybrid Framework for Identifying Non-coding RNAs</article-title>. <source>Brief Bioinform.</source> <volume>22</volume> (<issue>5</issue>), <fpage>bbab023</fpage>. <pub-id pub-id-type="doi">10.1093/bib/bbab023</pub-id> </citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Mao</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>H.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>DM3Loc: Multi-Label mRNA Subcellular Localization Prediction and Analysis Based on Multi-Head Self-Attention Mechanism</article-title>. <source>Nucleic Acids Res.</source> <volume>49</volume> (<issue>8</issue>), <fpage>e46</fpage>. <pub-id pub-id-type="doi">10.1093/nar/gkab016</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wei</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Su</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>M6APred-EL: A Sequence-Based Predictor for Identifying N6-Methyladenosine Sites Using Ensemble Learning</article-title>. <source>Mol. Ther. - Nucleic Acids</source> <volume>12</volume>, <fpage>635</fpage>&#x2013;<lpage>644</lpage>. <pub-id pub-id-type="doi">10.1016/j.omtn.2018.07.004</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wei</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Su</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Zou</surname>
<given-names>Q.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Comparative Analysis and Prediction of Quorum-sensing Peptides Using Feature Representation Learning and Machine Learning Algorithms</article-title>. <source>Brief. Bioinform.</source> <volume>21</volume> (<issue>1</issue>), <fpage>106</fpage>&#x2013;<lpage>119</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bby107</pub-id> </citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Whiteheart</surname>
<given-names>S. W.</given-names>
</name>
<name>
<surname>Griff</surname>
<given-names>I. C.</given-names>
</name>
<name>
<surname>Brunner</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Clary</surname>
<given-names>D. O.</given-names>
</name>
<name>
<surname>Mayer</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Buhrow</surname>
<given-names>S. A.</given-names>
</name>
<etal/>
</person-group> (<year>1993</year>). <article-title>SNAP Family of NSF Attachment Proteins Includes a Brain-specific Isoform</article-title>. <source>Nature</source> <volume>362</volume> (<issue>6418</issue>), <fpage>353</fpage>&#x2013;<lpage>355</lpage>. <pub-id pub-id-type="doi">10.1038/362353a0</pub-id> </citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Whiteheart</surname>
<given-names>S. W.</given-names>
</name>
<name>
<surname>Schraw</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Matveeva</surname>
<given-names>E. A.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>N-ethylmaleimide Sensitive Factor (NSF) Structure and Function</article-title>. <source>Int. Rev. Cytol.</source> <volume>207</volume>, <fpage>71</fpage>&#x2013;<lpage>112</lpage>. <pub-id pub-id-type="doi">10.1016/s0074-7696(01)07003-6</pub-id> </citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Zuo</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Multi-substrate Selectivity Based on Key Loops and Non-homologous Domains: New Insight into ALKBH Family</article-title>. <source>Cell. Mol. Life Sci.</source> <volume>78</volume> (<issue>1</issue>), <fpage>129</fpage>&#x2013;<lpage>141</lpage>. <pub-id pub-id-type="doi">10.1007/s00018-020-03594-9</pub-id> </citation>
</ref>
<ref id="B59">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xue</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>X.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>What Contributes to Serotonin-Norepinephrine Reuptake Inhibitors&#x27; Dual-Targeting Mechanism? the Key Role of Transmembrane Domain 6 in Human Serotonin and Norepinephrine Transporters Revealed by Molecular Dynamics Simulation</article-title>. <source>ACS Chem. Neurosci.</source> <volume>9</volume> (<issue>5</issue>), <fpage>1128</fpage>&#x2013;<lpage>1140</lpage>. <pub-id pub-id-type="doi">10.1021/acschemneuro.7b00490</pub-id> </citation>
</ref>
<ref id="B60">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Feature Selection and Analysis on Correlated Gas Sensor Data with Recursive Feature Elimination</article-title>. <source>Sens. Actuators B: Chem.</source> <volume>212</volume>, <fpage>353</fpage>&#x2013;<lpage>363</lpage>. <pub-id pub-id-type="doi">10.1016/j.snb.2015.02.025</pub-id> </citation>
</ref>
<ref id="B61">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Consistent Gene Signature of Schizophrenia Identified by a Novel Feature Selection Strategy from Comprehensive Sets of Transcriptomic Data</article-title>. <source>Brief Bioinform.</source> <volume>21</volume> (<issue>3</issue>), <fpage>1058</fpage>&#x2013;<lpage>1068</lpage>. <pub-id pub-id-type="doi">10.1093/bib/bbz049</pub-id> </citation>
</ref>
<ref id="B62">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>NOREVA: Enhanced Normalization and Evaluation of Time-Course and Multi-Class Metabolomic Data</article-title>. <source>Nucleic Acids Res.</source> <volume>48</volume> (<issue>W1</issue>), <fpage>W436</fpage>&#x2013;<lpage>W448</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkaa258</pub-id> </citation>
</ref>
<ref id="B63">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Meng</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2021a</year>). <article-title>Granular Multiple Kernel Learning for Identifying RNA-Binding Protein Residues via Integrating Sequence and Structure Information</article-title>. <source>Neural Comput. Appl.</source> <volume>33</volume>, <fpage>11387</fpage>&#x2013;<lpage>11399</lpage>. <pub-id pub-id-type="doi">10.1007/s00521-020-05573-4</pub-id> </citation>
</ref>
<ref id="B64">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>B.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Risk Prediction of Diabetes: Big Data Mining with Fusion of Multifarious Physical Examination Indicators</article-title>. <source>Inf. Fusion</source> <volume>75</volume>, <fpage>140</fpage>&#x2013;<lpage>149</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2021.02.015</pub-id> </citation>
</ref>
<ref id="B65">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yin</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Hong</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>VARIDT 1.0: Variability of Drug Transporter Database</article-title>. <source>Nucleic Acids Res.</source> <volume>48</volume> (<issue>D1</issue>), <fpage>D1042</fpage>&#x2013;<lpage>D1050</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkz779</pub-id> </citation>
</ref>
<ref id="B66">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yin</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Mou</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>K.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>INTEDE: Interactome of Drug-Metabolizing Enzymes</article-title>. <source>Nucleic Acids Res.</source> <volume>49</volume> (<issue>D1</issue>), <fpage>D1233</fpage>&#x2013;<lpage>D1243</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkaa755</pub-id> </citation>
</ref>
<ref id="B67">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Exploring Drug Treatment Patterns Based on the Action of Drug and Multilayer Network Model</article-title>. <source>Ijms</source> <volume>21</volume> (<issue>14</issue>), <fpage>5014</fpage>. <pub-id pub-id-type="doi">10.3390/ijms21145014</pub-id> </citation>
</ref>
<ref id="B68">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>F.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Predicting Therapeutic Drugs for Hepatocellular Carcinoma Based on Tissue-specific Pathways</article-title>. <source>Plos Comput. Biol.</source> <volume>17</volume> (<issue>2</issue>), <fpage>e1008696</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1008696</pub-id> </citation>
</ref>
<ref id="B69">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhai</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Teng</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Identifying Antioxidant Proteins by Using Amino Acid Composition and Protein-Protein Interactions</article-title>. <source>Front. Cel Dev. Biol.</source> <volume>8</volume>, <fpage>591487</fpage>. <pub-id pub-id-type="doi">10.3389/fcell.2020.591487</pub-id> </citation>
</ref>
<ref id="B70">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Pu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>AIEpred: an Ensemble Predictive Model of Classifier Chain to Identify Anti-Inflammatory Peptides</article-title>. <source>Ieee/acm Trans. Comput. Biol. Bioinform.</source> <volume>PP</volume>, <fpage>1</fpage>. <pub-id pub-id-type="doi">10.1109/TCBB.2020.2968419</pub-id> </citation>
</ref>
<ref id="B71">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H. D.</given-names>
</name>
<name>
<surname>Zulfiqar</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Yuan</surname>
<given-names>S. S.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Q. L.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z. Y.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins</article-title>. <source>Comput. Math. Methods Med.</source> <volume>2021</volume>, <fpage>6664362</fpage>. <pub-id pub-id-type="doi">10.1155/2021/6664362</pub-id> </citation>
</ref>
<ref id="B72">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>High Mobility Group Box 1: An Immune-Regulatory Protein</article-title>. <source>Cgt</source> <volume>19</volume> (<issue>2</issue>), <fpage>100</fpage>&#x2013;<lpage>109</lpage>. <pub-id pub-id-type="doi">10.2174/1566523219666190621111604</pub-id> </citation>
</ref>
<ref id="B73">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Cheng</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>DeepLGP: a Novel Deep Learning Method for Prioritizing lncRNA Target Genes</article-title>. <source>Bioinformatics</source> <volume>36</volume> (<issue>16</issue>), <fpage>4466</fpage>&#x2013;<lpage>4472</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa428</pub-id> </citation>
</ref>
<ref id="B74">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Jiao</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>ECFS-DEA: an Ensemble Classifier-Based Feature Selection for Differential Expression Analysis on Expression Profiles</article-title>. <source>BMC Bioinform.</source> <volume>21</volume> (<issue>1</issue>), <fpage>43</fpage>. <pub-id pub-id-type="doi">10.1186/s12859-020-3388-y</pub-id> </citation>
</ref>
<ref id="B75">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Identifying Plant Pentatricopeptide Repeat Proteins Using a Variable Selection Method</article-title>. <source>Front. Plant Sci.</source> <volume>12</volume>, <fpage>506681</fpage>. <pub-id pub-id-type="doi">10.3389/fpls.2021.506681</pub-id> </citation>
</ref>
<ref id="B76">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zheng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Mu</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>RAACBook: a Web Server of Reduced Amino Acid Alphabet for Sequence-Dependent Inference by Using Chou&#x27;s Five-step Rule</article-title>. <source>Database (Oxford)</source> <volume>2019</volume>, <fpage>baz131</fpage>. <pub-id pub-id-type="doi">10.1093/database/baz131</pub-id> </citation>
</ref>
<ref id="B77">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zheng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zuo</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>RaacLogo: a New Sequence Logo Generator by Using Reduced Amino Acid Clusters</article-title>. <source>Brief Bioinform.</source> <volume>22</volume> (<issue>3</issue>), <fpage>bbaa096</fpage>. <pub-id pub-id-type="doi">10.1093/bib/bbaa096</pub-id> </citation>
</ref>
<ref id="B78">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>X.-J.</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>C.-Q.</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>H.-Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Hao</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Predicting Protein Structural Classes for Low-Similarity Sequences by Evaluating Different Features</article-title>. <source>Knowledge-Based Syst.</source> <volume>163</volume>, <fpage>787</fpage>&#x2013;<lpage>793</lpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2018.10.007</pub-id> </citation>
</ref>
<ref id="B79">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zuo</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>PseKRAAC: a Flexible Web Server for Generating Pseudo K-Tuple Reduced Amino Acids Composition</article-title>. <source>Bioinformatics</source> <volume>33</volume> (<issue>1</issue>), <fpage>122</fpage>&#x2013;<lpage>124</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btw564</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>