<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2022.857292</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Big Data Recommendation Research Based on Travel Consumer Sentiment Analysis</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Yuan</surname>
<given-names>Zhu</given-names>
</name>
<xref rid="c001" ref-type="corresp"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1640830/overview"/>
</contrib>
</contrib-group>
<aff><institution>School of Business, Jilin Business and Technology College</institution>, <addr-line>ChangChun</addr-line>, <country>China</country></aff>
<author-notes>
<fn id="fn0001" fn-type="edited-by"><p>Edited by: Xuefeng Shao, University of Newcastle, Australia</p></fn>
<fn id="fn0002" fn-type="edited-by"><p>Reviewed by: Zan Guo, Shenyang Pharmaceutical University, China; Qing Wang, South China Normal University, China</p></fn>
<corresp id="c001">&#x002A;Correspondence: Zhu Yuan, <email>bamboo20220118@163.com</email></corresp>
<fn id="fn0003" fn-type="other"><p>This article was submitted to Organizational Psychology, a section of the journal Frontiers in Psychology</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>28</day>
<month>02</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>857292</elocation-id>
<history>
<date date-type="received">
<day>19</day>
<month>01</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>02</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Yuan.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Yuan</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>More and more tourists are sharing their travel feelings and posting their real experiences on the Internet, generating tourism big data. Online travel reviews can fully reflect tourists&#x2019; emotions, and mining and analyzing them can provide insight into the value of them. In order to analyze the potential value of online travel reviews by using big data technology and machine learning technology, this paper proposes an improved support vector machine (SVM) algorithm based on travel consumer sentiment analysis and builds an Hadoop Distributed File System (HDFS) system based on Map-Reduce model. Firstly, Internet travel reviews are pre-processed for sentiment analysis of the review text. Secondly, an improved SVM algorithm is proposed based on the main features of linear classification and kernel functions, so as to improve the accuracy of sentiment word classification. Then, HDFS data nodes are deployed on the basis of Hadoop platform with the actual tourism application context. And based on the Map-Reduce programming model, the map function and reduce function are designed and implemented, which greatly improves the possibility of parallel processing and reduces the time consumption at the same time. Finally, an improved SVM algorithm is implemented under the built Hadoop platform. The test results show that online travel reviews can be an important data source for travel big data recommendation, and the proposed method can quickly and accurately achieve travel sentiment classification.</p>
</abstract>
<kwd-group>
<kwd>tourism consumption</kwd>
<kwd>sentiment analysis</kwd>
<kwd>big data analysis</kwd>
<kwd>support vector machine</kwd>
<kwd>Map-Reduce</kwd>
</kwd-group>
<counts>
<fig-count count="6"/>
<table-count count="5"/>
<equation-count count="10"/>
<ref-count count="28"/>
<page-count count="9"/>
<word-count count="5085"/>
</counts>
</article-meta>
</front>
<body>
<sec id="sec1" sec-type="intro">
<title>Introduction</title>
<p>In the era of big data, the whole tourism industry is also undergoing a revolution. As the scale of tourism development continues to grow, tourism data information is also exploding. Currently, big data are gaining more attention as it continues to integrate with the tourism industry, opening new doors for innovation in tourism decision making as well. Through mobile Internet technology, tourists can search for travel information, book travel services, and share travel experiences, generating &#x201C;tourism big data.&#x201D; In particular, in recent years, more and more travel consumers are making online travel reviews, which are comments on various aspects, such as services and prices (<xref ref-type="bibr" rid="ref2">Boley et al., 2018</xref>; <xref ref-type="bibr" rid="ref3">Buttazzoni et al., 2018</xref>; <xref ref-type="bibr" rid="ref4">Cottrill et al., 2018</xref>; <xref ref-type="bibr" rid="ref14">Luo et al., 2018</xref>; <xref ref-type="bibr" rid="ref18">Prem et al., 2018</xref>; <xref ref-type="bibr" rid="ref27">Yunlong and Yuanchang, 2018</xref>). The textual keywords are the most important data carriers of this information. By analyzing these textual reviews, it is possible to effectively understand the emotional state of consumers when traveling and their satisfaction with the service.</p>
<p>Online comment data are an important part of tourism big data, which have attracted extensive research because it can directly reflect the real emotions of tourists. The effects of tourism big data are mainly reflected in two aspects (<xref ref-type="bibr" rid="ref19">Yi et al., 2014</xref>; <xref ref-type="bibr" rid="ref5">Cynarski, 2018</xref>; <xref ref-type="bibr" rid="ref16">Mayer and Woltering, 2018</xref>; <xref ref-type="bibr" rid="ref26">Woosnam et al., 2018</xref>): on the one hand, tourism big data are beneficial to the transformation of tourism industry and have a great impact on the healthy and rapid development of tourism. On the other hand, tourism big data can contribute to the innovation of tourism management tools. &#x201C;Sentiment analysis&#x201D; takes textual comments as the object of research and is able to automatically determine the emotional tendencies of the subject of the comment. There are two main branches of sentiment analysis technology (<xref ref-type="bibr" rid="ref6">De Vos, 2019</xref>; <xref ref-type="bibr" rid="ref10">Haynes, 2020</xref>; <xref ref-type="bibr" rid="ref12">Hunold et al., 2020</xref>; <xref ref-type="bibr" rid="ref24">Van and Hieu, 2020</xref>), namely, sentiment analysis based on machine learning and sentiment analysis based on semantic methods. The former can rely on certain algorithms to achieve intelligent classification of data by computers. The main intelligent classification methods are Bayesian classification, KNN classification methods, decision tree classification, and support vector machine (SVM) classification. There are many other text classification algorithms, such as text classification using neural network approach. <xref ref-type="bibr" rid="ref1">Abubakar and Ilkan (2016)</xref> pointed out that online reviews of tourist destinations affect tourists&#x2019; trust and stimulate their purchase demand. <xref ref-type="bibr" rid="ref8">Gao et al. (2016)</xref> achieved affective polarity classification of text by building semantic features and binary models for sentiment analysis of microblog comment data. In addition, data mining and applications of sentiment analysis in tourism have been gradually developed. <xref ref-type="bibr" rid="ref23">Tu and Tang (2016)</xref> pointed out the shortcomings of machine learning-based sentiment analysis methods in analyzing tourism reviews and established a sentiment analysis model based on semantic lexicon, which effectively improved the management efficiency of tourists&#x2019; evaluation remarks. <xref ref-type="bibr" rid="ref15">Ma et al. (2016)</xref> constructed a sentiment analysis model, which is conducive to improving the service quality and image marketing of tourist destinations.</p>
<p>Although online review data have many advantages, its variable quantity and variety of characteristics also bring great trouble to the practical application. In particular, the current analysis and extraction technology are still immature, with low accuracy, slow extraction speed, and other drawbacks. Therefore, the current technology can no longer meet the needs of the market, so this problem needs to be solved. It should be noted that the current technology for processing online review data mainly relies on the concurrent storage and processing technology of massive data. HDFS concurrent storage system and Map-Reduce model have the ability to store and process massive data information at high speed concurrently (<xref ref-type="bibr" rid="ref7">Fujiu et al., 2016</xref>; <xref ref-type="bibr" rid="ref20">Rossini et al., 2016</xref>; <xref ref-type="bibr" rid="ref21">Seera and Taruna, 2018</xref>; <xref ref-type="bibr" rid="ref22">Sengan et al., 2020</xref>), which maximizes the use of all aspects of computer resources and has strong concurrency and high efficiency. SVM algorithm is a vector-based binary classification algorithm that is suitable for classification and analysis in massive data.</p>
<p>Therefore, this paper proposes an SVM classification solution based on the concurrent processing technology of massive data, and also designs and implements a classifier based on the improved SVM algorithm in the context of big data. The main work of this paper is reflected in the following two points: (1) This paper obtains standardized text data by word separation of travel reviews, so as to exploit the value of travel online reviews by using sentiment analysis technology, and makes an innovative exploration in the application. (2) The key technology for analysis of sentiment vocabulary is classification; therefore, this paper proposes to classify sentiment of travel online review data by machine learning algorithm and designs an improved SVM algorithm. The improved SVM algorithm makes full use of the threshold between SVM and point vector for denoising, which improves the accuracy of the system. Meanwhile, the Map-Reduce model in Hadoop platform is utilized to greatly improve the possibility of parallel processing and reduce the time consumption.</p>
<p>The rest of the paper is organized as follows: In section 2, the text pre-processing of online travel comment data is studied in detail, while section 3 provides the detailed classification model of tourism emotion based on machine learning. Section 4 provides detailed results and discussion. Finally, the paper is concluded in section 5.</p>
</sec>
<sec id="sec2">
<title>Text Pre-Processing of Travel Online Review Data</title>
<sec id="sec3">
<title>Comment Type Information Pre-processing</title>
<p>The research data in this paper are mainly taken from the raw data of online travel platforms and microblogs. To convert these raw data into a language that can be recognized by computers, a systematic pre-processing work has to be performed. This specifically includes denoising, normalizing the representation, and word separation.</p>
<list list-type="order">
<list-item>
<p>Denoising is to reduce data interference and make the study more focused. The objects removed by denoising are interfering comments and duplicate comments in the original comment data. Interference information is information that is not related to the research topic, such as &#x201C;&#x2026;&#x201D; Such pure symbols do not reflect the commenter&#x2019;s expression intention and have little value to the research topic, so this paper chooses to remove them. Repeated comments are multiple copies of the same comment by a certain user, so only one comment by this user is kept as the object of analysis in this paper.</p>
</list-item>
<list-item>
<p>The normative representation is mainly to correct and transform the original data so that it can be identified quickly in the data analysis. Since some commenters will adopt personalized expressions or Internet phrases in their comments, even the most common word separation system cannot identify them. Therefore, this paper will choose to replace or modify these data by consent to make the expressions more standardized and uniform.</p>
</list-item>
<list-item>
<p>The role of the word separation module is to make the evaluation dimensions correspond to the evaluation words one by one by using the word separation software, forming the initial correspondence-type data, and paving the way for the subsequent vectorization of the data.</p>
</list-item>
</list>
<p>Assuming that the vocabulary of the current evaluation dimension consists of <italic>i</italic> characters, match the first <italic>i</italic> characters of the current string. It also matches the <italic>i</italic> characters after the evaluation dimension (<italic>i</italic> values are 2&#x2013;10). If this <italic>i</italic> character is the evaluation vocabulary, the field is sliced out and corresponds to the previous evaluation dimension, thus forming the initial correspondence data.</p>
</sec>
<sec id="sec4">
<title>Text Vectorization</title>
<p>When the corresponding data are formed, the vectorization process begins. The model can also be represented by the mathematical formula (<xref ref-type="bibr" rid="ref17">Minaee et al., 2021</xref>):</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mrow><mml:mi>D</mml:mi><mml:mo>=</mml:mo><mml:mfenced close="}" open="{"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mi mathvariant="normal">,</mml:mi><mml:msub><mml:mi>w</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mi mathvariant="normal">;</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mi mathvariant="normal">,</mml:mi><mml:msub><mml:mi>w</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mi mathvariant="normal">;</mml:mi><mml:mo>&#x2026;</mml:mo><mml:mi mathvariant="normal">;</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mi mathvariant="normal">,</mml:mi><mml:msub><mml:mi>w</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:math></disp-formula>
<p>where <italic>D</italic> is the set of feature terms, <italic>t<sub>n</sub></italic> is each feature term, and <italic>W<sub>n</sub></italic> is the weight corresponding to each feature term.</p>
<p>The specific process of data vectorization is divided into two parts: data vectorization and error tolerance (<xref ref-type="bibr" rid="ref9">Hartmann et al., 2019</xref>).</p>
<sec id="sec5">
<title>Data Vectorization</title>
<p>The key to data vectorization is the determination of evaluation dimensions and evaluation vocabulary weights, and the frequency of dimension vocabulary occurrence is found to be a very scientific way of description through the preliminary market research, so the frequency value is chosen as the evaluation criterion here (<xref ref-type="bibr" rid="ref13">Kadhim, 2019</xref>). The introduction of this evaluation dimension in the vector will play a crucial role in the classification results. When the vector is classified, based on the positive and negative values of the third component of the vector it will be possible to prioritize the comments into positive and negative, which will then be subdivided by the SVM algorithm, thus saving a lot of time costs and improving the efficiency of the system. For example, <xref rid="tab1" ref-type="table">Table 1</xref> shows the matching results of the travel-based sentiment evaluation. The effect after vectoring is shown in <xref rid="tab2" ref-type="table">Table 2</xref>.</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption><p>Matching results of tourism-based emotional evaluation.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Evaluation dimension</th>
<th align="center" valign="top">Weights</th>
<th align="center" valign="top">Score evaluation</th>
<th align="center" valign="top">Tendency</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Clogging</td>
<td align="center" valign="top">20%</td>
<td align="center" valign="top">84</td>
<td align="center" valign="top">negative</td>
</tr>
<tr>
<td align="left" valign="top">Traps</td>
<td align="center" valign="top">15%</td>
<td align="center" valign="top">85</td>
<td align="center" valign="top">negative</td>
</tr>
<tr>
<td align="left" valign="top">Dirty and disordered</td>
<td align="center" valign="top">15%</td>
<td align="center" valign="top">81</td>
<td align="center" valign="top">negative</td>
</tr>
<tr>
<td align="left" valign="top">Help</td>
<td align="center" valign="top">20%</td>
<td align="center" valign="top">80</td>
<td align="center" valign="top">positive</td>
</tr>
<tr>
<td align="left" valign="top">Beautiful Scenery</td>
<td align="center" valign="top">15%</td>
<td align="center" valign="top">68</td>
<td align="center" valign="top">positive</td>
</tr>
<tr>
<td align="left" valign="top">Friendliness</td>
<td align="center" valign="top">15%</td>
<td align="center" valign="top">45</td>
<td align="center" valign="top">positive</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="tab2">
<label>Table 2</label>
<caption><p>Vector table.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Vector number</th>
<th align="center" valign="top">1</th>
<th align="center" valign="top">2</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">1</td>
<td align="center" valign="top">(11,84,7)</td>
<td align="center" valign="top">(13,80,6)</td>
</tr>
<tr>
<td align="left" valign="top">2</td>
<td align="center" valign="top">(22,85,9)</td>
<td align="center" valign="top">(12,45,4)</td>
</tr>
<tr>
<td align="left" valign="top">3</td>
<td align="center" valign="top">(17,81,7)</td>
<td align="center" valign="top">(13,44,9)</td>
</tr>
<tr>
<td align="left" valign="top">4</td>
<td align="center" valign="top">(4,45,&#x2212;8)</td>
<td align="center" valign="top">(3,9,4)</td>
</tr>
<tr>
<td align="left" valign="top">5</td>
<td align="center" valign="top">(13,45,&#x2212;10)</td>
<td align="center" valign="top">(14,32,&#x2212;5)</td>
</tr>
<tr>
<td align="left" valign="top">6</td>
<td align="center" valign="top">(26,24,50)</td>
<td align="center" valign="top">(15,3,&#x2212;6)</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="sec6">
<title>Error Tolerance</title>
<p>Since the combination of Chinese vocabulary is very different and the computer has limited ability to recognize Chinese text vocabulary, there will be various errors misleading the final result during the data pre-processing. Therefore, in the error-tolerant stage, when there is data in the vector that exceeds the boundary value, this vector is decisively excluded as an error vector to improve the accuracy of the system data.</p>
</sec>
</sec>
</sec>
<sec id="sec7">
<title>Machine Learning-Based Travel Sentiment Classification Model</title>
<p>The machine learning-based travel sentiment classification model contains the entire process from data collection to recommendation classification, as shown in <xref rid="fig1" ref-type="fig">Figure 1</xref>.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption><p>Support vector machine (SVM)-based travel sentiment classification model.</p></caption>
<graphic xlink:href="fpsyg-13-857292-g001.tif"/>
</fig>
<sec id="sec8">
<title>SVM Algorithm Improvement</title>
<p>The core of the SVM algorithm is to find the support vector and the optimal hyperplane (<xref ref-type="bibr" rid="ref25">Varatharajan et al., 2018</xref>; <xref ref-type="bibr" rid="ref28">Zeng et al., 2018</xref>; <xref ref-type="bibr" rid="ref11">Hu et al., 2020</xref>), so the key problem is to find the interval threshold between the vector corresponding to the data points and the support vector. Many linearly indistinguishable scenarios often occur in practice. Therefore, the selection of the kernel function is a key for the linearly indistinguishable problem. Since the data volume of travel online reviews is too large and the dimensionality is relatively low, the selection of polynomial type kernel function is particularly suitable.</p>
<p>For the case of linear indivisibility, the innovation of this paper is to improve the operation method by expanding the point vector in the SVM classifier into individual sub-vectors and then substituting them into the polynomial in turn. The original polynomial kernel function is as:</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M2"><mml:mrow><mml:mi>K</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mi mathvariant="normal">,</mml:mi><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>&#x00B7;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mi>d</mml:mi></mml:msup></mml:mrow></mml:math></disp-formula>
<p>The SVM classifier decision function when linearly indistinguishable is as:</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M3"><mml:mrow><mml:mfenced close=")" open=""><mml:mrow><mml:mi>f</mml:mi><mml:mfenced><mml:mi>x</mml:mi></mml:mfenced><mml:mo>=</mml:mo><mml:mi>sgn</mml:mi><mml:msup><mml:mrow><mml:mfenced><mml:mrow><mml:munderover><mml:mstyle displaystyle="true"><mml:mo>&#x2211;</mml:mo></mml:mstyle><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:msub><mml:mi>&#x03B1;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mfenced><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mspace width="0.25em"/><mml:mo>&#x00B7;</mml:mo><mml:mspace width="0.25em"/><mml:mi>x</mml:mi></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:mfenced></mml:mrow><mml:mi>d</mml:mi></mml:msup><mml:mo>+</mml:mo><mml:mi>b</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:math></disp-formula>
<p>where <italic>x</italic> is the vector to be classified, <italic>x<sub>i</sub></italic> is the support vector, and <italic>x</italic> is an <italic>n</italic>-dimensional vector in the calculation of this formula.</p>
<p>It is obvious that the total time complexity is as:</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M4"><mml:mrow><mml:mi>f</mml:mi><mml:mfenced><mml:mrow><mml:mi>m</mml:mi><mml:mi mathvariant="normal">,</mml:mi><mml:mi>n</mml:mi><mml:mi mathvariant="normal">,</mml:mi><mml:mi>d</mml:mi></mml:mrow></mml:mfenced><mml:mo>=</mml:mo><mml:mi>m</mml:mi><mml:mfenced><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:mfenced></mml:mrow></mml:math></disp-formula>
<p>It can be seen that <italic>n</italic> inner product calculations are performed during the operation and the number of times is <italic>d</italic>.</p>
<p>Therefore, the computational effort of the improved SVM algorithm is not related to the number of support vectors but to the spatial dimensionality of the study context and the order of the kernel function in the polynomial. The use of the improved SVM algorithm is particularly effective for the case of this paper where the number of vectors is large and the dimensionality and order are small and low. When computing the vectors, there is no effect on the selection of the classification results and the calculation of the threshold value because the spreading vectors of the point vectors are discarded. Only the normal vectors need to be substituted into the computation to perform the classification, so that the efficiency of the algorithm is greatly improved. The penalty function in the linearly indistinguishable case is as:</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M5"><mml:mrow><mml:mi>min</mml:mi><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mo>&#x2225;</mml:mo><mml:mi>W</mml:mi><mml:msup><mml:mo>&#x2225;</mml:mo><mml:mn>2</mml:mn></mml:msup><mml:mo>+</mml:mo><mml:mi>C</mml:mi><mml:munderover><mml:mstyle displaystyle="true"><mml:mo>&#x2211;</mml:mo></mml:mstyle><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>R</mml:mi></mml:munderover><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi>s</mml:mi><mml:mo>.</mml:mo><mml:mi>t</mml:mi><mml:mo>.</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msup><mml:mi>w</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mi>b</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2265;</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x2212;</mml:mo><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x03B5;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x2265;</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></disp-formula>
<p>where <italic>C</italic> is a free parameter, and based on practical experience, <italic>C</italic> is set to 2 in this study. <italic>R</italic> is the number of vectors.</p>
<p>Another important issue, the choice of the order <italic>d</italic>-value has a great impact on the accuracy and time complexity of the kernel function. Regarding the effect of different <italic>d</italic>-value choices on the recognition rate of the algorithm, it is illustrated here by the experimental results shown in <xref rid="tab3" ref-type="table">Table 3</xref>.</p>
<table-wrap position="float" id="tab3">
<label>Table 3</label>
<caption><p>Experimental results with different <italic>d</italic> value.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Order <italic>d</italic></th>
<th align="center" valign="top">Recognition rate</th>
<th align="center" valign="top">Number of support vectors</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">2</td>
<td align="center" valign="top">0.987</td>
<td align="center" valign="top">355</td>
</tr>
<tr>
<td align="left" valign="top">3</td>
<td align="center" valign="top">0.934</td>
<td align="center" valign="top">299</td>
</tr>
<tr>
<td align="left" valign="top">4</td>
<td align="center" valign="top">0.955</td>
<td align="center" valign="top">270</td>
</tr>
<tr>
<td align="left" valign="top">5</td>
<td align="center" valign="top">0.919</td>
<td align="center" valign="top">248</td>
</tr>
<tr>
<td align="left" valign="top">6</td>
<td align="center" valign="top">0.923</td>
<td align="center" valign="top">234</td>
</tr>
<tr>
<td align="left" valign="top">7</td>
<td align="center" valign="top">0.910</td>
<td align="center" valign="top">255</td>
</tr>
<tr>
<td align="left" valign="top">8</td>
<td align="center" valign="top">0.933</td>
<td align="center" valign="top">211</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In this paper, the data dimension is 3 and the number of training samples is 3,776 data points. The polynomial type kernel function is used, and the parameter <italic>C</italic> is 2. From <xref rid="tab3" ref-type="table">Table 3</xref>, we can see that the data recognition rate is highest when <italic>d</italic>&#x2009;=&#x2009;2, so the <italic>d</italic> of the polynomial kernel function is 2 in this study.</p>
</sec>
<sec id="sec9">
<title>Big Data Processing Module</title>
<p>When deploying an HDFS storage structure on a Hadoop platform, a Master is used as the host, which is responsible for the NameNode and JobTracker. Each Slave usually has a DateNode for storing data information and its backups, and executes Map tasks and Reduce tasks in conjunction with a local TaskTraker according to the application requirements. The design idea of the Map-Reduce model is shown in <xref rid="fig2" ref-type="fig">Figure 2</xref>.</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption><p>Map-Reduce model design ideas.</p></caption>
<graphic xlink:href="fpsyg-13-857292-g002.tif"/>
</fig>
<p>The Map-Reduce framework treats the input of an application as a set of &#x003C;key,{list of value}&#x2009;&#x003E;&#x2009;keys and the &#x003C;key,{list of value}&#x2009;&#x003E;&#x2009;keys input to the Map function are raw vector form data, where key represents the evaluation dimension weights and {list of value} corresponds to the &#x201C;evaluation vocabulary&#x201D; and &#x201C;positive/negative meaning,&#x201D; respectively. When &#x003C;key,value&#x003E; is derived from the Map, it comes to the Reduce class, and the Reduce task starts to process the healthy values scattered in many different nodes. The Reduce function saves the classification results of each data node into different classifiers and counts the number and percentage of nodes in each classifier.</p>
</sec>
</sec>
<sec id="sec10">
<title>Experiment and Result Analysis</title>
<sec id="sec11">
<title>Experimental Environment and Data Sources</title>
<p>The PC environment used for the test experiment is Intel I7 processor, 16G RAM, Windows 10 operating system, Eclipse, and Matlab. The cooperation with an online tourism platform in terms of big data was conducted and the online review data were captured using the big data cloud computing service platform owned by it. In 2019, the total tourism revenue of Hangzhou city was 219.741 billion yuan, an increase of 12.72% year-on-year. This study used this platform to capture online review data of Hangzhou West Lake Scenic Area in 2019. There are two main types of these data: (1) microblog data, such as Sina Weibo and Tencent Weibo; (2) OTA online review data, such as <ext-link xlink:href="http://Ctrip.com" ext-link-type="uri">Ctrip.com</ext-link>, <ext-link xlink:href="http://GoWhere.com" ext-link-type="uri">GoWhere.com</ext-link>, and <ext-link xlink:href="http://Tongcheng.com" ext-link-type="uri">Tongcheng.com</ext-link>. The collected raw data are pre-processed, among which the most important work is to manually classify these data, and the obtained division results yield the dataset shown in <xref rid="tab4" ref-type="table">Table 4</xref>.</p>
<table-wrap position="float" id="tab4">
<label>Table 4</label>
<caption><p>Results of the manual division of the West Lake Scenic Area online comments.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center" valign="top">Number of comments with negative meaning</th>
<th align="center" valign="top">Number of comments with positive meaning</th>
<th align="center" valign="top">Total</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Training set</td>
<td align="center" valign="top">20,670</td>
<td align="center" valign="top">18,033</td>
<td align="center" valign="top">38,703</td>
</tr>
<tr>
<td align="left" valign="top">Testing set</td>
<td align="center" valign="top">16,145</td>
<td align="center" valign="top">11,578</td>
<td align="center" valign="top">27,723</td>
</tr>
<tr>
<td align="left" valign="top">Total</td>
<td align="center" valign="top">36,815</td>
<td align="center" valign="top">29,611</td>
<td align="center" valign="top">66,426</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="sec12">
<title>Evaluation Indicators</title>
<p>Travel sentiment classification can be summarized as a matter of binary classification. Those that contain negative evaluations are not recommended reviews and those that do not contain negative evaluations are recommended reviews. In general, there are two major dimensions for the evaluation criteria of classification models, one is the accuracy of classification, and the other is the efficiency of classification. In this paper, we will compare the travel sentiment classification models from these two perspectives. The evaluation metrics of classification accuracy use recall, correctness, and precision. Among them, correctness is an evaluation of the degree of classification of correct results in the classification results; precision is the percentage of each correct classification target in its category; and recall is a metric to evaluate the proportion of the category of recall targets. These evaluation metrics are calculated as follows:</p>
<disp-formula id="E6"><label>(6)</label><mml:math id="M6"><mml:mrow><mml:mi>A</mml:mi><mml:mi>C</mml:mi><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">TP</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="normal">TN</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">TP</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="normal">FN</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="normal">FP</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="normal">TN</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<disp-formula id="E7"><label>(7)</label><mml:math id="M7"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">TP</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">TP</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="normal">FP</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<disp-formula id="E8"><label>(8)</label><mml:math id="M8"><mml:mrow><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">TP</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">TP</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="normal">FN</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<disp-formula id="E9"><label>(9)</label><mml:math id="M9"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">TN</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">TN</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="normal">FN</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<disp-formula id="E10"><label>(10)</label><mml:math id="M10"><mml:mrow><mml:mi>R</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mi>e</mml:mi><mml:mi>g</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi mathvariant="normal">TN</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="normal">FP</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="normal">TN</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
</sec>
<sec id="sec13">
<title>Performance Analysis of the Improved SVM Algorithm</title>
<p>To verify the superiority of the improved SVM algorithm with respect to the original algorithm, two aspects of accuracy and time consumption are verified separately.</p>
<p><xref rid="fig3" ref-type="fig">Figure 3</xref> represents the comparison of the result accuracy of the original SVM and the improved SVM algorithm. It can be seen that the original SVM algorithm is higher at small data size. However, when the data volume grows to more than 600&#x2009;G, the accuracy gap between the two algorithms gradually decreases. When the data volume grows to more than 1&#x2009;T, the accuracy rate of the original SVM algorithm starts to decline, while the accuracy rate of the improved SVM algorithm remains around 80%. It can be inferred that the accuracy of the improved SVM algorithm has been well stabilized in the context of big data.</p>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption><p>The accuracy comparison results of the original SVM and improved SVM algorithm.</p></caption>
<graphic xlink:href="fpsyg-13-857292-g003.tif"/>
</fig>
<p><xref rid="fig4" ref-type="fig">Figure 4</xref> shows the result accuracy of the improved SVM algorithm in the Hadoop system. It can be seen that after the introduction of Hadoop platform, the result accuracy of the algorithm has improved, especially when dealing with large data. From <xref rid="fig4" ref-type="fig">Figure 4</xref>, we can see that the accuracy rate is maintained at about 80%, which is basically in line with the actual application requirements. A comparison of the time consumption of the improved SVM algorithm and the original SVM algorithm is shown in <xref rid="fig5" ref-type="fig">Figure 5</xref>.</p>
<fig position="float" id="fig4">
<label>Figure 4</label>
<caption><p>The accuracy of improvement algorithms results under the background of big data.</p></caption>
<graphic xlink:href="fpsyg-13-857292-g004.tif"/>
</fig>
<fig position="float" id="fig5">
<label>Figure 5</label>
<caption><p>The time-consuming comparison of two algorithms.</p></caption>
<graphic xlink:href="fpsyg-13-857292-g005.tif"/>
</fig>
<p>From <xref rid="fig5" ref-type="fig">Figure 5</xref>, it is clear that when the amount of data is small, there is no significant advantage of the improved SVM algorithm after decomposing the vectors due to the limited number of both point vectors and support vectors. Therefore, there will not be much difference between the two in terms of time cost. However, when the data size gradually increases, the improved SVM algorithm has a great advantage in the computation of kernel function due to the linear indistinguishability gradually increases, especially when the data size is more than 20&#x2009;GB, this advantage is especially obvious. Overall, in the context of massive data, the improved SVM algorithm greatly improves the execution efficiency of the algorithm while ensuring a certain accuracy of the results.</p>
</sec>
<sec id="sec14">
<title>Performance Comparison of Different Classification Models</title>
<p>Sentiment analysis models based on SVM, plain Bayes and dictionary rules were continuously trained and optimized according to five major measures for assessing model accuracy. The classification results of different models at the same test set are shown in <xref rid="tab5" ref-type="table">Table 5</xref>.</p>
<table-wrap position="float" id="tab5">
<label>Table 5</label>
<caption><p>Classification effects of different models at the same test set.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Algorithm</th>
<th align="center" valign="top">Number of test sets</th>
<th align="center" valign="top"><italic>Acc</italic></th>
<th align="center" valign="top"><italic>P</italic>(<italic>po</italic>s)</th>
<th align="center" valign="top"><italic>R</italic>(<italic>po</italic>s)</th>
<th align="center" valign="top"><italic>P</italic>(<italic>neg</italic>)</th>
<th align="center" valign="top"><italic>R</italic>(<italic>neg</italic>)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top" rowspan="6">Improved SVM</td>
<td align="center" valign="top">1,000</td>
<td align="center" valign="top">0.838</td>
<td align="center" valign="top">0.825</td>
<td align="center" valign="top">0.912</td>
<td align="center" valign="top">0.816</td>
<td align="center" valign="top">0.738</td>
</tr>
<tr>
<td align="center" valign="top">5,000</td>
<td align="center" valign="top">0.874</td>
<td align="center" valign="top">0.883</td>
<td align="center" valign="top">0.919</td>
<td align="center" valign="top">0.859</td>
<td align="center" valign="top">0.803</td>
</tr>
<tr>
<td align="center" valign="top">10,000</td>
<td align="center" valign="top">0.880</td>
<td align="center" valign="top">0.889</td>
<td align="center" valign="top">0.909</td>
<td align="center" valign="top">0.866</td>
<td align="center" valign="top">0.837</td>
</tr>
<tr>
<td align="center" valign="top">15,000</td>
<td align="center" valign="top">0.880</td>
<td align="center" valign="top">0.875</td>
<td align="center" valign="top">0.907</td>
<td align="center" valign="top">0.886</td>
<td align="center" valign="top">0.847</td>
</tr>
<tr>
<td align="center" valign="top">20,000</td>
<td align="center" valign="top">0.874</td>
<td align="center" valign="top">0.861</td>
<td align="center" valign="top">0.915</td>
<td align="center" valign="top">0.891</td>
<td align="center" valign="top">0.825</td>
</tr>
<tr>
<td align="center" valign="top">27,723</td>
<td align="center" valign="top">0.870</td>
<td align="center" valign="top">0.856</td>
<td align="center" valign="top">0.915</td>
<td align="center" valign="top">0.890</td>
<td align="center" valign="top">0.817</td>
</tr>
<tr>
<td align="left" valign="top" rowspan="6">NaiveBayes</td>
<td align="center" valign="top">1,000</td>
<td align="center" valign="top">0.838</td>
<td align="center" valign="top">0.836</td>
<td align="center" valign="top">0.895</td>
<td align="center" valign="top">0.842</td>
<td align="center" valign="top">0.762</td>
</tr>
<tr>
<td align="center" valign="top">5,000</td>
<td align="center" valign="top">0.859</td>
<td align="center" valign="top">0.874</td>
<td align="center" valign="top">0.902</td>
<td align="center" valign="top">0.833</td>
<td align="center" valign="top">0.789</td>
</tr>
<tr>
<td align="center" valign="top">10,000</td>
<td align="center" valign="top">0.866</td>
<td align="center" valign="top">0.878</td>
<td align="center" valign="top">0.898</td>
<td align="center" valign="top">0.849</td>
<td align="center" valign="top">0.821</td>
</tr>
<tr>
<td align="center" valign="top">15,000</td>
<td align="center" valign="top">0.850</td>
<td align="center" valign="top">0.851</td>
<td align="center" valign="top">0.875</td>
<td align="center" valign="top">0.847</td>
<td align="center" valign="top">0.820</td>
</tr>
<tr>
<td align="center" valign="top">20,000</td>
<td align="center" valign="top">0.846</td>
<td align="center" valign="top">0.844</td>
<td align="center" valign="top">0.878</td>
<td align="center" valign="top">0.848</td>
<td align="center" valign="top">0.807</td>
</tr>
<tr>
<td align="center" valign="top">27,723</td>
<td align="center" valign="top">0.853</td>
<td align="center" valign="top">0.850</td>
<td align="center" valign="top">0.890</td>
<td align="center" valign="top">0.857</td>
<td align="center" valign="top">0.813</td>
</tr>
<tr>
<td align="left" valign="top" rowspan="6">Dictionary</td>
<td align="center" valign="top">1,000</td>
<td align="center" valign="top">0.880</td>
<td align="center" valign="top">0.960</td>
<td align="center" valign="top">0.828</td>
<td align="center" valign="top">0.800</td>
<td align="center" valign="top">0.952</td>
</tr>
<tr>
<td align="center" valign="top">5,000</td>
<td align="center" valign="top">0.845</td>
<td align="center" valign="top">0.900</td>
<td align="center" valign="top">0.811</td>
<td align="center" valign="top">0.790</td>
<td align="center" valign="top">0.888</td>
</tr>
<tr>
<td align="center" valign="top">10,000</td>
<td align="center" valign="top">0.880</td>
<td align="center" valign="top">0.924</td>
<td align="center" valign="top">0.872</td>
<td align="center" valign="top">0.823</td>
<td align="center" valign="top">0.892</td>
</tr>
<tr>
<td align="center" valign="top">15,000</td>
<td align="center" valign="top">0.855</td>
<td align="center" valign="top">0.930</td>
<td align="center" valign="top">0.836</td>
<td align="center" valign="top">0.753</td>
<td align="center" valign="top">0.889</td>
</tr>
<tr>
<td align="center" valign="top">20,000</td>
<td align="center" valign="top">0.856</td>
<td align="center" valign="top">0.928</td>
<td align="center" valign="top">0.811</td>
<td align="center" valign="top">0.784</td>
<td align="center" valign="top">0.916</td>
</tr>
<tr>
<td align="center" valign="top">27,723</td>
<td align="center" valign="top">0.867</td>
<td align="center" valign="top">0.931</td>
<td align="center" valign="top">0.841</td>
<td align="center" valign="top">0.790</td>
<td align="center" valign="top">0.906</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The results of the accuracy analysis of these three models are shown in <xref rid="fig6" ref-type="fig">Figure 6</xref>. From the results, it can be seen that the commonality of the three is that the accuracy rate is high, above 83%, and all of them can achieve the accurate management of tourism emotion. However, there are still some differences among the three. Compared with the model based on sentiment dictionary rules, the classification accuracy of the machine learning-based model is more stable, and the improved SVM classifier is generally better than the plain Bayesian classifier. Among them, the classification accuracy of the dictionary model is higher than that of the machine learning classification when the number of test sets is small, but it gradually decreases as the number of test sets increases.</p>
<fig position="float" id="fig6">
<label>Figure 6</label>
<caption><p>Sentiment classification accuracy (<italic>Acc</italic>) of each classifier.</p></caption>
<graphic xlink:href="fpsyg-13-857292-g006.tif"/>
</fig>
<p>In summary, the travel sentiment classification model based on sentiment dictionary has better classification results when dealing with a small amount of text. However, as the number of test sets increases, the improved support vector machine-based travel sentiment classification model is able to achieve more desirable classification results. Therefore, the improved support vector machine-based sentiment analysis method has great superiority in dealing with large amount of travel review data.</p>
</sec>
</sec>
<sec id="sec15" sec-type="conclusions">
<title>Conclusion</title>
<p>In this paper, a typical method of sentiment analysis is used to build a machine learning model based on support vector machines to analyze the sentiment of travel online comments, and the feasibility and effectiveness of the idea are verified by experimental models. In the algorithm study, the SVM support vector machine algorithm is chosen and improved for the case of clear categories and large amount of data. In addition, the improved SVM algorithm is applied using Map-Reduce and classifies large-scale textual keywords, while the HDFS system is used to store and read and write data rows concurrently to improve the efficiency when training large amounts of data. Finally, the efficiency of the proposed method is verified through comparative experiments. The shortcoming of this study is that the data size of the test set used in the experiments is limited and does not reflect well the speedup ratio relationship of the classification process under more processors. The data size of the experiments and the number of nodes in the experimental platform will be increased in the future.</p>
</sec>
<sec id="sec16" sec-type="data-availability">
<title>Data Availability Statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="sec17">
<title>Author Contributions</title>
<p>ZY was responsible for designing the framework of the entire manuscript from topic selection to solution to experimental verification.</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="sec200" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abubakar</surname> <given-names>A. M.</given-names></name> <name><surname>Ilkan</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>Impact of online WOM on destination trust and intention to travel: a medical tourism perspective</article-title>. <source>J. Destin. Mark. Manag.</source> <volume>5</volume>, <fpage>192</fpage>&#x2013;<lpage>201</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jdmm.2015.12.005</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Boley</surname> <given-names>B. B.</given-names></name> <name><surname>Jordan</surname> <given-names>E. J.</given-names></name> <name><surname>Kline</surname> <given-names>C.</given-names></name> <name><surname>Knollenberg</surname> <given-names>W.</given-names></name></person-group> (<year>2018</year>). <article-title>Social return and intent to travel</article-title>. <source>Tour. Manag.</source> <volume>64</volume>, <fpage>119</fpage>&#x2013;<lpage>128</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.tourman.2017.08.008</pub-id></citation></ref>
<ref id="ref3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Buttazzoni</surname> <given-names>A. N.</given-names></name> <name><surname>Van</surname> <given-names>K.</given-names></name> <name><surname>Shah</surname> <given-names>T. I.</given-names></name> <name><surname>Gilliland</surname> <given-names>J. A.</given-names></name></person-group> (<year>2018</year>). <article-title>Active school travel intervention methodologies in North America: a systematic review</article-title>. <source>Am. J. Prev. Med.</source> <volume>55</volume>, <fpage>115</fpage>&#x2013;<lpage>124</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.amepre.2018.04.007</pub-id>, PMID: <pub-id pub-id-type="pmid">29776785</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cottrill</surname> <given-names>C.</given-names></name> <name><surname>Pereira</surname> <given-names>F. C.</given-names></name> <name><surname>Fang</surname> <given-names>Z.</given-names></name> <name><surname>Dias</surname> <given-names>I.</given-names></name> <name><surname>Lim</surname> <given-names>H. B.</given-names></name> <name><surname>Ben-Akiva</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>The future mobility survey: experiences in developing a smartphone-based travel survey in Singapore</article-title>. <source>Transp. Res. Rec.</source> <volume>2354</volume>, <fpage>59</fpage>&#x2013;<lpage>67</lpage>. doi: <pub-id pub-id-type="doi">10.3141/2354-07</pub-id></citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cynarski</surname> <given-names>W. J.</given-names></name></person-group> (<year>2018</year>). <article-title>Scientific travel to Osaka: the next case study of martial arts tourism</article-title>. <source>Ido Mov. Cult.</source> <volume>18</volume>, <fpage>23</fpage>&#x2013;<lpage>30</lpage>. doi: <pub-id pub-id-type="doi">10.14589/ido.18.1.4</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>De Vos</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>Satisfaction-induced travel behaviour</article-title>. <source>Transport. Res. F: Traffic Psychol. Behav.</source> <volume>63</volume>, <fpage>12</fpage>&#x2013;<lpage>21</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.trf.2019.03.001</pub-id></citation></ref>
<ref id="ref7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fujiu</surname> <given-names>M.</given-names></name> <name><surname>Toleikis</surname> <given-names>J. R.</given-names></name> <name><surname>Logemann</surname> <given-names>J. A.</given-names></name></person-group> (<year>2016</year>). <article-title>Glossopharyngeal evoked potentials in normal subjects following mechanical stimulation of the anterior faucial pillar</article-title>. <source>Electroencephal. Clin. Neurophysiol. Evok. Poten.</source> <volume>92</volume>, <fpage>183</fpage>&#x2013;<lpage>195</lpage>. doi: <pub-id pub-id-type="doi">10.1016/0168-5597(94)90062-0</pub-id></citation></ref>
<ref id="ref8"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>F.</given-names></name> <name><surname>Sun</surname> <given-names>X.</given-names></name> <name><surname>Wang</surname> <given-names>K.</given-names></name> <name><surname>Ren</surname> <given-names>F.</given-names></name></person-group> (<year>2016</year>). &#x201C;Chinese micro-blog sentiment analysis based on semantic features and PAD model.&#x201D; in <italic>2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)</italic>. June 26&#x2013;29, 2016; IEEE; 1&#x2013;5.</citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hartmann</surname> <given-names>J.</given-names></name> <name><surname>Huppertz</surname> <given-names>J.</given-names></name> <name><surname>Schamp</surname> <given-names>C.</given-names></name> <name><surname>Heitmann</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>Comparing automated text classification methods</article-title>. <source>Int. J. Res. Mark.</source> <volume>36</volume>, <fpage>20</fpage>&#x2013;<lpage>38</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ijresmar.2018.09.009</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haynes</surname> <given-names>N. C.</given-names></name></person-group> (<year>2020</year>). <article-title>Robots, artificial intelligence and service automation in travel, tourism and hospitality</article-title>. <source>J. Tour. Futures</source> <volume>6</volume>, <fpage>191</fpage>&#x2013;<lpage>192</lpage>. doi: <pub-id pub-id-type="doi">10.1108/JTF-06-2020-149</pub-id></citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>R.</given-names></name> <name><surname>Zhu</surname> <given-names>X.</given-names></name> <name><surname>Zhu</surname> <given-names>Y.</given-names></name> <name><surname>Gan</surname> <given-names>J.</given-names></name></person-group> (<year>2020</year>). <article-title>Robust SVM with adaptive graph learning</article-title>. <source>World Wide Web</source> <volume>23</volume>, <fpage>1945</fpage>&#x2013;<lpage>1968</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11280-019-00766-x</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hunold</surname> <given-names>M.</given-names></name> <name><surname>Kesler</surname> <given-names>R.</given-names></name> <name><surname>Laitenberger</surname> <given-names>U.</given-names></name></person-group> (<year>2020</year>). <article-title>Rankings of online travel agents, channel pricing, and consumer protection</article-title>. <source>Mark. Sci.</source> <volume>39</volume>, <fpage>92</fpage>&#x2013;<lpage>116</lpage>. doi: <pub-id pub-id-type="doi">10.1287/mksc.2019.1167</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kadhim</surname> <given-names>A. I.</given-names></name></person-group> (<year>2019</year>). <article-title>Survey on supervised machine learning techniques for automatic text classification</article-title>. <source>Artif. Intell. Rev.</source> <volume>52</volume>, <fpage>273</fpage>&#x2013;<lpage>292</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10462-018-09677-1</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luo</surname> <given-names>F.</given-names></name> <name><surname>Becken</surname> <given-names>S.</given-names></name> <name><surname>Zhong</surname> <given-names>Y.</given-names></name></person-group> (<year>2018</year>). <article-title>Changing travel patterns in China and 'carbon footprint' implications for a domestic tourist destination</article-title>. <source>Tour. Manag.</source> <volume>65</volume>, <fpage>1</fpage>&#x2013;<lpage>13</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.tourman.2017.09.012</pub-id></citation></ref>
<ref id="ref15"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Ma</surname> <given-names>Z.</given-names></name> <name><surname>Du</surname> <given-names>J.</given-names></name> <name><surname>Zhou</surname> <given-names>Y.</given-names></name></person-group> (<year>2016</year>). &#x201C;Sentiment analysis based on evaluation of tourist attractions.&#x201D; in <italic>Proceedings of the 2015 Chinese Intelligent Systems Conference</italic>. November 08, 2015; Berlin, Heidelberg: Springer; <fpage>375</fpage>&#x2013;<lpage>382</lpage>.</citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mayer</surname> <given-names>M.</given-names></name> <name><surname>Woltering</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>Assessing and valuing the recreational ecosystem services of Germany's national parks using travel cost models</article-title>. <source>Ecosyst. Serv.</source> <volume>31</volume>, <fpage>371</fpage>&#x2013;<lpage>386</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ecoser.2017.12.009</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Minaee</surname> <given-names>S.</given-names></name> <name><surname>Kalchbrenner</surname> <given-names>N.</given-names></name> <name><surname>Cambria</surname> <given-names>E.</given-names></name> <name><surname>Nikzad</surname> <given-names>N.</given-names></name> <name><surname>Chenaghlu</surname> <given-names>M.</given-names></name> <name><surname>Gao</surname> <given-names>J.</given-names></name></person-group> (<year>2021</year>). <article-title>Deep learning--based text classification: a comprehensive review</article-title>. <source>ACM Comput. Surv.</source> <volume>54</volume>, <fpage>1</fpage>&#x2013;<lpage>40</lpage>. doi: <pub-id pub-id-type="doi">10.1145/3439726</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Prem</surname> <given-names>C. D.</given-names></name> <name><surname>Burris</surname> <given-names>M.</given-names></name> <name><surname>Shaw</surname> <given-names>W. D.</given-names></name></person-group> (<year>2018</year>). <article-title>Do travelers pay for managed-lane travel as they claimed they would? before-and-After study of travelers on Katy freeway, Houston, Texas</article-title>. <source>Transp. Res. Rec.</source> <volume>2297</volume>, <fpage>56</fpage>&#x2013;<lpage>65</lpage>. doi: <pub-id pub-id-type="doi">10.3141/2297-07</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rossini</surname> <given-names>P. M.</given-names></name> <name><surname>Deuschl</surname> <given-names>G.</given-names></name> <name><surname>Pizzella</surname> <given-names>V.</given-names></name> <name><surname>Tecchio</surname> <given-names>F.</given-names></name> <name><surname>L&#x00FC;cking</surname> <given-names>C. H.</given-names></name></person-group> (<year>2016</year>). <article-title>Topography and sources of electromagnetic cerebral responses to electrical and air-puff stimulation of the hand</article-title>. <source>Electroencephalogr. Clin. Neurophysiol.</source> <volume>100</volume>, <fpage>229</fpage>&#x2013;<lpage>239</lpage>. doi: <pub-id pub-id-type="doi">10.1016/0168-5597(95)00275-8</pub-id>, PMID: <pub-id pub-id-type="pmid">8681864</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seera</surname> <given-names>N. K.</given-names></name> <name><surname>Taruna</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title>A novel framework to optimize I/O cost in MapReduce: an index based solution</article-title>. <source>Procedia Comput. Sci.</source> <volume>132</volume>, <fpage>1270</fpage>&#x2013;<lpage>1279</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.procs.2018.05.043</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sengan</surname> <given-names>S.</given-names></name> <name><surname>Satheesh</surname> <given-names>N.</given-names></name> <name><surname>Balu</surname> <given-names>S.</given-names></name> <name><surname>Reddy</surname> <given-names>A. S.</given-names></name> <name><surname>Murugan</surname> <given-names>G.</given-names></name></person-group> (<year>2020</year>). <article-title>Optimizing joins in a map-reduce for data storage and retrieval performance analysis of query processing in HDFS for big data</article-title>. <source>Int. J. Adv. Trends Comput. Sci. Eng.</source> <volume>8</volume>, <fpage>2062</fpage>&#x2013;<lpage>2067</lpage>. doi: <pub-id pub-id-type="doi">10.30534/ijatcse/2019/33852019</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tu</surname> <given-names>H. L.</given-names></name> <name><surname>Tang</surname> <given-names>X. B.</given-names></name></person-group> (<year>2016</year>). <article-title>Constructing a visitor sentiment analysis model based on online reviews</article-title>. <source>J. Mod. Info.</source> <volume>36</volume>, <fpage>70</fpage>&#x2013;<lpage>77</lpage>. doi: <pub-id pub-id-type="doi">10.3969/j.issn.1008-0821.2016.04.013</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van</surname> <given-names>H. T.</given-names></name> <name><surname>Hieu</surname> <given-names>V. M.</given-names></name></person-group> (<year>2020</year>). <article-title>Travel branding in tourism 4.0: case study Vietnam travel</article-title>. <source>J. Asian Afr. Stud.</source> <volume>55</volume>, <fpage>896</fpage>&#x2013;<lpage>909</lpage>. doi: <pub-id pub-id-type="doi">10.1177/0021909620935428</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Varatharajan</surname> <given-names>R.</given-names></name> <name><surname>Manogaran</surname> <given-names>G.</given-names></name> <name><surname>Priyan</surname> <given-names>M. K.</given-names></name></person-group> (<year>2018</year>). <article-title>A big data classification approach using LDA with an enhanced SVM method for ECG signals in cloud computing</article-title>. <source>Multimed. Tools Appl.</source> <volume>77</volume>, <fpage>10195</fpage>&#x2013;<lpage>10215</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11042-017-5318-1</pub-id></citation></ref>
<ref id="ref26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Woosnam</surname> <given-names>K. M.</given-names></name> <name><surname>Draper</surname> <given-names>J.</given-names></name> <name><surname>Jiang</surname> <given-names>J. K.</given-names></name> <name><surname>Aleshinloye</surname> <given-names>K. D.</given-names></name> <name><surname>Erul</surname> <given-names>E.</given-names></name></person-group> (<year>2018</year>). <article-title>Applying self-perception theory to explain residents' attitudes about tourism development through travel histories</article-title>. <source>Tour. Manag.</source> <volume>64</volume>, <fpage>357</fpage>&#x2013;<lpage>368</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.tourman.2017.09.015</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yi</surname> <given-names>S.</given-names></name> <name><surname>Day</surname> <given-names>J.</given-names></name> <name><surname>Cai</surname> <given-names>L. A.</given-names></name></person-group> (<year>2014</year>). <article-title>Exploring tourist perceived value: an investigation of Asian cruise tourists&#x2019; travel experience</article-title>. <source>J. Qual. Assur. Hosp. Tour.</source> <volume>15</volume>, <fpage>63</fpage>&#x2013;<lpage>77</lpage>. doi: <pub-id pub-id-type="doi">10.1080/1528008X.2014.855530</pub-id></citation></ref>
<ref id="ref27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yunlong</surname> <given-names>Z.</given-names></name> <name><surname>Yuanchang</surname> <given-names>X.</given-names></name></person-group> (<year>2018</year>). <article-title>Travel mode choice modeling with support vector machines</article-title>. <source>Transp. Res. Rec.</source> <volume>2076</volume>, <fpage>141</fpage>&#x2013;<lpage>150</lpage>. doi: <pub-id pub-id-type="doi">10.3141/2076-16</pub-id></citation></ref>
<ref id="ref28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zeng</surname> <given-names>N.</given-names></name> <name><surname>Qiu</surname> <given-names>H.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Liu</surname> <given-names>W.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name></person-group> (<year>2018</year>). <article-title>A new switching-delayed-PSO-based optimized SVM algorithm for diagnosis of Alzheimer&#x2019;s disease</article-title>. <source>Neurocomputing</source> <volume>320</volume>, <fpage>195</fpage>&#x2013;<lpage>202</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.neucom.2018.09.001</pub-id></citation></ref>
</ref-list>
</back>
</article>