<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Big Data</journal-id>
<journal-title>Frontiers in Big Data</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Big Data</abbrev-journal-title>
<issn pub-type="epub">2624-909X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fdata.2019.00008</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Big Data</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Abusive Language Detection in Online Conversations by Combining Content- and Graph-Based Features</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>C&#x000E9;cillon</surname> <given-names>No&#x000E9;</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/714333/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Labatut</surname> <given-names>Vincent</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/72717/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Dufour</surname> <given-names>Richard</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/714359/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Linar&#x000E8;s</surname> <given-names>Georges</given-names></name>
</contrib>
</contrib-group>
<aff><institution>LIA, Avignon University</institution>, <addr-line>Avignon</addr-line>, <country>France</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Sabrina Gaito, University of Milan, Italy</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Roberto Interdonato, Territoires, Environnement, T&#x000E9;l&#x000E9;d&#x000E8;tection et Information Spatiale (TETIS), France; Eric A. Leclercq, Universit&#x000E9; de Bourgogne, France</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Vincent Labatut <email>vincent.labatut&#x00040;univ-avignon.fr</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Data Mining and Management, a section of the journal Frontiers in Big Data</p></fn></author-notes>
<pub-date pub-type="epub">
<day>04</day>
<month>06</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="collection">
<year>2019</year>
</pub-date>
<volume>2</volume>
<elocation-id>8</elocation-id>
<history>
<date date-type="received">
<day>01</day>
<month>04</month>
<year>2019</year>
</date>
<date date-type="accepted">
<day>14</day>
<month>05</month>
<year>2019</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2019 C&#x000E9;cillon, Labatut, Dufour and Linar&#x000E8;s.</copyright-statement>
<copyright-year>2019</copyright-year>
<copyright-holder>C&#x000E9;cillon, Labatut, Dufour and Linar&#x000E8;s</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>In recent years, online social networks have allowed world-wide users to meet and discuss. As guarantors of these communities, the administrators of these platforms must prevent users from adopting inappropriate behaviors. This verification task, mainly done by humans, is more and more difficult due to the ever growing amount of messages to check. Methods have been proposed to automatize this moderation process, mainly by providing approaches based on the textual content of the exchanged messages. Recent work has also shown that characteristics derived from the structure of conversations, in the form of conversational graphs, can help detecting these abusive messages. In this paper, we propose to take advantage of both sources of information by proposing fusion methods integrating content- and graph-based features. Our experiments on raw chat logs show not only that the content of the messages, but also their dynamics within a conversation contain partially complementary information, allowing performance improvements on an abusive message classification task with a final <italic>F</italic>-measure of 93.26%.</p></abstract>
<kwd-group>
<kwd>automatic abuse detection</kwd>
<kwd>content analysis</kwd>
<kwd>conversational graph</kwd>
<kwd>online conversations</kwd>
<kwd>social networks</kwd>
</kwd-group>
<counts>
<fig-count count="3"/>
<table-count count="2"/>
<equation-count count="0"/>
<ref-count count="18"/>
<page-count count="7"/>
<word-count count="5361"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>The internet has widely impacted the way we communicate. Online communities, in particular, have grown to become important places for interpersonal communications. They get more and more attention from companies to advertise their products or from governments interested in monitoring public discourse. Online communities come in various shapes and forms, but they are all exposed to abusive behavior. The definition of what exactly is considered as abuse depends on the community, but generally includes personal attacks, as well as discrimination based on race, religion, or sexual orientation.</p>
<p>Abusive behavior is a risk, as it is likely to make important community members leave, therefore endangering the community, and even trigger legal issues in some countries. Moderation consists in detecting users who act abusively, and in taking actions against them. Currently, this moderation work is mainly a manual process, and since it implies high human and financial costs, companies have a keen interest in its automation. One way of doing so is to consider this task as a classification problem consisting in automatically determining if a user message is abusive or not.</p>
<p>A number of works have tackled this problem, or related ones, in the literature. Most of them focus only on the content of the targeted message to detect abuse or similar properties. For instance (Spertus, <xref ref-type="bibr" rid="B16">1997</xref>), applies this principle to detect hostility (Dinakar et al., <xref ref-type="bibr" rid="B6">2011</xref>), for cyberbullying, and (Chen et al., <xref ref-type="bibr" rid="B4">2012</xref>) for offensive language. These approaches rely on a mix of standard NLP features and manually crafted application-specific resources (e.g., linguistic rules). We also proposed a content-based method (Papegnies et al., <xref ref-type="bibr" rid="B11">2017a</xref>) using a wide array of language features (Bag-of-Words, <italic>tf</italic>-<italic>idf</italic> scores, sentiment scores). Other approaches are more machine learning intensive, but require larger amounts of data. Recently, Wulczyn et al. (<xref ref-type="bibr" rid="B17">2017</xref>) created three datasets containing individual messages collected from Wikipedia discussion pages, annotated for toxicity, personal attacks and aggression, respectively. They have been leveraged in recent works to train Recursive Neural Network operating on word embeddings and character <italic>n</italic>-gram features (Pavlopoulos et al., <xref ref-type="bibr" rid="B14">2017</xref>; Mishra et al., <xref ref-type="bibr" rid="B9">2018</xref>). However, the quality of these direct content-based approaches is very often related to the training data used to learn abuse detection models. In the case of online social networks, the great variety of users, including very different language registers, spelling mistakes, as well as intentional users obfuscation, makes it almost impossible to have models robust enough to be applied in all cases. (Hosseini et al., <xref ref-type="bibr" rid="B8">2017</xref>) have then shown that it is very easy to bypass automatic toxic comment detection systems by making the abusive content difficult to detect (intentional spelling mistakes, uncommon negatives&#x02026;).</p>
<p>Because the reactions of other users to an abuse case are completely beyond the abuser&#x00027;s control, some authors consider the content of messages occurring <italic>around</italic> the targeted message, instead of focusing only on the targeted message itself. For instance, (Yin et al., <xref ref-type="bibr" rid="B18">2009</xref>) use features derived from the sentences neighboring a given message to detect harassment on the Web. (Balci and Salah, <xref ref-type="bibr" rid="B1">2015</xref>) take advantage of user features such as the gender, the number of in-game friends or the number of daily logins to detect abuse in the community of an online game. In our previous work (Papegnies et al., <xref ref-type="bibr" rid="B13">2019</xref>), we proposed a radically different method that completely ignores the textual content of the messages, and relies only on a graph-based modeling of the conversation. This is the only graph-based approach ignoring the linguistic content proposed in the context of abusive messages detection. Our conversational network extraction process is inspired from other works leveraging such graphs for other purposes: chat logs (Mutton, <xref ref-type="bibr" rid="B10">2004</xref>) or online forums (Forestier et al., <xref ref-type="bibr" rid="B7">2011</xref>) interaction modeling, user group detection (Camtepe et al., <xref ref-type="bibr" rid="B3">2004</xref>). Additional references on abusive message detection and conversational network modeling can be found in Papegnies et al. (<xref ref-type="bibr" rid="B13">2019</xref>).</p>
<p>In this paper, based on the assumption that the interactions between users and the content of the exchanged messages convey different information, we propose a new method to perform abuse detection while leveraging both sources. For this purpose, we take advantage of the content-(Papegnies et al., <xref ref-type="bibr" rid="B12">2017b</xref>) and graph-based (Papegnies et al., <xref ref-type="bibr" rid="B13">2019</xref>) methods that we previously developed. We propose three different ways to combine them, and compare their performance on a corpus of chat logs originating from the community of a French multiplayer online game. We then perform a feature study, finding the most informative ones and discussing their role. Our contribution is twofold: the exploration of fusion methods, and more importantly the identification of discriminative features for this problem.</p>
<p>The rest of this article is organized as follows. In section 2, we describe the methods and strategies used in this work. In section 3 we present our dataset, the experimental setup we use for this classification task, and the performances we obtained. Finally, we summarize our contributions in section 4 and present some perspectives for this work.</p></sec>
<sec sec-type="methods" id="s2">
<title>2. Methods</title>
<p>In this section, we summarize the content-based method from Papegnies et al. (<xref ref-type="bibr" rid="B12">2017b</xref>) (section 2.1) and the graph-based method from Papegnies et al. (<xref ref-type="bibr" rid="B13">2019</xref>) (section 2.2). We then present the fusion method proposed in this paper, aiming at taking advantage of both sources of information (section 2.3). <xref ref-type="fig" rid="F1">Figure 1</xref> shows the whole process, and is discussed through this section.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Representation of our processing pipeline. <italic>Existing methods</italic> refers to our previous work described in Papegnies et al. (<xref ref-type="bibr" rid="B12">2017b</xref>) (content-based method) and Papegnies et al. (<xref ref-type="bibr" rid="B13">2019</xref>) (graph-based method), whereas the contribution presented in this article appears on the right side (fusion strategies). Figure available at <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.7442273.v5">10.6084/m9.figshare.7442273</ext-link> under CC-BY license.</p></caption>
<graphic xlink:href="fdata-02-00008-g0001.tif"/>
</fig>
<sec>
<title>2.1. Content-Based Method</title>
<p>This method corresponds to the bottom-left part of <xref ref-type="fig" rid="F1">Figure 1</xref> (in green). It consists in extracting certain features from the content of each considered message, and to train a Support Vector Machine (SVM) classifier to distinguish abusive (<italic>Abuse</italic> class) and non-abusive (<italic>Non-abuse</italic> class) messages (Papegnies et al., <xref ref-type="bibr" rid="B12">2017b</xref>). These features are quite standard in Natural Language Processing (NLP), so we only describe them briefly here.</p>
<p>We use a number of <italic>morphological features</italic>. We use the message length, average word length, and maximal word length, all expressed in number of characters. We count the number of unique characters in the message. We distinguish between six classes of characters (letters, digits, punctuation, spaces, and others) and compute two features for each one: number of occurrences, and proportion of characters in the message. We proceed similarly with capital letters. Abusive messages often contain a lot of copy/paste. To deal with such redundancy, we apply the Lempel&#x02013;Ziv&#x02013;Welch (LZW) compression algorithm (Batista and Meira, <xref ref-type="bibr" rid="B2">2004</xref>) to the message and take the ratio of its raw to compress lengths, expressed in characters. Abusive messages also often contain extra-long words, which can be identified by collapsing the message: extra occurrences of letters repeated more than two times consecutively are removed. For instance, &#x0201C;looooooool&#x0201D; would be collapsed to &#x0201C;lool&#x0201D;. We compute the difference between the raw and collapsed message lengths.</p>
<p>We also use <italic>language features</italic>. We count the number of words, unique words and bad words in the message. For the latter, we use a predefined list of insults and symbols considered as abusive, and we also count them in the collapsed message. We compute two overall <italic>tf</italic>&#x02013;<italic>idf</italic> scores corresponding to the sums of the standard <italic>tf</italic>&#x02013;<italic>idf</italic> scores of each individual word in the message. One is processed relatively to the <italic>Abuse</italic> class, and the other to the <italic>Non-abuse</italic> class. We proceed similarly with the collapsed message. Finally, we lower-case the text and strip punctuation, in order to represent the message as a basic Bag-of-Words (BoW). We then train a Naive Bayes classifier to detect abuse using this sparse binary vector (as represented in the very bottom part of <xref ref-type="fig" rid="F1">Figure 1</xref>). The output of this simple classifier is then used as an input feature for the SVM classifier.</p></sec>
<sec>
<title>2.2. Graph-Based Method</title>
<p>This method corresponds to the top-left part of <xref ref-type="fig" rid="F1">Figure 1</xref> (in red). It completely ignores the content of the messages, and only focuses on the dynamics of the conversation, based on the interactions between its participants (Papegnies et al., <xref ref-type="bibr" rid="B13">2019</xref>). It is three-stepped: (1) extracting a conversational graph based on the considered message as well as the messages preceding and/or following it; (2) computing the topological measures of this graph to characterize its structure; and (3) using these values as features to train an SVM to distinguish between abusive and non-abusive messages. The vertices of the graph model the participants of the conversation, whereas its weighted edges represent how intensely they communicate.</p>
<p>The graph extraction is based on a number of concepts illustrated in <xref ref-type="fig" rid="F2">Figure 2</xref>, in which each rectangle represents a message. The extraction process is restricted to a so-called <italic>context period</italic>, i.e., a sub-sequence of messages including the message of interest, itself called <italic>targeted message</italic> and represented in red in <xref ref-type="fig" rid="F2">Figure 2</xref>. Each participant posting at least one message during this period is modeled by a vertex in the produced conversational graph. A mobile window is slid over the whole period, one message at a time. At each step, the network is updated either by creating new links, or by updating the weights of existing ones. This <italic>sliding window</italic> has a fixed length expressed in number of messages, which is derived from ergonomic constraints relative to the online conversation platform studied in section 3. It allows focusing on a smaller part of the context period. At a given time, the last message of the window (in blue in <xref ref-type="fig" rid="F2">Figure 2</xref>) is called <italic>current message</italic> and its author <italic>current author</italic>. The weight update method assumes that the current message is aimed at the authors of the other messages present in the window, and therefore connects the current author to them (or strengthens their weights if the edge already exists). It also takes chronology into account by favoring the most recent authors in the window. Three different variants of the conversational network are extracted for one given targeted message: the <italic>Before</italic> network is based on the messages posted before the targeted message, the <italic>After</italic> network on those posted after, and the <italic>Full</italic> network on the whole context period. <xref ref-type="fig" rid="F3">Figure 3</xref> shows an example of such networks obtained for a message of the corpus described in section 3.1.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Illustration of the main concepts used during network extraction (see text for details). Figure available at <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.7442273.v5">10.6084/m9.figshare.7442273</ext-link> under CC-BY license.</p></caption>
<graphic xlink:href="fdata-02-00008-g0002.tif"/>
</fig>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Example of the three types of conversational networks extracted for a given context period: <italic>Before</italic> <bold>(Left)</bold>, <italic>After</italic> <bold>(Center)</bold>, and <italic>Full</italic> <bold>(Right)</bold>. The author of the targeted message is represented in red. Figure available at <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.6084/m9.figshare.7442273.v5">10.6084/m9.figshare.7442273</ext-link> under CC-BY license.</p></caption>
<graphic xlink:href="fdata-02-00008-g0003.tif"/>
</fig>
<p>Once the conversational networks have been extracted, they must be described through numeric values in order to feed the SVM classifier. This is done through a selection of standard topological measures allowing to describe a graph in a number of distinct ways, focusing on different scales and scopes. The <italic>scale</italic> denotes the nature of the characterized entity. In this work, the individual vertex and the whole graph are considered. When considering a single vertex, the measure focuses on the <italic>targeted author</italic> (i.e., the author of the targeted message). The <italic>scope</italic> can be either micro-, meso-, or macroscopic: it corresponds to the amount of information considered by the measure. For instance, the graph density is microscopic, the modularity is mesoscopic, and the diameter is macroscopic. All these measures are computed for each graph, and allow describing the conversation surrounding the message of interest. The SVM is then trained using these values as features. In this work, we use exactly the same measures as in Papegnies et al. (<xref ref-type="bibr" rid="B13">2019</xref>).</p></sec>
<sec>
<title>2.3. Fusion</title>
<p>We now propose a new method seeking to take advantage of both previously described ones. It is based on the assumption that the content- and graph-based features convey different information. Therefore, they could be complementary, and their combination could improve the classification performance. We experiment with three different fusion strategies, which are represented in the right-hand part of <xref ref-type="fig" rid="F1">Figure 1</xref>.</p>
<p>The first strategy follows the principle of <italic>Early Fusion</italic>. It consists in constituting a global feature set containing all content- and graph-based features from sections 2.1 and 2.2, then training a SVM directly using these features. The rationale here is that the classifier has access to the whole raw data, and must determine which part is relevant to the problem at hand.</p>
<p>The second strategy is <italic>Late Fusion</italic>, and we proceed in two steps. First, we apply separately both methods described in sections 2.1 and 2.2, in order to obtain two scores corresponding to the output probability of each message to be abusive given by the content- and graph-based methods, respectively. Second, we fetch these two scores to a third SVM, trained to determine if a message is abusive or not. This approach relies on the assumption that these scores contain all the information the final classifier needs, and not the noise present in the raw features.</p>
<p>Finally, the third fusion strategy can be considered as <italic>Hybrid Fusion</italic>, as it seeks to combine both previous proposed ones. We create a feature set containing the content- and graph-based features, like with <italic>Early Fusion</italic>, but also both scores used in <italic>Late Fusion</italic>. This whole set is used to train a new SVM. The idea is to check whether the scores do not convey certain useful information present in the raw features, in which case combining scores and features should lead to better results.</p></sec></sec>
<sec id="s3">
<title>3. Experiments</title>
<p>In this section, we first describe our dataset and the experimental protocol followed in our experiments (section 3.1). We then present and discuss our results, in terms of classification performance (sections 3.2) and feature selection (section 3.3).</p>
<sec>
<title>3.1. Experimental Protocol</title>
<p>The dataset is the same as in our previous publications (Papegnies et al., <xref ref-type="bibr" rid="B12">2017b</xref>, <xref ref-type="bibr" rid="B13">2019</xref>). It is a proprietary database containing 4,029,343 messages in French, exchanged on the in-game chat of <italic>SpaceOrigin</italic><xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>, a Massively Multiplayer Online Role-Playing Game (MMORPG). Among them, 779 have been flagged as being abusive by at least one user in the game, and confirmed as such by a human moderator. They constitute what we call the <italic>Abuse</italic> class. Some inconsistencies in the database prevent us from retrieving the context of certain messages, which we remove from the set. After this cleaning, the <italic>Abuse</italic> class contains 655 messages. In order to keep a balanced dataset, we further extract the same number of messages at random from the ones that have not been flagged as abusive. This constitutes our <italic>Non-abuse</italic> class. Each message, whatever its class, is associated to its surrounding context (i.e., messages posted in the same thread).</p>
<p>The graph extraction method used to produce the graph-based features requires to set certain parameters. We use the values matching the best performance, obtained during the greedy search of the parameter space performed in Papegnies et al. (<xref ref-type="bibr" rid="B13">2019</xref>). In particular, regarding the two most important parameters (see section 2.2), we fix the <italic>context period</italic> size to 1,350 messages and the <italic>sliding window</italic> length to 10 messages. Implementation-wise, we use the iGraph library (Csardi and Nepusz, <xref ref-type="bibr" rid="B5">2006</xref>) to extract the conversational networks and process the corresponding features. We use the Sklearn toolkit (Pedregosa et al., <xref ref-type="bibr" rid="B15">2011</xref>) to get the text-based features. We use the SVM classifier implemented in Sklearn under the name SVC (C-Support Vector Classification). Because of the relatively small dataset, we set-up our experiments using a 10-fold cross-validation. Each fold is balanced between the <italic>Abuse</italic> and <italic>Non-abuse</italic> classes, 70% of the dataset being used for training and 30% for testing.</p></sec>
<sec>
<title>3.2. Classification Performance</title>
<p><xref ref-type="table" rid="T1">Table 1</xref> presents the Precision, Recall and <italic>F</italic>-measure scores obtained on the <italic>Abuse</italic> class, for both baselines [<italic>Content-based</italic> (Papegnies et al., <xref ref-type="bibr" rid="B12">2017b</xref>) and <italic>Graph-based</italic> (Papegnies et al., <xref ref-type="bibr" rid="B13">2019</xref>)] and all three proposed fusion strategies (<italic>Early Fusion, Late Fusion</italic> and <italic>Hybrid Fusion</italic>). It also shows the number of features used to perform the classification, the time required to compute the features and perform the cross validation (<italic>Total Runtime</italic>) and to compute one message in average (<italic>Average Runtime</italic>). Note that <italic>Late Fusion</italic> has only 2 direct inputs (content- and graph-based SVMs), but these in turn have their own inputs, which explains the values displayed in the table.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Comparison of the performances obtained with the methods (<italic>Content-based, Graph-based, Fusion</italic>) and their subsets of <italic>Top Features</italic> (TF).</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center"><bold>Number of features</bold></th>
<th valign="top" align="center"><bold>Total runtime</bold></th>
<th valign="top" align="center"><bold>Average runtime</bold></th>
<th valign="top" align="center"><bold>Precision</bold></th>
<th valign="top" align="center"><bold>Recall</bold></th>
<th valign="top" align="center"><bold>F-measure</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Content-Based</td>
<td valign="top" align="center">29</td>
<td valign="top" align="center">0:52</td>
<td valign="top" align="center">0.02s</td>
<td valign="top" align="center">78.59</td>
<td valign="top" align="center">83.61</td>
<td valign="top" align="center">81.02</td>
</tr>
<tr>
<td valign="top" align="left">Content-Based TF</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0:21</td>
<td valign="top" align="center">0.01s</td>
<td valign="top" align="center">75.82</td>
<td valign="top" align="center">82.57</td>
<td valign="top" align="center">79.05</td>
</tr> <tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Graph-Based</td>
<td valign="top" align="center">459</td>
<td valign="top" align="center">8:19:10</td>
<td valign="top" align="center">7.56s</td>
<td valign="top" align="center">90.21</td>
<td valign="top" align="center">87.63</td>
<td valign="top" align="center">88.90</td>
</tr>
<tr>
<td valign="top" align="left">Graph-Based TF</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">14:22</td>
<td valign="top" align="center">0.03s</td>
<td valign="top" align="center">88.72</td>
<td valign="top" align="center">84.87</td>
<td valign="top" align="center">86.75</td>
</tr> <tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Early Fusion</td>
<td valign="top" align="center">488</td>
<td valign="top" align="center">8:26:41</td>
<td valign="top" align="center">7.68s</td>
<td valign="top" align="center">91.25</td>
<td valign="top" align="center">89.45</td>
<td valign="top" align="center">90.34</td>
</tr>
<tr>
<td valign="top" align="left">Early Fusion TF</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">11:29</td>
<td valign="top" align="center">0.17s</td>
<td valign="top" align="center">89.09</td>
<td valign="top" align="center">87.12</td>
<td valign="top" align="center">88.09</td>
</tr> <tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Late Fusion</td>
<td valign="top" align="center">488 (2)</td>
<td valign="top" align="center">8:23:57</td>
<td valign="top" align="center">7.64s</td>
<td valign="top" align="center">94.10</td>
<td valign="top" align="center">92.43</td>
<td valign="top" align="center">93.26</td>
</tr>
<tr>
<td valign="top" align="left">Late Fusion TF</td>
<td valign="top" align="center">13</td>
<td valign="top" align="center">15:42</td>
<td valign="top" align="center">0.24s</td>
<td valign="top" align="center">91.64</td>
<td valign="top" align="center">89.97</td>
<td valign="top" align="center">90.80</td>
</tr> <tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Hybrid Fusion</td>
<td valign="top" align="center">490</td>
<td valign="top" align="center">8:27:01</td>
<td valign="top" align="center">7.68s</td>
<td valign="top" align="center">91.96</td>
<td valign="top" align="center">90.48</td>
<td valign="top" align="center">91.22</td>
</tr>
<tr>
<td valign="top" align="left">Hybrid Fusion TF</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">16:57</td>
<td valign="top" align="center">0.26s</td>
<td valign="top" align="center">90.74</td>
<td valign="top" align="center">89.00</td>
<td valign="top" align="center">89.86</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The total runtime is expressed as h:min:s. See text for details</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>Our first observation is that we get higher <italic>F</italic>-measure values compared to both baselines when performing the fusion, independently from the fusion strategy. This confirms what we expected, i.e., that the information encoded in the interactions between the users differs from the information conveyed by the content of the messages they exchange. Moreover, this shows that both sources are at least partly complementary, since the performance increases when merging them. On a side note, the correlation between the score of the graph- and content-based classifiers is 0.56, which is consistent with these observations.</p>
<p>Next, when comparing the fusion strategies, it appears that <italic>Late Fusion</italic> performs better than the others, with an <italic>F</italic>-measure of 93.26. This is a little bit surprising: we were expecting to get superior results from the <italic>Early Fusion</italic>, which has direct access to a much larger number of <italic>raw</italic> features (488). By comparison, the <italic>Late Fusion</italic> only gets 2 features, which are themselves the outputs of two other classifiers. This means that the <italic>Content-Based</italic> and <italic>Graph-Based</italic> classifiers do a good work in summarizing their inputs, without loosing much of the information necessary to efficiently perform the classification task. Moreover, we assume that the <italic>Early Fusion</italic> classifier struggles to estimate an appropriate model when dealing with such a large number of features, whereas the <italic>Late Fusion</italic> one benefits from the pre-processing performed by its two predecessors, which act as if reducing the dimensionality of the data. This seems to be confirmed by the results of the <italic>Hybrid Fusion</italic>, which produces better results than the <italic>Early Fusion</italic>, but is still below the <italic>Late Fusion</italic>. This point could be explored by switching to classification algorithm less sensitive to the number of features. Alternatively, when considering the three SVMs used for the <italic>Late Fusion</italic>, one could see a simpler form of a very basic Multilayer Perceptron, in which each neuron has been trained separately (without system-wide backpropagation). This could indicate that using a regular Multilayer Perceptron directly on the raw features could lead to improved results, especially if enough training data is available.</p>
<p>Regarding runtime, the graph-based approach takes more than 8 h to run for the whole corpus, mainly because of the feature computation step. This is due to the number of features, and to the compute-intensive nature of some of them. The content-based approach is much faster, with a total runtime of &#x0003C; 1 min, for the exact opposite reasons. Fusion methods require to compute both content- and graph-based features, so they have the longest runtime.</p></sec>
<sec>
<title>3.3. Feature Study</title>
<p>We now want to identify the most discriminative features for all three fusion strategies. We apply an iterative method based on the <italic>Sklearn</italic> toolkit, which allows us to fit a linear kernel SVM to the dataset and provide a ranking of the input features reflecting their importance in the classification process. Using this ranking, we identify the least discriminant feature, remove it from the dataset, and train a new model with the remaining features. The impact of this deletion is measured by the performance difference, in terms of <italic>F</italic>-measure. We reiterate this process until only one feature remains. We call <italic>Top Features</italic> (TF) the minimal subset of features allowing to reach 97% of the original performance (when considering the complete feature set).</p>
<p>We apply this process to both baselines and all three fusion strategies. We then perform a classification using only their respective TF. The results are presented in <xref ref-type="table" rid="T1">Table 1</xref>. Note that the <italic>Late Fusion TF</italic> performance is obtained using the scores produced by the SVMs trained on <italic>Content-based TF</italic> and <italic>Graph-based TF</italic>. These are also used as features when computing the TF for <italic>Hybrid Fusion TF</italic> (together with the raw content- and graph-based features). In terms of classification performance, by construction, the methods are ranked exactly like when considering all available features.</p>
<p>The <italic>Top Features</italic> obtained for each method are listed in <xref ref-type="table" rid="T2">Table 2</xref>. The last 4 columns precise which variants of the graph-based features are concerned. Indeed, as explained in section 2.2, most of these topological measures can handle/ignore edge weights and/or edge directions, can be vertex- or graph-focused, and can be computed for each of the three types of networks (<italic>Before, After</italic>, and <italic>Full</italic>).</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Top features obtained for our 5 methods.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="left"><bold>Top Features</bold></th>
<th valign="top" align="center"><bold>Graph</bold></th>
<th valign="top" align="center"><bold>Weights</bold></th>
<th valign="top" align="center"><bold>Directions</bold></th>
<th valign="top" align="center"><bold>Scale</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Content-Based</td>
<td valign="top" align="left">Naive Bayes</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td/>
<td valign="top" align="left"><italic>tf</italic>&#x02013;<italic>idf</italic> Abuse Score</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Character Capital Ratio</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr> <tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Graph-Based</td>
<td valign="top" align="left">Coreness Score</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">I</td>
<td valign="top" align="center">G</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">PageRank Centrality</td>
<td valign="top" align="center">A</td>
<td valign="top" align="center">U</td>
<td valign="top" align="center">D</td>
<td valign="top" align="center">N</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Strength Centrality</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">W</td>
<td valign="top" align="center">O</td>
<td valign="top" align="center">N</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Vertex Count</td>
<td valign="top" align="center">F</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">G</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Closeness Centrality</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">W</td>
<td valign="top" align="center">O</td>
<td valign="top" align="center">G</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Closeness Centrality</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">W</td>
<td valign="top" align="center">O</td>
<td valign="top" align="center">N</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Authority Score</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">W</td>
<td valign="top" align="center">D</td>
<td valign="top" align="center">G</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Hub Score</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">U</td>
<td valign="top" align="center">D</td>
<td valign="top" align="center">N</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Reciprocity</td>
<td valign="top" align="center">A</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">D</td>
<td valign="top" align="center">G</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Closeness Centrality</td>
<td valign="top" align="center">A</td>
<td valign="top" align="center">W</td>
<td valign="top" align="center">U</td>
<td valign="top" align="center">N</td>
</tr> <tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Early Fusion</td>
<td valign="top" align="left">Coreness Score</td>
<td valign="top" align="center">A</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">O</td>
<td valign="top" align="center">G</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Coreness Score</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">I</td>
<td valign="top" align="center">G</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Eccentricity</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">I</td>
<td valign="top" align="center">G</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Naive Bayes</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr> <tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Late Fusion</td>
<td valign="top" align="left"><italic>Content-Based</italic> TF &#x0222A;<italic>Graph-Based TF</italic></td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr> <tr style="border-top: thin solid #000000;">
<td valign="top" align="left">Hybrid Fusion</td>
<td valign="top" align="left">Graph-based output</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Content-based output</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Strength Centrality</td>
<td valign="top" align="center">A</td>
<td valign="top" align="center">W</td>
<td valign="top" align="center">O</td>
<td valign="top" align="center">N</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Coreness Score</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">I</td>
<td valign="top" align="center">G</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The letters in the Graph column stand for Before (B), After (A), and Full (F). Those in the Weights and Directions columns stand for: Unweighted or Undirected (U), Weighted (W), Directed (D), Incoming (I), and Outgoing (O). Those in the Scale column mean Graph-scale (G) or Vertex-scale (N)</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>There are three <italic>Content-Based TF</italic>. The first is the <italic>Naive Bayes</italic> prediction, which is not surprising as it comes from a fully fledged classifier processing BoWs. The second is the <italic>tf</italic><italic>-</italic><italic>idf</italic> <italic>score</italic> computed over the <italic>Abuse</italic> class, which shows that considering term frequencies indeed improve the classification performance. The third is the <italic>Capital Ratio</italic> (proportion of capital letters in the comment), which is likely to be caused by abusive message tending to be shouted, and therefore written in capitals. The <italic>Graph-Based TF</italic> are discussed in depth in our previous article (Papegnies et al., <xref ref-type="bibr" rid="B13">2019</xref>). To summarize, the most important features help detecting changes in the direct neighborhood of the targeted author (Coreness, Strength), in the average node centrality at the level of the whole graph in terms of distance (Closeness), and in the general reciprocity of exchanges between users (Reciprocity).</p>
<p>We obtain 4 features for <italic>Early Fusion TF</italic>. One is the <italic>Naive Bayes</italic> feature (content-based), and the other three are topological measures (graph-based features). Two of the latter correspond to the Coreness of the targeted author, computed for the <italic>Before</italic> and <italic>After</italic> graphs. The third topological measure is his/her Eccentricity. This reflects important changes in the interactions around the targeted author. It is likely caused by angry users piling up on the abusive user after he has posted some inflammatory remark. For <italic>Hybrid Fusion TF</italic>, we also get 4 features, but those include in first place both SVM outputs from the content- and graph-based classifiers. Those are completed by 2 graph-based features, including Strength (also found in the <italic>Graph-based</italic> and <italic>Late Fusion TF</italic>) and Coreness (also found in the <italic>Graph-based, Early Fusion</italic> and <italic>Late Fusion TF</italic>).</p>
<p>Besides a better understanding of the dataset and classification process, one interesting use of the TF is that they can allow decreasing the computational cost of the classification. In our case, this is true for all methods: we can retain 97% of the performance while using only a handful of features instead of hundreds. For instance, with the <italic>Late Fusion TF</italic>, we need only 3% of the total <italic>Late Fusion</italic> runtime.</p></sec></sec>
<sec id="s4">
<title>4. Conclusion and Perspectives</title>
<p>In this article, we tackle the problem of automatic abuse detection in online communities. We take advantage of the methods that we previously developed to leverage message content (Papegnies et al., <xref ref-type="bibr" rid="B11">2017a</xref>) and interactions between users (Papegnies et al., <xref ref-type="bibr" rid="B13">2019</xref>), and create a new method using both types of information simultaneously. We show that the features extracted from our content- and graph-based approaches are complementary, and that combining them allows to sensibly improve the results up to 93.26 (<italic>F</italic>-measure). One limitation of our method is the computational time required to extract certain features. However, we show that using only a small subset of relevant features allows to dramatically reduce the processing time (down to 3%) while keeping more than 97% of the original performance.</p>
<p>Another limitation of our work is the small size of our dataset. We must find some other corpora to test our methods at a much higher scale. However, all the available datasets are composed of isolated messages, when we need threads to make the most of our approach. A solution could be to start from datasets such as the Wikipedia-based corpus proposed by Wulczyn et al. (<xref ref-type="bibr" rid="B17">2017</xref>), and complete them by reconstructing the original conversations containing the annotated messages. This could also be the opportunity to test our methods on an other language than French. Our content-based method may be impacted by this change, but this should not be the case for the graph-based method, as it is independent from the content (and therefore the language). Besides language, a different online community is likely to behave differently from the one we studied before. In particular, its members could react differently to abuse. The Wikipedia dataset would therefore allow assessing how such cultural differences affect our classifiers, and identifying which observations made for Space Origin still apply to Wikipedia.</p></sec>
<sec id="s5">
<title>Data Availability</title>
<p>The datasets for this manuscript are not publicly available because Private dataset. Requests to access the data should actually be addressed to the corresponding author, V. Labatut.</p></sec>
<sec id="s6">
<title>Author Contributions</title>
<p>All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.</p>
<sec>
<title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p></sec></sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Balci</surname> <given-names>K.</given-names></name> <name><surname>Salah</surname> <given-names>A. A.</given-names></name></person-group> (<year>2015</year>). <article-title>Automatic analysis and identification of verbal aggression and abusive behaviors for online social games</article-title>. <source>Comput. Hum. Behav.</source> <volume>53</volume>, <fpage>517</fpage>&#x02013;<lpage>526</lpage>. <pub-id pub-id-type="doi">10.1016/j.chb.2014.10.025</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Batista</surname> <given-names>L. V.</given-names></name> <name><surname>Meira</surname> <given-names>M. M.</given-names></name></person-group> (<year>2004</year>). <article-title>&#x0201C;Texture classification using the lempel-ziv-welch algorithm,&#x0201D;</article-title> in <source>Brazilian Symposium on Artificial Intelligence</source> (<publisher-loc>Berlin</publisher-loc>), <fpage>444</fpage>&#x02013;<lpage>453</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-540-28645-5-45</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Camtepe</surname> <given-names>A.</given-names></name> <name><surname>Krishnamoorthy</surname> <given-names>M. S.</given-names></name> <name><surname>Yener</surname> <given-names>B.</given-names></name></person-group> (<year>2004</year>). <article-title>&#x0201C;A tool for Internet chatroom surveillance,&#x0201D;</article-title> in <source>International Conference on Intelligence and Security Informatics, Vol 3073 of Lecture Notes in Computer Science</source> (<publisher-loc>Berlin</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>252</fpage>&#x02013;<lpage>265</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-540-25952-7-19</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Zhou</surname> <given-names>Y.</given-names></name> <name><surname>Zhu</surname> <given-names>S.</given-names></name> <name><surname>Xu</surname> <given-names>H.</given-names></name></person-group> (<year>2012</year>). <article-title>&#x0201C;Detecting offensive language in social media to protect adolescent online safety,&#x0201D;</article-title> in <source>International Conference on Privacy, Security, Risk and Trust and International Conference on Social Computing</source> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>71</fpage>&#x02013;<lpage>80</lpage>.</citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Csardi</surname> <given-names>G.</given-names></name> <name><surname>Nepusz</surname> <given-names>T.</given-names></name></person-group> (<year>2006</year>). <article-title>The igraph software package for complex network research</article-title>. <source>Int. J.</source> <volume>1695</volume>, <fpage>1</fpage>&#x02013;<lpage>9</lpage>.</citation></ref>
<ref id="B6">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Dinakar</surname> <given-names>K.</given-names></name> <name><surname>Reichart</surname> <given-names>R.</given-names></name> <name><surname>Lieberman</surname> <given-names>H.</given-names></name></person-group> (<year>2011</year>). <article-title>&#x0201C;Modeling the detection of textual cyberbullying,&#x0201D;</article-title> in <source>5th International AAAI Conference on Weblogs and Social Media / Workshop on the Social Mobile Web</source> (<publisher-loc>Barcelona</publisher-loc>: <publisher-name>AAAI</publisher-name>), <fpage>11</fpage>&#x02013;<lpage>17</lpage>.</citation></ref>
<ref id="B7">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Forestier</surname> <given-names>M.</given-names></name> <name><surname>Velcin</surname> <given-names>J.</given-names></name> <name><surname>Zighed</surname> <given-names>D.</given-names></name></person-group> (<year>2011</year>). <article-title>&#x0201C;Extracting social networks to understand interaction,&#x0201D;</article-title> in <source>International Conference on Advances in Social Networks Analysis and Mining</source> (<publisher-loc>Kaohsiung</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>213</fpage>&#x02013;<lpage>219</lpage>. <pub-id pub-id-type="doi">10.1109/ASONAM.2011.64</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hosseini</surname> <given-names>H.</given-names></name> <name><surname>Kannan</surname> <given-names>S.</given-names></name> <name><surname>Zhang</surname> <given-names>B.</given-names></name> <name><surname>Poovendran</surname> <given-names>R.</given-names></name></person-group> (<year>2017</year>). <article-title>Deceiving google&#x00027;s perspective api built for detecting toxic comments</article-title>. <source>arXiv</source> arXiv:1702.08138.</citation></ref>
<ref id="B9">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Mishra</surname> <given-names>P.</given-names></name> <name><surname>Yannakoudakis</surname> <given-names>H.</given-names></name> <name><surname>Shutova</surname> <given-names>E.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Neural character-based composition models for abuse detection,&#x0201D;</article-title> in <source>2nd Workshop on Abusive Language Online</source> (<publisher-loc>Brussels</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name>), <fpage>1</fpage>&#x02013;<lpage>10</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.aclweb.org/anthology/W18-5101">https://www.aclweb.org/anthology/W18-5101</ext-link></citation></ref>
<ref id="B10">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Mutton</surname> <given-names>P.</given-names></name></person-group> (<year>2004</year>). <article-title>&#x0201C;Inferring and visualizing social networks on Internet Relay Chat,&#x0201D;</article-title> in <source>8th International Conference on Information Visualisation</source> (<publisher-loc>London</publisher-loc>: <publisher-name>IEEE</publisher-name>) <fpage>35</fpage>&#x02013;<lpage>43</lpage>.</citation></ref>
<ref id="B11">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Papegnies</surname> <given-names>E.</given-names></name> <name><surname>Labatut</surname> <given-names>V.</given-names></name> <name><surname>Dufour</surname> <given-names>R.</given-names></name> <name><surname>Linares</surname> <given-names>G.</given-names></name></person-group> (<year>2017a</year>). <article-title>&#x0201C;Graph-based features for automatic online abuse detection,&#x0201D;</article-title> in <source>International Conference on Statistical Language and Speech Processing, volume 10583 of Lecture Notes in Computer Science</source> (<publisher-loc>Berlin</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>70</fpage>&#x02013;<lpage>81</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-68456-7-6</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Papegnies</surname> <given-names>E.</given-names></name> <name><surname>Labatut</surname> <given-names>V.</given-names></name> <name><surname>Dufour</surname> <given-names>R.</given-names></name> <name><surname>Linares</surname> <given-names>G.</given-names></name></person-group> (<year>2017b</year>). <article-title>&#x0201C;Impact of content features for automatic online abuse detection,&#x0201D;</article-title> in <source>International Conference on Computational Linguistics and Intelligent Text Processing, volume 10762 of Lecture Notes in Computer Science</source> (<publisher-loc>Berlin</publisher-loc>:<publisher-name>Springer</publisher-name>), <fpage>404</fpage>&#x02013;<lpage>419</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-77116-8-30</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papegnies</surname> <given-names>E.</given-names></name> <name><surname>Labatut</surname> <given-names>V.</given-names></name> <name><surname>Dufour</surname> <given-names>R.</given-names></name> <name><surname>Linares</surname> <given-names>G.</given-names></name></person-group> (<year>2019</year>). <article-title>Conversational networks for automatic online moderation</article-title>. <source>IEEE Trans. Comput. Soc. Syst.</source> <volume>6</volume>, <fpage>38</fpage>&#x02013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1109/TCSS.2018.2887240</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Pavlopoulos</surname> <given-names>J.</given-names></name> <name><surname>Malakasiotis</surname> <given-names>P.</given-names></name> <name><surname>Androutsopoulos</surname> <given-names>I.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Deep learning for user comment moderation,&#x0201D;</article-title> in <source>1st Workshop on Abusive Language Online</source> (<publisher-loc>Vancouver, BC</publisher-loc>: <publisher-name>ACL</publisher-name>), <fpage>25</fpage>&#x02013;<lpage>35</lpage>. <pub-id pub-id-type="doi">10.18653/v1/W17-3004</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pedregosa</surname> <given-names>F.</given-names></name> <name><surname>Varoquaux</surname> <given-names>G.</given-names></name> <name><surname>Gramfort</surname> <given-names>A.</given-names></name> <name><surname>Michel</surname> <given-names>V.</given-names></name> <name><surname>Thirion</surname> <given-names>B.</given-names></name> <name><surname>Grisel</surname> <given-names>O.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>Scikit-learn: machine learning in Python</article-title>. <source>J. Mach. Learn. Res.</source> <volume>12</volume>, <fpage>2825</fpage>&#x02013;<lpage>2830</lpage>.</citation></ref>
<ref id="B16">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Spertus</surname> <given-names>E.</given-names></name></person-group> (<year>1997</year>). <article-title>&#x0201C;Smokey: automatic recognition of hostile messages,&#x0201D;</article-title> in <source>14th National Conference on Artificial Intelligence and 9th Conference on Innovative Applications of Artificial Intelligence</source> (<publisher-loc>Providence, RI</publisher-loc>: <publisher-name>AAAI</publisher-name>), <fpage>1058</fpage>&#x02013;<lpage>1065</lpage>.</citation></ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wulczyn</surname> <given-names>E.</given-names></name> <name><surname>Thain</surname> <given-names>N.</given-names></name> <name><surname>Dixon</surname> <given-names>L.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Ex Machina: personal attacks seen at scale,&#x0201D;</article-title> in <source>26th International Conference on World Wide Web</source> (<publisher-loc>Geneva</publisher-loc>), <fpage>1391</fpage>&#x02013;<lpage>1399</lpage>. <pub-id pub-id-type="doi">10.1145/3038912.3052591</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yin</surname> <given-names>D.</given-names></name> <name><surname>Xue</surname> <given-names>Z.</given-names></name> <name><surname>Hong</surname> <given-names>L.</given-names></name> <name><surname>Davison</surname> <given-names>B. D.</given-names></name> <name><surname>Kontostathis</surname> <given-names>A.</given-names></name> <name><surname>Edwards</surname> <given-names>L.</given-names></name></person-group> (<year>2009</year>). <article-title>Detection of harassment on Web 2.0</article-title>. in <source>WWW Workshop: Content Analysis in the Web 2.0</source> (<publisher-loc>Madrid</publisher-loc>) <fpage>1</fpage>&#x02013;<lpage>7</lpage>.</citation></ref>
</ref-list>
<fn-group>
<fn id="fn0001"><p><sup>1</sup><ext-link ext-link-type="uri" xlink:href="https://play.spaceorigin.fr/">https://play.spaceorigin.fr/</ext-link></p></fn>
</fn-group>
</back>
</article>