<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Artif. Intell.</journal-id>
<journal-title>Frontiers in Artificial Intelligence</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Artif. Intell.</abbrev-journal-title>
<issn pub-type="epub">2624-8212</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/frai.2023.1230383</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Artificial Intelligence</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Automated facial characterization and image retrieval by convolutional neural networks</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes" equal-contrib="yes">
<name><surname>Shah</surname> <given-names>Syed Taimoor Hussain</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x02020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2327146/overview"/>
</contrib>
<contrib contrib-type="author" equal-contrib="yes">
<name><surname>Shah</surname> <given-names>Syed Adil Hussain</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x02020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2596615/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Qureshi</surname> <given-names>Shahzad Ahmad</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1398339/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Di Terlizzi</surname> <given-names>Angelo</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2596642/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Deriu</surname> <given-names>Marco Agostino</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/658046/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>PolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino</institution>, <addr-line>Turin</addr-line>, <country>Italy</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Research and Development (R&#x00026;D), GPI SpA</institution>, <addr-line>Trento</addr-line>, <country>Italy</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences</institution>, <addr-line>Islamabad</addr-line>, <country>Pakistan</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Sreyasee Das Bhattacharjee, University at Buffalo, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Bayram Akdemir, Konya Technical University, T&#x000FC;rkiye</p>
<p>Shi-Jinn Horng, Asia University, Taiwan</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Syed Taimoor Hussain Shah <email>taimoor.shah&#x00040;polito.it</email></corresp>
<fn fn-type="equal" id="fn001"><p>&#x02020;These authors have contributed equally to this work</p></fn></author-notes>
<pub-date pub-type="epub">
<day>20</day>
<month>12</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>6</volume>
<elocation-id>1230383</elocation-id>
<history>
<date date-type="received">
<day>14</day>
<month>07</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>21</day>
<month>11</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Shah, Shah, Qureshi, Di Terlizzi and Deriu.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Shah, Shah, Qureshi, Di Terlizzi and Deriu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<sec>
<title>Introduction</title>
<p>Developing efficient methods to infer relations among different faces consisting of numerous expressions or on the same face at different times (e.g., disease progression) is an open issue in imaging related research. In this study, we present a novel method for facial feature extraction, characterization, and identification based on classical computer vision coupled with deep learning and, more specifically, convolutional neural networks.</p>
</sec>
<sec>
<title>Methods</title>
<p>We describe the hybrid face characterization system named FRetrAIval (FRAI), which is a hybrid of the GoogleNet and the AlexNet Neural Network (NN) models. Images analyzed by the FRAI network are preprocessed by computer vision techniques such as the oriented gradient-based algorithm that can extract only the face region from any kind of picture. The Aligned Face dataset (AFD) was used to train and test the FRAI solution for extracting image features. The Labeled Faces in the Wild (LFW) holdout dataset has been used for external validation.</p>
</sec>
<sec>
<title>Results and discussion</title>
<p>Overall, in comparison to previous techniques, our methodology has shown much better results on k-Nearest Neighbors (KNN) by yielding the maximum precision, recall, F1, and F2 score values (92.00, 92.66, 92.33, and 92.52%, respectively) for AFD and (95.00% for each variable) for LFW dataset, which were used as training and testing datasets. The FRAI model may be potentially used in healthcare and criminology as well as many other applications where it is important to quickly identify face features such as fingerprint for a specific identification target.</p>
</sec></abstract>
<kwd-group>
<kwd>oriented gradient-based algorithm</kwd>
<kwd>convolutional neural networks</kwd>
<kwd>GoogLeNet</kwd>
<kwd>AlexNet</kwd>
<kwd>KNN</kwd>
<kwd>computer vision</kwd>
<kwd>facial features extraction</kwd>
</kwd-group>
<contract-sponsor id="cn001">H2020 Marie Sk&#x00142;odowska-Curie Actions<named-content content-type="fundref-id">10.13039/100010665</named-content></contract-sponsor>
<counts>
<fig-count count="10"/>
<table-count count="3"/>
<equation-count count="2"/>
<ref-count count="71"/>
<page-count count="16"/>
<word-count count="8726"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Machine Learning and Artificial Intelligence</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>The current era is representing the correct essence of a proverb said by a Chinese philosopher, &#x0201C;A picture is worth a thousand words&#x0201D; (Huffer et al., <xref ref-type="bibr" rid="B24">2019</xref>). Images have an important role in visual-based information since a picture can communicate very complex ideas in a relatively simple manner. However, processing and handling large data are cumbersome tasks. Since information retrieved is useless if the essence of the required information is missing in the output (Yang et al., <xref ref-type="bibr" rid="B65">2008</xref>), efficient and minimal time response systems are required.</p>
<p>With the advancement of multimedia-based technology, different firms came to the frontline. They provided different platforms like Facebook, Twitter, WhatsApp, Amazon and eBay (Zhang and Wang, <xref ref-type="bibr" rid="B71">2012</xref>). Each platform is itself a big ocean for multimedia data. On this huge data, different recommendation systems are working together. They employ valuable information and recommendations to the end-users according to their needs (Fayyaz et al., <xref ref-type="bibr" rid="B16">2020</xref>). Similarly, these platforms led the researchers to develop more better and efficient information retrieval algorithms.</p>
<p>Content-Based Image Retrieval (CBIR) is a hot research field which is getting the attention of researchers to fill the needs of the current era. The used features define the success of CBIR (Raghuwanshi and Tyagi, <xref ref-type="bibr" rid="B48">2019</xref>). Initially, researchers mainly focused on techniques for data retrieval by text-based query within minimal time and accuracy (Chang and Fu, <xref ref-type="bibr" rid="B11">1980a</xref>,<xref ref-type="bibr" rid="B6">b</xref>; Chang and Kunii, <xref ref-type="bibr" rid="B9">1981</xref>; Tamura and Yokoya, <xref ref-type="bibr" rid="B58">1984</xref>; Chang et al., <xref ref-type="bibr" rid="B10">1988</xref>, <xref ref-type="bibr" rid="B7">1998</xref>; Chang and Hsu, <xref ref-type="bibr" rid="B8">1992</xref>). However, it was challenging to annotate and manage the large data. Then researchers felt the need for new efficient ways for the retrieval of image-based data (Tyagi, <xref ref-type="bibr" rid="B60">2017</xref>). Through integrated efforts, a new technique was introduced in the form of features extraction like shape (Mezaris et al., <xref ref-type="bibr" rid="B40">2004</xref>; Zhang and Lu, <xref ref-type="bibr" rid="B68">2004</xref>; Yang et al., <xref ref-type="bibr" rid="B64">2005</xref>, <xref ref-type="bibr" rid="B65">2008</xref>; Zhang et al., <xref ref-type="bibr" rid="B67">2009</xref>) based information and color-based features (Flickner et al., <xref ref-type="bibr" rid="B18">1995</xref>; Pass and Pass, <xref ref-type="bibr" rid="B47">1996</xref>; Park et al., <xref ref-type="bibr" rid="B46">2013</xref>).</p>
<p>Since the past few years, several researchers have been contributing to facial-based feature extraction and recognition (Samal and Iyengar, <xref ref-type="bibr" rid="B50">1992</xref>; Brunelli and Poggio, <xref ref-type="bibr" rid="B5">1993</xref>; Valentin et al., <xref ref-type="bibr" rid="B61">1994</xref>; Chellappa et al., <xref ref-type="bibr" rid="B12">1995</xref>). In general, face recognition systems are developed with two steps: (1) face detection and (2) face recognition. For face detection, a number of different methods have been introduced, such as edge representation (Jesorsky et al., <xref ref-type="bibr" rid="B26">2001</xref>; Singh et al., <xref ref-type="bibr" rid="B54">2016</xref>), gray information (Maio and Maltoni, <xref ref-type="bibr" rid="B36">2000</xref>; Feng and Yuen, <xref ref-type="bibr" rid="B17">2001</xref>), color-based information (Dai and Nakano, <xref ref-type="bibr" rid="B13">1996</xref>), neural network-based detection (Li et al., <xref ref-type="bibr" rid="B32">2015</xref>), morphology-based preprocessing (Han et al., <xref ref-type="bibr" rid="B22">2000</xref>), and geometrical face model (Jeng et al., <xref ref-type="bibr" rid="B25">1998</xref>). However, for face recognition, the following different methods have been proposed: Eigenfaces (Swets, <xref ref-type="bibr" rid="B56">1996</xref>; Zhang et al., <xref ref-type="bibr" rid="B69">1997</xref>), hidden Markov model (Nefian and Hayes, <xref ref-type="bibr" rid="B43">1998</xref>), LDA based techniques (Swets, <xref ref-type="bibr" rid="B56">1996</xref>; Belhumeur et al., <xref ref-type="bibr" rid="B3">1997</xref>), using local autocorrelations and multiscale integration (Goudail et al., <xref ref-type="bibr" rid="B20">1996</xref>), discriminant eigenfeatures (Swets, <xref ref-type="bibr" rid="B56">1996</xref>), algebraic feature extraction method (Liu et al., <xref ref-type="bibr" rid="B33">1993</xref>), and probabilistic visual learning for object representation (Moghaddam and Pentland, <xref ref-type="bibr" rid="B41">1997</xref>). In parallel to facial recognition, researchers are also paying attention and progressing toward facial expressions that may be applied in different fields such as pain intensity determination (Othman et al., <xref ref-type="bibr" rid="B44">2023</xref>), health prediction (Yiew et al., <xref ref-type="bibr" rid="B66">2023</xref>), vehicle driving (Ashlin Deepa et al., <xref ref-type="bibr" rid="B1">2023</xref>), security checking (Mao et al., <xref ref-type="bibr" rid="B38">2023</xref>), facial expressions of babies to predict their health status (Brahnam et al., <xref ref-type="bibr" rid="B4">2023</xref>), and prediction of different diseases such as neurological outcome and pain. In this regard, Jiang and Yin (<xref ref-type="bibr" rid="B27">2022</xref>) developed a convolutional attention-based facial expression recognition system with a multi-feature fusion method. These authors focused on the convolutional attention-based module that learned facial expressions in a better way. In another study, Mao et al. (<xref ref-type="bibr" rid="B38">2023</xref>) used a customized convolutional deep learning model to train facial expressions. Their main focus was on deploying an region of interest (ROI) pooling layer with L2 regularization, including learning rate decay mechanisms to train well on facial expression landmarks.</p>
<p>In facial recognition, Mahmood et al. (<xref ref-type="bibr" rid="B35">2022</xref>) suggested a facial image retrieval technique that deals with image texture features with defined pixel patterns such as local binary, ternary, and tetra directional pixel patterns of the input image. The PCA optimizer function was used to select the robust features from a collection of extractive feature sets. Finally, these authors used the Manhattan distance equation to calculate feature similarity.</p>
<p>Zhang et al. (<xref ref-type="bibr" rid="B70">2021</xref>) proposed a novel data structural technique named as the deep hashing method that focuses on the improvement of image retrieval speed with low memory consumption. In addition, on the basis of deep feature similarity, Zhang&#x00027;s model generates the hashing code of the features of the fully connected layer from convolutional neural network (CNN) and develops a hashing database for a retrieval purpose. In this database, the associated label of the image generates a label matrix for the classification and retrieval of performance purposes. Then, the hamming cube learning technique has been used to compute the central image features of the inter-class separability that assisted in the retrieval process against the query image.</p>
<p>Sato et al. (<xref ref-type="bibr" rid="B51">2019</xref>) presented a peculiar face retrieval framework. In a situation, a user wants to find a particular person with one visual memory about that person. The user selects some pictures about that person based on target specific information and passes them to the trained model. Deep convolutional network processes and the resulting features are then matched for retrieval. Hasnat et al. (<xref ref-type="bibr" rid="B23">2019</xref>) proposed a new efficient method of face image retrieval system, i.e., discriminative ternary census transforms histogram (DTCTH) technique, which is specialized for capturing only the required information. Later, Shukla and Kanungo (<xref ref-type="bibr" rid="B52">2019</xref>) also introduced a new approach for face recognition and retrieval, in which a bag of features is used by extracting visual words with the help of the Gray Wolf Optimization algorithm. Wu et al. (<xref ref-type="bibr" rid="B63">2010</xref>) proposed a new method for face-based image retrieval using indexing. They extracted local and global features of faces and quantized the local features containing the special properties of faces into visual elements. Sun et al. (<xref ref-type="bibr" rid="B55">2019</xref>) introduced a new technique for facial images that show large variations in illumination, pose, and facial expressions. This technique combined the Face&#x0002B;&#x0002B; algorithm (Fan et al., <xref ref-type="bibr" rid="B15">2014</xref>) and convolutional neural networks for image retrieval.</p>
<p>Tarawneh et al. (<xref ref-type="bibr" rid="B59">2019</xref>) used deep learning methods to retrieve images by analyzing the face mesh. In their study, the authors used VGG16 and AlexNet as the main focusing networks for training and testing. With the help of these networks, different feature representation approaches of the faces have been analyzed on different layers of the network and then used for the retrieval process against each query image.</p>
<p>In previous studies, different combinations of methods with convolutional neural networks have been reported for retrieval purposes. Some of these methods increase the complexity of the pipeline while others also require improvement toward a more efficient retrieval of the right information in a short time.</p>
<p>In this study, we present a novel pipeline for image retrieval that merges the power of two well-known tools, i.e., GoogleNet and AlexNet. In addition, we have also developed a novel convolutional neural network named as FRetrAIval (FRAI) for the training of face recognition.<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref> Two different standard datasets for training and testing purposes have been considered: AFD (training and validation) and LFW (Gary et al., <xref ref-type="bibr" rid="B19">2008</xref>) (external validation), which have shown better classification and retrieval results in accuracy, recall, and precision measures.</p>
</sec>
<sec id="s2">
<title>2 Materials and methods</title>
<p>In this study, we used two different standard datasets named as aligned face dataset (AFD) (see text footnote 1) and LFW database (Gary et al., <xref ref-type="bibr" rid="B19">2008</xref>). In both datasets, aside from only numerous facial characters for image retrieval purposes, they are having different facial expressions such as happy, sad, neutral, angry, and surprise, as shown in <bold>Figure 10</bold>. This aspect suggests that the model&#x00027;s recognition is based not only on facial landmarks but also recognizing expressions and appearances. In addition, AFD has a good number of samples per class as compared to the other holdout dataset, which consequently encouraged us to use this dataset for training on our proposed GoogLeNet CNN. After training, the LFW dataset was used for the testing phase of our methodology and is elaborated further in the Section 3.</p>
<sec>
<title>2.1 Dataset and image preprocessing</title>
<p>A. In the AFD (see text footnote 1), over 10,000 images of 100 different celebrities were collected from Pinterest. An average of 100 images of each celebrity was included in this dataset.</p>
<p>B. A database based on unconstrained face recognition, i.e., LFW database (Gary et al., <xref ref-type="bibr" rid="B19">2008</xref>), was used in this study. It comprises 13,233 faces with 5,749 unique people collected from the web. All images comprised the face of some characters and were labeled with the character&#x00027;s name. In this dataset, 1,680 images had two or more distinct photos. As shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, dataset images were preprocessed with the help of a local computer-based crawler (Dalal and Triggs, <xref ref-type="bibr" rid="B14">2005</xref>).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Face region selection using an oriented gradient-based algorithm (Dalal and Triggs, <xref ref-type="bibr" rid="B14">2005</xref>) on AFW dataset faces; left image in each cell is original where red box is representing the detected face and right one is the automatically cropped face region.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0001.tif"/>
</fig>
<p>This program iteratively goes to the local disk drive and fetches an image. Image names were decoded into labels and stored in the absolute path in a database with the corresponding labels as shown in the flow diagram of our methodology in <xref ref-type="fig" rid="F2">Figure 2</xref>. Each image of the database was fetched and passed on to the face detection model, which was based on a histogram of an oriented gradient-based algorithm (Dalal and Triggs, <xref ref-type="bibr" rid="B14">2005</xref>). The face-selection region thus obtained was stored in the labeled face folders. Then, we loaded the data according to their labels and used the augmentation technique.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>An end-to-end automatic facial based image retrieval system.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0002.tif"/>
</fig>
</sec>
<sec>
<title>2.2 Integration of GoogleNet and AlexNet CNNs</title>
<p>GoogleNet (Szegedy et al., <xref ref-type="bibr" rid="B57">2014</xref>) and AlexNet networks were the two CNNs considered for this study. GoogleNet (Szegedy et al., <xref ref-type="bibr" rid="B57">2014</xref>) is a convolutional neural network having a total of 144 layers and comprising convolutional, pooling, fully connected, and softmax layers. In the AlexNet network, there is a total of nine layers, including an input layer and convolution and fully connected layers. As the last layers of this network, the AlexNet team introduced three fully connected layers.</p>
<p>The FRAI network uses a modified version of the GoogleNet network in which three further layers are added, as shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. Two of those layers were taken from the AlexNet (<xref ref-type="bibr" rid="B30">Krizhevsky et al., n.d</xref>.) network, of which the last two fully connected layers were made by 500 interconnected neurons. A final fully connected layer was added, i.e., the classification layer having the same number of neurons as the number of training dataset characters. In other words, the last classification layer is a set based on the required number of classes.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Summary of proposed novel FRAI layers.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0003.tif"/>
</fig>
<p>For the FRAI architecture, we chose these two networks due to their application in many research studies of image recognition tasks such as those in the studies of Wu et al. (<xref ref-type="bibr" rid="B62">2015</xref>) and Mehendale (<xref ref-type="bibr" rid="B39">2020</xref>). Individually, GoogleNet is efficient in parameter usage space such as inception behavior, which uses multiple filter sizes, such as 1 &#x000D7; 1, 3 &#x000D7; 3, and 5 &#x000D7; 5, within the same layer. This allows for the network to capture features at different spatial scales without significantly increasing the number of parameters. In addition, it uses the 1 &#x000D7; 1 filter size in the inception modules to reduce the computational complexity. Moreover, its inception module has inspired other CNN architectures. As a result, many other architectures have incorporated this module. On the other hand, we used the behavior of the last fully connected layer of AlexNet by using the same size fully connected with the alteration of dropout layers for better generalization. These visions led toward better generalization and robustness in the facial characterization of our model.</p>
<p>Moreover, in the FRAI network, each newly added fully connected layer (from AlexNet) was further connected to the dropout layer by having a 50% dropout probability. This strategy helps the system to pass the best information to the next layer from the previous one.</p>
</sec>
<sec>
<title>2.3 FRAI pipeline</title>
<p>In the section, a description of the FRAI pipeline has been discussed in detail, as shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. We started with dataset preprocessing followed by dataset grouping into training and testing sets. In this grouping, the FRAI model was trained on the training set. After training, the last classification layer was removed, and the model started to obtain well-represented features or face vectors for the development of a database. During the initial stage of database development, we computed a threshold (as is discussed for each retrieval algorithm in Section 3) using the hit-and-trial method to decide using the holdout LFW dataset images if they match too closely to a character that exists already in the database or does not exist. This technique helps in designing an automation pipeline to decide whether, for the current character, a new index should be introduced or not. After developing a good database, we started the retrieval process with the help of query images. With similar steps for the query process, the first query image features were computed and then passed to the feature matching algorithms. These algorithms took a decision based on the threshold of whether matching faces exist in the database or not. If a matching face does not exist, then the FRAI pipeline considers that person as a new one and adds their image with a new index. In this manner, our automated pipeline itself expands the database with new upcoming faces/characters.</p>
</sec>
<sec>
<title>2.4 Similarity metrics</title>
<p>In the present study, the following similarity metrics were used to compute the accuracy:</p>
<sec>
<title>2.4.1 Euclidean distance</title>
<p>In this matrix, the formula (Malkauthekar, <xref ref-type="bibr" rid="B37">2013</xref>) was used.</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>e</mml:mi><mml:mi>u</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msqrt><mml:mrow><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>&#x000A0;</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mstyle></mml:mrow></mml:msqrt><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where <italic>x</italic> is a query image vector, <italic>y</italic> is a database image vector, and <italic>k</italic> is representing the column number.</p>
</sec>
<sec>
<title>2.4.2 Cosine similarity</title>
<p>In this similarity measure, we have used the formula (Lahitani et al., <xref ref-type="bibr" rid="B31">2016</xref>) as</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M2"><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>cos</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x003B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>x</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mo>.</mml:mo><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>y</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mstyle></mml:mrow></mml:msqrt><mml:msqrt><mml:mrow><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mrow><mml:msubsup><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mstyle></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where <italic>x</italic> is query image vector and <italic>y</italic> is database image vector and <italic>i</italic> is representing the column number.</p>
</sec>
<sec>
<title>2.4.3 k-NN algorithm</title>
<p>This algorithm (Guo et al., <xref ref-type="bibr" rid="B21">2003</xref>; Moldagulova and Sulaiman, <xref ref-type="bibr" rid="B42">2017</xref>) works with the value of k for the prediction of the label. When the label is predicted, then the accuracy is 100% but otherwise 0% because this algorithm returns only the best-suited class from the vicinity of the query vector.</p>
<p>In the present study, we used different metrics algorithms for getting better results than the previously proposed techniques. In this system, the vectors of an extracted query image features match those of the database features and indicate their similarity score. If the score is lower than the defined threshold, then it will be part of the database as a new feature. Afterward, images were stored in our database with its features for future use.</p>
</sec>
</sec>
</sec>
<sec id="s3">
<title>3 Results and discussion</title>
<p>Two different standard datasets, Aligned Face and LFW datasets, were used in this study, and these experimentations were carried using Dell Alienware machine, Intel (R) Core (TM) i7-&#x02212;7700 HQ CPU &#x00040; 2.80 GHz, having 32 GB RAM. Initially, we selected the Aligned Face dataset for training because this dataset has a good number of samples per class as compared to the LFW dataset, and convolutional neural networks also require a good/huge number of samples for training (Asmat et al., <xref ref-type="bibr" rid="B2">2021</xref>; Pande et al., <xref ref-type="bibr" rid="B45">2022</xref>). After the selection of the dataset, we employed augmentation (Khalifa et al., <xref ref-type="bibr" rid="B28">2021</xref>) to increase the number of samples further. In augmentation, we applied different parameters such as normalization, i.e., &#x02212;20&#x000B0; to 20&#x000B0; for rotation and &#x02212;30 to 30 pixels for translation and shear, and even reflection. With these augmented images, we started to train the FRAI. To reach the best training results, we employed the hit-and-trial method to find the best parameters for our convolutional neural network (CNN). These parameters were minibatch size, total iterations for training, and learning rate, which were optimized using Adam (Kingma and Ba, <xref ref-type="bibr" rid="B29">2015</xref>), which is an optimization algorithm, and 104, 20, and 0.0003 were the optimized values of the minibatch size, total iterations for training, and learning rate parameters, respectively. At the outset of our study, we initiated the training of our FRAI network with carefully randomly selected parameters. As the training progressed, we observed that, after &#x0007E;650 iterations, further improvements in validation accuracy were minimal, and this pattern continued until we reached a total of 1,640 iterations. To prevent overfitting of the training model, we retrained and stopped the training earlier than expected. The reason behind learning in fewer transactions was our efficient preprocessing of faces. The cropped characters&#x00027; faces helped the model in focusing only in the close vicinity to learn the better convolutional features. The second point in this scenario is the batch size of 104 images. From these consecutive numbers of images, it is implied that the model learned in a few epochs. The last important point is that we used the weights of GoogleNet that helped in the training to bring less changes to weights in the GoogleNet layers. Adversely, the last newly introduced two fully connected layers altering pooling layers will be trained with more changing weights.</p>
<p>On the best-found parameters, the CNN has shown a validation accuracy of 85.97% with 0.07% of training loss against 3:1:1 split in the dataset as training, validation, and testing. On testing, the FRAI trained model resulted in 82, 97.29, 83.16, 85.39, and 81.05% performance measures for accuracy, AUC, F-1, precision, and recall values, respectively. The mentioned training and validation results of our FRAI are shown in <xref ref-type="fig" rid="F4">Figures 4A</xref>&#x02013;<xref ref-type="fig" rid="F4">C</xref> as training accuracy, training loss, and parameters, respectively. The training graph shows the learning behavior of the network with validation accuracy. The training accuracy resulted in a total of 656 iterations, after which the training was stopped, as also shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. Similarly, the training loss showed the opposite behavior to the training loss toward minimization with validation loss. After the successful completion of the training, we removed the last layer, i.e., softmax, and started to retrieve feature vectors.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>FRAI training and validation performance. <bold>(A)</bold> Training and validation accuracies. <bold>(B)</bold> Training and validation losses. <bold>(C)</bold> Graph legend.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0004.tif"/>
</fig>
<p>To provide a more comprehensive understanding of the learning capabilities of the FRAI model, the ablation (layer-freezing) technique was used to compute the accuracy on the layers of the non-ablated (non-freeze) training model by deploying the same AFD dataset&#x00027;s training, validation, and testing sets. To facilitate comparison, the FRAI model&#x00027;s layers were divided into four quartiles: the first experiment covered layers up to 0.25, the second experiment covered layers up to 0.50, the third experiment covered layers up to 0.75, and the final experiment involved training with all layers, without ablation. These quartiles were applied in two fashions: in a forward way, the FRAI network was kept frozen from the start to the end, and in a second way, the FRAI model was kept frozen from the end to the start, such as transfer learning. In the forward way, we found that, up to the first quartile layers, the model yielded a 14% accuracy by training on &#x0007E;38 million parameters. For the second experiment, up to second quartile layers, the model showed around 38% accuracy considering &#x0007E;42 million parameters. For the third experiment, up to third quartile layers, the model resulted in 45% accuracy by training on &#x0007E;46 million parameters, and at the end, the whole networks&#x00027; training produced 82% accuracy with the training of &#x0007E;47 million parameters. On the other hand, in second way from the end to the start, we found that, up to the first quartile layers, the model yielded into an accuracy of 79% by training on &#x0007E;35 million parameters. For the second experiment, up to second quartile layers, the model showed &#x0007E;75% accuracy considering &#x0007E;30 million parameters. For the third experiment, up to third quartile layers, the model resulted in 70% accuracy by training on &#x0007E;27 million parameters, and at the end, whole networks&#x00027; training produced 82% accuracy with &#x0007E;47 million parameters&#x00027; training.</p>
<p>This analysis highlighted the significance of training the entire model when considering the layers within the first quartile. Skipping the training of these layers led to decreased accuracy, but including their training significantly improved the overall accuracy. In the forward training approach, it was observed that, when the ending layers were considered for training, the model lacked the necessary weights to recognize facial landmarks accurately, which is crucial for achieving high accuracy in image recognition, particularly in scenarios involving diverse facial expressions and appearances, as depicted in <xref ref-type="fig" rid="F5">Figure 5</xref>. In the general conclusion for ablation, we highlight the crucial insights that the original weights alone do not well-characterize the faces posing various complex human emotions. It became evident that employing pre-trained weights and training the entire model was essential. This approach allowed for the refinement of weights across all layers, facilitating the ideal combination of features. Specifically, the initial layers learned to discern edges, corners, and texture details, the middle layers integrated these elements to form shapes, and the later layers refined connections and made accurate shape predictions.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Comparison of training performance in ablation process. <bold>(A)</bold> Comparison between accuracy and quartile-based freezing of layers in forward way. <bold>(B)</bold> Comparison between number of parameters and quartile-based freezing of layers in forward way. <bold>(C)</bold> Comparison between accuracy and quartile-based freezing of layers in backward way. <bold>(D)</bold> Comparison between number of parameters and quartile-based freezing of layers in backward way.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0005.tif"/>
</fig>
<p>To improve our methodology&#x00027;s results further, we employed other conventional similarity measure algorithms such as Euclidean Distance, Cosine, and k-Nearest Neighbors (KNN) for the process of image retrieval. For this process, we removed the last layer, i.e., softmax, and started to retrieve features vector against the query images. Against the retrieval of each query image, we used different similarity matrices and computed different results, as shown in <xref ref-type="fig" rid="F6">Figure 6</xref>. The tabular representation of precision, recall, F1, and F2 score values against Euclidean and Cosine metrics are described in <xref ref-type="table" rid="T1">Table 1</xref>. In the case of KNN, we only displayed average precision, recall, F1, and F2 scores values in the tabular format as this algorithm only returns the belonging class where the results would be either 0% or 100%. In this context, we computed the results either the right class or the wrong one, and the final results were computed on average percentage.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Similarity measures of most top five best found for a query. <bold>(A)</bold> Neural network query image. <bold>(B)</bold> Top five predictions by neural network. <bold>(C)</bold> Euclidean distance query image. <bold>(D)</bold> Top five predictions by Euclidean distance. <bold>(E)</bold> Cosine similarity query image. <bold>(F)</bold> Top five predictions by Cosine similarity.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0006.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>(%) Comparison of precision, recall, <italic>F</italic>1, and <italic>F</italic>2 scores for Aligned Face dataset.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left"><bold>Matrices</bold></th>
<th valign="top" align="center"><bold>Method</bold></th>
<th valign="top" align="center"><bold>Precision</bold></th>
<th valign="top" align="center"><bold>Recall</bold></th>
<th valign="top" align="center"><bold><italic>F</italic>1 score</bold></th>
<th valign="top" align="center"><bold><italic>F</italic>2 score</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Euclidean</td>
<td valign="top" align="center">Query with each feature</td>
<td valign="top" align="center">90.08</td>
<td valign="top" align="center">84.16</td>
<td valign="top" align="center">87.02</td>
<td valign="top" align="center">85.28</td>
</tr>
<tr>
<td valign="top" align="left">Cosine</td>
<td valign="top" align="center">Query with each feature</td>
<td valign="top" align="center">92.18</td>
<td valign="top" align="center">86.76</td>
<td valign="top" align="center">89.39</td>
<td valign="top" align="center">87.79</td>
</tr>
<tr>
<td valign="top" align="left">k-NN</td>
<td valign="top" align="center">Nearest Centroid (<italic>k</italic> = 5)</td>
<td valign="top" align="center">92.00</td>
<td valign="top" align="center">92.66</td>
<td valign="top" align="center">92.33</td>
<td valign="top" align="center">92.52</td>
</tr></tbody>
</table>
</table-wrap>
<p>After computing results on the aligned face dataset, we used second testing holdout LFW dataset to describe our model&#x00027;s performance on a ratio of 4:1 for database and query images. We began again with the step of feature extraction with the help of the already trained FRAI that is well-trained on facial features/maps. After feature extraction, query images were passed to FRAI for their unique feature vector and then different similar measures, as used for the aligned face dataset, were used to retrieve images against the query. Against each query image, we calculated recall, precision, F1, and F2 score values to compute the performances as shown in <xref ref-type="fig" rid="F7">Figures 7</xref>, <xref ref-type="fig" rid="F8">8</xref>, and <xref ref-type="table" rid="T2">Table 2</xref>. The results of the similarity measures are given in the following subsections.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Precision and Recall scores of Euclidean and Cosine algorithms against each retrieval images. <bold>(A)</bold> Euclidean distance precision. <bold>(B)</bold> Euclidean distance recall. <bold>(C)</bold> Cosine precision. <bold>(D)</bold> Cosine recall.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0007.tif"/>
</fig>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Facial image retrieval results on AFD by picking the best three images by using Euclidean and Cosine similarity scores while red box is showing wrong retrieval.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0008.tif"/>
</fig>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>(%) comparison of precision, recall, <italic>F</italic>1, and <italic>F</italic>2 scores for LFW dataset.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left"><bold>Matrices</bold></th>
<th valign="top" align="center"><bold>Method</bold></th>
<th valign="top" align="center"><bold>Precision</bold></th>
<th valign="top" align="center"><bold>Recall</bold></th>
<th valign="top" align="center"><bold><italic>F</italic>1 score</bold></th>
<th valign="top" align="center"><bold><italic>F</italic>2 score</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Euclidean</td>
<td valign="top" align="center">Query with each feature</td>
<td valign="top" align="center">94.91</td>
<td valign="top" align="center">87.33</td>
<td valign="top" align="center">90.96</td>
<td valign="top" align="center">88.74</td>
</tr>
<tr>
<td valign="top" align="left">Cosine</td>
<td valign="top" align="center">Query with each feature</td>
<td valign="top" align="center">91.65</td>
<td valign="top" align="center">87.88</td>
<td valign="top" align="center">89.72</td>
<td valign="top" align="center">88.61</td>
</tr>
<tr>
<td valign="top" align="left">k-NN</td>
<td valign="top" align="center">Nearest Centroid (<italic>k</italic> = 5)</td>
<td valign="top" align="center">95.00</td>
<td valign="top" align="center">95.00</td>
<td valign="top" align="center">95.00</td>
<td valign="top" align="center">95.00</td>
</tr></tbody>
</table>
</table-wrap>
<sec>
<title>3.1 Aligned face dataset</title>
<sec>
<title>3.1.1 Neural network output</title>
<p>After successful training, the top five best similar classes have been computed, as shown in <xref ref-type="fig" rid="F6">Figure 6B</xref>, against a face shown in <xref ref-type="fig" rid="F6">Figure 6A</xref>. The training showed better results. However, the guessed class was assigned with 100% probability that it is &#x0201C;Aaron Paul&#x0201D;.</p>
</sec>
<sec>
<title>3.1.2 Euclidean distance similarity</title>
<p>This measure showed good results with precision, recall, F1, and F2 values of 90.08, 84.16, 87.02, and 85.28%, respectively, which are relatively better than previously proposed techniques (Wu et al., <xref ref-type="bibr" rid="B63">2010</xref>; Sun et al., <xref ref-type="bibr" rid="B55">2019</xref>; Tarawneh et al., <xref ref-type="bibr" rid="B59">2019</xref>), as shown in <xref ref-type="table" rid="T1">Table 1</xref>. The normalized threshold used for this similarity measure was 0.552 where we got the best similarity output. A threshold should be defined to assess whether the images would be retrieved or not. We assessed this threshold by the hit-and-trial method, which indicated that the images having a score of more than 0.552 are retrieved images. The graph of the top five best recall images is shown in <xref ref-type="fig" rid="F6">Figure 6D</xref> for a query image in <xref ref-type="fig" rid="F6">Figure 6C</xref>. Our model had shown a similar behavior with other query images.</p>
</sec>
<sec>
<title>3.1.3 Cosine similarity</title>
<p>This measure follows the same trend as Euclidean Distance with precision, recall, F1, and F2 score values as 92.18, 86.76, 89.39, and 87.79%, respectively (shown in <xref ref-type="table" rid="T1">Table 1</xref>). It performed better at the normalized threshold of 0.72 where we got the best results. If the images have a score &#x0003E;0.72, it means that the database has retrieval images. The graph of the top 5 best recall images is shown in <xref ref-type="fig" rid="F6">Figure 6F</xref> for a query image <xref ref-type="fig" rid="F6">Figure 6E</xref>. Our model has shown similar behavior with other query images.</p>
</sec>
<sec>
<title>3.1.4 k-nearest neighbors</title>
<p>This algorithm showed the most promising results at 92.00, 92.66, 92.33, and 92.52% for precision, recall, F1, and F2 score values, respectively, for facial-based feature vectors (shown in <xref ref-type="table" rid="T1">Table 1</xref>). The random selection method has been used for the nearest centroid classifier function with <italic>k</italic> = 5.</p>
</sec>
</sec>
<sec>
<title>3.2 LFW face dataset</title>
<sec>
<title>3.2.1 Euclidean distance similarity</title>
<p>This measure has shown good results with maximum precision, recall, F1, and F2 score values at 94.91, 87.33, 90.96, and 88.74%, respectively, as shown in <xref ref-type="table" rid="T2">Table 2</xref>, which are comparably better than those obtained from previous techniques (Wu et al., <xref ref-type="bibr" rid="B63">2010</xref>; Sun et al., <xref ref-type="bibr" rid="B55">2019</xref>; Tarawneh et al., <xref ref-type="bibr" rid="B59">2019</xref>). The graphs of precision and recall are shown in <xref ref-type="fig" rid="F7">Figures 7A</xref>, <xref ref-type="fig" rid="F7">B</xref>, respectively, for each query image.</p>
</sec>
<sec>
<title>3.2.2 Cosine similarity</title>
<p>This measure follows the same trend as Euclidean Distance with precision, recall, F1, and F2 score values of 91.65, 87.88, 89.72, and 88.61%, respectively (<xref ref-type="table" rid="T2">Table 2</xref>). In <xref ref-type="fig" rid="F7">Figures 7C</xref>, <xref ref-type="fig" rid="F7">D</xref>, it is shown that the lower score peaks are higher than those from the Euclidean Distance for the precision graph, but the recall graphs exhibited a similar trend as that of the Euclidean Distance.</p>
</sec>
<sec>
<title>3.2.3 k-nearest neighbors (KNN)</title>
<p>This algorithm showed the most promising results at 95.00, 95.00, 95.00, and 95.00% for precision, recall, F1, and F2 score values, respectively, for facial-based feature vectors (<xref ref-type="table" rid="T2">Table 2</xref>). The random selection was used for the nearest centroid classifier function with <italic>k</italic> = 5.</p>
<p>The FRAI model has shown very promising results in the image recognition process for the AFD dataset irrespective of any particular facial expression. Generally, our model produces higher scores compared to those images where the same character with a similar expression exists. In the same way, our model retrieves the same appearance images for the characters with appearance particulars such as beard or glasses or any other appearance elements to better fulfill the need of a query image. Overall, these mentioned scores are not for any peculiar expression and is our model&#x00027;s tendency to show that expression and appearance matter in the image retrieval task, as shown in <xref ref-type="fig" rid="F8">Figure 8</xref> for the AFD dataset. <xref ref-type="fig" rid="F10">Figure 10</xref> shows the different facial expressions experienced by the FRAI model during training. Furthermore, the best-retrieved images using Euclidean measure for the test dataset have been shown in <xref ref-type="fig" rid="F9">Figure 9</xref> for the LFW dataset. The first cell in <xref ref-type="fig" rid="F9">Figure 9</xref> shows a query image. All images, following the query image, were retrieved as best images against the query image.</p>
<fig id="F9" position="float">
<label>Figure 9</label>
<caption><p>Facial image retrieval results using LFW on the best retrieved images and red box is showing wrong retrieval.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0009.tif"/>
</fig>
<fig id="F10" position="float">
<label>Figure 10</label>
<caption><p>Facial landmarks&#x00027; characterization based on facial expressions and appearances on AFD dataset faces by applied SHAP and LIME on the FRAI model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1230383-g0010.tif"/>
</fig>
<p>In comparison to the pervious techniques of Wu et al. (<xref ref-type="bibr" rid="B63">2010</xref>), Sun et al. (<xref ref-type="bibr" rid="B55">2019</xref>) and Tarawneh et al. (<xref ref-type="bibr" rid="B59">2019</xref>), our training model FRAI and similarity measures made our technique novel and robust. In the study of Wu et al. (<xref ref-type="bibr" rid="B63">2010</xref>) titled Scalable Face Image Retrieval with Identity-Based Quantization and Multi-reference Re-ranking, an average accuracy of 71% has been shown because the authors employed conventional features such as local and global features and then used these features for the retrieval of images. These authors focused on only appearance features such as eyes, nose, and ears (Wu et al., <xref ref-type="bibr" rid="B63">2010</xref>) but failed to focus on the whole face map, which is a limitation in their research. On the other hand, in the study titled &#x0201C;Eye-tracking based relevance feedback for iterative face image retrieval&#x0201D; by Sun et al. (<xref ref-type="bibr" rid="B55">2019</xref>), the authors used first Face&#x0002B;&#x0002B; algorithm to get the top 36 ranked images, and then, they were passed to the neural network for the retrieval process and showed a good average precision recall score of 90.00% (Sun et al., <xref ref-type="bibr" rid="B55">2019</xref>). This model has not shown more than 90% accuracy, which may be due to the large number (i.e., 36) of ranking images among which there may be few images that will have the similar faces/expressions but not the correct face. Similarly, the study titled &#x0201C;Deep Face Image Retrieval: a Comparative Study with Dictionary Learning&#x0201D; by Tarawneh et al. (<xref ref-type="bibr" rid="B59">2019</xref>) has also shown good results of 90.50% average precision recall score. In their method, they have used deep learning methods to retrieve images by analyzing the face mesh. Compared to our study, the study of these authors did not employ further any other retrieving similarity measure for the images, which is a gap in most of the methodologies and which we have also shown with the help of different metrics. All of the abovementioned three techniques have been compared with our methodology in <xref ref-type="table" rid="T3">Table 3</xref>. The comparison of previous techniques with our proposed methodology has shown a much more efficient average precision recall.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Comparison with previous techniques.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left"><bold>Methods</bold></th>
<th valign="top" align="center"><bold>Average precision-recall (APR)</bold></th>
<th valign="top" align="center"><bold>References</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Proposed methodology</td>
<td valign="top" align="center">95.48%</td>
<td valign="top" align="center">&#x02013;</td>
</tr>
<tr>
<td valign="top" align="left">Deep face image retrieval: a comparative study with dictionary learning</td>
<td valign="top" align="center">90.50%</td>
<td valign="top" align="center">Tarawneh et al., <xref ref-type="bibr" rid="B59">2019</xref></td>
</tr>
<tr>
<td valign="top" align="left">Scalable face image retrieval with identity-based quantization and multi-reference re-ranking</td>
<td valign="top" align="center">70.50%</td>
<td valign="top" align="center">Wu et al., <xref ref-type="bibr" rid="B63">2010</xref></td>
</tr>
<tr>
<td valign="top" align="left">Eye-tracking based relevance feedback for iterative face image retrieval</td>
<td valign="top" align="center">90.00%</td>
<td valign="top" align="center">Sun et al., <xref ref-type="bibr" rid="B55">2019</xref></td>
</tr></tbody>
</table>
</table-wrap>
<p>Our methodology has worked out very well due to two reasons. First, our training model FRAI is the combination of two renowned models and has shown good training and validation results. Second, we did not use it directly to retrieve images from the database. To use FRAI for feature extraction first and then against each query&#x00027;s feature vector, we retrieved different images with the help of different renowned similarity algorithms that have shown increased recall, precision, F1, and F2 scores. These two reasons have shown us that a good model and good feature matching algorithm, such as KNN in our case, improve the retrieval system.</p>
<p>In our study, we also employed two AI explainability tools, SHAP (Lundberg and Lee, <xref ref-type="bibr" rid="B34">2017</xref>) and LIME (Ribeiro et al., <xref ref-type="bibr" rid="B49">2016</xref>), to interpret the FRAI model results. To obtain more accurate pixel annotations and segmentations, we conducted up to 5,000 and 20,000 evaluations for each query image using SHAP and LIME, respectively. Both explainable tools were able to interpret the FRAI model in a different way. While SHAP highlights the most important pixels in the query images that the FRAI model pays attention to, LIME focuses on developing segmentations of the facial features related to the searched face image. From pixels and regions, the important segmentation aspects from SHAP and LIME, respectively, was found. With segmentation, the characters&#x00027; facial expressions and appearances, such as with beard or glasses or any other visual characteristics, were helpful in retrieving, as shown in <xref ref-type="fig" rid="F10">Figure 10</xref>, the related expression faces most of the time.</p>
<p>The use of FRAI (a tool that can perform a face recognition) together with the use of explainable AI tools (which highlight personalized facial landmarks responsible for FRAI face identification) has the potential to support protectionists to identify marker combinations and their relationships with specific face features that can be related to somatic characteristics, mood or psychological states, and pathological trajectories, among others.</p>
<p>Overall, our study demonstrates the potential of the FRAI model in a range of applications and highlights the value of explainable AI models in interpreting complex problems.</p>
<p>In conclusion, future advancements of the FRAI tool emerge as a potential decision tool in all those applications where the identification of personalized facial recognition markers may be of crucial importance, e.g., in clinical diagnostics, prognostics criminology, security, and many others.</p>
</sec>
</sec>
</sec>
<sec id="s4">
<title>4 Conclusion</title>
<p>In this study, we developed an efficient, more accurate, and less computational novel technique for the automation of an end-to-end facial character identification, emphasizing specifically on the extraction and then recognition of characters exhibiting diverse facial expressions. We used two standard datasets, namely the Aligned Face dataset and the LFW dataset, which consists of a number of characters posing various facial expression landmarks. We started with the Aligned Face dataset and trained our novel model named FRAI, which is a combination of GoogleNet and AlexNet, used for the feature extraction process. Then, we developed two databases against the Aligned Face dataset and holdout testing LFW dataset. For these databases, a number of metrics were used for the calculation of precision, recall, F1 and F2 scores using Euclidean distance-, Cosine distance and k-nearest neighbor (KNN) measures for the retrieval process against each query image. We achieved maximum precision, recall, F1, and F2 score values of 92.00, 92.66, 92.33, and 92.52%, respectively, for the Aligned Face dataset used for training and 95.00% for LFW dataset used for testing by using the KNN measure. Our methodology has concluded that a good facial trained model is not enough for the facial based image retrieval system. To use this model for a feature extraction process and then employ different conventional measurement algorithms for the retrieval process, it is usually recommended to increase the performance, which, in our case, is KNN.</p>
<p>The FRAI tool may be potentially used in the healthcare, for example, to predict different diseases from features of facial characterization such as neurological diseases and cancer, brain growth development in the cases of babies, and so on, and criminology, among other fields. In the future, we will improve our technique by developing a new database using GANs. On the newly generated faces, a new parallel and vertical combination of neural networks, namely, GoogleNet, AlexNet, ResNet (Simonyan and Zisserman, <xref ref-type="bibr" rid="B53">2015</xref>), will be employed to enhance the robustness of the features toward the defined goal.</p>
</sec>
<sec sec-type="data-availability" id="s5">
<title>Data availability statement</title>
<p>The original contributions and datasets utilized in the study are publicly accessible and have been appropriately referenced in the article/supplementary material. Further inquiries can be directed to the corresponding author/s.</p>
</sec>
<sec sec-type="author-contributions" id="s6">
<title>Author contributions</title>
<p>Material preparation, data collection, analysis, and methodology implementation were performed by STS and SAS. SQ, AD, and MD managed and guided in the development of the whole strategy. The first draft of the manuscript was written by STS and SAS, and then, all authors commented on previous versions of the manuscript. SQ, AD, and MD read and approved the final manuscript. All authors contributed to the study conception and design. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<sec sec-type="funding-information" id="s7">
<title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The present research work has been developed as part of the PARENT project, funded by the European Union&#x00027;s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie-Innovative Training Network 2020, Grant Agreement No 956394 (<ext-link ext-link-type="uri" xlink:href="https://parenth2020.com/">https://parenth2020.com/</ext-link>).</p>
</sec>
<ack><p>The authors would like to acknowledge Politecnico di Torino, Italy, Pakistan Institute of Engineering and Applied Sciences, Pakistan and GPI SpA, Italy for their technical assistance in this research venture.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>SAS and AD were employed by GPI SpA. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s8">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<fn-group>
<fn id="fn0001"><p><sup>1</sup><italic>5 Million Faces&#x02014;Top 15 Free Image Datasets for Facial Recognition</italic>. Lionbridge AI (n.d.). vailable online at: <ext-link ext-link-type="uri" xlink:href="https://lionbridge.ai/datasets/5-million-faces-top-15-free-image-datasets-for-facial-recognition/">https://lionbridge.ai/datasets/5-million-faces-top-15-free-image-datasets-for-facial-recognition/</ext-link> (accessed December 8, 2019).</p></fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ashlin Deepa</surname> <given-names>R. N.</given-names></name> <name><surname>Rakesh Reddy</surname> <given-names>D. R.</given-names></name> <name><surname>Milind</surname> <given-names>K.</given-names></name> <name><surname>Vijayalata</surname> <given-names>Y.</given-names></name> <name><surname>Rahul</surname> <given-names>K.</given-names></name></person-group> (<year>2023</year>). <article-title>Drowsiness detection using iot and facial expression</article-title>. <source>Cogn. Sci. Technol.</source> <fpage>679</fpage>&#x02013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1007/978-981-19-2358-6_61</pub-id></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Asmat</surname> <given-names>N. B.</given-names></name> <name><surname>Malik</surname> <given-names>M. I.</given-names></name> <name><surname>Shahzad</surname> <given-names>M.</given-names></name> <name><surname>Shafait</surname> <given-names>F.</given-names></name></person-group> (<year>2021</year>). <article-title>Segmentation of Text Documents Using Hyperspectral Imaging: A Blend of Deep Cnn and Generative Adversarial Network</article-title>. <source>SSRN Electronic Journal</source>. <pub-id pub-id-type="doi">10.2139/ssrn.3992988</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Belhumeur</surname> <given-names>N. P.</given-names></name> <name><surname>Hespanha</surname> <given-names>J. P.</given-names></name> <name><surname>Kriegman</surname> <given-names>D. J.</given-names></name></person-group> (<year>1997</year>). <article-title>Eigenfaces vs. fisherfaces: recognition using class specific linear projection</article-title>. <source>IEEE Transact. Pattern Anal. Mach. Intell</source>. <volume>19</volume>, <fpage>711</fpage>&#x02013;<lpage>720</lpage>. <pub-id pub-id-type="doi">10.1109/34.598228</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brahnam</surname> <given-names>S.</given-names></name> <name><surname>Nanni</surname> <given-names>L.</given-names></name> <name><surname>McMurtrey</surname> <given-names>S.</given-names></name> <name><surname>Lumini</surname> <given-names>A.</given-names></name> <name><surname>Brattin</surname> <given-names>R.</given-names></name> <name><surname>Slack</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>Neonatal pain detection in videos using the ICOPEvid dataset and an ensemble of descriptors extracted from gaussian of local descriptors</article-title>. <source>Appl. Comp. Informat.</source> <volume>19</volume>, <fpage>122</fpage>&#x02013;<lpage>143</lpage>. <pub-id pub-id-type="doi">10.1016/j.aci.2019.05.003</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brunelli</surname> <given-names>R.</given-names></name> <name><surname>Poggio</surname> <given-names>T.</given-names></name></person-group> (<year>1993</year>). <article-title>Face recognition: features versus templates</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>. <volume>15</volume>, <fpage>1042</fpage>&#x02013;<lpage>1052</lpage>. <pub-id pub-id-type="doi">10.1109/34.254061</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chang</surname> <given-names>N.-S.</given-names></name> <name><surname>Fu</surname> <given-names>K.-S.</given-names></name></person-group> (<year>1980b</year>). <article-title>Query-by-pictorial-example</article-title>. <source>IEEE Transact. Softw. Eng.</source> <volume>6</volume>, <fpage>519</fpage>&#x02013;<lpage>524</lpage>. <pub-id pub-id-type="doi">10.1109/TSE.1980.230801</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chang</surname> <given-names>S. F. U.</given-names></name> <name><surname>Eleftheriadis</surname> <given-names>A.</given-names></name> <name><surname>Mcclintock</surname> <given-names>R.</given-names></name></person-group> (<year>1998</year>). <article-title>Next-generation content representation, creation, and searching for new-media applications in education</article-title>. <source>Proc. IEEE</source>. <volume>86</volume>, <fpage>884</fpage>&#x02013;<lpage>904</lpage>. <pub-id pub-id-type="doi">10.1109/5.664278</pub-id></citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chang</surname> <given-names>S. K.</given-names></name> <name><surname>Hsu</surname> <given-names>A.</given-names></name></person-group> (<year>1992</year>). <article-title>Image information systems: where do we go from here?</article-title> <source>IEEE Transact. Knowl. Data Eng</source>. <volume>4</volume>, <fpage>431</fpage>&#x02013;<lpage>442</lpage>. <pub-id pub-id-type="doi">10.1142/9789814343138_0035</pub-id></citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chang</surname> <given-names>S. K.</given-names></name> <name><surname>Kunii</surname> <given-names>L. T.</given-names></name></person-group> (<year>1981</year>). <article-title>Pictorial data-base systems</article-title>. <source>Computer</source> <volume>14</volume>, <fpage>13</fpage>&#x02013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1109/C-M.1981.220245</pub-id></citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chang</surname> <given-names>S. K.</given-names></name> <name><surname>Yan</surname> <given-names>C. W.</given-names></name> <name><surname>Dimitroff</surname> <given-names>D. C.</given-names></name> <name><surname>Arndt</surname> <given-names>T.</given-names></name></person-group> (<year>1988</year>). <article-title>An intelligent image database system</article-title>. <source>IEEE Transact. Softw. Eng</source>. <volume>14</volume>, <fpage>681</fpage>&#x02013;<lpage>688</lpage>. <pub-id pub-id-type="doi">10.1109/32.6147</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Chang</surname> <given-names>N. S.</given-names></name> <name><surname>Fu</surname> <given-names>K. S.</given-names></name></person-group> (<year>1980a</year>). <source>A Relational Database System for Images</source>. <publisher-loc>Berlin; Heidelberg</publisher-loc>: <publisher-name>Springer Berlin Heidelberg</publisher-name>.</citation>
</ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chellappa</surname> <given-names>R.</given-names></name> <name><surname>Sirohey</surname> <given-names>S.</given-names></name> <name><surname>Wilson</surname> <given-names>L. C.</given-names></name></person-group> (<year>1995</year>). <article-title>Human and machine recognition of faces: a survey</article-title>. <source>Proc. IEEE</source>. <volume>83</volume>, <fpage>705</fpage>&#x02013;<lpage>741</lpage>. <pub-id pub-id-type="doi">10.1109/5.381842</pub-id></citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dai</surname> <given-names>Y.</given-names></name> <name><surname>Nakano</surname> <given-names>Y.</given-names></name></person-group> (<year>1996</year>). <article-title>Face-texture model based on SGLD and its application in face detection in a color scene</article-title>. <source>Pattern Recognit</source>. <volume>29</volume>, <fpage>1007</fpage>&#x02013;<lpage>1017</lpage>. <pub-id pub-id-type="doi">10.1016/0031-3203(95)00139-5</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dalal</surname> <given-names>N.</given-names></name> <name><surname>Triggs</surname> <given-names>B.</given-names></name></person-group> (<year>2005</year>). <article-title>&#x0201C;Histograms of oriented gradients for human detection,&#x0201D;</article-title> in <source>Proceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005</source> (IEEE Computer Scociety), <fpage>886</fpage>&#x02013;<lpage>893</lpage>.</citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fan</surname> <given-names>H.</given-names></name> <name><surname>Cao</surname> <given-names>Z.</given-names></name> <name><surname>Jiang</surname> <given-names>Y.</given-names></name> <name><surname>Yin</surname> <given-names>Q.</given-names></name> <name><surname>Doudou</surname> <given-names>C.</given-names></name></person-group> (<year>2014</year>). <article-title>Learning deep face representation</article-title>. <source>arXiv [Preprint].</source> arXiv:1403.2802. <pub-id pub-id-type="doi">10.1145/2647868.2654960</pub-id></citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fayyaz</surname> <given-names>Z.</given-names></name> <name><surname>Ebrahimian</surname> <given-names>M.</given-names></name> <name><surname>Nawara</surname> <given-names>D.</given-names></name> <name><surname>Ibrahim</surname> <given-names>A.</given-names></name> <name><surname>Kashef</surname> <given-names>R.</given-names></name></person-group> (<year>2020</year>). <article-title>Recommendation systems: Algorithms, challenges, metrics, and business opportunities</article-title>. <source>Appl. Sci.</source> <volume>10</volume>, <fpage>7748</fpage>.</citation>
</ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Feng</surname> <given-names>G. C.</given-names></name> <name><surname>Yuen</surname> <given-names>C. P.</given-names></name></person-group> (<year>2001</year>). <article-title>Multi-cues eye detection on gray intensity image</article-title>. <source>Pattern Recognit</source>. <volume>34</volume>, <fpage>1033</fpage>&#x02013;<lpage>1046</lpage>. <pub-id pub-id-type="doi">10.1016/S0031-3203(00)00042-X</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Flickner</surname> <given-names>M.</given-names></name> <name><surname>Sawhney</surname> <given-names>H.</given-names></name> <name><surname>Niblack</surname> <given-names>W.</given-names></name> <name><surname>Ashley</surname> <given-names>J.</given-names></name> <name><surname>Huang</surname> <given-names>Q.</given-names></name> <name><surname>Dom</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>1995</year>). <article-title>Query by image and video content: the QBIC system</article-title>. <source>Computer</source>. <volume>28</volume>, <fpage>23</fpage>&#x02013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1109/2.410146</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gary</surname> <given-names>H. B.</given-names></name> <name><surname>Mattar</surname> <given-names>M.</given-names></name> <name><surname>Berg</surname> <given-names>T.</given-names></name> <name><surname>Learned-Miller</surname> <given-names>E.</given-names></name></person-group> (<year>2008</year>). <article-title>Labeled faces in the wild: a database for studying face recognition in unconstrained environments</article-title>. <source>Tech. Rep.</source></citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goudail</surname> <given-names>F.</given-names></name> <name><surname>Lange</surname> <given-names>E.</given-names></name> <name><surname>Iwamoto</surname> <given-names>T.</given-names></name> <name><surname>Kyuma</surname> <given-names>K.</given-names></name> <name><surname>Otsu</surname> <given-names>N.</given-names></name></person-group> (<year>1996</year>). <article-title>Face recognition system using local autocorrelations and multiscale integration</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>. <volume>18</volume>, <fpage>1024</fpage>&#x02013;<lpage>1028</lpage>. <pub-id pub-id-type="doi">10.1109/34.541411</pub-id></citation>
</ref>
<ref id="B21">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Guo</surname> <given-names>G.</given-names></name> <name><surname>Wang</surname> <given-names>H.</given-names></name> <name><surname>Bell</surname> <given-names>D.</given-names></name> <name><surname>Bi</surname> <given-names>Y.</given-names></name> <name><surname>Greer</surname> <given-names>K.</given-names></name></person-group> (<year>2003</year>). <source>KNN Model-Based Approach in Classification</source>. <publisher-loc>Berlin; Heidelberg</publisher-loc>: <publisher-name>Springer Berlin Heidelberg</publisher-name>. <fpage>986</fpage>&#x02013;<lpage>996</lpage>.</citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Han</surname> <given-names>C.-C.</given-names></name> <name><surname>Liao</surname> <given-names>H.-Y. M.</given-names></name> <name><surname>Yu</surname> <given-names>G.-J.</given-names></name> <name><surname>Chen</surname> <given-names>L.-H.</given-names></name></person-group> (<year>2000</year>). <article-title>Fast face detection via morphology-based pre-processing</article-title>. <source>Pattern Recognit.</source> <volume>33</volume>, <fpage>1701</fpage>&#x02013;<lpage>1712</lpage>. <pub-id pub-id-type="doi">10.1016/S0031-3203(99)00141-7</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hasnat</surname> <given-names>A.</given-names></name> <name><surname>Halder</surname> <given-names>S.</given-names></name> <name><surname>Bhattacharjee</surname> <given-names>D.</given-names></name> <name><surname>Nasipuri</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>Face image retrieval using discriminative ternary census transform and spatial pyramid matching</article-title>. <source>Commun. Comp. Inf. Sci.</source> <volume>1031</volume>, <fpage>316</fpage>&#x02013;<lpage>330</lpage>. <pub-id pub-id-type="doi">10.1007/978-981-13-8581-0_26</pub-id></citation>
</ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huffer</surname> <given-names>D.</given-names></name> <name><surname>Wood</surname> <given-names>C.</given-names></name> <name><surname>Graham</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>What the machine saw: some questions on the ethics of computer vision and machine learning to investigate human remains trafficking</article-title>. <source>Int. Archaeol</source>. <volume>52</volume>, <fpage>1</fpage>&#x02013;<lpage>10</lpage>. <pub-id pub-id-type="doi">10.11141/ia.52.5</pub-id></citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jeng</surname> <given-names>H. S.</given-names></name> <name><surname>Liao</surname> <given-names>H. Y. M.</given-names></name> <name><surname>Han</surname> <given-names>C. C.</given-names></name> <name><surname>Chern</surname> <given-names>M. Y.</given-names></name> <name><surname>Liu</surname> <given-names>Y. T.</given-names></name></person-group> (<year>1998</year>). <article-title>Facial feature detection using geometrical face model: an efficient approach</article-title>. <source>Pattern Recognit</source>. <volume>31</volume>, <fpage>273</fpage>&#x02013;<lpage>282</lpage>. <pub-id pub-id-type="doi">10.1016/S0031-3203(97)00048-4</pub-id></citation>
</ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Jesorsky</surname> <given-names>O.</given-names></name> <name><surname>Kirchberg</surname> <given-names>K. J.</given-names></name> <name><surname>Frischholz</surname> <given-names>R. W.</given-names></name></person-group> (<year>2001</year>). <article-title>&#x0201C;Robust face detection using the hausdorff distance BT,&#x0201D;</article-title> in <source>Proc. of Conf. on Audio- and Video-Based Biometric Person Authentication</source> (<publisher-loc>Berlin; Heidelberg</publisher-loc>: <publisher-name>Springer Berlin Heidelberg</publisher-name>).</citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>M.</given-names></name> <name><surname>Yin</surname> <given-names>S.</given-names></name></person-group> (<year>2022</year>). <article-title>Facial expression recognition based on convolutional block attention module and multi-feature fusion</article-title>. <source>Int. J. Comp. Vis. Robot.</source> <volume>13</volume>, <fpage>21</fpage>&#x02013;<lpage>37</lpage>. <pub-id pub-id-type="doi">10.1504/IJCVR.2023.127298</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Khalifa</surname> <given-names>E. N.</given-names></name> <name><surname>Loey</surname> <given-names>M.</given-names></name> <name><surname>Mirjalili</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>A comprehensive survey of recent trends in deep learning for digital images augmentation</article-title>. <source>Artif. Intell. Rev.</source> <volume>55</volume>, <fpage>2351</fpage>&#x02013;<lpage>2377</lpage>. <pub-id pub-id-type="doi">10.1007/s10462-021-10066-4</pub-id><pub-id pub-id-type="pmid">34511694</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kingma</surname> <given-names>P. D.</given-names></name> <name><surname>Ba</surname> <given-names>L. J.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Adam: a method for stochastic optimization,&#x0201D;</article-title> in <source>3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings</source> (International Conference on Learning Representations, ICLR).</citation>
</ref>
<ref id="B30">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Krizhevsky</surname> <given-names>A.</given-names></name> <name><surname>Sutskever</surname> <given-names>I.</given-names></name> <name><surname>Hinton</surname> <given-names>G. E</given-names></name></person-group>. (n.d.). <source>ImageNet Classification with Deep Convolutional Neural Networks</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://code.google.com/p/cuda-convnet/">http://code.google.com/p/cuda-convnet/</ext-link> (accessed January 10, 2020).</citation>
</ref>
<ref id="B31">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lahitani</surname> <given-names>R. A.</given-names></name> <name><surname>Permanasari</surname> <given-names>A. E.</given-names></name> <name><surname>Setiawan</surname> <given-names>N. A.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Cosine similarity to determine similarity measure: study case in online essay assessment,&#x0201D;</article-title> in <source>Proceedings of 2016 4th International Conference on Cyber and IT Service Management, CITSM 2016</source> (<publisher-loc>Bandung</publisher-loc>: <publisher-name>Institute of Electrical and Electronics Engineers Inc.</publisher-name>).</citation>
</ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Lin</surname> <given-names>Z.</given-names></name> <name><surname>Shen</surname> <given-names>X.</given-names></name> <name><surname>Brandt</surname> <given-names>J.</given-names></name> <name><surname>Hua</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <source>A Convolutional Neural Network Cascade for Face Detection</source>. IEEE Computer Society.</citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>K.</given-names></name> <name><surname>Cheng</surname> <given-names>Y. Q.</given-names></name> <name><surname>Yang</surname> <given-names>J. Y.</given-names></name></person-group> (<year>1993</year>). <article-title>Algebraic feature extraction for image recognition based on an optimal discriminant criterion</article-title>. <source>Pattern Recognit</source>. <volume>26</volume>, <fpage>903</fpage>&#x02013;<lpage>911</lpage>. <pub-id pub-id-type="doi">10.1016/0031-3203(93)90056-3</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lundberg</surname> <given-names>M. S.</given-names></name> <name><surname>Lee</surname> <given-names>S. I.</given-names></name></person-group> (<year>2017</year>). <article-title>A unified approach to interpreting model predictions</article-title>. <source>Adv. Neural Inf. Process. Syst.</source> <fpage>4766</fpage>&#x02013;<lpage>4775</lpage>. <pub-id pub-id-type="doi">10.48550/arXiv.1705.07874</pub-id></citation>
</ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mahmood</surname> <given-names>M. D.</given-names></name> <name><surname>Bibi</surname> <given-names>A.</given-names></name> <name><surname>Masud</surname> <given-names>M.</given-names></name> <name><surname>Ahmed</surname> <given-names>G.</given-names></name> <name><surname>Khan</surname> <given-names>S.</given-names></name> <name><surname>Jhanjhi</surname> <given-names>N. Z.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Citation: Qasim PCA-based advanced local octa-directional pattern (ALODP-PCA): A texture feature PCA-based advanced local octa-directional pattern (ALODP-PCA): A texture feature descriptor for image retrieval</article-title>. <source>Electronics</source> 11. <pub-id pub-id-type="doi">10.3390/electronics11020202</pub-id></citation>
</ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maio</surname> <given-names>D.</given-names></name> <name><surname>Maltoni</surname> <given-names>D.</given-names></name></person-group> (<year>2000</year>). <article-title>Real-time face location on gray-scale static images</article-title>. <source>Pattern Recognit</source>. <volume>33</volume>, <fpage>1525</fpage>&#x02013;<lpage>1539</lpage>. <pub-id pub-id-type="doi">10.1016/S0031-3203(99)00130-2</pub-id></citation>
</ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Malkauthekar</surname> <given-names>M. D.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Analysis of euclidean distance and manhattan distance measure in face recognition,&#x0201D;</article-title> in <source>IET Conference Publications</source> (<publisher-loc>Mumbai</publisher-loc>: <publisher-name>Institution of Engineering and Technology</publisher-name>), <fpage>503</fpage>&#x02013;<lpage>507</lpage>.</citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mao</surname> <given-names>R.</given-names></name> <name><surname>Meng</surname> <given-names>R.</given-names></name> <name><surname>Mao</surname> <given-names>R. S. R.</given-names></name> <name><surname>Sun</surname> <given-names>R.</given-names></name></person-group> (<year>2023</year>). <article-title>&#x0201C;Facial expression recognition based on deep convolutional neural network,&#x0201D;</article-title> in <source>Proc. SPIE 12509, Third International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI 2022)</source>, 1250923. <pub-id pub-id-type="doi">10.1117/12.2655893</pub-id></citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mehendale</surname> <given-names>N.</given-names></name></person-group> (<year>2020</year>). <article-title>Facial emotion recognition using convolutional neural networks (FERC)</article-title>. <source>SN Appl. Sci.</source> <volume>2</volume>, <fpage>1</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1007/s42452-020-2234-1</pub-id></citation>
</ref>
<ref id="B40">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Mezaris</surname> <given-names>V.</given-names></name> <name><surname>Kompatsiaris</surname> <given-names>I.</given-names></name> <name><surname>Strintzis</surname> <given-names>M. G.</given-names></name></person-group> (<year>2004</year>). <source>An Ontology Approach to Object-Based Image Retrieval.</source> <publisher-loc>Barcellona</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>.</citation>
</ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moghaddam</surname> <given-names>B.</given-names></name> <name><surname>Pentland</surname> <given-names>A.</given-names></name></person-group> (<year>1997</year>). <article-title>Probabilistic visual learning for object representation</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>. <volume>19</volume>, <fpage>696</fpage>&#x02013;<lpage>710</lpage>. <pub-id pub-id-type="doi">10.1109/34.598227</pub-id></citation>
</ref>
<ref id="B42">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Moldagulova</surname> <given-names>A.</given-names></name> <name><surname>Sulaiman</surname> <given-names>R. B.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Using KNN algorithm for classification of textual documents,&#x0201D;</article-title> in <source>ICIT 2017 - 8th International Conference on Information Technology, Proceedings</source> (<publisher-loc>Amman</publisher-loc>: <publisher-name>Institute of Electrical and Electronics Engineers Inc.</publisher-name>), <fpage>665</fpage>&#x02013;<lpage>671</lpage>.</citation>
</ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nefian</surname> <given-names>V. A.</given-names></name> <name><surname>Hayes</surname> <given-names>M. H.</given-names></name></person-group> (<year>1998</year>). <article-title>&#x0201C;Hidden markov models for face recognition,&#x0201D;</article-title> in <source>ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings</source> (IEEE Computer Society).</citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Othman</surname> <given-names>E.</given-names></name> <name><surname>Werner</surname> <given-names>P.</given-names></name> <name><surname>Saxen</surname> <given-names>F.</given-names></name> <name><surname>Al-Hamadi</surname> <given-names>A.</given-names></name> <name><surname>Gruss</surname> <given-names>S.</given-names></name> <name><surname>Walter</surname> <given-names>S.</given-names></name></person-group> (<year>2023</year>). <article-title>Classification networks for continuous automatic pain intensity monitoring in video using facial expression on the X-ITE pain database</article-title>. <source>J. Vis. Commun. Image Represent.</source> <volume>91</volume>, <fpage>103743</fpage>. <pub-id pub-id-type="doi">10.1016/j.jvcir.2022.103743</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pande</surname> <given-names>D. S.</given-names></name> <name><surname>Jadhav</surname> <given-names>P. P.</given-names></name> <name><surname>Joshi</surname> <given-names>R.</given-names></name> <name><surname>Sawant</surname> <given-names>A. D.</given-names></name> <name><surname>Muddebihalkar</surname> <given-names>V.</given-names></name> <name><surname>Rathod</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Digitization of handwritten devanagari text using CNN transfer learning &#x02013; a better customer service support</article-title>. <source>Neurosci. Informat.</source> <volume>2</volume>, <fpage>100016</fpage>. <pub-id pub-id-type="doi">10.1016/j.neuri.2021.100016</pub-id></citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Park</surname> <given-names>Y.</given-names></name> <name><surname>Park</surname> <given-names>K.</given-names></name> <name><surname>Kim</surname> <given-names>G.</given-names></name></person-group> (<year>2013</year>). <article-title>Content-based image retrieval using colour and shape features</article-title>. <source>Int. J. Comp. Appl. Technol.</source> <volume>48</volume>, <fpage>155</fpage>. <pub-id pub-id-type="doi">10.1504/IJCAT.2013.056023</pub-id></citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pass</surname> <given-names>G.</given-names></name> <name><surname>Pass</surname> <given-names>G.</given-names></name></person-group> (<year>1996</year>). <article-title>Histogram refinement</article-title>. <source>Science</source>. <fpage>96</fpage>&#x02013;<lpage>102</lpage>.</citation>
</ref>
<ref id="B48">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Raghuwanshi</surname> <given-names>G.</given-names></name> <name><surname>Tyagi</surname> <given-names>V.</given-names></name></person-group> (<year>2019</year>). <source>Impact of Feature Extraction Techniques on a CBIR System</source> (<publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer Singapore</publisher-name>), <fpage>338</fpage>&#x02013;<lpage>348</lpage>.</citation>
</ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ribeiro</surname> <given-names>T. M.</given-names></name> <name><surname>Singh</surname> <given-names>S.</given-names></name> <name><surname>Guestrin</surname> <given-names>C.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;&#x02018;Why should i trust you?&#x00027;: explaining the predictions of any classifier,&#x0201D;</article-title> in <source>NAACL-HLT 2016 - 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Demonstrations Session</source> (arXiv), <fpage>97</fpage>&#x02013;<lpage>101</lpage>.</citation>
</ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Samal</surname> <given-names>A.</given-names></name> <name><surname>Iyengar</surname> <given-names>A. P.</given-names></name></person-group> (<year>1992</year>). <article-title>Automatic recognition and analysis of human faces and facial expressions: a survey</article-title>. <source>Pattern Recognit</source>. <volume>25</volume>, <fpage>65</fpage>&#x02013;<lpage>77</lpage>. <pub-id pub-id-type="doi">10.1016/0031-3203(92)90007-6</pub-id></citation>
</ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sato</surname> <given-names>Y.</given-names></name> <name><surname>Fukusato</surname> <given-names>T.</given-names></name> <name><surname>Morishima</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>Interactive face retrieval framework for clarifying user&#x00027;s visual memory</article-title>. <source>ITE Transact. Media Technol. Appl.</source> <volume>7</volume>, <fpage>68</fpage>&#x02013;<lpage>79</lpage>. <pub-id pub-id-type="doi">10.3169/mta.7.68</pub-id></citation>
</ref>
<ref id="B52">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Shukla</surname> <given-names>K. A.</given-names></name> <name><surname>Kanungo</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <source>Enhanced Bag-of-Features Method Using Grey Wolf Optimization for Automated Face Retrieval</source> (<publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer Singapore</publisher-name>), <fpage>519</fpage>&#x02013;<lpage>528</lpage>.</citation>
</ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Simonyan</surname> <given-names>K.</given-names></name> <name><surname>Zisserman</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>&#x0201C;Very deep convolutional networks for large-scale image recognition,&#x0201D;</article-title> in <source>3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. International Conference on Learning Representations</source> (ICLR).</citation>
</ref>
<ref id="B54">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Singh</surname> <given-names>A.</given-names></name> <name><surname>Singh</surname> <given-names>M.</given-names></name> <name><surname>Singh</surname> <given-names>B.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Face detection and eyes extraction using sobel edge detection and morphological operations,&#x0201D;</article-title> in <source>Conference on Advances in Signal Processing, CASP 2016</source> (<publisher-loc>Pune</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>).</citation>
</ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>M.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Chi</surname> <given-names>Z.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Eye-tracking based relevance feedback for iterative face image retrieval,&#x0201D;</article-title> in <source>120. SPIE-Intl Soc Optical Eng</source> (PIE Digital Library).</citation>
</ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Swets</surname> <given-names>L. D.</given-names></name></person-group> (<year>1996</year>). <article-title>Using discriminant eigenfeatures for image retrieval</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell</source>. <volume>18</volume>, <fpage>831</fpage>&#x02013;<lpage>836</lpage>. <pub-id pub-id-type="doi">10.1109/34.531802</pub-id></citation>
</ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szegedy</surname> <given-names>C.</given-names></name> <name><surname>Liu</surname> <given-names>W.</given-names></name> <name><surname>Jia</surname> <given-names>Y.</given-names></name> <name><surname>Sermanet</surname> <given-names>P.</given-names></name> <name><surname>Reed</surname> <given-names>S.</given-names></name> <name><surname>Anguelov</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Going deeper with convolutions</article-title>. <source>arXiv</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1409.4842</pub-id></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tamura</surname> <given-names>H.</given-names></name> <name><surname>Yokoya</surname> <given-names>N.</given-names></name></person-group> (<year>1984</year>). <article-title>Image database systems: a survey</article-title>. <source>Pattern Recognit</source>. <volume>17</volume>, <fpage>29</fpage>&#x02013;<lpage>43</lpage>. <pub-id pub-id-type="doi">10.1016/0031-3203(84)90033-5</pub-id></citation>
</ref>
<ref id="B59">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Tarawneh</surname> <given-names>S. A.</given-names></name> <name><surname>Hassanat</surname> <given-names>A. B.</given-names></name> <name><surname>Celik</surname> <given-names>C.</given-names></name> <name><surname>Chetverikov</surname> <given-names>D.</given-names></name> <name><surname>Rahman</surname> <given-names>M. S.</given-names></name> <name><surname>Verma</surname> <given-names>C.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Deep face image retrieval: a comparative study with dictionary learning,&#x0201D;</article-title> in <source>2019 10th International Conference on Information and Communication Systems (ICICS)</source> (<publisher-loc>Irbid</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>), <fpage>185</fpage>&#x02013;<lpage>192</lpage>.</citation>
</ref>
<ref id="B60">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Tyagi</surname> <given-names>V.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Content-based image retrieval: an introduction,&#x0201D;</article-title> in <source>Content-Based Image Retrieval</source> (<publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>1</fpage>&#x02013;<lpage>27</lpage>. <pub-id pub-id-type="doi">10.1007/978-981-10-6759-4_1</pub-id></citation>
</ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Valentin</surname> <given-names>D.</given-names></name> <name><surname>Abdi</surname> <given-names>H.</given-names></name> <name><surname>O&#x00027;Toole</surname> <given-names>A. J.</given-names></name> <name><surname>Cottrell</surname> <given-names>G. W.</given-names></name></person-group> (<year>1994</year>). <article-title>Connectionist models of face processing: a survey</article-title>. <source>Pattern Recognit</source>. <volume>27</volume>, <fpage>1209</fpage>&#x02013;<lpage>1230</lpage>. <pub-id pub-id-type="doi">10.1016/0031-3203(94)90006-X</pub-id></citation>
</ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y.</given-names></name> <name><surname>Hassner</surname> <given-names>T.</given-names></name> <name><surname>Kim</surname> <given-names>K.</given-names></name> <name><surname>Medioni</surname> <given-names>G.</given-names></name> <name><surname>Natarajan</surname> <given-names>P.</given-names></name></person-group> (<year>2015</year>). <article-title>Facial landmark detection with tweaked convolutional neural networks</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>40</volume>, <fpage>3067</fpage>&#x02013;<lpage>3074</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2017.2787130</pub-id><pub-id pub-id-type="pmid">29990138</pub-id></citation></ref>
<ref id="B63">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Z.</given-names></name> <name><surname>Ke</surname> <given-names>Q.</given-names></name> <name><surname>Sun</surname> <given-names>J.</given-names></name> <name><surname>Shum</surname> <given-names>H.-Y.</given-names></name></person-group> (<year>2010</year>). <article-title>&#x0201C;Scalable face image retrieval with identity-based quantization and multi-reference re-ranking,&#x0201D;</article-title> in <source>2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>San Francisco, CA</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>), <fpage>3469</fpage>&#x02013;<lpage>3476</lpage>.</citation>
</ref>
<ref id="B64">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>C.</given-names></name> <name><surname>Dong</surname> <given-names>M.</given-names></name> <name><surname>Fotouhi</surname> <given-names>F.</given-names></name></person-group> (<year>2005</year>). <article-title>&#x0201C;Image content annotation using bayesian framework and complement components analysis,&#x0201D;</article-title> in <source>Proceedings - International Conference on Image Processing, ICIP</source> (<publisher-loc>Genova</publisher-loc>: <publisher-name>IEEE Computer Society</publisher-name>).</citation>
</ref>
<ref id="B65">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>M.</given-names></name> <name><surname>Kpalma</surname> <given-names>K.</given-names></name> <name><surname>Ronsin</surname> <given-names>J.</given-names></name> <name><surname>Ronsin</surname> <given-names>J. A.</given-names></name> <name><surname>Mingqiang</surname> <given-names>Y.</given-names></name> <name><surname>Kidiyo</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2008</year>). <source>Survey of Shape Feature Extraction Techniques. Peng-Yeng Yin</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://hal.archives-ouvertes.fr/hal-00446037">https://hal.archives-ouvertes.fr/hal-00446037</ext-link> (accessed July 7, 2019).</citation>
</ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yiew</surname> <given-names>K.</given-names></name> <name><surname>Togher</surname> <given-names>L.</given-names></name> <name><surname>Power</surname> <given-names>E.</given-names></name> <name><surname>Brunner</surname> <given-names>M.</given-names></name> <name><surname>Rietdijk</surname> <given-names>R.</given-names></name></person-group> (<year>2023</year>). <article-title>Differentiating use of facial expression between individuals with and without traumatic brain injury using affectiva software: a pilot study</article-title>. <source>Int. J. Environ. Res. Public Health</source> <volume>20</volume>, <fpage>1169</fpage>. <pub-id pub-id-type="doi">10.3390/ijerph20021169</pub-id><pub-id pub-id-type="pmid">36673925</pub-id></citation></ref>
<ref id="B67">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>D.</given-names></name> <name><surname>Islam</surname> <given-names>M. M.</given-names></name> <name><surname>Lu</surname> <given-names>G.</given-names></name> <name><surname>Hou</surname> <given-names>J.</given-names></name></person-group> (<year>2009</year>). <article-title>&#x0201C;Semantic image retrieval using region based inverted file,&#x0201D;</article-title> in <source>DICTA 2009 - Digital Image Computing: Techniques and Applications</source> (<publisher-loc>Melbourne, VIC</publisher-loc>: <publisher-name>IEEE</publisher-name>).</citation>
</ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>D.</given-names></name> <name><surname>Lu</surname> <given-names>G.</given-names></name></person-group> (<year>2004</year>). <article-title>Review of shape representation and description techniques</article-title>. <source>Pattern Recognit</source>. <volume>37</volume>, <fpage>1</fpage>&#x02013;<lpage>19</lpage>.<pub-id pub-id-type="doi">10.1016/j.patcog.2003.07.008</pub-id></citation>
</ref>
<ref id="B69">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Yan</surname> <given-names>Y.</given-names></name> <name><surname>Lades</surname> <given-names>M.</given-names></name></person-group> (<year>1997</year>). <article-title>Face recognition: eigenface, elastic matching, and neural nets</article-title>. <source>Proc. IEEE</source>. <volume>85</volume>, <fpage>1423</fpage>&#x02013;<lpage>1435</lpage>.</citation>
</ref>
<ref id="B70">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>M.</given-names></name> <name><surname>Zhe</surname> <given-names>X.</given-names></name> <name><surname>Chen</surname> <given-names>S.</given-names></name> <name><surname>Yan</surname> <given-names>H.</given-names></name></person-group> (<year>2021</year>). <article-title>Deep center-based dual-constrained hashing for discriminative face image retrieval</article-title>. <source>Pattern Recognit.</source> <volume>117</volume>, <fpage>107976</fpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2021.107976</pub-id></citation>
</ref>
<ref id="B71">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>P.</given-names></name> <name><surname>Wang</surname> <given-names>C.</given-names></name></person-group> (<year>2012</year>). <article-title>The evolution of social commerce: an examination from the people, business, technology, and information perspective</article-title>. <source>CAIS</source> <volume>31</volume>, <fpage>105</fpage>&#x02013;<lpage>127</lpage>. <pub-id pub-id-type="doi">10.17705/1CAIS.03105</pub-id></citation>
</ref>
</ref-list>
</back>
</article>