<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Artif. Intell.</journal-id>
<journal-title>Frontiers in Artificial Intelligence</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Artif. Intell.</abbrev-journal-title>
<issn pub-type="epub">2624-8212</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">670538</article-id>
<article-id pub-id-type="doi">10.3389/frai.2021.670538</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Artificial Intelligence</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Kernelized Heterogeneity-Aware Cross-View Face Recognition</article-title>
<alt-title alt-title-type="left-running-head">Dhamecha et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">KHDA for Cross-View Face Recognition</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Dhamecha</surname>
<given-names>Tejas I.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="fn" rid="fn1">
<sup>&#x2020;</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1232104/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ghosh</surname>
<given-names>Soumyadeep</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Vatsa</surname>
<given-names>Mayank</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1049931/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Singh</surname>
<given-names>Richa</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1048576/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>IIIT Delhi, <addr-line>New Delhi</addr-line>, <country>India</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>IIT Jodhpur, <addr-line>Jodhpur</addr-line>, <country>India</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/142637/overview">Fabrizio Riguzzi</ext-link>, University of Ferrara, Italy</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1254188/overview">Hiranmoy Roy</ext-link>, RCC Institute of Information Technology, India</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1280214/overview">Francesco Giannini</ext-link>, University of Siena, Italy</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Mayank Vatsa, <email>mvatsa@iitj.ac.in</email>
</corresp>
<fn fn-type="equal" id="fn1">
<label>
<sup>&#x2020;</sup>
</label>
<p>Work carried when the author was affiliated to IIIT Delhi.</p>
</fn>
<fn fn-type="other">
<p>This article was submitted to Machine Learning and Artificial Intelligence, a section of the journal Frontiers in Artificial Intelligence</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>20</day>
<month>07</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>4</volume>
<elocation-id>670538</elocation-id>
<history>
<date date-type="received">
<day>21</day>
<month>02</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>06</day>
<month>05</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Dhamecha, Ghosh, Vatsa and Singh.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Dhamecha, Ghosh, Vatsa and Singh</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Cross-view or heterogeneous face matching involves comparing two different views of the face modality such as two different spectrums or resolutions. In this research, we present two heterogeneity-aware subspace techniques, heterogeneous discriminant analysis (HDA) and its kernel version (KHDA) that encode heterogeneity in the objective function and yield a suitable projection space for improved performance. They can be applied on any feature to make it heterogeneity invariant. We next propose a face recognition framework that uses existing facial features along with HDA/KHDA for matching. The effectiveness of HDA and KHDA is demonstrated using both handcrafted and learned representations on three challenging heterogeneous cross-view face recognition scenarios: (i) visible to near-infrared matching, (ii) cross-resolution matching, and (iii) digital photo to composite sketch matching. It is observed that, consistently in all the case studies, HDA and KHDA help to reduce the heterogeneity variance, clearly evidenced in the improved results. Comparison with recent heterogeneous matching algorithms shows that HDA- and KHDA-based matching yields state-of-the-art or comparable results on all three case studies. The proposed algorithms yield the best rank-1 accuracy of 99.4% on the CASIA NIR-VIS 2.0 database, up to 100% on the CMU Multi-PIE for different resolutions, and 95.2% rank-10 accuracies on the e-PRIP database for digital to composite sketch matching.</p>
</abstract>
<kwd-group>
<kwd>face recognition (FR)</kwd>
<kwd>discriminant analysis (DA)</kwd>
<kwd>heterogeneity</kwd>
<kwd>cross-spectral</kwd>
<kwd>cross-resolution</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>With increasing focus on security and surveillance, face biometrics has found several new applications and challenges in real-world scenarios. In terms of the current practices by law enforcement agencies, the legacy mugshot databases are captured with good quality face cameras operating in the visible spectrum (VIS) with inter-eye distance of at least 90 pixels (<xref ref-type="bibr" rid="B60">Wilson et&#x20;al., 2007</xref>). However, for security and law enforcement applications, it is difficult to meet these standard requirements. For instance, in surveillance environment, when the illumination is not sufficient, majority of the surveillance cameras capture videos in the near-infrared spectrum (NIR). Even in daytime environment, an image captured at a distance may have only <inline-formula id="inf1">
<mml:math id="m1">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#xd7;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> facial region for processing. For these applications, the corresponding gallery or database image is generally a good quality mugshot image captured in controlled environments. This leads to the challenge of heterogeneity in gallery and probe images. <xref ref-type="fig" rid="F1">Figure&#x20;1</xref> shows samples of these heterogeneous face matching cases. This figure also showcases another interesting application of matching composite sketch images with digital face images. In this problem, composite sketches are generated using a software tool based on eyewitness description, and this synthetic sketch image is then matched against a database of mugshot face images. Since the information content in sketches and photos is different, matching them can be viewed as heterogeneous matching problem.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Examples of heterogeneous face recognition scenarios. Top row <bold>(A)</bold> shows heterogeneity due to difference in visible and near-infrared spectrum; <bold>(B)</bold>&#x20;shows photo and composite sketches of a person. <bold>(C)</bold>&#x2013;<bold>(F)</bold> illustrates heterogeneity due to resolution variation of 72x72, 48x48, 32x32, and 16x16, respectively. (The&#x20;images of different resolution are stretched to common sizes.)</p>
</caption>
<graphic xlink:href="frai-04-670538-g001.tif"/>
</fig>
<p>The challenge of heterogeneous face recognition is posed by the fact that the view<xref ref-type="fn" rid="fn2">
<sup>1</sup>
</xref> of the query face image is not the same as that of the gallery image. In a broader sense, two face images are said to have different views if the facial information in the images is represented differently. For example, visible and near-infrared images are two views. The difference in views may arise due to several factors such as difference in sensors, their operating spectrum range, and difference in the process of sample generation. Most of the traditional face recognition research has focused on homogeneous matching (<xref ref-type="bibr" rid="B1">Bhatt et&#x20;al., 2015</xref>), that is, when both gallery and probe images have the same views. In recent past, researchers have addressed the challenges of heterogeneous face recognition (<xref ref-type="bibr" rid="B59">Tang and Wang, 2003</xref>; <xref ref-type="bibr" rid="B64">Yi et&#x20;al., 2007</xref>; <xref ref-type="bibr" rid="B31">Lei and Li, 2009</xref>; <xref ref-type="bibr" rid="B32">Lei et&#x20;al., 2012a</xref>; <xref ref-type="bibr" rid="B27">Klare and Jain, 2013</xref>; <xref ref-type="bibr" rid="B23">Jin et&#x20;al., 2015</xref>). Compared to homogeneous face recognition, matching face images with different views is a challenging problem as heterogeneity leads to increase in the intra-class variability.</p>
<sec id="s1-1">
<title>Literature Review</title>
<p>The literature pertaining to heterogeneous face recognition can be grouped into two broad categories: 1) heterogeneity invariant features and 2) heterogeneity-aware classifiers. Heterogeneity invariant feature&#x2013;based approaches focus on extracting features which are invariant across different views. The prominent research includes use of handcrafted features such as variants of histogram of oriented gradients (HOG), Gabor, Weber, local binary patterns (LBP) (<xref ref-type="bibr" rid="B39">Liao et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B15">Goswami et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B25">Kalka et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B6">Chen and Ross, 2013</xref>; <xref ref-type="bibr" rid="B11">Dhamecha et&#x20;al., 2014</xref>), and various learning-based features (<xref ref-type="bibr" rid="B63">Yi et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B41">Liu et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B51">Reale et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B19">He et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B21">Hu et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B7">Cho et&#x20;al., 2020</xref>). Heterogeneity-aware classifier&#x2013;based approaches focus on learning a model using samples from both the views. In this research, we primarily focus on designing a heterogeneity-aware classifier.</p>
<p>One set of work focuses on addressing the heterogeneity in projection space or by statistically learning the features suitable for heterogeneous matching. On these lines, one of the earliest research related to visible to near-infrared matching, proposed by <xref ref-type="bibr" rid="B64">Yi et&#x20;al. (2007)</xref>, utilizes canonical correlation analysis (CCA) which finds the projections in an unsupervised manner. It computes two projection directions, one for each view such that the correlation between them is maximized in the projection space. Closely related to CCA, <xref ref-type="bibr" rid="B55">Sharma et&#x20;al. (2012)</xref> proposed generalized multi-view analysis (GMA) by adding a constraint that the multi-view samples of each class are as much closer as possible. Similar multi-view extension to discriminant analysis is also explored (<xref ref-type="bibr" rid="B26">Kan et&#x20;al., 2016</xref>). Further, dictionary learning is also utilized for heterogeneous matching (<xref ref-type="bibr" rid="B24">Juefei-Xu et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B61">Wu et&#x20;al., 2016</xref>). Efforts to extract heterogeneity-specific features have resulted in common discriminant feature extractor (CDFE) (<xref ref-type="bibr" rid="B40">Lin and Tang, 2006</xref>), coupled spectral regression (CSR) (<xref ref-type="bibr" rid="B31">Lei and Li, 2009</xref>) and its extensions (<xref ref-type="bibr" rid="B32">Lei et&#x20;al., 2012a</xref>, <xref ref-type="bibr" rid="B33">b</xref>), common feature discriminant analysis (CFDA) (<xref ref-type="bibr" rid="B37">Li et&#x20;al., 2014</xref>), coupled discriminative feature learning (CDFL) (<xref ref-type="bibr" rid="B23">Jin et&#x20;al., 2015</xref>), and coupled compact binary face descriptors (C-CBFD) (<xref ref-type="bibr" rid="B43">Lu et&#x20;al., 2015</xref>). Similarly, mutual component analysis (MCA) <xref ref-type="bibr" rid="B38">Li et&#x20;al. (2016)</xref> utilizes iterative EM approach along with a modeling of face generation process to capture view-invariant characteristics.</p>
<p>Although statistical in spirit, a body of work approaches the heterogeneity challenge as a manifold modeling problem. These works explore manifold learning&#x2013;based approaches to learn heterogeneity-aware classifier. <xref ref-type="bibr" rid="B35">Li et&#x20;al. (2010)</xref> proposed locality preserving projections (LPP)&#x2013;based approach that preserves local neighborhood in the projection space. <xref ref-type="bibr" rid="B4">Biswas et&#x20;al. (2013</xref>, <xref ref-type="bibr" rid="B5">2012)</xref> proposed a multidimensional scaling (MDS)&#x2013;based approach for matching low-resolution face images. The algorithm learns an MDS transformation which maps pairwise distances in kernel space of one view to corresponding pairwise distances of the other view. <xref ref-type="bibr" rid="B27">Klare and Jain (2013)</xref> proposed a prototyping-based approach. It explores the intuition that across different views, the relative coordinates of samples should remain similar. Therefore, the vector of similarities between the query sample and prototype samples in the corresponding view may be used as the feature.</p>
<p>Other research directions, such as maximum margin classifier (<xref ref-type="bibr" rid="B57">Siena et&#x20;al., 2013</xref>) and transductive learning (<xref ref-type="bibr" rid="B65">Zhu et&#x20;al., 2014</xref>), are also explored. Further, deep learning&#x2013;based approaches are also proposed for heterogeneous matching to learn shared representation (<xref ref-type="bibr" rid="B63">Yi et&#x20;al., 2015</xref>), to leverage large homogeneous data (<xref ref-type="bibr" rid="B51">Reale et&#x20;al., 2016</xref>), to learn using limited data (<xref ref-type="bibr" rid="B21">Hu et&#x20;al., 2018</xref>), to facilitate transfer learning (<xref ref-type="bibr" rid="B41">Liu et&#x20;al., 2016</xref>), performing face hallucination <italic>via</italic> disentangling (<xref ref-type="bibr" rid="B12">Duan et&#x20;al., 2020</xref>), and learning deep models using Wasserstein distance (<xref ref-type="bibr" rid="B20">He et&#x20;al., 2019</xref>). <xref ref-type="bibr" rid="B10">Deng Z. et&#x20;al. (2019)</xref> extend MCA to utilize convolutional neural networks for heterogeneous matching. Most recent representation learning methods have a large parameter space, hence require enormous amounts of data for training models for heterogeneous matching. Nevertheless, learned face representations from such approaches are found to be very effective (<xref ref-type="bibr" rid="B58">Taigman et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B44">Majumdar et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B62">Wu et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B9">Deng J.&#x20;et&#x20;al., 2019</xref>).</p>
<p>In the literature, we identify a scope for improving statistical techniques for heterogeneous matching scenarios. Specifically, we observe that for heterogeneous matching task, modeling of intra-view variability is not critical, as the task always involves matching an inter-view/heterogeneous face pair. The objective functions of the proposed approaches differ from the literature in focusing only on the inter-view variability. To this end, we present two subspace-based classifiers aiming at reducing the inter-view intra-class variability and increasing the inter-view inter-class variability for heterogeneous face recognition. Specifically, in this article, we<list list-type="simple">
<list-item>
<p>&#x2022; propose heterogeneous discriminant analysis (HDA) and its nonlinear kernel extension (KHDA),</p>
</list-item>
<list-item>
<p>&#x2022; demonstrate the effectiveness of these HDA and KHDA using multiple features on three challenging heterogeneous face recognition scenarios: matching visible to near-infrared images, matching cross-resolution face images, and matching digital photo to composite sketch,&#x20;and</p>
</list-item>
<list-item>
<p>&#x2022; utilize deep learning&#x2013;based features and show that combined with the proposed HDA and KHDA, they yield impressive heterogeneous matching performance.</p>
</list-item>
</list>
</p>
</sec>
</sec>
<sec id="s2">
<title>Heterogeneous Discriminant Analysis</title>
<p>To address the issue of heterogeneity in face recognition, we propose a discriminant analysis&#x2013;based approach. In this context, the heterogeneity can arise due to factors such as spectrum variations as shown in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>. The same individual may appear somewhat different in two different spectrums. While a feature extractor may filter out some of the heterogeneity, most feature extractors are not designed to be heterogeneity invariant. Therefore, for practical purposes, the heterogeneity of the source image may be retained in the extracted features.</p>
<p>By definition, the end goal of heterogeneous matching is always a cross-view comparison, for example, VIS to NIR matching and never intra-view comparison, for example, VIS to VIS matching. Therefore, the cross-view information would contain stronger cues for the task than the intra-view information. In other words, optmizing the intra-view variation may have limited utility. It is our hypothesis that incorporating only the cross-view (e.g., cross-spectral) information along with intra- and inter-class variability can improve heterogeneous matching. The proposed heterogeneous discriminant analysis is inspired from the formulation of linear discriminant analysis. Therefore, we first briefly summarize the formulation and limitations of linear discriminant analysis (LDA) followed by presenting the details of&#x20;HDA.</p>
<p>Traditionally, intra- and inter-class variabilities are represented using within- <inline-formula id="inf2">
<mml:math id="m2">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>W</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</inline-formula> and between-class scatter matrices <inline-formula id="inf3">
<mml:math id="m3">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>B</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</inline-formula>; where <italic>c</italic> is the total number of classes, <inline-formula id="inf4">
<mml:math id="m4">
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the number of samples in <inline-formula id="inf5">
<mml:math id="m5">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> class, <inline-formula id="inf6">
<mml:math id="m6">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the <inline-formula id="inf7">
<mml:math id="m7">
<mml:mrow>
<mml:msup>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> sample of the <inline-formula id="inf8">
<mml:math id="m8">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> class, and <inline-formula id="inf9">
<mml:math id="m9">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the mean of the <inline-formula id="inf10">
<mml:math id="m10">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> class. The Fisher criterion <inline-formula id="inf11">
<mml:math id="m11">
<mml:mrow>
<mml:mi>J</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>w</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msup>
<mml:mi>w</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>B</mml:mi>
</mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msup>
<mml:mi>w</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>W</mml:mi>
</mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> attempts to find the projection directions that minimize the intra-class variability and maximize the inter-class variability in the projected space.</p>
<p>The way the scatter matrices are defined ensures that all the samples are as close to the corresponding class mean as possible and that class means are as apart as possible. Any new sample resembling the samples of a certain class would get projected near the corresponding class mean. LDA attempts to optimize the projection directions assuming that the data conforms to a normal distribution. Obtaining such a projection space is useful when the samples to be compared are homogeneous, that is, there is no inherent difference in the sample representation. Even if we assume that each view of each class is normally distributed in itself, the restrictive constraint of LDA is not satisfied. As shown in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>, when provided with a multi-view or heterogeneous data, the projection directions obtained from LDA may be suboptimal and can affect the classification performance. Therefore, for heterogeneous matching problems, we propose to incorporate the view information while computing the between- and within-class scatter matrices.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>
<bold>(A)</bold> Graphical interpretation of HDA and <bold>(B&#x2013;D)</bold> illustration of the effectiveness of HDA with multiple views. Class 1 and 2 are generated using Gaussian mixture of two modes resulting in two views. <bold>(B)</bold> represents the scatter plot and the projection directions obtained using LDA and HDA (without regularization). The&#x20;histograms of projections of data samples on the LDA and HDA directions are shown in <bold>(C)</bold> and <bold>(D)</bold>, respectively.</p>
</caption>
<graphic xlink:href="frai-04-670538-g002.tif"/>
</fig>
<p>The formulation of the proposed heterogeneous discriminant analysis is described in the following two stages: 1. adaptation of scatter matrices and 2. analytical solution.</p>
<sec id="s2-1">
<title>Adaptation of Scatter Matrices</title>
<p>Let <inline-formula id="inf12">
<mml:math id="m12">
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf13">
<mml:math id="m13">
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> denote the two views (A and B) of the <inline-formula id="inf14">
<mml:math id="m14">
<mml:mrow>
<mml:msup>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mtext>th</mml:mtext>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> sample of the <inline-formula id="inf15">
<mml:math id="m15">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mtext>th</mml:mtext>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> class, respectively, and <inline-formula id="inf16">
<mml:math id="m16">
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf17">
<mml:math id="m17">
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> represent the number of samples in view A and B of the <inline-formula id="inf18">
<mml:math id="m18">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mtext>th</mml:mtext>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> class, respectively. <inline-formula id="inf19">
<mml:math id="m19">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>|</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>&#x2264;</mml:mo>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> represents the samples in view A of <inline-formula id="inf20">
<mml:math id="m20">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mtext>th</mml:mtext>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> class. For example, <inline-formula id="inf21">
<mml:math id="m21">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> represents the visible spectrum face images of <inline-formula id="inf22">
<mml:math id="m22">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mtext>th</mml:mtext>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> subject, and <inline-formula id="inf23">
<mml:math id="m23">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> represents the near-infrared spectrum face images of the subject.<list list-type="simple">
<list-item>
<p>
<sub>&#x2022;</sub> <inline-formula id="inf24">
<mml:math id="m24">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>and</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> are examples of match pairs, that is, face images in a pair belong to the same subject.</p>
</list-item>
<list-item>
<p>
<sub>&#x2022;</sub> <inline-formula id="inf25">
<mml:math id="m28">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>and</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> are examples of non-match pairs consisting of face images of different subjects.</p>
</list-item>
<list-item>
<p>
<sub>&#x2022;</sub> <inline-formula id="inf26">
<mml:math id="m32">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>and</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> represent intra-view pairs where face images belong to the same&#x20;view.</p>
</list-item>
<list-item>
<p>
<sub>&#x2022;</sub> <inline-formula id="inf27">
<mml:math id="m36">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>and</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> are examples of inter-view pairs, that is, face images in a pair belong to different&#x20;view.</p>
</list-item>
</list>
</p>
<p>There can be four kinds of information: i) inter-class intra-view difference, ii) inter-class inter-view difference, iii) intra-class intra-view difference, and iv) intra-class inter-view difference. Optimizing the intra-view (homogeneous) distances would not contribute in achieving the goal of efficient heterogeneous matching. Therefore, the scatter matrices should be defined such that the objective function reduces the heterogeneity (inter-view variation) along with improving the classification accuracy. The distance between the inter-view samples of the non-matching class should be increased and the distance between inter-view samples of the matching class should be decreased. With this hypothesis, we propose the following two modifications in the scatter matrices for heterogeneous matching:</p>
<p>Inter-class inter-view difference encodes the difference between different views of two individuals (e.g., <inline-formula id="inf28">
<mml:math id="m40">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>and</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> pairs). This can be incorporated in the between-class scatter matrix.</p>
<p>Intra-class inter-view difference encodes the difference between two different views of one person (e.g., <inline-formula id="inf30">
<mml:math id="m44">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>and</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>&#x03C7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> pairs). This can be incorporated in the within-class scatter matrix. (see <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>)</p>
<p>Incorporating these yields a projection space in which same-class samples from different views are drawn closer, thereby fine tuning the objective function for heterogeneous matching. The heterogeneous between-class scatter matrix (<inline-formula id="inf32">
<mml:math id="m47">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>) encodes the difference between different views of different classes<disp-formula id="e1">
<mml:math id="m48">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>l</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:mover>
</mml:mrow>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>k</mml:mi>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>k</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mi>j</mml:mi>
</mml:munder>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>k</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>k</mml:mi>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>k</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mi>n</mml:mi>
<mml:mi>a</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mi>n</mml:mi>
<mml:mi>b</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>k</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
<label>(1)</label>
</disp-formula>
</p>
<p>Here, <inline-formula id="inf33">
<mml:math id="m49">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf34">
<mml:math id="m50">
<mml:mrow>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> are the mean and prior of view A of class <italic>i</italic>, respectively; <inline-formula id="inf35">
<mml:math id="m51">
<mml:mrow>
<mml:msup>
<mml:mi>n</mml:mi>
<mml:mi>a</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> represents the number of samples in view A. Similarly, <inline-formula id="inf36">
<mml:math id="m52">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf37">
<mml:math id="m53">
<mml:mrow>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> represent the mean and prior of view B of class <italic>i</italic>, respectively; <inline-formula id="inf38">
<mml:math id="m54">
<mml:mrow>
<mml:msup>
<mml:mi>n</mml:mi>
<mml:mi>b</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> represents the number of samples in view B. <inline-formula id="inf39">
<mml:math id="m55">
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf40">
<mml:math id="m56">
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> represent the number of samples in view A and B of the <inline-formula id="inf41">
<mml:math id="m57">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> class, respectively, and <italic>c</italic> represents the total number of classes. Note that, unlike CCA, the number of samples does not have to be equal in both views. The within-class scatter matrix <inline-formula id="inf42">
<mml:math id="m58">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is proposed as<disp-formula id="e2">
<mml:math id="m59">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
</p>
<p>Since the proposed technique encodes data heterogeneity in the objective function and utilizes the definitions of between- and within-class scatter matrices, it is termed as heterogeneous discriminant analysis. Following the Fisher criterion, the objective function of HDA is proposed as<disp-formula id="e3">
<mml:math id="m60">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>arg</mml:mi>
<mml:mtext>&#x0020;</mml:mtext>
<mml:munder>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
<mml:mi>w</mml:mi>
</mml:munder>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>J</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>w</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>arg</mml:mi>
<mml:mtext>&#x0020;</mml:mtext>
<mml:munder>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
<mml:mi>w</mml:mi>
</mml:munder>
<mml:mfrac>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msup>
<mml:mi>w</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msup>
<mml:mi>w</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
</p>
<p>The optimization problem in <xref ref-type="disp-formula" rid="e3">Eq. 3</xref> is modeled as a generalized eigenvalue decomposition problem which results into a closed-form solution such that <italic>w</italic> is the set of top eigenvectors of <inline-formula id="inf43">
<mml:math id="m61">
<mml:mrow>
<mml:msubsup>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. The geometric interpretation of HDA in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref> shows that the objective function in <xref ref-type="disp-formula" rid="e3">Eq. 3</xref> tries to achieve the following in the projected space: 1) Bring samples <inline-formula id="inf44">
<mml:math id="m62">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> closer to mean <inline-formula id="inf45">
<mml:math id="m63">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> of <inline-formula id="inf46">
<mml:math id="m64">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and vice versa; and similarly for class 2. This reduces the inter-view distance within each class, for example, the projections of visible and NIR images of the same person become similar. 2) Increase the distance between mean <inline-formula id="inf47">
<mml:math id="m65">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> of <inline-formula id="inf48">
<mml:math id="m66">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and mean <inline-formula id="inf49">
<mml:math id="m67">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3bc;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> of <inline-formula id="inf50">
<mml:math id="m68">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>; and similarly increase the distance between mean of <inline-formula id="inf51">
<mml:math id="m69">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> and mean of <inline-formula id="inf52">
<mml:math id="m70">
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3c7;</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>, that is, the projections of mean visible face image of a subject become different from the mean NIR face image of another subject. The proposed way of encoding inter- (<xref ref-type="disp-formula" rid="e1">Eq. 1</xref>) and intra-class (<xref ref-type="disp-formula" rid="e2">Eq. 2</xref>) variations in the heterogeneous scenario requires that both the views are of the same dimensionality. In the application domain of face recognition, this is usually not an unrealistic constraint as, in practice, same kind of features, with same dimensionality, are extracted from both the views (<xref ref-type="bibr" rid="B11">Dhamecha et&#x20;al., 2014</xref>).</p>
<p>In some applications including face recognition, the number of training samples is often limited. If the number of training samples is less than the feature dimensionality, it leads to problems such as singular within-class scatter matrix. In the literature, it is also known as the small sample size problem and shrinkage regularization is generally used to address the issue (<xref ref-type="bibr" rid="B14">Friedman, 1989</xref>). Utilizing the shrinkage regularization, <xref ref-type="disp-formula" rid="e3">Eq. 3</xref> is updated as<disp-formula id="e4">
<mml:math id="m71">
<mml:mrow>
<mml:mi>J</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>w</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msup>
<mml:mi>w</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msup>
<mml:mi>w</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>&#x3bb;</mml:mi>
<mml:mi>I</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>
</p>
<p>Here, <italic>I</italic> represents the identity matrix and &#x3bb; is the regularization parameter. Note that <inline-formula id="inf53">
<mml:math id="m72">
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> results in no regularization, whereas <inline-formula id="inf54">
<mml:math id="m73">
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> results into not utilizing the within-class scatter matrix&#x20;<inline-formula id="inf55">
<mml:math id="m74">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>.</p>
<p>To visualize the functioning of the proposed HDA as opposed to LDA, the distributions of the projections obtained using LDA and HDA are shown in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>. <xref ref-type="table" rid="T1">Table&#x20;1</xref> presents a quantitative analysis in terms of the overlap between projections of views of both classes. The overlap between two histograms is calculated as <inline-formula id="inf56">
<mml:math id="m75">
<mml:mrow>
<mml:msub>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mtext>min</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf57">
<mml:math id="m76">
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf58">
<mml:math id="m77">
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> are the values of the <inline-formula id="inf59">
<mml:math id="m78">
<mml:mrow>
<mml:msup>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> bin of the first and second histograms, respectively. In&#x20;the ideal case, the projections of different views of the same&#x20;class should completely overlap (i.e.,&#x20;area of overlap 0.5) and the projections of the views of different classes should be&#x20;nonoverlapping (i.e.,&#x20;area of overlap 0). Since LDA does not&#x20;take into account the view information, the overlap between projections of both classes is large. Further, it is interesting to note that LDA yields a significant overlap of 0.351 between view A of class 1 and view B of class 2. Such overlap can deteriorate the heterogeneous matching performance. In the heterogeneous analysis (last two rows of <xref ref-type="table" rid="T1">Table&#x20;1</xref>), the overlap between projections of two views of the same class is relatively low. Note that view A and view B of class 1 result in two individual peaks. This also increases the intra-class variation, that is, projection distributions of both classes are spread rather than peaked. HDA yields better projection directions with less than 50% of inter-class overlap compared to LDA. For the homogeneous matching scenarios (fourth and fifth rows), HDA has marginally poor overlap compared to LDA. However, for the heterogeneous scenarios, the overlap of HDA is significantly lower for non-match pair of view A class 1&#x2013;view B class 2 (seventh row) and higher for match pairs (last two rows). For the view A class 2&#x2013;view B class 1 (eighth row), the numbers are slightly poorer for HDA; however, the difference is small enough to be neglected in context of the overlap metrics of other three pairs.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Analyzing the overlap of projection distributions in <xref ref-type="fig" rid="F2">Figures 2</xref>. LDA vs HDA comparison indicates that ignoring intra-view differences could be beneficial for heterogeneous matching.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left">Pair</th>
<th colspan="3" align="center">Overlap</th>
</tr>
<tr>
<th align="left">Ideal</th>
<th align="left">LDA</th>
<th align="left">HDA</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td colspan="4" align="left">
<bold>Overall</bold>
</td>
</tr>
<tr>
<td align="left">Class 1&#x2013;class 2</td>
<td align="left">0.000</td>
<td align="left">0.356</td>
<td align="left">0.159</td>
</tr>
<tr>
<td colspan="4" align="left">
<bold>Homogeneous</bold>
</td>
</tr>
<tr>
<td align="left">View A class 1&#x2013;view A class 2</td>
<td align="left">0.000</td>
<td align="left">0.110</td>
<td align="left">0.135</td>
</tr>
<tr>
<td align="left">View B class 1&#x2013;view B class 2</td>
<td align="left">0.000</td>
<td align="left">0.005</td>
<td align="left">0.013</td>
</tr>
<tr>
<td colspan="4" align="left">
<bold>Heterogeneous</bold>
</td>
</tr>
<tr>
<td align="left">View A class 1&#x2013;view B class 2</td>
<td align="left">0.000</td>
<td align="left">0.351</td>
<td align="left">0.076</td>
</tr>
<tr>
<td align="left">View A class 2&#x2013;view B class 1</td>
<td align="left">0.000</td>
<td align="left">0.000</td>
<td align="left">0.034</td>
</tr>
<tr>
<td align="left">View A class 1&#x2013;view B class 1</td>
<td align="left">0.500</td>
<td align="left">0.025</td>
<td align="left">0.261</td>
</tr>
<tr>
<td align="left">View A class 2&#x2013;view B class 2</td>
<td align="left">0.500</td>
<td align="left">0.174</td>
<td align="left">0.429</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The time complexity of computing <inline-formula id="inf60">
<mml:math id="m79">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf61">
<mml:math id="m80">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is <inline-formula id="inf62">
<mml:math id="m81">
<mml:mrow>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:msup>
<mml:mi>d</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf63">
<mml:math id="m82">
<mml:mrow>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:msup>
<mml:mi>d</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, respectively. The generalized eigenvalue decomposition in <xref ref-type="disp-formula" rid="e3">Eq. 3</xref> has time complexity of <inline-formula id="inf64">
<mml:math id="m83">
<mml:mrow>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>d</mml:mi>
<mml:mn>3</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <italic>n</italic>, <italic>d</italic>, and <italic>c</italic> are the number of training samples, feature dimensionality, and number of classes, respectively.</p>
</sec>
<sec id="s2-2">
<title>Nonlinear Kernel Extension</title>
<p>We further analyze the objective function in <xref ref-type="disp-formula" rid="e3">Eq. 3</xref> to adapt it for nonlinear transformation <inline-formula id="inf65">
<mml:math id="m84">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x2192;</mml:mo>
<mml:mi>&#x3d5;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. Using the representer theorem (<xref ref-type="bibr" rid="B53">Sch&#xf6;lkopf et&#x20;al., 2001</xref>), the projection direction in <italic>w</italic> can be written as linear sum of the transformed samples, that is, <inline-formula id="inf66">
<mml:math id="m85">
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mi>n</mml:mi>
<mml:mi>a</mml:mi>
</mml:msup>
</mml:mrow>
</mml:msubsup>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b1;</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mi>&#x3d5;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>&#x2b;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mi>n</mml:mi>
<mml:mi>b</mml:mi>
</mml:msup>
</mml:mrow>
</mml:msubsup>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>q</mml:mi>
</mml:msub>
<mml:mi>&#x3d5;</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mi>q</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</inline-formula>. Using this property, the <xref ref-type="disp-formula" rid="e4">Eq. 4</xref> can be rewritten as<xref ref-type="fn" rid="fn3">
<sup>2</sup>
</xref>
<disp-formula id="e5">
<mml:math id="m86">
<mml:mrow>
<mml:mi>J</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:msup>
<mml:mi>&#x3b1;</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msup>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:msub>
<mml:mi>M</mml:mi>
<mml:mtext>&#x2a;</mml:mtext>
</mml:msub>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>&#x3b1;</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mi>&#x3b2;</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:msup>
<mml:mi>&#x3b1;</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:msup>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mtext>&#x2a;</mml:mtext>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>&#x3bb;</mml:mi>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>&#x3b1;</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mi>&#x3b2;</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>where <inline-formula id="inf67">
<mml:math id="m87">
<mml:mrow>
<mml:msub>
<mml:mi>M</mml:mi>
<mml:mtext>&#x2a;</mml:mtext>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf68">
<mml:math id="m88">
<mml:mrow>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mtext>&#x2a;</mml:mtext>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are analogous to <inline-formula id="inf69">
<mml:math id="m89">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf70">
<mml:math id="m90">
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, respectively, and are defined as<disp-formula id="equ1">
<mml:math id="m91">
<mml:mrow>
<mml:msub>
<mml:mi>M</mml:mi>
<mml:mo>&#x2217;</mml:mo>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>l</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:mover>
</mml:mrow>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:msubsup>
<mml:mi>p</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ2">
<mml:math id="m92">
<mml:mrow>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mo>&#x2217;</mml:mo>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf71">
<mml:math id="m93">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2133;</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:munderover>
<mml:mi>K</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mi>q</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf72">
<mml:math id="m94">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="script">M</mml:mi>
<mml:msubsup>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>K</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mi>q</mml:mi>
<mml:mi>b</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <italic>K</italic> is a kernel function. In this work, we use the Gaussian kernel function. <xref ref-type="disp-formula" rid="e5">Eq. 5</xref> with linear kernel is equivalent to <xref ref-type="disp-formula" rid="e4">Eq. 4</xref>. However, if <inline-formula id="inf73">
<mml:math id="m95">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, the criterion in <xref ref-type="disp-formula" rid="e4">Eq. 4</xref> is computationally more efficient than <xref ref-type="disp-formula" rid="e5">Eq. 5</xref> but if <inline-formula id="inf74">
<mml:math id="m96">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, <xref ref-type="disp-formula" rid="e5">Eq. 5</xref> is&#x20;computationally more efficient than <xref ref-type="disp-formula" rid="e4">Eq.&#x20;4</xref>.</p>
</sec>
</sec>
<sec id="s3">
<title>Proposed Cross-View Face Recognition Approach</title>
<p>The main objective of this research is to utilize the proposed heterogeneity-aware classifiers in conjunction with robust and unique features for heterogeneous face recognition. <xref ref-type="fig" rid="F3">Figure&#x20;3</xref> showscases the steps involved in the face recognition pipeline. From the given input image, the face region is detected using a Haar face detector or manually annotated (for digital sketches) eye coordinates. It is our assertion that the proposed HDA and KHDA should yield good results with both handcrafted and learnt representations. Based on our formulation, to a large extent, HDA and KHDA should help obtain heterogeneity invariant representation of features. Therefore, the lesser heterogeneity invariant a feature is, the greater should be the extent of improvement by HDA and KHDA. Arguably, the learned features are more sophisticated and heterogeneity invariant compared to handcrafted features. Therefore, in this research, we have performed experiments with features of both types for detailed evaluation. In the literature, it has been observed that histogram of oriented gradients (HOG) and local binary patterns (LBP) are commonly used handcrafted features for heterogeneous face matching (<xref ref-type="bibr" rid="B27">Klare and Jain, 2013</xref>, <xref ref-type="bibr" rid="B28">2010</xref>). <xref ref-type="bibr" rid="B11">Dhamecha et&#x20;al. (2014)</xref> compared the performance of different variants of HOG and showed that DSIFT (<xref ref-type="bibr" rid="B42">Lowe, 2004</xref>) yields the best results. Therefore, among handcrafted features, we have demonstrated the results with DSIFT (extracted at keypoints on uniform grid and landmark points). For learnt representation, we use local class sparsity&#x2013;based supervised encoder (LCSSE) (<xref ref-type="bibr" rid="B44">Majumdar et&#x20;al., 2016</xref>), LightCNN (<xref ref-type="bibr" rid="B62">Wu et&#x20;al., 2018</xref>), and ArcFace (<xref ref-type="bibr" rid="B9">Deng J.&#x20;et&#x20;al., 2019</xref>). For LightCNN (<monospace>LightCNN29V2</monospace>) and ArcFace, both the models pretrained on MS-Celeb 1M dataset are utilized as feature extractor. In this research, we have used the pretrained LCSSE model and fine-tuned with the training samples for each case&#x20;study.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Illustrating the steps involved in the face recognition pipeline with the proposed HDA and KHDA.</p>
</caption>
<graphic xlink:href="frai-04-670538-g003.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>, once the features are obtained, they are projected on to a PCA space (preserving 99% eigenenergy), followed by projecting onto the <inline-formula id="inf77">
<mml:math id="m99">
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> dimensional HDA or KHDA space. It is to be noted that learning of PCA subspace does not use class labels, whereas HDA and KHDA training utilize identity labels and the view labels. Finally, distance score between gallery and probe feature vectors is computed using cosine distance measure.</p>
</sec>
<sec id="s4">
<title>Experimental Evaluation</title>
<p>The effectiveness of the proposed heterogeneous discriminant algorithm is evaluated for three different case studies of heterogeneous face recognition: 1) visible to near-infrared matching, 2) cross-resolution face matching, and 3) composite sketch (CS) to digital photo (DP) matching. For all three case studies, we have used publicly available benchmark databases: CASIA NIR-VIS 2.0 (<xref ref-type="bibr" rid="B36">Li et&#x20;al., 2013</xref>), CMU Multi-PIE (<xref ref-type="bibr" rid="B16">Gross et&#x20;al., 2010</xref>), and e-PRIP composite sketch (<xref ref-type="bibr" rid="B17">Han et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B46">Mittal et&#x20;al., 2014</xref>). <xref ref-type="table" rid="T2">Table&#x20;2</xref> summarizes the characteristics of the three databases. The experiments are performed with existing and published protocols so that the results can be directly compared with reported results.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Datasets utilized for evaluating the proposed HDA and KHDA on three heterogeneous face recognition challenges.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left" rowspan="2">Case Study</th>
<th align="left" rowspan="2">Gallery</th>
<th align="left" rowspan="2">Probe</th>
<th align="center" rowspan="2">Dataset</th>
<th align="left" rowspan="2">&#x23;Images</th>
<th align="center" colspan="2">&#x23;Subjects</th>
</tr>
<tr>
<th/>
<th align="center">Total Training: Testing (Protocol)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Cross-spectral</td>
<td align="left">VIS</td>
<td align="left">NIR</td>
<td align="left">CASIA NIR-VIS-2.0 (<xref ref-type="bibr" rid="B36">Li et&#x20;al., 2013</xref>)</td>
<td align="left">17,850</td>
<td align="left">725</td>
<td align="left">357 : 358 (<xref ref-type="bibr" rid="B36">Li et&#x20;al., 2013</xref>)</td>
</tr>
<tr>
<td align="left">Cross-resolution</td>
<td align="left">HR</td>
<td align="left">LR</td>
<td align="left">CMU Multi-PIE (<xref ref-type="bibr" rid="B16">Gross et&#x20;al., 2010</xref>)</td>
<td align="left">18,420</td>
<td align="left">337</td>
<td align="left">100 : 227 (<xref ref-type="bibr" rid="B3">Bhatt et&#x20;al., 2012</xref>; <xref ref-type="bibr" rid="B2">Bhatt et&#x20;al., 2014</xref>)</td>
</tr>
<tr>
<td align="left">Photo to sketch</td>
<td align="left">DP</td>
<td align="left">CS</td>
<td align="left">e-PRIP composite sketch (<xref ref-type="bibr" rid="B17">Han et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B46">Mittal et&#x20;al., 2014</xref>)</td>
<td align="left">246</td>
<td align="left">123</td>
<td align="left">48 : 75 (<xref ref-type="bibr" rid="B46">Mittal et&#x20;al., 2014</xref>)</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s4-1">
<title>Cross-Spectral (Visible&#x2013;NIR) Face Matching</title>
<p>Researchers have proposed several algorithms for VIS to NIR matching and primarily used the CASIA NIR-VIS 2.0 face dataset (<xref ref-type="bibr" rid="B36">Li et&#x20;al., 2013</xref>). The protocol defined for performance evaluation consists of 10 splits of train and test sets for random subsampling cross-validation. As required by the predefined protocol, results are reported for both identification (mean and standard deviation of rank-1 identification accuracy) and verification (GAR at 0.1%&#x20;FAR).</p>
<p>The images are first detected and preprocessed. Seven landmarks (two eye corners, three points on nose, and two lip corners) are detected (<xref ref-type="bibr" rid="B13">Everingham et&#x20;al., 2009</xref>) from the input face image and geometric normalization is applied to register the cropped face images. The output of preprocessing is grayscale face images of size <inline-formula id="inf78">
<mml:math id="m100">
<mml:mrow>
<mml:mn>130</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>150</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> pixels. All the features<xref ref-type="fn" rid="fn4">
<sup>3</sup>
</xref> are extracted from geometrically normalized face images. We evaluate the effectiveness of HDA over LDA. To compare the results with LDA, the pipeline shown in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref> is followed with the exception of using LDA instead of HDA. The results are reported in <xref ref-type="table" rid="T3">Table&#x20;3</xref> and the key observations are discussed below.<xref ref-type="fn" rid="fn5">
<sup>4</sup>
</xref>
</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Rank-1 identification accuracy for visible to near-infrared face matching on the CASIA NIR&#x2013;VIS 2.0 database (<xref ref-type="bibr" rid="B36">Li et&#x20;al., 2013</xref>).</p>
</caption>
<table>
<thead valign="top">
<tr>
<th colspan="2" align="left">Algorithm</th>
<th align="left">DSIFT</th>
<th align="left">LCSSE</th>
<th align="left">LightCNN</th>
<th align="left">ArcFace</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">W/O DA</td>
<td align="left">Eucl</td>
<td align="left">12.6&#xb1;0.9</td>
<td align="left">50.3&#xb1;8.3</td>
<td align="left">95.7&#xb1;0.3</td>
<td align="left">97.1&#xb1;0.4</td>
</tr>
<tr>
<td align="left"/>
<td align="left">Cos</td>
<td align="left">19.6&#xb1;1.4</td>
<td align="left">51.6&#xb1;7.8</td>
<td align="left">96.9&#xb1;0.3</td>
<td align="left">97.4&#xb1;0.5</td>
</tr>
<tr>
<td align="left">LDA</td>
<td align="left">Eucl</td>
<td align="left">56.7&#xb1;2.2</td>
<td align="left">82.3&#xb1;4.8</td>
<td align="left">96.8&#xb1;0.3</td>
<td align="left">98.2&#xb1;0.9</td>
</tr>
<tr>
<td align="left"/>
<td align="left">Cos</td>
<td align="left">80.4&#xb1;1.7</td>
<td align="left">88.9&#xb1;3.2</td>
<td align="left">98.1&#xb1;0.5</td>
<td align="left">98.5&#xb1;0.6</td>
</tr>
<tr>
<td align="left">HDA</td>
<td align="left">Eucl</td>
<td align="left">58.0&#xb1;2.1</td>
<td align="left">95.2&#xb1;1.7</td>
<td align="left">96.3&#xb1;0.5</td>
<td align="left">99.1&#xb1;0.2</td>
</tr>
<tr>
<td align="left"/>
<td align="left">Cos</td>
<td align="left">81.0&#xb1;1.9</td>
<td align="left">96.8&#xb1;0.9</td>
<td align="left">98.1&#xb1;0.3</td>
<td align="left">99.3&#xb1;0.2</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s4-2">
<title>Discriminative Learning using HDA</title>
<p>As shown in <xref ref-type="table" rid="T3">Table&#x20;3</xref>, without discriminant analysis (LDA or HDA), the performance of individual features is lower. The deep learning&#x2013;based LCSSE yields around 50% rank-1 accuracy. The LightCNN and ArcFace features yield impressive rank-1 accuracy of about 95% and 97%, respectively, which shows their superior feature representation. The next experiment illustrates the effect of applying LDA on individual features. <xref ref-type="table" rid="T3">Table&#x20;3</xref> shows that LDA improves the accuracy up to 60%. Comparing the performance of HDA with LDA shows that HDA outperforms LDA. Utilizing HDA in place of LDA for discriminative learning improves the results up to 12.9%. The HDA and LDA performance is very high and almost same for LightCNN, which may point toward its spectrum-invariant representation capabilities. For ArcFace, although small, a consistently progressive improvement of about 1% is observed between raw features, LDA, and HDA, respectively. Understandably, if the feature is spectrum-invariant, the benefits of heterogeneity-aware classifier are expected to be limited. The improvement provided by HDA can be attributed to the fact that it learns a discriminative subspace specifically for heterogeneous matching. Similar to the toy example shown in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>, it can be asserted that the multi-view information yields different clusters in the feature space. Under such scenarios, since the fundamental assumption of Gaussian data distribution is not satisfied, LDA can exhibit suboptimal results. However, by encoding the view label information, HDA is able to find better projection space, thereby yielding better results.</p>
</sec>
<sec id="s4-3">
<title>Effect of HDA across Features</title>
<p>The results show that the proposed HDA improves the accuracy of DSIFT and LCSSE features by 40&#x2013;60%. For instance, applying LCSSE with HDA improves the results by around 45%. As discussed earlier, even the raw LightCNN and ArcFace features yield very high performance, leaving very little room of improvement by LDA or HDA projections.</p>
</sec>
<sec id="s4-4">
<title>Direction vs Magnitude in Projection Space</title>
<p>Cosine distance encodes only the difference in direction between samples, whereas the Euclidean distance encodes both direction and magnitude. For the given experiment, as shown in <xref ref-type="table" rid="T3">Table&#x20;3</xref>, cosine distance generally yields higher accuracy over Euclidean distance. This shows that for heterogeneous matching, the magnitude of projections may not provide useful information and only directional information can be used for matching.</p>
</sec>
<sec id="s4-5">
<title>Optimum Combination</title>
<p>From the above analysis, it can be seen that the proposed HDA in combination with DSIFT features and cosine distance measure yields an impressive 81% for a handcrafted feature. ArcFace features with HDA and cosine distance measure yield the best results. However, LightCNN and LCSSE are also within 3% of it. For the remaining experiments (and other case studies), we have demonstrated the results with DSIFT, LCSSE, LightCNN, and ArcFace features and cosine distance measure along with proposed heterogeneity-aware classifiers.</p>
</sec>
<sec id="s4-6">
<title>Comparison with Existing Algorithms</title>
<p>We next compare the results of the proposed approaches with the results reported in the literature. Comparative analysis is shown with a leading commercial off-the-shelf (COTS) face recognition system, FaceVACS<xref ref-type="fn" rid="fn6">
<sup>5</sup>
</xref>, and 20 recently published results. <xref ref-type="table" rid="T4">Table&#x20;4</xref> shows that with pixel values as input, the proposed HDA approach outperforms other existing algorithms. For example, MvDA with pixel values yields 41.6% rank-1 identification accuracy and 19.2% GAR at 0.1% FAR, whereas the proposed approach yields similar rank-1 accuracy with lower standard deviation and much higher GAR of 31.4%. Further, <xref ref-type="table" rid="T4">Table&#x20;4</xref> clearly<xref ref-type="fn" rid="fn7">
<sup>6</sup>
</xref> demonstrates the performance improvement due to the proposed HDA and its nonlinear kernel variant KHDA. KHDA with learnt representation LCSSE and HDA with LightCNN yield almost equal identification accuracy. However, our best results are obtained with ArcFace with KHDA at 99.4% rank-1 and 99.1% GAR@FAR&#x3d;0.1%. The reported results are comparable to the recently published state of the&#x20;art.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Comparing the face recognition performance of the proposed and some existing algorithms for VIS to NIR face matching on CASIA NIR&#x2013;VIS 2.0 dataset.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left">Algorithm</th>
<th rowspan="2" align="left">Year</th>
<th align="center">Rank-1</th>
<th align="center">GAR</th>
</tr>
<tr>
<th align="center">Accuracy (%)</th>
<th align="center">@ FAR &#x3d; 0.1%</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">FaceVACS (<xref ref-type="bibr" rid="B11">Dhamecha et&#x20;al., 2014</xref>)</td>
<td align="center">2014</td>
<td align="center">58.6&#xb1;1.2</td>
<td align="center">52.9</td>
</tr>
<tr>
<td colspan="4" align="center">
<bold>Pixels as Features</bold>
</td>
</tr>
<tr>
<td align="left">CCA<xref ref-type="table-fn" rid="Tfn1">
<sup>a</sup>
</xref> (<xref ref-type="bibr" rid="B18">Hardoon et&#x20;al., 2004</xref>)</td>
<td align="center">2004</td>
<td align="center">28.5&#xb1;3.4</td>
<td align="center">10.8</td>
</tr>
<tr>
<td align="left">PLS<xref ref-type="table-fn" rid="Tfn1">
<sup>a</sup>
</xref> (<xref ref-type="bibr" rid="B54">Sharma and Jacobs, 2011</xref>)</td>
<td align="center">2011</td>
<td align="center">17.7&#xb1;1.9</td>
<td align="center">2.3</td>
</tr>
<tr>
<td align="left">CDFE<xref ref-type="table-fn" rid="Tfn1">
<sup>a</sup>
</xref> (<xref ref-type="bibr" rid="B40">Lin and Tang, 2006</xref>)</td>
<td align="center">2006</td>
<td align="center">27.9&#xb1;2.9</td>
<td align="center">6.9</td>
</tr>
<tr>
<td align="left">MvDA<xref ref-type="table-fn" rid="Tfn1">
<sup>a</sup>
</xref> (<xref ref-type="bibr" rid="B26">Kan et&#x20;al., 2016</xref>)</td>
<td align="center">2012</td>
<td align="center">41.6&#xb1;4.1</td>
<td align="center">19.2</td>
</tr>
<tr>
<td align="left">GMLDA<xref ref-type="table-fn" rid="Tfn1">
<sup>a</sup>
</xref> (<xref ref-type="bibr" rid="B55">Sharma et&#x20;al., 2012</xref>)</td>
<td align="center">2012</td>
<td align="center">23.7&#xb1;1.4</td>
<td align="center">5.1</td>
</tr>
<tr>
<td align="left">GMMFA<xref ref-type="table-fn" rid="Tfn1">
<sup>a</sup>
</xref> (<xref ref-type="bibr" rid="B55">Sharma et&#x20;al., 2012</xref>)</td>
<td align="center">2012</td>
<td align="center">24.8&#xb1;1.1</td>
<td align="center">7.6</td>
</tr>
<tr>
<td align="left">PCA&#x2b;Symmetry&#x2b;HCA (<xref ref-type="bibr" rid="B36">Li et&#x20;al., 2013</xref>)</td>
<td align="center">2013</td>
<td align="center">23.7&#xb1;1.9</td>
<td align="center">19.3</td>
</tr>
<tr>
<td align="left">PIXEL&#x2b;HDA</td>
<td align="center">-</td>
<td align="center">41.4&#xb1;1.3</td>
<td align="center">31.4</td>
</tr>
<tr>
<td colspan="4" align="center">
<bold>Other Features/Approaches</bold>
</td>
</tr>
<tr>
<td align="left">DSIFT&#x2b;SDA (<inline-formula id="inf79">
<mml:math id="m101">
<mml:mrow>
<mml:mi>H</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>) (<xref ref-type="bibr" rid="B66">Zhu and Martinez, 2006</xref>)</td>
<td align="center">2006</td>
<td align="center">75.7&#xb1;1.2</td>
<td align="center">54.8</td>
</tr>
<tr>
<td align="left">Gabor&#x2b;RBM&#x2b;Remove 11 PC (<xref ref-type="bibr" rid="B63">Yi et&#x20;al., 2015</xref>)</td>
<td align="center">2015</td>
<td align="center">86.2&#xb1;1.0</td>
<td align="center">81.3</td>
</tr>
<tr>
<td align="left">C-DFD (s&#x3d;3)<xref ref-type="table-fn" rid="Tfn1">
<sup>a</sup>
</xref> (<xref ref-type="bibr" rid="B30">Lei et&#x20;al., 2014</xref>)</td>
<td align="center">2014</td>
<td align="center">65.8&#xb1;1.6</td>
<td align="center">46.2</td>
</tr>
<tr>
<td align="left">CDFL (s&#x3d;3) (<xref ref-type="bibr" rid="B23">Jin et&#x20;al., 2015</xref>)</td>
<td align="center">2015</td>
<td align="center">71.5&#xb1;1.4</td>
<td align="center">55.1</td>
</tr>
<tr>
<td align="left">C-CBFD&#x2b;LDA (<xref ref-type="bibr" rid="B43">Lu et&#x20;al., 2015</xref>)</td>
<td align="center">2015</td>
<td align="center">81.8&#xb1;2.3</td>
<td align="center">47.3</td>
</tr>
<tr>
<td align="left">Joint Dictionary Learning (<xref ref-type="bibr" rid="B24">Juefei-Xu et&#x20;al., 2015</xref>)</td>
<td align="center">2015</td>
<td align="center">78.5&#xb1;1.7</td>
<td align="center">85.8</td>
</tr>
<tr>
<td align="left">
<xref ref-type="bibr" rid="B52">Saxena and Verbeek (2016)</xref>
</td>
<td align="center">2016</td>
<td align="center">85.9&#xb1;0.9</td>
<td align="center">78.0</td>
</tr>
<tr>
<td align="left">
<xref ref-type="bibr" rid="B51">Reale et&#x20;al. (2016)</xref>
</td>
<td align="center">2016</td>
<td align="center">87.1&#xb1;0.9</td>
<td align="center">74.5</td>
</tr>
<tr>
<td align="left">TRIVET (<xref ref-type="bibr" rid="B41">Liu et&#x20;al., 2016</xref>)</td>
<td align="center">2016</td>
<td align="center">95.7&#xb1;0.5</td>
<td align="center">91.0</td>
</tr>
<tr>
<td align="left">MTC-ELM (<xref ref-type="bibr" rid="B22">Jin et&#x20;al., 2016</xref>)</td>
<td align="center">2016</td>
<td align="center">89.1</td>
<td align="center">-</td>
</tr>
<tr>
<td align="left">
<xref ref-type="bibr" rid="B34">Lezama et&#x20;al. (2017)</xref>
</td>
<td align="center">2017</td>
<td align="center">89.6&#xb1;0.9</td>
<td align="center">-</td>
</tr>
<tr>
<td align="left">
<xref ref-type="bibr" rid="B19">He et&#x20;al. (2017)</xref>
</td>
<td align="center">2017</td>
<td align="center">95.8&#xb1;0.8</td>
<td align="center">94.0</td>
</tr>
<tr>
<td align="left">Gabor&#x2b;HJB (<xref ref-type="bibr" rid="B56">Shi et&#x20;al., 2017</xref>)</td>
<td align="center">2017</td>
<td align="center">91.7&#xb1;0.9</td>
<td align="center">89.9</td>
</tr>
<tr>
<td align="left">G-HFR (<xref ref-type="bibr" rid="B50">Peng et&#x20;al., 2017</xref>)</td>
<td align="center">2017</td>
<td align="center">85.3&#xb1;0.0</td>
<td align="center">-</td>
</tr>
<tr>
<td align="left">Frankenstein (<xref ref-type="bibr" rid="B21">Hu et&#x20;al., 2018</xref>)</td>
<td align="center">2018</td>
<td align="center">85.1&#xb1;0.8</td>
<td align="center">-</td>
</tr>
<tr>
<td align="left">LightCNN (<xref ref-type="bibr" rid="B62">Wu et&#x20;al., 2018</xref>)</td>
<td align="center">2018</td>
<td align="center">96.7&#xb1;0.2</td>
<td align="center">94.8</td>
</tr>
<tr>
<td align="left">WCNN (<xref ref-type="bibr" rid="B20">He et&#x20;al., 2019</xref>)</td>
<td align="center">2019</td>
<td align="center">98.7</td>
<td align="center">98.4</td>
</tr>
<tr>
<td align="left">MC-CNN (<xref ref-type="bibr" rid="B10">Deng et&#x20;al., 2019b</xref>)</td>
<td align="center">2019</td>
<td align="center">99.2&#xb1;0.2</td>
<td align="center">-</td>
</tr>
<tr>
<td align="left">RGM&#x2b;NAU&#x2b;C-softmax (<xref ref-type="bibr" rid="B7">Cho et&#x20;al., 2020</xref>)</td>
<td align="center">2020</td>
<td align="center">99.3&#xb1;0.1</td>
<td align="center">98.9</td>
</tr>
<tr>
<td align="left">PACH (<xref ref-type="bibr" rid="B12">Duan et&#x20;al., 2020</xref>)</td>
<td align="center">2020</td>
<td align="center">98.9&#xb1;0.2</td>
<td align="center">98.3</td>
</tr>
<tr>
<td align="left">DSIFT&#x2b;HDA</td>
<td align="center">-</td>
<td align="center">81.0&#xb1;1.9</td>
<td align="center">62.8</td>
</tr>
<tr>
<td align="left">DSIFT&#x2b;KHDA</td>
<td align="center">-</td>
<td align="center">83.1&#xb1;1.7</td>
<td align="center">62.1</td>
</tr>
<tr>
<td align="left">LCSSE&#x2b;HDA</td>
<td align="center">-</td>
<td align="center">96.8&#xb1;0.9</td>
<td align="center">93.1</td>
</tr>
<tr>
<td align="left">LCSSE&#x2b;KHDA</td>
<td align="center">-</td>
<td align="center">98.1&#xb1;0.5</td>
<td align="center">94.3</td>
</tr>
<tr>
<td align="left">LightCNN&#x2b;HDA</td>
<td align="center">-</td>
<td align="center">98.1&#xb1;0.3</td>
<td align="center">96.5</td>
</tr>
<tr>
<td align="left">ArcFace&#x2b;HDA</td>
<td align="center">-</td>
<td align="center">99.3&#xb1;0.2</td>
<td align="center">98.8</td>
</tr>
<tr>
<td align="left">ArcFace&#x2b;KHDA</td>
<td align="center">-</td>
<td align="center">99.4&#xb1;0.1</td>
<td align="center">99.1</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="Tfn1">
<label>a</label>
<p>represents the results reported in <xref ref-type="bibr" rid="B23">Jin et&#x20;al. (2015)</xref>, <xref ref-type="bibr" rid="B43">Lu et&#x20;al. (2015)</xref>. Other cited results as reported in their corresponding publications.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>Also, LCSSE&#x2b;KHDA and LightCNN&#x2b;HDA achieve 94.3% and 96.5% GAR at 0.1% FAR, respectively. Also note that, in a fair comparison, DSIFT features with the proposed KHDA also yield results comparable to other non-deep learning&#x2013;based approaches.</p>
</sec>
</sec>
<sec id="s4-7">
<title>Cross-Resolution Face Matching</title>
<p>Cross-resolution face recognition entails matching high-resolution gallery images with low-resolution probe images. In this scenario, high resolution and low resolution are considered as two different views of a face image. We compare our approach with <xref ref-type="bibr" rid="B3">Bhatt et&#x20;al. (2012</xref>, <xref ref-type="bibr" rid="B2">2014</xref>) as they have reported one of the best results for the problem. We follow their protocol on CMU Multi-PIE database (<xref ref-type="bibr" rid="B16">Gross et&#x20;al., 2010</xref>). Each image is resized to six different resolutions: <inline-formula id="inf80">
<mml:math id="m102">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf81">
<mml:math id="m103">
<mml:mrow>
<mml:mn>24</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>24</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf82">
<mml:math id="m104">
<mml:mrow>
<mml:mn>32</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>32</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf83">
<mml:math id="m105">
<mml:mrow>
<mml:mn>48</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>48</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf84">
<mml:math id="m106">
<mml:mrow>
<mml:mn>72</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>72</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, and <inline-formula id="inf85">
<mml:math id="m107">
<mml:mrow>
<mml:mn>216</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>216</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>. In total, <inline-formula id="inf86">
<mml:math id="m108">
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mn>6</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>2</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>15</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> cross-resolution matching scenarios are considered. For every person, two images are selected and images pertaining to 100 subjects are utilized for training, whereas the remaining 237 subjects are utilized for&#x20;testing. The results are reported in <xref ref-type="table" rid="T5">Table&#x20;5</xref>. Results for ArcFace&#x2b;KHDA are similar to ArcFace&#x2b;HDA, hence not reported here. Since the protocol (<xref ref-type="bibr" rid="B3">Bhatt et&#x20;al., 2012</xref>, <xref ref-type="bibr" rid="B2">2014</xref>) does not involve cross-validation, error intervals are not reported.</p>
<table-wrap id="T5" position="float">
<label>TABLE 5</label>
<caption>
<p>Rank-1 identification accuracy of the proposed HDA, KHDA and existing algorithms, Cotransfer Learning (CTL) and a commercial off-the-shelf (COTS) (<xref ref-type="bibr" rid="B3">Bhatt et&#x20;al., 2012</xref>, <xref ref-type="bibr" rid="B2">2014</xref>), DSIFT (<xref ref-type="bibr" rid="B42">Lowe, 2004</xref>), LCSSE (<xref ref-type="bibr" rid="B44">Majumdar et al., 2016</xref>), LightCNN, and ArcFace on CMU Multi-PIE database (<xref ref-type="bibr" rid="B16">Gross et al., 2010</xref>) with different gallery and probe image sizes.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left">Probe res.</th>
<th rowspan="2" align="center">CTL</th>
<th rowspan="2" align="center">COTS</th>
<th colspan="2" align="center">DSIFT</th>
<th colspan="2" align="center">LCSSE</th>
<th align="center">LightCNN</th>
<th align="center">ArcFace</th>
</tr>
<tr>
<th align="center">HDA</th>
<th align="center">KHDA</th>
<th align="center">HDA</th>
<th align="center">KHDA</th>
<th align="center">HDA</th>
<th align="center">HDA</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td colspan="9" align="center">
<bold>Gallery: 216 &#xd7; 216</bold>
</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf88">
<mml:math id="m111">
<mml:mrow>
<mml:mn>72</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>72</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">81.0</td>
<td align="char" char=".">99.5</td>
<td align="char" char=".">94.1</td>
<td align="char" char=".">95.4</td>
<td align="char" char=".">95.8</td>
<td align="char" char=".">97.0</td>
<td align="char" char=".">100</td>
<td align="char" char=".">100</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf89">
<mml:math id="m112">
<mml:mrow>
<mml:mn>48</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>48</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">79.7</td>
<td align="char" char=".">98.1</td>
<td align="char" char=".">92.4</td>
<td align="char" char=".">94.1</td>
<td align="char" char=".">93.7</td>
<td align="char" char=".">95.3</td>
<td align="char" char=".">100</td>
<td align="char" char=".">100</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf90">
<mml:math id="m113">
<mml:mrow>
<mml:mn>32</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>32</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">65.3</td>
<td align="char" char=".">97.4</td>
<td align="char" char=".">89.0</td>
<td align="char" char=".">90.7</td>
<td align="char" char=".">92.0</td>
<td align="char" char=".">93.2</td>
<td align="char" char=".">99.6</td>
<td align="char" char=".">100</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf91">
<mml:math id="m114">
<mml:mrow>
<mml:mn>24</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>24</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">37.7</td>
<td align="char" char=".">54.5</td>
<td align="char" char=".">87.3</td>
<td align="char" char=".">85.7</td>
<td align="char" char=".">89.0</td>
<td align="char" char=".">89.5</td>
<td align="char" char=".">92.0</td>
<td align="char" char=".">95.0</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf92">
<mml:math id="m115">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">23.6</td>
<td align="char" char=".">10.9</td>
<td align="char" char=".">37.6</td>
<td align="char" char=".">37.6</td>
<td align="char" char=".">61.2</td>
<td align="char" char=".">62.5</td>
<td align="char" char=".">35.0</td>
<td align="char" char=".">46.0</td>
</tr>
<tr>
<td colspan="9" align="center">
<bold>Gallery: 72 &#xd7; 72</bold>
</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf94">
<mml:math id="m117">
<mml:mrow>
<mml:mn>48</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>48</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">92.3</td>
<td align="char" char=".">92.7</td>
<td align="char" char=".">95.4</td>
<td align="char" char=".">96.2</td>
<td align="char" char=".">96.6</td>
<td align="char" char=".">97.0</td>
<td align="char" char=".">100</td>
<td align="char" char=".">100</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf95">
<mml:math id="m118">
<mml:mrow>
<mml:mn>32</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>32</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">84.1</td>
<td align="char" char=".">84.3</td>
<td align="char" char=".">92.4</td>
<td align="char" char=".">96.2</td>
<td align="char" char=".">92.8</td>
<td align="char" char=".">96.6</td>
<td align="char" char=".">100</td>
<td align="char" char=".">100</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf96">
<mml:math id="m119">
<mml:mrow>
<mml:mn>24</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>24</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">77.4</td>
<td align="char" char=".">78.5</td>
<td align="char" char=".">89.0</td>
<td align="char" char=".">91.6</td>
<td align="char" char=".">93.2</td>
<td align="char" char=".">94.1</td>
<td align="char" char=".">95.4</td>
<td align="char" char=".">98.2</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf97">
<mml:math id="m120">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">72.4</td>
<td align="char" char=".">72.8</td>
<td align="char" char=".">44.3</td>
<td align="char" char=".">54.9</td>
<td align="char" char=".">73.4</td>
<td align="char" char=".">75.1</td>
<td align="char" char=".">39.2</td>
<td align="char" char=".">52.4</td>
</tr>
<tr>
<td colspan="9" align="center">
<bold>Gallery: 48 &#xd7; 48</bold>
</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf99">
<mml:math id="m122">
<mml:mrow>
<mml:mn>32</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>32</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">61.8</td>
<td align="char" char=".">96.8</td>
<td align="char" char=".">95.4</td>
<td align="char" char=".">97.1</td>
<td align="char" char=".">96.2</td>
<td align="char" char=".">97.9</td>
<td align="char" char=".">100</td>
<td align="char" char=".">100</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf100">
<mml:math id="m123">
<mml:mrow>
<mml:mn>24</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>24</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">57.1</td>
<td align="char" char=".">75.9</td>
<td align="char" char=".">95.4</td>
<td align="char" char=".">94.9</td>
<td align="char" char=".">96.6</td>
<td align="char" char=".">97.5</td>
<td align="char" char=".">89.9</td>
<td align="char" char=".">94.8</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf101">
<mml:math id="m124">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">32.9</td>
<td align="char" char=".">6.4</td>
<td align="char" char=".">73.8</td>
<td align="char" char=".">71.3</td>
<td align="char" char=".">77.2</td>
<td align="char" char=".">78.1</td>
<td align="char" char=".">34.6</td>
<td align="char" char=".">50.0</td>
</tr>
<tr>
<td colspan="9" align="center">
<bold>Gallery: 32 &#xd7; 32</bold>
</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf103">
<mml:math id="m126">
<mml:mrow>
<mml:mn>24</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>24</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">45.7</td>
<td align="char" char=".">78.4</td>
<td align="char" char=".">94.9</td>
<td align="char" char=".">94.5</td>
<td align="char" char=".">95.8</td>
<td align="char" char=".">96.2</td>
<td align="char" char=".">98.7</td>
<td align="char" char=".">100</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf104">
<mml:math id="m127">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">28.1</td>
<td align="char" char=".">5.4</td>
<td align="char" char=".">88.6</td>
<td align="char" char=".">86.1</td>
<td align="char" char=".">90.3</td>
<td align="char" char=".">91.1</td>
<td align="char" char=".">50.6</td>
<td align="char" char=".">62.4</td>
</tr>
<tr>
<td colspan="9" align="center">
<bold>Gallery: 24 &#xd7; 24</bold>
</td>
</tr>
<tr>
<td align="left">
<inline-formula id="inf106">
<mml:math id="m149">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="char" char=".">43.2</td>
<td align="char" char=".">16.3</td>
<td align="char" char=".">85.7</td>
<td align="char" char=".">85.2</td>
<td align="char" char=".">87.3</td>
<td align="char" char=".">89.0</td>
<td align="char" char=".">56.5</td>
<td align="char" char=".">68.8</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>It can be seen that LCSSE&#x2b;KHDA outperforms the cotransfer learning (<xref ref-type="bibr" rid="B3">Bhatt et&#x20;al., 2012</xref>, <xref ref-type="bibr" rid="B2">2014</xref>) in all the cross-resolution matching scenarios. For example, when <inline-formula id="inf107">
<mml:math id="m129">
<mml:mrow>
<mml:mn>48</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>48</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> pixel gallery images are matched with probe images of <inline-formula id="inf108">
<mml:math id="m130">
<mml:mrow>
<mml:mn>32</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>32</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf109">
<mml:math id="m131">
<mml:mrow>
<mml:mn>24</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>24</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, and <inline-formula id="inf110">
<mml:math id="m132">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> pixels, performance improvement of about 30%&#x2013;40% is observed. LightCNN and ArcFace yield even higher identification accuracy, except when the probe image is <inline-formula id="inf111">
<mml:math id="m133">
<mml:mrow>
<mml:mn>16</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>16</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>. We believe that the feature extractor is unable to extract representative information at these resolutions. Analyzing the results across resolutions shows that the accuracy reduces with increase in resolution difference between the gallery and probe images. FaceVACS yields impressive performance when the size of both gallery and probe are higher than <inline-formula id="inf112">
<mml:math id="m134">
<mml:mrow>
<mml:mn>32</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>32</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>. However, the performance deteriorates significantly with decrease in the gallery image size and with increase in the resolution difference. Generally, the performance of the proposed HDA and/or KHDA is less affected due to resolution difference in comparison to FaceVACS and CTL. We have also observed that for cross-resolution face recognition, learned features (LCSSE, LightCNN, and ArcFace) show higher accuracies compared to DSIFT with a difference of up to&#x20;25%.</p>
</sec>
<sec id="s4-8">
<title>Digital Photo to Composite Sketch Face matching</title>
<p>In many law enforcement and forensic applications, software tools are used to generate composite sketches based on eyewitness description and the composite sketch is matched against a gallery of digital photographs. <xref ref-type="bibr" rid="B17">Han et&#x20;al. (2013)</xref> presented a component-based approach followed by score fusion for composite to photo matching. Later, <xref ref-type="bibr" rid="B46">Mittal et&#x20;al. (2014</xref>, <xref ref-type="bibr" rid="B48">2013</xref>, <xref ref-type="bibr" rid="B49">2015</xref>, <xref ref-type="bibr" rid="B47">2017)</xref> and <xref ref-type="bibr" rid="B8">Chugh et&#x20;al. (2013)</xref> presented learning-based algorithms for the same. <xref ref-type="bibr" rid="B29">Klum et&#x20;al. (2014)</xref> presented FaceSketchID for matching composite sketches to photos.</p>
<p>For this set of experiments, we utilize the e-PRIP composite sketch dataset (<xref ref-type="bibr" rid="B17">Han et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B46">Mittal et&#x20;al., 2014</xref>). The dataset contains composite sketches of 123 face images from the AR face dataset (<xref ref-type="bibr" rid="B45">Martinez, 1998</xref>). It contains the composite sketches created using two tools, Faces and IdentiKit<xref ref-type="fn" rid="fn8">
<sup>7</sup>
</xref>. The PRIP dataset (<xref ref-type="bibr" rid="B17">Han et&#x20;al., 2013</xref>) originally has composite sketches prepared by a Caucasian user (with IdentiKit and Faces softwares) and an Asian user (with Faces software). Later, the dataset is extended by <xref ref-type="bibr" rid="B46">Mittal et&#x20;al. (2014)</xref> by adding composite sketches prepared by an Indian user (with Faces software) which is termed as the e-PRIP composite sketch dataset. In this work, we use composite sketches prepared using Faces software by the Caucasian and Indian users as they are shown to yield better results compared to other sets (<xref ref-type="bibr" rid="B46">Mittal et&#x20;al., 2014</xref>, <xref ref-type="bibr" rid="B48">2013</xref>). The experiments are performed with the same protocol as presented by <xref ref-type="bibr" rid="B46">Mittal et&#x20;al. (2014)</xref>. Mean identification accuracies, across five random cross-validations, at rank-10 are reported in <xref ref-type="table" rid="T6">Table&#x20;6</xref>, and <xref ref-type="fig" rid="F4">Figure&#x20;4</xref> shows the corresponding CMC curves.</p>
<table-wrap id="T6" position="float">
<label>TABLE 6</label>
<caption>
<p>Results for composite sketch to photo matching.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left">Algorithm</th>
<th colspan="2" align="center">Rank-10 Accuracy (%)</th>
</tr>
<tr>
<th align="left">Faces (Caucasian)</th>
<th align="left">Faces (Indian)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">
<xref ref-type="bibr" rid="B49">Mittal et&#x20;al. (2015)</xref>
</td>
<td align="center">56.0&#xb1;2.1</td>
<td align="center">60.2&#xb1;2.9</td>
</tr>
<tr>
<td align="left">
<xref ref-type="bibr" rid="B47">Mittal et&#x20;al. (2017)</xref>
</td>
<td align="center">59.3&#xb1;0.8</td>
<td align="center">58.4&#xb1;1.1</td>
</tr>
<tr>
<td align="left">COTS (<xref ref-type="bibr" rid="B46">Mittal et&#x20;al., 2014</xref>)</td>
<td align="center">11.3&#xb1;2.1</td>
<td align="center">9.1&#xb1;1.9</td>
</tr>
<tr>
<td align="left">
<xref ref-type="bibr" rid="B52">Saxena and Verbeek (2016)</xref>
</td>
<td align="center">-</td>
<td align="center">65.6&#xb1;3.7</td>
</tr>
<tr>
<td align="left">DSIFT only</td>
<td align="center">67.5&#xb1;5.8</td>
<td align="center">51.7&#xb1;4.0</td>
</tr>
<tr>
<td align="left">DSIFT&#x2b;HDA</td>
<td align="center">79.5&#xb1;2.8</td>
<td align="center">73.9&#xb1;5.8</td>
</tr>
<tr>
<td align="left">DSIFT&#x2b;KHDA</td>
<td align="center">78.6&#xb1;3.4</td>
<td align="center">74.6&#xb1;3.8</td>
</tr>
<tr>
<td align="left">LCSSE only</td>
<td align="center">68.0&#xb1;2.6</td>
<td align="center">65.3&#xb1;4.1</td>
</tr>
<tr>
<td align="left">LCSSE&#x2b;HDA</td>
<td align="center">85.6&#xb1;1.3</td>
<td align="center">89.0&#xb1;1.5</td>
</tr>
<tr>
<td align="left">LCSSE&#x2b;KHDA</td>
<td align="center">89.6&#xb1;1.9</td>
<td align="center">94.7&#xb1;1.0</td>
</tr>
<tr>
<td align="left">LightCNN only</td>
<td align="center">84.6&#xb1;0.9</td>
<td align="center">75.4&#xb1;1.0</td>
</tr>
<tr>
<td align="left">LightCNN&#x2b;HDA</td>
<td align="center">85.0&#xb1;0.6</td>
<td align="center">72.1&#xb1;0.9</td>
</tr>
<tr>
<td align="left">ArcFace only</td>
<td align="center">86.5&#xb1;0.2</td>
<td align="center">80.6&#xb1;1.3</td>
</tr>
<tr>
<td align="left">ArcFace&#x2b;HDA</td>
<td align="center">89.1&#xb1;0.6</td>
<td align="center">90.8&#xb1;1.1</td>
</tr>
<tr>
<td align="left">ArcFace&#x2b;KHDA</td>
<td align="center">90.2&#xb1;0.4</td>
<td align="center">95.2&#xb1;0.7</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>CMC curves for composite sketch to digital photo matching on the e-PRIP composite sketch dataset (<xref ref-type="bibr" rid="B17">Han et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B46">Mittal et&#x20;al., 2014</xref>).</p>
</caption>
<graphic xlink:href="frai-04-670538-g004.tif"/>
</fig>
<p>With the above mentioned experimental protocol, one of the best results in the literature has been reported by <xref ref-type="bibr" rid="B47">Mittal et&#x20;al. (2017)</xref> with rank-10 identification accuracies of 59.3% (Caucasian) and 58.4% (Indian). <xref ref-type="bibr" rid="B52">Saxena and Verbeek (2016)</xref> have shown results with Indian users only and have achieved 65.5% rank-10 accuracy. As shown in the results, the proposed approaches, HDA and KHDA, with both DSIFT and LCSSE improve the performance significantly. Compared to existing algorithms, DSIFT demonstrates an improvement in the range of 11&#x2013;23%, while LCSSE&#x2b;HDA and LCSSE&#x2b;KHDA improve the rank-10 accuracy by &#x223c;30% with respect to state of the art (<xref ref-type="bibr" rid="B52">Saxena and Verbeek, 2016</xref>). Interestingly, LightCNN yields poorer performance compared to LCSSE in this case study. ArcFace yields the highest identification accuracy. Similar to previous results, this experiment also shows that application of HDA/KHDA improves the results of DSIFT, LCSSE, and ArcFace. However, the degree of improvement varies between handcrafted and learned features.</p>
</sec>
</sec>
<sec sec-type="conclusion" id="s5">
<title>Conclusion</title>
<p>In this research, we have proposed a discriminant analysis approach for heterogeneous face recognition. We formulate heterogeneous discriminant analysis which encodes view labels and has the objective function optimized for heterogeneous matching. Based on the analytical solution, we propose its kernel extension, KHDA. The proposed techniques are heterogeneity aware. Potentially, they can be applied on top of any features to get heterogeneity invariant representation, to an extent. Experiments are performed on three heterogeneous face matching problems, namely, visible to NIR matching, cross-resolution matchings, and digital photo to sketch, with handcrafted DSIFT and deep learning&#x2013;based LCSSE, LightCNN, and ArcFace features. The results show that incorporating the proposed discriminant analysis technique consistently improves the performance of both learnt and handcrafted features, without increasing much to the computational requirements. The improvement is more pronounced in handcrafted features and provides an efficient way to improve their performance.</p>
</sec>
</body>
<back>
<sec id="s6">
<title>Data Availability Statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found here: <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.cbsr.ia.ac.cn/english/HFB_Agreement/NIR-VIS-2.0_agreements.pdf">http://www.cbsr.ia.ac.cn/english/HFB_Agreement/NIR-VIS-2.0_agreements.pdf</ext-link>, <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.cs.cmu.edu/afs/cs/project/PIE/MultiPie/Multi-Pie/Home.html">https://www.cs.cmu.edu/afs/cs/project/PIE/MultiPie/Multi-Pie/Home.html</ext-link>, <ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://www.iab-rubric.org/resources/eprip.html">https://www.iab-rubric.org/resources/eprip.html</ext-link>.</p>
</sec>
<sec id="s7">
<title>Ethics Statement</title>
<p>Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements. Written informed consent was not obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.</p>
</sec>
<sec id="s8">
<title>Author Contributions</title>
<p>TD, MV, and RS discussed the primary approach. TD, SG, and MV performed the experiments and all the authors prepared the manuscript.</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<fn-group>
<fn id="fn2">
<label>1</label>
<p>The terms view and domain/modality are used synonymously in the heterogeneous face recognition literature.</p>
</fn>
<fn id="fn3">
<label>2</label>
<p>Detailed formulation is in the supplementary document.</p>
</fn>
<fn id="fn4">
<label>3</label>
<p>Results of LBP, HOG variants, and pixel are in supplementary document.</p>
</fn>
<fn id="fn5">
<label>4</label>
<p>There is slight difference between LightCNN &#x2b; W/O DA of <xref ref-type="table" rid="T3">Table&#x20;3</xref> and LightCNN in <xref ref-type="table" rid="T4">Table&#x20;4</xref>, as former is our implementation and later is as reported in (<xref ref-type="bibr" rid="B62">Wu et&#x20;al., 2018</xref>).</p>
</fn>
<fn id="fn6">
<label>5</label>
<p>
<ext-link ext-link-type="uri" xlink:href="http://www.cognitec.com/technology.html">http://www.cognitec.com/technology.html</ext-link>
</p>
</fn>
<fn id="fn7">
<label>6</label>
<p>ROC in the supplementary document.</p>
</fn>
<fn id="fn8">
<label>7</label>
<p>Faces: <ext-link ext-link-type="uri" xlink:href="http://www.iqbiometrix.com">www.iqbiometrix.com</ext-link>, IdentiKit: <ext-link ext-link-type="uri" xlink:href="http://www.identikit.net">www.identikit.net</ext-link>
</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Bhatt</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2015</year>). <source>Covariates of Face Recognition</source>. <comment>Tech. Report at IIIT Delhi</comment>.</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bhatt</surname>
<given-names>H. S.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ratha</surname>
<given-names>N. K.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Improving Cross-Resolution Face Matching Using Ensemble-Based Co-transfer Learning</article-title>. <source>IEEE Trans. Image Process.</source> <volume>23</volume>, <fpage>5654</fpage>&#x2013;<lpage>5669</lpage>. <pub-id pub-id-type="doi">10.1109/tip.2014.2362658</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bhatt</surname>
<given-names>H. S.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ratha</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Matching Cross-Resolution Face Images Using Co-transfer Learning</article-title>. <source>IEEE ICIP</source>, <fpage>1453</fpage>&#x2013;<lpage>1456</lpage>. <pub-id pub-id-type="doi">10.1109/ICIP.2012.6467144</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Biswas</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Aggarwal</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Flynn</surname>
<given-names>P. J.</given-names>
</name>
<name>
<surname>Bowyer</surname>
<given-names>K. W.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Pose-robust Recognition of Low-Resolution Face Images</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>35</volume>, <fpage>3037</fpage>&#x2013;<lpage>3049</lpage>. <pub-id pub-id-type="doi">10.1109/tpami.2013.68</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Biswas</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Bowyer</surname>
<given-names>K. W.</given-names>
</name>
<name>
<surname>Flynn</surname>
<given-names>P. J.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Multidimensional Scaling for Matching Low-Resolution Face Images</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>34</volume>, <fpage>2019</fpage>&#x2013;<lpage>2030</lpage>. <pub-id pub-id-type="doi">10.1109/tpami.2011.278</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Ross</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Local Gradient Gabor Pattern (LGGP) with Applications in Face Recognition, Cross-Spectral Matching, and Soft Biometrics</article-title>. <source>SPIE Defense, Security, and Sensing</source> <volume>8712</volume>, <fpage>87120R</fpage>. <pub-id pub-id-type="doi">10.1117/12.2018230</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cho</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>I.-J.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Relational Deep Feature Learning for Heterogeneous Face Recognition</article-title>. <source>IEEE TIFS</source> <volume>16</volume>, <fpage>376</fpage>&#x2013;<lpage>388</lpage>. <pub-id pub-id-type="doi">10.1109/TIFS.2020.3013186</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chugh</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Bhatt</surname>
<given-names>H. S.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Matching Age Separated Composite Sketches and Digital Face Images</article-title>. <source>IEEE BTAS</source>, <fpage>1</fpage>&#x2013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1109/BTAS.2013.6712719</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Zafeiriou</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019a</year>). <article-title>Arcface: Additive Angular Margin Loss for Deep Face Recognition</article-title>. <source>CVPR</source>, <fpage>4690</fpage>&#x2013;<lpage>4699</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2019.00482</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deng</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Qiao</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2019b</year>). <article-title>Mutual Component Convolutional Neural Networks for Heterogeneous Face Recognition</article-title>. <source>IEEE Trans. Image Process.</source> <volume>28</volume>, <fpage>3102</fpage>&#x2013;<lpage>3114</lpage>. <pub-id pub-id-type="doi">10.1109/tip.2019.2894272</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dhamecha</surname>
<given-names>T. I.</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2014</year>). <fpage>1788</fpage>&#x2013;<lpage>1793</lpage>. <article-title>On Effectiveness of Histogram of Oriented Gradient Features for Visible to Near Infrared Face Matching</article-title> <source>IAPR ICPR</source>. <pub-id pub-id-type="doi">10.1109/ICPR.2014.314</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Duan</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Cross-spectral Face Hallucination via Disentangling Independent Factors</article-title>. <source>CVPR</source>, <fpage>7930</fpage>&#x2013;<lpage>7938</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR42600.2020.00795</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Everingham</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Sivic</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zisserman</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Taking the Bite Out of Automated Naming of Characters in Tv Video</article-title>. <source>Image Vis. Comput.</source> <volume>27</volume>, <fpage>545</fpage>&#x2013;<lpage>559</lpage>. <pub-id pub-id-type="doi">10.1016/j.imavis.2008.04.018</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Friedman</surname>
<given-names>J.&#x20;H.</given-names>
</name>
</person-group> (<year>1989</year>). <article-title>Regularized Discriminant Analysis</article-title>. <source>J.&#x20;Am. Stat. Assoc.</source> <volume>84</volume>, <fpage>165</fpage>&#x2013;<lpage>175</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1989.10478752</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goswami</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Chan</surname>
<given-names>C. H.</given-names>
</name>
<name>
<surname>Windridge</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kittler</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Evaluation of Face Recognition System in Heterogeneous Environments (Visible vs NIR)</article-title>. <source>IEEE ICCV Workshops</source>, <fpage>2160</fpage>&#x2013;<lpage>2167</lpage>. <pub-id pub-id-type="doi">10.1109/ICCVW.2011.6130515</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gross</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Matthews</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Cohn</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kanade</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Baker</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Multi-PIE</article-title>. <source>Image Vis. Comput.</source> <volume>28</volume>, <fpage>807</fpage>&#x2013;<lpage>813</lpage>. <pub-id pub-id-type="doi">10.1016/j.imavis.2009.08.002</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Han</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Klare</surname>
<given-names>B. F.</given-names>
</name>
<name>
<surname>Bonnen</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A. K.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Matching Composite Sketches to Face Photos: A Component-Based Approach</article-title>. <source>IEEE TIFS</source> <volume>8</volume>, <fpage>191</fpage>&#x2013;<lpage>204</lpage>. <pub-id pub-id-type="doi">10.1109/TIFS.2012.2228856</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hardoon</surname>
<given-names>D. R.</given-names>
</name>
<name>
<surname>Szedmak</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Shawe-Taylor</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Canonical Correlation Analysis: An Overview with Application to Learning Methods</article-title>. <source>Neural Comput.</source> <volume>16</volume>, <fpage>2639</fpage>&#x2013;<lpage>2664</lpage>. <pub-id pub-id-type="doi">10.1162/0899766042321814</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Learning Invariant Deep Representation for NIR-VIS Face Recognition</article-title>. <source>AAAI</source> <volume>4</volume>, <fpage>7</fpage>. </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>He</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>41</volume>, <fpage>1761</fpage>&#x2013;<lpage>1773</lpage>. <pub-id pub-id-type="doi">10.1109/tpami.2018.2842770</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Hospedales</surname>
<given-names>T. M.</given-names>
</name>
<name>
<surname>Verbeek</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Frankenstein: Learning Deep Face Representations Using Small Data</article-title>. <source>IEEE Trans. Image Process.</source> <volume>27</volume>, <fpage>293</fpage>&#x2013;<lpage>303</lpage>. <pub-id pub-id-type="doi">10.1109/tip.2017.2756450</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jin</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Ruan</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Multi-task Clustering ELM for VIS-NIR Cross-Modal Feature Learning</article-title>. <source>Multidimensional Syst. Signal Process.</source> <fpage>1</fpage>&#x2013;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1007/s11045-016-0401-8</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jin</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Ruan</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Coupled Discriminative Feature Learning for Heterogeneous Face Recognition</article-title>. <source>IEEE Trans.Inform.Forensic Secur.</source> <volume>10</volume>, <fpage>640</fpage>&#x2013;<lpage>652</lpage>. <pub-id pub-id-type="doi">10.1109/tifs.2015.2390414</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Juefei-Xu</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Pal</surname>
<given-names>D. K.</given-names>
</name>
<name>
<surname>Savvides</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>NIR-VIS Heterogeneous Face Recognition via Cross-Spectral Joint Dictionary Learning and Reconstruction</article-title>. <source>CVPR Workshops</source>, <fpage>141</fpage>&#x2013;<lpage>150</lpage>. <pub-id pub-id-type="doi">10.1109/CVPRW.2015.7301308</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kalka</surname>
<given-names>N. D.</given-names>
</name>
<name>
<surname>Bourlai</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Cukic</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Hornak</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Cross-spectral Face Recognition in Heterogeneous Environments: A Case Study on Matching Visible to Short-Wave Infrared Imagery</article-title>. <source>IEEE IJCB</source>, <fpage>1</fpage>&#x2013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1109/IJCB.2011.6117586</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kan</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Shan</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Lao</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Multi-view Discriminant Analysis</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>38</volume>, <fpage>188</fpage>&#x2013;<lpage>194</lpage>. <pub-id pub-id-type="doi">10.1109/tpami.2015.2435740</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klare</surname>
<given-names>B. F.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A. K.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Heterogeneous Face Recognition Using Kernel Prototype Similarities</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>35</volume>, <fpage>1410</fpage>&#x2013;<lpage>1422</lpage>. <pub-id pub-id-type="doi">10.1109/tpami.2012.229</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klare</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A. K.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Heterogeneous Face Recognition: Matching NIR to Visible Light Images</article-title>. <source>IAPR ICPR</source>, <fpage>1513</fpage>&#x2013;<lpage>1516</lpage>. <pub-id pub-id-type="doi">10.1109/ICPR.2010.374</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Klum</surname>
<given-names>S. J.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Klare</surname>
<given-names>B. F.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A. K.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>The FaceSketchID System: Matching Facial Composites to Mugshots</article-title>. <source>IEEE Trans.Inform.Forensic Secur.</source> <volume>9</volume>, <fpage>2248</fpage>&#x2013;<lpage>2263</lpage>. <pub-id pub-id-type="doi">10.1109/tifs.2014.2360825</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Pietik&#xe4;inen</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Learning Discriminant Face Descriptor</article-title>. <source>IEEE Trans. Pattern Anal. Mach Intell.</source> <volume>36</volume>, <fpage>289</fpage>&#x2013;<lpage>302</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2013.112</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Coupled Spectral Regression for Matching Heterogeneous Faces</article-title>. <source>CVPR</source>, <fpage>1123</fpage>&#x2013;<lpage>1128</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2009.5206860</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Liao</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A. K.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2012a</year>). <article-title>Coupled Discriminant Analysis for Heterogeneous Face Recognition</article-title>. <source>IEEE Trans.Inform.Forensic Secur.</source> <volume>7</volume>, <fpage>1707</fpage>&#x2013;<lpage>1716</lpage>. <pub-id pub-id-type="doi">10.1109/tifs.2012.2210041</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Yi</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A. K.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2012b</year>). <article-title>An Improved Coupled Spectral Regression for Heterogeneous Face Recognition</article-title>. <source>IEEE/IAPR Int. Conf. Biometrics</source>, <fpage>7</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1109/icb.2012.6199751</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lezama</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Qiu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Sapiro</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Not afraid of the Dark: NIR-VIS Face Recognition via Cross-Spectral Hallucination and Low-Rank Embedding</article-title>. <source>CVPR</source>, <fpage>6807</fpage>&#x2013;<lpage>6816</lpage>. <pub-id pub-id-type="doi">10.1109/cvpr.2017.720</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Shan</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Low-resolution Face Recognition via Coupled Locality Preserving Mappings</article-title>. <source>IEEE SPL</source> <volume>17</volume>, <fpage>20</fpage>&#x2013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1109/LSP.2009.2031705</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Yi</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Liao</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>The CASIA NIR-VIS 2.0 Face Database</article-title>. <source>CVPR Workshops</source>, <fpage>348</fpage>&#x2013;<lpage>353</lpage>. <pub-id pub-id-type="doi">10.1109/CVPRW.2013.59</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Gong</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Qiao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Common Feature Discriminant Analysis for Matching Infrared Face Images to Optical Face Images</article-title>. <source>IEEE Trans. Image Process.</source> <volume>23</volume>, <fpage>2436</fpage>&#x2013;<lpage>2445</lpage>. <pub-id pub-id-type="doi">10.1109/TIP.2014.2315920</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Gong</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Mutual Component Analysis for Heterogeneous Face Recognition</article-title>. <source>ACM Trans. Intell. Syst. Technol.</source> <volume>7</volume>, <fpage>1</fpage>&#x2013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1145/2807705</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liao</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Yi</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Qin</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Heterogeneous Face Recognition from Local Structures of Normalized Appearance</article-title>. <source>Adv. Biometrics</source>, <fpage>209</fpage>&#x2013;<lpage>218</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-642-01793-3_22</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Inter-modality Face Recognition</article-title>. <source>ECCV</source>, <fpage>13</fpage>&#x2013;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1007/11744085_2</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Transferring Deep Representation for NIR-VIS Heterogeneous Face Recognition</article-title>. <source>IEEE ICB</source>, <fpage>1</fpage>&#x2013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1109/ICB.2016.7550064</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lowe</surname>
<given-names>D. G.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Distinctive Image Features from Scale-Invariant Keypoints</article-title>. <source>Int. J.&#x20;Comput. Vis.</source> <volume>60</volume>, <fpage>91</fpage>&#x2013;<lpage>110</lpage>. <pub-id pub-id-type="doi">10.1023/b:visi.0000029664.99615.94</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Liong</surname>
<given-names>V. E.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Learning Compact Binary Face Descriptor for Face Recognition</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>37</volume>, <fpage>2041</fpage>&#x2013;<lpage>2056</lpage>. <pub-id pub-id-type="doi">10.1109/tpami.2015.2408359</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Majumdar</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Face Verification via Class Sparsity Based Supervised Encoding</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>39</volume>, <fpage>1273</fpage>&#x2013;<lpage>1280</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2016.2569436</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Martinez</surname>
<given-names>A. M.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>The AR Face Database</article-title>. <source>CVC Tech. Rep.</source> <fpage>24</fpage>. <pub-id pub-id-type="doi">10.2118/46030-ms</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mittal</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Goswami</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Recognizing Composite Sketches with Digital Face Images via SSD Dictionary</article-title>. <source>IEEE IJCB</source>, <fpage>1</fpage>&#x2013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1109/BTAS.2014.6996265</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mittal</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Goswami</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Composite Sketch Recognition Using Saliency and Attribute Feedback</article-title>. <source>Inf. Fusion</source> <volume>33</volume>, <fpage>86</fpage>&#x2013;<lpage>99</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2016.04.003</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mittal</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Jain</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Boosting Local Descriptors for Matching Composite and Digital Face Images</article-title>. <source>IEEE ICIP</source>, <fpage>2797</fpage>&#x2013;<lpage>2801</lpage>. <pub-id pub-id-type="doi">10.1109/ICIP.2013.6738576</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mittal</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Vatsa</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Composite Sketch Recognition via Deep Network - a Transfer Learning Approach</article-title>. <source>IEEE/IAPR ICB</source>, <fpage>251</fpage>&#x2013;<lpage>256</lpage>. <pub-id pub-id-type="doi">10.1109/ICB.2015.7139092</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peng</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Graphical Representation for Heterogeneous Face Recognition</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>39</volume>, <fpage>301</fpage>&#x2013;<lpage>312</lpage>. <pub-id pub-id-type="doi">10.1109/tpami.2016.2542816</pub-id> </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reale</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Nasrabadi</surname>
<given-names>N. M.</given-names>
</name>
<name>
<surname>Kwon</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Chellappa</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Seeing the forest from the Trees: A Holistic Approach to Near-Infrared Heterogeneous Face Recognition</article-title>. <source>CVPR Workshops</source>, <fpage>320</fpage>&#x2013;<lpage>328</lpage>. <pub-id pub-id-type="doi">10.1109/cvprw.2016.47</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saxena</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Verbeek</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Heterogeneous Face Recognition with CNNs</article-title>. <source>ECCV Workshops</source>, <fpage>483</fpage>&#x2013;<lpage>491</lpage>. </citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sch&#xf6;lkopf</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Herbrich</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Smola</surname>
<given-names>A. J.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>A Generalized Representer Theorem</article-title>. <source>Comput. Learn. Theor.</source>, <fpage>416</fpage>&#x2013;<lpage>426</lpage>. <pub-id pub-id-type="doi">10.1007/3-540-44581-1_27</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sharma</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Jacobs</surname>
<given-names>D. W.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Bypassing Synthesis: PLS for Face Recognition with Pose, Low-Resolution and Sketch</article-title>. <source>CVPR Workshops</source>, <fpage>593</fpage>&#x2013;<lpage>600</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2011.5995350</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sharma</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Daume</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Jacobs</surname>
<given-names>D. W.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Generalized Multiview Analysis: A Discriminative Latent Space</article-title>. <source>CVPR</source>, <fpage>2160</fpage>&#x2013;<lpage>2167</lpage>. <pub-id pub-id-type="doi">10.1109/cvpr.2012.6247923</pub-id> </citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yi</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Cross-modality Face Recognition via Heterogeneous Joint Bayesian</article-title>. <source>IEEE SPL</source> <volume>24</volume> (<issue>1</issue>), <fpage>81</fpage>&#x2013;<lpage>85</lpage>. <pub-id pub-id-type="doi">10.1109/LSP.2016.2637400</pub-id> </citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Siena</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Boddeti</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Maximum-margin Coupled Mappings for Cross-Domain Matching</article-title>. <source>IEEE BTAS</source>. <fpage>1</fpage>&#x2013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1109/BTAS.2013.6712686</pub-id> </citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taigman</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ranzato</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wolf</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Deepface: Closing the gap to Human-Level Performance in Face Verification</article-title>. <source>CVPR</source>, <fpage>1701</fpage>&#x2013;<lpage>1708</lpage>. <pub-id pub-id-type="doi">10.1109/cvpr.2014.220</pub-id> </citation>
</ref>
<ref id="B59">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>Face Sketch Synthesis and Recognition</article-title>. <source>IEEE ICCV</source>, <fpage>687</fpage>&#x2013;<lpage>694</lpage>. <pub-id pub-id-type="doi">10.1109/ICCV.2003.1238414</pub-id> </citation>
</ref>
<ref id="B60">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Wilson</surname>
<given-names>C. L.</given-names>
</name>
<name>
<surname>Grother</surname>
<given-names>P. J.</given-names>
</name>
<name>
<surname>Chandramouli</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2007</year>). <source>Biometric Data Specification for Personal Identity Verification</source>. <comment>Tech. Report NIST-SP-800-76-1. National Institute of Standards &#x26; Technology</comment>. </citation>
</ref>
<ref id="B61">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Jing</surname>
<given-names>X.-Y.</given-names>
</name>
<name>
<surname>You</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yue</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J.-Y.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Multi-view Low-Rank Dictionary Learning for Image Classification</article-title>. <source>Pattern Recognition</source> <volume>50</volume>, <fpage>143</fpage>&#x2013;<lpage>154</lpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2015.08.012</pub-id> </citation>
</ref>
<ref id="B62">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>A Light Cnn for Deep Face Representation with Noisy Labels</article-title>. <source>IEEE Trans.Inform.Forensic Secur.</source> <volume>13</volume>, <fpage>2884</fpage>&#x2013;<lpage>2896</lpage>. <pub-id pub-id-type="doi">10.1109/tifs.2018.2833032</pub-id> </citation>
</ref>
<ref id="B63">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yi</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Shared Representation Learning for Heterogenous Face Recognition</article-title>. <source>IEEE FG</source>, <fpage>1</fpage>&#x2013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1109/FG.2015.7163093</pub-id> </citation>
</ref>
<ref id="B64">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yi</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Chu</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Lei</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Face Matching between Near Infrared and Visible Light Images</article-title>. <source>Adv. Biometrics</source>, <fpage>523</fpage>&#x2013;<lpage>530</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-540-74549-5_55</pub-id> </citation>
</ref>
<ref id="B65">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>J.-Y.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>W.-S.</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>J.-H.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Matching NIR Face to VIS Face Using Transduction</article-title>. <source>IEEE Trans.Inform.Forensic Secur.</source> <volume>9</volume>, <fpage>501</fpage>&#x2013;<lpage>514</lpage>. <pub-id pub-id-type="doi">10.1109/tifs.2014.2299977</pub-id> </citation>
</ref>
<ref id="B66">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Martinez</surname>
<given-names>A. M.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Subclass Discriminant Analysis</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>28</volume>, <fpage>1274</fpage>&#x2013;<lpage>1286</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2006.172</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>