Impact Factor 2.089

The world's most-cited Multidisciplinary Psychology journal

General Commentary ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Psychol. | doi: 10.3389/fpsyg.2018.01610

Commentary: The Dynamic Features of Lip Corners in Genuine and Posed Smiles

Yingqi Li1, Zhongyong Shi2, 3,  Lishu Luo4, 5, Honglei Zhang4, 5 and  Guoxin Fan5, 6*
  • 1Tongji University, China
  • 2Massachusetts General Hospital, Harvard Medical School, United States
  • 3Shanghai Tenth People's Hospital, Tongji University, China
  • 4Tianjin University, China
  • 5Surgical Planing Lab, Department of Radiology, Brigham and Women's Hospital, United States
  • 6School of Medicine, Tongji University, China

For thousands of human’s history, we have learned how to fake or hide our genuine feelings and emotion to our surrounding folks intentionally or unconsciously. It is an irony that this is what we call emotion intelligence to get more interests, show our politeness, tackle the dilemma, or deal with many other complicated situations. Posed smiles are one of the most common faked expressions in our daily life. Indeed, it is a challenge for computer vision system to recognize genuine from posed smile of an individual, not even for human brains sometimes. Recently, an interesting work by Guo et al.(Guo, et al., 2018) employed computer vision techniques to investigate the potential differences in duration, intensity, speed, symmetry of the lip corners, and certain irregularities between genuine and posed smiles based on the UvA-NEMO Smile Database. The results are quite rewarding since they found that genuine smiles were correlated with higher onset, offset, apex, and total durations, as well as offset displacement, and Irregularity-b, compared with posed smiles. In addition, posed smiles were correlated with higher onset and offset Speeds, Irregularity-a, Symmetry-a, and Symmetry-d.
We can’t agree with the saying that only a handful of studies on the dynamic features of facial expressions have been conducted due to the lack of user-friendly analytic tools. For the past decades, hundreds of studies focused on dynamic features of facial expressions (Ko, 2018, Sandbach, et al., 2012). (Valstar, et al., 2006) differentiated spontaneous brow actions from posed ones focusing on velocity, duration and order of occurrence. (Littlewort, et al., 2009) distinguished fake pain from real pain by analyzing facial actions based on Gabor features. (Dibeklioğlu, et al., 2012) analyzed the dynamics of eyelid, cheek and lip corner to tell genuine smiles from posed ones and extracted 25 features, which were also cited by the author. The author(Guo, et al., 2018) said not all these 25 features could be explained from a psychological perspective, so they extracted duration, speed, intensity, symmetry, and irregularity in their study. The question is why all the potential features should be explained by psychological theory. It is possible that we may lose a lot of useful information to help distinguish genuine smiles from posed ones. Obviously, we still have great limited knowledge in psychology itself.
Indeed, the value of all the above-mentioned pioneering works should be appreciated, which did help improve the recognition performance of posed and spontaneous expression over time. However, the hand-crafted features built by rules may lead to inadequate abstraction and representations. We are wondering whether the 25 features are the whole story to tell genuine smiles from posed ones, and how much these extracted features would help the computer vision system to recognize posed smiles from genuine facial expression. Obviously, there is still a lot of work for us to set the weights of all identified features extracted by different studies from different datasets to conduct the recognition performance. At least, we can not tell how much the dynamic features of lip corners would help differentiate genuine smiles from posed smiles supported by diagnostic data in the presented study.
Recently, deep learning has demonstrated overwhelming performance in image or video processing over conventional methods such as facial recognition and classification(Majumder, et al., 2018, Peng, et al., 2017, Rodriguez, et al., 2017, Yu, et al., 2018). Many start-up companies already built their great business with outstanding performance of facial recognition in security. It is not surprising that researchers already adopted convolutional neural network (CNN) to differentiate genuine smiles from posed ones, and the recognition performance was promising(Kumar, et al., 2017, Mandal, et al., 2017). Then comes another question whether deep learning will take over this area and wipe out the necessity of studying hand-crafted features.
Actually, the recognition performance of deep learning in classifying genuine smiles and posed smiles may heavily rely on the size of training data. Unfortunately, datasets containing labeled genuine smiles and posed smiles are limited(Xu, et al., 2017). The good news is that hand-crafted features combined with deep learning may have the potential to improve the recognition performance over deep learning alone with limited data(Pesteie, et al., 2018). We can build a hybrid model inputting n features from deep learning plus definite well-known features from conventional methods into a classifier (Figure 1). We admitted that deep learning is also criticized by its interpretability, known as black box. However, many researchers realized the importance of solving the black box of deep learning and emerging solutions have been proposed(Gunning, 2017, Samek, et al., 2017, Shwartz-Ziv, et al., 2017).
Considering the outstanding recognition performance, we do believe that deep learning may dominate in the area of image recognition and classification, including discriminating genuine smiles from posed ones of course. As for the black box, we should regard it as the inner characteristics of deep learning, instead of limitations. It would be better if we can crack the black box just like Newton figured out why apples always fell into the ground. When that day comes, deep learning may have more impact than today, though we admit that more efforts are needed to break down the black box of deep learning.

Keywords: facial recognition, deep learning, dynamic features, smiles, lip corners

Received: 06 May 2018; Accepted: 13 Aug 2018.

Edited by:

Xunbing Shen, Department of Psychology, Jiangxi University of Traditional Chinese Medicine, China

Reviewed by:

Lynden K. Miles, University of Aberdeen, United Kingdom  

Copyright: © 2018 Li, Shi, Luo, Zhang and Fan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Dr. Guoxin Fan, School of Medicine, Tongji University, Shanghai, China, 1610707@tongji.edu.cn