Book Review: Perception of Pixelated Images

Goffaux, Valérie

doi:10.3389/fpsyg.2016.01151

BOOK REVIEW article

Front. Psychol., 12 August 2016

Sec. Perception Science

Volume 7 - 2016 | https://doi.org/10.3389/fpsyg.2016.01151

Book Review: Perception of Pixelated Images

Valérie Goffaux^1,2,3^*

¹Psychological Sciences Research Institute, Université Catholique de Louvain, Louvain-la-Neuve, Belgium
²Institute of Neuroscience, Université Catholique de Louvain, Louvain-la-Neuve, Belgium
³Cognitive Neuroscience Department, Maastricht University, Maastricht, Netherlands

A book review on
Perception of Pixelated Images

Edited by Talis Bachmann, London: Elsevier, 2016, 170 Pages. ISBN: 9780128095058

Despite the fact that we feel immersed in a rich and continuous flow of visual sensations, our visual system samples only a small fraction of the luminance variations present in the environment. Such sparse sampling inevitably comes along with a loss of information. And this is advantageous since it decreases the computational and metabolic needs of the system to e.g., generate, classify, and store images. But sampling must be smartly calibrated so that critical cues are not lost. This seems to be the case for the perception of major visual categories, such as faces and letters, which has been found to rely on a restricted but optimized range of spatial resolutions, also called spatial frequencies (SF; Gold et al., 1999; Nasanen, 1999; Majaj et al., 2002).

Initial works addressing the SF dependency of human perception manipulated image spatial resolution by means of quantization, also called pixelation. In his recent book, Talis Bachmann reviews how this method contributed to a better understanding of human vision. Quantization consists in dividing an image into equally sized squares, and filling each square with its averaged luminance value (Figures 1A,B). This image process acts like a low-pass SF filter since it maintains the coarse structure of the original picture (i.e., its low SF) but removes its finer details (i.e., its high SF). But quantization also produces a spurious block structure, which adds “alien” high SF to the image.

FIGURE 1

Figure 1. (A) Grayscale detail (800 by 800 pixels) of “Girl with a Pearl Earring,” oil painting by Johannes Vermeer. (B) The image has been quantized at different spatial scales, by averaging luminance over square areas of different sizes. Spatial scale of quantized images can be expressed in two ways: block size (i.e., pixels per block or ppb) or the number of blocks per image width (bpw) or height. The latter measure divided by two can be taken to approximate the SF range available in the quantized image. To be recognized, a quantized image of a face needs to contain at least 16 blocks (e.g., approximately 8 cycles per image; e.g., Bachmann, 1991). The sharp edges of the spurious blocks that occupies high SF directly adjacent to the low SF range of portrait (from 4 cycles per image on for an 8-block quantized image) largely disrupt image recognizability. (C) When block edges are attenuated by low-pass filtering, the recognition of the quantized image improves (Harmon and Julesz, 1973). (D,E) Obliquely-quantized images seem to provide cues that do not fully overlap with those carried by cardinally-quantized images (B,C). Using different block structure orientations may yield new insights on how quantization affects shape processing.

The quantization adventure started with the work published by Harmon and Julesz (1973). The authors quantized the iconic portrait of President Abraham Lincoln and found that portrait recognizability decreased as block size increased (Figures 1A,B). Interestingly, the recognition of the quantized portrait recovered to some extent when block edges were attenuated by low-pass SF filtering (Figure 1C). Harmon and Julesz (1973) interpreted this observation as reflecting “critical band masking,” namely that the high SF of the block structure interfere with (or mask) the low SF carrying portrait information. Such masking was proposed to emerge at primary visual stages of SF extraction, before the integration of visual input into a shape.

Later Morrone et al. (1983) challenged the early “critical band masking” interpretation by reporting a seemingly paradoxical finding: portrait recognition improves when high SF random noise is added to the quantized image. If the disruptive effects of quantization on perception were due to inter-SF competition, increasing the power of high SF by adding noise should even more interfere with the recognition of the low SF portrait. That portrait recognition improves when block shape is destroyed by noise instead suggests that the difficulty of recognizing quantized images is due to a competition between the integration of block and portrait shapes, at a higher visual processing stage than the early SF extraction stage (Bachmann and Kahusk, 1997; see also Caelli and Yuzyk, 1985). Besides the disruptive effect of the high SF block edges on perception, quantization was also reported to distort the second-order properties of the low SF image content (Caelli and Yuzyk, 1985; Bachmann and Kahusk, 1997; Morgan and Watt, 1997; Morrone and Burr, 1997). Although quantization was initially used to investigate the primary SF dependencies of human vision, this evidence shows that it also drastically distorts the higher-level (shape) properties of the image.

Actually, quantization also affects the orientation content of the image. Considering that (1) the visual system preferentially responds to cardinally-oriented edges (at least for meaningless shapes; Furmanski and Engel, 2000) and that (2) distinct orientation ranges are optimal for the perception of core categories such as faces and scenes (Hansen et al., 2003; Dakin and Watt, 2009; Goffaux and Dakin, 2010; Pachai et al., 2013), it is plausible that the standard cardinal orientation of block averaging influenced quantization evidence in peculiar and complex ways. Using a different quantization structure (Figures 1D,E) may yield new insights on the shape-related mechanisms involved when dealing with quantized images.

Because quantized image perception actually reflects complex and still elusive interactions between the integration of block and e.g., portrait shapes, interpreting perceptual findings derived from this technique proves difficult (Costen et al., 1994; Morrison and Schyns, 2001). Therefore, most researchers investigating the optimal SF range for human vision abandoned quantization in favor of Fourier-filtering procedures. As a consequence, the empirical literature related to quantization is relatively limited. The present book describes in detail this confined literature, without providing innovative arguments that would potentially make the reader reconsider the contribution of this technique to the field of vision science. Bachmann defends quantization as a more valid means to manipulate visual perception than SF filtering due to its more disruptive effect on shape integration. However, the elusiveness of quantization effects on shape processing undermines this statement.

Research on quantization may be more illuminating with regards to digital sampling. These last decades the amount of image data on the internet has exploded (e.g., Deng et al., 2009), and our everyday visual diet has dramatically changed to become increasingly digital. Analogously to images captured by our visual system, the apparently smooth and rich digital images result from a sampling operation that break luminance gradients of the captured scene into discrete units called pixels. As Bachmann states, both visual and digital sampling are bound to the spatial resolution issue, i.e., how fine-grained an image should be to allow for recognition by man and machine. Quantization evidence has the potential to inform on the spatial resolution necessary for an economic storage of digital images, the optimal image classification by computer algorithms, and ultimately the development of efficient artificial intelligent devices. The book casts some light on these potential and more warranted contributions of quantization research.

Author Contributions

The author confirms being the sole contributor of this work and approved it for publication.

Funding

The author is supported by the Belgian National Foundation for Scientific Research (F.R.S.-F.N.R.S.).

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Bachmann, T. (1991). Identification of spatially quantised tachistoscopic images of faces: how many pixels does it take to carry identity? Eur. J. Cogn. Psychol. 3, 87–103. doi: 10.1080/09541449108406221

CrossRef Full Text | Google Scholar

Bachmann, T., and Kahusk, N. (1997). The effects of coarseness of quantisation, exposure duration, and selective spatial attention on the perception of spatially quantised (‘blocked’) visual images. Perception 26, 1181–1196. doi: 10.1068/p261181

PubMed Abstract | CrossRef Full Text | Google Scholar

Caelli, T., and Yuzyk, J. (1985). What is perceived when two images are combined? Perception 14, 41–48. doi: 10.1068/p140041

PubMed Abstract | CrossRef Full Text | Google Scholar

Costen, N. P., Parker, D. M., and Craw, I. (1994). Spatial content and spatial quantisation effects in face recognition. Perception 23, 129–146. doi: 10.1068/p230129

PubMed Abstract | CrossRef Full Text | Google Scholar

Dakin, S. C., and Watt, R. J. (2009). Biological “bar codes” in human faces. J. Vis. 9, 2.1–2.10. doi: 10.1167/9.4.2

PubMed Abstract | CrossRef Full Text | Google Scholar

Deng, J., Dong, W., Socher, R., Li, L. J., Kai, L., and, Li, F. F. (2009). “ImageNet: a large-scale hierarchical image database,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009 (Miami, FL).

Furmanski, C. S., and Engel, S. A. (2000). An oblique effect in human primary visual cortex. Nat. Neurosci. 3, 535–536. doi: 10.1038/75702

PubMed Abstract | CrossRef Full Text | Google Scholar

Goffaux, V., and Dakin, S. (2010). Horizontal information drives the behavioural signatures of face processing. Front. Psychol. 1:143. doi: 10.3389/fpsyg.2010.00143

CrossRef Full Text | Google Scholar

Gold, J., Bennett, P. J., and Sekuler, A. B. (1999). Identification of band-pass filtered letters and faces by human and ideal observers. Vision Res. 39, 3537–3560. doi: 10.1016/S0042-6989(99)00080-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Hansen, B. C., Essock, E. A., Zheng, Y., and DeFord, J. K. (2003). Perceptual anisotropies in visual processing and their relation to natural image statistics. Network 14, 501–526. doi: 10.1088/0954-898X_14_3_307

PubMed Abstract | CrossRef Full Text | Google Scholar

Harmon, L., and Julesz, B. (1973). Masking in visual recognition: effects of two dimensional filtered noise. Science 180, 1194–1197. doi: 10.1126/science.180.4091.1194

PubMed Abstract | CrossRef Full Text | Google Scholar

Majaj, N. J., Pelli, D. G., Kurshan, P., and Palomares, M. (2002). The role of spatial frequency channels in letter identification. Vision Res. 42, 1165–1184. doi: 10.1016/S0042-6989(02)00045-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Morgan, M. J., and Watt, R. J. (1997). The combination of filters in early spatial vision: a retrospective analysis of the MIRAGE model. Perception 26, 1073–1088. doi: 10.1068/p261073

PubMed Abstract | CrossRef Full Text | Google Scholar

Morrison, D. J., and Schyns, P. G. (2001). Usage of spatial scales for the categorization of faces, objects, and scenes. Psychon. Bull. Rev. 8, 454–469. doi: 10.3758/BF03196180

PubMed Abstract | CrossRef Full Text | Google Scholar

Morrone, M. C., and Burr, D. C. (1997). Capture and transparency in coarse quantized images. Vision Res. 37, 2609–2629. doi: 10.1016/S0042-6989(97)00052-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Morrone, M. C., Burr, D. C., and Ross, J. (1983). Added noise restores recognizability of coarse quantized images. Nature 305, 226–228. doi: 10.1038/305226a0

PubMed Abstract | CrossRef Full Text | Google Scholar

Nasanen, R. (1999). Spatial frequency bandwidth used in the recognition of facial images. Vision Res. 39, 3824–3833. doi: 10.1016/S0042-6989(99)00096-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Pachai, M. V., Sekuler, A. B., and Bennett, P. J. (2013). Sensitivity to information conveyed by horizontal contours is correlated with face identification accuracy. Front. Psychol. 4:74. doi: 10.3389/fpsyg.2013.00074

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: quantization, pixelation, spatial frequency, recognition, digitization

Citation: Goffaux V (2016) Book Review: Perception of Pixelated Images. Front. Psychol. 7:1151. doi: 10.3389/fpsyg.2016.01151

Received: 20 June 2016; Accepted: 19 July 2016;
Published: 12 August 2016.

Edited and reviewed by: Haluk Ogmen, University of Houston, USA

Copyright © 2016 Goffaux. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Valerie Goffaux, dmFsZXJpZS5nb2ZmYXV4QHVjbG91dmFpbi5iZQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.