Computational Model for Human 3D Shape Perception From a Single Specular Image

In natural conditions the human visual system can estimate the 3D shape of specular objects even from a single image. Although previous studies suggested that the orientation field plays a key role for 3D shape perception from specular reflections, its computational plausibility, and possible mechanisms have not been investigated. In this study, to complement the orientation field information, we first add prior knowledge that objects are illuminated from above and utilize the vertical polarity of the intensity gradient. Then we construct an algorithm that incorporates these two image cues to estimate 3D shapes from a single specular image. We evaluated the algorithm with glossy and mirrored surfaces and found that 3D shapes can be recovered with a high correlation coefficient of around 0.8 with true surface shapes. Moreover, under a specific condition, the algorithm's errors resembled those made by human observers. These findings show that the combination of the orientation field and the vertical polarity of the intensity gradient is computationally sufficient and probably reproduces essential representations used in human shape perception from specular reflections.


Appendix B: Psychophysical experiment results
Five subjects were first asked whether the local 3D surface around the red crosses in Figures 6A,B looks convex or concave. After that, they were asked whether the true 3D shape ( Figure 6C) or the recovered 3D shape ( Figure 6D or Figure 6E) was more similar to the perceived 3D shape from the image. Four of five subjects answered that the local surface of Figure 6A looked concave and only one thought that it looked convex. All five subjects answered that the local surface of Figure 6B looked convex. Four of five subjects answered that the recovered shape ( Figure 6D) was closer to the perceived shape of the image shown in Figure 6A, and one thought that the true shape ( Figure 6C) was closer. Four of five subjects answered that the recovered shape ( Figure 6E) was closer to the perceived shape from the image shown in Figure 6B and one answered that the true shape was closer.

Appendix C: Minimization of cost function
We optimized σmax, σmin, and ka by minimizing cost function E (Equation 18) as follows. First, we optimized the second derivative signs, σmax and σmin, assuming that the second derivative magnitude is constant as ka=1. After that, we optimized second derivative magnitude ka. We first optimized the signs for the following reasons. First, the signs are more easily optimized because the vertical polarity gives reasonable initial values. Second, the influence of the sign error on the magnitude optimization is larger than the opposite case because the magnitude amplifies the sign effect.

Mean field algorithm for optimizing the signs
With the assumption that ka=1, the cost function becomes a simple quadratic form of the summarized signs σall = (σmax T , σmin T ) T : where = ( , ) −1 ( , ) and all = ( max , min ) . Equation (A1) is a hard optimization problem because σall are binary parameters. However, we can utilize an existing approximation method since function Equation (A1) is the same as the Ising model. We used a naive mean field approximation to derive the optimization algorithm (Parisi, 1988).
1. Initial value: The initial values are determined by the vertical polarity (Equation 7) and set to zero where there is no information: Additionally, the initial values near the boundary are determined so that σmax = +1 and σmin equal the curvature sign of the image contour (Supplementary Figure 5).

Iteration: This update equation is repeated Nσ times (Nσ =100 in this study):
where  is a constant representing the inverse temperature of the system. If  is either small or large, the fluctuation is either large or small. To search for the global minimum, simulated annealing was used in this study: the initial value of the inverse temperature is  = 0 / |J| and increased per iteration as  = 1.1, where 0 = 10, and |J| is the maximum eigenvalue of matrix J.

Result:
The solution is obtained by extracting the sign:

Optimization of second derivative amplitudes
Given σmax and σmin, the cost function becomes a simple quadratic form of second derivative magnitude ka: where = + 2 − ( max + min ) −1 ( max + min ); Σmax and Σmin are diagonal matrices with diagonal elements σmax and σmin. In the minimization of the cost function, a constraint, where the mean value of the second derivative magnitude is 1 as Ω ⁄ = 1, is introduced to avoid a trivial solution, ka = 0. The solution is obtained by the Lagrange multiplier: Again, the sign optimization precedes the magnitude optimization. Although ka ≥ 0 by definition, the cost function is minimized by removing this restriction and the solution is given by = and = 0: Then, based on Equations (3) and (4), we flip the signs as ka(x,y) = -ka(x,y), σmax(x,y) = -σmax(x,y), and σmin(x,y) = -σmin(x,y) where ka(x,y)<0. This procedure minimizes the cost function greater than the magnitude optimization under constraint ka ≥ 0 by simultaneously optimizing the signs and the magnitude. For the sign optimization, we repeat this procedure Nk times (Nk=10 in this study) by updating σmax and σmin and alternately repeat the mean field algorithm described above and this procedure Nloop times (Nloop=10 in this study). Note that, in the mean field algorithm, the assumption that ka=1 holds during this loop.
After sufficient optimization of the second derivative signs, the optimized magnitude is given by Equation (A8).

Appendix D: Minimization of second cost function
The improved solution is obtained by second cost function C' (Equation 12). The second cost function is complicated, which makes the global search based on it difficult. Therefore, we utilize the second cost function to locally improve the solution obtained by the first cost minimization. The second cost function is represented without z by the same procedure of Equations (17) and (18) Cost function E' is minimized by repeating the mean field algorithm and the gradient descent. We used the optimized σmax and σmin by the first cost function as initial values. However, the initial value of the second derivative magnitude was ka = 1, because the resultant cost function was minimized greater than the case where we used the optimized ka by the first cost function as initial values.

Optimization of second derivative amplitudes
Since cost function E' is complicated with respect to ka, the gradient descent was used to optimize ka: where γ = 0.1 in this study. We repeated this iteration 100 times. Furthermore, we alternately repeated the mean field algorithm and the gradient descent 10 times.

Effect of second cost minimization on results
We first obtained intermediate solutions based on the first cost minimization (Equation 11) and improved them based on the second cost minimization (Equation 12). Here, we show the difference of the results. The recovered shapes of the glossy condition by the first cost minimization are shown in Supplementary Figure 6. The recovered shape looks too smoothed and less bumpy than the improved solutions shown in Figure 4. The average values of the global and the local interior depth correlation for the 12 objects were rg = 0.82 and rli = 0.71. These depth correlation values were slightly worse than the improved solutions. The small difference is due to the fact that the estimated signs of the two conditions were almost the same (the agreement rates of σmax and σmin were 0.996 and 0.94) indicating that the surface second derivative signs were not improved by the second cost minimization because the solutions by the gradient descent are easily trapped into the local minimum. We interpreted why the appearance of the recovered shapes and the depth correlation values was improved by the second cost minimization as follows. The first cost function depends strongly on the shapes near the boundary because the cost is proportional to second derivative magnitude ka (Equation 10), and the second derivative magnitude is generally large near the boundary. As a result, the recovered shapes are insensitive to bumps far from the boundary. In contrast, because the second cost function is normalized by second derivative magnitude ka, the shapes were recovered with equal weighing to the inner region as well as near the boundary.