Modeling the Time-Course of Responses for the Border Ownership Selectivity Based on the Integration of Feedforward Signals and Visual Cortical Interactions

Border ownership (BO) indicates which side of a contour owns a border, and it plays a fundamental role in figure-ground segregation. The majority of neurons in V2 and V4 areas of monkeys exhibit BO selectivity. A physiological work reported that the responses of BO-selective cells show a rapid transition when a presented square is flipped along its classical receptive field (CRF) so that the opposite BO is presented, whereas the transition is significantly slower when a square with a clear BO is replaced by an ambiguous edge, e.g., when the square is enlarged greatly. The rapid transition seemed to reflect the influence of feedforward processing on BO selectivity. Herein, we investigated the role of feedforward signals and cortical interactions for time-courses in BO-selective cells by modeling a visual cortical network comprising V1, V2, and posterior parietal (PP) modules. In our computational model, the recurrent pathways among these modules gradually established the visual progress and the BO assignments. Feedforward inputs mainly determined the activities of these modules. Surrounding suppression/facilitation of early-level areas modulates the activities of V2 cells to provide BO signals. Weak feedback signals from the PP module enhanced the contrast gain extracted in V1, which underlies the attentional modulation of BO signals. Model simulations exhibited time-courses depending on the BO ambiguity, which were caused by the integration delay of V1 and V2 cells and the local inhibition therein given the difference in input stimulus. However, our model did not fully explain the characteristics of crucially slow transition: the responses of BO-selective physiological cells indicated the persistent activation several times longer than that of our model after the replacement with the ambiguous edge. Furthermore, the time-course of BO-selective model cells replicated the attentional modulation of response time in human psychophysical experiments. These attentional modulations for time-courses were induced by selective enhancement of early-level features due to interactions between V1 and PP. Our proposed model suggests fundamental roles of surrounding suppression/facilitation based on feedforward inputs as well as the interactions between early and parietal visual areas with respect to the ambiguity dependence of the neural dynamics in intermediate-level vision.


The model V1 cell for extracting the contrast from visual inputs and attentional modulation
A detailed mathematical description of the model V1 cell is given here. The input image, Input, is a 124x124 pixel, grey scale image with an intensity value ranging between zero and one. We designed the model so that 25 pixel corresponds to one degree (  , where index x and y indicate spatial location, and ω indicates the spatial frequency. The local contrasts, xy C θω , have the magnitude ranging between zero and two. Because we have limited input stimuli, we chose a single frequency of 0.5 o wavelength that is optimal for the extraction of contours from the stimuli. Orientation, θ, is chosen from 0, 90, 180 and 270 degree. m represents the number of pixels in the Gabor filter G θω . Spatial attention that is represented in PP modulates the contrast gain in V1 as in Equation 3 in the main text. Local contrast, C θωxy , is modulated by attention that is given by the feedback from PP to V1, . The modulated contrast, I θωxy V1,exc , is given by the following equations, as proposed by (Lee et al., 1999;Peters et al., 2005): , where W ij represents connection weights of the Gaussian (Deco and Lee, 2004) with the standard deviation of w σ (0.53 o ). W ij were chosen so that the total sum is one.
is identical to that in Equation 2 in the main text. A xy PP shows the activity of a model PP cell as shown in Equation 8 in the main text. k and l show the spatial extent of the feedback from PP cells to a single V1 cell. α, χ, γ and δ are constants. S is a semi-saturation constant which prevents the denominator to be zero (Lee et al., 1999;Peters et al., 2005). In our simulation, we used α = 0.25, χ = 0.6, γ = 4.0, δ = 3.0 and S = 2.05. These constants were chosen following the references (Deco and Lee, 2004;Lee et al., 1999;Peters et al., 2005). All constants were fixed throughout all simulations. The semisaturation constant S had an important role for preventing the magnitude of the modulated feedforward inputs I θωxy V1,exc from being infinitely activated. Furthermore, this constant prevented the denominator to be zero. Major results were insensitive to the change of these parameters at least in the range between 75% and 150% of those used in the simulation (Wagatsuma et al., 2008;Wagatsuma et al., 2013).
The denominator of Equation S2 shows inhibitory effects in V1. N i and N j represent the spatial range of the inhibitory effects, and the feedback from PP, , modulates this inhibition. N i and N j were set to 1.0 o . The denominator of Eq.S2 functioned as the inhibitory unit in the V1 module. Spatial attention increases the contrast gain, thus the extracted signal at the attended location is enhanced.
The activity of a model V1 cell, , where 1 V noise I represents random noise and µ represents a scaling constant. In this simulation, we used µ = 0.95. Equation S5 includes the contrast signal, I θωxy V1,exc , that is modulated by spatial attention so that the activities of model V1 cells at and around the attended location are increased.

The mechanism of the surrounding suppression/facilitation of a model BO-selective cell in V2
A mathematical description of the surrounding suppression/facilitation of a model BO-selective cell in V2 module is given here. The model determines BO based on surrounding contrast (Sakai and Nishimura, 2006;Sakai et al., 2012).
First, V2 module pools the contrast signals that are modulated by spatial attention in V1 module over space and frequency, x and y represent the location of the classical receptive field (CRF) of the model cell. m indicates the spatial extent of feed-forward from V1 module. 1 xy O θ shows the feed-forward input from V1. Index cross and iso represent contrasts orthogonal and parallel to the preferred orientation, θ, of the cell, respectively. W ij represents the Gaussian function as shown in Equation S4. , is given by a linear combination of contrast signals from excitatory and inhibitory regions that are defined by Gaussian functions as illustrated in Figure  1(B) in the main text.
CF xyN BO and CS xyN BO represent the contrast signals within the facilitatory and suppressive regions, respectively. The index N represents the type of model BO cells that are distinguished by their surround regions. We implemented 10 types of surround regions from a pool of Gaussians generated randomly to reproduce a diversity of BO selectivity (Zhou et al., 2000). F N BO and S N BO represent the facilitatory and suppressive regions of the model BO-selective cell. n x and n y indicate the spatial extent of facilitatory and suppressive regions. The combination of facilitatory and suppressive regions determines the property of BO selectivity. Such localized, asymmetric, and orientation dependent organization is observed in surrounding modulation in V1 neurons (Jones et al., 2002). c a and c b are connection strength. These constants for surrounding modulation (n x , n y , c a and c b ) were determined following the references (Sakai and Nishimura, 2006). The balance of the facilitation and suppression determines modulation of the model BO-selective cell.
For the determination of direction of figure, the activities of model BO-selective cells are pooled to represent the population activities. For computing the BO signals, we took the summation of all activities of BO-right cells and that of BO-left cells (see the main text for details). Supplementary Figure   Figure S1. We fitted functions to BO signal ν of each simulation trial after the replacement (>500 ms). We computed the slopes and time constants of BO signals ν through fitting with exponential curves ( N(t) = N 0 exp(−t / τ ) , N 0 and τ meant the slope and time constant, respectively). Asterisks indicate significant differences between the stimulus sets (t-test: ** p < 0.01; * p < 0.05).