Matching by Adjustment: If X Matches Y, Does Y Match X?

When dealing with pairwise comparisons of stimuli in two fixed observation areas (e.g., one stimulus on the left, one on the right), we say that the stimulus space is regular well-matched if (1) every stimulus is matched by some stimulus in another observation area, and this matching stimulus is determined uniquely up to matching equivalence (two stimuli being equivalent if they always match or do not match any stimulus together); and (2) if a stimulus is matched by another stimulus then it matches it. The regular well-matchedness property has non-trivial consequences for several issues, ranging from the ancient “sorites” paradox to “probability-distance hypothesis” to modeling of discrimination probabilities by means of Thurstonian-type models. We have tested the regular well-matchedness hypothesis for locations of two dots within two side-by-side circles, and for two side-by-side “flower-like” shapes obtained by superposition of two cosine waves with fixed frequencies in polar coordinates. In the location experiment the two coordinates of the dot in one circle were adjusted to match the location of the dot in another circle. In the shape experiment the two cosine amplitudes of one shape were adjusted to match the other shape. The adjustments on the left and on the right alternated in long series according to the “ping-pong” matching scheme developed in Dzhafarov (2006b, J. Math. Psychol., 50, 74–93). The results have been found to be in a good agreement with the regular well-matchedness hypothesis.

This simple observation leads us to propose that a valid theoretical definition of the notion "stimulus y matches stimulus x" 1 should be constructed so that the relation it depicts be symmetric: y x x y matches if and only if matches ( 1) . (1) Note that if x and y in the relation "y matches x" are, say, the left and the right stimuli, respectively (and so the relation in question means that the right stimulus matches the left one in some property or overall, but ignoring the conspicuous difference in locations), then they retain these locations in the relation "x matches y." So, the statement in (1) for left-right stimuli should be read as y (on the right) matches x (on the left) if and only if x (on the left) matches y (on the right).
Analogously, if x and y in the relation "y matches x" are presented in a temporal succession, x first, y second, then (1) means y (second) matches x (first) if and only if x (first) matches y (second), and not (contrary to a common procedural mistake) 2

IntroductIon
Consider a description of an experiment in which two stimuli were visually presented side-by-side. Let the description say, in part, that a participant adjusted the color [or intensity, or shape] of a stimulus on the right until the appearance of this stimulus matched the appearance of the stimulus on the left.
The author of this quote would not probably hesitate to rewrite it as a participant adjusted the color [or intensity, or shape] of a stimulus on the right until the appearance of this stimulus was matched by the appearance of the stimulus on the left. Or a participant adjusted the color [or intensity, or shape] of a stimulus on the right until the appearance of this stimulus and the appearance of the stimulus on the left matched each other.
Note that we are not dealing here with differently formulated instructions to a participant, nor with different procedures of adjustment. Rather we have three "theoretical" descriptions of a certain performance (under a given instruction and by a given procedure), and these three descriptions appear interchangeable. This theoretical belief is likely to be shared by the participants in such an experiment themselves: if a participant declares "I think that now this right shape matches this left one," then the questions like "And do you also think that the left one matches the right one?" or "Do you also think they both match each other?" are likely to be met by a questioning stare.
With this definition of matching it is no longer obvious that y matches x if and only if x matches y. In fact, it is very easy to construct models that would be incompatible with this statement. This is true, in particular, for Thurstonian-type models, a widely used theoretical tool about which Luce (1977, p. 462) said that "this conception of internal representation of signals is so simple and so intuitively compelling that no one ever really manages to escape from it." Consider the simplest such a model, proposed in Luce and Galanter (1963). Stimuli x and y in this model are mapped into independent univariate normal random variables R x and R y , and the response "same" is given if and only if |R x − R y | is less than some fixed constant. Suppose that the variances σ x 2 and σ y 2 are continuously differentiable functions of the corresponding means, σ µ σ µ x x y y 2 1 2 2 ( ). Since in this case any two x-values that map into an R x with a given mean, hence also a given variance, are equivalent (i.e., they match or do not match any stimulus y together), and analogously for y-values, we can conveniently speak of "stimuli μ x and μ y " in place of x and y. 4 Assuming that μ x and μ y fill in respective intervals of reals, it can easily be shown (Dzhafarov, 2003a(Dzhafarov, , 2006b) that there are some functions H and G such that (A) any stimulus μ x is matched by a single μ y = H(μ x ), (B) any stimulus μ y is matched by a single μ x = G(μ y ), but (C) G is not the inverse of H unless the variances σ x 2 and σ y 2 have constant values. In other words, if σ x 2 and σ y 2 in this model change with stimuli, then the PSE of the PSE of a given stimulus (μ x or μ y ) is generally different from this stimulus. One can show that this situation cannot be "corrected" by replacing the independent univariate normal distributions in this model with more complex and stochastically interdependent distributions on other probability spaces (provided that the model remains "well-behaved" in some rather non-restrictive sense; see Dzhafarov, 2003a,b, 2006a, and Kujala and Dzhafarov, 2009. We see that the requirement that y match x if and only if x matches y is far from being innocuous: it imposes rather stringent constraints on the possible Thurstonian-type models (see Dzhafarov, 2006b, in response to Ennis, 2006).
Another modeling scheme for which the requirement in question is critical is the "probability-distance hypothesis" (Dzhafarov, 2002a). In this class of models, assuming that both x and y stimuli (say, presented on the left and on the right, respectively) take their values in some common set Z, the probability with which x and y are judged to be different is an increasing function Φ of some metric D imposed on Z: Although traditionally applied to greater-less rather than samedifferent judgments, this modeling scheme pertains to what Luce and Edwards (1958, p. 232) called "the old, famous psychological rule of thumb: equally often noticed differences are equal." Now, a direct application of (4) implies that y x, y  ψ( ) achieves its minimum (i.e., y matches x) if and only if y = x; and that x x, y  ψ( ) achieves its minimum (i.e., x matches y) if and only if x = y. The symmetry requirement therefore must be satisfied in order for the y (second) matches x (first) if and only if x (second) matches y (first).
The latter statement is generally wrong due to the presence of constant error (here, time order effect).
Our goal in this paper is to construct a definition of matching and to experimentally test its compliance with the symmetry requirement (1) for the matching-by-adjustment paradigm. Given our opening example, one might wonder why we need a theoretical definition of matching in the first place. Why cannot we simply say that stimulus y matches stimulus x when an observer says so? The reason is that pairwise comparisons are probabilistic: one cannot say "y is judged to be the same as x" without adding "in this trial" (and then in another trial this may not be true) or "with this probability" (and then another stimulus y′ will be judged to be the same as x with some other probability). As a result, the identity of a stimulus y matching x has to be computed from a set of responses rather than observed in a single one.
To make this clear, consider the classical paradigm of greater-less comparisons. Let us say x is the stimulus presented on the left, y is presented on the right, and in response to a left-right pair (x,y) a participant says which of the two contains more of a certain property (say, brightness). The participant is not allowed to say that the two stimuli are equally bright, so one could not identify the matching relation with the participant's judgments even if they were deterministic. The fact is, however, they are probabilistic, and each pair of stimuli maps into a probability with which the right stimulus is judged to be greater (in brightness) than the left one, ξ( ) Pr is judged to be greater than x, y y If we view this function as y x, y  ξ( ), with x fixed, then the match (or point of subjective equality, PSE) for x is traditionally defined as any value of y (may not be unique if y is not unidimensional) 3 for which ξ(x,y) = 1/2. Viewing the function as x x, y  ξ( ), with y fixed, the match (or PSE) for y is analogously defined as any value of x at which ξ(x,y) = 1/2. It is easy to see that with this definition of matching, y matches x if and only if x matches y.
The symmetry of the matching relation, however, is not always a mathematical necessity. With other definitions of matching it may be an empirical hypothesis. Nor is this hypothesis always innocuous and trivial. It often has in fact unexpectedly restrictive consequences. To see this, consider the paradigm of same-different comparisons. Let stimuli x,y, again, be presented on the left and on the right, respectively, and let a participant say in response to a pair (x,y) whether the two stimuli are different (in some respect, such as brightness, or overall). Each stimulus pair now is associated with the probability ψ( ) Pr[ and are judged to be different] x, y y x = . (3) A natural definition of a match (PSE) for x here is any value of y such that ψ(x,y) is the smallest value of the function y x y  ψ( ). , Analogously, any value of x at which the function x x y  ψ( ) , achieves its minimum value is taken to be a match (PSE) for y.
x, however close to y. It is reasonable to assume in fact, as it is done in all models and fits of psychometric functions known to us from the literature, that the value of y (or x) for which ξ(x,y) = 1/2 is unique for all x (respectively, y) -because with conventional choices of stimulus continua y x y  ξ( ) , is strictly increasing in the vicinity of its median (respectively, x x y  ξ( ) , is strictly decreasing in the vicinity of its median). Even if we speculate, with no empirical justification, that in some cases the function y x y  ξ( ) , may have a plateau at the level 1/2 over some interval ]y − ε, y + ε[, it is reasonable to assume then (in the absence of any empirical evidence to the contrary and in accordance with the regular well-matchedness hypothesis formulated in Section 2) that any two y 1 ,y 2 stimuli in this interval are equivalent: ξ(x,y 1 ) = ξ(x,y 2 ) for all x.
Let us return now to our opening example: two stimuli, one of them fixed, the other manipulated by a person until it appears matching the fixed one. A mapping of some physical process (such as trackball rotation) into a set of stimuli normally requires a parametrization of stimuli by reals, so we may assume that x and y are vectors of reals. If we imagine the adjustment procedure repeated infinitely many times under the same conditions, each fixed stimulus, x or y, will correspond to a random variable Y x with y-values (respectively, a random variable X y with x-values). How should one define the matching stimulus (PSE) for x or y in this situation? The traditional answer is to take some measure of central tendency of Y x and X y , such as their expected values or componentwise medians. One needs, however, a theory that would justify suitable choices for this measure. Most important in the present context, given different choices one should opt for those that ensure (or at least make it plausible) that the matching relation is symmetric: denoting a This consideration makes it clear that a suitable definition of the PSE for x or y has to be tied to a particular parameterization of stimuli. Indeed, with no conventional choice of m, if (6) holds for x and y will it also hold for x′ = T 1 (x) and y′ = T 2 (y) across all possible reparametrizations T 1 ,T 2 , even if one confines the latter, as we do in this paper, to diffeomorphisms only (continuously differentiable bijections with continuously differentiable inverses).

regulAr Well-MAtchedness
The general notion of a regular well-matched stimulus space has been developed in Dzhafarov and Dzhafarov (2010b) for an arbitrary set of stimuli and observation areas (defined, e.g., by multiple locations of stimuli compared in shape, or multiple colors of stimuli compared in brightness). For detailed discussions of the notion of an observation area and its importance in the theory of comparative judgments see Dzhafarov (2002b), Dzhafarov and Colonius (2006), and Dzhafarov and Dzhafarov (2010b). Here we confine our consideration to the case when stimuli belong to two fixed observation areas. Let us agree to use letters x and a to denote stimulus values in the one of them (say, left, or first), and letters y and b to denote stimulus values in the other (right, second). More rigorous notation would be (x, 1) or x (1) , meaning the stimulus with value x in observation area 1, and analogously for y, but the simplified notation seems sufficient in the present context. model to hold. A more sophisticated approach takes into account the possibility of constant error (non-coincidence of the values of a stimulus and its PSE) and modifies (4) as where H is some bijective function. It is easy to see that both y x y  ψ( ) , and x x y  ψ( ) , achieve their (common) minimum if and only if y = H(x), ensuring thereby that y matches x if and only if x matches y.
Yet another issue in which the symmetry in question plays an important role is known in philosophy as the perceptual variety of the "sorites paradox" (see, e.g., the collections of chapters edited by Keefe and Smith, 1999;Beall, 2003). In both philosophy and psychophysics the issue is also known as that of nontransitivity of matches (Goodman, 1951(Goodman, /1997Luce, 1956). Somewhat simplifying, let the matching y for x be determined uniquely, y = H(x), and let the matching x for y be determined uniquely as well, x = G(y). Then the PSE for y = H(x) is x′ = G • H(x). If G is not the inverse of H, x′ does not generally coincide with x. The PSE for x′ in turn is y′ = H • G • H(x), which does not generally coincide with y and therefore does not match the initial value of x. We obtain thus a "tetradic soritical sequence" (Dzhafarov and Dzhafarov, 2010b) This situation does not occur if matches are symmetric, G ≡ H −1 . Then x′ = x and y′ = y, that is, the last element of the sequence, y′, matches its first element, x. 5 It should be mentioned, to prevent misunderstandings, that the possibility or impossibility of soritical sequences is determined not only by the issue of symmetry of matches but also by that of their uniqueness. Thus, many authors take it for granted that if y matches x then any y * which is sufficiently close to y will also match x. This position, however, is logically untenable as it leads to a contradiction (see Dzhafarov and Dzhafarov, 2010a, for a detailed analysis). Not to discuss this on a general level, let matching be determined through the function ξ(x,y) in (2), and let the stimulus values be unidimensional, which we indicate by using the notation x = x, y = y. Let (x,y) be a left-right pair of matching stimuli, which we know to mean that ξ(x,y) = 1/2. It would be fallacious now to maintain that whenever this happens, ξ(x,y + ε) must remain equal to 1/2 for sufficiently small |ε| -such an assertion would in fact imply that the function y x y  ξ( ) , equals 1/2 over all values of y. If the latter is not the case, then there must be at least one value of y matching x such that no value y * to the right and/or to the left from y matches G ≡ H −1 . Figure 1 illustrates three situations of interest: when MS is violated, when it is satisfied, and when it is violated but it is difficult if not hopeless to distinguish it from the case of compliance with MS in a realistic experiment. With an appropriately formulated general model the situations illustrated in the left-hand and middle panels of the figure can be made sources for competing statistical hypotheses.

generAl Model
The general model in question is as follows. Let the values of x and y (after equivalent stimuli have been identically labeled) be representable by real-valued vectors, x = (x 1 ,…, x n ), y = (y 1 ,…,y n ), filling in two open connected areas of R n . 6 Let the random vectors Y x and X y be as defined above. We assume the existence of two diffeomorphic transformations, x = T 1 (a) and y = T 2 (b), with each of a and b filling in R n , such that where h and g are continuously differentiable functions, and (δa, δb) is a 2n-vector of independent normally distributed variables with zero means 7 . We define the PSE functions for, respectively, x = T 1 (a) and y = T 2 (b) as the continuously differentiable functions

T h a T h T x H x T g b T g T y G y
Let us assume that the set of all x and y stimuli is endowed with a binary relation M ("is matched by") which can only hold true for two stimuli from different observation areas: xMy or yMx but never x 1 Mx 2 or y 1 My 2 . Let us also define a binary relation E ("is equivalent to") which, on the contrary, only holds for two stimuli from one and the same observation area: x 1 Ex 2 means that for any y, yMx 1 ⇔ yMx 2 ; analogously, y 1 Ey 2 means that for any x, xMy 1 ⇔ xMy 2 .
We say that the x and y stimuli form a regular well-matched space if they satisfy the following statements: WM (well-matchedness property). For any stimulus (x or y) there is a stimulus in another observation area (respectively, y or x) such that the two stimuli match each other (xMy and yMx).
R (regularity property). If two stimuli in the same observation area (x 1 ,x 2 or y 1 ,y 2 ), are matched by another stimulus (respectively, y or x), then they are equivalent (x 1 Ex 2 , or y 1 Ey 2 , respectively).
The requirement of regular well-matchedness is all one needs to ensure that matching is "non-paradoxical": no possibility for nontransitive sequences like (5), and no violations of symmetry (1). It is convenient in the present context to reformulate the definition of a regular well-matched space of stimuli in the form maximally emphasizing the symmetry property. Assume that all x and y stimuli have been (re)labeled so that any two equivalent stimuli receive one and the same label. Retaining the same notation (x and y) for thus (re)labeled stimuli, no two different x (or y) stimuli are equivalent. With this proviso, the stimuli form a regular well-matched space if the following statements hold: MF (matching is a function). For every stimulus there is one and only one stimulus in the other observation area which matches it; that is, there is a function H such that xMy ⇔ y = H(x), and a function G such that yMx ⇔ x = G(y).
The equivalence of MF-MS to WM-R is obvious. The functions H and G are referred to as PSE functions, with H(x) being the PSE for x and G(y) the PSE for y. Once MF is accepted, the property MS says that the functions H and G are bijective and each other's inverses: G ≡ H −1 . This formulation is close to the definitions of Regular Minimality and Regular Mediality given in Dzhafarov (2003a) and Dzhafarov and Colonius (2006) for, respectively, same-different and greater-less comparisons (the formulation in Dzhafarov and Dzhafarov, 2010b, is better suited for multiple observation areas).
The reason MF-MS is more convenient for our purposes than WM-R is that it is usually easy to construct a definition of matching that satisfies MF, and whenever this is the case (as it is, e.g., in the Luce-Galanter model mentioned in Section 1), the question of whether a stimulus space is regular well-matched reduces to the title question of this paper. Most importantly in the present context, MF is trivially satisfied for the matching-byadjustment paradigm: if each x corresponds to a one and only one random variable Y x (with values representing declared y-matches to x in different trials), then any measure of central tendency m[Y x ] is a function of x, m[Y x ] = H(x); and analogously with y and m[X y ] = G(y). The question is whether m can be chosen so that

Figure 1 | x and y stimuli (for illustration purposes unidimensional) with the PSe functions H(x) and g(y)
. The abscissa segment and ordinate segment depict "sufficiently large" areas of stimuli around x 0 and y 0 = H(x 0 ), respectively. Left-hand panel: the symmetry assumption, MS, is not satisfied, and the two functions do not cross within the areas depicted. Middle panel: MS is satisfied. Right-hand panel: MS is not satisfied but the two functions have numerous crossings within the areas depicted. In the left-hand panel the PSE for the PSE of x 0 is not x 0 itself, and analogously for y 0 = H(x 0 ): there are systematic differences between g • H(x 0 ) and x 0 , and between H(x 0 ) and H • g • H(x 0 ) which may be detectable if the procedure is repeated many times and the errors of matching are sufficiently wellbehaved. In the middle panel the PSE for the PSE of x 0 is x 0 itself, and analogously for y 0 = H(x 0 ): if the procedure is repeated many times any variance among successive adjustments of x and y will be due to matching errors only.

6
It is usually assumed that the values of x and y belong to the same set, but this assumption is not critical for our analysis. The latter would even apply to a case when x stimuli are, say, visual and y stimuli auditory. 7 The only property of these variables essential for this paper is their independence and symmetry around 0. The normality, however, is convenient due to the uniqueness properties mentioned at the end of this section.

Dzhafarov and Perry
Matching by adjustment

H y T h T y T T y G y
or (as illustrated in the middle panel of Figure 1) From (9) we have then xMy if and only if yMx.

AlternAtIve Model
The alternative model corresponds to the left-hand panel of Figure 1. Since its difference from the right-hand panel is a matter of scale only, the alternative model has to be formulated in reference to the set of stimuli recorded in a specific experiment (whether set by experimenter or adjusted by participant). Let {x 1 ,…,x M } and {y 1 ,…,y N } be these stimuli. Let us define a sufficiently large stimulus area for x as any open connected area X of x-values that contains where G is the true PSE function for y as defined by (8) The alternative model says that in some sufficiently large areas X and Y the graphs of the corresponding components of PSE functions H(x) and G(y) do not cross. This means that for any i = 1,…,n, the ith component of the difference H(x) − y has one and the same sign across all x ∈ X and y ∈ Y such that H(x) ∈ Y and G(y) = x; analogously, for any i = 1,…,n, the ith component of the difference G(y) − x has one and the same sign across all y ∈ Y and x ∈ X such that G(y) ∈ X and H(x) = y.

pIng-pong MAtchIng pArAdIgM
If there was no matching error involved, then starting with any x ∈ X one could create two sequences of stimuli, one in each observation area (let them be again "left" and "right"), chain-matched as shown in Figure 2. Under our alternative model, each stimulus in each observation area is different from the one immediately following it. Moreover, for any i = 1,…,n, the differences , etc., have one and the same sign, and so do the differences y y y y y y , , etc., in the other observation area. If the null model is true, however, then (in the absence of matching errors) all x's are the same and so are all y's, whence all the componentwise differences between successive stimuli in either observation area are 0.
The ping-pong matching paradigm proposed in Dzhafarov (2006b) is aimed at distinguishing between these two competing possibilities in the presence of matching errors. The logic of the In our general notation, Note that this definition of the PSE functions H and G does not tell us how to compute them from Y x and X y , respectively, as our general model does not specify the transformations T 1 ,T 2 . We will be able to circumvent this difficulty in the application of the model to our experiments (in Section 3.1) by using linear approximations to T 1 and T 2 . In Section 7 we mention an approach which may make the reliance on approximations unnecessary. This issue is related to the uniqueness properties of T 1 ,T 2 , which is worth mentioning even if not essential for the analysis to follow.
Clearly, if T 1 , T 2 exist, then T 1 ° L 1 , T 2 ° L 2 will be another pair of transformations providing (7), for any choice of orthogonal linear transformations L 1 , L 2 . Linear transformations, however, are inconsequential, as they do not change the PSE functions H and G. If x,y belong to R 1 or R 2 (arguably the most important cases amenable to experimental analysis), then it is known that within a class of transformations including diffeomorphisms (under certain constraints trivially satisfied in our general model), linear transformations are the only ones which preserve the normality of δa and δb (Ghosh, 1969;Khatri and Mukerjee, 1987). In other words, for univariate and bivariate stimuli the PSE functions in the general model are determined uniquely. There are reasons to conjecture (Khatri, 1987) that this is also true for n > 2, but the results we know of are less general than for n = 1, 2. There does not, however, seem to be a known example of a nonlinear diffeomorphism in R n that would map n + 1 normal distributions with distinct means into n + 1 normal distributions with distinct means.

null Model
We say "null model" instead of "null hypothesis" to emphasize that the former is an essentially non-statistical theoretical construct which may be used as a source of (generally more than one) statistically testable consequences, which then will be referred to as null hypotheses.
The null model is obtained from the general model by positing that h and g in (7) are diffeomorphisms, and It follows from (8) that 2 | A chain-matched sequence of left and right stimuli. The arrows should be read "is matched by" (i.e., they represent the relation M). Dzhafarov (2006b), however, does not offer a general model of matching-by-adjustment. Also, one can be skeptical about the generalizability of unidimensional results to multidimensional stimuli. 8 The present work is to fill in these gaps. In the remainder of this section we show how the general model of Section 2 and its null and alternative versions apply to the ping-pong adjustment paradigm.

ApplIcAtIon of the generAl Model
Let us enumerate the trial pairs (as described in the legend to Figure 3) 1,2,…,N, in chronological order. Denote the balance points established in the kth trial pair by (y k ,x k ) and the first-order differences (or ∆'s for short) by ∆x k = x k+1 − x k and ∆y k = y k+1 − y k . It is shown in the Appendix that the general model of Section 2 implies where M…,N… denote n × n matrices, and o designates any function whose norm |o| (say, the supremal one) is o{1}|(δa 1 ,δb 1 ,…,δa k+1 ,δb k+1 )|. We know that (δa k ,δb k ) is a 2n-vector normally distributed with zero paradigm is presented in Figure 3. As an example, in three pingpong matching experiments reported in Dzhafarov (2006b), stimuli were straight line segments presented side-by-side in a frontal plane, and in each trial a participant had to adjust one of the segments until it appeared of the same length as the other one, held fixed. Every time a "balance point" was achieved, the balance was upset by randomly changing the length of the segment which was fixed in the previous trial, and the participant had to adjust it "back," until it matched the length of the other segment (which remained fixed at its previously established value). This alternating procedure was replicated 200 times (100 balance points on each side), and each of these 200-trial series was repeated 10-25 times. In reference to Figure 3, x = x and y = y are unidimensional, so the first-order differences are ∆x k = x k+1 − x k and ∆y k = y k+1 − y k . As shown below (Sections 3.1-3.3), to the extent one can drop non-linear terms in certain Taylor expansions, it follows from the null model that the distributions of the ∆x k and ∆y k should be symmetric around 0. The histograms and statistics shown in Figure 4 do not contradict this prediction. Trials may or may not be separated by time intervals. A series of adjustments consists of many consecutive trial pairs. In the first trial of any trial pair, x remains fixed (solid horizontal lines, top panel) at the value established at the end of the previous trial pair; the value of stimulus y at the beginning of this first trial is randomly offset (dashed vertical lines, bottom) so that it generally does not match x, and the participant adjusts this value (oblique solid lines, bottom) until it seems to match x (the encircled points, bottom); in the second trial of the trial pair, y remains fixed (solid horizontal lines, bottom) at the value established at the end of the previous trial; the value of stimulus x at the beginning of this second trial is randomly offset (dashed vertical lines, top) so that it generally does not match y, and the participant adjusts this value (oblique solid lines, top) until it seems to match y (the encircled points, top). The stimuli x 1 ,x 2 ,x 3 ,… and y 1 ,y 2 ,y 3 ,… represented by the encircled points are referred to as "balance points. " In this work we focus on the first-order differences ∆x k = x k+1 − x k and ∆y k = y k+1 − y k between balance points. , all positive or all negative. Let us denote this common sign of the v k i 's by sgn(v i ). By aggregating ∆y k i across all k we create a random variable ∆y i which equals ∆y k i with probability 1/N. Since for any positive numbers α < β, sgn Pr P r s gn It follows that the conclusion we have drawn from the null model, that the values of ∆y i in some interval 0 ± ε i should be distributed symmetrically around 0, is false under the alternative model. In particular, sgn(Pr[0 ≤ ∆y i < ε i ] − Pr[−ε i ≤ ∆y i < 0]) = sgn(v i ) whence the median of ∆y i in any interval 0 ± ε i (including for ε i = ∞) also shares the sign with v i . The same is true about the mean ∆y i , which equals 1 The consideration of ∆x k i and their mixture ∆x i is analogous and leads to the same conclusions.
We can now formulate, for each i = 1,…,n and any choice of ε i , i , the alternative hypotheses corresponding to H1 0 −H3 0 of the previous section. We have mentioned in the previous section how we chose the intervals and partitions for the experiments reported below.

pArtIcIpAnts
Seven paid volunteers, students at Purdue (six females and one male) and the second author of this paper (LP) served as participants in two experiments. The paid volunteers, naive as to the aims and designs of the experiments, are identified as P1-P3 (in the location experiment) and P4-P7 (in the shape experiment). LP participated in both experiments. All participants were aged around 20 and had normal or corrected to normal vision. mean and a diagonal variance matrix, for every k. Let us additionally assume that (δa k ,δb k ) and (δa k′ ,δb k′ ) are independent for any k ≠ k′. It follows then that to the extent one can ignore the o-terms, every component ∆y k i of ∆y k and every component ∆x k i of ∆x k are approximately normally distributed (i = 1,…,n). Note however that (∆x k ,∆y k ) and (∆x k′ ,∆y k′ ) for k ≠ k′ generally have different means and variances, and any two components of the 4n-vector (∆x k ,∆y k ,∆x k′ ,∆y k′ ) are generally stochastically interdependent. The sequences ∆y k , , , therefore are not generally sequences of iid variables.

null hypotheses
The situation simplifies considerably under the null model. As shown in the Appendix, (10) then acquires the form where (δa, δa′, δb) is a 3n-vector of independent normal variates with zero means. Since the smaller the values of |∆y i | the more likely it is to correspond to small values of |δa|,|δa′|,|δb| in (12) and the better justified one is in dropping the o-terms, one should expect that for a sufficiently small ε i > 0, the values of ∆y i in the interval 0 ± ε i should be distributed symmetrically around zero; and the same should be true for ∆x i in an interval 0 ±  i . The choice of ε i and  i , for i = 1,…,n, depends on the precision needed (which in turn depends on sample size) and on the test of symmetry one chooses to use (cruder tests allow for wider intervals). Thus, ε i and  i may very well be chosen differently in the three null hypotheses we use to assess the compliance of the experiments reported below with the symmetry prediction of the null model.
where m = 0,1,…,l i − 1 and ε 0 0 i = ; and an analogous statement is true for ∆x i and some partition 0 H2 0 : The population mean of ∆y i -values falling between −ε i and ε i is 0; and the same is true for ∆x i between − i and  i .
H3 0 : The population median of ∆y i -values falling between −ε i and ε i is 0; and the same is true for ∆x i between − i and  i .
In order not to bias the outcomes in favor of the nulls, in the analysis of our experiments we simply put ε =  = ∞, that is, we used the entire range of data. In H1 0 , however, we could only choose narrow grouping bins ε ε ∆ ∆ equal to 0, we know also that they are all positive or all negative. Our goal is, however, to formulate the H1 A as a simple negation of H1 0 , so that one of them has to be true within the confines of the general model. recorded). In total each of the four participants worked through 20 ping-pong series. This amounted to the total of 2000 balance points for each of y 1 , y 2 , x 1 , x 2 , yielding 1980 values for each of the corresponding first-order differences.
In the shape experiment the horizontal and vertical rotations of the trackball controlled the amplitudes A 3 (x 1 or y 1 ) and A 5 (x 2 or y 2 ), respectively. Each trial began by the two shapes appearing on the screen. One of the shapes remained the same as established at the end of the previous trial [or, in trial 1, it was at the initial value (A 3 = 14 px, A 5 = 14 px)], while the other shape at the beginning of the trial was randomly chosen as shown in Figure 6. The participant was instructed to adjust this shape until it matched the other, fixed shape, and to click the button on the trackball device when satisfied. With this click the trial ended and the two stimuli disappeared, to appear again 0.5 s later. Each series of ping-pong adjustments consisted of 110 y-adjustments (in the odd-numbered trials) and 110 x-adjustments (in the evennumbered ones), preceded by a practice series of 20 trial pairs (which was not recorded). There was one recorded series per participant per day, with a few minutes break in the middle (after trial 110). In total each of the five participants worked through nine ping-pong series, providing the total of 990 balance points for each of y 1 , y 2 , x 1 , x 2 and 981 values for each of the corresponding first-order differences.

results
The main results are presented in Figures 7-10 (location experiment) and Figures 11-15 (shape experiment). Each panel shows a histogram of first-order differences (∆'s) in one of the two components of x or y. The bins of the histograms are all 1 pixel wide (62 sec arc), but in the location experiment the ∆'s are integer numbers of

stIMulI And procedure
The stimuli used are exemplified in Figure 5 and described in its legend, together with the observation conditions. In each trial a participant changed the parameters of one of the two stimuli by rotating a trackball on which the participant rested her/his dominant hand.
In the location experiment the horizontal and vertical rotations of the trackball controlled the horizontal (x 1 or y 1 ) and vertical (x 2 or y 2 ) coordinates of one of the dots. Each trial began by the two circles with the dots appearing on the screen. In accordance with the logic of ping-pong adjustments (Figure 3), one of the dots was kept at the same location as established at the end of the previous trial [or, in trial 1, at the initial value (27 px, 16 px)], while the other dot at the beginning of the trial was at a randomly chosen location as shown in Figure 6. The participant was instructed to move this dot until its location matched that of the other, fixed dot, and to click a button on the trackball device when satisfied. With this click the trial ended and the two stimuli disappeared, to appear again 0.5 s later. Each series of ping-pong adjustments consisted of 100 trial pairs (100 y-adjustments in the odd-numbered trials and 100 x-adjustments in the even-numbered ones). There were two such series per participant per day, separated by a few minutes, each preceded by a practice series of 20 trial pairs (which was not

Figure 6 | A detailed view of the adjustment procedure in the location (left) and shape (right) experiments.
The left-hand picture shows the first quadrant of the circle in which the location of the dot is manipulated. The cross shows the location of the dot in the previous trial. Denoting its polar coordinates by (θ,r), at the beginning of the current trial the dot's location is randomly chosen according to the uniform distribution over the rectangle (θ − π/18, θ + π/18) × (r − 0.1 · radius, r + 0.1 · radius) in polar coordinates (shown by the colored area). The right-hand picture shows the space of the A 3 ,A 5 -amplitudes, |A 3 | + |A 5 | ≤ R, for the shape being adjusted. At the beginning of the current trial the values of A 3 ,A 5 (irrespective of their values in the previous trial) are randomly chosen according to the uniform distribution over the square (−0.5R, 0.5R) × (−0.5R, 0.5R). A participant could change the A 3 ,A 5 -values freely within the entire diamond-shaped area, but at any given (A 3 ,A 5 ) the rate of further change (per rotation angle of the trackball) in any of the four directions shown was proportional to the corresponding distances of (A 3 ,A 5 ) to the borders (updating quasicontinuously and ensuring thereby that the boundary could never be reached). In both experiments the two observation areas are defined as "left" and "right. " The two stimuli were displayed on a flat-panel monitor viewed (using a chin rest with forehead support) from the distance of 90 cm, making 1 screen pixel ≈ 62 sec arc. The stimuli were grayish-white on black, of a comfortably low fixed luminance, viewed in darkness. In the location experiment the stimulus values x on the left and y on the right are locations of the dots within their circles: they are measured by the horizontal and vertical Cartesian coordinates of the dots with respect to the circles' centers. The width of the circumferences and the diameter of the dots in the experiment were 5 px, the circles' radii measured 70 px, and the distance between the circles' centers was 150 px. The initial value of x in the experiment was (27 px, 16 px), corresponding to (π/6, 0.45 · radius) in polar coordinates. In the shape experiment the stimulus values x on the left and y on the right are the amplitudes A 3 and A 5 in the formula for a "floral" shape in polar coordinates: R + A 3 cos3θ + A 5 cos5θ, where |A 3 | + |A 5 | ≤ R. In the experiment R was 70 px, the distance between the floral shapes' centers was 300 px, and the width of the contours 5 px. The initial value of x in this experiment was A 3 = A 5 = 0.2R = 14 px.

Dzhafarov and Perry
Matching by adjustment

Figure 7 | Histograms of the first-order differences (∆'s) for the location experiment, participant LP.
The insets show the time series of the matching adjustments from which the ∆'s were computed. Each panel contains the mean and the median of the corresponding ∆ (in sec arc), with the p-values for the hypotheses that the population mean and median are 0, as well as the χ 2 (df = 9) and the p-value for the symmetry test described in the text.

Dzhafarov and Perry
Matching by adjustment pixels (so the 1-pixel-wide bins are quasicontinuous representations of their integer centers), while in the shape experiment the ∆'s are grouped into the intervals between successive integers. The insets show the time series of the matching adjustments from which the ∆'s were computed: the abscissa of the inset shows successive trials in which the adjustments are made (1, 3, 5,… for the right adjustments and 2, 4, 6,… for the left ones), the ordinate axis of the inset corresponds to the abscissa of the histogram. Each panel shows the results of three tests: (H1 0 ) that the histogram of ∆'s is symmetric around 0 (against the generic alternative); (H2 0 ) that the expected value of ∆ is 0 (against the two-directional alternative), and (H3 0 ) that the median ∆ in the population is 0 (i.e., that Pr[∆ > 0] + Pr[∆ = 0]/2 = 1/2, against ≠1/2). for the location experiment and the shape experiment, respectively. Note that the frequency of ∆'s in the intervals −9 and 9 was very small in the location experiment, which, combined with the fact that 8 pixels (≈492 sec arc) seems a good candidate for the notion of being "small," was the reason for choosing this range for a "detailed view." For uniformity, we used the same range for the shape experiment, although the frequency of ∆'s in the intervals −9 and 9 was not small for participants P5 and, especially, P4. The test for the means was the standard t-test with the test statistic mean st. err.

∆ ∆
The test for the medians was the χ 2 (df = 1) test with the test statistic ( ) ' . number of number of number of all s The symmetry test was the χ 2 (df = 9) test with the test statistic

dIscussIon
There are obvious individual differences in the patterns of the time series for balance points (the insets of the graphs). Our goal, however, is confined to their single feature: the lack or presence  of a systematic trend, as revealed by the analysis of the first-order differences. In assessing the results, note that the choice of the significance level for a test (the alpha below which a p-value is considered rejecting the null hypothesis) is dubious when one deals with multiple tests: the computation of alpha depends on one's subjective decision on how the different tests should be grouped. Setting the alpha for a given test for a given condition for a given participant in a given experiment at 0.05 means that the Type I error probability for 12 generally interdependent tests per participant per experiment (3 tests × 4 ∆'s) is anywhere between 0.05 and 0.6, making the overall Type I error probability across all tests for all conditions and all participants be anywhere between 0.19 and 0.97 for the location experiment, between 0.23 and 0.99 for the shape experiment, and between 0.37 and 1.0 if the two experiments are combined. The formula for these calculations is where k is the number of tests per participant per experiment (in our case 12) and p is the number of independent applications of these k tests (four in the location experiment and five in the shape experiment), the tests for different participants × experiments being considered stochastically independent. To fix the lower boundary for the overall Type I error probability at 0.05 one needs to set the alpha for a given test × condition × participant × experiment at 0.013 for the locations experiment, at 0.010 for the shape experiment, and at 0.006 if the two experiments are combined. Rounding these figures to the conventional ones, we are justified to compare the p-values in our tests to 0.05 and 0.01. The results are summarized in Table 1.
The conclusions one can derive from the location experiment are unequivocal. At α = 0.05 the null hypothesis is rejected in none of the 48 tests presented in the 16 panels of Figures 7-10 (although the probability of a rejection happening by chance, with all nulls true, is greater than 0.19). Equally important is that the values of the mean and the median are obviously very small (note that a single screen pixel measured 62 sec arc). The matching regularity hypothesis can be upheld for locations with very high confidence.
For the shape experiment none of the 60 tests presented in the 20 panels of Figures 11-15 rejects the null hypothesis at α = 0.01 (with the overall probability of Type I error exceeding 0.05). The hypothesis that the population means are 0 is not rejected at α = 0.05, and the mean ∆'s are very small. However, in one case out of 60 (Figure 14, right A 5 ) the distribution's symmetry is rejected at α = 0.05, and the hypothesis that the population median is 0 is rejected at α = 0.05 in four out of 60 cases (right A 5 in Figure 12, left A 5 in Figure 14, left A 3 and right A 5 in Figure  15). Still, the logic of our tests leads us to conclude that for the shape experiment, too, there is little if any evidence against the null model of Sections 2.3 and 3.2. Note that there are no figure panels where we see a rejection occurring at α = 0.05 in more than one of the three tests. The occasional rejections can therefore be assumed to be Type I errors (whose probability in the shape experiment exceeds 0.23). Moreover, even if the rejected null hypotheses are indeed false, it is still possible (and probable, in view of the rest of the data) that these were the cases when the error terms were not sufficiently small to warrant dropping the o-terms in (12).

conclusIon
The symmetry of matching, MS of Section 2.1, being a "natural" proposition firmly built in our colloquial language as well as in the language and practice of psychophysics, it seems to be a reasonable scientific strategy to dismiss this proposition only if the evidence against it is compelling. We have shown that in the matching-by-adjustment paradigm, with a reasonable definition of the PSE functions satisfying MF of Section 2.1, there is no empirical evidence against MS: y matches x if and only if x matches y.
Our paper does not, however, provide an algorithm for computing the precise matches for x and y from the distributions of the balance points Y x and X y , respectively. Rather, to the extent the use of the linear part of (12) is justifiable, our null model upholds the traditional textbook recommendation, usually confined to unidimensional stimuli (see, e.g., Gescheider, 1985, p.54): approximate the distribution of within-trial matches to a given stimulus by a normal distribution and take its mean as the (approximate) PSE for this stimulus. It is also common to advise (ibid) that if the distribution is not normal, a transformation may be applied first to make it normal. Our general model (Section 2.2) suggests a multidimensional version of the advice in question: transform the distribution of within-trial matches to a given stimulus into a normal distribution with uncorrelated components and then take its mean as the PSE for this stimulus. Glossing over statistical issues, this procedure provides a "direct access" to the variables a,b of (7), modulo linear transformations inconsequential for the analysis, making thereby the use of linear approximations unnecessary. Note that such transformations need not exist: thus, in the unidimensional case, no diffeomorphism would translate Y Y Y , , of which the first two are normal with distinct means and the third one is not normal, into three normal variates (Ghosh, 1969). An empirical demonstration that the transformation postulated in the general model does not exist would not necessarily falsify MS, but one would have then to seek other ways of operationalizing and testing it.