Central Limit Theorem for Linear Eigenvalue Statistics for Submatrices of Wigner Random Matrices

We prove the Central Limit Theorem for finite-dimensional vectors of linear eigenvalue statistics of submatrices of Wigner random matrices under the assumption that test functions are sufficiently smooth. We connect the asymptotic covariance to a family of correlated Gaussian Free Fields.


INTRODUCTION
Wigner random matrices were introduced by Wigner in the 1950's (see e.g., [1][2][3]) to study energy levels of heavy nuclei. Let {W jj } n j=1 and {W jk } 1≤j<k≤n be two independent families of independent and identically distributed real-valued random variables satisfying: EW jk = 0, E|W jk | 2 = 1 for j < k, and E[W 2 jj ] = σ 2 . (1.1) Set W = (W jk ) n j,k=1 with W jk = W kj . The Wigner Ensemble of normalized real symmetric n × n matrices consists of matrices M of the form The archetypal example of a Wigner real symmetric random matrix is the Gaussian Orthogonal Ensemble (GOE) defined as [3] A = 1 2 (B + B t ), (1.3) where the entries of B are i.i.d. real Gaussian random variables with zero mean and variance 1/2. Wigner Hermitian random matrices are defined in a similar fashion. Specifically, we assume that {W jj } n j=1 and {W jk } 1≤j<k≤n are two independent families of independent and identically distributed real, correspondingly complex random variables satisfying (1.1). The archetypal example of a Wigner Hermitian random matrix is the Gaussian Unitary Ensemble (GUE) (1.4) where the entries of B are i.i.d. complex standard Gaussian random variables [3]. Over the last sixty years, Random Matrix Theory has developed many exciting connections to Quantum Chaos [4], Quantum Gravity [5], Mesoscopic Physics [6], Numerical Analysis [7], Theoretical Neuroscience [8], Optimal Control [9], Number Theory [10], Integrable Systems [11], Combinatorics [12], Random Growth Models [13], Multivariate Statistics [14], and many other fields of Science and Engineering.
The rest of the paper is organized as follows. We formulate our results in section 2. Theorem 2.1 is proved in section 3. Theorem 2.2 is proved in section 4. Auxiliary results are discussed in the Appendices.
Research of the last author has been partially supported by the Simons Foundation Collaboration Grant for Mathematicians # 312391.

STATEMENT OF MAIN RESULTS
This section is devoted to formulation of the main results of the paper.

3)
n lm = |B n l ∩ B n m |, 1 ≤ l ≤ m ≤ d. (2.4) We assume that the following limits exist: γ l : = lim n→∞ n l n > 0, γ lm : = lim n→∞ n lm n , 1 ≤ l ≤ m ≤ d. (2.5) If it does not lead to ambiguity, we will omit the superindex n in the notation for B n i , 1 ≤ i ≤ d. For an n × n matrix M and B ⊂ {1, 2, . . . , n}, consider a spectral linear statistic |B| l=1 ϕ(λ l ), where {λ l } |B| l=1 are the eigenvalues of the submatrix M(B). We are going to study the joint fluctuations of linear statistics of the eigenvalues. It will be beneficial later to view the submatrices from a different perspective. Consider the matrix P B = diag(P B jj ), which projects onto the subspace corresponding to indices in B, i.e., ϕ(λ B l ) = Tr(ϕ(M B )), (2.8) where {λ B l } n l=1 are the eigenvalues of M B . Note that the spectra of M B and M(B) differ only by a zero eigenvalue of multiplicity n − |B|. As a result, when we consider the linear statistics of their eigenvalues the extra terms (n − |B|)ϕ(0) cancel once we center these random variables. In general, when considering multiple sequences B l , in order to simplify the notation we will write Also, denote by P (l,r) the matrix which projects onto the subspace corresponding to the indices in the intersection B l ∩ B r , i.e., P (l,r) = P (l) P (r) = P (r) P (l) . (2.10) Recall that a test function ϕ : R → R belongs to the Sobolev space H s if where ϕ is its Fourier transform. First we consider Gaussian Wigner matrices.
In the expression for the covariance, (ϕ l ) k denotes the coefficients in the expansion of ϕ l in the (rescaled) Chebyshev basis, i.e., Note the form of the kernel in the above contour integral expression for the covariance. Since it is the Green's function for the Laplacian on H with Dirichlet boundary conditions (appropriately scaled), we note that the limiting distributions form a family of correlated Gaussian free fields. This is consistent with the previous work of Borodin [26,27] for the covariance of linear eigenvalue statistics corresponding to polynomial test functions. Now we formulate our result for non-Gaussian Wigner matrices.
Theorem 2.2. Let W = (W jk ) n j,k=1 be an n × n random matrix and M = n −1/2 W. Let B 1 , . . . B d be infinite subsets of N satisfying (2.2-2.4) and (2.5). Assume the following conditions: (1) All the entries of W are independent random variables.
(2) The fourth moment of the non-zero off-diagonal entries does not depend on n: (3) There exists a constant σ 6 such that for any j, k, E{|W jk | 6 } < σ 6 . Let ϕ 1 , · · · , ϕ d : R → R be test functions that satisfy the regularity condition ||ϕ l || s < ∞, for some s > 5.5. Then the random vector (2.12) converges in distribution to the zero mean Gaussian vector ( G 1 , · · · , G d ) ∈ R d with the covariance given by where Cov(G l , G p ) is given by (2.13).
In the course of the proof of Theorem 2.1, it has been necessary to understand the following bilinear form.

Definition 2.3.
Let M be a Wigner matrix satisfying (1.1), and let P (l) , P (l,r) be the projection matrices defined in (2.6) and (2.10). (2.17) Remark 2.4. The bilinear form ·, · lr is well defined on H s × H s as a consequence of Proposition 3.9. The bilinear form is also well defined for polynomial f and g, see section 3.2 and also Lemma 2.5 below.
The following diagonalization lemma is an important technical tool for the proof of Theorem 2.1.
In section 3.2, it will also be proved that, with f , g given as above, almost surely lim n→∞ 1 n Tr P (l) f (M (l) ) · P (l,r) · g(M (r) )P (r) (2.20) Remark 2.6. Recall that the rescaled Chebyshev polynomials of the second kind are orthonormal with respect to the Wigner semicircle law, i.e., 1 2πγ The proof of Theorem 2.1 appears in section 3 and the proof of Theorem 2.2 appears in section 4.
Remark 2.7. Theorems 2.1 and 2.2 prove convergence of finitedimensional distributions. This paper does not address the functional convergence which would require a tightness result.

Stein-Tikhomirov Method
We follow the approach used by Lytova and Pastur [21] for the full Wigner matrix case (see also [28][29][30]). Essentially, it is a modification of the Stein-Tikhomirov method (see e.g., [31]). This approach was also used to prove the CLT for linear eigenvalue statistics of band random matrices in Li and Soshnikov [24], which is connected to our work through the Chu-Vandermonde identity (see section 3.2). While several steps of our proof are similar to the ones in Lytova and Pastur [21], the fact that we are dealing with submatrices introduces new technical difficulties.
We will prove Theorem 2.1 in the present section and extend the technique to non-Gaussian Wigner matrices later. The following inequalities will be used often. As a consequence of the Poincaré inequality, one can bound from above the variance of Trϕ(M) for a differentiable test functions ϕ as We refer the reader to Lytova and Pastur [21] for the details. The next inequality is due to Shcherbina (see [22]). Let s > 3/2 and ϕ ∈ H s . Then there is a constant C s > 0, so that Let ǫ > 0 and set s = 5 2 + ǫ. Recall that the regularity assumption on the test functions is that ||ϕ l || 5/2+ǫ < ∞, for 1 ≤ l ≤ d. There exists a C ǫ > 0 so that The inequality holds because of (3.3), since M(B l ) is an ordinary |B l | × |B l | Gaussian Wigner matrix. We note that the bound is n-independent.
It is sufficient to prove the CLT for all linear combinations of the components of the random vector (2.12). Consider a linear combination ξ : = d l=1 α l N (l)• [ϕ l ], and denote the characteristic function by (3.5) It is a basic fact that the characteristic function of the Gaussian distribution with variance V is given by As a consequence of the Levy Continuity theorem, to prove theorem 2.1 it will be sufficient to demonstrate that for each where Z(x) is given as above with So V is the limiting variance of ξ . It will be demonstrated that Z n (x) converges uniformly to the solution of the following equation Note that (3.6) is the unique solution of (3.9) within the class of bounded and continuous functions. Therefore, to prove the theorem, it is sufficient to demonstrate that the pointwise limit of Z n (x) is a continuous and bounded function which satisfies Equation (3.9), with V given by (3.8).
Observe that Since Z n (0) = 1, we have by the fundamental theorem of calculus that (3.14) Frontiers in Applied Mathematics and Statistics | www.frontiersin.org A pre-compactness argument based on the Arzela-Ascoli theorem will be developed below, which ensures that the subsequences converge uniformly, implying that the limit is a continuous function. The estimate |Z n (x)| ≤ 1, for all n, shows that the sequence is uniformly bounded. Generally we will abuse the subsequence notation by writing {n} for a uniformly converging subsequence. Since (3.11) combined with ||ϕ l || 5/2+ǫ < ∞ justify an application of the dominated convergence theorem in (3.12), it follows from (3.13) and (3.14) that the limit of Z n (x) satisfies equation (3.9). Therefore the pointwise limit (3.7) holds. We turn our attention to the precompactness argument, and will argue later that (3.13) and (3.14) hold. We follow the notations used in Lytova and Pastur [21]. Denote by Recall that U (l) (t) is a unitary matrix, and writing β jk : = (1 + δ jk ) −1 , we have Moreover, (3.20) Applying the Fourier inversion formula it follows that Using the Fourier representation of the linear eigenvalue statistics in (3.10), it follows that where Y (l) n (x, t) : = E u (l)• n (t)e n (x) . (3.25) The limit of Y (l) n (x, t) is determined later in the proof. Since we need only consider t ≥ 0. It will now be demonstrated that each sequence {Y (l) n } is bounded and equicontinuous on compact subsets of {x ∈ R, t ≥ 0}, and that every uniformly converging subsequence has the same limit Y (l) , implying (3.13) and (3.14). See proposition 3.1.
Proposition 3.1. In order to prove the functions Y (l) n (x, t) converge uniformly to appropriate limits so that (3.24) implies (3.14), it is sufficient to prove the convergence of Y (l) n (x, t) on arbitrary compact subsets of {x ∈ R, t ≥ 0}.
Proof: Let δ > 0. Recall that the regularity assumption on the test functions ϕ l are i.e., that ϕ l ∈ H s , with s = 5/2 + ǫ. Using the Cauchy-Schwarz inequality, it follows that A consequence of the finiteness of the integral in (3.33), for each 1 ≤ l ≤ d, is that there exists a T > 0 so that Using (3.24), we can write Notice that the estimate (3.36) is n-independent, so that in particular the estimate holds in the limit n → ∞. Since δ was arbitrary, this completes the proof of the proposition.
This completes the pre-compactness argument, which allows us to pass to the limit in (3.24) and in (3.12), and conclude that Z n (x) converges pointwise to the unique solution of equation (3.9) belonging to C b (R), implying (3.7), and hence the conclusion of the theorem. Now we show the limiting behavior of the sequences Y (l) n (x, t) imply (3.13) and (3.14). Consider the identity Apply this identity, noting that M (l) , and applying the decoupling formula (see Appendix 1) for Gaussian random variables, it follows from (3.37) that (3.38) It will be useful to rewrite (3.38) as (3.39) The reason for the rewrite is that it splits the functions Y (l) n (x, t) into a part that depends on the distribution of the diagonal entries and a part that corresponds to the same term as for the Gaussian Orthogonal Ensemble, for which σ 2 = 2. Recalling that e n (x) is given by (3.23), again writing β jk = (1 + δ jk ) −1 and using the identity it follows by a direct calculation that Then for 1 ≤ l ≤ d, using (3.40) and (3.19), it follows that and also that Using the semigroup property it follows form (3.41) that T 1 can be written The following proposition presents the functions Y (l) n (x, t) in a form that is amenable to asymptotic analysis.

Proposition 3.2. The equation Y (l)
n (x, t) = T 1 + T 2 , can be written as (3.52) Proof: Begin with the term T 11 , defined in (3.43). Write Noting that and also that it follows that The term (3.55) goes into the remainder, which becomes (3.49). Also, (3.56) is added to the left-hand side of (3.45). Now consider the term T 12 , defined in (3.43). We have that which becomes (3.48) in the remainder. Consider the term T 13 , also defined in (3.43). Writing it follows, with A (l) n (t) given by (3.46), that (3.60) Then (3.59) becomes (3.50) in the remainder, while (3.60) remains on the right-hand side of (3.45). Now consider the term T 21 , defined in (3.42). This term becomes (3.51) in the remainder. Finally, consider the term T 22 , also defined in (3.42). Write so that, with Q (l) n (t) given by (3.47), (3.63) The term (3.62) becomes (3.52) in the remainder. Also, the term (3.63) remains on the right-hand side of (3.45). This completes the argument for proposition 3.2.
We now turn our attention to the remainder term, r (l) n (x, t), of proposition 3.2. The content of the following proposition is that the remainder is negligible in the limit.

Proposition 3.3. Each term of r (l)
n (x, t) converges to 0 uniformly on compact subsets of {x ∈ R, t ≥ 0}, for 1 ≤ l ≤ d. In other words, we have the uniform limit lim n→∞ r (l) n (x, t) = 0. (3.64) Proof: Begin with the term (3.48). Applying the estimate (3.29), we obtain Now consider the term (3.49). Using the bound |e • n (x)| ≤ 2, the Cauchy-Schwarz inequality, and (3.27) twice, it follows that Consider the term (3.50) next. Applying (2.20) of lemma 2.5 to the exponential function and ϕ ′ r , and noting that ϕ ′ r ∈ H 3 2 +ǫ , it follows that While the exponential function does not belong to H 3 2 +ǫ , we can truncate the exponential function in a smooth fashion outside the support of the semicircle law, so that the truncated exponential function belongs to H 3 2 +ǫ . We may replace the exponential function by its truncated version because the eigenvalues of the submatrices concentrate in the support of the semicircle law with overwhelming probability. Then Here it is not so important to know the exact value of the limit, but we will use the fact that we have convergence in the mean and almost surely to the same limit. Note the convergence in (3.67) implies that the sequence of numbers is bounded. Also the convergence in (3.68) implies that the random variables are bounded with probability 1. Using (3.67) and (3.68) with the dominated convergence theorem, it now follows that Combining the bound |e n (x)| ≤ 1 with (3.69), it follows that Then, using (3.70) in the remainder term (3.50), it follows that Consider (3.51), which is the next term in the remainder. Observe that, again using the Cauchy-Schwarz inequality and the fact that For fixed j, p, q ∈ B l , using (3.19), (3.73) Using (3.73), recalling that β pq = (1 + δ pq ) −1 ≤ 1, and the Cauchy-Schwarz inequality, it follows that Using (3.74), the fact that |U (l) jk (t)| ≤ 1, and the inequality 2ab ≤ a 2 + b 2 , it follows that (3.75) Using the Poincaré inequality, (3.75), adding more nonnegative terms, and using the property of the unitary matrices that it follows that Now, combining (3.72) with (3.77), we have that and it follows that (3.79) Now consider the final term of the remainder, given by (3.52). We apply the identity below which is a consequence of the matrix version of the Fourier inversion formula (3.21). Using (3.80), the finiteness of the integral (3.33), the above estimate (3.78), and the dominated convergence theorem, we have that The goal now is to pass to the limit in (3.45). In what follows (3.82) Proposition 3.4. Let A (l) n (t) be given by (3.46), Q (l) n (t) given by (3.47), andv n (t) given by (3.44). Then the limits of A (l) n (t), Q (l) n (t), andv n (t) as n → ∞ exist and the limit of Q (l) n (t) is given by

85)
and the limit ofv n (t), after rescaling by γ l , is given by n E Tr{P (l) U (l) (t 1 )P (l,r) ϕ ′ r (M (r) )P (r) } dt 1 . In the full Wigner matrix case one has A n (t) = −2 t 0 1 n ETr{e itM ϕ ′ (M)}dt 1 , and the limiting behavior follows immediately from the Wigner semicircle law. In the case of submatrices with asymptotically regular intersections there are additional technical difficulties due to the fact that for the n × n submatrices M (l) = P (l) MP (l) , we have 87) so that the summation is restricted to entries common to both submatrices, i.e., to j, k ∈ B l ∩ B r . It follows from lemma 2.5 that the limit of A (l) n (t) exists and equals This establishes (3.83). The proof of lemma 2.5 will be given in section 3.2. We turn our attention to Q (l) n (t). First it will be argued that the variance of the matrix entries converge to zero. Using the Poincaré inequality, (3.74), (3.76), and proposition 3.1, it follows that Note that in the course of the calculation (3.90), we showed that The Cauchy-Schwarz inequality implies Using the Cauchy-Schwarz inequality and (3.80), it follows that Using the Poincaré inequality, (3.91), (3.94), we obtain (3.95) Using (3.93), (3.95), (3.90), and the Cauchy-Schwarz inequality, we obtain (3.96) ) jj ], when passing to the limit. We use proposition 2.1 of Pizzo et al. [32], which guarantees that for f ∈ C 7 c (R), In order to apply this asymptotic to the exponential function, which is smooth enough, we truncate the function in a smooth fashion outside the support of µ sc . We are justified in replacing the exponential function by its truncated version because the eigenvalues of the submatrices concentrate in the support of the semicircle law, with overwhelming probability. It is for this same reason that we may assume ϕ ′ r is compactly supported. This function is not sufficiently smooth, but we can avoid this problem by a density argument using standard convolution, and then apply the bound (3.3) on the variance of linear eigenvalue statistics.
Using (3.99), we pass to the limit in (3.47), and obtain (3.85). The limit ofv (l) is given by (rescaled) Wigner semicircle law, as a consequence of the zero eigenvalues. Alternatively, it can be computed using the bilinear form in lemma 2.5, with f (x) = e itx and g(x) = 1. To facilitate solving the integral equation (3.101), below, it will be useful to rescale by γ l . We obtain v (l) (t) = 1 γ l e itx , 1 ll which establishes (3.86). The proposition is proved. Now using propositions 3.2, 3.3, 3.4, we pass to the limit n m → ∞ in (3.45), and determine that the limit Y (l) of every uniformly converging subsequence {Y (l) n m } satisfies the equation where A (l) (t) is given by (3.83), Q (l) (t) is given by (3.85), and v (l) (t) is given by (3.86). Now the argument will proceed by solving the integral equation (3.101). We use a version of the technique used by Pastur and Lytova [21], to solve this equation. Define We check that after replacing the integral over L by the integral over [−2γ l , 2γ l ], and taking into account that z 2 − 4γ l is ±i 4γ l − λ 2 , on the upper and lower edges of the cut. Then the solution of (3.101) is (3.106) Then, with F lr given by (3.84), Using the regularity condition ||ϕ l || 5/2+ǫ < ∞ for 1 ≤ l ≤ d, (3.107), (3.108), and the dominated convergence theorem to pass to limit in (3.24) yields Applying the Fourier inversion formula (3.21), it follows that We will use the fact that (3.111) Expand the test function ϕ l in the Chebyshev basis to obtain (3.112) Returning to the computation of Z ′ (x), using (3.110), (3.111), and (3.112), it follows that (3.113) Using the orthogonality of the Chebyshev polynomials (2.21), Integrating by parts yields we expand ϕ r (y) in the Chebyshev basis to obtain Recalling that F lr is given by (3.84), it follows that (3.120) We have obtained the expression for the asymptotic covariance (2.14) in terms of Chebyshev polynomials. Now we write this expression as a contour integral. Let make the change of coordinates x = 2 √ γ l cos(θ ), y = 2 √ γ r cos(ω), and use (2.14) to obtain that Integrating by parts in θ , ω it follows that To evaluate the infinite sum above, recall that for z ∈ C with |z| < 1, we have Noting that β < 1, using (3.123), it follows that ∞ k=1 β k k sin(kθ )sin(kω) Making the change of coordinates z = √ γ l e iθ , w = √ γ r e iω , and recalling that β = γ lr √ γ l γ r , this can be written as (3.125) Combining (3.122), (3.125), and noting that

The Bilinear Form
The main goal of this section is to prove Lemma 2.5, to which we now turn our attention. Begin with the following definition.
The large n limit of f , g lr,n exists for polynomial functions because all moments of the matrix entries of M are finite. Then lim n→∞ f , g lr,n = f , g lr , where ·, · lr is the bilinear form defined in definition 2.3. We will compute the bilinear form f , g lr for monomial functions f (x) = x k , g(x) = x q . We will also consider the random variables n −1 Tr{P (l) f (M (l) )P (l,r) g(M (r) )P (r) }, and prove their convergence almost surely to the non-random limit described in lemma 2.5. To this end, we will use some results and techniques from Free Probability. We refer the reader to Anderson et al. [16] for the relevant background concerning noncommutative probability spaces, asymptotic freeness of Wigner matrices, as well as the definition and the properties of the multilinear free cumulant functionals κ p , for p ≥ 1.
Since the limiting spectral distribution of M is Wigner semicircle law with respect to the functional n −1 ETr, and almost surely the Wigner semicircle law with respect to the functional n −1 Tr, we have that κ 2 (M) = 1 and κ p (M) = 0 for p = 2. It follows now that x k , x q lr = 0, if k + q is odd, (3.130) and also that almost surely lim n→∞ 1 n Tr P (l) MP (l) k P (r) MP (r) q = 0, if k + q is odd. (3.131) Supposing then that k + q is even, and continuing the calculation, x k , x q lr = π2∈NCP(even) π 1 ∈ NC(odd) π 1 ∪ π 2 ∈ NC(2(k + q) + 1) κ π1 (P (l) , · · · , P (r) ) = π2∈NCP(k+q) π 1 ∈ NC(k + q + 1) π 1 ≤ π c 1 κ π1 (P (l) , · · · , P (r) ) = π2∈NCP(k+q) where π c 1 = {S 1 , · · · , S |π c 1 | } are the blocks of the non-crossing complement of a given partition. We have used the complement partitions to write the sum of the free cumulants over the partitions of the projection matrices into a product of joint moments of the projection matrices.
Similarly, with respect to the functional n −1 Tr, we have that almost surely lim n→∞ 1 n Tr P (l) MP (l) k P (r) MP (r) q = π 2 ∈NCP(even) π 1 ∈ NC(odd) π 1 ∪ π 2 ∈ NC(2(k + q) + 1) κ π 1 (P (l) , · · · , P (r) ) = π 2 ∈NCP(k+q) π 1 ∈ NC(k + q + 1) π 1 ≤ π c 1 κ π 1 (P (l) , · · · , P (r) ) = π 2 ∈NCP(k+q) Recall that the non-crossing pair partitions are in bijection with Dyck paths, NCP(k + q) → D (k+q) . Thus the computation for each functional reduces to counting Dyck paths. The number of Dyck paths (h(0), · · · , h(k + q)) with h(k) = j is Note that lim n→∞ n −1 Tr P (l) a P (r) b = γ lr , for any a, b ≥ 1. Also note that below the partition π c 1 depends on the Dyck path d ∈ D (k+q) (which corresponds to some non-crossing pair partition). Also note that by |π c 1 | we denote the number of blocks of π c 1 . Suppose for now that both k, q are even integers. The height of the path at h(k) must be even, say h(k) = 2j. Those blocks which consist only of the matrices P (l) will contribute a factor of γ l to the product of joint moments.
The number of blocks which contain only the matrices P (l) corresponds to the number of down edges of the path in the first k steps. Denote by u the number of up edges and d the number of down edges of the path up to step k. Then u + d = k and u − d = 2j, which implies that d = k/2 − j. The number of blocks which contain only the matrices P (r) is equal to the number of up edges of the path in the final q steps. This number corresponds to the exponent on the factor γ r in the product of joint moments. Denote now by u the number of up edges and d the number of down edges of the path in the final q steps. The u + d = q and d − u = 2j, which implies that u = q/2 − j. The remaining blocks of the partition contain projection matrices of mixed type and will contribute a factor γ lr to the product of joint moments. Since the total number of blocks in the partition is k+q 2 + 1, the number of factors of γ lr in the product of joint moments is 2j + 1. Partitioning the Dyck paths into equivalence classes based on the height h(k), we get that and also, almost surely, . Now suppose that both k, q are odd. The height of the path at h(k) must be odd, say h(k) = 2j + 1. Similar to the even case, the number of blocks which consist only of the matrices P (l) equals the exponent of γ l in the product of joint moments. The number of blocks which contain only the matrices P (l) corresponds to the number of down edges of the path in the first k steps. Denote by u the number of up edges and d the number of down edges of the path up to step k. Then u + d = k and u − d = 2j + 1, which implies that d = (k − 1)/2 − j. The number of blocks which contain only the matrices P (r) is equal to the number of up edges of the path in the final q steps. This number corresponds to the exponent on the factor γ r in the product of joint moments. Denote now by u the number of up edges and d the number of down edges of the path in the final q steps. The u + d = q and d − u = 2j + 1, which implies that u = (q − 1)/2 − j. The remaining blocks of the partition contain projection matrices of mixed type and will contribute a factor of γ lr to the product of joint moments. Since the total number of blocks in the partition is k+q 2 + 1, the number of factors of γ lr in the product of joint moments is 2j + 2. Partitioning the Dyck paths into equivalence classes based on the height h(k), we get that and also, almost surely, The intersection of countably many events, each with probability 1, occurs with probability 1. There are only countably many polynomials with rational coefficients, so we have proved that the random variables 1 n Tr{P (l) f (M (l) )P (l,r) g(M (r) )P (r) }, converge almost surely to the same, non-random limit given by the right hand side of (3.134), whenever f , g are polynomials with rational coefficients.
The bilinear form f , g lr is diagonalized in the next proposition. More precisely, The Proposition 3.7 is proven in the Appendix 2.
The Chebyshev polynomials have rational coefficients, so it follows from the above argument that a.s.
Now the bilinear form ·, · lr will be extended to functions other than polynomials. For this part of the argument, the bound on the variance of linear eigenvalue statistics in 3.3 is essential.
The Proposition 3.9 is proven in the Appendix 3. Lemma 2.5 now follows from Propositions 3.7 and 3.9. This also completes the proof of Theorem 2.1.
Proof: From (4.16) and (4.17), we have where the error term is bounded by C 3 (x, t) as n → ∞. The first term in (4.23) is The first term and the second term are bounded because of (4.12).
So the limit of T 32 is So if Y(x, t) = lim n→∞ Y n (x, t), then Y(x, t) satisfies Y(x, t) + 2γ 1  (4.48) Therefore,