Transverse and Quantum Localization of Light: A Review on Theory and Experiments

Anderson localization is an interference effect yielding a drastic reduction of diffusion—including complete hindrance—of wave packets such as sound, electromagnetic waves, and particle wave functions in the presence of strong disorder. In optics, this effect has been observed and demonstrated unquestionably only in dimensionally reduced systems. In particular, transverse localization (TL) occurs in optical fibers, which are disordered orthogonal to and translationally invariant along the propagation direction. The resonant and tube-shaped localized states act as micro-fiber-like single-mode transmission channels. Since the proposal of the first TL models in the early eighties, the fabrication technology and experimental probing techniques took giant steps forwards: TL has been observed in photo-refractive crystals, in plastic optical fibers, and also in glassy platforms, while employing direct laser writing is now possible to tailor and “design” disorder. This review covers all these aspects that are today making TL closer to applications such as quantum communication or image transport. We first discuss nonlinear optical phenomena in the TL regime, enabling steering of optical communication channels. We further report on an experiment testing the traditional, approximate way of introducing disorder into Maxwell’s equations for the description of TL. We find that it does not agree with our findings for the average localization length. We present a new theory, which does not involve an approximation and which agrees with our findings. Finally, we report on some quantum aspects, showing how a single-photon state can be localized in some of its inner degrees of freedom and how quantum phenomena can be employed to secure a quantum communication channel.


INTRODUCTION
Transverse localization (TL) is found in media in which the refractive index is randomly modulated only orthogonally to the direction of propagation. In these paraxial systems, Anderson localization (AL) sustains nondiffracting beams: confined light tubes showing many potential applications including fiber optics, quantum communication, and endoscopic imaging. In this review we will summarize recent advances in disordered optical fibers, in which confinement is obtained thanks to localization, discussing the advantages with respect to standard fibers. First we will report the latest experimental results on transverse Anderson localization: the migration of localized states due to nonlinearity, self-focusing, wavefront shaping in the localized regime, and the single-mode transport in disordered paraxial structures. This last result is particularly important as it bridges the physics of Anderson localization to the single-mode properties of optical fibers.
Then we will show how the traditional description of Anderson localization, which was based on the analogy to electrons in a random potential, turned out to be in error and led to the prediction of a localization length depending strongly on the wavelength of the light, which was not observed. We also report on the alternative correct theory, which relies on an analogy to acoustical waves in the presence of random elastic moduli. Regarding quantum aspects, we will report on how a single-photon state localized in some of its inner degrees of freedom could be an effective resource in quantum communication and cryptography, increasing both the amount of information loaded per single particle and the security and performance of protocols based on localized photon quanta. Finally, we will review the so-called random quantum walks in which the dynamics of a single particle moving on a lattice conditionally to the state of an ancillary degree of freedom display localization under certain conditions. A further aspect of AL of quantum particles is the behavior of the multiparticle interference and of the particle statistics in quantum walks. In the first proof-of-principle photonic experiments, AL has been observed in the two-photon wave function. In this scenario, it could be possible to simulate even the fermionic statistics by proper manipulation of two-photon entangled states generated by single-photon sources.

Modeling Transverse Localization: The Beginning
In the last decades, the idea that Anderson localization could be applied to electromagnetic waves [1,2] has drawn the attention of the scientific community, stimulating experiments and conjectures. The excitement was further propelled by the following observation of the coherent backscattering cone (the so-called weak localization) [3][4][5]. Several experiments claimed strong localization of light in bulk media [6][7][8], but these results are still today strongly debated [9][10][11][12]. First Abdullaev in 1980 [13] and then De Raedt and coworkers in 1989 [14] proposed an alternative form of localization for light: the transverse localization. These authors described an optical system translationally invariant along the propagation direction of the waves, together with a refractive index varying randomly in the directions rectangular to it (transverse disorder). As usual in diffraction theory one can reduce the appropriate Helmholtz equation to a paraxial one [15,16], which will be outlined in the following [14].
We start from the Helmholtz equation for the scalar field ϕ (r), which represents one of the components of the electric field E (r, t) ∇ 2 ϕ(r) + ω 2 c 2 n 2 (x, y)ϕ(r) 0 where ω is 2π times the frequency, c is the vacuum light velocity, and r [x, y, z]. In the case of spatial longitudinal invariant system, the function n(x, y) is the (transversely varying) refractive index. Due to the translation invariance in the z direction, the wave function can be represented as ϕ(r) a(r) exp(−ik 0 z) (2) where a(r) is the envelope in z direction and describes the localization effects in (x, y) direction. k 0 n 0 ω/c 2π/λ is the wavenumber in the medium. n 0 is the average refractive index in the disordered medium (disordered fiber), and λ is the wavelength in the medium. Equation 1 becomes − z 2 a(r) zz 2 + 2ik 0 za(r) zz z 2 a(r) zx 2 + z 2 a(r) zy 2 + k 2 0 n(x, y) 2 − n 2 0 n 2 0 a(r).
where the potential is defined in terms of the spatially fluctuating refractive index: Δn(x, y) n 0 , with Δn(x, y) n(x, y) − n 0 . Equation 5 is formally equivalent to the Schrödinger equation driving electron localization [17].

EXPERIMENTS ON TRANSVERSE LOCALIZATION
The first papers on transverse localization were focused on theoretical modeling and numerical simulations. The experimental realization of the effect required more than a decade from the appearance of the paper of De Raedt et al. [14], because it required several technological advances on the fabrication side. The difficulty relies in the realization of the translationally invariant disorder in the z direction, which is particularly challenging at optical wavelengths, where it is needed to realize "paraxial defects," i.e., invariance of the defect structures along the symmetry axis for sufficient length and size. The first successful approach was the "writing method" based on the application of the photo-refractive effect, which easily enables to produce z-invariant defect structures, employing Gaussian beams. On a second stage, the TL has been realized employing fiber-drawing technology, which enabled to produce longer structures, larger refractive index contrasts, and finally application-ready transverse-localized fibers. In the last stage, TL femtosecond direct laser writing was applied, which enabled the direct control of the defect positioning in order to investigate effects related to the disorder design. In the following, we will describe all this.

Early Experiments
The first experimental observation of TL (and actually one of the most unequivocal experimental manifestation of light localization) has been reported by Schwartz and coworkers [18] employing photo-refractive media. The authors employed the optical induction technique [19], to transform the intensity distribution of an array of parallel laser beams into a refractiveindex distribution thanks to the nonlinear response of the glassy material. The distribution of the beam intensity is controlled with an interference mask, thus enabling the experimentalist to design the disorder configuration. The approach of Schwartz and coworkers induces a small refractive index contrast (Δn ∼ 10 -4 ) and a large disorder grain size (∼ 10 µm), thus the expected mean free path ℓ (the spatial length over which light propagation direction memory is lost) could be rather large. However, because the localization length, according to the scaling theory [20,21] is proportional to e k⊥ℓ , they argue that they deal with a very small perpendicular wavenumber k ⊥ . The latter is the projection of the wavevector onto the x−y plane k ⊥ k 0 sin θ, with θ being the incidence angle; see the sketch in Figure 1.
The big advantage of the optical induction is the possibility to completely rearrange the refractive index distribution, with a simple and fast rewriting procedure. The possibility to perform experiments with several realizations of the disordered n(x, y) pattern enables to retrieve averaged-over-disorder quantities and this is an important aspect for verifying the presence of light localization. In particular, the authors of [22] demonstrated a dependence of the localization length on the degree of disorder, thus demonstrating TL.

Optical Fibers
In 2012, Arash Mafi and coworkers [18] demonstrated TL in an optical fiber, composed of polymer materials. They used a novel kind of fiber named disordered binary fibers (DBF), based on the random mixing of tens of thousands of polymer fibers of two types: poly-methyl-methacrylate (PMMA) and polystyrene (PS). The fibers were put together randomly and then thinned by applying a fiber-drawing tower.
By this procedure, homogenous fibers were realized with a realization of transverse disorder in the refractive index. The binary fiber approach provides several advantages: 1) the disordered refractive index distribution is permanent, 2) the refractive index mismatch between the two materials (Δn ∼ 0.1) has orders of magnitude higher than in the case of the photorefractive structures, and 3) the optical fibers are a mature technology ready for applications based on localization. Mafi and coworkers also fabricated glass optical fibers hosting transverse disorder and demonstrated TL therein [18]. The glass platform is extremely favorable for applications, providing very high refractive index contrast together with increased stability and lower absorption.

Image Transport
In-fiber implementation of the Anderson localization enables the propagation of localized beams with the transverse size comparable to that of cores of commercial single mode optical fibers. Thus a single disordered fiber with sufficient transverse extension can act as a coherent fiber bundle [23]. In [24] Mafi and coworkers demonstrated image transport through disordered optical fibers up to 5 cm. The transported image quality is comparable to or slightly better than the one obtainable with commercially available multicore image fibers, with disorder reducing the pixelation effect present in periodic structures and improved contrast. On the other hand, the imaging resolution is limited by the quality of the cleaving and polishing of the fiber tip, while the transport distance is limited by the optical attenuation and the residual longitudinal disorder resulting from the imperfect drawing process. In this sense, a glass-based disordered fiber, with a higher filling fraction and much lower losses, has the potential to further improve endoscopic disordered fibers.

Experimental Test of the Traditional Theory of Anderson Localization
The traditional theoretical description of Anderson localization of light, and, in particular, transverse Anderson localization [14,18,25] predicts a pronounced dependence of the localization length on the light wavelength. This is implied by the dependence of the potential U (x, y) in Eq. 5 on k 0 2π/λ We call the traditional approach according to Eq. 5 the potential-type approach (PT). The authors of [26] investigated this effect experimentally (see a sketch of the setup in Figure 2A) in order to verify the validity of the current theory of Anderson localization of light. The experimental setup is shown in Figure 2. Figure 2B reports the localization length ξ versus the incidentlaser wavelength: no dependence on the wavelength is retrieved in the range 0.55 ≤ λ ≤ 1 μm. This is in contrast to a simulation of Karbase et al. [25] of the same specimens, using Eq. 5, showing a strong dependence on the wavelength λ 2π/k 0 ; see the dashed line in Figure 2. The reason for this discrepancy with the theoretical predictions will be fully explained in the following section. Here, we just sketch the essence of the findings of the authors of [26]: 1) In deriving the wave, Eq. 5, it has been tacitly assumed that the divergence of the electric field would be zero. In the presence of a spatially varying index of refraction, the divergence is, however, nonzero and is given by Approximately neglecting this term leads to the strong dependence of the localization length on k 0 , and, hence, on the laser wavelength (dashed line in Figure 2B) [14,18,25], at variance with the experimental findings ( Figure 2B).
2) An alternative wave equation has been derived by the authors of [26], in which the electric modulus 1/n 2 (x, y) enters ("modulus-type," MT), and which involves no approximation (except the paraxial one): here, b(x, y, z) e −ik0z B(x, y, z) and B is the magnetic field. This wave equation does not predict any wavelength dependence of the average localization length (full line in Figure 2B), in agreement with the experimental data.

Nonlinearity in Disordered Optical Fibers
There is a relevant debate about the fact that nonlinearity [27,28] may enhance disorder induced localization. The interplay between disorder and a nonlinear response may strongly modify the process of disordered induced wave trapping in TL. In particular in the case of nonlocality, while localization tends to reduce the intermode interactions, a nonlinear perturbation, extending beyond the region of the localized state, could eventually produce some kind of action at a distance. The first experimental evidence of nonlocality acting together with Anderson localization in an optical fiber has been shown in [29]. In that paper, the disordered fiber has been probed with a broadband laser beam, showing a distribution of sharp peaks in the transmittance, as expected from the "resonant" behavior of the disorder induced localized states [in Figure 3A we report the spectrum transmitted from the fiber (blue) compared to the probe spectrum (red)]. The first evidence is that the spatial shape of the localized states is strongly affected by energy probe beam power. This effect is reported in Figures 3B,C, where the localized state shape is reported as a function of the input power. The mode is seen to shrink when power is augmented. The presence of sharp peaks in the spectrum for all the power values confirms that the nonlinear action conserves the coherence of probe light. This self-focusing results from the peculiar interaction between disorder and thermal nonlinearity. In general, the refractive index of a nonlinear optical material, varies with the optical intensity I as n n 0 + Δn(x, y) + n 2 I, where Δn(x, y) is the refractive index fluctuation due to disorder and n 2 represent the coefficient of the nonlinear perturbation. A positive n 2 coefficient results in a converging wave front that can potentially surpass the diffraction limit. Conversely, a negative value of n 2 produces a defocusing nonlinearity, thus the expansion of the beam. In plastic binary fibers, one expects the slow thermal nonlinearity to yield a negative n 2 , thus defocusing. However, experimental measurements report instead a focusing nonlinearity. This unexpected effect is explained as follows [30]: if the refractive index reduction is more pronounced in one of the two constituent materials of the binary fiber, the refractive index mismatch may increase. Thus the overall refractive index reduction is compensated by a stronger index contrast, resulting in a smaller mean free path. This argument is not affected if we switch from the PT to the MT description, because according to both theories the mean-free path is inversely proportional to the index contrast, resulting effectively in a decrease of the localization length, as shown in Figure 3.
This effect makes a local and optical tunability of the localization length possible, enabling to drive the position of the localized states in a form of localization-mediated beam steering. The steering effect is reported in Figure 4. Figure 4A shows light reflected by the fiber input: the probe beam (green spot on the left) and the pump beam (red spot on the right). The red thin line represents the source spectrum. Panel (B) reports the shape of the most intense mode (at ≃801 nm), for five values of the input power. Panel (C) shows localization length versus input intensity. All data are from [29].
Frontiers in Physics | www.frontiersin.org August 2021 | Volume 9 | Article 715663 5 Figures 4B-D show the shape of the probe beam at the output as a function of the pump beam power. Here it is possible to note how much the probe beam is attracted towards the pump one. Figure 4E shows the distance of the probe beam center as a function of the input power.
Nonlocality obviously works also when more than two modes are involved. The behavior of a group of localized modes is visualized in Figures 4F,G. Here we show (data from [29]) the mode density (number of localized modes per square µm) along the x and y axes at the output of a fiber. The mode density has been characterized for various values of the input power. The modes indeed appear to be attracting one another and then after a "collision" they start to diverge.

Localized States and "Single Modes"
The Anderson localization (AL) scenario typically comprises a disordered system supporting states which are strongly localized at different locations in space and at different energies [31]. These disorder-induced resonances have thus a poor or negligible spatial and spectral overlap, so that transverse energy transport is substantially slowed down.
While the majority of the studies on AL are focused on the measurement of transport-related quantities (such as diffusion or conductance [6][7][8]21]), it is also interesting to study the properties of disorder-generated localized states. These light structures could be employed for energy storage [28,32] or super-efficient lasing [33,34]. Indeed localized states act exactly like a microresonator, with the difference that the resonance is sustained by a disordered structure instead of a regular one. In photonics, this kind of "lonely" structures are extensively employed in several fields: the most successful applications are in the field of fiber optics and laser resonators where they are named single-mode resonators or single-mode fibers. The principle of operation for both applications is very similar: they are resonant structures designed to host a single solution (typically the fundamental one) of the wave equation without (or with very small) losses.
In the case of disordered optical fibers, one may ask, to which extent a localized state operates as a single mode hosted in the core of single-mode fibers. This issue has been extensively studied in [35].
In contrast to multimode fibers, disordered binary fibers (DBF) show peculiar transmittance maps. The transmittance and the pump beam (red on the right). Panels (B-D) report the probe beam (pump light has been removed from the detector with a spectral filter) for several values of pump power. Panel E shows the distance between probe and pump beam versus the pump power. Panels (F-G) report the modes density along the X axis and Y axis (respectively) and for several pump powers. Data are from [29].
Frontiers in Physics | www.frontiersin.org August 2021 | Volume 9 | Article 715663 6 map is the total (integrated over the whole fiber tip output) intensity measured as a function of the injection location, and measured with the setup shown in Figure 5. Light from a CW laser is tightly focused into a spot of 0.7 µm diameter at the DBF input. The fiber input tip is sustained by a motorized actuator which enables to scan the injection location (r [x, y]) along the input plane. The total transmittance T (r) is thus obtained by summing the whole intensity measured on the output plane (R [X, Y]) by Camera 2. A typical transmittance map is reported in Figure 5A. It is possible to note that high transmittance locations (green spots) are appearing rather sparsely and surrounded by a sea of barely transmitting locations. These "hotspots" are the locations at which the input (which has a size much smaller than the localization length) couples efficiently to a transmission channel corresponding to a transversely localized state. The fact that the transmittance map is sparse should be thus a consequence of the fact that the coupling conditions are very "strict" (resonance bandwidth is very small) and thus coupling happens only at specific locations. Now it is interesting to further investigate the nature of these transmittance hotspots. The most accessible feature is the intensity profile measured at the fiber output: this is reported at Figure 6 for four different input locations. The input locations are identified by small colored dots labeled (a i b i , c i , d i ) in Figure 5A. The intensity profiles in panels B and C of Figure 6B correspond to injection locations in the same hotspot and they produce two very similar output intensity profiles. On the other hand, two very close input locations lying in a barely transmitting area (panels C and D of Figure 6) produce two very different output intensity profiles. The intensity profile corresponding to a high efficient transmission channel is thus a fingerprint of the channel. In the same way the Gaussian profile going out from a single mode fiber is an indistinguishable signature for efficient coupling of a laser beam to the fundamental mode of the fiber's core.
To verify this picture, one should observe where the mode's fingerprint is found while scanning the input of the fiber. To perform this measurement systematically, the authors engineered a specialized observable that is the degree of similarity Q(r 1 , r 2 ) normalized such that Q(r, r) 1. The fingerprint of a transmission channel is the output intensity profile retrieved at the location of higher transmittance. So for the transmission channel located at a i , it produces a Q-map Q(r a , r 2 ) I (R, r a )I (R, r 2 )dR, where r a corresponds to a location of higher intensity of the mode a i . By computing Q(r a , r 2 ) for all r 2 in the area of view, we retrieve the Q-map reported in Figure 7A. The white/bluish area (where Q ≃ 1) corresponds to the dwelling area of the mode: the set of input locations from which the mode can be activated. The dwelling area is very sharp, meaning that when the mode is activated, no other modes (which would modify the fingerprint and immediately lower the Q) are activated.
A similar situation is found in Figure 7B for the mode in e i . The two modes are only barely overlapping: energy is not flowing from one localized state to the other. The dark area in both maps corresponds to locations in which no intensity is transferred to the localized state. Note that the small displacements of the input inside the dwelling area do not cause any modification in the mode fingerprint. Multimode light structures would give rise to a pronounced flickering of the image due to the difference in phase delay of the different modes. The absence of such flickering is a relevant proof of the single-mode nature of the light structures supported by the DBFs.
The fact that the dwelling areas of different modes are barely overlapping is consistent with the picture in which localized states are orthonormal. A definitive confirmation of orthonormality requires to measure all the relevant observables: dwelling area, fingerprint, spectral parameter, and polarization. Such a challenging experiment (requiring the full characterization of thousands of modes) has not yet been carried on to our knowledge.
On the other hand, Figure 7C, related to mode (c i ), provides a Q-map almost entirely empty: in absence of a transmission channel, retrieved light is not coupled to a localized state. In this case the fiber behaves in way similar to a (very leaky) largecore multimode fiber where a small translation of the input produces a complete change of the output due to interference (thus an immediate decay of the Q value).
Regarding polarization, [24] reports one of the first studies about the impact of polarization: in the Supplementary Figure S3 of reference [24], the authors show that image reconstruction is nearly unaffected by the input polarization. The Supplementary Figure S2 of reference [35] demonstrates the "polarization maintaining" nature of the localized states. To our knowledge, there are no experimental studies for the polarization behavior in the nonlinear case; however, we have no evidence suggesting that the picture changes.
The summary is as follows 1) High transmission channels in a fiber are sparse.
2) They are separated by a barely transmitting "sea".
3) Independently on how (and where) light is coupled to a fiber, each transmission channel retains its fingerprint (output profile). 4) Modes are excited in alternative fashion (i.e., the same input location activates only a transmission channel at time).
In other words, localized states of a disordered binary fiber behave exactly like the single modes of conventional single mode fibers showing the same property: namely the "resilience to the launch conditions."

Designed Disorder in Glass Fibers
Disorder binary fibers (DBF) are a unique architecture [36]: a fiber without cores (thus similar to a multimode fiber), capable of hosting localized/single mode solutions. However, the high absorbance of the plastic component materials, together with fabrication-induced scattering losses, degrades consistently their transmittance efficiency, which remains very limited especially if compared with the properties of silica fibers capable to transmit light for kilometer with few losses. It is thus very promising to obtain transverse localization on disordered glassy fibers. The first  Frontiers in Physics | www.frontiersin.org August 2021 | Volume 9 | Article 715663 8 observation of transverse Anderson localization in a glass optical fiber has been obtained by Mafi and coworkers [37]. The glassy disordered fiber has been obtained, starting from a "porous satin quartz" rod of 8 mm in diameter and 850 mm in length from which a single 150 m long fiber (diameter 250 µm) has been obtained. However, in this system the nonhomogeneous distribution of disorder (lower air hole density in the central region of the fiber) produces localized states only at the borders of the fiber. This uneven distribution of disorder forbids a complete optical exploitation of the waveguide section. Moreover, the positions of the defects (the air bubbles) are random (it results from the natural occurrence of pores in the rod) and cannot be tuned by the experimentalist at the fabrication stage.
On the other hand, the concept of designed disorder [38] is becoming an intense field of research with applications ranging from the fabrication of waveguides polarizering to light harvesting [39][40][41]. In fact, in some cases, disordered structures, even if fully deterministic, can be more favorable for specific tasks than periodic ones. For example, disordered arrays of defects can be employed to produce a structure displaying different propagation regimes (full photonic bandgap, Anderson localization, or free diffusion), depending on the wavelength employed [42].
In order to implant "designed disorder" into glassy optical fibers, the authors of [43] employed the femtosecond direct laser writing (FDLW) technique. FDLW [44] exists since the early nineties and enables nanometric resolution in surface ablation. In transparent materials bulk micro machining can be achieved through nonlinear (two-or three-photon) absorption, thus enabling the fabrication of photonic or microfluidic devices. The strong confinement of the nonlinear absorption volume, together with positioning performed by piezo-actuators with nanometric resolution, enables the fabrication of threedimensional and complex structures. The modifications by nonlinear absorption yield a local refractive-index change (at low power) or even void formation (at high power). Importantly the changes produced are permanent; thus, the low power approach enables producing durable wave guides. The group of Szameit and coworkers reported several experiments on waveguide arrays in which disorder is introduced in the interwaveguide coupling factors. This approach enabled to investigate Anderson localization [45][46][47][48], defect localization [49], and also topological insulation [50]. This wave-guide-based approach, indeed, enables accessing a plethora of intriguing physical phenomena; however, it requires the individual fabrication of each of the transmission channels. In this sense, DBF support localization in a different manner: they can be, indeed, seen as a continuous meta-material, potentially hosting localized states at any location, which could support the resonance condition. The most evident consequence of this difference is that localized states translate gradually their position when wavelength is changed in DBF, while they can be hosted only at the waveguides location in waveguides-arrays.
In order to transfer the advantage of DBF to glasses, Gianfrate and coworkers [43] employed FDLW in a nontraditional way. In particular they employed an objective with high numerical aperture (NA 0.65) to generate tubes with very small diameter and with refractive index larger than the surrounding medium. These paraxial structures play the role of a transverse scatterer, because their reduced transverse dimension does not enable to support propagating modes: they act as paraxial defects (see sketch in Figure 1). This new generation of optical fibers based on paraxial defects has been studied in [43], where the authors show how the localization strength depends on the degree disorder properties. The authors demonstrate that the confinement properties depend on the degree of disorder 0 < χ < 1: a parameter tuned at the fabrication stage. The paraxial defects are fabricated at the transverse coordinates where M x and M y are integer numbers between 0 and S and δ is the lattice size and θ is a uniform random number between [−0.5 and 0.5]. When χ is 0, the paraxial defects are located in a square lattice with cell side δ and S 2 defects (square with side δS). For χ > 0 each defect is displaced by a random amount δχθ Mx along X and δχθ My along y, generating a square lattice with an increased degree of randomness. Figure 8 reports the measured localization length ξ as a function of χ. It is possible to note that the localization length Frontiers in Physics | www.frontiersin.org August 2021 | Volume 9 | Article 715663 9 decreases up to 0.6 and then starts to increase again. While the decrease is naturally expected as a natural consequence of increasing disorder, the increasing behavior above χ 0.6 results from the appearance of overlapping paraxial defects which are effectively decreasing the defect density.
The realization of localization induced by direct laser-written defects is the first step towards a new generation of glass-based optical fibers characterized by low absorption and greater stability with respect to their plastic counterpart. The ability of tuning the defect position will open the possibility to test the concepts of designed disorder directly in optical fibers, thus paving the way towards unprecedented applications.

Historical Overview
Since the appearance of Anderson's seminal 1958 article [17], the interest of the condensed-matter community in electron and wave localization has not decreased [31,51]. That Anderson localization is an interference phenomenon, i.e., due to the wave nature of electrons, became only clear in a second seminal paper by the "gang of four" Abrahams, Anderson, Licciardello and Ramakrishnan [20], in which they combined perturbation theory with a scaling procedure (to be described below) to show that the disorder-induced interference leads always to localization in two and one dimension. That onedimensional systems feature always localized states had already been shown by Mott and Twose in 1961 [52].
The rather ad-hoc scaling argument of the gang of four has been given two complementary field-theoretic fundaments: the selfconsistent localization theory of Vollhardt and Wölfle [53][54][55] and the generalized nonlinear sigma model [56,57], which goes back to a paper by Wegner [58]. Wegner realized that the nonlinear sigma model for planar ferromagnetism exhibits the same scaling behavior as the scaling theory of Anderson localization. Shortly after the self-consistent localization theory of Vollhardt and Wölfle [53][54][55] appeared, it was noted by Economou and Soukulis [59] that the resulting self-consistent equation for finding the localization length was mathematically analogous to the problem of finding (or not finding) a bound state for single electrons within a potential well ("potential-well analogy"). The potential-well-analogy method was later formulated more rigorously by Soukoulis et al. [60] and Zdetsis et al. [61]. In all three analytic approaches, 1) the scaling/nonlinear sigma model theory, 2) the self-consistent theory, and 3) the potential-well analogy, one proceeds in two steps for calculating the localization characteristics, namely the phase diagram, the conductance in the delocalized regime, and the localization length in the localized regime: 1) Calculating a "unrenormalized" (or "reference") conductance g 0 from the disorder statistics of the spatially fluctuating potential 2) Applying the -scaling equations (scaling theory/nonlinear sigma model) -self-consistency relation (self-consistent localization theory) -potential-well relation.
It is interesting to note that as the potential-well analogy is not based on the assumption of weak disorder (i.e., the assumption that the relative variance of the potential fluctuations is small), one can apply the standard effective-medium theory for strong disorder, namely the coherent-potential approximation (CPA) for calculating g 0 [61]. Complementary numerical methods for solving the Anderson localization problem are the Green's function method [62,63] and the transfer matrix method [64]. In both methods the one-parameter scaling idea is used to extract the bulk localization features by finite-size scaling. These methods do not suffer from some shortcomings of the analytical approaches (The analytic approaches predict, e.g., a critical exponent of ] 1.0, whereas the true one, obtained by the numerical work, is ] 1.57 [62,65]).
The standard (Anderson) model for electron localization is given by the following Hamiltonian on a simple hypercubic lattice [17] with lattice constant a.
where the indices i, j denote lattice sites and the double sum is over next nearest neighbors only. ϵ i < i|V(r)|i > is the diagonal element of an external potential V(r), which varies randomly in space. < r|i > are Wannier basis states. Usually the local potentials ϵ i are assumed to be independent random variables, which are distributed according to a distribution density P(ϵ), which can be a Gaussion or a box-shaped function [66]. The overlap ("hopping") matrix element t is assumed to be constant. Near the bottom (or top) of the band, one can write down a continuum Hamiltonian, with ta 2 1 2m . Here a constant 6t has been added (subtracted) at the bottom (top) of the band. This leads to the Schrödinger equation: In the continuum description [57], one considers V(r) as a Gaussian random variable with zero mean and correlation As there is no difference from the mathematical standpoint between a time-Fourier transformed classical wave equation (Helmholtz equation) and the Schrödinger Eq. 12 for the electrons (identifying 2mE/Z 2 with −ω 2 , where ω is the angular frequency), it was soon suggested that acoustical [67] and electromagnetic (EM) waves [68,69] should also exhibit Anderson localization. These theoretical approaches were based on the nonlinear-sigma-model formalism. A multiple-scattering approach for localization of (schematically scalar) acoustical waves was published by Kirkpatrick (1985) [70].
Considering acoustical waves in a disordered medium, the disorder may come from 1) spatial density fluctuations or 2) spatial fluctuations of elastic moduli. John et al. [67] assumed density fluctuations, and the approach of Kirkpatrick (see Eqs 6a-6c of [70]) is equivalent to considering fluctuating elastic moduli. If one assumes both density and modulus fluctuations, the wave equation for the schematically scalar acoustical displacement field ϕ(r, t) takes the form or in frequency space If the density ρ does not fluctuate, Eq. 14 constitutes an ordinary eigenvalue problem, which can be solved by discretization, followed by diagonalization. In the case of density fluctuations, one can separate the fluctuations from the mean ρ(r) ρ 0 + Δρ(r): Now the term ω 2 Δρ(r) looks like a frequency-dependent potential. But from the mathematical standpoint, it is strange that in an eigenvalue problem the potential depends on the eigenvalue. Certainly it would be more sound to divide the whole equation by ρ(r) to obtain an effective modulus K(r)/ ρ(r). However, in their approach to acoustical localization, John et al. [67] kept the effective frequency-dependent potential, because then they could take over the established electronic theory of Anderson localization, in particular the nonlinearsigma-model theory of McKane and Stone [57]. In his seminal article on the localization of light [71], the author pursued the same strategy: he wrote down a wave equation for the electrical field E(r, ω), where the ω 2 term was multiplied with the spatially varying permittivity ϵ(r). In this derivation, he tacitly assumed that the divergence of E would be zero. We pointed already out that, in the presence of a disorder-induced spatially varying permittivity, this is not the case. We shall discuss the consequences of this in the sections after introducing the scaling concept of localization theory.
For later reference, let us call a stochastic wave equation with fluctuating coefficient of ω 2 a "potential-type" (PT) equation and, if the elastic modulus fluctuates, a "modulus-type" (MT) stochastic equation.
Pinski et al. [65,69] used the transfer matrix method to solve the discretized stochastic acoustic wave Eq. 14 for the density of states and the localization properties, comparing the MT case with the PT case. While they find that the critical properties of both models are within the universality class of the electronic Anderson problem [72], the phase diagrams of the two models are rather different. The analogue of the PT model is the Anderson model of Eqs. 10, 12, whereas the quantum analog of the MT model is an electron system with spatially fluctuating hopping amplitude or effective mass ("off-diagonal disorder") [73], which also has a phase diagram different from the Anderson model [73]. So one can state that the PT and MT stochastic wave equations describe two physically different situations of wave disorder. In the case of electromagnetic case, however, they are supposed to describe the same, namely EM waves in a medium with spatially fluctuating permittivity ϵ(r) n 2 (r). The solution of this paradox is, as we shall demonstrate below, that the PT description is an approximation (neglecting the ∇ E term), whereas the MT description is exact.

Wave Interference and the Scaling Theory of Anderson Localization
If a wave is experiencing disorder like electrons in an impure metal, the waves are repeatedly scattered. This multiple scattering process can be interpreted as a random walk, featuring a diffusion constant D. In fact, one can derive a diffusion equation for the wave (diffusion approximation [74,75]). The diffusion approximation is equivalent with the relaxation time approximation [76], leading to the Drude law for the conductivity.
The diffusion approximation assumes that after each scattering event, the phase memory is lost. However, if one follows the scattering amplitudes with phases kł ij (where l ij is the distance between two adjacent scattering centers) in a frozen medium, the phase memory is in principle not lost. This has dramatic consequences for recurrent partial paths, i.e., paths with closed loops: the phase of the recurrent path is exactly equal to that going in the reverse direction (see Figure 9). This leads to destructive interference and therefore to a decrease of the diffusivity and, as we shall see, for d 2 to a vanishing of the diffusivity.
For describing the interference mechanism, Abrahams et al. [20] have proposed an ingenious scaling scenario. They consider  12 | Experimental apparatus for quantum cryptography using single-photon localized states. Alice encodes her qudits by preparing single-photon states via a spatial light modulator. She chooses between two bases, namely (K) the eigenstates of the multimode fiber that localize after the propagation and (X) the states that spread after the fiber. Bob measures in the K basis or in the X basis inserting one or two lens before the detection stage. After the comparison between the basis choices by Alice and Bob, they can extract a secure key. the dependence of a dimensionless conductance g on the sample size L in d dimensions and make the Ansatz For β > 0 g scales towards infinity with increasing L, β < 0 g scales towards zero. In the metallic regime ( g → ∞), the conductance should depend on the size L of a sample as g(L) ∝ σL d−2 , where σ is the conductivity, so that β( g → ∞) d − 2 (see Figure 10). On the other hand, for localization ( g → 0) one expects g (L) ∝ e −L/ξ , where ξ is the localization length. This transforms to β( g → 0) ∝ ln g. Abrahams et al. then assumed a smooth interpolation between the two limits to exist (see Figure 10). By means of perturbation theory, they further estimated the correction due to the interference terms to be negative and proportional to 1/g. Their final result for the scaling function is where c is a dimensionless constant of the order of 1. It can be verified from Figure 10 that in 3 dimension, the scaling with increasing size L depends on the initial value of the conductance, i.e., on the conductivity in diffusion approximation (Drude approximation for electrons). However, as can be seen from Figure 10 in 2 and 1 dimension, g scales always towards 0, i.e., for L → ∞, there is always localization. The scaling function (Eq. 17) is the same for the nonlinear sigma model for planar ferromagnetism, as noticed by Wegner [58]. Later a field-theoretical mapping from a stochastic Helmholtz equation to a generalized nonlinear sigma model was established and applied to the electronic Anderson problem [56,57] as well as the PT description of the classical-wave problem [1,67] and the MT description of acoustical [77] waves and light [26].
In two dimensions, the scaling Eq. 17 is solved as The localization length is given by the value L taken for g 1 ≈ 1. The reference conductance g 0 is the diffusion-approximation conductance where k is the wave number and ℓ the mean free path. For the reference length, traditionally the mean-free path has been taken as well. In our description, instead, we shall take for L 0 the correlation length L c of the disorder fluctuations (see below) From Eq. 18 one obtains the well-known formula [78] for two dimensions:

Wave Equation for Electromagnetic Waves in a Disordered Environment
As indicated above, almost the entire literature on Anderson localization (AL) of light is based on the potential-type wave equation, i.e., a wave equation in which the spatially fluctuating permittivity ϵ(r) appears as a coefficient of the double-time derivative of the wave function (electric field). In a recent article [26], some of the present authors have shown that this wave equation is in error and leads to a fictitious wavelength dependence of the localization length in transverse localization, which is not observed in the experiments. We now review the derivation of the traditional wave equation, showing which error was made, and present the derivation of the correct wave equation.
Maxwell's equations in a medium with spatially varying permittivity ϵ(r) are For deriving a wave equation for the electromagnetic fields, one can either solve the electrical field E(r, t) or the magnetic field B(r, t).

Traditional, Potential-Type Approach
The traditional procedure (potential-type approach, PT) was to solve E(r, t): where in the last step ∇ ·E 0 was assumed. In the frequency regime, we obtain the following stochastic Helmholtz equation: which, separating the fluctuations of the permittivity as ϵ(r) <ϵ> + Δϵ(r), can be rewritten as This equation is mathematically equivalent to a stationary Schrödinger equation for an electron in a frequency-dependent random potential V(ω). This equivalence made it possible to transfer the complete electronic theory of AL [21,53]  electromagnetic waves [21,71,79,80]. We call this approach "potential-type" (PT). We now want to check the validity of the approximation made in Eqs 23-25. We have 0 ∇ · D ∇ · (ϵ(r)E(r)) ϵ(r)∇ · E(r) + E(r) · ∇ϵ(r) (26) from which follows [81].
One can estimate the error made in Eq. 23 by inserting for E a wave with wavelength λ. If the scale, on which the permittivity is varying, is large with respect to λ (eikonal limit), the term on the right-hand side of Eq. 27 is negligible. However, if this condition is fulfilled, one deals with very weak disorder. In this case, one has in three-dimension delocalization and in two dimension a very large localization length, exceeding macroscopic sample dimensions, which would make the observation of AL impossible. So, for stronger disorder, where one might have a chance for observing AL, the scale of the permittivity fluctuations must be of order λ. In this case the divergence of E is not negligible. This renders the approximation made in the PT wave Eq. 23 invalid.

New Approach: Modulus-Type Description
If one solves the Maxwell Eq. 22 for B, one obtains Equation 28 leads to the following stochastic Helmholtz equation: where we have defined the spatially fluctuating dielectric modulus as M(r) c 2 0 /ϵ(r). Equation 29 is mathematically equivalent to the Helmholtz equation for an elastic medium with zero bulk modulus and a spatially fluctuating shear modulus M(r). This equation is exact and is called the modulus-type (MT) approach. A theory for a medium with finite (constant) bulk modulus K and a spatially fluctuating shear modulus has been worked out [77,82,83] by some of the present authors and applied for explaining the anomalous vibrational properties of glasses, in particular the enhancement of the vibrational density of states with respect to the Debye law ("boson peak"). Our present theory of Anderson localization of light relies on the analogy to this case. Essentially one needs only to take the K → 0 limit for this theory and obtain a theory for light diffusion and localization in disordered optical systems.
In order to describe transverse Anderson localization, we first map this problem to a two-dimensional problem. We then use the paraxial approximation to map the z dependence of the wave profiles to the time dependence in an effective Schrödinger equation. For estimating the diameter of the large-z profile, the localization length ξ, we apply the scaling theory of Anderson localization [20,21], which is equivalent to the renormalization-group approach to the generalized nonlinear sigma model [56][57][58]. For the calculation of the z dependence of the localization length, we then use the self-consistent localization theory of Vollhardt and Wölfle [53][54][55]80].

Description of Optical Fibers With Transverse Disorder
We now consider an optical fiber with transverse disorder, i.e., the permittivity exhibits spatial fluctuations in x and y direction, but not in z direction.
Because our system is translation invariant with respect to the z direction, we may take a Fourier transform with respect to z: B(x, y, k z , ω) dze ikzz B(x, y, z, ω). We then obtain an effective two-dimensional Helmholtz equation where k 0 ω/ 〈M〉 √ 2π/λ is the wavenumber of the input laser beam, λ is its wavelength, and θ is the angle between the direction of the incident beam direction and the optical axis (azimuthal angle); see Figure 1. E k 2 ⊥ k 2 0 − k 2 z k 2 0 sin 2 θ is called the spectral parameter. It replaces the spectral parameter ω 2 of a true two-dimensional system.
For θ ≪ 1 we can make the approximation E (k 0 + k z ) (k 0 − k z ) ≈ − 2k 0 (k z − k 0 ) ≡ −2k 0 Δk z , which is called the paraxial approximation [16]. The wavenumber Δk z refers to the Fourier component of the envelope b(x, y, z) B(x, y, z)e −ik 0 z , which describes the beats of the magnetic field vector B(x, y, z) in z direction. In the paraxial limit b(x, y, z) obeys the paraxial equation Introducing a "time" τ z/2k 0 (which has the dimension of a squared length), this equation acquires the form of a Schrödinger equation of an electron in a medium with a randomly varying effective mass: As stated above, such a model is related with a stochastic tightbinding model with a spatially varying hopping amplitude ("offdiagonal disorder") [73].
Let us now compare the MT Eq. 31 with Eq. 5, By comparing Eq. 31 with Eq. 5, we can estimate the influence of the wavelength λ 2π/k 0 on the localization properties: in the exact MT Eq. 31 the wavenumber k 0 enters only as a prefactor of the z derivative. Therefore, in the steady-state regime z → ∞ k 0 does not enter at all. This is, however, completely different for the PT case described by Eq. 5: here k 0 is the prefactor of the fluctuating-disorder term, which governs the mean-free path and hence the localization length.
How come that the predictions of two wave equations which are supposed to describe the same physical situation, namely, the wave propagation (or localization) of samples with transverse disorder, differ in such a drastic way? The difference can be traced back to the fact that in deriving (Eq. 5) the term ∇ ·E has been dropped. Therefore, the additional wavelength dependence implied by the PT Eq. 5 must be an artifact of this approximation. We shall come back to the comparison between the PT and MT predictions when we display explicit results obtained in the two approaches.

Rayleigh Scattering and Disorder
The most important aspect of Anderson localization is the disordered environment. In the case of electromagnetic wave, the disorder may appear as randomly distributed scatterers or a spatially fluctuating permittivity ϵ(r) n 2 (r).
Lord Rayleigh considered in his seminal papers on the blue color of the sky [84,85] point-like scatterers, which act like induced Hertzian dipoles. Jackson points out in his textbook [86] that considering fluctuating permittivity one obtains as well the famous ω 4 law for the scattering cross-section, which is inversely proportional to the mean-free path. It is known nowadays that this law becomes ω d+1 in d dimensions [87,88].
Because in our effective two-dimensional system the wave spectral parameter ω 2 is replaced by E k 2 ⊥ for small E, we must have by the two-dimensional Rayleigh law We shall demonstrate in Section 3.7 that the PT approach violates this requirement.
For weak disorder 1 , i.e., 〈ΔM 2 〉 ≪ 〈M〉 2 , the detailed distribution of the moduli (Gaussian or otherwise) is not important, because-as we shall see-the only parameters which enter into the mean-free path are the mean 〈M〉, the variance 〈ΔM 2 〉, and the correlation length L c of the fluctuating inverse permittivity (modulus). Here ΔM(ρ) M(ρ) −〈M〉 are the fluctuations about the mean. Here and in the following ρ signifies the two-dimensional spatial vector.
The correlation length of the spatially fluctuating modulus is an important parameter, because it defines the characteristic length scale of these fluctuations. It is defined by the length scale of the spatial decay of their correlations: with the correlation function 1 The theory may be generalized to include strong disorder using the coherentpotential approximation [89]. This theory includes percolative aspects, which may be relevant for TL fibers made of glass.

Simplified Scalar MT Model and Born Approximation
In this introductory subsection, we consider a simplified MT Helmholtz equation for a schematic scalar wave function b(ρ, E) The corresponding Green's function obeys where s E + iϵ is the complex spectral parameter. It has been shown in [89] that for sufficient small spectral parameter one can use the Born Approximation with respect to the fluctuations ΔM of the (in this case elastic) modulus. In order to apply the Born approximation, the Fourier-transformed averaged Green's function is represented in terms of a complex self-energy function Σ(E): and the lowest quadratical order in ΔM is given by with the unperturbed Green's function We now represent the correlation function schematically by introducing an upper wavenumber cutoff q c ∝ L −1 c : where C 0 is a dimensionless constant and θ(x) is the Heaviside step function. From this we get, using the fact that the Green's function does only depend on q |q|, We fix C 0 by requiring that the q sum i.e., C 0 4π. We finally get for the self-energy with the "disorder parameter" c 〈 (ΔM) 2 〉 / 〈M 〉 2 . The inverse mean-free path is given by the imaginary part of Σ, multiplied with k ⊥ (see the next subsection): The integral in Eq. 42 or Eq. 44 is elementary and gives k 2 ⊥ π/q 2 c for k ⊥ ≤ q c , so that we obtain from the Born approximation (Eq. 42) the Rayleigh law (Eq. 33) which, as said above, holds in the limit E k 2 ⊥ → 0

Self-Consistent Born Approximation (SCBA) for the Modulus-Type Approach
For higher values of the spectral parameter, the Born approximation is insufficient, and we need a nonperturbative approach. Using a hand-waving argument, stating that it is inconsistent to work with two different Green's functions, one may replace G 0 (q, s) in the Born approximation for Σ(s) by the full Green's function G (q, s). This turns Eq. 44 into a nonlinear, self-consistent equation for Σ(s). This is the self-consistent Born approximation (SCBA): The SCBA may be obtained more rigorously by fieldtheoretical techniques [56,57,67,77], in which it appears as a saddle-point of an effective field theory.
For our detailed calculations 2 we return to the full vector Helmholtz Eq. 30. We take advantage of the fact that the equation of motion for an elastic medium with spatially fluctuating shear modulus μ(r) is of the form ρ m ω 2 + ∇ · (λ + 2μ(r))∇ · −∇ × μ(r)∇× u(r, ω) 0 (48) where u is the displacement vector, ρ m the mass density, and λ is the longitudinal Lamé modulus. If one discards the longitudinal term, one arrives at the MT Eq. 30. Therefore one can take over the entire theory [77] derived for the classical sound waves without the longitudinal excitations, working in two instead in three dimensions. Within this theory the influence of transverse disorder is accounted for by an effective-medium treatment (selfconsistent Born approximation, SCBA), derived as saddle-point approximation within the nonlinear sigma model field theory [77]. In such a treatment the spatial fluctuations of ϵ̃are transformed into a dependence on the complex spectral parameter s E + i0 according to where G (k, s) is the Fourier and Laplace transform of one of the two configurationally averaged Greens functions of Eq. 30 (which are equal to each other), and we now represent the Green's function in the following way: where we have introduced an E dependent complex wave number The SCBA self-consistent equation for Σ(s) is with the local Green's function From the local Green's function, we obtain the spectral density as For E ≪ q 2 c , we have which can be solved to give Making a variable change v q 2 and neglecting the imaginary part of k 2 Σ in the denominator, we obtain for the imaginary part of the self-energy from which follows We now want to relate Σ″(E) to the mean-free path of the scattered waves. We may Fourier-transform the Green's function (Eq. 37) into ρ space to obtain where H (1) 0 (z) is the Hankel function of first kind [90]. For large ρ the intensity is then given by with the mean-free path given by [26].
This generalizes the Born-approximation result (Eq. 46), which is reobtained for small E and/or small c.

Self-Consistent Born Approximation for the PT Approach
The SCBA for the PT approach, due to John et al. [67], adapted to the transverse-disorder case reads [26] with the Green's function As in the MT case, we have for the mean-free path

Diffusion of the Wave Intensity
As the vector character of the magnetic field enters into our mean-field treatment only by doubling the Green's functions, we return to a scalar description of the field amplitude. The multiple scattering of waves in a turbid medium can be well described in terms of a random walk along the possible paths among the scattering centers [74]. The scattered intensity may be shown to obey a diffusion equation. Our object of interest is therefore the intensity propagator P(q, p, E) 1 (2π) 2 d 2 k〈G k+ with p ≡ −iω + ϵ, ϵ → + 0. The second line defines P (q, p, E) as the spatial Fourier transform and Laplace transform (with respect to τ) of the intensity propagator P (ρ, τ, E) in the ρ (x, y) plane. For deriving the diffusion description, it is assumed that after each scattering event the memory of the phase of the wave function is lost. P (q, p, E) then obeys a diffusion equation with a E dependent modal "diffusivity" 3 D(p, E): As a matter of fact, within the saddle-point approximation (SCBA) one is able to calculate the mean-field diffusion coefficient D 0 (E), which corresponds to the diffusion approximation. This diffusivity is the analogue to the electronic diffusivity D 0 σ 0 /ρ F , where σ 0 is the Drude conductivity and ρ F the density of states at the Fermi level. D 0 is obtained by considering the Gaussian fluctuations of the field variable Q (ρ, s) around Q saddle (s) [26,56,57,67,77] and is given by This diffusivity may be related to the dimensionless reference conductivity g 0 by the Einstein relation [26].
We see that g 0 and D 0 in our model are equal to each other within a factor of order unity. In two dimension, the conductivity is also equal to the conductance. This quantity is relevant to the scaling approach of Anderson localization, which will be explained in the beginning of the next section. For E → 0 the conductance g diverges due to the Rayleigh law [67].
We remind ourselves that ξ(E) also depends on D 0 (E) ∝ g 0 (E) via

Results for the Localization Length
We have solved both for the MT and the PT cases of the SCBA Eqs. 53, 63, 64, resp. for four different values k 0 2π/λ, where λ is the laser's wavelength inside the medium. From the results of the complex wavenumber k Σ (s)we evaluated the reference conductance g 0 k′(E)ℓ(E) k′(E)/2k″(E), which, in turn, is proportional to the logarithm of the E dependent localization length.
We observe the following features: • In the MT case, all four curves fall on top of each other (because k 0 enters only into the definition of E but not elsewhere). • In the PT case, one obtains four different curves.
• Furthermore, in the PT case, the curves enter into the negative E regime, which is unphysical and violates the stability law for bosonic excitations [92,93].
How can one estimate the average localization length from this calculation?
The distribution of the localization length is determined by the function ξ(E) by where E α (ξ) α 1,2 are the two branches of the inverse function E(ξ) of ξ(E) in the MT case. In the PT case, there is only one branch. In the MT case, there is a broad E region, where ξ(E) ≈ ξ min , indicated by the arrow in Figure 11. Therefore P(ξ) has a delta-like peak near ξ min . On the other hand, in the PT case there is a broad range of values for ξ, which, furthermore, varies strongly with k 0 . So, the average value of ξ will vary correspondingly with k 0 , as demonstrated by the numerical calculations by Karbasi et al. [25], shown in Figure 2.
In view of the fact that in our measurement we did not find a dependence on k 0 and that the SCBA of John et al. for the PT approach [67,71] leads to unphysical results, we suggest that one should rather abandon the PT approach and use the MT one.

Transverse-Localized Modes and Wavelength Dependence
We stated in the beginning that we experimentally verified [35] that the Anderson-localized wave functions are single modes, i.e., they are single eigenmodes corresponding to a certain eigenvalue E i k (i) ⊥ according to the characteristic equation For the intensity I i (x, y, z) B 2 i steady-state, large z regime is localized around certain spots r i in the (x, y) plane (see, e.g., Figure 5) and becomes zero on the length scale given by ξ(E i ). As can be seen from Figure 5, the x, y dependence of the wave intensities is rather irregular. Only on the average, we find an exponential decay of the intensity away from r i . Let us consider now, in more detail, cases in which one may obtain a wavelength dependence of the localization length. In our experiment, reported in [26] we had a rather large aperture of ∼ 50°. As shown by the authors, this covers the whole ξ(E) spectrum shown in Figure 11 and leads to a ξ distribution peaked at ξ min , which is k 0 independent. On the other hand, if one would work with a narrow-aperture laser, eventually tilted by a certain angle θ with respect to the optical axis, one could "pick" certain single modes. Because k ⊥ k 0 sin(θ) has k 0 as prefactor, certainly the mode one may pick up by this procedure will be a different one if k 0 is changed. This opens an interesting method for further investigation of the localized single modes.

Discussion
In this section we have presented a comprehensive theory of transverse Anderson localization of light. We started to derive the appropriate stochastic Helmholtz equation for electromagnetic waves with spatially fluctuating permittivity. We have shown that the potential-type approach, which is analogous to the Schrödinger equation for an electron in a random potential with the potential depending on the spectral parameter E, relies on an approximation, which is only applicable to very weak disorder, and, for transverse disorder, leads to a wavelength dependence of the localization length. Such a dependence is not observed. In the newly introduced modulus-type approach, which is exact, such a dependence is not predicted, in agreement to our experiments.
Within the modulus-type approach, the localization length, i.e., the radius of the transmitted modes, diverges as the spectral parameter (which is proportional to the square of the azimuthal angle between the direction of the incident radiation and the optical angle) vanishes. This must be so, because a ray in the direction of the optical axis does not experience transverse disorder. The potential-type approach, however, implies a finite mean-free path at zero spectral parameter, and the predicted spectrum penetrates into the negative range of E, rendering the predicted spectrum unstable.
At the end of this section, we would like to comment on the possibility of observing localization of light in threedimensional systems. As mentioned in the introduction, despite of intensive efforts, this has not been observed until now. We emphasized that the modulus-type theory is analogous to sound waves in solids with spatially fluctuating shear modulus. There it is known that localized states exist at the upper band edge, which in solids is the Debye frequency. In turbid media the analogue of the upper band edge is the inverse of the correlation length of the disorder fluctuations. So if it would be possible to prepare materials with spatial fluctuations of the dielectric modulus, which have a correlation length of the order of the light wavelength, we expect chances for observing 3dimensional Anderson localization.

NONCLASSICAL ANDERSON LOCALIZATION OF LIGHT
According to the seminal studies by Anderson regarding singleparticle evolution in lattices, the disorder in the system leads to localization of the wave-function. As we have illustrated in the first sections, such a phenomenon is well explained by quantum mechanics in the case of electrons and by classical electrodynamics in the case of light in the classical limit; i.e., no quantum effects are involved. In particular, localization is the result of constructive and destructive interference among the multiple paths of the particle. Being an explicit example of the wave-like behavior of quantum particles, the observation of AL in single-photon states does not display any substantial difference with respect to the experiments carried out with classical light. However, single-photons are one of the most promising candidates for quantum information processing in the context of computation, simulation, and cryptography [94]. In this framework, AL has been extensively investigated in photonic quantum walks [95][96][97][98]. The latter are versatile platforms for several tasks [99,100], including simulation of quantum transport effects such as the AL. Furthermore, localized singlephotons have been used as a resource to realize quantum cryptography protocols [101,102]. The investigation of AL at the single-particle level reveals distinctive features when particleparticle interference is taken into accounts [103,104]. This occurs when more than one particle evolve in the disordered lattice. In this case, other quantum properties of the system, such as particle indistinguishability and statistics, play a crucial role in the spatial distribution of the multiphoton wave-function.
This section regarding quantum AL is organized as follows. First, we introduce the quantum walks model and present singlephoton experiments in the context of AL. We further provide practical applications of localized single-photon states in quantum cryptography protocols. Second, we illustrate twophoton quantum walks experiments and the effect of particle statistics in the localization.

Quantum Walks
The concept of quantum walks (QW) was first formulated as a generalization of classical random walks (RW) [105]. In the discretetime evolution, the walk is performed by a quantum particle, which lives in a Hilbert space of d levels corresponding to the position in the lattice. In the classical random walk, the walkers go forward or backward according to the result of a coin toss. In the quantum case, the coin toss is a unitary operator that manipulates an additional two-dimensional degree of freedom embedded in the walker. Then the state of a quantum walker is described by the eigenstates of the position operator w {|d>} and by the coin basis c {|↑>, |↓>}. The evolution is regulated by two operators, the coin Ĉthat performs rotations in the coin subspace and the shift S. The latter moves the position of the walkers conditionally to the coin state c according to the following expression: The evolution operator in the discrete-time scenario is the combination of the coin and shift action, namely U n (Ŝ· (C⊗Îw)) n , where n is the number of single time-step evolution and Îw is the identity operator in the walker's position space. It is possible to retrieve the evolution operator through the Hamiltonian H of the system describing a particle evolving in a lattice as U(t) e −iHt . In this scenario, it is straightforward to translate the above description to the continuous-time case. The operator H expresses the interactions among the lattice sites like in an adjacency matrix. The resulting QW evolution U(t) is entirely identified by the H matrix without the need for defining a coin operator as in the discrete-time case. The main feature of the QW with respect to a RW with an unbiased coin is the distribution of the walker for t → ∞. Such distribution depends on the initial state of the particle and the walker tends to spread towards the far ends of the lattice. This is in contrast to the typical diffusive behaviors of a RW. This discrepancy is due to the superposition principle in quantum mechanics that gives rise to the interference effects typical of waves. The formulation of QWs is very general and feasible for different applications and experimental implementations in the quantum information and quantum computation fields [98,106,107]. In particular, the formulation of QWs is very suitable for realization in photonic platforms [108]. In the various experiments of photonic QWs, the dynamic of the walker has been encoded in the degrees of freedom of single photon states, such as the polarization for the coin subspace and, for the walker's position, the optical path in bulk [109,110] and integrated interferometers [111][112][113][114][115][116], the time arrival to the detector [117], the modes supported by a multimode fiber [118], the angular [119][120][121], and the transverse momentum [122].
The QWs evolution operator can be modified for different tasks. For example, the QWs paradigm has been exploited to observe topological-protected states [108,123], to simulate systems with nontrivial topology [119,121] and to engineer high-dimensional quantum states [120]. For what concerns AL in discrete-time QWs, single-photon localization has been investigated by introducing site-dependent disorder in the QW evolution. Such condition is achieved implementing sitedependent coin operators. One example is the coin in the form where random extracted phase-shifts ϕ ↑(↓) d operate locally on the site d thus breaking the transnational symmetry of the systems. In [94] the authors present a discrete-time QWs encoded in the time arrival and polarization of single-photon states. The apparatus comprises two loops of different lengths. At each step, the photons generated by a single-photon source choose the shortest or the longest path according to the polarization state that represents the coin space. The position of the particle is encrypted in time. The coin operators in the expression (Eq. 76) were manipulated to reproduce 1) the ballistic spread of the quantum walker by fixing ϕ ↑(↓) d 0, 2) Anderson localization (AL) with random extracted phase-shift, and 3) the diffusion regime that resembles the behavior of a classical random walker. This last condition is the result of a dephasing between the two polarizations in Eq. 76 larger than the coherence time of the single-photon packets, which destroys the interference among the paths. This experiment was one of the first proofs of AL at the single-photon level. Another example in this direction is [95].
Here the discrete-time QW was realized through an integrated optical circuit composed by a network of beam-splitters and phase-shifts [124,125]. Single-photon localization was observed in the output modes of the optical circuit.
There are further examples of single-photon localization regarding continuous-time QWs. They are typically realized exploiting continuous coupling among waveguides arranged in a lattice in photonic chips. In this scenario, the time coordinate is again replaced by the distance z covered during the propagation in the waveguides. The single-photon wave-function is given by the Schrödinger equation [126]: where ψ d is the single-photon amplitude at the site d and the coefficients c ij are the couplings among the modes of the lattice that are expressed in the Hamiltonian H. The length of the device and the coupling coefficients can be engineered to observe AL, as shown in the single-photon experiments in [96,97]. Concerning all mentioned quantum experiments, it is worth noting that the localized single-photon distribution has the same properties as the distribution of localized modes of classical light described in Sections 2, 3. The interest in quantum localization is not restricted only to the pure observation of localized states. In the following section, we illustrate an application of localized single-photon states in quantum cryptography.

Quantum Cryptography Through Localized Single-Photon States
Quantum computing could undermine the security of some of the current cryptographic protocols. An example is given by the RSA protocol security which is based on the difficulty for a classical computer to find prime factors of large integers, while a quantum computer solves the same problem in polynomial time [127]. This motivates the need for a different approach to come up with a more secure cryptographic procedure. Quantum cryptography is the field of quantum information that has the aim to formulate secure protocols based on the rules of quantum mechanics. In the quantum protocol BB84, two agents, Alice and Bob, exchange a stream of qubits, i.e., quantum states that live in a twodimensional Hilbert space. Alice randomly chooses to prepare the state according to two possible bases {|↑>, |↓>} and {|+>, |−>}, where | ± 〉 1 2 √ (|↑〉 ± |↓〉). Bob receives the signal and decides randomly in which basis he measures the qubits. He extracts a stream of bits corresponding to 0 when he measures ↑ (+) and to 1 when he measures ↓ (−). Then, Alice and Bob's streams of bits cannot correspond to each other when Bob measures on a basis different from Alice's choice. The two agents compare part of their bit strings and, according to the resulting bit error rate, they can detect an eventual eavesdropper attack and extract a secure key [128]. Variants of this protocols exploit entangled states or high-dimensional states instead of qubits. The latter are a generalization of qubits and describe a particle living in a ddimensional space. The so-called qudits provide advantages in the amount of the information storage in the state sent to the receiver, and security [129]. In [100] the authors implement a BB84-inspired protocol using localized states generated by a disordered optical fiber. The experimental setup is similar to the one shown in Figure 12. Alice modulates the single photons obtained by an attenuated laser with a spatial light modulator. In this way, she can choose to send states that localize after propagation in the fiber in either momentum or position at the fiber's output tip. Bob chooses the basis of the measurement by placing or removing a lens before the single-photon detector. This implementation of the BB84 protocol exploits the quantum duality between the real space and the Fourier space of the lens: a state localized in the first space is indeterminate in the other one and vice versa. The authors prove the feasibility of the protocol using localized single-photon wave-functions even in the experimental conditions. A recent work [101] exploits a similar setup for performing a slight different cryptographic protocol. In this experiment, the information about the basis chosen by Alice is not shared publicly after the communication between the agents. Alice codifies her message and the basis in two different photons that are sent at different random time. At the end of the protocol, Alice and Bob compare the measurements about some random pair of photons, and then they are able to extract a secure key. This protocol offers advantages in terms of sensitivity to noise and resilience to a "photon number splitting" eavesdropper attack.

Multiphoton Localization
Single-photon localized states do not add any further insight into AL with respect to experiments based on wave interference. Nevertheless, the proper description of quantum light is within the framework of second quantization. This representation is necessary for describing many-particle evolution. The electromagnetic field can be expressed by the boson annihilation âand creation operators a, i.e., the operators that destroy or create exactly one photon in a given mode [130]. This description reflects the particle statistics and, consequently, explains the quantum interference effects due to the indistinguishability of the particles. This change in the description consists basically in expressing the same state in terms of occupation numbers of the field modes. The system is then individuated by the evolution operator acting to the creation and annihilation operators. In the case of QWs, that, as we have seen in the previous section, corresponds to a linear transformation among modes of a given degree of freedom, the single creation operators representing one photon in the mode i will be b̂ † i j U ij â † j (78) where U ij are the element of the QW evolution operator in the occupation number representation. One of the most famous examples of two-photon interference, the Hong-Ou Mandel (HOM) experiment [131] is explained by the latter formulation. Here two indistinguishable photons entering in a beam-splitter from different ports come out always together in the same output port. This phenomenon is a first example of the role of particle indistinguishability in the evolution of multiphoton states. Two-photon interference has been investigated in the regime of AL. The main result that emerges from these studies is that the way in which the system approaches localization strongly depends on its initial state. In Figure 13 we report numerical simulations illustrating the two-photon state localization investigated in the theoretical [102,103,132] and the experimental works [95][96][97] carried out in this topic. The first row ( Figures 13A-D) report the two-photon distribution G (x 1 , x 2 ) defined as the probability to detect one photon in the position x 1 and the other in x 2 , averaged over different disorder configurations. For example, in the case two identical photons injected in the QW in positions 0 and 1 in the state |ψ in 〉 â † 0 â † 1 |0〉, the function G (x 1 , x 2 ) has the following expression where «·» is the average over the disorder, the bra 〈0|ax 1 ax 2 is the projection on the state with the photons in the positions x 1 and x 2 respectively and |ψ> the output state of the QW. The last term is the result of the application of Eq. 78 to the creation operators in the initial state |ψ in > and of the bosonic operators commutation rule [a i , a † j ] δ ij . The distributions reported in the figure illustrate the state of the system after a discrete-time QW of 30 time steps. In the simulations we have extracted uniformly the phases ϕ ↑(↓) d of Eq. 76 around 0 in an interval of length π/2. In such condition, the single-photon wave function still preserves the ballistic spread typical in the QW, while it is starting to localize. In the second row ( Figures 13E-H), we show the function g(Δ) x1−x2 Δ G(x 1 , x 2 ) in the region of localization. All the quantities are normalized to the maximum and averaged among 1,000 configurations of disorder. Figures 13A,B,E,F compare the two functions G and g for the evolution of the initial states â † 0 â † 1 |0〉 and 1 2 √ (â † 0,H â † 1,V − â † 1,H â † 0,V )|0〉. These states correspond to two photons created in the site 0 and 1 with different symmetries with respect to the exchanging operations between the two particles. The first one reproduces the evolution of two noninteracting bosons. Here, we observe the typical tendency of bosons to assume the same states just mentioned in the description of the HOM experiment, i.e., to find the two photons in the same position with high probability. In contrast, the second state is antisymmetric under exchanging of the two particles and presents the opposite behavior. The probability to find the photons in the same site is zero. These states simulate de facto fermion statistics and the Pauli exclusion principle. To reproduce an antisymmetric state, it is necessary to exploit an additional degree of freedom, in this case the polarization. The boson and fermion statistics in AL have been observed experimentally for the first time in [95] by exploiting polarization entangled states and an integrated photonic chip. Here a single-photon source based on parametric down conversion from a nonlinear crystal generates a pair of entangled states in polarization such as the state investigated in Figures 13B,F. The state evolves in a discrete QW platform, realized in an integrated device that comprises a network of beam-splitter and phase-shifts (see Section 4.1). The coin operators in the form of Eq. 76 are sampled properly to observe the AL. The second type of states investigated in the literature is illustrated in Figures 13C,D and g-h. These states are entangled in the occupation number of the sites 0 and 1. The output distribution depends on the sign in the superpositions of the contributions â † 2 0 and â † 2 1 that create two photons in the respective modes. Such entangled states in the context of AL were investigated for the first time in [96] and then in [97]. The pair of entangled photons is generated via parametric downconversion. These photons are strongly correlated in the momentum space. Such correlations are transferred among the position of the QW's lattice by means of a lens system. In this way the photons are coupled in the waveguides of the integrated device implementing the QWs. The two experiments with such entangled states have been performed exploiting continuous-time QW by random couplings among the waveguides arranged in a lattice (see Section 4.1). In particular, in the most recent experiment [97], the authors report the results averaged over different configurations of random couplings thus representing one of the most exhaustive experiment on two-photon Anderson localization.

Discussion
In this section we have illustrated Anderson localization (AL) in the context of quantum light, presenting the most relevant results for what concerns the experimental realizations and applications. We have first formulated AL in the context of quantum walks (QW). We have then described the use of localized states in quantum cryptography. In the end we have illustrated the problem of localization in quantum optics by considering multiphoton states. Up to now the investigation of multiphoton AL localization was confined to the two-photon case. The reasons are various. It is still debated in the literature, whether the results reported in the quantum experiments can be reproduced by classical light, i.e., by wave interference. For instance, in [133] it was shown that some features of the distribution reported in Figure 13 could be observed with a laser propagating in a circuit engineered with an appropriate disorder. There are other concerns regarding the intrinsic difficulty to simulate the evolution of noninteracting bosons such as photons scattered by a random network [132]. This prevents finding an analytical solution for the systems with a large number of photons. All these considerations explain why the present investigations about quantum AL were basically carried out from a phenomenological perspective. This motivates further studies to provide a more rigorous framework for quantum AL.

APPLICATIONS AND PERSPECTIVES
The story understanding transverse localization of light in the last four decades has been one of constant advances. With respect to the first formulations (which reported just numerical evidences [13,14]), now it is possible to observe and tailor localization on at least four different platforms: photorefractive crystals, plastic binary fibers, disordered glassy fibers, and laser written glass waveguides. Each of this platforms has its specific features and advantages in terms of applications.
1) Photorefractive crystals [134], proposed in 2007, enable relatively rapid reshaping of disorder together with nonlinear response, thus a new generation of switches or routers based on disorder guiding can be envisioned. The drawback of this approach is the small refractive index mismatch and large disorder grain size. For obtaining micron sized localized states, further effort would be needed in improving the nonlinearity engineering. 2) Polymeric binary fibers have been proposed in 2012 and successively further improved. The fabrication technique for these items is extremely cheap and straightforward (if one has access to a fiber drawing tower) and enables realizing kilometer-long fibers starting from a few-centimeter-sized preform. In binary fibers the refractive index mismatch is 0.1 (employing PMMA and Polystyrene as plastic components of the preform), and the disorder can be obtained easily with a grain size of the order of a micron. The advantage of this approach is that the micrometric sized defects need not to be individually fabricated: it is the transverse thinning, affecting collectively all individual strands in the preform that produces this fine-scale disorder. Thus binary fibers support strong localization in the visible range. Binary fibers have been thus extremely successful in terms of potential applications. It has been demonstrated that they can support nonlinearity and switching, image transport, wave front shaping, controlled focusing, quantum communication and key distribution, and image transport. The drawback of binary fibers resides in the large losses: currently between 50 and 100 dB/m. These losses are exceeding the ones expected for the intrinsic scattering and absorption of the plastic material: this poor performance is probably due to the assembly and drawing stage (carried on in a unclean environment), which introduces microscopic dust in the preform. Due to these losses, all the experiments have been carried on longitudinally small pieces of binary fibers (few tens of centimeters). 3) Glass-based binary fibers have been fabricated since 2014 from a porous glass. A rod with initial diameter of 8 mm produces air-holes with diameter varying between 0.2 and 5 µm. This approach promises all the advantages of glass (lower losses and enhanced stability) together with easy fabrication. This potential has already been demonstrated in recent results including new applications such as localization-based random lasing. The only drawback of this approach is related to the nonhomogeneity of the disorder. Indeed, air holes tend to be located at the outer boundary of the fibers due to the fabrication process and, thus, eventually turning localized states into leaky modes. 4) Employing fiber drawing (both in the glassy or plastic versions), it is impossible to get a direct control on the position of the defects. This drawback has been circumvented in 2020, employing a direct laser writing approach. Direct laser writing is still prone to high losses due to inefficient coupling and small refractive index contrast. Nevertheless, by tuning individually the paraxial defect positions, it is possible to test how extensive localization properties depend on specific configurations. Thus direct laser written localization can be employed as a test-bench to find out how different localization properties are affected by varying the disorder configurations. Then, if performance enhancement is found, the optimal configurations can be translated to the more efficient fabrication approaches.
If the technological progress on these platforms continues just at the same rate as in the last years, we can envision that one (or perhaps more than one) of these platforms will find its way to the application and industrialization in the next few years.

AUTHOR CONTRIBUTIONS
ML reviewed the experiments carried out with coherent light. WS and GR wrote the theoretical background. TG reviewed the experiments with quantum light. All the authors discussed the structure of the work and contributed to the writing of the paper.