Floating-Point Calculations on a Quantum Annealer: Division and Matrix Inversion

Systems of linear equations are employed almost universally across a wide range of disciplines, from physics and engineering to biology, chemistry and statistics. Traditional solution methods such as Gaussian elimination become very time consuming for large matrices, and more efficient computational methods are desired. In the twilight of Moore's Law, quantum computing is perhaps the most direct path out of the darkness. There are two complementary paradigms for quantum computing, namely, gated systems and quantum annealers. In this paper, we express floating point operations such as division and matrix inversion in terms of a quadratic unconstrained binary optimization (QUBO) problem, a formulation that is ideal for a quantum annealer. We first address floating point division, and then move on to matrix inversion. We provide a general algorithm for any number of dimensions, but we provide results from the D-Wave quantum anneler for $2\times 2$ and $3 \times 3$ general matrices. Our algorithm scales to very large numbers of linear equations. We should also mention that our algorithm provides the full solution the the matrix problem, while HHL provides only an expectation value.


I. INTRODUCTION
Systems of linear equations are employed almost universally across a wide range of disciplines, from physics and engineering to biology, chemistry and statistics.An interesting physics application is computational fluid dynamics (CFD), which requires inverting very large matrices to advance the state of the hydrodynamic system from one time step to the next.An application of importance in biology and chemistry would include the protein folding problem.For large matrices, Gaussian elimination and other standard techniques becomes too time consuming, and therefore faster computational methods are desired.As Moore's Law draws to a close, quantum computing offers the most direct path forward, and perhaps the most radical path.In a nutshell, quantum computers are physical systems that exploit the laws of quantum mechanics to perform arithmetic and logical operations exponentially faster than a conventional computer.In the words of Harrow, Hassidim, and Lloyd (HHL) [1], "quantum computers are devices that harness quantum mechanics to perform computations in ways that classical computers cannot.''There are currently two complementary paradigms for quantum computing, namely, gated systems and quantum annealers.Gated systems exploit the deeper properties of quantum mechanics such coherence, entanglement and non-locality, while quantum annealers take advantage of tunneling between metastable states and the ground state.In Ref. [1], HHL introduces a gated method by which the inverse of a matrix can be computed, and Refs.[2][3][4] provide implementations of the algorithm to invert 2 × 2 matrices.Gated methods are limited by the relatively small number of qubits that can be entangled into a fully coherent quantum state, currently of order 32 or so.An alternative approach to quantum computing is the quantum anneler [5], which takes advantage of quantum tunneling between metastable states and the ground state.The D-Wave Quantum Annealers have reached capacities of 2000+ qubits, which suggests that quantum annealers could be quite effective for linear algebra with hundreds to thousands of degrees of freedom.In this paper, we express floating point operations such as division and matrix inversion as quadratic unconstrained binary optimization (QUBO) problems, which are ideal for a quantum annealer.We should mention that our algorithm provides the full solution the the matrix problem, while HHL provides only an expectation value.Furthermore, our algorithm places no contraints on the matrix that we are inverting, such as a sparcity condition.
The first step in mapping a general problem to a QUBO problem begins with constructing a Hamiltonian that encodes the logical problem in terms of a set of qubits.Next, it will be necessary to "embed" the problem on the chip, first by mapping each logical qubit to a collection or "chain" of physical qubits, and then by determining parameter settings for all the physical qubits, including the chain couplings.We have implemented our algorithms on the D-Wave 2000Q and 2X chips, illustrating that division and matrix inversion can indeed be performed on an existing quantum annealer.The algorithms that we propose should scale quite well for large numbers of equations, and should be applicable to matrix inversion of relatively high order (although probably not exponentially higher order as in HHL).
Before examining the various algorithms, it is useful to review the basic formalism and to establish some notation.The general problem starts with a graph G = (V, E), where V is the vertex set and E is the edge set.The QUBO Hamiltonian on G is defined by with Q r ∈ {0, 1} for all r ∈ V.The coefficient A r is called the weight at vertex r, while the coefficient B rs is called the strength between vertices r and s.It might be better to call (1.1) the objective function rather than the Hamiltonian, as H G is a real-valued function and not an operator on a Hilbert-space.However, it is easy to map (1.1) in an equivalent Hilbert space form, where Qr |Q = Q r |Q for all r ∈ V, and |Q ∈ H for Hilbert space H.The hat denotes an operator on the Hilbert space, and Q r is the corresponding Eigenvalue of Qr with Eigenstate |Q .Consequently, we can write and we use the terms Hamiltonian and objective function interchangeably.By the QUBO problem, we mean the problem of finding the lowest energy state |Q of the Hamiltonian (1.2), which corresponds to minimizing Eq. (1.1) with respect to the Q r .This is an NP-hard problem uniquely suited to a quantum annealer.Rather than sampling all 2 #V possible states, quantum tunneling finds the most likely path to the ground state by minimizing the Euclidian action.In the case of the D-Wave 2X chip, the number of distinct quantum states is of order the insanely large number 2 1000 , and the ground state is selected from this jungle of quantum states by tunneling to those states with a smaller Euclidean action.
The Ising model [6] is perhaps the quintessential physical example of a QUBO problem, and indeed, is one of the most studied systems in statistical physics.The Ising model consists of a square lattice of spin-1/2 particles with nearest neighbor spin-spin interactions between sites r and s, and when the system is immersed in a nonuniform magnetic field, FIG.1: The left panel shows the fully connected graph K 8 and the right panel shows the corresponding graph K 4 .To perform a calculation to 8-bit accuracy requires the connectivity of K 8 .We take the vertex and edge sets for be K 8 to To perform a calculation to 4-bit accuracy requires K 4 connectivity, and similarly, the vertex and edge sets for K 4 are V 4 = {0, 1, 2, 3} and this introduces coupling terms at individual sites r, thereby producing a Hamiltonian of the form where J r = ±1/2.The Ising problem is connected to the QUBO problem by J r = Q r − 1/2.
For floating point division to R bits of resolution, the graph G is in fact just the fully connected graph K R .In terms of vertex and edge sets, we write K R = (V R , E R ), and Fig. 1 illustrates K 8 and K 4 .The left panel shows the completely connected graph K 8 , with vertex and edge sets ) while the right panel shows the K 4 graph, Just as 8-bits is called a word, 4-bits is called a nibble.As we will also see, the dynamic range of the D-Wave is most directly suitable to K 4 , and consequently the connectivity of K 4 gives a quantum nibble.
Let us remark about our summation conventions.Rather than summing over an edge set, we find it convenient to sum over all values of r and s taking B rs to be symmetric.In this case, the double sum differs by a factor of two relative to summing over the edge set of the graph, Furthermore, for r = s, there will be a linear contribution from the idempotency condition (1.12) We can write this as Brs Q r Q s . (1.13)

A. Division as a QUBO Problem
In this section we present an algorithm for performing floating point division on a quantum annealer.Given two input parameters m and y to R-bits of resolution, the algorithm calculates the ratio y/m to R bits of resolution.The corresponding division problem can be represented by the linear equation which has the unique solution Solving (2.1) on a quantum annealer amounts to finding an objective function H(x) whose minimum corresponds to the solution that we are seeking, namely (2.2).Although the form of H(x) is not unique, for this work we employ the simple real-valued quadratic function where m and y are continuous parmeters.For an ideal annealer, we do not have to concern ourselves with the numerical range and resolution of the parameters m and y; however, for a real machine such as the D-Wave, this is an important consideration.For a well-conditioned matrix, we require that the parameters m and y possess a numerical range that spans about an order of magnitude, from approximately 0.1 to 1.0.This provides about 3 to 4 bits of resolution: 1/2 0 = 1, 1/2 1 = 0.5, 1/2 2 = 0.25, and 1/2 3 = 0.125.The dynamic range and the connectivity both impact the resolution of a calculation.To proceed, let us formulate floating point division as a quadratic unconstrained binary optimization (QUBO) problem.The algorithm starts by converting the real-valued number x in (2.3) into an R-bit binary format, while the numbers m and y remain real valued parameters of the objective function.For any number χ ∈ [0, 2), the binary representation accurate to R bits of resolution can be expressed by is value of the r-th bit, and the square bracket indicates the binary representation. 1 It is more algebraically useful to express this in terms of the power series in 2 −r , (2.4) 1 Since the infinite geometric series ∞ r=0 2 −r sums to 2, the finite series is less than 2. In binary form we have [1.11 Working to resolution R is like calculating the R-th partial sum of an infinite series.
In order to represent negative number, we perform the binary offset where x ∈ [−1, 3).The objective function now takes the form The constant term (m + y) 2 can be dropped when finding the minimum of (2.6), but we choose to keep it for completeness.Equation (2.4) provides a change of variables (where Q is the collection of the Q r ), and this allows us to express (2.3) in the form In the notation of graph theory, we would write where , and E R is the edge set.We often employ an abuse of notation and write rs ∈ E R to mean {r, s} ∈ E R .We should also use the notation B {r,s} , but instead we write B rs .Since the order of the various elements of a set are immaterial, we require B rs to be symmetric in r and s.Rather than summing over the edge sets rs ∈ E R , we employ the double sum r =s , which introduces a relative factor of two in the convention for the strengths B rs .The goal of this section is to find A r and B rs in terms of m and y.
Note that we can generalize the simple binary offset (2.4) if we scale and shift χ ∈ [0, 2) by When d > 0 and c > d/2, the domain of x always contains a positive and negative region, and the precise values for d and c can be chosen based on the specifics of the problem.For Eq. (2.9), the objective function takes the form (2.10) For simplicity of notation, this paper employs the simple binar offset (2.5), although our Python interface to the D-Wave quantum annealer employs the generalized form (2.10).
Equation (2.4) allows us to express the quadratic term in χ as where we have used the idempotency condition Q 2 r = Q r along the diagonal in the last term of (2.11).Substituting the forms (2.4) and (2.11) into (2.6)provides the Hamiltonian and the Ising coefficients in (2.7) can be read off: Because of the double sum over r and s in the objective function in (2.12), the algorithm requires a graph of connectivity K R .The special cases of K 8 and K 4 have been illustrated in Fig. 1.To obtain higher accuracy than the K R graph allows, we can iterate this procedure in the following manner.Suppose we start with y 0 = y, and we are given a value y n−1 with n > 1, then we advance the iteration to y n in the following manner, solve mx n = y n−1 for x n to R bits (2.15) calculate the error y n = y n−1 − mx n . (2.16) Now that we have the value y n , we can repeat the process to find y n+1 , and we can stop the iterative procedure when the desired level of accuracy has been achieved.
The D-Wave Chimera chip consists of coupled bilayers of micro rf-SQUIDs overlaid in such a way that, while relatively easy to fabricate, results in a fairly limited set of physical connections between the qubits.However, by chaining together well chosen qubits in a positively correlated manner, this limitation can largely be overcome.The process of chaining requires that we (i) embed the logical graph onto the physical graph of the chip (for example ) and that we (ii) assign weights and strengths to the physical graph embedding in such as a way as to preserve the ground state of the logical system.These steps are called graph embedding and Hamiltonian embedding, respectively. (2.17) The set B 8 represents the connections between a given red qubit and the corresponding blue qubits in the Figures.The red and blue dots illustrate the bipartate nature of C 8 , as every red dot is connected to every blue dot, while none of the blue and red dots are connected to one another.
We will denote the physical qubits on the D-Wave chip by q .For the D-Wave 2000Q there is a maximum of 2048 qubits, while the D-Wave 2X has 1152 qubits.For the example calculation in this text, we only use 10 to 50 qubits.The physical Hamiltonian or objective function takes the form where we have introduced a factor of 2 in the strength to account for the symmetric summation over r and s.We will call the qubits Q r of the previous section the logical qubits.
To write a program for the D-Wave means finding an embedding of the logical problem onto the physical collection of qubits q .If the connectivity of the Chimera graphs were large enough, then the logical qubits would coincide exactly with the physical qubits.However, since the graph C 8 possesses less connectivity than K 4 , we must resort to chaining on the D-Wave, even for 4-bit resolution.Figure 4 illustrates the K 4 embedding used by our algorithm, where, as before, the left panel illustrates the bipartate graph in column format, and the right panel illustrates the corresponding graph in cross format.
In Fig. 4 we have labeled the physical qubits by = 1, 2, 3 • • • 8, and we wish to map the logical problem involving Q r Q s Q t onto the four physical qubits q 5 q 1 q 6 q 2 .The embedding requires that we chain together the two qubits 1-6 and 3-8, respectively.We may omit qubits 4 and 7 entirely.As illustrated in Fig. 5, the physical qubits q 1 and q 6 are chained together to simulate a single logical qubit Q t , while qubits q 5 and q 2 are mapped directly The blue lines represent normal connections between qubits, while the red double-lines represent chained qubits, that is to say, qubits that are strictly correlated (and can thereby represent a single logical qubit at a higher level of abstraction).The qubits 1-6 are chained together, as are the qubits 3-8.

FIG. 5:
The left panel shows three logical qubits Q r , Q s , Q t with connectivity between r-t and t-s.The box surrounding qubit t means that it will be modeled by a linear chain of physical qubits, as illustrated in the right panel.The labeling is taken from Fig. 4 for qubits 5-1-6-2, where Q r is mapped to q 5 , Q s is mapped to q 2 , and Q t is split between q 1 nd q 6 .Qubits q 1 and q 6 are chained together to simulate the single logical qubit Q t , while qubits Q r and Q s map directly onto physical qubits q 5 and q 2 .to the logical qubits Q r and Q s , respectively.Qubit q 5 is assigned the weight a 5 = A r and the coupling between q 5 and q 1 is assigned the value b 51 = B rt .Similarly for qubit q 2 , the vertex is assigned weight a 2 = A s , and strength between q 2 and q 6 is b 26 = B st .We must now distribute the logical qubit Q t between q 1 and q 6 by assigning the values a 1 , a 6 and b 16 .We distribute the weight A t uniformly between qubits q 1 and q 2 , giving a 1 = A t /2 and a 6 = A t /2.We must now choose b 16 .To preserve the energy spectrum, we must shift the TABLE I: For two qubits the counter-term Hamiltonian is H CT (q 1 , q 6 ) = a q 1 + a q 6 + 2b q 1 q 6 .The lowest energy state is preserved for b = −α and a = α where α > 0. We will split the weight A t uniformly across the N chained physical qubits, thereby giving a contribution to the physical Hamiltonian H t 16 = A t /2 + α q 1 + α q 6 − 2α q 1 q 6 .The energy spectrum ensures that the two qubits are strictly correlated.
values of the weights a 1 and a 6 .We can do this by adding a counter-term Hamiltonian H CT = a q 1 + a q 6 + 2b 16 q 1 q 6 (2.19) to the physical Hamiltonian.The double lines in Figs.chained together.This means that the qubits are strictly correlated, i.e. when q 1 is up then q 6 is up, and when q 1 is down then q 6 is down.This is accomplished by choosing the coupling strength b 16 to favor a strict correlation; however, to preserve the ground state energy, this also requires shifting the weights for q 1 and q 6 .For q 1 = q 6 = 0 we have H CT = 0. We wish to preserve this condition when q 1 = q 6 = 1, which means 2a + 2b = 0. Furthermore, the state q 1 = 1 and q 6 = 0 must have positive energy, which means a > 0. Similarly for q 1 = 0 and q 6 = 1.We therefore choose a 1 = a 6 = α > 0 and b 16 = −α, where α is an arbitrary parameter.This is illustrated in Table.I.A more complicated case is the linear chain of N qubits as shown in Fig. 6.The counter-term Hamiltonian is taken to be (2.20) Note that H CT vanishes when q m = 0 for all m = 1 • • • N .And conversely, we must arrange the counter-term to vanish when q m = 1.The simplest choice is to take all weights to be the same and all couplings to be identical.Then, to preserve the ground state when the q r = 1, we impose with α > 0 and r = 1 • • • N .The first term in a t r distributes the weight A t uniformly across all N nodes in the chain.The second set of terms b t r,r+1 ensures that the qubits of the chain are strictly correlated.The counter-term energy is positive and is therefore selected against when the linearly chained qubits are not correlated.Table II illustrates the spectrum of the counter-term Hamiltonian for three qubits.We often need to choose large values of α, of order 20 or more, to sufficiently separate the states.The uniform spectrum of 4 states with H CT = a in Table II arises from a permutation symmetry in q 1 , q 2 , q 3 .TABLE II: For a three qubit chain the counter-term Hamiltonian is H CT (q 1 , q 2 , q 3 ) = a q 1 + a q 2 + a q 3 +2b q 1 q 2 +2b q 2 q 3 , where a = 4α/3 and b = −α.The degeneracy in energy of value a arises from a permutation symmetry in q 1 → q 2 → q 3 that preserves the form of the counter-term Hamiltonian.
To review, note that a linear counter-term is represented in Fig. 6.We add a counter-term to break the logical qubits into a chain of physical qubits that preserve the ground state.Let us consider the conditions that we place on the Hamiltonian to ensure strict correlation between the chained qubits.We adjust the values of A r and B rs to ensure that spin alignment is energetically favorable.By slaving several qubits together, we can overcome the limitations of the Chimera connectivity.As a more complex example, consider the four logical qubits of Fig. 7 connected in a circular chains by strengths B 12 , B 24 , B 43 , and B 31 .Suppose the weights are A 1 , A 2 A 3 and A 4 .Figure 8 provides an example in which each logical qubit is chained in a linear fashion to the physical qubits.FIG.8: A possible mapping of the logical qubits in Fig. 7 onto the physical device.Each logical qubit is modeled by a linear chain of strictly correlated qubits.

III. MATRIX INVERSION AS A QUBO PROBLEM
In this section we present an algorithm for solving a system of linear equations on a quantum annealer.To precisely define the mathematical problem, let M be a nonsingular N × N real matrix, and let Y be a real N dimensional vector; we then wish to solve the linear equation The linearity of the system means that there is a unique solution, and the algorithm is realized by specifying an objective function whose ground state is indeed (3.2).The objective function is not unique, although it must be commensurate with the architecture of the hardware.If the inverse matrix itself is required, it can be constructed by solving (3.1) for each of the N linearly independent basis vectors for Y.It is easy to construct a quadratic objective H(x) whose whose minimum is (3.2), namely In terms of matrix components, this can be written Note that Y 2 is just a constant, which will not affect the minimization.In principle all constants can be dropped from the objective function, although we choose to keep them for completeness.One may obtain a floating point representation of each component of T by expanding in powers of 2 multiplied by Boolean-valued variables q i r ∈ {0, 1}, As before, the domains are give by χ i ∈ [0, 2) and x i ∈ [−1, 3), and upon expressing x as a function the q i r we can recast (3.4) in the form The coefficients a i r are called the weights and the coefficients b ij rs are the interaction strengths.Note that the algorithm requires a connectivity of K NR .
Let us first calculate the product x i x j in (3.4).From (3.5) and (3.6) we find where we have used the idempotency condition (q i r ) 2 = q i r in the second term of (3.9).While the second form is one used by the code, it is more convenient algebraically to use the first form.Substituting (3.8) into the first term in (3.4) gives 12) The second term in (3.4) can be expressed as Adding H 1 and H 2 gives The Ising terms are therefore The physical qubits in are accessed by a 1-dimensional linear index, while the logical qubits in are defined in terms of the 2-dimensional indicies i and r, where i = 0, 1, • • • , N − 1 and r = 0, 1, • • • , R − 1.To map the logical qubits onto the physical qubits, we re-parameterize the logical qubits by a single index = 0, 1, • • • , N R − 1.Now we define a 1-1 mapping between these indices and the linear index = 0, 1, • • • , N • R − 1.This is just an ordinary linear indexing for 2-dimensional matrix elements, so we choose the usual row-major linear index mapping, The inverse mapping gives the row and column indices as below, where n is the greatest integer less than or equal to n.The expression " mod R" is modulo R.This is a 1-1, invertible mapping between each pair of values of i and r in the matrix index space to every value of in the linear qubits index space.We can simply replace sums over all index pairs i, r by a single sum over , provided we also rewrite any isolated indices in i and r as functions of via their inverse mapping.We may summarize this observation in the following formal identity.Given some arbitrary quantity, A, that depends functionally upon the tuple (i, r), and possibly upon the individual indices i and r, it is trivial to verify that, where , i , and r are related as in equations (3.19)- (3.22).This identity is useful for formal derivations.For example, we may use it to quickly derive the binary expansion of x i in terms of logical qubits.Inserting (3.23) into (3.6)gives, Clearly, x i has non-zero contributions only for those indices corresponding to i = i = /R , that is, only from those qubits within a row in the q i r array.Also, those contributions are summed along that row, i.e., over r = mod R.This equation will be used to reconstruct the floating-point solution, x, from the components q of the binary solution returned from the D-Wave annealing runs.The weights and strengths now become For a 2 × 2 matrix to 4-bit accuracy, we need K 8 (4 × 2 = 8), and to 8-bit accuracy we need K 16 (8 × 2 = 16).We have inverted matrices up to 3 × 3 to 4-bit accuracy, which requires . For an N × N matrix with R bits of resolutions, we must construct linear embeddings of K RN .We could generalize this procedure for complex matrices.

A. Implementation
The methods above were implemented using D-Wave's Python SAPI interface and tested on a large number of floating-point calculations.Initially, we performed floating-point division on simple test problems with a small resolution.Early on, we discovered that larger graph embeddings tended to produce noisier results.To better understand what was happening we started with a K 8 graph embedding to represent two floating-point numbers with only four bits of resolution.Since the D-Wave's dynamic range is limited to about a factor of 10 in the scale of the QUBO parameters, we determined that we could expect no more than 3 to 4 bits of resolution from any one calculation in any event.However, our binary offset representation (3.5) implies that we should expect no more than 3 bits of resolution in any single run.Indeed, using the K 8 embedding, we were able to get exact solutions from the annealer for any division problems which had answers that were multiples of 0.25, between -1.0, and 1.0.Problems in this range which had solutions that were not exact multiples of 0.25 resulted in approximate solutions, effectively "rounded" to the nearest of ±0.25 or ±0.75.At this point we implemented an iterative scheme that uses the current error, or residual, as a new input, keeping track of the accumulated floating-point solution.
The iteration method has been implemented and tested for floating-point division but we have not yet implemented iteration for matrix inversion.That can be done by using the previous residual (error) as the new inhomogeneous term in the matrix equation.We plan to implement an iterative method in the matrix inversion code soon.However, we already have good preliminary results on matrix inversion that encourages us that this should work reasonably well, at least for well-conditioned matrices.Currently, we are able to solve 2 × 2 and 3 × 3 linear equations involving floating-point numbers up to a resolution of 4 bits, and having well-conditioned matrices, exactly for input vectors with elements defined on [−1, 1] and that are multiples of 0.25.Using an example matrix that is poorly-conditioned, we find that it is generally not possible to get the right answer without first doing some sort of pre-conditioning to the matrix.But more importantly, we were able to obtain some insight about why ill-conditioned matrices can be difficult to solve as QUBO problems on a quantum annealer, which gives some hints about how to ameliorate the problem.We will discuss these results, and the effects of ill-conditioning on the QUBO matrix inverse problem below.

Note on Solution Normalization and Iteration
Allowing both the division and linear equation QUBO solvers to work for arbitrary floating-point numbers, and to allow for iterative techniques, requires normalizing the ratio of the current dividend and the divisor, or the residual and matrix, to a value in a range between −1 and 1.For the division problems, we wanted to avoid "dividing in order to divide", so we normalized each ratio using the difference between the binary exponents of divisor and dividend .These can be found just by using order comparisons, with no explicit divisions.Adding 1 to this yields an "offset" -the largest binary exponent of the ratio -to within a factor of 2 (±1 in the offset), which is sufficient for scaling our QUBO parameters as needed.The fact that our QUBO solutions are always returned in binary representation provides a simple way to bound the solution into a range solvable with the annealer by simply shifting the binary representation of the current dividend by a few bits (using the current offset), which is why we refer to the solution exponent as an "offset".In this way, the solution can easily be guaranteed to be in the correct range without having to perform any divisions in Python.The "offset" is accumulated and used to construct the current approximation to the floating-point solution on each iteration.The iteration process continues until the error of the approximate solution is less than some tolerance specified by the user.

B. Results for Division
First, we present some examples for division without iteration.We used a K 4 graph embedding for expanding the unknown x up to a resolution of 4 bits.However using the binary offset representation we can only get a true precisions of 3 bits.We solved the simple division problem, x = y m .(4.1)

Basic Division Solver
Table III gives an extensive list of tested exact solutions returned by the floating-point annealing algorithm on the D-Wave machine using the K 4 graph embedding with an effective binary resolution of 3, corresponding to the multiples of 0.25 in the range [−1, 1].The "Ground State" column lists the raw binary vector solutions, corresponding to the expansion in Eq (2.5).It is easy to check from Eqs. (2.5) and (2.4) that these give the floating-point solutions found in the corresponding "D-Wave Solution" column.In all of these cases, values of α ≥ 0.5 yielded the solution exactly; however, α is set to 20.0 here because that gives a better approximate solution for the inexact divisions, and faster convergence for the iterated divisions.It does not change the solutions for the exact cases.Table IV lists some illustrative division problems on [−1, 1] that do not have solutions which are multiples of ±0.25, and therefore are not solved exactly by the quantum annealing algorithm to 3 bits of resolution.Note that the energies are different for the ground states because the Hamiltonians are somewhat different for these problems.The "rounding" here occurs naturally in the quantum annealing algorithm as the annealer settles into the lowest energy ground state that approximates the solution.The last four problems are "challenge" problems for the iterated division solver.

Iterated Division Solver
Table V lists a few example division problems returned from the iterated quantum annealing solver.These are problems selected from both Tables III and IV to illustrate the nature of the solutions returned for both cases.These problems were iterated to an error tolerance of 1.0 × 10 −6 .The four "challenge" problems from Table IV can now be solved with the iterative method.The ground state is no longer given since the solution is generally the concatenation of multiple binary vectors for every iteration.Instead, the number of iterations is listed in the last column.Note that some of the energies are the same for the solutions of different problems.We have also left out an "Energy" column, since it only was calculated for the partial solution from the last iteration.

C. Results for Matrix Equations
Note that we have occasionally been somewhat loose in calling this "matrix inversion", since we are technically solving the linear equations, rather than directly inverting the matrices.However, for the problems considered here, we may simply obtain the solutions to the equations using trivial orthonormal eigenvectors such as (1, 0) and (0, 1), in which case the inverse of the matrix will just be the matrix having those solutions as columns.
The linear equation algorithm was implemented and used to solve several 2 × 2 and 3 × 3 matrices on the D-Wave quantum annealer.Floating-point numbers are represented using the same offset binary representation as was used for the division problems.Thus, there are 4 qubits per floating-point number.As in the previous section, this gives an effective resolution of 3 bits for floating-point numbers defined on [−1, 1].In this case, however, we employed the normalization technique discussed in the the division iteration to allow solutions with positive and negative floating-point numbers with larger magnitudes than 1.But, in these problems we still use solution values with relatively small magnitudes, and within an order of magnitude of each other for all solution vector elements.All of the cases shown here are matrix equations with exact solutions, in which case the values of the solution vector elements are multiples of 0.25.This suggests that we could have the iterative solver for the matrix inversion working very soon.
In general, every qubit constituting a floating-point number may be coupled to every other qubit for the same number.In turn, every logical qubit may be connected to every other logical qubit, which implies that every qubit in the logical qubit representation of the problem, may be coupled to every other logical qubit in the problem.Therefore, the linear solution algorithm is implemented using a K 8 graph embedding to solve 2 × 2 matrix equations, having a 2 dimensional solution vector with 4 qubits per element, and using a K 12 graph embedding to solve 3 × 3 matrix equations, having a 3 dimensional solution vector with 4 qubits per element.
Most of these solutions involve well-conditioned matrices; however, one does not generally find a feasible solution when using an ill-conditioned matrix.This is illustrated in two cases, one with a of a 2 × 2 matrix another with a 3 × 3 matrix.We were able to obtain the correct solutions by pre-conditioning these matrices before converting to QUBO form, however the 3 × 3 matrix, still had a nearly degenerate ground state and required a very large chaining penalty α to get the correct solution.This is analyzed and discussed in detail below.

Simple Analytic Problem
Recalling equation (3.1), we shall obtain solutions x of the following matrix equation, using values of M and Y listed in the Appendix.Here we present the first two tests as an example.Consider the following matrix, We can solve equation (3.1) for M , with the following two Y vectors, The exact solutions are, We may obtain M −1 simply as, In the next section we summarize all of the solutions obtained by the DWave for all of our test problems.

QUBO Solution Results
The solutions for the 2 × 2 linear solves are presented in Table VI.Notice that all of the test problems are presented with α = 20.0 except for the last two.This was done to illustrate the affect of pre-conditioning for the ill-conditioned case.However, for this example, the difference disappeared above α = 2.0, and both began to give incorrect answers below α = 1.5.This is in contrast to the 3 × 3 matrix solution cases, which are evidently more sensitive to condition number than the 2 × 2 tests.problem gave only D-Wave solutions with broken chains.One only begins to get solutions with unbroken chains at a value of α above 1000, but those solutions are generally wrong and basically random until one gets to a very high α.We discuss this in greater detail in the following section.The algorithms described here generally worked quite well for these small test cases with the exception of the ill-conditioned 3×3 matrix.The ill-conditioned cases clearly demonstrate not only the limitations of quantum annealing applied solving linear equations, but the limitations of quantum annealing, in general.Consider the two ill-conditioned tests presented here.When translated to a QUBO problem, the Hamiltonian spectra for these tests contain many energy eigenvalues very close to the ground state energy.When these are embedded within a larger graph of physical qubits they result in a very nearly degenerate ground state, typically with thousands of states having energies within the energy uncertainty of the ground state over the annealing time, τ , given by Consider a set of excited states with energy, E n for n > 0, with n = 0, corresponding to the ground state with energy denoted by E 0 , and with E n ordered by energy.The quantum annealer near the ground state evolves adiabatically whenever This is the adiabatic condition for quantum time evolution [8].However, when this condition is badly violated, which can occur dynamically since the instantaneous energies (eigenvalues of H) are time dependent, the time evolution for the system near the ground state deviates significantly from adiabatic behavior, resulting in a relatively slowly evolving superposition of all the eigenstates states that are close in energy to the ground state.Now, the energy spectra corresponding to poorly conditioned matrices have a large number of eigenstates sufficiently near the ground state to strongly violate the adiabaticity condition.Furthermore, these states, in general, will have no correlation to the solution encodings for any particular problem (e.g., the offset binary floating-point representation).For example, they are not, in general, related in any meaningful way to Hamming distance.Therefore, these problems effectively cannot resolve the true ground state and tend to give nearly random lowest energy "solutions" when the final state is measured on any individual annealing run.Since there are so many of these states for ill-conditioned problems, a very large number of "reads" (individual runs) may be have to be specified to sufficient sample the solution space to find the true ground state.
The behavior of the QUBO energies for the ill-conditioned matrices is illustrated in the figures below.The final states are binary vectors corresponding to binary numbers up to 2 N for N qubits.Since our test problems were reasonably small, we directly computed the energies for all 256 final states for the 2 × 2 matrices, and all 4096 states for the 3 × 3 matrices.The array of binary solution states were plotted as what we call "Gray" projections.
By this we mean that the energy is plotted against the solution number ordered in a Gray code sequence.Gray codes are cyclically ordered lists of binary numbers such that every consecutive pair of numbers differ in only one bit, i.e., there us a Hamming distance of 1 between any two sequential numbers [7].We employed the standard "reflected binary" Gray code which is conveniently generated by a recursive algorithm.Table VIII shows an example Gray code for the case of 4 bits.Gray code ordering was used for these plots because, although it is impossible to represent all nearest-neighbor connections for K 8 and K 12 graphs in a 1 dimensional line plot, we were able to at least ensure that all of the adjacent values in the energy plots correspond to neighboring states, which gives some information about the local energy landscape, such as whether neighboring states are close in energy, indicating a relatively "flat" displacement along that direction, or if there is a large jump, indicating a steep "cliff" or "valley" in the energy landscape.Note, also, that this Gray code sequence has global permutation symmetry among its digits, and a cyclic symmetry that repeats periodically for every 2 2 numbers in each digit place.Since this periodicity is also reflected in some periodicity of states separated by a Hamming distance of 1, this makes it possible to infer features of the local energy surface along other directions such as the existence of very deep, but nearly flat, "canyon floors" in the global energy surface.One can see this by noting periodic behavior with even periodicity in the 1 dimensional energy plots.
Figure 9 shows the energy surface for the 2 × 2 ill-conditioned matrix.Note the denseness of the energy spectrum near the ground state energy.The alpha value is listed on this plot, however, with our counter-term parameter setting strategy the energy of all the final states (when there are no broken chains) is actually independent of α.To illustrate this point, compare this to Figure 10, which is the same test problem but computed without using the counter-terms weighted to subtract out the chaining penalty energy contribution.This plot clearly shows why it is crucial to remove the error in the energy caused by not accounting for varying different chain sizes when applying the chaining penalty.Note from the plot that this error is inhomogeneous in state space.Therefore, the lowest energy state for the embedded Hamiltonian will frequently not be the correct one corresponding to the logical Hamiltonian.Note, also, that when the energy of all the chained logical qubits is restricted to be 0, the plots for the embedded and logical QUBO energies overlay each other exactly.Figure 11 shows the energy for the pre-conditioned version of this problem with the counterterm applied.Note that the pre-conditioning has moved many of the previous low-energy eigenvalues significantly higher.The 3×3 ill-conditioned matrix was much more problematic.This is partly because of the nature of this test matrix, which is very nearly singular with an approximate 2 dimensional kernel space.The energy spectrum of the ill-condition problem is given in Figure 12, and the energy spectrum of the pre-conditoned version is given in Figure 13.Although the pre-conditioned version is significantly better, it still required a very large chaining penalty, and even with that, it required thousands of "reads" (annealing runs) to reliably obtain the correct answer.In this case, it required sampling the state space with as many as 2500 or more "reads" with 20µs anneal times.Only several orders of magnitude longer anneal times began to allow for a smaller number of reads.Here there were 2 12 = 4096 final states for the logical Hamiltonian, so we were actually sampling with more than half as many distinct runs as the number of possible states.However, the K 12 embedding used for these runs had 56 qubits.Therefore, the total state space for all qubits in the embedding was of size 2 56 ≈ 7 × 10 16 , so perhaps this large sampling requirement is not as bad as it may at first appear.Still, it is clear that there are still far too many excited states too close to the ground state and that the system is far from perfectly adiabatic here.However, it should be noted that the pre-conditioning still improved things somewhat, because we could not get the correct answer with even 10000 "reads" for the original ill-conditioned problems.
The pre-conditioning method we used was very crude, rather ad-hoc, and ill-suited to practical matrix inversion problems for classical algorithms, but the intent was simply to test the effects of a simple pre-conditioning on the quantum annealing solutions.We have been studying this issue and believe it may be possible to pre-condition the problems better for solving these problems on a quantum annealing machine.In fact, we suspect that methods related to this approach may allow other QUBO problems suffering from similar pathologies to be "pre-conditioned" in way that better separates the ground state energy and allows more practical solutions on the annealer.This, however, is still work in progress, and we plan to further develop and test those ideas in the near future.
Appendix A: Matrix Test Problems The problems we solved to test our quantum annealing algorithm to solve equation (3.1) are listed below.Note that, although the QUBO B ij matrix is symmetric by construction, the matrix M need not be symmetric.

FIG. 2 :FIG. 3 :
FIG. 2: The left panel illustrates the bipartate graph C 8 in column format, while the right panel illustrates the corresponding graph in cross format, often called a Chimera graph.The gray lines represent direct connections between qubits.The cross format is useful since it minimizes the number intersecting connections.The use of red and blue dots emphasize the bipartate nature of C 8 , as every red dot is connected to every blue dot, while none of the red and blue dots are connected to one another.The vertex set of C 8 is taken to be V 8 = {1, 2, • • • , 8} and edge set is B 8 = {{1, 5}, {1, 6}, {1, 7}, {1, 8}, {2, 5}, {2, 6} • • • {7, 8}}.

FIG. 4 :
FIG.4:The K 4 embedding onto C 8 used in our implementation of 4-bit of division on the D-Wave.The blue lines represent normal connections between qubits, while the red double-lines represent chained qubits, that is to say, qubits that are strictly correlated (and can thereby represent a single logical qubit at a higher level of abstraction).The qubits 1-6 are chained together, as are the qubits 3-8.
FIG.6: Generalization of Fig.5to a chain of N linear qubits.The right panel illustrates the chain coupling parmeters used to create strict correlations of the physical qubits within the chain.

TABLE III :
Exact Quantum Annealed Division Problems to 3-bit Resolution.

TABLE IV :
"Rounded" Quantum Annealed Division Solutions to 3-bit Resolution.

TABLE VIII :
Example Gray code for 4 bits 9: Gray Projection of Energy vs State for Test 1(i) FIG.10: Gray Projection of Energy vs State for Test 1(i) without Weighted Counter-terms FIG.11: Gray Projection of Energy vs State for Test 1(j)