Individual subject evaluated difficulty of adjustable mazes generated using quantum annealing

In this paper, the maze generation using quantum annealing is proposed. We reformulate a standard algorithm to generate a maze into a specific form of a quadratic unconstrained binary optimization problem suitable for the input of the quantum annealer. To generate more difficult mazes, we introduce an additional cost function $Q_{update}$ to increase the difficulty. The difficulty of the mazes was evaluated by the time to solve the maze of 12 human subjects. To check the efficiency of our scheme to create the maze, we investigated the time-to-solution of a quantum processing unit, classical computer, and hybrid solver.


INTRODUCTION
A combinatorial optimization problem is minimizing or maximizing their cost or objective function among many variables that take discrete values.In general, it takes time to solve the combinatorial optimization problem.To deal with many combinatorial optimization problems, we utilize generic solvers to solve them efficiently.Quantum annealing (QA) is one of the generic solvers for solving combinatorial optimization problems Kadowaki and Nishimori (1998) using the quantum tunneling effect.Quantum annealing is a computational technique to search for good solutions to combinatorial optimization problems by expressing the objective function and constraint time requirements of the combinatorial optimization problem by quantum annealing in terms of the energy function of the Ising model or its equivalent QUBO (Quadratic Unconstrained Binary Optimization), and manipulating the Ising model and QUBO to search for low energy states Shu Tanaka and Seki (2022).Various applications of QA are proposed in traffic flow optimization Neukart et al. (2017); Hussain et al. (2020); Inoue et al. (2021), finance Rosenberg et al. (2016); Orús et al. (2019); Venturelli and Kondratyev (2019), logistics Feld et al. (2019); Ding et al. (2021), manufacturing Venturelli et al. (2016); Yonaga et al. (2022); Haba et al. (2022), preprocessing in material experiments Tanaka et al. (2023), marketing Nishimura et al. (2019), steel manufacturing Yonaga et al. (2022), and decoding problems Ide et al. (2020); Arai et al. (2021a).The model-based Bayesian optimization is also proposed in the literature Koshikawa et al. (2021) A comparative study of quantum annealer was performed for benchmark tests to solve optimization problems Oshiyama and Ohzeki (2022).The quantum effect on the case with multiple optimal solutions has also been discussed Yamamoto et al. (2020); Maruyama et al. (2021).As the environmental effect cannot be avoided, the quantum annealer is sometimes regarded as a simulator for quantum many-body dynamics Bando et al. (2020); Bando and Nishimori (2021); King et al. (2022).Furthermore, applications of quantum annealing as an optimization algorithm in machine learning have also been reported Neven et al. (2012); Khoshaman et al. (2018); O' Malley et al. (2018); Amin et al. (2018); Kumar et al. (2018); Arai et al. (2021b); Sato et al. (2021); Urushibata et al. (2022); Hasegawa et al. (2023); Goto and Ohzeki (2023).In this sense, developing the power of quantum annealing by considering hybrid use with various techniques is important, as in several previous studies Hirama and Ohzeki (2023); Takabayashi and Ohzeki (2023).
In this study, we propose the generation of the maze by quantum annealing.In the application of quantum annealing to mazes, algorithms for finding the shortest path through a maze have been studied Pakin (2017).Automatic map generation is an indispensable technique for game production, including roguelike games.Maze generation has been used to construct random dungeons in roguelike games, by assembling mazes mok Bae et al. (2015).Therefore, considering maze generation as one of the rudiments of this technology, we studied maze generation using a quantum annealing machine.Several algorithms for the generation of the maze have been proposed.In this study, we focused on mazegenerating algorithms.One can take the bar-tipping algorithm Alg (2023a), the wall-extending algorithm Alg (2023b), and the hunt-and-kill algorithm Alg (2023c).
The bar-tipping algorithm is an algorithm that generates a maze by extending evenly spaced bars one by one.For the sake of explanation, we will explain the terminology here.A path represents an empty traversable part of the maze and a bar a filled non traversable part.Figure 1 shows where the outer wall, bars, and coordinate (i, j) are in a 3 × 3 maze.The maze is surrounded by an outer wall as in Figure 1.It requires the following three constraints.First, each bar can be extended by one cell only in one direction.Second, the first column can be extended in four directions: up, down, left, and right, while the second and subsequent columns can be extended only in three directions: up, down, and right.Third, adjacent bars cannot overlap each other.We explain the detailed process of the bar-tipping algorithm using the 3 × 3 size maze.In this study, a maze generated by extending the N × N bars is called N × N size maze.First, standing bars are placed in every two cells in a field surrounded by an outer wall, as in Figure 1.Second, Figure 2 shows each step of bar-tipping algorithm.Figure 2   If multiple maze solutions are possible, the maze solution is not unique, simplifying the time and difficulty of reaching the maze goal.These constraints must be followed for the reasons described below.The first constraint prevents a maze from generating a maze with multiples maze solutions and closed circuits.Figure 3 (a) shows a maze state that violates the first constraint.The step violating the first constraint because one bar in the upper right corner is extended in two directions as Figure 3 (a) .
The second constraint prevents generating a maze from a maze with closed circuits and multiple maze solutions.Figure 3 (b) shows a state that violates the second constraint.The second constraint is violated, it has a closed circuit and multiple maze solutions, as Figure 3 (b).The third constraint prevents maze generation from a maze with multiple maze solutions.Figure 3 (c) shows a state that violates the third constraint.The bars overlap in the upper right corner, making it the third constraint as Figure 3 (c).
Next, we describe the wall-extending algorithm.It is an algorithm that generates a maze by extending walls.Figure 4 shows the extension starting coordinates of the wall-extending algorithm.Figure 5 (a) shows the initial state of the wall expansion algorithm.First, as an initial condition, the outer perimeter of the maze is assumed to be the outer wall, and the rest of the maze is assumed to be the path as Figure 5 (a).Coordinate system is different from the bar-tipping algorithm, all cells are labeled coordinates.As Figure 4 shows, the coordinates where both x and y are even and not walls are listed as starting coordinates for wall extending.The following process is repeated until all starting coordinates change to walls, as shown in Figure 5(c).Randomly choose the coordinates from the non-wall extension start coordinates.
The next extending direction is randomly determined from which the adjacent cell is a path.Figure 5 (b) shows how the path is extended.The extension will be repeated while two cells ahead of the extending direction to be extended is a path as Figure 5 (b). Figure 5 (c) shows all starting coordinates changed to walls.These processes are repeated until all the starting coordinates change to walls as in Figure 5 (c).Figure 5 (d) shows a maze created by wall-extending.Following the process, we can generate a maze as in Figure 5 (d).As a third, the hunt-and-kill algorithm is explained below.It is an algorithm that generates a maze by extending paths.Figure 6 shows the extension starting coordinates of the hunt-and-kill algorithm.Figure 7 (a) shows the initial state of the hunt-and-kill algorithm.The entire surface is initially walled off as Figure 7 (a).Coordinates, where both x and y are odd, are listed as starting coordinates for path extension as in Figure 6.As with the wall-extending algorithm, all cells are set to coordinates.Figure 7 (b) shows the state in which the path is extended.A coordinate is chosen randomly from the starting coordinates, and the path is extended from there as in Figure 7 (b).Figure 7 (c) shows the coordinate selection and reextension after the path can no longer be extended.If the path can no longer be extended, a coordinate is randomly selected from the starting coordinates, which are already paths, and extension starts again from it as in Figure 7 (c).This process is repeated until all the starting coordinates turn into paths to generate the maze.Of the three maze generation algorithms mentioned above, the bar-tipping algorithm is relevant to the combinatorial optimization problem.In addition, unlike other maze generation algorithms, the bartipping algorithm is easy to apply because it only requires the consideration of adjacent elements.Thus, we have chosen to deal with this algorithm.Other maze generation algorithms could be generalized by reformulating them as combinatorial optimization problems.The wall-extending and hunt-and-kill algorithms will be implemented in future work, considering the following factors.The former algorithm introduces the rule that adjacent walls are extended and so are their walls.The number of connected components will be computed for the latter, and the result will be included in the optimization.
Using the bar-tipping algorithm, we reformulated it to solve a combinatorial optimization problem that generates a maze with a longer solving time and optimized it using quantum annealing.Quantum annealing (DW 2000Q 6 from D-Wave), classical computing (simulated annealing, simulated quantum annealing, and algorithmic solution of the bar-tipping algorithm), and hybrid computing were compared with each other according to the generation time of mazes, and their performance was evaluated.The solver used in this experiment is as follows: DW 2000Q 6 from D-Wave, simulated annealer called SASampler and simulated quantum annealer called SQASampler from OpenJij ope (2023), D-Wave's quantum-classical hybrid solver called hybrid binary quadratic model version2 (BQM) and classical computer (MacBook Pro(14-inch, 2021), OS: macOS Monterey Version 12.5, Chip: Apple M1 Pro, Memory: 16GB) This comparison showed that quantum annealing was faster.This may be because the direction of the bars is determined at once using quantum annealing, which is several times faster than the classical algorithm.We do not use an exact solver to solve the combinatorial optimization problem.We expect some diversity in the optimal solution and not only focus on the optimal solution in maze generation.Thus, we compare three solvers, which generate various optimal solutions.
In addition, we generate mazes that reflect individual characteristics, whereas existing maze generation algorithms rely on randomness and fail to incorporate other factors.In this case, we incorporated the maze solution time as one of the other factors to solve the maze.The maze solving time was defined as the time (in seconds) from the start of solving the maze to the end of solving the maze.
The paper is organized as follows.In the next Section, we explain the methods of our experiments.In Sec. 3, we describe the results of our experiments.In Sec. 4, we summarize this paper.

Cost function
To generate the maze by quantum annealer, we need to set the cost function in the quantum annealer.One of the important features of the generation of the maze is diversity.In this sense, the optimal solution is not always unique.Since it is sufficient to obtain a structure consistent with a maze, the cost function is mainly derived from the necessary constraints of a maze, as explained below.Three constraints describe the basis of the algorithm of the bar-tipping algorithm.The cost function will be converted to a QUBO matrix to use the quantum annealer.To convert the cost function to a QUBO, the cost function must be written in a quadratic form.Using the penalty method, we can convert various constraints written in a linear form into a quadratic function.The penalty method is a method to rewrite the equality constant as a quadratic function.For example, the penalty method can rewrite an equation constant x = 1 to (x − 1) 2 .Thus, we construct the cost function for generating the maze using the bar-tipping algorithm below.
The constraints of the bar-tipping algorithm correlate with each term in the cost function described below.The first constraint of the bar-tipping algorithm is that the bars can be extended in only one direction.It prevents making closed circuits.The second constraint of the bar-tipping algorithm is that the bars of the first column be extended randomly in four directions (up, right, down, and left), and the second and subsequent columns can be extended randomly in three directions (up, right, and down).It also prevents the creation of closed circuits.The third constraint of the bar-tipping algorithm is that adjacent bars must not overlap.Following the constraint in the bar-tipping algorithm, we can generate a maze with only one path from the start to the goal.
The cost function consists of three terms to reproduce the bar-tipping algorithm according to the three constraints and to determine the start and goal.
in Equation 1 depends on i, j, d, i ′ , j ′ , and d ′ and is expressed as follows The coefficients of λ 1 and λ 2 are constants to adjust the effects of each penalty term.The first term prevents the bars from overlapping and extending each other face-to-face.It represents the third constraint of the bar-tipping algorithm.Here, due to the second constraint, bars in the second and subsequent columns cannot be extended to the left.Therefore, the adjacent bars in the same row cannot extend and overlap.This corresponds to the fact that d cannot take 3 when j ≥ 1.Thus, there is no need to reflect, considering the left and right.In particular, the first term restricts the extending and overlapping between the up and down adjacent bars.For example, the situation in which one bar in (i, j) extended down (d = 2) and the lower bar in (i + 1, j) extended up (d = 0) is represented by x i,j,0 x i+1,j,2 = 1, and Q(i, j, 0), (i + 1, j, 2) takes 1.In the same way, thinking of the relation between the bar in (i, j) and the upper bar in (i − 1, j), Q (i−1,j,2),(i,j,0) = 1.Thus, Q (i−1,j,2),(i,j,0) x i,j,0 x i+1,j,2 takes 1, and the value of the cost function taken will increase.By doing this, the third constraint is represented as a first term.The second term is a penalty term that limits the direction of extending to one per bar.It represents the first constraint of the bar-tipping algorithm.This means that for a given coordinate (i, j), the sum of x i,j,d d = 0, 1, 2(, 3) must take the value 1.Here, the bars in the second and subsequent columns cannot extend to the left by the second constraint.Thus, d takes (0, 1, 2, 3) when j = 0, and d takes (0, 1, 2) when j ≥ 1.The third term is the penalty term for selecting two coordinates of the start and the goal from the coordinates (m, n).This means that a given coordinate (m, n), the sum of X m,n takes 2. The start and the goal are commutative in the maze.They are randomly selected from the two coordinates determined by the third term.X m,n denotes whether or not to set the start and goal at the m-th row and n-th column of options of start and goal coordinates.When the (m, n) coordinate is chosen as the start and goal, X m,n takes 1.Otherwise, it takes 0. There are no relations between X m,n and x i,j,d in Equation 1.This means that the maze structure and the start and goal determination coordinates have no relations.Figure 8 shows the coordinates (m, n) that are the options of the start and the goal.As Figure 8 shows, (m, n) is different from the coordinate setting bars; it is located at the four corners of the bars, where the bars do not extend.X m,n and x i,j,d are different.X m,n are options of start and goal, and x i,j,d are options of coordinates and directions to extend the bars.We have shown the simplest implementation of the maze generation following the bar-tipping algorithm by quantum annealer.Following the above, a maze, depending on randomness, is generated.To Generate a unique maze independent of randomness, we add the effect to make the maze more difficult in the cost function, and the difficulty is defined in terms of time (in seconds).

Update rule
We propose an additional Q update term to increase the time to solve the maze.We introduce a random term that takes random elements to change the maze structure.It is added to the Equation 1. First, Q update term, the additional term which includes the new QUBO matrix Q update , is given by where (4) Figure 9 shows the structure of Q update and roles.Here, k ′ , l ′ are the replacement of i, j, m, n in k, l with i ′ , j ′ , m ′ , n ′ .N in Equation 4is the size of the maze.The coefficients λ update1 and λ update2 are constants to adjust the effect of terms.The elements of Q update related to maze generation, part A in Figure 9 is multiplied by the λ update1 .The elements of Q update related to the relation between the start and goal determination and the maze generation, part B, C in Figure 9 is multiplied by the λ update1 .The elements of Q update related to the start and goal determination, part D in Figure 9 is multiplied by the λ update2 .These are to control the maze difficulty without breaking the bar-tipping algorithm's constraints.Equation 3is represented by the serial number k of each coordinate (i, j) at which bars can extend, and the sum l of the total number of coordinates at which the bars can extend and the serial number of coordinates (m, n), which are options for the start and the goal.Furthermore, The second term and the third term in Equation 3 allows the maze to consider the relation between the structure of the maze and the coordinates of the start and the goal.Second, Q update , the new QUBO matrix, is given by where Q random is a matrix of random elements from −1 to 1 and p(t) depends on time t (in seconds) taken to solve the previous maze and is expressed as follows (6) The Q update is a matrix that was made with the aim of increasing the maze solving time through the maze solving iteration.The initial Q update used in the first maze generation is a random matrix, and the next Q update that is used in the second or subsequent maze generation is updated using Equation5, the maze solving time t, and the previous Q update .The longer the solving time t of the maze is, the higher the percentage of the previous Q update in the current Q update and the lower the percentage of Q random ; inversely, when t is small, the ratio of the previous Q update is small, and the percentage of Q random is significant.In other words, the longer the solving time t of the previous maze, the more characteristics of the previous term Q update remain.Here, a is a constant to adjust the percentage.The p(t) is a function that increases monotonically with t and takes 0 to 1. Thus, Q random that is, the random elements in Q update increase as time t increases.After the maze is solved, the next maze QUBO is updated by Equation 5 using the time taken to solve the maze.The update is carried out only once before the maze generation.Repetition of the update will make the maze gradually difficult for individuals.
The sum of Equation 1 and Equation 3 is always used to generate a new maze annealing from a maximally mixed state.

Generation of maze
We generate mazes by optimizing the cost function using DW 2000Q 6.Since the generated maze will not be solved, the update term is excluded for this experiment.λ 1 = 2 and λ 2 = 2 were chosen.

Computational cost
We compare the generation times of N × N maze in DW 2000Q 6 from D-Wave, simulated annealer called SASampler and simulated quantum annealer called SQASampler from OpenJij, D-Wave's quantumclassical hybrid solver called hybrid binary quadratic model version2 (hereinafter referred to as "Hybrid Solver") and classical computer (MacBook Pro(14-inch, 2021), OS: macOS Monterey Version 12.5, Chip: Apple M1 Pro, Memory: 16GB) based on bar-tipping algorithm coded with Python 3.11.5 (hereinafter referred to as "Classic").The update term was excluded from this experiment.We set λ 1 = 2 and λ 2 = 2. DW 2000Q 6 was annealed 1000 times for 20µs, and its QPU annealed time for maze generation as calculated using time-to-solution (TTS).SASampler and SQASampler were annealed with 1000 sweeps.These parameters were constant throughout this experiment.Regression curves fitted using least squares method were drawn from the results to examine the dependence of computation time on maze size.

Effect of update term
The solving time of 9 × 9 maze generated without Q update and using Q update were measured.This experiment asked 12 human subjects to solve mazes one set (30 times).To prevent the players from memorizing maze structure, they can only see the limited 5×5 cells.In other words, only two surrounding cells can be seen.The increase rate from the first step of simple moving average of ten solving times was plotted on the graph.For this experiment, λ 1 = 2, λ 2 = 2, λ update1 = 0.15, λ update2 = 0.30 and a = 0.05 were chosen.For two λ update , we chose larger values that do not violate the constraints of the bar-tipping algorithm.We chose a value in which Equation 6 will be about 0.8 (80%) when t = 30 seconds as a constant a.

Applicatons
The cost function in this paper has many potential applications by generalizing it.For example, it can be applied to graph coloring and traffic light optimization.Graph coloring can be applied by allowing adjacent nodes to have different colors.Traffic light optimization can address the traffic light optimization problem by looking at the maze generation as traffic flow.Roughly speaking, our cost function can be applied to the problem of determining the next state by looking at adjacent states.
Q update can be applied to the problem of determining the difficulty of the next state from the previous result.The selection of personalized educational materials is one of the examples.Based on the solving time of the previously solved problems, the educational materials can be selected at a difficulty suitable for the individual.This is the most fascinating direction in future studies.As described above, we should emphasize that Q update proposed in this paper also has potential use in various fields related to training and education.Fits of the form aN 2 + bN + c are applied to each of the datasets using least squares method.The results are as follows.Figure 11 shows the relation between TTS for maze generation and maze size on DW 2000Q 6. DW 2000Q 6 is O(N) or O(N 2 ).Even if it is quadratically dependent on the maze size, its deviation is smaller than the other solvers.Figure 12

Effect of update term
Here, 12 human subjects are asked to solve the maze one set (30 times), and the maze is shown to increase in difficulty as it adapts to each human subject.Figure 15 (a) shows the increase rate from the first step of simple moving average of 10 solving time of maze generated without Q update and individual increase rate.The solving time of the maze without Q update was slightly getting shorter overall.Figure 15 (b) shows the increase rate from the first step of simple moving average of 10 solving time of maze generated using Q update and individual increase rate.The solving time of the maze using Q update was getting longer overall.Most of the players increased their solving time, but some players decreased or didn't change their solving time.In addition, nine players' average of the solving time of the maze generated using Q update increased than that of the maze generated without Q update .These show that Q update has potential to increase the difficulty of the mazes.

DISCUSSION
In this paper, we show that generating difficult (longer the maze solving time) mazes using the bar-tipping algorithm is also possible with quantum annealing.By reformulating the bar-tipping algorithm as the combinatorial optimization problem, we generalize it more flexibly to generate mazes.In particular, our approach is simple but can adjust the difficulty in solving mazes by quantum annealing.
In Sec.3.2, regarding comparing computational costs to solve our approach to generating mazes using TTS, DW 2000Q 6 has a smaller coefficient of N 2 than the classical counterpart.Therefore, as N increases, the computational cost of DW 2000Q 6 can be expected to be lower than that of the classical simulated annealing for a certain time.Unfortunately, since the number of qubits in the D-Wave quantum annealer is finite, the potential power of generating mazes by quantum annealing is limited.However, our insight demonstrates some advantages of quantum annealing against its classical counterpart.In addition, we observed that the hybrid solver's computational cost was constant up to N = 18.This indicates that hybrid solvers will be potentially effective if they are developed to deal with many variables in the future.
In Sec.3.3, we proposed Q update to increase the solving time using quantum annealing.We demonstrated that introducing Q update increased the time to solve the maze and changed the difficulty compared to the case where Q update was not introduced.At this time, the parameters (λ update1 , λ update2 , and a) were fixed.Difficult maze generation for everyone may be possible by adjusting the parameters individually.
One of the directions in the future study is in applications of our cost function in various realms.We should emphasize that Q update proposed in this paper also has potential use in various fields related to training and education.The powerful computation of quantum annealing and its variants opens the way to such realms with high-speed computation and various solutions.
(a)  shows the first column of bars extended.The bars in the first column are randomly extended in only one direction with no overlaps, as in Figure2(a).The bars can be extended in four directions (up, down, right, left) at this time.Figure 2 (b) shows the second column of bars being extended.Third, the bars in the second column are randomly extended in one direction without overlap as in Figure 2 (b).The bars can be extended in three directions (up, down, right) at this time.Figure 2 (c) shows the state in which the bars after the second column are extended.Fourth, the bars in subsequent columns are randomly extended in one direction, likewise the bars in the second column, as in Figure 2 (c).Figure 2 (d) shows the complete maze in its finished state.Following the process, we can generate a maze as in Figure 2 (d).

Figure 2 .
Figure 2. Step of bar-tipping algorithm.(a) step1: bars in first column are extended.(b) step2: bars in second column are extended.(c) step3: bars in subsequent column are extended.(d) step4: A complete maze through these steps.
Figure 7 (d) shows the complete maze with the hunt-and-kill algorithm.Following the process, we can generate a maze as in Figure 7 (d).

Figure 3 .
Figure 3. Mazes violated the constraints.(a) A maze violate the first constraint.(b) A maze violate the second constraint.(c) A maze violated the third constraint.

Figure 4 .
Figure 4. Red cells represent options of starting coordinates for the wall-extending algorithm.

Figure 6 .
Figure 6.Red cells represent options of starting coordinates for the hunt-and-kill algorithm.

Figure 8 .
Figure 8. Black cells represent outer walls and inner bars (i, j).Red cells represent options of start and goal coordinates (m, n).

Figure 9 .
Figure 9. Structure of Q update .Part A is related to maze generation.Part B and part C are related to the relation between maze generation and the start and goal determination.Part D is related to the start and goal determination.

Figure 10
Figure10shows execution examples of 9 × 9 and 15 × 15 mazes generated by optimizing the cost function using DW 2000Q 6.

Figure 12 .
Figure 12.(a) The time to reach the ground state as a function of the maze size in Classic.The error bars represent a 95% confidence interval.The regression curve is (0.855 ± 0.090)N 2 + (0.6 ± 1.5)N + (2.2 ± 5.1) .(b) Time to reach the ground state as a function of the maze size in SASampler.The error bars represent a 95% confidence interval.The regression curve is (28.8 ± 1.2)N 2 + (36 ± 20)N + (129 ± 71) .(c) Time to reach the ground state as a function of the maze size in SQASampler.The error bars represent a 95% confidence interval.The regression curve is (172.8± 4.4)N 2 + (287 ± 73)N − (1.5 ± 2.5) • 10 2

Figure 14 .
Figure 14.Time to reach the ground state as a function of maze size in the Hybrid Solver.The error bars represent a 95% confidence interval.

Figure 15 .
Figure 15.(a) Left: Increase rate from the first step of simple moving average of 10 solving time of 9 × 9 maze generated without Q update .The error bars represent standard errors.Right: All players' increase rate from the first step of simple moving average of 10 solving time of 9 × 9 maze generated without Q update .(b) Left: Increase rate from the first step of simple moving average of 10 solving time of 9 × 9 maze generated using Q update .The error bars represent standard errors.Right: All players' increase rate from the first step of simple moving average of 10 solving time of 9 × 9 maze generated using Q update .