Effectiveness of Augmented Reality Guides for Blind Insertion Tasks

Although many augmented reality (AR)-based assembly support systems have been proposed in academic research and industry, the effectiveness of AR to resolve the occlusion issue in the context of a blind assembly process remains an unexplored topic. Therefore, the present work investigates how AR can assist operators during the execution of blind manual assembly tasks. Specifically, an AR research set-up was designed to provide assistance in occlusion situations during a peg-in-hole task. The set-up featured a see-through device (HoloLens), which provides operators with two modes of visual augmentations that directly overlay on the assembly objects. The first mode referred to as the “wireframe overlay” displays the inner part of the objects, providing an inside view of the occluded parts, and the second one referred to as the “axes overlay,” displays the axes of the objects and their slots, indicating how to align the different parts during the assembly. The effectiveness of these AR visualizations was compared to a baseline augmentation-free situation in a controlled experiment. Thus, following a within-subject design, 30 participants performed a two-stages blind insertion task. Their performances represented by task completion time, insertion errors, and smoothness of the insertions were recorded. In addition, a post-questionnaire reported their subjective perception of task difficulty during the task and their preferences. Results indicated a strong acceptance of participants for AR visualizations that they rated as allowing them to perform the task more easily. However, no statistically significant differences in terms of objective performance measures were found. Yet, it was found that axes overlay produced smoother trajectories compared to the wireframe overlay, highlighting the potential effect of more abstract visualization aids.


INTRODUCTION
Manual assembly tasks represent one of the most extensively studied manual processes in manufacturing where "automation is not cost-effective, products are highly customized, or processes cannot be done by automatic machines" (Tang et al., 2003). One of the main challenges of these studies is to enhance the assembly information that guides a human operator when performing the assembly process. Assembly information, such as textual instructions, drawings or schematics, in the form of paper or electronic manuals, is often separated from the assembly product. Therefore, the operator would usually need to switch his/her attention between the assembly instructions and the parts being assembled. These switches of attention may lead to reduce productivity, increase assembly times and errors, as well as strain injuries (Khuong et al., 2014). By replacing these types of information and providing adequate guidance to the operator during the manual assembly task, one could reduce operation time and cost, and improve the quality of manufacturing processes.
Thus, to assist operators during such tasks, different approaches were proposed. Among them, the use of haptic technology to provide more realistic feedback during the assembly process, such as feeling the weight of the parts to be assembled or the contact force when objects collide (Seth et al., 2006). Other researches also suggested the use of haptics to define virtual constraint guidance-for example when wearing gloves (Valentini, 2009), or using vibrotactile feedback (Arbeláez et al., 2019)-, which helps operators to find the right alignment on the assembly constraint (Tching et al., 2010;Wildenbeest et al., 2012).
While haptic technology for manual assembly tasks showed certain benefits, there are still many limitations that prevent its wide-spreading. Indeed, to maintain stability in real-time, haptic simulations are required to calculate forces at a high framerate (1 kHz), which makes their use computationally expensive. Moreover, they generally operate with intrusive mechanical structures or equipment (instrumented gloves, exoskeletons, and robotic arms) that disturb operators during the task or restrict their gesture, which in turn affects the performance (Bashir et al., 2004). For a more detailed review of assembly with haptic feedbacks and its limitations, see Perret et al. (2013).
In parallel to the use of haptic technology, an increasingly common approach is the use of Augmented Reality (AR) to provide visual cues that help operators during the assembly process (Unger et al., 2002;Petzold et al., 2004;Funk et al., 2016a). AR is a human-machine interaction tool that overlays computergenerated information (e.g., 3D models and annotations) on the real-world environment perceived by a human user (Azuma, 1997;Azuma et al., 2001). AR makes it possible to display digital assembly information in the operators' field of view according to the situation (i.e., depending on the observed objects). Hence, it can improve assembly operations through essential step-by-step real-time instructions. The operators can concentrate on the tasks at hand without having to change their head or body positions to access the next instruction. Consequently, AR technology could provide an efficient and complementary tool to assist assembly tasks.
Many researchers in the manufacturing industries (Caudell andMizell, 1992, Curtis et al., 1999), as well as in academic institutes and universities (Doil et al., 2003;Reinhart and Patron, 2003) have explored the use of AR technology in assembly activities. As a result, several prototype applications were introduced, which show the benefits of using AR assistance in manual assembly operations (Reiners et al., 1999;Zenati et al., 2004;Regenbrecht et al., 2005). See Nee et al. (2012) and Ong et al. (2008) for an overview of AR applications in manufacturing. Thus, in comparison with conventional guidance methods, such as paper-based work instructions, assembly guidance systems based on AR can help reduce search time for relevant instructions as well as reduce the mistakes (Tang et al., 2003;Henderson and Feiner, 2011;Hou and Wang, 2013;Korn et al., 2013;Zhu et al., 2013). In addition, it allows the user to focus on the task by displaying guidance materials close to the working area spatially to minimize attention switching (Khuong et al., 2014), thus reducing the mental workload (Robertson et al., 2008;Hou and Wang, 2013). Finally, it improves user acceptance (Nilsson and Johansson, 2007;Webel et al., 2013).
However, although the use of AR to support assembly tasks has been a focus of interest over the last decade, few researchers and industrials have addressed the problem of occlusion that can occur during "blind" manual assembly task, i.e., when the view of the operator can be blocked, partially or totally, by the elements to be assembled.
The purpose of the present paper is to evaluate the effectiveness of an AR-based assembly prototype consisting of two types of AR visualizations in order to understand how best to assist operators in the context of manual blind assembly tasks.
The remainder of this paper is divided into six sections. The second section provides an overview of related works highlighting the research focus and the main objective of the present study. The third section presents the AR system designed to address the visual occlusion issues that occur during blind assembly tasks. The user evaluation procedure is reported in the fourth section. It is followed by statistical analysis and subsequent results in the fifth section. In the sixth section, these results are discussed. The seventh and last section concludes with some future work directions inspired by the present findings.

RATIONALE AND MOTIVATION
AR assistance for manufacturing and assembly domain activities is about as old as augmented reality itself, with the first AR-based assembly system introduced, in 1992, by engineers at Boeing to aid workers in the assembling of wires on a mounting plate, through displaying pertinent instructions and diagrams on a head-mounted display (HMD) (Caudell and Mizell, 1992;Sims, 1994;Curtis et al., 1999). Although they could demonstrate the feasibility of their system, they encountered several usability issues due to hardware and software limitations.
Since then, many experiments have been conducted to investigate the effectiveness of AR assistance for manual assembly tasks. Baird and Barfield (1999) conducted an experiment when operators had to assemble computer motherboards using four types of instruction media (paper, model on display, video-seethrough and optical-see-through HMD). Results indicated that AR-based assembly guidance was more effective than other forms of instruction: operators achieved the assembly in a shorter amount of time while making fewer errors. Tang et al. (2003) compared the effectiveness of AR instructions for assembling Duplo blocks against three other types of instructional media [a paper-based instruction set, computer-assisted instruction (CAI) using a monitor-based display, and CAI utilizing HMD]. Results showed that that overlaying 3D AR instructions on the actual pieces reduced the error rate for an assembly task by 82% compared to more conventional instruction sets. In the same year, several AR-based assembly guidance systems where developed (Reinhart and Patron, 2003;Zauner et al., 2003;Yuan et al., 2004). Nakanishi et al. (2007) evaluated the use of an AR manual in a wiring task. They found that the wiring time was shortened by about 15% and at the same time, the error in wiring positions was reduced to almost zero. For a detailed survey of ARbased assembly applications between 1990 and 2015, see Wang et al. (2016).
The majority of the AR research appears to have originated from academia. Industrial AR applications are far less reported in comparison. Yet, AR-based assembly guidance in industries is a strong and growing area. Several industrial projects demonstrated prototypes that allow computer-guided assembly of complex mechanical elements using augmented reality techniques, showing the benefits of AR technology for assembly tasks (Schwald et al., 2001;Hillers et al., 2004; ARVIKA 1 ; ARTESAS 2 ) In fact, the more complex the product is, the greater the potential benefit from the use of AR technology can be. Consequently, many manufacturing companies are integrating AR technology into their assembly activities. For a detailed review of industrial AR applications in manufacturing, see (Nee et al., 2012).
With the advent of technological developments in augmented reality systems (Zhou et al., 2008), mainly in tracking techniques and especially the vision-based tracking techniques (Sivaraman and Trivedi, 2013), and display devices (Ardito et al., 2015) such as projection-based displays and head-mounted displays, smaller, sophisticated, and even wearable AR-based manual assembly systems were designed and several academic studies, as well as industrial projects, have been conducted to evaluate their effectiveness.
Thus, recent attempts to investigate AR visual assembly guidance have been proposed. Based on the work of Tang et al. (2003) and Funk et al. (2015) proposed Duplo blocks assembly tasks as a standardized lab-style experiment design to evaluate AR instructions. They followed this design to compare HMD instructions, tablet instructions, and baseline paper instructions to in-situ AR projected instructions. They found that participants were faster and made fewer errors using AR projected-based instructions compared to HMD instructions (Funk et al., 2016b). Following this trend, Blattgerste et al. (2017) compared insitu instructions to conventional in-view instructions using a smartphone, Microsoft HoloLens, Epson Moverio BT-200 smart glasses, and paper-based instructions. Like their predecessors, the in-situ instructions consisted in displaying at each step a cuboid with size and color that corresponds to the Lego Duplo brick that had to be assembled at the correct assembly position. The results showed that the participants were faster using the paper instructions but made fewer errors with in-situ instructions using the Microsoft HoloLens. Nishihara and Okamoto (2015) and Okamoto and Nishihara (2016) proposed an AR system for guiding the assembly of a Pentomino puzzle. The system consisted of a fixed tablet computer between the participant and the parts, on which visual indications of final positions were displayed. Similarly, puzzles have been largely used in AR for testing assembly implementations (Kitagawa and Yamamoto, 2011;Syberfeldt et al., 2015).
In parallel, Radkowski et al. (2015) analyzed the dependency between two factors that may affect the effectiveness of AR assembly guidance systems, namely, the complexity of the manual assembly task (assembly of an axial piston motor in this case) and the complexity of visual features used to present the assembly steps. The features were adapted to the level of difficulty and varied from textual information on the screen describing the task, 2D sketches, and static 3D virtual models, to 3D arrows used to indicate the assembly location or the assembly path, as well as 3D animations to show the assembly method. They found out that the visual features must correspond to the relative difficulty level and that the difficulty of the task does not affect the user's assembly performance (i.e., the assembly time). Their results also showed that the visual features for AR assistance increase the user's confidence despite the fact that they did not find statistically significant results regarding assembly time. Syberfeldt et al. (2015) followed the same idea except that they used AR overlaying information on the real objects to identify the correct object to be assembled. Their work was based on results from Pathomaree and Charoenseang (2005) and Seok and Kim (2008), which indicated that simpler visual features can be used when 3D models overwhelm the user. They developed an AR prototype based on the Oculus Rift platform and evaluated it through the assembling of a 3D puzzle, in order to investigate user acceptance. The results showed that the most important keys improving acceptability were that the complexity of the assembling task must be significant and that the AR system should make the user more efficient. Horejší (2015), on the other hand, proposed to use a monitor placed in front of the user that displayed the final image with virtual 3D models. He focused on displaying the order of the tasks to be performed and measured time improvement in assembly tasks in comparison with the classic method. More recently, Ojer et al. (2020) presented a new projection-based AR system for assisting operators during electronic component assembly processes. The proposed system consists of four different parts: an illumination system, a 2D high-resolution image acquisition setup, a screen and a projector located at sufficient height in order to not disturb the operator during manual operation. The main goal of this tool was to generate models able to highlight the missing electronic components on the board. Results of a study they conducted showed that operators actually find the system more usable, feel more secure with it, and require less time to perform their tasks.
Therefore, AR-based assembly guidance has demonstrated its effectiveness compared with classic assistance methods (digital and paper manuals) such as time and error rate reduction and increased user acceptance. Displaying directly the information to the user, it is possible to avoid the attention swapping, the execution of repetitive movements and, at the same time, simplifying user's decisions (Tang et al., 2003;Yuan et al., 2008;Henderson and Feiner, 2011;Arbeláez et al., 2019).
While these researches provided strong evidence for the value of AR, they mainly focused on two ways to provide visual aids, namely by: • Displaying 2D information-such as textual information, numerical values or 2D sketches-that is relevant to what is under observation e.g., the description of the current operation (Radkowski et al., 2015) or the order of the operations the user needs to follow to perform the task (Horejší, 2015); • Displaying 3D virtual objects inserted within the real environment in spatially registered positions that can represent either 3D indications such as arrows to show the correct location or the pose of the real object. Consequently, the user is instructed on how to assemble real components together (Syberfeldt et al., 2015;Funk et al., 2016b;Blattgerste et al., 2017).
These visual features are added to the real components of the assembly task. They represent external information that does not exist outside the framework of the experiment. As a result, they can lead to an overload of the real scene and therefore increase the mental workload (Hou and Wang, 2013;Markov-Vetter and Staadt, 2013). Moreover, although much effort has been expended on this topic, there are still many unsolved issues such as the visual occlusion issue that happens during blind assembly tasks when objects or parts of objects are occluded.
In contrast to these prior works, the focus is given in this paper on integrating extra geometric information to the objects to be assembled. To be useful for blind assembly, the information should represent some important, intrinsic properties of the objects that are not directly visible to the users. The information can be implicit, such as symmetries, axis, etc., or explicit, i.e., portions of objects that are occluded during the assembly. By visualizing hidden information with AR, one could perform blind assembly tasks that would otherwise be difficult or even impossible to accomplish.
Therefore, the aim of this work is to develop an augmented reality (AR) system helping users when performing blind assembly tasks, by providing them with AR visualizations appropriate to this issue. To achieve this purpose, two different modes of AR visualizations are proposed: (1) Highlight the hidden part of the objects (i.e., the inner and/or the rear part) as well as the parts occluded by other objects; information is selected solely on the basis of visibility criteria.
(2) Display only the axes of the objects (or similar structural features) so that their relative positions become explicit. This time, information is selected based on its relevance with respect to the insertion task.
Then, an evaluation is conducted to explore the potential benefit of these AR visualizations methods to assist users in blind assembly situations compared to a baseline situation where no AR is provided.

EXPERIMENTAL DESIGN
In order to provide AR visualizations as support for blind tasks, an AR-based assembly prototype system was designed consisting of 3D visual overlays displayed on a head-mounted device and a controlled blind insertion task designed as follows.

Blind Insertion Task
It was not possible to rely on previous works consisting in standardized assembly set-ups that are mainly designed for "pickand-place" tasks, and where the occlusion issue is not addressed (Tang et al., 2003;Funk et al., 2015;Blattgerste et al., 2017). Instead, a blind assembly process was designed based on the "peg-in-hole" manipulation, where an object must be inserted in another without direct visibility on the insertion area (Chhatpar and Branicky, 2001;Park et al., 2013;Abdullah et al., 2015;Zhang et al., 2017). Insertions are an important aspect of assembly: tight tolerances between objects involved in the insertion, as well as positioning accuracies, require some level of compliance, and trajectory control (Lim et al., 2007). Insertion tasks are also found in a wide variety of maintenance and automotive applications, making them suitable standardized tasks that should be studied. Therefore, three objects to be assembled were designed and manufactured 3 : -a box with three -not aligned-slots on the top side and one slot on the slide, -a board with three slots on its middle area, -a second board with no slots.
The objects were built in medium-density fiberboard, a material light enough for easy handling, yet strong enough to guarantee some durability throughout the experiment. In addition, visual targets were engraved on the object for tracking purposes (see section Tracking Set-Up), once again ensuring pattern durability over time. Informal interviews after the experiment did not reveal any visual confusion due to the targets printed on the objects.
Using these three objects, a two-operation insertion routine was carried out in the following order: Operation 1: Insert the first board (the one with slots) through the box laterally from left to right. Operation 2: Insert the second board into one of the three vertical slots on the top of the box, then through the previously inserted board (choosing the correct slot that allows for a vertical insertion).
Refer to Figure 1 for a graphical description of the assembly task.

Visual Overlays
As mentioned above, previous works have focused on procedural augmentation, such as 2D or 3D instructions. In this study, the focus was given instead on the later stage of actual assembly and more precisely how geometric overlays can compensate human senses during critical phases, such as insertions. Thus, two 3D visual overlays -associated with the assembly objectswere designed: the "wireframe" overlay and the "axes" overlay (see Figure 2). They are described in the following.

Wireframe Overlay
The wireframe overlay employs wire-frame models of the assembly objects to display an X-ray vision of the assembly parts. AR X-ray vision has been used in different fields (Bane and Hollerer, 2004; Avery et al., 2007Avery et al., , 2009). In particular, it was used in medical scenarios to provide a 3D view of the regions to be operated in real-time, so that surgeons can intervene in an easier and more accurate manner (Bajura et al., 1992;Navab et al., 2009;Zang et al., 2009;Tabrizi and Mahvash, 2015). Based on the analogy that exists between the regions to be operated and the objects to be assembled, the wireframe overlay was proposed to improve the perception of relative placement of the objects in an assembly and provide additional depth cues, by virtually representing visible and invisible contours. In other words, this overlay will display all the outlines and inner parts of objects during the assembly. This would allow operators to get an inside view of the occluded parts.

Axes Overlay
As previously mentioned in Valentini (2009), Tching et al. (2010, and Wildenbeest et al. (2012), performance can be improved in an assembly task by defining virtual constraints on the objects using haptic devices. It could, therefore, be interesting to reproduce such constraints using only visual guidance in order to encourage operators to follow a certain path while inserting the objects. Thus, in the axes overlay, the axes of the objects and their insertion features (slots) are displayed to indicate to operators how to align the different objects during the assembly.

Device Set-Up
A commonly available AR viewing device is the see-through head-mounted display (HMD). For such a device to be operated in assembly operations, it must be lightweight and small enough not to obstruct the user's view, and computationally powerful enough to be able to interpret specific user input and the environment (Azuma et al., 2001). The user should also be able to interact with the devices in a most natural way, without awkward postures and gestures (Carmigniani et al., 2011).
For these reasons, the decision was made to use a Microsoft HoloLens running a 32-bit version of the Windows 10 operating system, with an Intel Atom x5-z8100 processor consisting of four 64-bit cores running at 1.04 GHz. In addition, it features an HPU/GPU Holographic Processing Unit, 64 GB Flash, 2 GB RAM, and 2-3 h of active battery life that allows the standalone operation of this device (Furlan, 2016). Moreover, it is a completely self-contained HMD, i.e., it does not require the HMD to be tethered to a separate computing device.

Tracking Set-Up
The one area in which the HoloLens falls short is tracking the location of the parts and assembly station. Such an intricate assembly requires precise location capabilities and a high level of accuracy in tracking and superimposition of augmented information (Nee et al., 2012). The HoloLens does have spatial mapping capabilities; however, the mesh created is not accurate enough for a detailed assembly application. Microsoft does currently suggest that if a developer wants to use marker-based tracking, the Vuforia plug-in for Unity3D should be used 4 All implementation details on how to configure a Vuforia app for Hololens can be found on their website 5 .
Therefore, the HoloLens built-in tracking system was replaced by a more accurate tracking procedure based on a marker-based approach, implemented using the Vuforia 6 SDK. Consequently, each object to be tracked was covered with visual targets that would be recognized by the Vuforia API on the HoloLens. Special care was taken to preserve high local contrasts and avoid repetitive patterns to obtain satisfactory tracking performances. Refer to Vuforia's website 6 for a better understanding of the tracking requirements. Given these precautions, the HoloLens could properly track object positions as the user moves them around in the assembly area. That is crucial because, without the specific location of each component being tracked, the device cannot achieve true AR capabilities.
This approach has allowed to provide an easily reproducible and ecologically valid system without any external tracking apparatus and design a completely portable, lightweight, and easy to handle set-up. In particular, the portable AR gear was comfortable to wear while providing satisfactory AR assistance.
Finally, the 3D models of the assembly objects were created on Blender 2.6, then imported in Unity3D 5.5.2 where custom Vuforia targets were generated. The AR rendering overlays were implemented in C# using custom shaders in Unity3D.

Factors
A within-subjects design experiment was run with two fixed variables: [VISUAL] The visual overlay with three modalities labeled WIR, AXE, and BAS representing, respectively, both wireframe and axes visual overlays, and a baseline condition with no AR visualization provided to allow a comparison of the AR conditions with the natural operator condition during the assembly.
[SLOT] The numbered slot (located on the top of the box and the first board) in which the participants had to insert the second board. There were three modalities representing the three slots numbered from 1 to 3. This variable was considered as a repeated measure in the evaluation.
The order of both variables was counterbalanced across participants using Latin Square for [VISUAL] and randomization in an equal way for [SLOT] in order to reduce the order effect and avoid bias the results.

USER EVALUATION Participants
Thirty participants took part in the experiment (21 males, 9 females) with ages ranging from 19 to 59 years old (mean = 29, SD = 10). They reported an average degree of expertise with HMDs of 1.83 on a 5 point Likert scale (1 meaning no experience and 5 meaning very experienced). The only condition to participate was to have a normal or corrected to normal vision (the HoloLens can accommodate glasses without difficulty).

Procedure
Upon their arrival, participants read and signed an informed consent form containing written instructions about the experiment. They also filled out a background information document and rate their degree of experience with virtual and augmented reality devices. Then, the participants were seated at a table in front of the objects to be assembled with the HoloLens on their heads (including in BAS condition). Each object was clearly labeled so that no confusion was possible. Figure 3A illustrates a participant before starting the task. The experiment was divided into three phases: Training A training was established before the evaluation in order to reduce the learning effect. Thus, participants underwent a training session of 2 min per [VISUAL] condition, during which the evaluator described the visual overlays and explained the task to be performed. The evaluator also asked the participants to insert the boards. This phase allowed them to get familiar with the task, the three different conditions, as well as with the set-up.
Precisely, they were encouraged to adjust the HoloLens comfortably on their head (improper fitting of a see-through headset can lead to misalignment of the AR elements with respect to the real world). In addition, the evaluator gave them short verbal instructions: -They were not allowed to move the box (that was fixed on the table) to prevent getting extra visibility cues during tasks; -They were not allowed to lean forward too much and peek behind the box in order to limit their perception of the actual depth of the box or of the slots' position. An informal poll at the end of the evaluation revealed this was not an issue for the participants. -They were not allowed to touch the slots in which they had to insert the boards. This, as to avoid any haptic support; -Every time they finished the task, they have to put the boards back in their initial position on the table, indicated by a label. This, in an attempt to provide the same starting point for all participants and avoid any experimental bias.

Task
During this phase, the participants had to: (1) First, perform operation 1: insert the first board (with the slots) through the box from left to right. (2) Then, perform operation 2: a slot number was given to participants orally by the evaluator and through text instruction displaying on the device to avoid any confusion (see Figure 3B). They then inserted the second board into the box from the top and through the first inserted board, using the correct numbered slot. They did this three times, once for each slot according to the number given to them. If a participant inserted the second board into the wrong slot, the insertion was counted as an insertion error, and they proceeded to the next one.
Participants repeated these two operations three times, once per [VISUAL] condition. In this way, each participant performed nine blind insertions altogether. Figure 1 illustrates the firstperson view through the HoloLens at different stages of the task.

Post-assessment
Once the task was completed, participants were asked to state how difficult it was to perform the insertion task in each condition by filling out a 5-point Likert scale questionnaire. The total duration of the evaluation (training, evaluation, and post-assessment) was about 8-9 min for each condition. A small duration was chosen to avoid nausea and loss of attention that could result from prolonged wearing of the HMD, and therefore, could reduce the task performance (Livingston, 2005). Consequently, the total duration to complete the evaluation was ∼25 min.

Data Collection
Two participants were removed from the evaluation due to technical problems during the test. In total, 252 trials were registered: 3 [VISUAL] × 3 [SLOT] × 28 participants. For each trial, the task completion time and the number of wrong insertions (i.e., inserting the board into a wrong slot) were logged. In addition, positions of both boards were recorded every 15 frames (4 Hz). Participants' responses to the questionnaire regarding the subjective complexity of the task and their preference regarding each condition were also collected.
From this data, three objective measures were extracted: (1) TCT: the task completion time of successful insertions only (i.e., when participants inserted the second board in the correct slot).
(2) PWI: the percentage of wrong insertions compared to the total amount of insertions. (3) AAO: the average amplitude oscillation (shaking) of the second board extracted from X and Y coordinates. It was calculated-for each slot-as the minimum Euclidean distance between the trajectory of the board and the optimal insertion trajectory (i.e., no shaking at all). This provides a measure of how close the trajectory was to the optimal one. Moreover, this measure was calculated during a time interval illustrated in Figure 4. Precisely, the interval started from the moment participants inserted the second board into one slot of the box (P0 in Figure 4), and the moment they inserted it into the corresponding slot of the first board while inside the box (P1 in Figure 4). Finally, a horizontal threshold of 2 cm was defined empirically to remove possible extreme points due to a lack of tracking.
In addition, responses from participants resulted in one subjective measure: (4) DIF: scores for the difficulty perceived by participants during the assembly.

Hypotheses
The main goal of this study was to investigate the effectiveness of the AR overlays proposed to highlight the occluded parts during blind insertions. Therefore, it was expected that [VISUAL] conditions would significantly affect the reported measures. Precisely, it was anticipated that the AR visual overlays would help participants to perform the blind task more efficiently compared to the no AR condition. In addition, it was expected that the wireframe overlay would outperform the axes overlay because it provides more complete information on the objects. Thus, it was hypothesized that:

H1(a):
TCT will be the highest in the BAS condition. H1(b): TCT will be lower in the WIR condition compared to the AXE condition. H2(a): PWI will be the highest in the BAS condition. H2(b): PWI will be lower in the WIR condition compared to the AXE condition. H3(a): AAO will be the highest in the BAS condition. H3(b): AAO will be lower in the WIR condition compared to the AXE condition. H4(a): DIF will be the highest in the BAS condition. H4(b): DIF will be lower in the WIR condition compared to the AXE condition.

RESULTS
In the following, the means and standard deviations are abbreviated by M and α, respectively. The normality of the data was analyzed using visual inspections of the normal QQplots in combination with Shapiro-Wilk tests. When data were non-normally distributed, a log 10-transformation was applied to satisfy the assumption of parametric tests. If the data was FIGURE 4 | How the time interval was defined in order to calculate AAO measure: P0 represents the point at which participants insert the second board (board 2) into one slot of the box, P1 represents the point where they insert it into the corresponding slot of the first board while inside the box, and P2 represents the point where they complete the task.
not normally distributed (i.e., log 10-transformation did not succeed), non-parametric equivalent tests were substituted. The result of the statistical parametric and non-parametric tests for each measure is reported. For statistically significant effects (p < 0.05), Cohen's d effect size estimate r was computed with threshold values 0.1 (small), 0.3 (medium), and 0.5 (large). All the analyses were performed using R version 3.6.0. The remainder of this section is divided into three parts. The effect of [VISUAL] conditions on the objective measures of performance and the subjective questionnaire are described, respectively, in the first and second parts. The third part investigates the potential order effect. Figure 5 shows the mean plots for TCT, PWI, and AAO measures. Regarding TCT measure, the mean value for each condition was M BAS = 29.3s (α BAS = 10s), M WIR = 28s (α WIR = 9.1s), M AXE = 32.7s (α AXE = 14.2s). The log 10 -transformed data was normally distributed (W = 0.98; p = 0.65). Therefore, a one-way repeated measures ANOVA analysis was run that showed no statistical significant difference between [VISUAL] conditions [F (2, 54) = 1.34, p = 0.27), which contradicted H1(a) and H1(b) hypotheses.
Finally, concerning AAO measure, the process used to calculate it resulted in removing the data from eight participants due to the lack of points recorded in the specified time interval (detailed in section Data Collection) during their evaluation. Therefore, the analysis below concerns only 20 of the initial 30 participants. The mean value for each condition was M BAS = 18.8 mm (α BAS = 7.6 mm), M WIR = 21.2 mm (α WIR = 7.4 mm), M AXE = 14.4 mm (α AXE = 5.7 mm). The data was normally distributed (W = 0.96; p = 0.07). Therefore, a oneway repeated measures ANOVA analysis was run that showed statistical significant difference between the conditions [F (2,38) = 4.43, p < 0.05]. Then, paired t-tests with Bonferroni correction were run showing a significant difference between WIR and AXE conditions [t (19) = 3.12, p < 0.01, r = 1.02] with AXE outperforming WIR, which was not expected. In contrast, no statistically significant results were found between BAS and WIR conditions [t (19) = −1, p = 0.33], and between BAS and AXE conditions [t (19) = 1.9, p = 0.07]. Therefore, H3(a) and H3(b) were rejected.

Effect on the Subjective Questionnaire
The average value of DIF (Figure 6) was found to be higher in BAS condition M BAS = 3.25 pts (α BAS = 0.85 pts) compared to both AR conditions (M WIR = 2 pts, α WIR = 0.89 pts; M AXE = 2.37 pts, α AXE = 0.82 pts). The data was not normally distributed (W = 0.94; p < 0.000). A Friedman test was then carried out to compare the mean values for each [VISUAL] condition that showed a significant difference [χ² (2) = 19.63, p < 0.000]. Then, Wilcoxon signed-rank dependent tests with continuity correction were conducted. Results showed statistically significant differences between BAS and WIR conditions (V = 357.5, p < 0.000, r = 1.39), and FIGURE 6 | Effect of the [VISUAL] conditions on the difficulty perceived by the participants. The diamond symbol, the line across the box, and the dots represent, respectively, the mean score, the median, and the outliers.
between BAS and AXE conditions (V = 274.5, p < 0.01, r = 1.04), in both case AR conditions outperformed the baseline condition, which supported H4(a). However, the results between WIR and AXE conditions showed no significant differences (V = 88, p = 0.07), which went against H4(b).

Learning Effect
In order to investigate the learning effect, one-way repeated measures ANOVA analyses and Friedman's tests were computed between the different orders followed by participants during the experiment. The results indicated no statistically significant differences on the four measures of the evaluation, namely, TCT

DISCUSSION
An important issue in the evaluation of the three visual conditions was that participants could learn how to perform the insertions more efficiently as they repeat the same task for each condition. This learning effect could bias the results. Yet, no statistical significant results were found between the different orders of conditions. Therefore, one can conclude that preventive measures were enough to mitigate the learning effect.
The comparative analysis performed on task completion time and percentage of wrong insertions indicated that augmenting the user's vision, with both wireframe and axe overlays, did not lead to statistically significant objective performance improvement in comparison with the no-AR baseline situation. The most likely reason could be the tracking issues met during the experiment. Even with the use of the Vuforia plug-in and visual targets with high satisfactory requirements, the system experienced some loss of tracking of the assembly objects (mostly the boards) when they were too much covered by participants' hands. Another potential reason could be the low resolution of the cameras of the HoloLens that negatively affected the quality of the Vuforia tracking system (Evans et al., 2017). Livingston (2005) highlighted the lack of robustness of current tracking algorithms. This is a common software limitation of AR display devices that has to be resolved in the future in order to provide robust AR assembly assistance. Finally, yet importantly, participants could also experience an incorrect perception of depth (Swan et al., 2015). This could result in a misinterpretation of the overall assembly information augmented on the real objects, and therefore, reduce the performance of users. Nevertheless, because of the lack of familiarity of the average user with this type of device (the perceived average skill with HMD prior to the experiment reported as only 1.83 on a scale of 5 points Likert scale), one can be comforted by the fact the performance was, at least, not degraded by the current limitations of HMDs [limited field of vision, imprecise tracking, etc. (Livingston, 2005)].
User performance depends also on hardware features (Nee et al., 2012). Therefore, it is necessary to ensure that users feel comfortable using AR devices. In this experiment, participants reported a good acceptance of the system despite its shortcomings. They unanimously perceived both wireframe and axes mode as easier than the default mode. Moreover, the questionnaire indicates a strong preference of participants for both wireframe and axes mode: participants were asked to rank the assembling modes in order of preference from 1 to 3 (rank 1 being their favorite). Results revealed that 42% of the participants preferred the wireframe mode, 36% preferred the axes mode and 22% preferred the default mode. It validated the hypothesis regarding the subjective usefulness of both AR visualizations for blind insertion tasks.
The most interesting result concerned trajectory. Indeed, axes overlay resulted in a smaller degree of oscillation compared to wireframe overlay. In other words, using the axes overlays participants performed smoother trajectories, meaning that more abstract visualization aids can simplify the perception of the assembly scene and reduce the information to be processed by users, leading to better performance. This parameter could prove useful to build future evaluation systems and possibly apply our findings to real-world assembly tasks.
Informal post-interviews also confirmed the potential value of the axes overlay. It was reported that in some cases, overlaying exhaustive geometric information (wireframe condition) might become counterproductive and actually obfuscate important visual assembly cues. Furthermore, some participants reported the perception of an offset in the wireframe condition, which could be due to the absence of eye tracking calibration during the experiment (since inter-eye distances vary among participants, this may have contributed to displayed errors). This offset in itself was very small and did not affect the participants' understanding of augmented information. However, although participants noticed it only in the wireframe condition, it existed also in axes condition, except that, with wireframe overlay, the virtual content represented 3D objects superimposed on the real objects, which made the offset easier to notice. Whereas, with axes overlay, the guides were abstract objects with no real references making it more difficult to notice the offset. Thus, simplified, more abstract features with high information value (holes, axes, and slots, etc.) were preferred. An obvious design recommendation might be, therefore, to modify the wireframe overlay to display only the truly useful parts of the assembly instead of all the outlines and inner parts of objects, which can at times obstruct the real-world view. More specifically, it could be interesting to design a "dynamic" wireframe overlay relying on a context-aware approach to display only the relevant information at each stage of the blind assembly process (Zhu et al., 2013;Khuong et al., 2014).
To summarize, both AR visualizations were preferred and perceived as more useful compared to the no AR baseline situation. Conversely, objective indicators suggest no significant gain in performance. This contrast between objective and subjective results can be due to the relative simplicity of the prototypical peg-in-hole task design. To some extent, choosing a task suitable for an AR-based assistance system is still an open research issue since it relies on a good user interface, but before such an interface is developed, one is not sure if the task is suitable (Livingston, 2005). Nevertheless, an outcome of the experiment is the necessity to design tasks that are more difficult and build objects that are more complex. This would allow studying the effect of increasing complexity on both user performance and satisfaction, with or without visual AR (Radkowski et al., 2015).

CONCLUSION AND PERSPECTIVES
Although many AR-based assembly support systems have been proposed in academic research and industry, the occlusion issue that occurs during the process of blind assembly tasks remained an unexplored topic.
In this paper, an AR prototype set-up was designed specifically for blind peg-in-hole insertions tasks. It consisted of assembly objects overlaid with assistance information presented with the AR personal see-through device HoloLens coupled with a Vuforia plug-in for tracking purposes. Precisely, two AR visualization modes that directly overlaid on the physical objects were proposed: one that displays all the outlines and inner parts of the objects thus providing an inside view of the occluded parts referred to as the wireframe overlay, and another one in which only the axes of the objects and their slots were rendered indicating how to align the different parts during the assembly, referred to as the axes overlay.
Special care was given not to distract or obstruct the user by designing a self-contained, standalone, and lightweight setup. Particular attention was also paid to user interaction by providing natural interaction while manipulating the assembly objects (Carmigniani et al., 2011).
A user experiment was then conducted to comparatively evaluate both AR overlays with a no-AR baseline condition. The evaluation included objective performance measures represented by task completion time, percentage of wrong insertions, and the extent to which the trajectory of the objects oscillated, as well a subjective questionnaire reporting the degree of difficulty of the task and the perceived user preference.
Results indicated that participants perceived AR overlays as making them more effective at performing their tasks. However, objectives measures did not validate these results and showed no significant difference between AR aids and the baseline situation. This could be mainly due to the loss of tracking of the assembly objects when they were too much covered by participants' hands. Another potential cause highlighted by the experiment could be the low resolution of the cameras of the HoloLens, confirming some studies reporting that the low tracking accuracy of the HoloLens prevents the deliverance of robust AR assembly experiences (Evans et al., 2017;Palmarini et al., 2018).
With the improvement of tracking algorithms and more accurate response time, future versions of the HoloLens and other AR see-through headsets should more effectively assist assembly operations. Nevertheless, in the meantime, it would be interesting to add an external camera with higher resolution to improve tracking. An additional camera would also allow the implementation of another AR visualization: "the thirdperson view" which would consist in an indirect view of the assembly objects (similar to a side-view mirror). The next study will consist in designing such a set-up and compare this new visualization with the current AR overlays. In addition, care must be taken to calibrate eye tracking which is necessary to provide more accurate depth presentation and avoid bias the results.
Moreover, since the assembly environment is assumed to be known, it could also be interesting to improve the AR visualizations presented in this paper in order to provide a more effective way of notifying the user when the appropriate insertion depth has been reached.
Apart from improved hardware and software development, another future research direction would be to study the effectiveness of this AR-based assembly system with respect to a particular blind assembly task designed with certain degrees of complexity. The evaluation would include performance and cognitive measures such as the mental and physical workload, as well as monitor the user satisfaction and acceptance of such a system. Thus, the presented study provided a first insight into the design of AR visualizations for blind assembly support systems. It also highlighted that, despite being a promising device, the HoloLens is not ready yet for deployment in a factory assembly environment.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
Ethical approval was not provided for this study on human participants because no local ethics committee existed at the time of the experiment. However, the Helsinki protocol was followed. The patients/participants provided their written informed consent to participate in this study.