Augmented Reality for Aircraft Maintenance Training and Operations ...

2 downloads 0 Views 1MB Size Report
Applications. Editor: Mike Potel. Augmented Reality for Aircraft Maintenance. Training and Operations Support. Francesca De Crescenzio, Massimiliano Fantini, ...
Applications

Editor: Mike Potel

Augmented Reality for Aircraft Maintenance Training and Operations Support Francesca De Crescenzio, Massimiliano Fantini, Franco Persiani, Luigi Di Stefano, Pietro Azzari, and Samuele Salti University of Bologna

R

ecent statistics on causes of aviation accidents and incidents demonstrate that to increase air-transportation safety, we must reduce human errors’ impact on operations.1 So, the industry should first address human factors related to people in stressful roles to significantly minimize such errors. In particular, aviation maintenance employees work under high-pressure conditions— that is, they’re under strict time constraints and must adhere to stringent guidelines. Because of such constraints, they might be prone to making errors. Unfortunately, many of these errors might not become apparent until an accident occurs. Although maintenance errors are a recognized threat to aviation safety, there are few simulation and computer-based tools for managing humanfactor issues in this field. The main advantages in using computer-based systems to train or support technicians are that computers don’t forget and that they can help humans clearly understand facts. Such features can help reduce errors due to procedure violations, misinterpretation of facts, or insufficient training. Toward that end, augmented reality (AR) is a promising technology to build advanced interfaces using interactive and wearable visualization systems to implement new methods to display documentation as digital data and graphical databases.2 Nevertheless, many factors—such as cumbersome hardware, the need to put markers on the aircraft, and the need to quickly create digital content—seem to hinder its effective implementation in industry.3 To resolve such limitations, we’ve developed a prototype system that incorporates four main requirements for efficiency. It’s user-centered, it implements markerless camera pose estimation, it provides an efficient authoring procedure, and it keeps interaction simple and easy. 96

January/February 2011

Task Analysis Airplane maintenance employees face several issues every day. For example, we investigated daily inspections of the Cessna C.172P, an airplane that flight schools often use. We focused on the maintenance check performed before the first flight of the day, using the aircraft maintenance manual and the flight manual as guides. With maintenance experts’ help, we performed a hierarchical task analysis of the complete procedure to identify the subtasks and steps (see Figure 1a). According to such analysis, the daily inspection is composed of the exterior check and the interior check. Each check is organized into a number of tasks that correspond to different aircraft subsystems to control. The subtasks are the specific procedures that concern the apparatus to be checked and are divided into a number of steps. The exterior check list comprises seven tasks to be performed in a specific sequence (see Figure 1b). Controlling the nose, right side, the maintenance operator must check the engine oil by extracting the dipstick, which is accessible through the access door in the engine cowling. Therefore, we selected the oil-check subtask as a case study to highlight maintenance-related human errors in this procedure. We could augment the individual steps by applying virtual models and animations that implement different types of digital data (for example, digital replicas of parts and subparts) or graphical symbols (for example, arrows and pointers) to attract the operator’s attention or correctly guide technicians through a task. We then prepared a storyboard for each substep. The maintenance experts provided practical examples of the observed error risks and the type of 3D information that could guide technicians (see Table 1).

Published by the IEEE Computer Society

0272-1716/11/$26.00 © 2011 IEEE

Daily inspection Exterior check

Task 1

Task2



Exterior check Start Cockpit interior check Nose, left side

Taskn

Task 1

Task2



Taskn

Fuselage, left side

Engine oil level check

Subtask1

Fuselage, right side

Step

Empennage

Right wing

Open door

Subtaskn



Nose, right side

Left wing

Exterior check End

Read instructions

ARStep

Unscrew and extract Step

ARStep

Check level

ARStep

Close door

… Step

(a)

Screw

(b)

Figure 1. Maintenance inspections performed on the Cessna C.172P aircraft before the first flight of the day. (a) A hierarchical task analysis of daily inspections. (b) The augmented reality (AR) procedure for the engine-oil-level-check subtask. The diagrams show the level of detail we achieved by decomposing the operator’s work in single steps. We deploy such steps to complete the engine oil check by highlighting the maintenance inspection’s operational flow.

Table 1. The augmented reality (AR) virtual layer is designed for each step. Step

Risks

AR information

Open the access door

An animation of opening the door

Read the instructions inside

Skipping this step and using the wrong oil specification

Framing the info tag that reports the manufacturer’s specifications

Unscrew the oil cap, counterclockwise

Forcing the knob clockwise to unscrew the stick

A circular arrow clearly showing the correct sense of rotation

Extract the dipstick

Touching hot oil

An animation of the 3D stick rotating and moving upward, initially superimposed on the real stick

Check the oil level

Mistakes in unit transformation

Depicting correct and incorrect oil levels on the 3D stick to attract the user’s attention

A warning message

Mistakes in level check Screw the oil cap back on

An animation of the stick rotating and moving downward

Close the door

An animation of closing the door

Markerless Camera Pose Estimation AR techniques aim to convey information to the user that’s spatially coherent with the observed scene. They display such information by augmenting the scene captured through a camera with graphical objects properly aligned with the realworld 3D structures. An important AR technology originating from computer vision is real-time estimation of the camera’s 3D position and orientation (jointly called the pose) with respect to the real-world objects to be augmented. So far, most successful AR systems superimpose specifically designed markers on the real-world objects. This allows for easy tracking of markers in the camera’s images and provides the necessary information to infer camera pose in real time. Hirokazu Kato originally developed the ARToolkit (www.hitl.washington.edu/artoolkit)—a widely known marker-based approach.

Unfortunately, as we mentioned earlier, the aerospace industry perceives the need to place markers on aircraft as a limitation of AR. So, we concentrated on computer vision research dealing with markerless camera pose estimation. In such a field, effective real-time natural-feature tracking is an issue. This involves visual patterns that aren’t fixed on physical objects but that naturally exist in the scene. So, we can effectively track these patterns throughout the incoming video stream to provide the required input data to the pose-estimation algorithm.

Online and Offline Processes Over the past few years, local-invariant-feature research has gained momentum in the computer vision community. You can effectively adopt such features for tracking natural patterns in markerless AR systems. Local invariant features consist IEEE Computer Graphics and Applications

97

Applications

SIFT and SURF

(a)

(b)

Feature extraction and matching (also called feature tracking) is crucial for a markerless AR interface because tracked features represent the input data upon which all successive computations must build. To develop our prototype, we considered two algorithms for dealing with local invariant features: ■■ ■■

(c)

(d)

Figure 2. Feature tracking and camera pose estimation. (a) A reference image of a Cessna C.172P cockpit, with superimposed boxes highlighting possible local features. (b) Extracted features from the current frame. The arrows indicate matches between features found in the reference image and the current frame. (c) The reference image with a 3D coordinate system linked to the object of interest as perceived by the AR system. (d) The current view with a 3D coordinate system. The rigid motion between the two 3D coordinate systems corresponds to the estimated camera pose.

David Lowe developed SIFT, which is arguably the most widely known method to detect and match natural features between images. Its introduction was a groundbreaking advance in computer vision, and it has proved effective in challenging tasks such as image retrieval, object recognition, and automatic image stitching. SIFT offers state-of-the-art performance at a relatively high computational cost. Assuming that a moving camera is viewing a planar object, AR systems can reliably track SIFT features throughout the video stream owing to their ■■

■■

of visual patterns (for example, patches, circular blobs, and arbitrarily shaped regions) that ■■

■■

can be detected and matched in natural images and exhibit invariance—or robustness—to scale and viewpoint changes and to brightness variations.

According to this research trend, we developed an AR interface based on an offline and an online process (see Figure 2). In the offline process, we first acquire a reference image of the object we want to augment and extract local invariant features. We then store the extracted features in the system for later use. In the online process, the system continuously processes the camera’s video stream, processing each incoming frame in two stages. First, it extracts the local invariant features from the frame and matches them against those of the reference image. If the system finds a sufficient number of matches, it concludes that the camera can see the object of interest and feeds the corresponding features’ pixel coordinates to the second stage. This stage consists of an algorithm that, based on the assumption that the camera is viewing a planar object, computes the camera pose from the corresponding features’ pixel coordinates.4 98

January/February 2011

SIFT (scale-invariant feature transform)5 and SURF (Speeded-Up Robust Features).6

invariance to image translations, rotations, and scale changes and robustness to brightness variations and moderate changes of viewing direction (that is, roughly within 40 degrees).

On the other hand, SURF is newer and aims to extract and match local features while emphasizing computational efficiency. Although other local-feature algorithms specifically conceived for speed exist, SURF seems to offer the most convincing trade-off between quality and computational efficiency. We’ve observed that, without being tweaked or using optimized implementations, SIFT can require up to 1 second to process a standard 640 × 480-pixel image. We found a better trade-off between computation time and feature quantity and quality when we avoided the initial image-doubling step proposed in the original algorithm. Avoiding this step can improve our implementation’s frame rate by up to 3 fps. Unfortunately, this is still too slow for humans to perceive a sequence of images as a stream. We obtained the best compromise between quantity and quality of matches and fluidity of the video stream by avoiding the initial image doubling and processing subsampled (320 × 240) images. In such a configuration, we still estimated the pose with sufficient accuracy, and the system could process a video at 5 fps.

SURF also includes the initial image-doubling step, which we had to remove to obtain near realtime performance. Using SURF features without doubling the input image, our system processed a full-resolution video at 5.7 fps and a subsampled video (320 × 240) at approximately 10 fps. We chose the latter as our feature-tracking module’s default configuration because it didn’t imply too severe a loss of accuracy during pose estimation and it allowed for convincing fluid rendering of augmented graphical content.

CAD models



Real apparatus Feature identification

Model alignment VRML export

Camera image

3D virtual models

Pose estimation

Reference system on the apparatus

AR virtual layer

Reference image creation

The Authoring Procedure This procedure (see Figure 3) enables quick, flexible creation of digital content. Application developers can import or create digital content and correlate it with the real world. Digital content comprises 3D basic models, such as symbols, arrows, and frames and 3D models of parts and subparts. If CAD component models are available, content creators can exploit them for the AR visualization. Otherwise, they can model digital replicas of parts or obtain them through reverse-engineering. We used Rhinoceros 3D to model a digital dipstick, and we reconstructed the airplane door and fuselage using the Minolta Vivid 9i laser scanner. We use the WRL file format to export the 3D models and store them in the database. We correlate the digital-content database with the different airplane apparatus, using an offline procedure based on the image of the oil-check system. So, we base our use of markerless methods on the creation of a reference image that interfaces between the real and virtual worlds. Such an image must exist for each apparatus, and we must perform an offline correlation procedure to align the real reference system with the virtual objects’ local reference system. The content creator identifies a feature on the apparatus and virtually locates a reference system on that feature. The feature should have a simple shape (rectangular, circular, or triangular), be easy to measure on the real object, and be on a plane surface on the apparatus. Once the content creator identifies this feature, he or she takes a picture of the apparatus. Afterward, the content creator must derive the reference image by editing and modifying the picture so that the feature is at the image’s center and the feature’s actual size is replicated in the image. This correlation procedure lets content creators quickly create a virtual space linked to the real apparatus once they have selected a feature and spatially located the virtual objects offline through a

Reverse-engineering models

Reference image

Figure 3. The authoring procedure lets authors import or create digital content and correlate it with the real world.

CAD interface. We used OpenGL graphics libraries to create our system’s graphic-rendering feature.

The Interface The introduction of AR systems in maintenance introduces interaction paradigms that are completely new for technicians. The interface should be designed to make the operator’s life easier. Figure 4 shows the display’s main window, which overlays the virtual layer on the video stream. The 3D animation at the virtual layer depends on the maintenance intervention’s specific status because the maintenance check is a sequence of operations that the technician must perform in the proper order. The four screen shots in Figure 4 correspond to subsequent animations that are overlaid onto the real scene. So, each subtask consists of a finite number of steps for the specified apparatus. Moreover, the interface user must be continuously aware of ■■ ■■

■■

the current subtask’s total number of steps, the current step’s active index (the current progressive number), and the task-performance progress.

To provide such information, we designed the step window, which is always at the display’s lower left corner. Here, a timeline shows the total number of steps as a set of colored bars. Green indicates completed steps, yellow indicates the current step, and black indicates the remaining steps. The window also indicates the current step’s name. This approach intends to keep interaction easy and intuitive. Users can manage passage from one IEEE Computer Graphics and Applications

99

Applications

Liteye LE750 monitor and a Logitech webcam, and a notebook computer (see Figure 5).

Validation

(a)

(b)

(c)

(d)

Figure 4. The display window and steps 4–7 for the oil-check procedure. (a) A red 3D rectangle frames the tag, which reminds the maintenance operator to carefully read the information. (b) The dipstick rotates counterclockwise and translates upward to indicate which operation should be emulated. (c) The dipstick model displays and augments different oil levels between cuts using color-coded information. (d) The dipstick rotates clockwise and translates downward to show how to put it back. A warning message is also provided.

Figure 5. The prototype head-mounted display. We used hardware components from off-the-shelf systems to minimize its weight, maintain its stability, and make it cost effective.

step to the next through a unique activation command that they activate through a simple interface (for example, the Enter key, a button device, or voice recognition). When implementing our prototype, we took into account some of the project’s basic usability requirements. The system should support operations performed in large areas (for example, a hangar) and shouldn’t hamper the operator. Also, it should be comfortable to wear continuously for at least half an hour. So, we used hardware components from off-the-shelf systems to minimize the prototype’s weight, maintain its stability, and make it cost effective. The components comprised an adjustable plastic headset fitted with a see-through 100

January/February 2011

We assessed our prototype’s efficiency and usability. By efficiency, we mean the system’s ability to provide markerless tracking and 3D animation overlap on several similar apparatuses on different airplanes and in different lighting conditions. So, we repeated the oil check on three Cessna 172s. A system’s overall usability relies on different people being able to use it efficiently and achieve a set of specific functional objectives. So, we asked 10 people to participate in the experiments and collected the results using an evaluation form. We also used this form to report the number of software operations needed to progress through the AR procedure and to track participants. In this case study, the assigned subtask required 14 operations. The ratio of the number of operations the participants actually made to the required number of operations was 1.2. The maximum value was 20, which means that the participants followed the procedure well, didn’t perform useless operations, and kept the AR interface synchronized with the real maintenance subtasks. The training always took less than 30 minutes; it included instructions for using the prototype and checking the oil. The error rate was relatively low, although some problems occurred. However, the maximum time to solve such interruptions was 8 minutes. To avoid such problems, we could have provided more training, considering the participants had no previous AR experience. The average time to complete the task autonomously was 20 minutes. We measured the workload by applying the NASA-TLX (Task Load Index) form, setting six 10-point rating scales to measure the perceived workload (see Figure 6). The average workload rating didn’t exceed 4 points, and performance and satisfaction received high ratings. The participants didn’t comment on the system during the procedure. However, they made several strongly positive observations on their evaluation forms. They all reported that the animated components’ visual representation let them perform any task by simply emulating what was displayed. They also appreciated the step-by-step reminders.

T

he participants seemed to understand how to use the prototype and said that they would appreciate its implementation. We believe this is because we showed that the technology actually

Mental workload

improved the task’s efficiency. We also showed potential users how to use the AR interfaces. Our case study indicated that this system is strongly application oriented—demonstrating AR’s potential—and we feel we’ve overcome some skepticism about AR. In addition, the markerless methods this system implements, together with the easy-to-handle graphics-authoring procedure, enables wider applicability, which can scale to different case studies. This demonstrates that interdisciplinary studies contribute to AR’s industrial uses.

Physical workload Temporal workload Effort Performance Satisfaction 0

1

2

3

4

5

6

Acknowledgments We thank Centro Italiano di Ricerche Aerospaziali for collaboration and for funding the project. We also thank Giovanni Miranda and Iuri Pelliconi for their valuable contributions.

Luigi Di Stefano is an associate professor at the University of Bologna’s Department of Electronics, Computer Sciences, and Systems. Contact him at [email protected].

1. “Statistical Summary of Commercial Jet Airplane Accidents: Worldwide Operations, 1959–2009,” slide presentation, Boeing, 2010; www.boeing.com/news/ techissues. 2. T. Haritos and N.D. Macchiarella, “A Mobile Application of Augmented Reality for Aerospace Maintenance,” Proc. 24th Digital Avionics Systems Conf. (DASC 05), vol. 1, IEEE Press, pp. 5.B.3-1– 5.B.3-9; http://doi.ieeecomputersociety.org/10.1109/ DASC.2005.1563376. 3. H. Regenbrecht, G. Baratoff, and W. Wilke, “Augmented Reality Projects in the Automotive and Aerospace Industry,” IEEE Computer Graphics and Applications, vol. 25, no. 6, 2005, pp. 48–56; http:// doi.ieeecomputersociety.org/10.1109/MCG.2005.124. 4. G. Schweighofer and A. Pinz, “Robust Pose Estimation from a Planar Target,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 12, 2006, pp. 2024–2030. 5. D.G. Lowe, “Distinctive Image Features from ScaleInvariant Keypoints,” Int’l J. Computer Vision, vol. 60, no. 2, 2004, pp. 91–110. 6. H. Bay et al., “SURF: Speeded Up Robust Features,” Computer Vision and Image Understanding, vol. 110, no. 3, 2008, pp. 346–359.

9

10

Pietro Azzari is a senior software engineer at Snap-On Equipment. Azzari received his PhD from the University of Bologna’s Department of Electronics, Computer Sciences, and Systems. Contact him at [email protected]. Samuele Salti is a PhD student at the University of Bologna’s Department of Electronics, Computer Sciences, and Systems. Contact him at [email protected]. Contact department editor Mike Potel at potel@wildcrest. com.

ADVERTISER INFORMATION JANUARY/FEBRUARY 2011 Advertising Personnel Marian Anderson: Sr. Advertising Coordinator Email: [email protected] | Phone: +1 714 821 8380 | Fax: +1 714 821 4010 Sandy Brown: Sr. Business Development Mgr. Email: [email protected] | Phone: +1 714 821 8380 | Fax: +1 714 821 4010 IEEE Computer Society 10662 Los Vaqueros Circle Los Alamitos, CA 90720 USA www.computer.org Advertising Sales Representatives Western US/Pacific/Far East: Eric Kincaid Email: [email protected] | Phone: +1 214 673 3742; Fax: +1 888 886 8599

Francesca De Crescenzio is an assistant professor in the University of Bologna’s Second Faculty of Engineering. Contact her at [email protected].

Eastern US/Europe/Middle East: Ann & David Schissler Email: [email protected], [email protected] Phone: +1 508 394 4026; Fax: +1 508 394 4926

Massimiliano Fantini is a researcher in the University of Bologna’s Second Faculty of Engineering. Contact him at [email protected].

Advertising Sales Representatives (Classified Line/Jobs Board)



8

Figure 6. Participant ratings of the prototype system. The average workload rating didn’t exceed 4 points, and performance and satisfaction received high ratings.

Franco Persiani is a professor in the University of Bologna’s Second Faculty of Engineering. Contact him at franco. [email protected].

References

7

Greg Barbash Email: [email protected]; Phone: +1 914 944 0940

IEEE Computer Graphics and Applications

101