A Human Subjects Study on the Relative Benefit. of Immersive Visualization Technologies. Derrick Turner

A Human Subjects Study on the Relative Benefit of Immersive Visualization Technologies Derrick Turner A project submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science Daniel Ames, Chair Kevin Franke Gus Williams Department of Civil and Environmental Engineering Brigham Young University June 2014 Copyright 2014 Derrick Turner All Rights Reserved

ABSTRACT A Human Subjects Study on the Relative Benefit of Immersive Visualization Technologies Derrick Turner Department of Civil and Environmental Engineering, BYU Master of Science Large-scale stereoscopic immersive visualization environments typically include 3D stereo displays and head/hand tracking to create an immersive user experience. These components add cost and complication to such a system that may not be warranted if the components do not significantly improve data navigation and interpretation. This paper presents a two-part human subjects study to investigate the relative value of head tracking and stereoscopic technologies in terms of improved data navigation and interpretation. Ninety-six individuals performed specified tasks using several datasets in one of four different system configurations including: motion tracking with stereoscopic 3D, motion tracking with no stereoscopic 3D, no motion tracking with stereoscopic 3D, and no motion tracking with no stereoscopic 3D. Subjects were not informed of their specific configuration and each task was timed and a score was assigned based on the task performance accuracy. Results from Part A of the study (simple navigation, measuring distance, and image interpretation) indicated a lack of statistically significant difference between the performance metrics of each of the test groups; whereas performance metrics in Part B of the study (complex navigation) were greatest in the case of head tracking with no stereoscopic 3D. Some possible explanation of these results and their potential implication are provided. Keywords: Stereoscopic 3D, Head tracking, 3D immersive visualization

ACKNOWLEDGEMENTS I would like to thank my wife, Anika, for always supporting me and dealing with the long hours spent in the Clyde Building. I would also like the thank Dr. Ames for giving me the opportunity to further my studies.

TABLE OF CONTENTS LIST OF FIGURES... ix 1 Introduction... 1 1.1 Virtual Environments... 1 1.2 Stereoscopic 3D and Head Tracking... 2 1.3 Research Goals... 4 2 Methods... 7 2.1 Hardware and Software... 7 2.2 Human Subjects and Training... 8 2.3 System Configurations... 9 2.4 Datasets... 9 2.5 Tasks... 10 2.6 Assessing Tasks... 16 3 Results... 19 3.1 Task A1: Human Foot Horizontal Orientation... 19 3.2 Task A2: Human Foot Vertical Orientation... 21 3.3 Task A3: Highway Overpass... 22 3.4 Task A4: Change Detection of the Crowne Plaza Hotel, San Diego... 23 3.5 Task B: Location A... 24 3.6 Task B: Location B... 25 3.7 Task B: Location D... 27 3.8 Task B: End Location... 28 3.9 Task B: Total Score... 30 4 Discussion... 33 v

5 Conclusion... 35 References... 37 vi

LIST OF TABLES Table 1: Student s t-test Calculated t-values from the Human Foot Horizontal Orientation Task...20 Table 2: Student s t-test Calculated t-values from the Human Foot Vertical Orientation Task...22 Table 3: Student s t-test Calculated t-values from the Change Detection of the Crowne Plaza Hotel Task...24 Table 4: Student's t-test Calculated t-values from Location A...25 Table 5: Student's t-test Calculated t-values from Location B...26 Table 6: Student's t-test Calculated t-values from Location D...28 Table 7: Student's t-test Calculated t-values from the End Location...29 Table 8: Student's t-test Calculated t-values from the Total Score...31 vii

viii

LIST OF FIGURES Figure 1: The VuePod large-scale stereoscopic visualization system used for this study....8 Figure 2: Scan of the human foot horizontal orientation starting position...11 Figure 3: LiDAR scan of a bridge overpass...12 Figure 4: Artist s rendition of the Crowne Plaza Hotel property when built in 1966...13 Figure 5: 2005 LiDAR scan of the Crowne Plaza Hotel property...13 Figure 6: Plan view of the cavern...14 Figure 7: Required view at location A...15 Figure 8: Image of the end location...15 Figure 9: Human foot horizontal orientation scaled score results...19 Figure 10: Human foot vertical orientation scaled score results plot...21 Figure 11: Change detection Crowne Plaza Hotel scaled score results plot...23 Figure 12: Location A scaled score results...24 Figure 13: Location B scaled score results...26 Figure 14: Location D scaled score results...27 Figure 15: End location scaled score results...28 Figure 16: Total score scaled score results...30 ix

1 INTRODUCTION Large-scale stereoscopic immersive visualization has been demonstrated as a useful tool for viewing, interpreting, and analyzing scientific data (Koller, Lindstrom et al. 1995, Sulbaran and Baker 2000, Lin, Chen et al. 2013). Two key components of large-scale stereoscopic 3D immersive environments include stereoscopic displays and head tracking devices (Cruz-Neira, Sandin et al. 1992, Dodgson 2005). The approach assumes that stereoscopic images paired with head tracking creates a more visually immersive experience, leading to better manipulation and understanding of data that can result in more effective use of time and resources (Ware, Arthur et al. 1993, Dodgson 2005, Bowman and McMahan 2007). This paper presents a human subjects study of the relative value of stereoscopic 3D displays and head tracking devices with respect to simple data analysis and interpretation tasks. 1.1 Virtual Environments A Computer Automatic Virtual Environment or CAVE, is an immersive virtual reality environment that uses projectors or a large number of LCD video screens to show images on three, four, or six walls typically in a cube shaped room. CAVE s have been constructed using several different configurations for specific applications, but all generally are built with the goal of aiding scientists, engineers, technicians, teachers and others by providing users with a viewer-centered perspective of complex data (Cruz-Neira, Sandin et al. 1992). Advances in computer science and 1

technology have allowed CAVE system to improve rapidly in recent years (Peterka, Kooima et al. 2008, DeFanti, Acevedo et al. 2011). A common CAVE configuration uses four to six projection screens forming walls, a floor and/or ceilings with a rear projection projecting onto each screen (Cruz-Neira, Sandin et al. 1992, Browning, Cruz-Neira et al. 1994, Sherman, O'Leary et al. 2010, DeFanti, Acevedo et al. 2011). In this environment, the user wears stereoscopic glasses that allow them to interact with the screens (DeFanti, Acevedo et al. 2011). Motion tracking with six-degrees of freedom is used to interact with a sensor worn by the user that communicates with the software the location of the user s head or hands (DeFanti, Acevedo et al. 2011, Kreylos 2013). A similar configuration can be made with 3D LCD televisions and motion tracking system (Hayden, Ames et al. 2014). CAVE and related virtual environments are usually intended for one or more of the following three common uses: 1) as a tool for multi-dimensional spatial analysis and interaction, 2) as a platform for process-based simulation of dynamic geographic phenomena, and/or 3) as a workspace for multi-participant-based collaborative geographic experiments (Waly and Thabet 2002, Lin, Chen et al. 2013). In all cases, a virtual environment can serve as a means for users to better utilize and/or gain greater information from their data (Fisher, McGreevy et al. 1986). It has been argued that when a user is in control of a virtual environment, a feeling of immersion is created where the user becomes a critical part of the displayed data; this allows the user to be visually stimulated in ways that do not happen when using a simple single 2D display computer system (Kruger, Bohn et al. 1995, Slater 2003). 1.2 Stereoscopic 3D and Head Tracking Stereoscopic 3D displays that require the use of special glasses to create 3D images are generally considered a fundamental component of any CAVE or virtual environment. Several 2

hardware and software solutions for 3D glasses have been developed including the common theater-style glasses which use two polarized lenses that selectively pass through to the eyes two distinct images emanating simultaneously from a single display surface. This passive approach is appealing due to the low cost, lightweight glasses required. Other, active systems use a rapid shuttering system on each lens, synchronized with alternating left and right images on the display. In both cases, the viewer effectively sees two separate images on a single display surface (Dodgson 2005) creating the illusion of depth and volume within the image (Chiang 2013). Stereoscopic 3D displays can be used to extract visual information on depth, size and position of objects to facilitate spatial analysis tasks. The creation of shadows in 3D space can assist in making the 3D images appear to have real volume (Hubona, Wheeler et al. 1999). These realistic aspects of stereoscopic 3D work by triggering the visual, auditory, and other sensory cues that users have experienced in the real world (Bowman and McMahan 2007). Head tracking refers to the use of a device which is worn by a user to identify the exact position in and orientation of the head thereby providing an estimate of the users view direction. Several highly reflective balls or markers are usually attached to the head mounted device and a wall- or ceiling-mounted instrument continuously measures the position of these markers to determine the exact position and orientation of the user in space relative to the position of the tracking instrument. The tracking instrument sends this information to a computer to modify the current view with the correct size, shape, and location based on the position of the user in space (Chance, Gaunet et al. 1998). When head tracking is used and head position is properly measured, movement parallax is achieved (Dodgson 2005) a condition that prevents the user from noticing that the objects are changing position based on their movement allowing users to move their head and body naturally (Gibson, Gibson et al. 1959, Billen, Kreylos et al. 2008). To achieve this result, 3

it important to have a tracking system that adds minimal latency when rendering the image based on the movements of the user so that the changing objects and scenes are not noticed by the user. Head tracking systems can also estimate position, motion, and orientation through the use of internal motion detection electronics. This approach is particularly useful as adapted to headmounted displays (HMD) which provide a virtual reality interface by generating an image on a small screen immediately in front of the eye and in accordance with their viewpoint (Koller, Lindstrom et al. 1995). The Oculus Rift (http://www.oculusvr.com/) and Google Glass (http://www.google.com/glass/start/) are examples of current leading HMDs. HMDs can provide an extremely intimate and immersive single-user experience. The headmounted marker/tracker system, in contrast, has the advantage of allowing multiple people to view a single scene together. However, given the lack of support for multiple user tracking in all current CAVE systems, other viewers see the viewpoint of the user being tracked and therefore a slightly distorted object, which can be visually unsettling (Dodgson 2005). 1.3 Research Goals The remainder of this paper presents a two-part human subjects study to investigate the relative value of both stereoscopic displays and head tracking technology in terms of creating an immersive environment for data analysis and interpretation. A recent, unrelated study regarding the effects of stereoscopic 3D, head tracking, and field or regard concluded that stereoscopic 3D and head tracking enable users to perform better in immersive visualization environments (Ragan, Kopper et al. 2013). Because our study was designed and executed without any prior knowledge of this parallel research effort, our results can be viewed as a potential validation or qualification of those results. Also our effort is unique in that it is the first to focus heavily on a complex LiDAR 4

data set based navigation and interpretation problem. The following sections describe the research methods employed, detailed results, and some interpretation thereof. 5

2 METHODS 2.1 Hardware and Software The human subjects study presented here was conducted using a low-cost stereoscopic immersive visualization system called the VuePod, Figure 1. The VuePod is comprised of twelve 55 3D LCD televisions paired with a custom-built high end gaming computer containing three video cards each of which sends simultaneous stereoscopic video output to four of the monitors (Hayden, Ames et al. 2014). The VuePod includes a motion tracking system produced by ARTtrack (http://www.ar-tracking.com/products/tracking-systems/smarttrack/) that uses two cameras to track the position of the reflecting balls or markers attached to glasses (for head tracking) and to a Nintendo Wii video game remote controller (for hand tracking) within a volume in front of the televisions. The VuePod computer supports both Linux and Windows operating system based software. For the current study, an open source software application, Vrui was used. The VRUI VR Toolkit is a general purpose virtual reality software that is capable of outputting stereoscopic 3D images to multiple screens (Kreylos 2008). 7

Figure 1: The VuePod large-scale stereoscopic visualization system used for this study. 2.2 Human Subjects and Training Ninety-six study participants were recruited by email and verbal announcement in two groups of 48 each (one group for each part or phase of the study, hereafter termed Part A and Part B ). These participants were comprised of both male and female individuals ages 18 to 30 years old. The students were primarily undergraduate civil engineering students with moderate-tohigh technical and computer skills. User suitability was assessed by a pre-participation survey completed by each user. The survey determined whether a potential user had any problems regarding vision, depth perception, balance, fine motor skills, or mobility. These challenges would potentially put the users at a disadvantage, so they were addressed before the study. Once a user was determined suitable for the study, each user signed a consent form regarding the logistics of the study, confidentiality, risks, and compensation as per Brigham Young University Institutional Review Board-established human subjects in research policies. 8

A short training video was shown to each subject to instruct them on use of the VuePod (www.youtube.com/watch?v=v3pkyfbjguy). The tutorial video shows how to move and rotate objects, orient a scene, fly through a scene, measure distances, and zoom in and out of scenes. These controls were taught by visually showing how they were achieved and with instructions explaining their purpose and how to perform them. The video had periodic breaks to allow the subject to practice using the VuePod until they felt comfortable using the controls and tools taught in the video. The configuration used in the training of how to use the VuePod was head tracking with stereoscopic 3D. 2.3 System Configurations We created four different system configurations, to which each subject was randomly assigned, including: head tracking with stereoscopic 3D (TS), no head tracking with stereoscopic 3D (NTS), head tracking with no stereoscopic 3D (TNS), and no head tracking with no stereoscopic 3D (NTNS). After the video training, each subject was informed that the level of 3D immersion had been adjusted; however, they were not informed as to exactly what changes had been made, which specific system configurations were being tested, or to which they had been assigned. For example, for the NTNS group, all head tracking and stereoscopic 3D was disabled; whereas for the TS group, stereoscopic 3D and head tracking were both enabled. All subjects wore head tracking 3D glasses while performing tasks, regardless of their assigned system configuration. 2.4 Datasets Part A of the study used three different datasets for specific visualization and interpretation tasks. These datasets included: 1) a magnetic resonance imaging (MRI) scan of a human foot 9

retrieved from the Visible Human Project at the University of California, Davis (Kreylos 2000), 2) a ground-based LiDAR scan of a highway bridge overpass retrieved from the EarthScope Intermountain Seismic Belt LiDAR Project (NSF, USGS et al. 2008), and 3) an aerial LiDAR scan of a region of San Diego, California (San Diego 2005). After assessing results from Part A, we devised a Part B study using use a single complex ground based terrestrial laser scanner (TLS) scan of the Crystal Cave located in Sequoia National Park, California. The TLS scan depicts a 3D point cloud of the contours and characteristics of the inside of the cavern. The dataset was provided by the U.S. Department of Energy Idaho National Laboratory. 2.5 Tasks For Part A of the study, each subject was asked to perform four different tasks (Tasks A1- A4) using the three datasets noted. By random selection, each subject was assigned a system configuration and performed all four tasks using that same configuration. A total of 48 subjects participated in Part A of the study with 12 subjects performing tasks in each of the four configuration groups (TS, NTS, TNS, and NTNS). Task A1 and A2 tested the subjects ability to perform simple orientation of a common 3D object a human foot depicted via MRI scan using the six degrees of freedom controller and large scale visualization environment. First, subjects were shown the image oriented in a vertical position that revealed the exterior of the foot (Figure 2) and were instructed to rotate the foot 180⁰ horizontally to show the inside of the foot in a vertical position. Each user was timed while performing this task and the proctor assigned a score from 0 to 5 based on the accuracy with which the task was performed (5 indicated the greatest accuracy). Rotation was conducted using the Wii remote controller and its attached tracking system reflective balls. 10

Figure 2: Scan of the human foot horizontal orientation starting position Users also were instructed to rotate the image from an upside down vertical position showing the exterior of the foot to a vertical position. Each subject was timed while performing this task and was given a score based on the accuracy at which the task was performed of 0 to 5, with 5 being the most accurate. For Task A3, each subject was instructed to measure the vertical height from the bottom of the bridge overpass to the top of the road located below the overpass in the LiDAR scan shown in Figure 3. The vertical height measured was recorded, as well as the time required to complete the task. Measurement accuracy was recorded based on the location of the measurement in 3D space. For example, it was noted whether or not the line measured was vertical from all angles and whether the endpoints of the lines were located on the LiDAR points or in front of or behind them. The purpose of performing this task was to determine the ease and accuracy of a simple analytical tool (distance measurement). 11

Figure 3: LiDAR scan of a bridge overpass For Task A4, each subject was shown an artist s rendition of the Crowne Plaza Hotel in San Diego, California, as it appeared when built in 1966 (Figure 4) as well as a LiDAR scan of the same property as collected in 2005 (Figure 5). Each subject was given 3 minutes to analyze the LiDAR scan and the photo and to identify any notable differences between the two images. The number of differences and the specific differences themselves were recorded. There were approximately 10 notable differences between the original image and the LiDAR scan. 12

Figure 4: Artist s rendition of the Crowne Plaza Hotel property when built in 1966 Figure 5: 2005 LiDAR scan of the Crowne Plaza Hotel property Part B of the study involved one complex data set and associated task (Task B). For this task, the cavern dataset LiDAR scan was used to combine navigation and orientation tools to provide a difficult task that would measure how well subjects are able to perform in different 13

configurations using head tracking and stereoscopic 3D. We recruited 48 new participants who were each randomly assigned to perform the task in one of the four previously noted configurations making four groups of 12 participants in each configuration. To begin the task each user was provided with an ipad containing instructions, a map, and images of different locations within the cavern as shown in Figure 6. Each user was instructed to use the map to navigate from the beginning of the cavern to the end of the cavern following the path specified in the map. Subjects were instructed to stay within the cavern and not pass through cavern walls. Figure 6: Plan view of the cavern Four locations were marked on the map as A, B, C, and D. Subjects were instructed to navigate to each specific location. Upon finding the location, they were instructed to orient the VuePod view such that it matched the associated static images shown on the ipad. For example, the required view for location A as shown in Figure 7. Upon completing this activity, the accuracy 14

with which they oriented the view was measured by the study proctor. The subject was then instructed to start from a previously saved correct view of location A and navigate to location B, etc. After location D was found and matched, the subject was instructed to find the end of the cavern and match and associated final image (Figure 8). Figure 7: Required view at location A Figure 8: Image of the end location 15

2.6 Assessing Tasks Tasks A1 and A2 both measured the accuracy with which subjects were able to follow instructions and complete the task. After completing each task, the proctor observed the final position of the image when the subject had stated that the task was complete, and gave a score from 0 to 5 with 5 being the best possible score. Once a score was determined the time required to complete the task was recorded. Scores were assigned based on predetermined criteria of accuracy. Task A3 was assessed by determining how well the user duplicated the actual height of the overpass (10 m). This answer is achieved when the two endpoints of the line are located on specific LiDAR points and the line is vertical from all angles. The measurement length determined by each user was recorded as well as notes regarding their accuracy. The time to complete the task was also recorded. Task A4 was assessed based on the number of differences identified. Each identified difference was recorded by the proctor and was then later assessed as per its correctness. We expected the subjects to identify 10 specific and noticeable differences between the artist s rendition and LiDAR scan. The number of changes and time spent finding those changes were used to determine how well each person was able to perform the task in the different system configurations. Three of these four tasks were assessed based on time to complete the task and a score on how well the task was completed. Scores were scaled against time of completion by dividing the score by the time and multiplying by 100. The resulting scaled scores for each user were sorted from lowest to highest within each group. Creating ranked scaled scores allowed the time and actual score to be factored together to better characterize the differences in working in each configuration. 16

Task B was a timed activity that also accounted for how well the subjects were able to perform the task. The instructions stated that the subjects needed to stay within the cavern and not pass through the cavern walls. To account for this each subject was given an initial 30 points and 1 point was subtracted each time the subject passed through a cavern wall, floor, or ceiling. The subject was notified each time they exited the cavern so that they could correct themselves. The time to find and match each image at each of the four locations was recorded. Up to 5 points were given for accuracy in matching an image at each location. The accuracy with which subjects matched the VuePod view to the provided image at each location was determined based on zoom and angle. The same proctor assessed each subject in this study to limit subjective biases in the assignment of accuracy points. Fifty total points were possible for this task. The total score was divided by the total time to navigate the cavern and multiplied by 100 to create scaled score. A scaled score was also found for each individual location based on the time it took to find and match the image with the associated score. The scaled scores were sorted form smallest to largest to create a ranked order of scaled scores based on the configuration used to achieve each score. 17

Scaled Score 3 RESULTS A visual assessment of the ranked and scaled scores was performed, and a Student s t-test was used to determine the statistical significance of the results. We used a level of probability of 95% with 22 degrees of freedom to determine the tabulated t-value. The tabulated t-value was 2.07 and was then compared with the calculated t-value. Specific results are discussed below. 3.1 Task A1: Human Foot Horizontal Orientation The results from the human foot horizontal orientation are shown in Figure 9. 20 18 16 14 12 10 8 6 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 User Rank TS NTNS NTS TNS Figure 9: Human foot horizontal orientation scaled score results 19

The TS group had the highest overall score of any of the four groups. Figure 9 shows that among the top four users in each group (the third tercile), TS scored the highest, followed by the TNS group. Interestingly, results from the lower four subjects in each group (the first tercile) show the opposite, with the TS group scoring lowest. One interpretation of these results is that power users, or people who are comfortable with technology, are more likely to benefit from the addition of head tracking and stereoscopic 3D whereas novice technology users are more likely to find the additional technologies to be cumbersome or otherwise limiting. Regardless, the visual assessment of these data is inconclusive regarding the question of the relative benefit of these two technologies. Table 1 presents the calculated t-values from the human foot horizontal orientation task. From the results, none of the calculated t-values are greater than the tabulated t-value meaning that 95% of the tests in one group are not significantly different from any of the other groups. Table 1: Student s t-test Calculated t-values from the Human Foot Horizontal Orientation Task TS NTNS NTS TNS TS 0.03 4.36 0.00 NTNS 0.03 3.35 0.00 NTS 0.36 3.35 0.00 TNS 0.07 4.09 0.00 20

Scaled Score 3.2 Task A2: Human Foot Vertical Orientation The results from the vertical rotation of the human foot can be found in Figure 10. 160 140 120 100 80 60 40 TS NTNS NTS TNS 20 0 1 2 3 4 5 6 7 8 9 10 11 12 User Rank Figure 10: Human foot vertical orientation scaled score results plot The NTS group included one user with the overall highest score. However, only two of the four NTS users in the third tercile scored above the other three configuration groups. Within the third tercile TS has the highest average when the top scorer from NTS is disregarded. When looking at the first tercile TS is the top performer with the other three configurations having similar scores. The second tercile has a wide spread with NTNS having the best score followed by TS, NTS, and TNS. When considering all three terciles, TS scored best. Table 2 presents the calculated t-values from Task A2, and shows that none of the calculated t-values are greater than the tabulated t-value. This means that 95% of the tests in one group are not significantly different from any of the other groups. 21

Table 2: Student s t-test Calculated t-values from the Human Foot Vertical Orientation Task TS NTNS NTS TNS TS 0.37 0.33 0.92 NTNS 0.37 0.07 0.62 NTS 0.33 0.07 0.40 TNS 0.92 0.62 0.40 3.3 Task A3: Highway Overpass Assessment of the numerical results and associated user comments for the bridge overpass measurement task indicated a high level of difficulty performing the task. Most subjects were unable to correctly measure the height of the bridge overpass. The comments and proctor observations indicate that the subjects were unable to move the cursor directly on the required LiDAR points causing many of the measurements to be taken in front of the points instead of on the points. This means that the measurements were essentially taken in random space. Another issue with this task was that the users were unable to draw a vertical line. When subjects would rotate the image it was clear that the measurement lines were not exactly vertical, thereby causing measurements to be longer than expected. Only three people were able to obtain a measurement on the points with a distance close to the actual answer. These three subjects were members of groups TS, NTS, and TNS. 22

Scaled Score 3.4 Task A4: Change Detection of the Crowne Plaza Hotel, San Diego Figure 11. The average times to identify all changes in the Crowne Plaza Hotel scene are shown in 3.5 3 2.5 2 1.5 1 0.5 TS NTNS NTS TNS 0 1 2 3 4 5 6 7 8 9 10 11 12 User Rank Figure 11: Change detection Crowne Plaza Hotel scaled score results plot Change detection results for this task were scaled against time as with the other tasks, but with little visible difference in scores since most users used all three minutes to complete the task. Figure 11, shows that the third tercile in each group had the same high score, but the NTS group had the highest scores of the four. The other three groups have one of the four values different in the third tercile, but are otherwise the same. The second tercile is very similar to the third with NTS having the obvious advantage in performance and the other three groups ending with similar scores. 23

Scaled Score The third tercile shows that each group had the same high score, but overall in this tercile, NTS group scored highest. The other three groups show highly similar results in the top tercile. The second tercile is very similar to the third with NTS showing clearly higher scores. Table 3 shows the results from the statistical analysis of the scaled scores from the change detection of the Crowne Plaza Hotel task. None of the calculated t-values from this task exceed the tabulated t-value of 2.07. Therefore, we can conclude that there is not a significant difference between each group and their respective results. Table 3: Student s t-test Calculated t-values from the Change Detection of the Crowne Plaza Hotel Task TS NTNS NTS TNS TS 0.13 0.54 0.45 NTNS 0.13 0.40 0.55 NTS 0.54 0.40 0.91 TNS 0.45 0.55 0.91 3.5 Task B: Location A The results from finding and matching location A are shown in Figure 12. 12.00 10.00 8.00 6.00 4.00 2.00 TS NTNS NTS TNS 0.00 1 2 3 4 5 6 7 8 9 10 11 12 User Rank Figure 12: Location A scaled score results 24

The scaled score results show that TNS or tracking with no stereo consistently performed better than the other configurations. The other three configurations were grouped together except in the third tercile. Within the third tercile the results show that TS or tracking with stereoscopic 3D performed better than NTS or NTNS. Table 4 presents the calculated t-values from location A. From the results, it can be determined that there is a significant statistical difference between TNS and NTNS, as well as TNS and NTS. This shows that when trying to find location A and match the accompanying image, that users were able to perform better when head tracking was active and stereoscopic 3D was not versus when only stereoscopic 3D was active, or head tracking and stereoscopic 3D were turned off. It can be said with 95% confidence that the TNS configuration is better than the other configurations when performing this task. Table 4: Student's t-test Calculated t-values from Location A TS NTNS NTS TNS TS 0.94 0.94 1.71 NTNS 0.94 0.11 2.44 NTS 1.07 0.11 2.48 TNS 1.71 2.44 2.48 3.6 Task B: Location B Figure 13 shows the results from finding and matching location B. 25

Scaled Score 7.00 6.00 5.00 4.00 3.00 2.00 1.00 TS NTNS NTS TNS 0.00 1 2 3 4 5 6 7 8 9 10 11 12 User Rank Figure 13: Location B scaled score results Figure 13 clearly shows that NTS performed better than the other configurations at this location. Table 5 presents the calculated t-values from this task. None of the calculated t-values are greater than the tabulated t-values signifying that there is no statistical difference in the performance of users when finding and matching location B. Table 5: Student's t-test Calculated t-values from Location B TS NTNS NTS TNS TS 0.40 1.61 0.16 NTNS 0.40 2.06 0.61 NTS 1.61 2.06 1.54 TNS 0.16 0.61 1.54 While observing subjects navigate through the cavern, there was a problem that was noticed that skews the results for this task. In many cases while users were searching for location A, they would pass the correct location and would not realize they had passed it until they stumbled upon location B. At this point they would ask to start back at the beginning to try to find location 26

Scaled Score A once again. This helped to show how different configurations helped and hindered users while trying to find and match location A. However, after subjects found location A and started from location A to find location B the ones who missed location A and did not realize it until they found location B were easily able to find location B for the second task. From Figure 13, it shows that in many cases NTS subjects who had difficulty finding location A were able to find location B quicker and more accurately because they already knew where to find it. 3.7 Task B: Location D Figure 14 shows the results from finding and matching location D. 2.50 2.00 1.50 1.00 0.50 TS NTNS NTS TNS 0.00 1 2 3 4 5 6 7 8 9 10 11 12 User Rank Figure 14: Location D scaled score results TNS performed better across all users with the exception of the top subject from TS and NTS performing better than the top TNS subject. The first tercile shows that between TS, NTNS, and NTS that there was not one who outperformed the other two to have the second best score 27

Scaled Score behind TNS. In the second tercile, NTS starts to outperform the other two and continues this trend into the third tercile. Table 6 presents the calculated t-values from this task. The calculated t-values for TNS versus NTNS are higher than the tabulated t-values thus showing that there is a statistically significant difference between the two. Table 6: Student's t-test Calculated t-values from Location D TS NTNS NTS TNS TS 0.81 0.26 1.91 NTNS 0.81 1.02 3.31 NTS 0.26 1.02 1.44 TNS 1.91 3.31 1.44 3.8 Task B: End Location Figure 15 shows the results from the users finding and matching the end location. 16.00 14.00 12.00 10.00 8.00 6.00 4.00 TS NTNS NTS TNS 2.00 0.00 1 2 3 4 5 6 7 8 9 10 11 12 User Rank Figure 15: End location scaled score results 28

TS, NTNS, and TNS had similar for all 12 subjects. NTS was included in the group except for in the third tercile, where the top 4 users performed better than any of the other configurations. Table 7 presents the calculated t-values from finding and matching the end point. When comparing these values to the tabulated t-values none of the calculated values are higher than the tabulated values. This indicates that there was no statistically significant difference between how well the users performed and their associated configuration. Table 7: Student's t-test Calculated t-values from the End Location TS NTNS NTS TNS TS 0.61 1.49 0.98 NTNS 0.61 1.06 0.31 NTS 1.49 1.06 0.87 TNS 0.98 0.31 0.87 Similarly to finding location A the same problem occurred when users were trying to find location D. Subjects were unsure of themselves when trying to identify location D and would proceed forward and find the endpoint. When they found the end location they would use that as a reference point to find location D. So once location D was found, many users already knew where to find the end location and would find the end location very quickly and accurately. So the results from finding and matching the end location do not accurately represent the benefit of using one configuration over the others. 29

Scaled Score 3.9 Task B: Total Score Figure 16 shows the cumulative scaled score results from navigating through the cavern while finding and matching the associated images. 10.00 9.00 8.00 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00 1 2 3 4 5 6 7 8 9 10 11 12 User Rank TS NTNS NTS TNS Figure 16: Total score scaled score results The cumulative total score from finding and matching the four different locations and images were scaled and plotted to show that TNS scored better throughout the ranked subjects. Throughout the first tercile, TNS subjects scored significantly better than the other configurations, whose scores were all grouped together. The second tercile has a similar trend to the first, but the TS and NTS subjects scored better than the NTNS subjects and a division begins between them. The third tercile shows that TNS scored the highest followed by NTS, TS, and NTNS. The only exception is that the overall highest score came from TS. Table 8 presents the calculated t-values from the total scaled score. When comparing these to the tabulated t-value of 2.07, the conclusion can be made that there is a statistically significant 30

difference between TNS and NTNS. Since the plot shows that subjects performed better using TNS, subjects with head tracking and no stereoscopic 3D will perform better than those with no head tracking and no stereoscopic 3D. Table 8: Student's t-test Calculated t-values from the Total Score TS NTNS NTS TNS TS 1.27 0.21 1.92 NTNS 1.27 1.48 3.75 NTS 0.21 1.48 1.65 TNS 1.92 3.75 1.65 31

4 DISCUSSION Visual assessments of ranked, scaled scores and its associated tasks reveal inconclusive results regarding relative advantages of head tracking and stereoscopic displays. The TS group scored well on both human foot orientations, but did not perform well on the change detection task. The NTS group performed well on the change detection task, but did not receive very high scores on the human foot orientations when compared to the other groups. The statistical analysis using the Student s t-test approach supports the visual analysis concluded from the scaled score plots; there is not a statistically significant difference between the different configurations tested in Part A. In short, our results are at best inconclusive as to whether stereoscopic 3D with head tracking improves ones ability to view, interpret, or manipulate data. This result might be partially due to the noticeable bezel on the 3D televisions which tend to diminish the overall sense of immersion by providing a fixed reality reference point. When considering the navigation and orientation task in Part B there are significant statistical differences between some of the configurations. Part B contained four separate tasks that measured the orientation and navigation skills of subjects when using different configurations, however, two of these tasks produced results that were skewed by subjects failing to interpret the inside of the cavern. When taking into account just finding location A and location D with the total score, conclusions can be made that TNS improves ones ability to view, interpret, and manipulate data when compared to the NTNS configuration. There is also a slight difference between TNS 33

and NTS, but it cannot be said that users perform better 95% of the time when using TNS when compared to NTS. 34

5 CONCLUSION The purpose of this study was to determine the relative significance and importance of head tracking and stereoscopic 3D as it applies to interpreting and navigating data in immersive visualization environments. Part A of the study shows that when interpreting, navigating, or orienting data individually that there are no statistically significant differences between immersive visualization environments with different configurations of head tracking and stereoscopic 3D. Part B combined interpreting, navigating, and orienting into the same task and the results showed significant differences between different configurations of an immersive visualization environment. When considering the combination of what users would actually do in immersive visualization environments a conclusion can be made that an environment with head tracking and no stereoscopic 3D provides a most beneficial platform for users to perform at high levels. The initial hypothesis was that head tracking with stereoscopic 3D would provide an environment that would allow users to perform better when interpreting and navigating through data. However, the results showed that stereoscopic 3D hinders that success of users. The VuePod and its differences from other CAVEs needs to be considered when analyzing the results of this study. It is possible that performing this study in a four or more wall CAVE would produce different results regarding head tracking and stereoscopic 3D because of the way the 3D is rendered. When using a multi-walled CAVE the 3D images are perceived to be projected in open space and could possibly be more beneficial to users than when using the one-walled VuePod with 35

3D images projected into the screens instead of in front of them. The VuePod contains bezels where the 3D televisions meet one another. These bezels make it hard for the 3D images to be rendered realistic when the user can see this object in front of the 3D images. Our conclusions show that when using low-cost immersive visualization environments, such as the VuePod, that it is more beneficial for users to navigate and interpret data in a configuration that includes head tracking with no stereoscopic 3D. 36

REFERENCES Billen, M. I., et al. (2008). "A geoscience perspective on immersive 3D gridded data visualization." Computers & Geosciences 34(9): 1056-1072. Bowman, D. and R. P. McMahan (2007). "Virtual Reality: How Much Immersion is Enough?" IEEE Computer Society 40(7): 36-43. Browning, D. R., et al. (1994). Projection-Based Virtual Environments and Disability. Virtual Reality Conference. Chance, S. S., et al. (1998). "Locomotion Mode Affects the Updating of Objects Encountered Druing Travel: The Contribution of Vestibular and Proprioceptive Inputs to Path Integration." Presence: Teleoperators & Virtual Environments 7(2). Chiang, W.-L. (2013). Three-Dimensional Glasses. U. S. Patent. United States of America, Hon Hai Precision Industry Co., Ltd. Cruz-Neira, C., et al. (1992). "The Cave Audio Visual Experience Automatic Virtual Environment." Communications of the ACM 35(6): 64-72. DeFanti, T. A., et al. (2011). "The Future of the CAVE." Central European Journal of Engineering 1(1): 16-37. Dodgson, N. A. (2005). "Autostereoscopic 3D Displays." IEEE: 31-36. Fisher, S. S., et al. (1986). "Virtual Environment Display System." Interactive 3D Graphics: 77-87. Gibson, E. J., et al. (1959). "Motion Parallax as a Determinant of Perceived Depth." Journal of Experimental Psychology 58(1): 40-51. Hayden, S., et al. (2014). A Mobile, Low-Cost, Large-Scale, Immersive Data Visualization Environment, for Civil Engineering Applications, Brigham Young University. Hubona, G. S., et al. (1999). "The Relative Contributions of Stereo, Lighting and Background Scenes in Promotin 3D Depth Visualization." ACM Transactions on Computer-Human Interaction 6(3): 214-242. 37

Koller, D., et al. (1995). Virtual GIS: A Real-Time 3D Geographic Information System. 6th IEEE Visualization Conference. Kreylos, O. (2000). "Sample Datasets." from http://idav.ucdavis.edu/~okreylos/phdstudies/spring2000/ecs277/datasets.html. Kreylos, O. (2008). Environment-Independent VR Development. Las Vegas, Nevada. Kreylos, O. (2013). "How Head Tracking Makes Holographic Displays." Kruger, W., et al. (1995). "The Responsive Workbence: A Virtual Work Environment." Computer 28(7): 42-48. Lin, H., et al. (2013). "Virtual Geographic Environments (VGEs): A New Generation of Geographic Analysis Tool." Elsevier Earth Science Reviews(126): 74-84. NSF, et al. (2008). EarthScope Intermountain Seismic Belt LiDAR Project, Open Topography. Peterka, T., et al. (2008). "Advances in the Dynallax Solid-State Dynamic Parallax Barrier Autostereoscopic Visualization Display System." IEEE Transactions on Visualization and Computer Graphics 14(3): 487-499. Ragan, E. D., et al. (2013). "Studying the Effects of Stereo, Head Tracking, and Field of Regard on a Small-Scale Spatial Judgment Task." Visualization and Computer Graphics, IEEE 19(5): 886-896. San Diego, C. o. (2005). San Diego Urban Region Lidar. C. o. S. Diego. Sherman, W. R., et al. (2010). "IQ-Station: A Low Cost Portable Immersive Environment." Advances in Visual Computing 6454: 361-372. Slater, M. (2003). "A Note on Presence Terminology." Presence Connect 3(3). Sulbaran, T. and N. C. Baker (2000). Enhancing Engineering Education Through Distributed Virtual Reality. IEEE. Georgia Institute of Technology, IEEE: 13-18. Waly, A. F. and W. Y. Thabet (2002). "A Virtual Construction Environment for Precondtruction Planning." Elsevier Automation in Construction(12): 139-154. Ware, C., et al. (1993). Fish Tank Virtual Reality. Conference on Human Factors in Computing Systems, ACM. 38