Do-It-Yourself Object Identification Using Augmented Reality for Visually Impaired People Atheer S. Al-Khalifa 1 and Hend S. Al-Khalifa 2 1 Electronic and Computer Research Institute, King Abdulaziz City for Science and Technology 2 Information Technology Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia aalkhalifa@kacst.edu.sa hendk@ksu.edu.sa Abstract. In this paper, we present a Do-It-Yourself (DIY) application for helping Visually Impaired People (VIP) identify objects in their day-to-day interaction with the environment. The application uses the Layar TM Augmented Reality (AR) API to build a working prototype for identifying grocery items. The initial results of using the application show positive acceptance from the VIP community. Keywords: Augmented Reality, Visually Impaired, Object Identification, Layar TM, Assistive Technology. 1 Introduction Visually Impaired People (VIP) face problems in identifying day-to-day objects. The outer shape of an object is not enough to help VIP recognize its content. VIP usually apply different techniques to recognize objects based on their features such as: texture, size, or sound. Therefore, object identification is a very challenging task that depends on the VIP experience. Many Object identification assistive technologies, either hardware or software, were built to help VIP in recognizing surrounding objects. These assistive technologies help them in describing the environment. The description usually comes in the form that is suitable for VIP; It could be oral description or tactile. Given the fact that most of these assistive technologies are specialized to perform a specific task e.g. Talking Scales, yet, it is best to integrate the technology into already found and used devices. One of the most popular and widely used technologies nowadays is smart phones. Smart phones are becoming ubiquitous and have high processing power. All people regardless of their disabilities are using mobile phones for their daily communication. Actually, smart phones are becoming popular and their capabilities have increased in terms of processing power and memory capacity. At the same time, their costs have fallen down which make them available for all individuals whether they are healthy or disabled. K. Miesenberger et al. (Eds.): ICCHP 2012, Part II, LNCS 7383, pp. 560 565, 2012. Springer-Verlag Berlin Heidelberg 2012
Do-It-Yourself Object Identification Using Augmented Reality for VIP 561 In this paper we present a Do-It-Yourself (DIY) application for helping VIP identify real world objects using Augmented Reality (AR) technology. AR is defined as merging digital-generated graphics that are perfectly aligned to real world view [1]. This technology can be utilized to augment not only digital graphics but also sounds. In the market we have several AR software used in mobile phones whether they are open source or proprietary. Mixare Browser (mixare.org) is an example of an open source AR framework while Layar TM (layar.com) is its proprietary counterpart. The main contribution of this work is to present a DIY guide for creating a personalized augmented reality layer for VIP using Layar TM. The guide does not involve advanced programming skills; any novice programmer can easily implement it. The rest of the paper is organized as follows: section 2 sheds the light on some previous work in the area of object identification for VIP. Section 3 presents in detail the implemented DIY application. Section 4 reports the results of the preliminary evaluation of the application. Finally, section 5 concludes the paper with the application limitation and future work. 2 Previous Work Different mobile applications have been created to help VIP in identifying objects; whether using image processing algorithms e.g. LookTel or depending on human identification e.g. VizWiz. These contributions made a solid start in using mobile phones for helping VIP. LookTel is an example of using image processing for identifying objects. It is a visual assistance platform developed by Sudol et al. for VIP [2]. It performs currency identification, Optical Character Recognition (OCR) on texts, landmark and location recognition and packaged goods and tagged object identification. Tagging objects using unique vinyl stickers is an added function for recognizing problematic objects that lacks distinctive features for the scale-invariant feature transform (SIFT) recognition engine to recognize. The user can apply the pre-trained sticker on an object, for example medication bottle or glass jar, then add it to the system using the mobile phone along with the recorded audio description. On the other hand, an example of human identification system can be seen in Bigham et al. [3] VizWiz iphone mobile application. The application assists VIP in their visual environments by asking general questions answered by paid human workers. The application is used to identify and locate objects, where a user take a picture of the designated item using the mobile s camera followed by recording the spoken question. The photo and audio files are then uploaded to the server, which posts them as a job for recruited human workers in the Amazon Mechanical Turk (www.mturk.com/mturk/welcome). After identifying the object by speaking its name, the user will receive the audio answer on the mobile phone.
562 A.S. Al-Khalifa and H.S. Al-Khalifa 3 The Proposed DIY Application Our proposed AR personalized helper DIY application uses LayarTM environment which provides a platform that supports end-users in creating their own real-time AR environments. It creates a layer of digital objects over physical Points Of Interest (POI) that can trigger a series of actions. These POIs can be geo-location (using GPS), Layar VisionTM objects or both [4]. Layar Vision TM [4] is a client-side extension of the Layar TM environment that enables visual detection, tracking and augmentation of real world objects based on associated preloaded fingerprint into the application s layer. When a user points a mobile phone camera at a Vision POI physical object, the Layar TM client will detect it instantly and triggers different set of actions by sending a getpoi request to the layer service provider. This auto-trigger property can be used to create a personalized visual assistant by having the users create their own layer with their choice of objects that satisfy the specification of Layar TM vision's POI. Following is a step by step guide on how to build a VIP personalized helper using Layar TM AR. The application demonstrates a simple case of identifying five grocery items. These objects are divided into two physical forms with the same attributes (length, width, height, weight and texture) but different product types three potato chips bags (of different flavors) and two milk boxes (low fat and regular). The application Requirements are as follows: Layar Vision TM : applicable on the 6.0th version of Layar TM Reality browser on Android 2.2 along with iphone ios 4.0 platform and above. Layar TM developer account. Public web server. After creating a developer account and a new Vision enabled layer, the five items' images were uploaded as reference images into Layar TM publishing website. The service then analyze these images to rate their appropriateness to be recognized by the application. These images were taken using a Samsung Galaxy S II. Each image was then edited by cropping its background to the item's canvas using the painter program. Actually, Layar TM has presented the best practices for creating reference images [5]; where a target object's photo must be of a flat surface, cropped background, nonblurry, non-light reflected and front angled photo. Next, we used the code example published in Make magazine [6] by modifying it to adapt to our experiment. The example contains an "index.php" page and JavaScript Object Notion (JSON) file along with the page header and footer. The index file concatenates the POI or hotspot of each item into a single JSON response for the layer. Each hotspot object has: ID, anchor for the reference image name, and actions. The ID must be unique for each item and the reference image name is identical to the name assigned to it in LayarTM publishing website. Since we need the audio description of the specific item to play once an item reference image is detected, we have to modify the Action object of the GetPOI-JSON response found in each item file, as shown in Figure 1.
Do-It-Yourself Object Identification Using Augmented Reality for VIP 563 Fig. 1. The modified action for automatically playing an audio of an item description Then we modified the main directory s index.php file by adding each POI s JSON response instance to the POIs array. At the same time we added a small audio icon that appears when recognizing an object that is identified in the Object Dictionary (Figure 2). Fig. 2. The application running on Samsung Galaxy S II phone (left) and iphone 4GS (right), the volume icon indicate that the object name is being spoken Finally, we linked our created layer with the web server containing our application code by providing the API endpoint URL field with the path of index.php file [7].
564 A.S. Al-Khalifa and H.S. Al-Khalifa 4 Preliminary Evaluation A group of three VIP were asked to try the application using two smart phones: (1) Samsung Galaxy S II phones running Android version 2.3.4 and (2) iphone 4GS running ios5. The participants varied in age between 20-30 years old with a good experience in using mobile phones. The three VIP were given an introduction to the application explaining how to use it. They were then asked to perform the recognition tasks on a 3G connection. After trying out the application, the VIP were asked a set of questions with answers ranging from strongly agree to strongly disagree. Table 1 shows the responses to the survey questions. Table 1. Average response for three VIP. 1 is strongly disagree, 5 is strongly agree. Usefulness as a grocery item reader. 3 Ease of pointing to regions of interest. 2 Reliability of the application. 4 By looking at the survey results, we can find that a positive feedback was reflected in the first and third questions; however the second question regarding the ease of pointing was below midpoint. One blind user comment was "the application is really needed, but for me to be able to use it I need to estimate the distance and angle of pointing the mobile camera in order to get the application recognize the object". We can see from this preliminary evaluation that the application is both useful and needed for the VIP community, yet it needs to be improved in terms of reliability and ease of pointing. 5 Conclusion, Limitations and Future Work Several assistive technologies aim to help visually impaired people in eliminating the barriers formed as a result of disregarding their disabilities in human s daily needs. These assistive technologies take different orientations whether used in object identification, individual navigation or creating accessible environments. In this paper we presented a DIY Object Identification application using LayarTM Augmented reality API. The application utilizes available end-user application for helping VIP in recognizing real world items. One of the major limitations in the resulted application resides in the limited number of POI. No more than 50 POI can be registered in a layer. Also, we noticed that in low lighting Environments the application cannot identify the objects properly, i.e. it shows wrong results for similar objects. Moreover, sometimes the auto-trigger feature in Layar TM would be fired once and cannot be re-fired again unless the user reloads the layer. The final limitation was realized in the volume of the played audio. In the Android OS compared to ios 5, the audio played was low.
Do-It-Yourself Object Identification Using Augmented Reality for VIP 565 Despite all these limitations, creating a DIY was easy, straight forward and did not require deep technical skills. Our future work will include crowd-sourcing the process of populating the Object Dictionary. This can be done by developing a platform that enables ordinary people to send images of objects along with their audio description using their mobile phones. References 1. Haller, M., Billinghurst, M., Thomas, B.H.: Emerging technologies of augmented reality: interfaces and design. Idea Group Inc, (IGI) (2007) 2. Sudol, J., Dialameh, O., Blanchard, C., Dorcey, T.: Looktel A comprehensive platform for computer-aided visual assistance. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 73 80 (2010) 3. Bigham, J.P., et al.: VizWiz: nearly real-time answers to visual questions. In: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, New York, NY, USA, pp. 333 342 (2010) 4. Layar Reality Browser / Layar Vision Overview, http://layar.pbworks.com/ w/page/43908786/layar%20vision%20overview (accessed: January 17, 2012) 5. Layar Reality Browser / Reference Image Best Practices, http://layar.pbworks.com/w/page/43909057/reference%20image%2 0Best%20Practices (accessed: January 20, 2012) 6. Layar Augmented Reality for MAKE. vol 28. (and How to Make Your Own), http://boingboing.net/2011/11/21/ layar-augmented-reality-for-ma.html 7. The example s source code explained, can be found on the following link, http://faculty.ksu.edu.sa/hend.alkhalifa/documents/resources /AR-example.rar