GESTURE RECOGNITION SOLUTION FOR PRESENTATION CONTROL

Similar documents
A Study on the control Method of 3-Dimensional Space Application using KINECT System Jong-wook Kang, Dong-jun Seo, and Dong-seok Jung,

KINECT HANDS-FREE. Rituj Beniwal. Department of Electrical Engineering Indian Institute of Technology, Kanpur. Pranjal Giri

Design of an Interactive Smart Board Using Kinect Sensor

Gesture Recognition with Real World Environment using Kinect: A Review

The Hand Gesture Recognition System Using Depth Camera

Inspiring Creative Fun Ysbrydoledig Creadigol Hwyl. Kinect2Scratch Workbook

Air Marshalling with the Kinect

What was the first gestural interface?

CONTROLLING METHODS AND CHALLENGES OF ROBOTIC ARM

Available online at ScienceDirect. Procedia Computer Science 50 (2015 )

GESTURE BASED HUMAN MULTI-ROBOT INTERACTION. Gerard Canal, Cecilio Angulo, and Sergio Escalera

Design and Implementation of an Intuitive Gesture Recognition System Using a Hand-held Device

License Plate Localisation based on Morphological Operations

Hand Gesture Recognition for Kinect v2 Sensor in the Near Distance Where Depth Data Are Not Provided

A Kinect-based 3D hand-gesture interface for 3D databases

R (2) Controlling System Application with hands by identifying movements through Camera

OBJECT RECOGNITION THROUGH KINECT USING HARRIS TRANSFORM

Kinect Interface for UC-win/Road: Application to Tele-operation of Small Robots

Controlling Humanoid Robot Using Head Movements

An Evaluation of Automatic License Plate Recognition Vikas Kotagyale, Prof.S.D.Joshi

A Study for Choosing The Best Pixel Surveying Method by Using Pixel Decision Structures in Satellite Images

Augmented Reality using Hand Gesture Recognition System and its use in Virtual Dressing Room

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

3D-Position Estimation for Hand Gesture Interface Using a Single Camera

A Study on Motion-Based UI for Running Games with Kinect

Advancements in Gesture Recognition Technology

The Making of a Kinect-based Control Car and Its Application in Engineering Education

Lecture 19: Depth Cameras. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

A New Approach to Control a Robot using Android Phone and Colour Detection Technique

Various Calibration Functions for Webcams and AIBO under Linux

Getting started 1 System Requirements... 1 Software Installation... 2 Hardware Installation... 2 System Limitations and Tips on Scanning...

Android Speech Interface to a Home Robot July 2012

Finger rotation detection using a Color Pattern Mask

Simulation of a mobile robot navigation system

CSE 165: 3D User Interaction. Lecture #7: Input Devices Part 2

KI-SUNG SUH USING NAO INTRODUCTION TO INTERACTIVE HUMANOID ROBOTS

CSE Tue 10/09. Nadir Weibel

The Control of Avatar Motion Using Hand Gesture

Implementing RoshamboGame System with Adaptive Skin Color Model

ThermaViz. Operating Manual. The Innovative Two-Wavelength Imaging Pyrometer

MULTI-LAYERED HYBRID ARCHITECTURE TO SOLVE COMPLEX TASKS OF AN AUTONOMOUS MOBILE ROBOT

SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB

Evaluation of a Tricycle-style Teleoperational Interface for Children: a Comparative Experiment with a Video Game Controller

WALLY ROTARY ENCODER. USER MANUAL v. 1.0

SMART ELECTRONIC GADGET FOR VISUALLY IMPAIRED PEOPLE

Fabrication of the kinect remote-controlled cars and planning of the motion interaction courses

An Electronic Eye to Improve Efficiency of Cut Tile Measuring Function

Artificial Beacons with RGB-D Environment Mapping for Indoor Mobile Robot Localization

A simple MATLAB interface to FireWire cameras. How to define the colour ranges used for the detection of coloured objects

Design a Model and Algorithm for multi Way Gesture Recognition using Motion and Image Comparison

Robot Task-Level Programming Language and Simulation

TurtleBot2&ROS - Learning TB2

International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2015)

Development of an Automatic Camera Control System for Videoing a Normal Classroom to Realize a Distant Lecture

THE Touchless SDK released by Microsoft provides the

Team Description Paper

A SURVEY ON GESTURE RECOGNITION TECHNOLOGY

Toward an Augmented Reality System for Violin Learning Support

KINECT CONTROLLED HUMANOID AND HELICOPTER

Chapter 14. using data wires

Image Manipulation Interface using Depth-based Hand Gesture

Image Processing and Particle Analysis for Road Traffic Detection

Development of a Robotic Vehicle and Implementation of a Control Strategy for Gesture Recognition through Leap Motion device

Guided Filtering Using Reflected IR Image for Improving Quality of Depth Image

Image Interpretation System for Informed Consent to Patients by Use of a Skeletal Tracking

OCC Motion Sensor. Guide: how to program and use

How to define the colour ranges for an automatic detection of coloured objects

Advances in Human!!!!! Computer Interaction

Touch & Gesture. HCID 520 User Interface Software & Technology

ModaDJ. Development and evaluation of a multimodal user interface. Institute of Computer Science University of Bern

Development of excavator training simulator using leap motion controller

Automatic Electricity Meter Reading Based on Image Processing

Perception. Read: AIMA Chapter 24 & Chapter HW#8 due today. Vision

Digital Portable Overhead Document Camera LV-1010

Responding to Voice Commands

The 8 th International Scientific Conference elearning and software for Education Bucharest, April 26-27, / X

Real Time Hand Gesture Tracking for Network Centric Application

User Guide / Rules (v1.6)

Touch & Gesture. HCID 520 User Interface Software & Technology

Pinch-the-Sky Dome: Freehand Multi-Point Interactions with Immersive Omni-Directional Data

SIMULATION MODELING WITH ARTIFICIAL REALITY TECHNOLOGY (SMART): AN INTEGRATION OF VIRTUAL REALITY AND SIMULATION MODELING

Interactive Coffee Tables: Interfacing TV within an Intuitive, Fun and Shared Experience

Integration of Hand Gesture and Multi Touch Gesture with Glove Type Device

Gesture Control in a Virtual Environment

High-Level Programming for Industrial Robotics: using Gestures, Speech and Force Control

Development a File Transfer Application by Handover for 3D Video Communication System in Synchronized AR Space

Deep Green. System for real-time tracking and playing the board game Reversi. Final Project Submitted by: Nadav Erell

Navigation of PowerPoint Using Hand Gestures

EMMA Software Quick Start Guide

Introduction...3. System Overview...4. Navigation Computer GPS Antenna...6. Speed Signal...6 MOST RGB Lines...6. Navigation Display...

Service Robots in an Intelligent House

System of Recognizing Human Action by Mining in Time-Series Motion Logs and Applications

Next Back Save Project Save Project Save your Story

Boneshaker A Generic Framework for Building Physical Therapy Games

Sense. 3D Scanner. User Guide. See inside for use and safety information.

ReVRSR: Remote Virtual Reality for Service Robots

Control a 2-Axis Servomechanism by Gesture Recognition using a Generic WebCam

Limits of a Distributed Intelligent Networked Device in the Intelligence Space. 1 Brief History of the Intelligent Space

Design and Development of a Marker-based Augmented Reality System using OpenCV and OpenGL

HUMAN MACHINE INTERFACE

Transcription:

GESTURE RECOGNITION SOLUTION FOR PRESENTATION CONTROL Darko Martinovikj Nevena Ackovska Faculty of Computer Science and Engineering Skopje, R. Macedonia ABSTRACT Despite the fact that there are different presentation control techniques, today the standard mouse and keyboard are still frequently used for presentation control. Gesture-controlled solutions for presentation control also exist and they are based on motion-sensing devices like cameras, data gloves, infrared sensors and other similar devices. In this paper we present a gesture recognition solution for presentation control using the Kinect sensor. We included two gestures and studied their characteristics. For better gesture recognition we introduced 5 parameters and determined their values based on real gestures execution. Keywords: presentation control techniques, kinect sensor, gesture recognition I. INTRODUCTION When a speaker needs to deliver a talk or a lecture on some subject, often there is a previously prepared presentation that contains the most important notes about the subject. So, today more and more slideshow presentation software is used during a speech. To be able to control the slideshow, the speaker needs a way to input the required action. So, an appropriate presentation control technique must be chosen [1]. A. Presentation control techniques The most common and widely used presentation control is the standard keyboard and mouse input. However, this technique has some restrictions for the presenters. When the presenter needs to point some area of the slide and the projection plane is further away from the computer, then walking back and forth between the computer and the projection plane is imminent. On the other side, staying close to the computer leads to reduced body language and eye contact with the public. Another technique that is emerging today is the usage of remote control devices and smartphones for presentation control. Because these devices have limited number of buttons, the number of different actions is also limited. This is not a big disadvantage, but as the technology progresses forward, more and more new actions starts to unveil. Also new devices in the area of gesture recognition are getting popular and successful presentation control software has been built using this type of devices. One of them is the kinect sensor. B. Kinect sensor Kinect sensor is an input device for motion sensing and speech recognition, developed by Microsoft [2]. This sensor allows the users to control and interact with an application using real gestures and spoken commands. With the arrival of this sensor a new way of human-computer interaction has been introduced, with a huge impact in many industries, including education, robotics, healthcare, and beyond. What made kinect so popular, compared to the other existing sensors for motion tracking, is the low price, availability to use with traditional computer hardware and existence of developers tools for kinect application development. II. ANALYSIS OF EXISTING MOTION-SENSING SOLUTIONS FOR PRESENTATION CONTROL There are many existing solutions for motion sensing. Some of them are implemented in presentation control. Gesturecontrolled solutions for presentation control are usually based on motion-sensing devices like cameras, data gloves, infrared sensors and other similar devices. Some of these solutions are described in the sequel. In [3] a system using infrared laser tracking device is presented. The presenter uses a laser pointer to make the appropriate gesture and an infrared tracking device is used for gesture recognition. This system recognises circling gestures around some object, so an easier selection in a slide can be made. Presentation control can be achieved by putting appropriate objects in each slide that will represent the appropriate action. A presentation control solution using data gloves is presented in [4]. In [5] it is shown that a model can be trained with prerecorded gestures and using a camera it can track and recognise gesture movements. Appropriate system called PowerGesture has been built for controlling PowerPoint slides using gestural commands. Our goal was, with the usage of the kinect sensor, to create an application that will track and recognise user s gestures. We successfully created an application called ki-prez for this purpose, and it is explained in the following section. 187

III. DESCRIPTION OF OUR SOLUTION A. Ki-Prez as a kinect application We created a C# application that uses the kinect sensor for gesture control. To be able to record gesture movements, the presenter needs to stand from 0.8 to 4 meters in front of the sensor. Fig. 1 represents the graphical interface of the application. Figure 2: Tracked skeleton joints of the user s body Figure 1: Graphical User Interface of ki-prez When the application is started, an initialization process for the sensor is executed. Handling with unexpected events is also crucial, because the sensor may be used by another application in the same time or even turned off. Also, unexpected things can happen while the application is running, like accidently pulling off the power or connection cables. Therefore we added management with unexpected events: when an unexpected event happens, the user gets an appropriate message. The stream data can be acquired in two ways: using events or polling. If the event type is used, then an application has to subscribe for the data and after that the data will be sent continuously. By contrast, the polling method gives the data on demand, when the application requests it. We have used the event-subscription type as we track the user s movements simultaneously during the presentation. B. Gesture recognition For appropriate presentation control we have included two hand gestures: swipe left and swipe right. The practical realisation of these gestures is shown in Fig. 3. We have analysed these two gestures and developed an algorithm for detection. The algorithm is based on the characteristics that these gestures have. For developers working with the kinect sensor, libraries that speed up and help the development process can be found. One of them is the official Kinect Software Development Kit (Kinect SDK) from Microsoft, which we used in our application [6]. Using this SDK kit, three streams of data can be acquired: RGB, depth and skeleton data streams. The RGB data stream gives the colour information for every pixel, while the depth data gives the distance information between the pixels and the sensor. The skeletal data stream gives the positions of numerous skeletal data joints of the users that are in the tracking area of the sensor. The tracked skeleton joints of the user s body are shown in Fig. 2. By processing the depth stream data, the skeleton data stream is generated. For gesture detection, we have used the skeleton data stream. As the pixels colour isn t needed, the RGB data stream wasn t used. Figure 3: The two gestures: Swipe left and swipe right 188

Fig. 4 represents the coordinate system for the skeletal joint data. There are three axes, where the Z axis can have only positive values representing the distance of the joints from the sensor in meters. Also in Fig. 4 the execution of the gesture swipe left is shown, i.e. the position of the left hand joint as the time progresses (entry s indexes are increasing). The characteristics of the swipe left gesture that we observed are: The x-axis coordinate values are decreasing as the gesture is executed; The y-axis coordinate values have nearly equal values as the gesture is executed; The length of the line formed as a sum of the lengths between the points of the gesture has to exceed some previously-defined value. The spent time between the first and the last tracked point of the gesture has to be in the previously defined allowed range. Y max maximal threshold value of the y-axis between hand joint data expressed in meters for a recognised gesture L min minimal length of the recognised swipe gesture expressed in meters T min minimal duration of the recognised swipe gesture expressed in milliseconds T max maximal duration of the recognised swipe gesture expressed in milliseconds For the purpose of determining the parameter values, we collected gesture data from 5 people, which executed the two gestures from different distances in front of the sensor. Interesting conclusions were made. The parameters Y max and L min depend of the distance from which the gesture is executed. This is due to the fact that as the user is moving out from the sensor, the value deviation in two consecutive hand joint data is decreasing. For that purpose we introduced a formula for calculation of these parameters, given in (1, 2) where D is the distance of the presenter from the sensor, and L min and Y max are minimal length and maximal threshold respectfully. D Lmin = 0.8-0.2 5 D Ymax = 0.1-0.05 5 The duration of the executed gesture varies mainly from the gender and age of the presenter, so the parameter values of the minimal and maximal duration were chosen appropriately: T min=350ms and T max=1500ms. (1) (2) Figure 4: The skeleton coordinate system and the swipe left gesture execution The characteristics for the swipe right gestures are the same, except for the first one where the x-axis coordinate values are increasing (not decreasing) as the gesture is executed. C. Parameter selection For every characteristic, appropriate parameter values had to be chosen, for better rate of reliability. The parameters that we introduced are: X max maximal threshold value of the x-axis between two consecutive hand joint data expressed in meters for a recognised gesture To properly calculate the value of X max we considered the amount of time between two skeleton joint data generation. As the skeleton data is calculated using the depth image data from the sensor, the time for getting two successive skeleton data varies, depending on the processing power of the computer where the application is executed. We have tested the application on Intel Core 2 Duo, 2.4 GHz, and the average time between two successive skeleton data is 40 ms. Because T max is 1500ms, one gesture can be consisted of maximum 1500/40=37.5 38 successive skeleton joint data. So, the last 38 generated skeleton hand joint data are tracked, and the previous are disposed. X max is calculated by 0.8 (maximal length of swipe gesture) / 38 (maximal number of successive joint data in a gesture) / 2 (time variations in getting successive skeleton data) 0.1. To detect a swipe gesture, the successive skeleton hand joint data must be checked and when the data satisfies all of the 189

parameters, gesture is detected. This is explained in details in the next two paragraphs. For keeping the skeleton hand data two queues were used, one for the right skeleton hand joint data (left swipe gesture), and the other for the left joint data (right swipe gesture). The maximal number of elements in the queues is 38 (the maximal number of successive joint data in a gesture). When new skeleton data arrives, the hand joint data is added to the queues. The last two skeleton joint data entries are checked for the parameters X max and Y max for both gestures. If these parameters aren t satisfied then the data from the appropriate queue is erased. If they are satisfied, then the other three L min T min and T max are also checked. If they are satisfied then a swipe gesture is detected. When a gesture is detected, then an appropriate pressing of a keyboard button is simulated. The left swipe gesture represents pressing of the left arrow, and the right swipe gesture the right arrow on the keyboard. D. Choosing the presenter According to the Kinect SDK [6], current version supports tracking of only two skeletons simultaneously. We propose the presenter to be chosen according to the minimal Euclidean distance, measured between the skeletal hip center joint and the sensor. In (3) the x, y and z represent the coordinates of the skeletal hip center joint for the appropriate skeleton. min( 2 2 2 2 2 2 x + y + z, x + y + ) (3) 0 0 0 1 1 z1 Command Next Action The presentation goes to next slide Previous The presentation goes to the previous slide Start Continue Exit White screen Black screen Audio on Audio off Show ki-prez Hide ki-prez Gestures info Speech info The presentation is started The presentation is continued from the current opened slide The presentation is closed The screen switch to white The screen switch to black The speech commands can again be used (if they were deactivated) The speech commands are deactivated. Only audio on can be used The interface is shown in front-end The interface is minimised The info block for gestures usage and information is shown The info block for speech commands usage and information is shown E. Speech commands The Kinect SDK also allows working with the sensor s audio data. So, in addition to the gestures, using the Kinect Speech Recognition Engine [6] we defined a grammar - speech commands for presentation control. In Table 1 the defined speech commands and their actions are shown. The commands simulate pressing of the appropriate button, according to PowerPoint slideshow software. IV. EVALUATION We have evaluated our gesture-recognition module measuring the rate of success of the executed gesture from different sensor distances. In Fig. 5 that rate of success is shown. It can be seen that a high success rate had been achieved from all of the distances. There is little degradation in rate of success as the presenter is getting toward the furthest allowed distance from the sensor, because the skeleton data is prone to errors, especially when the presenter is moving far away from the sensor. Table 1: Speech commands and their actions Figure 5: Results of distance evaluation of the gesture recognition We have also evaluated the average CPU usage of the application. The testing was done on Intel Core 2 Duo, 2.4 GHz processor. These results are shown in Fig. 6 and we concluded that CPU usage mostly depends of the skeleton data. When the kinect is powered off and no skeleton data is received, then CPU usage is minimal. There is an increase in CPU usage when the kinect is plugged in, but the presenter is not in the viewing 190

range of the sensor. The CPU usage is more increased when the presenter is in front of the sensor and the application needs to process the skeleton data. Figure 6: CPU Usage of the application V. CONCLUSION In this paper we presented a gesture recognition solution for presentation control. We used Kinect sensor, and utilized it for PowerPoint presentation control. Two gestures were included and their characteristics were studied. We introduced 5 parameters for better gesture recognition and determined their values based on real gestures execution. As a future work, we are currently studying well-known models for time-spatial data and we plan to create a gesture pattern recognition using the kinect sensor. REFERENCES [1] X. Cao, E. Ofek and D. Vronay, Evaluation of Alternative Presentation Control Techniques CHI '05 Extended Abstracts on Human Factors in Computing Systems, pp. 1248-1251, 2005. [2] Kinect [Online] Available: http://www.microsoft.com/enus/kinectforwindows/ [3] K. Cheng, K. Pulo Direct Interaction with Large-Scale Display Systems using Infrared Laser Tracking Devices APVis '03 Proceedings of the Asia- Pacific symposium on Information visualisation, vol. 24, pp. 67-74, 2003. [4] M. Bhuiyan and R. Picking, Gesture-controlled user interfaces, what have we done and what s next? Journal of Software Engineering and Applications, vol. 4, no. 9, pp. 513-521, September 2011. [5] H.-K. Lee and J. H. Kim, An HMM-based threshold model approach for gesture recognition The IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, pp. 961-973, October 1999. [6] J. Webb and J. Ashley Beginning Kinect Programming with the Microsoft Kinect SDK, 2012 191