Tavares, J. M. R. S.; Ferreira, R. & Freitas, F. / Control a 2-Axis Servomechanism by Gesture Recognition using a Generic WebCam, pp. 039-040, International Journal of Advanced Robotic Systems, Volume 2, Number 1 (2005), ISSN 1729-8806 Control a 2-Axis Servomechanism by Gesture Recognition using a Generic WebCam Tavares, J. M. R. S.; Ferreira, R. & Freitas, F. Departamento de Engenharia Mecanica e Gestao Industrial Faculdade de Engenharia da Universidade do Porto, Portugal ricardoferreira76@iol.pt, tavares@fe.up.pt, ffreitas@fe.up.pt Abstract: This paper presents two approaches used to control a 2-axis servomechanism by gesture recognition. For both approaches, the adopted control philosophy was: In the system s learning phase, the control images are acquired by a generic webcam and associated to the desired orders. On the other hand, in the working phase, each control image is acquired, by the same webcam and, by comparison with the stored orders images, the desired order is recognized and processed by the servomechanism in quasi real time. In this paper, the used servomechanism and both implemented approaches are described, and are indicated some advantages and weakness of each one. Two examples of images control sets are also presented, and some conclusions and future works are addressed. Keywords: computational vision, gesture recognition, object moments, orientation histograms, servomechanism control. 1. Introduction The main objective of this work was the developing of a control system for a servomechanism, equipped with two linear motion axes, based on gesture recognition. With that purpose, we decided to use a generic webcam for the image acquisition process, and the following implementation philosophy: In the first phase, the learning phase, the control system should make the association of each hand gesture image to the desired control order. In the next phase, the working one, by comparing the correspondent order gesture image with the ones previously considered in the learning phase, the control system process the desired order. In the first considered approach for the servomechanism s vision control system, we identify the control orders by calculating the moments associated with the control object, (Awcock, G. & Thomas, R., 1995), (Jain, R.; Kasturi, R.; Schunk, B. & Brian, G., 1995). This approach, although simple, did not allow the distinction of a satisfactory orders number. To solve that problem, we then implemented a different approach based on orientation histograms, (Freeman, W. & Roth, M., 1995), (Freeman, W.; Tanaka, K.; Ohta, J. & Kyuma, K., 1996), (Freeman, W.; Anderson, D.; Beardsley, P.; Dodge, C.; Roth, M.; Weissman, C.; Yerazunis, W; Kage, H.; Kyuma, K.; Miyake, Y. & Tanaka, K., 1998), (Freeman, W.; Beardsley, P.; Kage, H.; Tanaka, K.; Kyuma, K. & Weissman, C., 1999). This last one, although more complex than the previous approach, is also very easy to implement, little demanding in computational resources and already allows a reasonable orders number. These two approaches will be described in this paper. We have divided this paper in the following way: In the next section, is briefly presented the servomechanism used. In the third, we describe the interface developed to control and monitor the used servomechanism through a personal computer. After that, we describe both approaches implemented for the servomechanism s vision control system: the one based on objects moments and the one based on orientation histograms; and, by the end of this section, we present two examples of satisfactorily images control sets. In the forth, and last section of this paper, are indicated some conclusions and possible future work perspectives. 2. Servomechanism Description A structural static base and a dynamic one, the working table, compose the used 2-axis servomechanism, Fig. 1. Two hydraulic cylinders, one vertical and one horizontal, of 600 and 350 mm respectively, manipulate the referred table. 039
Fig. 1. Servomechanism used in this work The servomechanism is also equipped with four sensors whose function is the checking of each cylinder s limits. To enable the interface between the control personal computer s board and the servomechanism s command system, was used an AX757 board which has eight digital signal inputs and eight relays. In this work, were used the digital signal inputs to monitor the four sensors referred above, and two of the eight relays in the servomechanism command. The AX757 board suggests the sending and receiving of electrical signals through a computer to the hydraulically pump s motor and the loading valve. To send the command signals from the computer, we had used the AX5411, from which we only used the digital capabilities, because there was no speed variations involved. As we can see in Fig. 1, the used servomechanism was already equipped with a user interface element, previous developed, which allows its manual control. The main objective of this work was to add to this servomechanism another fully automatic control system based on gesture recognition. This interface was developed for Microsoft Windows platforms, (Richter, J., 1998), using the Microsoft Visual C++ programming environment, (Young, M., 1998), and integrated in an already existing generic image processing software package, (Tavares, J., 2000). In Fig. 2 we present the developed interface for the servomechanism computer control. Through this interface, we can check the sensors state, turn on/off the hydraulic central motor, control the loading valve, move each cylinder, change the speed levels (fast/slow), and an emergency stop. The next step of this work was the developing of the control system based on gesture recognition. The two implemented approaches will be presented in the following subsections. 3.1. Object s Moments In the image analysis domain, the use of objects moments to characterize them in position, size and orientation, and therefore recognize them, is very common, (Awcock, G. & Thomas, R., 1995), (Jain, R.; Kasturi, R.; Schunk, B. & Brian, G., 1995). As our problem is indeed a recognition problem, and due to this just referred common use, we initiated the developing of the vision control system using this approach. Thus to compute the object s shape area A, the zero order moment s value, the object s centroid coordinates ( ) the first order moments, and the object s elongation axis orientation θ, the second order moment s axis, we use: 3. Servomechanism Control The first step of this work was the development of a friendly user interface that allows, through a personal computer, the total servomechanism control and monitor. and where: (n, m) are the image s dimensions and a b, and c, the second object s order moments, are defined as: Fig. 2. Developed interface for the servomechanism s control by a personal computer 040
and with and B [i,j] set to 1 (one), if the pixel belongs to the object, or to 0 (zero) if not. In Fig. 3 we present two examples of getting the geometric properties of objects using the above formulated moments method. The results are, as can been seen, satisfactory. a) Properties of a vertical object b) Properties of a horizontal object Fig. 3. Determination of two object s geometric properties using the moment s method However, although simple, this method presents some disadvantages: the control object must be represented by a single region, with shape preferably rectangular, the image binary and, because the obtained orientation is always between 0 and 180º, the number of possible control orders must be reduced. To overcome these problems, we implemented a second methodology based on orientation histograms which will be described in the next section. To overcome the problems associated with the moment s method, previously described, we implemented for the vision control system a second methodology based on orientation histograms, (Freeman, W. & Roth, M., 1995), (Freeman, W.; Tanaka, K.; Ohta, J. & Kyuma, K., 1996), (Freeman, W.; Anderson, D.; Beardsley, P.; Dodge, C.; Roth, M.; Weissman, C.; Yerazunis, W; Kage, H.; Kyuma, K.; Miyake, Y. & Tanaka, K., 1998), (Freeman, W.; Beardsley, P.; Kage, H.; Tanaka, K.; Kyuma, K. & Weissman, C., 1999). This methodology consists on calculating the orientation histogram of each acquired control image and compares them with each histogram considered in the learning phase (determined from the images correspond-ing to the desired orders). In Fig. 4 and 5 are represented the working diagrams for the vision servomechanism s control, based on image orientation histograms method, in the learning and working phase respectively. Therefore, for each 256 grey levels control image we calculated the orientation his-togram and stored in a 16- component vector. Then, in order to even its components, this vector is smooth. The comparison of the current order s image, with the ones considered in the learning stage, is obtained computing the differences between histogram s vectors. The orientation of the pixel with coordinates (i, j) is: θ [i,j] = arctan (di, dj), (8) with di = B[i,j]- B[i-1, j], (9) di = B[i,j]- B[i, j-1], (10) and B[i,j] equals to the pixel s grey level (generally, between [0, 255]). 3.2. Orientation Histograms Fig. 4. Learning phase of the vision servomechanism s control based on orientation histograms method 041
In the orientation histograms calculations, we only consider the pixels, which have intensity level above a predetermined value, thus neglecting the noise pixels, and, to neglecting pixels from areas with reduced meaning, a contrast value higher than a certain predefined value. In Fig. 6 and 7 we can see the implemented interfaces for the vision control system in its learning and working phases. Joao Manuel R. S. Tavares; Ricardo Ferreira & Francisco Freitas / Control a 2-Axis Servomechanism by Gesture Fig. 6. Developed interface for the servomechanism s vision control system in the learning phase. (To the visible image will be associated a command order by choosing the corresponding button) Fig. 7. Developed interface for the servomechanism s vision control system in the working phase for manual processing mode. (In the left window is visible the actual captured image, and in the right one the last image processed) The considered control orders were the following: stop movement, move to the left, to the right, up, down and movement speed change (fast/slow). In Fig. 8 and 9 are presented two examples of successfully orders images sets 042
Fig. 8. First example of a satisfactory working orders image set for the vision control system based on image orientation histograms method Fig. 9. Second example of a satisfactory working orders image set for the vision control system based on image orientation histograms method In the implementation done, the vision control system when in automatic working mode is constantly acquiring images and cyclically, in predetermined time intervals, does the active image interpretation and processes the associated working order, Fig. 10. When there is a considerable difference between the processing image s orientation histogram and the ones stored in the learning phase, the system rejects the associated order. The four sensors are also monitored cyclically, also in predetermined time intervals, which inhibit, or not, the respective cylinder s movement. 4. Conclusions and Future Work Fig. 10. Servomechanism s vision control system based on orientation histograms method in the automatic working mode In this paper, we presented a servomechanism s control system based on gesture recognition. We have described the 2-axis servomechanism used, presented the developed interface to control and monitor it by a personal computer, and both approaches implemented for the vision control system were described. The first approach considered, based in object s moments, presents simplicity and low computational cost; however presents some disadvantages, mainly the reduced number 043
of possible control orders. The second approach considered, based on orientation histograms, overcomes this problem successfully. In Table 1, we present some of the advantages and disadvantages of each approach that we have considered. During the several experimental tests done, we have concluded that the methodology based on orientation histograms presents two advantages: implementation simplicity and execution quickness. The referred approach works in satisfying manner controlling the used servomechanism and could be considered in other kinds of friendly interfaces: games, computerized applications, home systems, robotic systems, remote controls, etc.. However, we also found that this approach presents some limitations: As the used webcam does not compensate light changes, the control system does not react the same way if those variations are considerably. Another problem, with the actual control system version, is related with the control object size and how it domain the image. This last problem is augmented by the fact that the used webcam does not have an auto-focus system. Moments: Common in computational vision to recognize objects; Simplicity; Low computational cost; Single region with shape preferentially rectangular; Binary images; Reduced number of possible control orders. Orientation Histograms: Simplicity (although more complex than the moment s method); Low computational cost (although heavier than the moment s method); All image is used; Grey levels images; Higher number of possible control orders; More adjustable to the working environment (several control parameters available). Table 1. Advantages and disadvantages of each approach used for the vision control system For future works, to become the vision control system based on orientation histograms, more robust to the problems previously referred, we can suggest: The tracking of the control object through images sequence using, for example, Kalman filters, (Maybeck, P., 1979), (Tavares, J., 1995), (Tavares, J. & Padilha, A., 1995a), with active contours, (Bascle, B. & Deriche, R., 1992), (Blake, A. & Isard, M., 1998), (Kass, M.; Witkin. A. & Terzopoulos. D., 1987), (Tavares, J., 2000), as suggested in (Blake, A.; Curwen, R. & Zisserman, A., 1993), (Delagnes, P.; Benois, J. & Barba, D., 1995). The usage of a more sophisticated camera and the comparison of the new control system s behavior with the one obtained with the actual webcam. This camera upgrade, by itself, should turn more robust the implemented vision control system. 5. References Awcock, G. & Thomas, R. (1995). Applied Image Processing, McGRAW-HILL International Editions Bascle, B. & Deriche, R. (1992). Features Extraction using Parametric Snakes, IEEE 11th International Conference on Pattern Recognition, Netherlands Blake, A. & Isard, M. (1998). Active Contours, Springer- Verlag Blake, A.; Curwen, R. & Zisserman, A. (1993). A Framework for Spatiotemporal Control in the Tracking of Visual Contours. International Journal of Computer Vision, 11(2), pp. 127-145 Delagnes, P.; Benois, J. & Barba, D. (1995). Active contours approach to object tracking in image sequences with complex background, Pattern Recognition Letters, Vol. 16, pp. 171-178 Freeman, W. & Roth, M. (1995). Orientation histograms for hand gesture recognition, IEEE Intl. Workshp. on Automatic Face and Gesture Recognition, Zurich Freeman, W.; Tanaka, K.; Ohta, J. & Kyuma, K. (1996). Computer Vision for Computer Games, In IEEE 2nd International Conference on Automatic Face and Gesture Recognition, Killington, USA Freeman, W.; Anderson, D.; Beardsley, P.; Dodge, C.; Roth, M.; Weissman, C.; Yerazunis, W; Kage, H.; Kyuma, K.; Miyake, Y. & Tanaka, K. (1998). Computer Vision for Interactive Computer Graphics, IEEE Computer Graphics and Applications, Vol. 18, No. 3, pp. 42-53, (May-June 1998) Freeman, W.; Beardsley, P.; Kage, H.; Tanaka, K.; Kyuma, K. & Weissman, C. (1999). Computer Vision for Computer Interaction, SIGGRAPH Computer Graphics magazine, (November 1999) Jain, R.; Kasturi, R.; Schunk, B. & Brian, G. (1995). Machine Vision, McGraw-Hill International Editions, Computer Science Series Kass, M.; Witkin. A. & Terzopoulos. D. (1987). Snakes: Active contour models, International Journal of Computer Vision, Vol. 1, pp. 321-331 Maybeck, P. (1979). Stochastic Models, Estimation, and Control, Mathematics In Science and Engineering, Vol. 141. Vol. I., Academic Press Richter, J. (1998). Advanced Windows, Microsoft Press Tavares, J. (1995). MSc Thesis: Obtenção de Estrutura Tridimensional a Partir de Movimento de Câmara, Faculdade de Engenharia da Universidade do Porto Tavares, J. & Padilha, A. (1995a). Matching lines in image sequences with geometric constraints, RecPad'95-7th Portuguese Conference on Pattern Recognition, Aveiro, Portugal Tavares, J. (2000). PhD Thesis: Análise de Movimento de Corpos Deformáveis usando Visão Computacional, Faculdade de Engenharia da Universidade do Porto Young, M. (1998). Mastering Microsoft Visual C++ 6, Sybex 044