68 Internatonal Journal "Informaton Theores & Applcatons" Vol.11 PRACTICAL, COMPUTATION EFFICIENT HIGH-ORDER NEURAL NETWORK FOR ROTATION AND SHIFT INVARIANT PATTERN RECOGNITION Evgeny Artyomov and Orly Yadd-Pecht Abstract: In ths paper, a modfcaton for the hgh-order neural network (HONN) s presented. Thrd order networks are consdered for achevng translaton, rotaton and scale nvarant pattern recognton. They requre however much storage and computaton power for the task. The proposed modfed HONN takes nto account a pror knowledge of the bnary patterns that have to be learned, achevng sgnfcant gan n computaton tme and memory requrements. Ths modfcaton enables the effcent computaton of HONNs for mage felds of greater that 100 100 pxels wthout any loss of pattern nformaton. Keywords: HONN, hgher-order networks, nvarant pattern recognton. 1. Introducton Invarant pattern recognton usng neural networks was found to be attractve due to ts smlarty to bologcal systems. There are three dfferent classes that use neural networks for nvarant pattern recognton [1], that dffer n the way nvarance s acheved,.e. Invarance by Tranng [2], Invarant Feature Spaces, or nvarance by Structure, good examples are: the Neocogntron and HONN [3]. In thrd-order networks, whch are a specal case of the HONN, nvarance s bult nto the network structure, whch enables fast network learnng wth only one vew of each pattern presented at the learnng stage. However, an exponentally growng amount of nterconnectons n the network does not enable ts usage for mage felds larger than 18 x 18 pxels [3]. A few dfferent solutons were proposed to mnmze the number of the HONN nterconnectons. Weght sharng, by smlar trangles [3]. Weght sharng by approxmately smlar trangles [4]- [5]. Coarse codng [6]. Non-fully nterconnected HONN [7]. All these methods partally solve the problem of the HONN nterconnectons but do not help wth larger mages. Consequently, the research communty n the feld of nvarant pattern recognton largely abandoned the HONN method. In ths paper, a modfcaton for the thrd-order network s descrbed. The proposed modfcaton takes nto account a pror knowledge of the bnary patterns that must be learned. By elmnatng dle loops, the network acheves sgnfcant reductons n computaton tme as well as n memory requrements for network confguraton and weght storage. Better recognton rates (compared to conventonally constructed networks wth the same nput mage feld) are attaned by the ntroducton of a new approxmately equal trangles scheme for weght sharng. The modfed confguraton enables effcent computaton of mage felds larger than 100 100 pxels wthout any loss of mage nformaton an mpossble task wth any prevously proposed algorthm. 2. HONN Archtecture Followng equaton descrbes the output of a thrd-order network: ( y = f w x x x ), (1) k l kl k l where w s the weght assocated wth a partcular trangle, y s the output and x s a bnary nput,, k, and l are the ndces of the nputs. A schematc descrpton of ths network s shown n Fg. 1.
Internatonal Journal "Informaton Theores & Applcatons" Vol.11 69 Fg.1. Schematc descrpton of a thrd-order network In the tranng phase, the perceptron-lke rule s used: Δw = η ( t y ) x x x, (2) kl k l where t s the expected tranng output, y s the actual output, η s the learnng rate and x s a bnary nput. The number of trangles (NoT) can be calculated wth the followng equaton: IN! NoT =, (3) ( IN 3)!3! where IN s the number of nput nodes. For mage felds 100 x 100 and 256 x 256 the number of trangles wll be 1.6662 10 11 and 4.6910 10 13 accordngly. As can be seen, the number of trangles grows very fast off the lmts of any current hardware. A few technques to reduce the number of weghts have been proposed n the lterature (as descrbed n secton 1), but they do not reduce computaton tme. The problem of large computatonal demands arses snce the network s constructed n the pre-processng stage before the learnng phase. At ths stage, all possble trangles are computed and ponters to the weghts are saved [8]. In addton to the ponters, the weght array s also stored. At least two memory bytes are requred for each ponter. If, for example, an nput feld of 100 x 100 pxels s gven, the total number of bytes requred to store the entre vector of ponters s 3.3324 10 11 bytes. The memory and computaton requrements are enormous. To work wth large nput patterns, sgnfcant network modfcatons are requred. 3. The Proposed Modfed HONN Method As noted before, the nput pattern s bnary: edge or contour pxel has the value 1 and all other pxels have the value 0. As can be seen from equaton (1), each product wth pxel value 0 wll gve 0 as a result. Ths means that only actve trangles (n whch all pxels belong to an obect contour) wll nfluence the result. In addton, the weghts that belong to the nactve trangles wll not be updated and wll keep zero value durng the learnng process. Followng ths observaton, the network can be modfed and all nactve trangles can be dsregarded durng the constructon phase, whch elmnates the dle loops from the computaton. Wth ths modfcaton, the network confguraton strctly depends on the nput patterns that have to be learned. In addton, to mprove network performance regardng rotaton, dstorton and a number of learned classes we ntroduce an approxmately equal trangles scheme for network constructon. Ths scheme, n addton to the approxmately smlar trangles scheme presented n [4] for weghts sharng, adds trangle area equalty. Ths means that approxmately smlar trangles wth approxmately equal areas wll share the same weght. 3.1 The Proposed Network Constructon The modfed algorthm for network constructon can be descrbed as follows: 1. Load all patterns that must be learned. 2. Run through each mage and save the coordnates of the contour (boundary) pattern pxels to the
70 Internatonal Journal "Informaton Theores & Applcatons" Vol.11 dfferent arrays. A set of such arrays s shown n Fg. 2. 3. Compute angles of all presented trangles and classfy them n order to assocate wth a partcular weght. Indces X m,y m and n correspond to pattern number (n), pxel number (m), weght ndex () and puttern number (). The varable n s the number of trangles from the partcular pattern that correspond to the partcular trangle weght ndex (class). Fg.2. Arrays of the pxel coordnates of the obect contours Fg.3. General array for classfyng trangles by weght ndex. The presented method of classfcaton s based on approxmately equal trangles. For the assocaton of a trangle wth a partcular weght, the sets of possble values of the two smallest trangle angles (α, β) and the trangle area (S) are parttoned nto bns defned by: (k 1) w α < kw, (l 1) w β < lw, (m 1) s S < ms, (4) where w s an angular tolerance, α and β are the smallest angles of the trangle so that α < β, s s an trangle area tolerance, S s an area of the computed trangle, k, l and m are the bn ndces assocated wth two angles and trangle area, respectvely. Durng the classfcaton step, each trangle class s assocated wth a correspondng weght and s represented by three varables k, l, and m. The array of trangle classes s constructed as shown n Fg.3. After constructon, only the array of trangle classes presented n Fg.3 must be stored n memory. 3.2 Network Tranng The prevously constructed array of trangle classes (Fg. 3) s used as the bass for learnng n the tranng phase. In addton, a zero matrx of weghts (W) wth the sze of NoP x NoW s constructed. Where NoW s the number of ndvdual weghts, NoP s the number of tranng patterns. Output computaton takes nto account only nformaton presented n the weghts array (W) and n the trangle array (N) (Fg. 3). It follows the next equaton for a partcular nput mage: y = f w n ), (5) ( k where s the output ndex, s the weght ndex, k s the pattern ndex, w s the weght, and n s the number of trangles that correspond to the partcular trangle class (.e., the partcular weght). All weghts are updated only once after each teraton, accordng to the next equaton: Δw = η t y ), f N kl >0 ; w = 0, f N kl =0, (6) l ( Δ l where t s the expected tranng output, y s the actual output, η s the learnng rate.
Internatonal Journal "Informaton Theores & Applcatons" Vol.11 71 After the tranng phase s complete, only the array of learned weghts and the correspondng coeffcents k, l, and m that represent the equvalence class (from the upper three rows of the array n Fg. 3) must be saved. 3.3 Recognton The algorthm for recognton can be descrbed as follows: 1. Load pattern ntended for recognton. 2. Construct an array of coordnates of contour pxels (as n the constructon stage). 3. Construct a zero matrx (N) wth the sze equal to 1 x NoW. Ths s a counter for trangles n the mage, whch correspond to the partcular weght. 4. Run through the coordnate array and compute coeffcents k, l and m for all possble trangles as was descrbed n 3.1. After each computaton, compare the newly found k, l and m wth the ones prevously saved (upper part of the array from Fg.3). If a matchng class for the trangle s found, the counter correspondng to that trangle class poston s ncreased by one (n 1 = n 1 +1) Thus, durng classfcaton the nonzero one-dmensonal matrx of counters (N) s bult. 4. Compute outputs accordng to equaton (5), usng the weghts array (W) bult durng constructon phase and the trangle counters (N) bult n the begnnng of recognton phase. 4. Expermental Results To study the performance of the modfed network and compare computatonal resources wth the conventonal network, seven dfferent obect classes wth 60 x 60 and 170 x 170 pxels were prepared. One obect from each class was used n the tranng phase and 14 rotated patterns of each class were used n the recognton phase. Pattern examples are shown n Fg.4. Fg.4. Pattern examples The comparson for computatonal resource demands for 60 x 60 and 170 x 170 nput felds are presented n Table 1. As can be seen from the table, the gan acheved wth the modfed network n computatonal steps amount s four orders of magntude for an nput feld 60 x 60 and fve orders of magntude for an nput feld of 170 x 170. Ths gan wll be more sgnfcant wth mage sze ncrease. In addton, the memory resources are mnmzed also. Table 1: Comparson of the computatonal resources demands ( approxmately smlar trangles scheme s used alone, the network was traned for frst fve pattern classes, w = π/180, m - not used). Input feld sze 60 x 60 170 x 170 Network type Conventonal Modfed Conventonal Modfed Computatonal steps (number) 7.8 10 9 13.8 10 5 4.02 10 12 4.5 10 7 Total memory requrements (bytes) 15.5 10 9 81340 8.04 10 12 81340 Table 2: Recognton rate for a varyng number of traned classes, angular and area smlartes. Input pattern: 60 x 60 pxels. Tolerance Number of traned classes Weght Angular (w) Area (S) 2 3 4 5 6 7 number π/60 10 100 95 94 84 80 80 17286 π/60 20 100 95 95 85 80 80 8935 π/20 10 100 95 91 87 82 80 2502 π/20 20 100 95 88 80 - - 1217
72 Internatonal Journal "Informaton Theores & Applcatons" Vol.11 Table 3: Recognton rates of the net wth the approxmately smlar trangles scheme alone. Input pattern: 60 x 60 pxels. Angular (w) Number of traned classes Weght tolerance 2 3 4 5 number π/225 100 80 75 60 8533 For comparson wth the approxmately smlar trangles scheme, a few results are provded n Table 3. Results for the best confguraton are shown only, but even ths shows much worse recognton rates. The cause for ths s that smlar trangles wth very large dfference n sze are assocated wth the same trangle class, as a result, some obect classes wll be assocated wth the same trangle class, preventng from the obects to have an ndvdual trangle set assocated wth t. From the expermental data provded, t can be seen that our method enables the possblty of large nput feld computaton wthout sgnfcant resource demands. Translaton nvarance s bult nto the network, thus 100% translaton nvarance s acheved. All expermental data are provded for ths partcular data set. For other data sets, where obect classes dffer sgnfcantly n sze and n form, much better recognton results can be acheved. 5. Conclusons A modfed Hgh-Order Neural Network for effcent nvarant obect recognton has been presented. The proposed modfcaton acheves sgnfcant gan n computaton tme and memory requrements. The gan n computaton tme s acheved by elmnatng the dle loops, by takng a pror knowledge of tranng patterns. Wth the proposed modfed HONN, large nput patterns can be processed wthout large computaton demands. Performance of the network s mproved also sgnfcantly, by usng the approxmately equal trangles scheme. References [1] Barnard E., D. Casasent. 1991. Invarance and neural nets. IEEE Transactons on Neural Networks, vol. 2, no. 5, pp. 498-507. [2] Wood J. 1996. Invarant pattern recognton: a revew. Pattern Recognton, vol. 29, no.1, pp.1-17. [3] Sprkovska L., M.B. Red. 1992. Robust poston, scale, and rotaton nvarant obect recognton usng hgher-order neural networks. Pattern Recognton, Vol.25, No. 9, pp. 975-985. [4] Perantons S.J., P.J.G. Lsboa. 1992. Translaton, Rotaton, and Scale Invarant Pattern Recognton by Hgher-Order Neural Networks and Moment Classfers. IEEE Transactons on Neural Networks, Vol.3, No. 2, pp. 241-251. [5] He Z., M.Y. Syal. 1999. Improvement on Hgher-Order Neural Networks for Invarant Obect Recognton. Neural Processng Letters, Vol. 10, pp 49-55. [6] Sprkovska L., M.B. Red. 1993. Coarse-Coded Hgher Order Neural Networks for PSRI Obect Recognton. IEEE Transactons on Neural Networks, Vol. 4, No. 2, pp. 276-283. [7] Sprkovska L., M.B. Red. 1990. Connectvty Strateges for Hgher-order Neural Networks appled to Pattern Recognton. Proceedngs of IJCNN, Vol. 1, San Dego, pp. 21-26. [8] He Z. 1999. Invarant Pattern Recognton wth Hgher-Order Neural Networks. Master Thess, School of Electrcal and Computer Engneerng, Nanyang Technologcal Unversty, Sngapore. Authors' Informaton Evgeny Artyomov The VLSI Systems Center, Ben-Guron Unversty, Beer Sheva, Israel. e-mal: artemov@bgumal.bgu.ac.l Orly Yadd-Pecht Dept of Electrcal and Computer Engneerng, Unversty of Calgary, Alberta, Canada or The VLSI Systems Center, Ben-Guron Unversty, Beer Sheva, Israel. e-mal: oyp@ee.bgu.ac.l