Fast Code Detecton Usng Hgh Speed Tme Delay Neural Networks Hazem M. El-Bakry 1 and Nkos Mastoraks 1 Faculty of Computer Scence & Informaton Systems, Mansoura Unversty, Egypt helbakry0@yahoo.com Department of Computer Scence, Mltary Insttutons of Unversty Educaton (MIUE)-Hellenc Naval Academy, Greece Abstract. Ths paper presents a new approach to speed up the open of tme delay for fast code detecton. The entre data are collected together n a long vector and then tested as a one nput pattern. The proposed fast tme delay (FTDNNs) use cross correlaton n the frequency doman between the tested data and the nput weghts of neural networks. It s proved mathematcally and practcally that the number of computaton steps requred for the presented tme delay s less than that needed by conventonal tme delay (CTDNNs). Smulaton results usng MATLAB confrm the theoretcal computatons. 1 Introducton Recently, tme delay have shown very good results n dfferent areas such as automatc control, speech recognton, blnd equalzaton of tme-varyng channel and other communcaton applcatons. The man objectve of ths research s to reduce the response tme of tme delay. The purpose s to perform the testng process n the frequency doman nstead of the tme doman. Our approach was successfully appled for sub-mage detecton usng fast (FNNs) as proposed n [1,,3]. Furthermore, t was used for fast face detecton [7,9], and fast rs detecton [8]. Another dea to further ncrease the speed of FNNs through mage decomposton was suggested n [7]. FNNs for detectng a certan code n one dmensonal seral stream of sequental data were descrbed n [4,5]. Compared wth conventonal, FNNs based on cross correlaton between the tested data and the nput weghts of n the frequency doman showed a sgnfcant reducton n the number of computaton steps requred for certan data detecton [1,,3,4,5,7,8,9,11,1]. Here, we make use of our theory on FNNs mplemented n the frequency doman to ncrease the speed of tme delay. The dea of movng the testng process from the tme doman to the frequency doman s appled to tme delay. Theoretcal and practcal results show that the proposed FTDNNs are faster than CTDNNs. In secton, our theory on FNNs for detectng certan data n one dmensonal matrx s descrbed. Expermental results for FTDNNs are presented n secton 3. D. Lu et al. (Eds.): ISNN 007, LNCS 4493, Part III, pp. 764 773, 007. Sprnger-Verlag Berln Hedelberg 007
Fast Code Detecton Usng Hgh Speed Tme Delay Neural Networks 765 Fast Code Detecton Usng Cross Correlaton n the Frequency Doman Fndng a certan code/data n the nput one dmensonal matrx s a searchng problem. Each poston n the s tested for the presence or absence of the requred code/data. At each poston n the, each sub-matrx s multpled by a wndow of weghts, whch has the same sze as the sub-matrx. The outputs of neurons n the hdden layer are multpled by the weghts of the output layer. When the fnal output s hgh, ths means that the sub-matrx under test contans the requred code/data and vce versa. Thus, we may conclude that ths searchng problem s a cross correlaton between the matrx under test and the weghts of the hdden neurons. The convoluton theorem n mathematcal analyss says that a convoluton of f wth h s dentcal to the result of the followng steps: let F and H be the results of the Fourer Transformaton of f and h n the frequency doman. Multply F and H* (conjugate of H) n the frequency doman pont by pont and then transform ths product nto the spatal doman va the nverse Fourer Transform. As a result, these cross correlatons can be represented by a product n the frequency doman. Thus, by usng cross correlaton n the frequency doman, speed up n an order of magntude can be acheved durng the detecton process [1,,3,4,5,7,8,9,14]. In the detecton phase, a sub matrx I of sze 1xn (sldng wndow) s extracted from the tested matrx, whch has a sze 1xN, and fed to the neural network. Let W be the matrx of weghts between the nput sub-matrx and the hdden layer. Ths vector has a sze of 1xn and can be represented as 1xn matrx. The output of hdden neurons h() can be calculated as follows: n h = g + W (k)i(k) b (1) k = 1 where g s the actvaton functon and b() s the bas of each hdden neuron (). Equaton 1 represents the output of each hdden neuron for a partcular sub-matrx I. It can be obtaned to the whole Z as follows: n/ h (u) = g W (k) Z(u+ k) + b () k = n/ Eq. represents a cross correlaton open. Gven any two functons f and d, ther cross correlaton can be obtaned by: f(x) = n= Therefore, Eq. may be wrtten as follows [1]: d(x) f(x+ n)d(n) (3) ( Z b ) h g W + = (4) where h s the output of the hdden neuron () and h (u) s the actvty of the hdden unt () when the sldng wndow s located at poston (u) and (u) [N-n+1].
766 H.M. El-Bakry and N. Mastoraks Now, the above cross correlaton can be expressed n terms of one dmensonal Fast Fourer Transform as follows [1]: W Z = F 1 F Z ( ( ) F* ( W ) (5) Hence, by evaluatng ths cross correlaton, a speed up can be obtaned comparable to conventonal. Also, the fnal output of the neural network can be evaluated as follows: q O(u) = g = Wo () h (u) + bo (6) 1 where q s the number of neurons n the hdden layer. O(u) s the output of the neural network when the sldng wndow located at the poston (u) n the Z. W o s the weght matrx between hdden and output layer. The complexty of cross correlaton n the frequency doman can be analyzed as follows: 1- For a tested matrx of 1xN elements, the 1D-FFT requres a number equal to Nlog N of complex computaton steps [13]. Also, the same number of complex computaton steps s requred for computng the 1D-FFT of the weght matrx at each neuron n the hdden layer. - At each neuron n the hdden layer, the nverse 1D-FFT s computed. Therefore, q backward and (1+q) forward transforms have to be computed. Therefore, for a gven matrx under test, the total number of opens requred to compute the 1D-FFT s (q+1)nlog N. 3- The number of computaton steps requred by FNNs s complex and must be converted nto a real verson. It s known that, the one dmensonal Fast Fourer Transform requres (N/)log N complex multplcatons and Nlog N complex addtons [13]. Every complex multplcaton s realzed by sx real floatng pont opens and every complex addton s mplemented by two real floatng pont opens. Therefore, the total number of computaton steps requred to obtan the 1D-FFT of a 1xN matrx s: ρ=6((n/)log N) + (Nlog N) (7) whch may be smplfed to: ρ=5nlog N (8) 4- Both the nput and the weght matrces should be dot multpled n the frequency doman. Thus, a number of complex computaton steps equal to qn should be consdered. Ths means 6qN real opens wll be added to the number of computaton steps requred by FNNs. 5- In order to perform cross correlaton n the frequency doman, the weght matrx must be extended to have the same sze as the. So, a number of zeros = (N-n) must be added to the weght matrx. Ths requres a total real number of computaton steps = q(n-n) for all neurons. Moreover, after computng the FFT for the weght matrx, the conjugate of ths matrx must be obtaned. As a result, a real number of computaton steps = qn should be added n order to obtan the conjugate of
Fast Code Detecton Usng Hgh Speed Tme Delay Neural Networks 767 the weght matrx for all neurons. Also, a number of real computaton steps equal to N s requred to create butterfles complex numbers (e -jk(πn/n) ), where 0<K<L. These (N/) complex numbers are multpled by the elements of the or by prevous complex numbers durng the computaton of FFT. To create a complex number requres two real floatng pont opens. Thus, the total number of computaton steps requred for FNNs becomes: σ=(q+1)(5nlog N) +6qN+q(N-n)+qN+N (9) whch can be reformulated as: σ=(q+1)(5nlog N)+q(8N-n)+N (10) 6- Usng sldng wndow of sze 1xn for the same matrx of 1xN pxels, q(n-1)(n-n+1) computaton steps are requred when usng CTDNNs for certan code detecton or processng (n) nput data. The theoretcal speed up factor η can be evaluated as follows: q(n-1)(n- n + 1) η = (11) (q+ 1)(5Nlog N) + q(8n- n) N + 3 Smulaton Results Tme delay accept seral nput data wth fxed sze (n). Therefore, the number of nput neurons equals to (n). Instead of treatng (n) nputs, our new approach s to collect all the nput data together n a long vector (for example 100xn). Then the nput data s tested by tme delay as a sngle pattern wth length L (L=100xn). Such a test s performed n the frequency doman as descrbed n secton II. Complex-valued have many applcatons n felds dealng wth complex numbers such as telecommuncatons, speech recognton and mage processng wth the Fourer Transform [6,10]. Complex-valued mean that the nputs, weghts, thresholds and the actvaton functon have complex values. In ths secton, formulas for the speed up wth dfferent types of nputs wll be presented. The specal case of only real nput values (.e. magnary part=0) wll be consdered. Also, the speed up n the case of a one and two dmensonal nput matrx wll be concluded. The open of FNNs depends on computng the Fast Fourer Transform for both the nput and weght matrces and obtanng the resultng two matrces. After performng dot multplcaton for the resultng two matrces n the frequency doman, the Inverse Fast Fourer Transform s calculated for the fnal matrx. Here, there s an excellent advantage wth FNNs that should be mentoned. The Fast Fourer Transform s already dealng wth complex numbers, so there s no change n the number of computaton steps requred for FNNs. Therefore, the speed up n the case of complex-valued tme delay can be evaluated as follows: 1) In case of real nputs A) For a one dmensonal Multplcaton of (n) complex-valued weghts by (n) real nputs requres (n) real opens. Ths produces (n) real numbers and (n) magnary numbers. The addton
768 H.M. El-Bakry and N. Mastoraks of these numbers requres (n-) real opens. Therefore, the number of computaton steps requred by conventonal can be calculated as: θ=q(n-1)(n-n+1) (1) The speed up n ths case can be computed as follows: q(n-1)(n- n + 1) η = (13) (q+ 1)(5Nlog N) + q(8n- n) + N The theoretcal speed up for searchng short successve (n) data n a long nput vector (L) usng complex-valued tme delay s shown n Tables 1,, and 3. Also, the practcal speed up for manpulatng matrces of dfferent szes (L) and dfferent szed weght matrces (n) usng a.7 GHz processor and MATLAB s shown n Table 4. Table 1. The theoretcal speed up for tme delay (1D-real values nput matrx, n=400) 10000 4.607e+008 4.96e+007 10.76 40000 1.8985e+009 1.9614e+008 9.6793 90000 4.955e+009 4.7344e+008 9.079 160000 7.6513e+009 8.819e+008 8.6731 50000 1.1966e+010 1.475e+009 8.383 Table. The theoretcal speed up for tme delay (1D-real values nput matrx, n=65) 10000 7.063e+008 4.919e+007 16.3713 40000.9508e+009 1.9613e+008 15.045 90000 6.6978e+009 4.7343e+008 14.1474 160000 1.1944e+010 8.818e+008 13.5388 50000 1.8688e+010 1.475e+009 13.0915 Table 3. The theoretcal speed up for tme delay (1D-real values nput matrx, n=900) 10000 9.83 e+008 4.911e+007.8933 40000 4.06e+009 1.961e+008 1.500 90000 9.6176e+009 4.7343e+008 0.3149 160000 1.7173e+010 8.817e+008 19.4671 50000.6888e+010 1.475e+009 18.8356
Fast Code Detecton Usng Hgh Speed Tme Delay Neural Networks 769 Table 4. Practcal speed up for tme delay (1D-real values ) (n=400) (n=65) 10000 17.88 5.94 35.1 40000 17.19 5.11 34.43 90000 16.65 4.56 33.59 160000 16.14 4.14 33.05 50000 15.89 3.76 3.60 (n=900) B) For a two dmensonal Multplcaton of (n ) complex-valued weghts by (n ) real nputs requres (n ) real opens. Ths produces (n ) real numbers and (n ) magnary numbers. The addton of these numbers requres (n -) real opens. Therefore, the number of computaton steps requred by conventonal can be calculated as: θ=q(n -1)(N-n+1) (14) The speed up n ths case can be computed as follows: q(n -1)(N- n + 1) η = (15) (q+ 1)(5N log N ) + q(8n - n ) + N The theoretcal speed up for detectng (nxn) real valued submatrx n a large real valued matrx (NxN) usng complex-valued tme delay s shown n Tables 5, 6, 7. Also, the practcal speed up for manpulatng matrces of dfferent szes (NxN) and dfferent szed weght matrces (n) usng a.7 GHz processor and MATLAB s shown n Table 8. Table 5. The theoretcal speed up for tme delay (D-real values nput matrx, n=0) Sze of 100x100 3.1453e+008 4.916e+007 7.391 00x00 1.5706e+009 1.9610e+008 8.0091 300x300 3.7854e+009 4.7335e+008 7.9970 400x400 6.9590e+009 8.803e+008 7.8898 500x500 1.1091e+010 1.473e+009 7.7711 Table 6. The theoretcal speed up for tme delay (D-real values nput matrx, n=5) Sze of 100x100 4.385e+008 4.909e+007 10.0877 00x00.313e+009 1.9609e+008 11.8380 300x300 5.7086e+009 4.7334e+008 1.060 400x400 1.0595e+010 8.80e+008 1.0119 500x500 1.6980e+010 1.473e+009 11.8966
770 H.M. El-Bakry and N. Mastoraks Table 7. The theoretcal speed up for tme delay (D-real values nput matrx, n=30) Sze of 100x100 5.4413e+008 4.901e+007 1.6834 00x00 3.1563e+009 1.9608e+008 16.0966 300x300 7.97e+009 4.7334e+008 16.7476 400x400 1.4857e+010 8.801e+008 16.8444 500x500.3946e+010 1.473e+009 16.7773 Table 8. Practcal speed up for tme delay (D-real values ) Sze of nput matrx (n=0) (n=5) (n=30) 100x100 17.19.3 31.74 00x00 17.61.89 3.55 300x300 16.54 3.66 33.71 400x400 15.98.95 34.53 500x500 15.6.49 33.3 ) In case of complex nputs A) For a one dmensonal Multplcaton of (n) complex-valued weghts by (n) complex nputs requres (6n) real opens. Ths produces (n) real numbers and (n) magnary numbers. The addton of these numbers requres (n-) real opens. Therefore, the number of computaton steps requred by conventonal can be calculated as: θ=q(4n-1)(n-n+1) (16) The speed up n ths case can be computed as follows: q(4n-1)(n- n + 1) η = (17) (q+ 1)(5Nlog N) + q(8n- n) N + Table 9. The theoretcal speed up for tme delay (1D-complex values, n=400) 100x100 9.111e+008 4.96e+007 1.4586 00x00 3.7993e+009 1.9614e+008 19.3706 300x300 8.5963e+009 4.7344e+008 18.1571 400x400 1.531e+010 8.819e+008 17.3570 500x500.3947e+010 1.475e+009 16.7750 The theoretcal speed up for searchng short complex successve (n) data n a long complex-valued nput vector (L) usng complex-valued tme delay neural
Fast Code Detecton Usng Hgh Speed Tme Delay Neural Networks 771 networks s shown n Tables 9, 10, and 11. Also, the practcal speed up for manpulatng matrces of dfferent szes (L) and dfferent szed weght matrces (n) usng a.7 GHz processor and MATLAB s shown n Table 1. Table 10. The theoretcal speed up for tme delay (1D-complex values, n=65) 100x100 1.4058e+009 4.919e+007 3.7558 00x00 5.9040e+009 1.9613e+008 30.105 300x300 1.3401e+010 4.7343e+008 8.3061 400x400.3897e+010 8.818e+008 7.0883 500x500 3.7391e+010 1.475e+009 6.1934 Table 11. The theoretcal speed up for tme delay (1D-complex values, n=900) 100x100 1.9653e+009 4.911e+007 45.7993 00x00 8.4435e+009 1.961e+008 43.0519 300x300 1.940e+010 4.7343e+008 40.6410 400x400 3.4356e+010 8.817e+008 38.9450 500x500 5.3791e+010 1.475e+009 37.6817 Table 1. Practcal speed up for tme delay (1D-complex values nput matrx) (n=400) (n=65) 10000 37.90 53.58 70.71 40000 36.8 5.89 69.43 90000 36.34 5.47 68.69 160000 35.94 51.88 68.05 50000 35.69 51.36 67.56 (n=900) B) For a two dmensonal Multplcaton of (n ) complex-valued weghts by (n ) real nputs requres (6n ) real opens. Ths produces (n ) real numbers and (n ) magnary numbers. The addton of these numbers requres (n -) real opens. Therefore, the number of computaton steps requred by conventonal can be calculated as: θ=q(4n -1)(N-n+1) (18) The speed up n ths case can be computed as follows: q(4n -1)(N- n + 1) η = (19) (q+ 1)(5N log N ) + q(8n - n ) + N
77 H.M. El-Bakry and N. Mastoraks The theoretcal speed up for detectng (nxn) complex-valued submatrx n a large complex-valued matrx (NxN) usng complex-valued s shown n Tables 13, 14, and 15. Also, the practcal speed up for manpulatng matrces of dfferent szes (NxN) and dfferent szed weght matrces (n) usng a.7 GHz processor and MATLAB s shown n Table 16. Table 13. The theoretcal speed up for tme delay (D-complex values, n=0) Sze of 100x100 6.946e+008 4.916e+007 14.6674 00x00 3.1431e+009 1.9610e+008 16.081 300x300 7.5755e+009 4.7335e+008 16.0040 400x400 1.397e+010 8.803e+008 15.7894 500x500.197e+010 1.473e+009 15.5519 Table 14. The theoretcal speed up for tme delay (D-complex values, n=5) Sze of 100x100 8.6605e+008 4.909e+007 0.1836 00x00 4.6445e+009 1.9609e+008 3.6856 300x300 1.14e+010 4.7334e+008 4.1301 400x400.1198e+010 8.80e+008 4.0333 500x500 3.3973e+010 1.473e+009 3.808 Table 15. The theoretcal speed up for tme delay (D-complex values, n=30) Sze of 100x100 1.0886e+009 4.901e+007 5.3738 00x00 6.3143e+009 1.9608e+008 3.01 300x300 1.5859e+010 4.7334e+008 33.5045 400x400.97e+010 8.801e+008 33.6981 500x500 4.7904e+010 1.473e+009 33.5640 Table 16. Practcal speed up for tme delay (D-complex values nput matrx) Sze of (n=0) (n=5) 100x100 38.33 46.99 6.88 00x00 39.17 47.79 63.77 300x300 38.44 48.86 64.83 400x400 37.9 47.3 65.99 500x500 37.3 46.89 64.89 (n=30)
Fast Code Detecton Usng Hgh Speed Tme Delay Neural Networks 773 4 Concluson New FTDNNs have been presented. Theoretcal computatons have shown that FTDNNs requre fewer computaton steps than conventonal ones. Ths has been acheved by applyng cross correlaton n the frequency doman between the nput data and the nput weghts of tme delay. Smulaton results have confrmed ths proof by usng MATLAB. Ths algorthm can be successfully appled to any applcaton that uses tme delay. References [1] El-Bakry, H. M. Zhao, Q.: A Modfed Cross Correlaton n the Frequency Doman for Fast Pattern Detecton Usng Neural Networks. Internatonal Journal of Sgnal Processng 1 (004) 188-194 [] El-Bakry, H. M., Zhao, Q.: Fast Object/Face Detecton Usng Neural Networks and Fast Fourer Transform. Internatonal Journal of Sgnal Processng 1 (004) 18-187 [3] El-Bakry, H. M., Zhao, Q.: Fast Pattern Detecton Usng Normalzed Neural Networks and Cross Correlaton n the Frequency Doman. accepted and under publcaton n the EURASIP Journal on Appled Sgnal Processng [4] El-Bakry, H. M., Zhao, Q.: A Fast Neural Algorthm for Seral Code Detecton n a Stream of Sequental Data. Internatonal Journal of Informaton Technology (005) 71-90 [5] El-Bakry, H.M., Stoyan, H.: FNNs for Code Detecton n Sequental Data Usng Neural Networks for Communcaton Applcatons. Proc. of the Frst Internatonal Conference on Cybernetcs and Informaton Technologes, Systems and Applcatons: CITSA 004 Orlando, Florda, USA, Vol. IV, 150-153. [6] Hrose, A.: Complex-Valued Neural Networks Theores and Applcatons. Seres on nnovatve Intellegence 5 (003) [7] El-Bakry, H.M.: Face detecton usng fast and mage decomposton. Neurocomputng Journal, 48 (00) 1039-1046 [8] El-Bakry, H.M.: Human Irs Detecton Usng Fast Cooperatve Modular Neural Nets and Image Decomposton. Machne Graphcs & Vson Journal (MG&V) 11 (00) 498-51 [9] El-Bakry, H.M.: Automatc Human Face Recognton Usng Modular Neural Networks, Machne Graphcs & Vson Journal (MG&V) 10 (001) 47-73 [10] Jankowsk, S., Lozowsk, A., Zurada, M.: Complex-valued Multstate Neural Assocatve Memory. IEEE Trans. on Neural Networks 7 (1996) 1491-1496 [11] El-Bakry, H.M., Zhao, Q.: Fast Pattern Detecton Usng Neural Networks Realzed n Frequency Doman. Proc. of the Internatonal Conference on Pattern Recognton and Computer Vson, The Second World Enformatka Congress WEC'05, Istanbul, Turkey, (005) 89-9 [1] El-Bakry, H.M., Zhao, Q.: Sub-Image Detecton Usng Fast Neural Processors and Image Decomposton. Proc. of the Internatonal Conference on Pattern Recognton and Computer Vson, The Second World Enformatka Congress WEC'05, Istanbul, Turkey, (005) 85-88 [13] Cooley, J.W., Tukey, J.W.: An algorthm for the machne calculaton of complex Fourer seres. Math. Comput 19 (1965) 97 301 [14] Klette, R., Zamperon: Handbook of mage processng operators. John Wley & Sonsltd (1996)