Siking Neural Networks for Real-Time Infrared Images Processing in Thermo Vision Sstems Snejana Pleshkova Deartment of Telecommunications Technical Universit Kliment Ohridski, 8 Sofia aabbv@tu-sofia.bg Abstract: - Thermo vision are used in militar, olice custom traffic control, industrial and other secific alications for collecting and rocessing thermo visual information from infrared images. There is a roblem in the stes of imlementation of the develoed methods and algorithms for infrared image rocessing in real time ractical alications of thermo vision sstems. Here is roosed to exloit the advances in owerful arallel comuter grahics and image rocessing for comuter vision and comuter games alications, where are develoed grahical rocessing unit (GPU) and Comute Unified Device Architecture (CUDA) with the abilit of arallel rocessing and the highseed memor access of grahical rocessing units (GPU), which is essential in the real time alications with neural networks in most of the infrared image rocessing alications. Ke-Words: - siking neural networks; real time infrared image rocessing; thermo vision sstems 1 Introduction Thermo vision are used in militar, olice custom traffic control, industrial and other secific alications for collecting and rocessing thermo visual information from infrared images [1, 9]. There are man hardware or software develoment tools for testing the methods and alication algorithms for infrared catured image rocessing in thermo vision sstems [2, 3, 1]. The roblems arise in the stes of imlementation of the develoed methods and algorithms in real time ractical alications of thermo vision sstems. In surveillance and securit thermo visual sstems one of the most ractical goal is the moving objects detection and tracking in infrared images catured from a thermo vision camera. The inut infrared images are usuall searated and rocessed in small blocks with an aroriate and chosen shae (for examle rectangular) and size (for examle 8x8). In conventional hardware or software imlementation of infrared image rocessing algorithms the blocks are rocessed consecutivel or in series and the achieving the real time rocessing is not alwas ossible. The advances in owerful arallel comuter grahics and image rocessing for comuter vision and comuter games alications with the develoed grahical rocessing unit (GPU) and Comute Unified Device Architecture (CUDA) [4] offers for GPU-based comuting a owerful develoment framework integrated with high level arallel rogramming languages like C or C++ languages. Grahical rocessing units (GPU) are devices designed to exloit arallel shared memor-based floating-oint comutation. The rovide memor access seeds suerior to those of commodit CPU-based sstems. These features to udate in arallel the model variables ever iteration comared to other solutions like rogrammable logic, integrated circuits, custom shared memor solutions, and cluster message assing comuting sstems make GPUs attractive in real time image rocessing and eseciall in this article for infrared image rocessing alications. Here is roosed to exloit the abilit of arallel rocessing and the high-seed memor access of grahical rocessing units (GPU), which is essential in the real time alications with neural networks in most of the infrared image rocessing alications. In most alications of infrared image rocessing with neural networks the rocessed algorithms work sequentiall b a CPU, which means onl one neuron is udated at a given time. As a result the erformance degrades quickl with the increase in network size and connectivit. This is eseciall the case for large connectivit, since sequential rocessors need to iterative over ever connection for each neuron. To seed u the oeration, suercomuters or distributed comuters are normall used for large-scale neural network simulation. But these solutions incur high cost. Traditional CPU architectures are not designed for arallel rocessing. To avoid this roblem in real time infrared image rocessing alications a suitable te of neural network is roosed to use the siking neural network (SNN) ISBN: 978-1-6184-18-1 183
imlemented in grahical rocessing unit (GPU) and Comute Unified Device Architecture (CUDA). The examle is resented for real time infrared image rocessing alications like moving objects detection and tracking in infrared images in surveillance and securit thermo visual sstems. 2 Siking Neural Networks erformance useful for infrared image rocessing A siking neural network (SNN) is a model of a biological neural network with a simlified rocess of snatic transmission and neurons communication with each other b sikes, modeled as time-stamed otential ulses. The accurac of a sike time deends on the choice of numerical integration sstems, which can be classified into the following categories: - clock-driven (snchronous) sstems evaluate model variables onl at fixed oints in time in which the resolution of the time grid, defined b the magnitude of a time ste, determines the simulation accurac and affects the execution time; - event-driven (asnchronous) sstems udate variables onl at the exact time of a sike event exact time, in which the accurac of the event time in these sstems is not tied to a recision of an time grid, but deends on floating-oint format chosen (double or single recision); - hbrid sstems combine advantages of eventdriven and clock-driven sstems, in which the refresh of the model variables is at fixed oints in time, but et the rocess events at the exact time. Two identical siking neural network (SNN) excited with identical stimuli, but imlemented as a clock- and event- driven sstems do not roduce the same siking attern unless a time ste in the clock-driven imlementation is small enough to achieve the designed accurac. 3 Imlementation of Parcker-Sochcki Integration method in real time infrared image rocessing The analsis of the above mentioned choices of numerical integration sstems leads to the roosition to use here for infrared image rocessing the Parker- Sochacki (PS) numerical integration method [5] to the biologicall lausible henomenological neuron model develoed b Izhikevich [6]. This integration method rovides accurac aroriate for simulation of siking neural network (SNN) with biological mechanisms requiring exact event timing and achieving full doublerecision integration accurac. The Parker-Sochacki (PS) numerical integration technique is based on alication of the Maclaurin series to a solution of differential equations with an initial value roblem (IVP), d dt ( t) = f( t, ( t) ), ( t ) =, t [ t α t +α]. (1) =, The method was develoed based on the Picard iteration [12] under the assumtion that the solution function is locall Lischitz continuous in and continuous in t (Picard Lindelof theorem) [7], and therefore cam be described with ower series. Consequentl, based on the fact that next coefficient in the series can be reresented with the derivative of revious coefficient, ( ) ( ) ' ( + 1) + = t, = = =! (2) and after substituting (2) in (1) the IVP (1) can be described in terms of ower series: ( ) ' + 1 + = f t, t (3) = = Provided that f is a linear function, f ( t, ( t) ) = k( t) + b, Eq. (3) becomes (constant term is temorar droed): ' + 1 = + k t (4) = = ( ) The equation (4) exhibit loo level arallelism (LLP) and arallel reduction, which can be exloited if all coefficients are re-calculated. However, rovided that f is a quadratic function, 2 f ( t, ( t) ) = a ( t) + b( t) + c, after series multilication, equation (3) becomes: = ( ) ' + + 1 = a i i t + b t (5) = = Exloiting arallel comutation is roblematic in this case because of linearl scaled convolution, which introduces loo-carried circular deendence. Partial arallelism still can be exloited in the convolution itself b / +1. and term ( ) = ISBN: 978-1-6184-18-1 184
4 The siking neural networks for real time infrared image rocessing with comuter unified device architecture (CUDA) The equation (4) and (5) shows two imortant ossibilities to use full arallelism in arallel reduction of all re-calculated coefficients or artial arallelism in convolution, resectivel. This assertion is ver imortant in real time alication of infrared image rocessing and is well suited with the advances grahical rocessing unit (GPU) and Comute Unified Device Architecture (CUDA). Therefore, in this article is resented the structure of a real time infrared image rocessing with siking neural network and comute unified device architecture (CUDA), shown in Fig.1. The Infrared Image Cature in real-time the thermal images to be rocessing. The te of this infrared sensor is EasIR-9, which is a standard thermo vision camera. The catured infrared images are transformed as Pixel Data to the Sike Convertor. The function of this block is to convert the each value of inut Pixel Data of the infrared images to corresonding amlitudes, and time sacing of the ulse sequence (Sikes), reresenting the inuts of the used sike neural network (SNN) for infrared image rocessing. The Sikes are inuts of the used necessar comuter Unified devices architecture interface (CUDA Interface). This interface distributes the Sikes to the blocks SP, which in CUDA architecture are named as Scalar Processor (SP). The block SP in CUDA architecture are arranged as Grid of Blocks named Streaming Microrocessors (SM) with the corresonding Shared Memor and Local Memor. All of the existing in a CUDA architecture Grids of Blocks are connected to the Global Memor. The control of the infrared image rocessing and aling of a chosen algorithm for sike neural network (SNN) is erformed from a Digital Signal Processor (DSP) or from Host Comuter. Therefore, in Fig.1 is shown a DSP or Host Comuter Interface to the CUDA architecture block. Also a commonl used Disla Interface connected to LCD Disla is shown in Fig.1 for visualization of the inut and rocessed infrared images. Figure1. Structure of a real time infrared image rocessing with siking neural network and comute unified device architecture (CUDA) A more detailed reresentation of the Grids of Blocks in CUDA architecture, which execute the sike neural network (SNN) algorithm for infrared image rocessing, is show in Fig.2. it is seen that each art of the Grid Block can be regard as an n x n arra of sub blocks, named as Thread (1,1) Thread (n,n). The names Thread (, ) are chosen from the terminolog of CUDA Programming Model using Oen CL rogramming language [8]. ISBN: 978-1-6184-18-1 185
There are shown also in Fig.3 the necessar block Global Memor, Constant Memor and Infrared Image Memor, which are globall connected to all Thread blocks, transferring and distributing to these Tread blocks the global data, constant values and infrared image information as Sikes values. 5 Results and Conclusion Figure.2. Detailed reresentation of the Grids of Blocks in CUDA architecture, which execute the sike neural network (SNN) algorithm for infrared image rocessing The detailed structure of the Threads (, ) is shown in Fig.3. each Thread is connected to Shared/Local Memor an indirect to the Private Memor. These tes of memories are for storing and udating the local sike signal, coefficients and local executed infrared image rocessing oerations corresonding to the sike neural network (SNN) algorithm for real time infrared image rocessing in CUDA architecture. The exeriments for real time infrared image rocessing and sike neural network (SNN) with CUDA architecture are carried out with NVIDIA GTX28 GPU card that consists of 24 scalar rocessors groued into 3 Streaming Multirocessors (SM), each oerating at 1.2 GHz. The sustained erformance of the GTX28 GPU card is aroximatel 35 GFLOPS. Each Streaming Multirocessor (SM) has a hardware thread scheduler for sike neurons that selects a grou of threads for execution. If an one of the sike neuron threads in the grou issues a costl external memor oeration, then the sike thread scheduler automaticall switches to a new sike thread grou. At an instant of time, the hardware allows a ver high number of sike threads, aroximatel 768 sike threads er Streaming Multirocessors (SM) in GTX28, to be active simultaneousl. B swaing sike thread grous, the sike thread scheduler can effectivel hide costl memor latenc. Each GTX 28 GPU contains a 512-bit DDR3 interface to the grahics disla memor with a eak theoretical bandwidth of 143GB/s. The comarison of the results achieved in the exeriments for real time infrared image rocessing with sike neural network (SNN) and CUDA architecture imlemented in NVIDIA GTX28 GPU card are made with the same algorithm for infrared image rocessing with sike neural network (SNN) using standard Pentium chiset with a 64-bit quad-umed DDR3 interface. The results from this comarison are resented in Table I. Figure.3. Detailed structure of the Threads Sike Neural Network (SNN) for Infrared Image Processing With CUDA Architecture and NVIDIA GTX 28 GPU With Standard Pentiom Chiset and 64-bit quadumed DDR3 Interface In Programming Language Oen CL Microsoft Visual Studio 21 and Oen CV Table 1 Seed of Execution Real Time abilit 35 GB/s Yes 28 GB/s No ISBN: 978-1-6184-18-1 186
In conclusion is ossible to summarize the effectiveness of using grahical rocessing unit (GPU) and Comute Unified Device Architecture (CUDA) in siking neural network for real time infrared images rocessing: arallelism, high memor access, high seed rocessing. Acknowledgements This work was suorted b National Ministr of Science and Education of Bulgaria under Contract DDVU 2/4-7: Thermo Vision Methods and Recourses in Information Sstems for Customs Control and Combating Terrorism Aimed at Detecting and Tracking Objects and Peole. References: [1] Lebold J. Infrared Thermograh and Distribution Sstem Maintenance Electricit Toda, Volume 3. 28, 18-19. [2] FLIR Alication Book. FLIR Coman 21 [3] Coon D. D. and Perera A.G. U. Sectral information coding b Infraredhotorecetors. International Journal of Infrared and Milimeter Waves, Volume7, Number 1 1571-1583. [4] NVIDIA CUDA. htt://develoer.nvidia.com/ [5] G. E. Parker and J. S. Sochacki. Imlementing the Picard iteration, Neural, Parallel Sci. Comut., vol. 4,. 97-112, 1996 [6] E. M. Izhikevich and G. M. Edelman. Largescale model of mammalian thalamocortical sstems, Proceedings of the National Academ of Sciences, vol. 15,. 3593-3598, 28. [7] E. Picard, Traite D'Analse. Gauthier-Villars, 1922-1928, vol. 3. [8] (29, Jul.) NVIDIA CUDA C Programming Best Practices Guide. [Accessed online 4/3/21]. htt://develoer.nvidia.com/ [9] Andonova A., Thermograhic evaluation of electromechanical relas qualit in railwa automation, International Journal of Electricaland Comuter Engineering (IJECE), Feb. 212, vol.2, No1, 212,.1-6,ISSN:288-878 [1] Andonova A., S. Todorov, Buried Object Detection b Thermograh,Annual Journal of Electronics, vol.4, 1, Sofia,,. 133-136,21,ISSN 1313-1842 ISBN: 978-1-6184-18-1 187