Outline. A.I. Applications. Searching for the solution. Chess game. Deep Blue vs. Kasparov 20/03/ PDF Free Download

Outline Giorgio Buttzzo E-mil: g.buttzzo@sssup.it Scuol Superiore Snt Ann retis.sssup.it/~giorgio/slides/neurl/inn.html Motivtions Neurl processing Lerning prdigms Associtive memories Pttern recognitions Neurl networks for control Conclusions 2 A.I. Applictions Strtegy gmes Chess, Mstermind, Go Speech recognition syntctic nd semntic nlysis Theorem proving Logic resoning Expert Systems Medicl dignoses, wheter forecst 3 Serching for the solution It is bsed on the explortion of lrge dt structures, often orgnized s tree: 3 9 27 81 243 After 10 steps there re bout 60.000 nodes to explore After 20 steps nodes re more thn 3.5 billions! 4 Chess gme Deep Blue vs. Ksprov Ech position enbles in the verge 20 legl positions. The entire serch spce consists of bout 10 120 nodes. With the ctul computing power computer cn nlyze bout 10 10 posizioni/sec 120 Hence n exhustive 10 110 T 10 s serch would tke 10 10 (10 18 s = 30 billion yers) 5 Nevertheless, computers re more efficient thn humns for serching on lrge dt structures. Deep blue score 3.5-2.5 My 11, 1997 19:00 GMT Ksprov 200 millions positions per second 3 6 1

The A.I. dilemm Bbies bet Computers Computers re excellent in computing nd serching, but they miserbly fil when progrmmed to reproduce typicl humn ctivities: Sensory perception Sensory-motor coordintion Imge recognition Adptive behviors Although PC cn bet the world s chess chmpion, the most powerful computer in the plnet is not cpble to compete with 3-yer bby in Fce recognition Voice recognition 7 Lego construction 8 Filure of clssicl A.I. The clssicl A.I. pproch consists of defining set of rules nd use n inference engine to combine fcts with rules to solve the problem: The problem Complex ctions depend on too mny detils tht cnnot be exctly formlized by set of rules Expert System Inference Engine Fcts Knowledge bse Rules They must be lerned by direct experience A mind needs body! 9 10 Grsping Exmples The wy we grsp n object depends on severl fctors: Object position Our posture Object dimension nd shpe Predicted weight Possible obstcles in between Speech recognition It requires lerning phse necessry for: Adpting to the speker Filtering externl noises Seprting possible other sounds or voices in the sme frequency spectrum 11 12 2

Chrcter recognition Imge recognition Moreover, pttern recognition lso depends on the underling context. The difficulty of visul recognition becomes cler when you ttempt to write computer progrm to do it. Rules like IF (there is loop on top) AND ( verticl stroke t bottom right) THEN it is 9 do not work. When you try to mke such rules precise, you quickly get lost in n ocen of exceptions nd specil cses. = 13 14 How cn we recognize chrcters so esily? How do we do? How cn we ctch bll without solving equtions? How cn we interpret sounds in few milliseconds? The neurl pproch The strong difficulties experienced in solving such problems with computer pushed reserchers to investigte the brin more closely, for two different nd complementry motivtions: Physicins understnd how the brin works Engineers find new computtionl models to solve such problems 15 16 Humn Auditory System Humn Vision System 20 KHz Auditory nerve Er cnl Mlleus Incus Stpes 10 KHz 7 KHz 3 KHz 1 KHz 60 Hz 300 Hz 700 Hz 17 18 3

How does the brin work? Hitting tennis bll Our brin solve problems using different pproch: Mssive prllelism Lots of exmples (experience) re used to infer the rules for recognizing ptterns Associtive memory: ech sensory stte evokes cerebrl stte (n electro-chemicl ctivity), which is stored depending on the needs. 19 The trjectory depends on severl fctors: Force of lunce, initil ngle, spin, wind speed Predicting the trjectory requires: Mesuring vribles precisely; Solving complex equtions, to compute t every dt cquisition. How cn we do so? 20 Lerning phse In lerning phse, we try severl ctions nd store the good ones: When the bll is seen in the upper-left re of the visul field, then mke bck step; When the bll Operting Phse Once trined, the brin executes the ctions without thinking, bsed on the lerned ssocitions. 21 A similr mechnism is used when we lern to drive cr or ply n instrument. 22 Associtive computtion The biologicl neuron A set of complex equtions re solved through look-up tble. synpsis nucleus spike It is constructed bsed on the experience nd it is refined with trining. Sensory stte. Retrieved ction dendrites Activtion / Inhibition: Axon terminl Noise tolernce Low energy consumption 23 24 4

Some brin properties Some brin properties Neuron speed: milliseconds Lerning: due to synptic chnges Number of neurons: 10 11 10 12 Number of synpses: Computtion: 10 3 10 4 per neuron distributed nd prllel Rel time processing nd control, fult tolernce, nd grceful degrdtion Discovered by Donld Hebb in 1949: When n xon of cell A is ner enough to excite cell B nd repetedly or persistently tkes prt in firing it, some growth process or metbolic chnge tkes plce in one or both cells such tht A s efficiency, s one of the cells firing B, is incresed. Donld Hebb 25 26 1943: McCulloch & Pitts: proposed the first neurl model, the Binry-Threshold Neuron. 1957: Rosenbltt: exploiting the results by Hebb in 1949, he proposed new model of neuron ble to lern from exmples, the Perceptron. The first Perceptron ws in hrdwre. Inputs were tken by cmer with 20 20 photocells, rndomly connected to neurons. Wrren McCulloch Wlter Pitts Weights were encoded in potentiometers nd updted by motors. Frnk Rosenbltt Mrk I Perceptron 27 28 1969: Minsky & Ppert: showed strong limittions of the perceptron: the interest on neurl networks disppered for mny yers ( period clled the AI winter). 1982: Hopfield: developed neurl network cpble of behving s n ssocitive memory. 1982: Kohonen: developed competitive lerning model to crete Self-Orgnizing Mps. Mrvin Minsky Seymour Ppert 29 John Hopfield Tevuo Kohonen 30 5

1983: Brto, Sutton & Anderson: proposed neurl network cpble of lerning without supervision (Reinforcement Lerning). 1986: Rumelhrt, Hinton & Willims: formlized the process of lerning by exmples, defining the so clled Bckpropgtion lgorithm. Andrew Brto Richrd Sutton Chrles Anderson Dvid Rumelhrt Geoffrey Hinton Ronld Willims 31 32 1986-2006: On the one hnd, BP becme very populr in mny pplictions fields (engineering, physics, economy, chemistry, medicine, griculture science, socil science, etc.) s generl lerning methodology to fit dt sets. On the other hnd, reserchers hd hrd time in extending BP to more complex networks with mny lyers (deep networks). 2006-2012: LeCun, Bengio, Hinton: found solutions to overcome the difficulties in extending Bckpropgtion to networks with mny lyers. Deep lerning begins. Interest in neurl networks reserch pplictions Ynn LeCun Yoshu Bengio Geoffrey Hinton 1986 2006 33 34 Problems Since 2006, deep networks hve been successfully pplied in severl ppliction fields, including Clssifiction; Regression; Dimensionlity reduction; Modeling textures; Modeling motion; Object segmenttion; Informtion retrievl; Robotics; Nturl lnguge processing; Collbortive filtering (user prediction bsed on socil dt) 2012-tody: Resurgence of neurl networks. Explosion of interest in deep lerning, minly due to vilbility of chep computing power through GPUs; the interest of mjor plyers like Google, Fcebook, nd Microsoft to nlyze their huge dt sets. 36 6

Interest in neurl networks Interest in neurl networks 1943 McCulloch & Pitts Binry neuron 1957 Rosenbltt Perceptron 1949 Hebb 1969 Minsky-Ppert 1982 Hopfield, Kohonen, RL AI winter Problems in extending BP 1986 BP 2006 Deep Lerning 2012 GPU boosting 1930 1940 1950 1960 1970 1980 1900 2000 2010 2020 Generl neurl model Generl neurl model To define neurl model we need to define: The number of input chnnels: n The type of input vlues: The synptic weights: x i w i The ctivtion function: = F(x, w) The output function: y = f() 39 x 1 x 2 x n w 1 w 2 w n x1 w1 x2 w2 x = w = x n w n F y f = F(x, w) y = f() 40 Binry-threshold neuron Heviside s function x 1 w 1 h(x) x 2 w 2 f() y 1 x x n w n x1 w1 x2 w2 x = w = x n w n = w x = y = h( ) n i1 w i x i 41 h(x) = 1 if x >0 0 otherwise 42 7

Other possible output functions Network of neurons liner f () rectified liner f () f () bs To define neurl network we need to define: Neurl model for ll the neurons Network topology Activtion mode for the neurons f () 1 sigmoid tnh f () 1 Lerning prdigm Lerning rule 1 43 44 Network topology Representing connections output lyer hidden lyer i w ji j Fully connected network Lyered or feed-forwrd network input lyer Weight of neuron j for the connection coming from neuron i 45 46 Fully connected networks The weights of the network cn be specified through connection mtrix (weight mtrix): Lyered Networks The weights nd network with L lyers cn be specified through L-1 weight mtrices: n x n W = weights of neuron 1 weights of neuron 2 weights of neuron n 3 x 4 4 x 3 W 3 W 2 lyer 3 (output) lyer 2 (hidden) lyer 1 (input) 47 48 8

Activtion modes Lerning Synchronous (prllel) All neurons chnge stte simultneously, synchronized by clock. Asynchronous (sequentil) Neurons updte their stte one t time, hence, specific order must be defined. It refers to the cpbility of neurl network to modify its behvior in desired direction by chnging the synptic weights. There re three bsic lerning prdigms: Supervised Lerning Competitive Lerning Reinforcement Lerning 49 50 Supervised Lerning Lerning phse A network lerns to ssocite set of given input ptterns to set of output vlues. We cn distinguish two distinct phses: Lerning phse The network stores the desired dt (weights re modified s function of the errors); Operting phse The network is used on new dt never seen before (weights re kept fixed). Techer x k input pttern z k desired output NEURAL NETWORK w Lerning Rule y k output pttern 51 52 Competitive Lerning Reinforcement Lerning All neurons compete to specilize themselves to recognize prticulr input stimulus. The lerning rule is designed so tht the network self-orgnizes to crete n isomorphism (mp) between stimuli nd neurons on the output lyer. x k NEURAL NETWORK R y k System to be controlled z k x k input stimulus NEURAL NETWORK w y k output pttern Critic 53 54 9

Outline. A.I. Applications. Searching for the solution. Chess game. Deep Blue vs. Kasparov 20/03/2017