AST Catania Lab STMicroelectronics

AST Catania Lab STMicroelectronics Valeria Tomaselli Embedded Analytics Team June 2017

A global semiconductor leader 2016 revenues of $6.97B Who We Are 15% America 45% China and South Asia 13% Japan and Corea 27% EMEA (Europe, Middle East & Africa) Research & Development Main Sales & Marketing Front-End Back-End Approximately 43,500 employees worldwide Approximately 7,500 people working in R&D 11 manufacturing sites Over 75 sales & marketing offices Listed on New York Stock Exchange, Euronext Paris and Borsa Italiana, Milano

Devices Microelectronics: the development enabler 3 World Wide GDP: ~80.000B$ 10.700B$ SERVICES 1.700B$ 335B$ 6x ELECTRONICS 5x Semiconductor s 36B$ 38B$ Materials Equipment All trademarks and logos are the property of their respective owners. All rights reserved. They are used here only as conceptual examples Sources: IMF (International Monetary Found)/ SIA (Semiconductors Industry Association)/World Bank.org/ WSTS

Application Strategic Focus 4 The leading provider of products and solutions for Smart Driving and the Internet of Things Smart Things Smart Home & City Smart Industry Smart Driving

Application Strategic Focus 5 The leading provider of products and solutions for Smart Driving and the Internet of Things Smart Things Smart Home & City Smart Industry Smart Driving

Smart Things 6 Smart Things Making Every Thing Smarter A Smart Thing Understands the environment Manages data and transforms it into information Connects to the world Protects your data Is energy efficient

Smart Home & City 7 Smart Home & City Making Home & Cities Smarter Smart city infrastructure to improve traffic and municipal services Smart Grid Intelligent, adaptive street lighting Smart Buildings Smart City Smart Home Smart control of heating, air conditioning, appliances, locks and alarms Smart meters to connect homes to the smart grid More energy efficiency, convenience, comfort and security

Smart Industry 8 Smart Industry Enabling smarter, safer and more efficient factories and workplaces Smart Industry Factories that produce in a more efficient manner More flexibility and customization possibilities in the supply chain More sustainable production with less waste and less energy used Safer working environments for people Better man-machine cooperation in the work place Optimized usage of machines and tools

Smart Driving 9 Smart Driving Making driving Safer, Greener and more Connected Safer Having cars drive better than we can & always watching for threats Making driving safer for car occupants and other road users by actively avoiding accidents Greener Improving power and fuel efficiency, and helping minimize emissions and car maintenance Moving towards electric vehicles More Connected Enabling personalized car entertainment and connectivity Allowing vehicles to communicate with each other and the infrastructure (V2X)

The Catania site

A balanced structure % Function % Education R&D / Designers Manufacturing 25% 54% 36% 63% Univ Degree High school Product Management&Ad ministration 21% Others 1% More than 1000 R&D specialists, 690 of whom are graduates 3949 employees 2/3 of men and 1/3 of women Average age = 42 years 5000 4000 3000 2000 1000 0 Personnel 1960 1970 1980 1990 2000 2005 2010 2015 2016 Update Dec 2016

ST Catania: an integrated site Integrated Excellence Center: R&D, Design, Production, Marketing&Supply Chain Recognized Leadership in Discrete and Integrated Power Competencies in key and growing microelectronics sectors (Sensors, Health, Renewable Energy, ) 1600 1400 1200 1000 800 600 400 200 0 1472 1237 1245 Patents 1558 1386 1165 1240 1294 1191 1157 156 122 164 187 178 128 90 138 128 151 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 WW granted patents CT granted patents Catania Invention Disclosures 2015 Strong Partnerships with Universities and Research Centers ADG AMG AST FMT MDG

Collaborations with Universities and Research Centers University of Catania * University of Palermo University of Messina Politecnico of Torino * Scuola Superiore S.Anna, Pisa * University of Bologna University La Sapienza, Roma University of Calabria, Cosenza Politecnico of Bari University of Firenze INAF (Istituto Nazionale di Astrofisica) INFN (Istituto Nazionale di Fisica Nucleare) CNR (Consiglio Nazionale delle Ricerche) * CEA-LETI and Liten, Grenoble, France University of Tours, France CNES, Grenoble, France CERN, Geneva, Switzerland ESA, Brussels, Belgium IMEC, Brussels, Belgium Fraunhofer Institute, Germany VTT, Helsinki, Finland MIT, Boston, USA Johns Hopkins University, Baltimore, USA Arizona State University, Phoenix, USA IME, IMRE Labs of A*STAR, Singapore University of Tunis, Tunisia Waseda University, Tokyo, Japan (*) Laboratories in the Catania site

AST Catania

Investigate beyond state of the art Industrial R&D 15 Pave the way to next generation devices (3-5 years) Understand customers needs Support product divisions and customers Added value to ST products Marketing analysis Technology evolutions ST competitors positioning Execution Internal R&D Participation to funding projects Collaboration with academic communities Patents and papers.

AST: Advanced Systems Technology Automotive Surveillance AST IoT, Industrial and Home Automation, Smart Metering Sensors Data Fusion & Classification Low Power Digital Design Smart Sensing Low cost Advanced Driver Assistance Systems Digital Transceivers Health & Wearable Concrete Pressure Sensors Wireless charger Focused Product Portfolio

Optic / Image sensor Embedded Analytics: A long journey 17 Signal Processing Image Processing Computer Vision Machine Learning Robot Interaction Multiple sensors Sensors data fusion

Embedded Analytics Organization 18 Artificial Intelligence Video Analytics Applications & Platforms

Video Analytics 19

STV991 20 VG6640 + STV991... A smart platform for image KEY FEATURES ARM Cortex -R4 CPU @ 500 MHz Up to 2MB SRAM Up to 4MB Flash Real time image signal processing (HDR High Dynamic Range), 5Mpixel Still / Video compression (JPEG/H264) Embedded Video Analytics (Edge Extractor, Optical Flow)

STV991 is an enabler of very complex algorithms ST991 Applications 21 MOT = Moving Object Tracking MOT for Smart Mirror CTA = Cross Traffic Alert MOT for CTA in Automotive MOT for Drone Landing

Artificial Intelligence 22

Classical Representation paradigm 23 Traditional model of pattern recognition (since the late 50 s): Fixed engineered features (or fixed kernels) + trainable classifier; Deep knowledge of specific data domain was required. Hand-crafted Image Feature extractor Simple trainable classifier CAR Hand-crafted Audio Feature extractor Simple trainable classifier Rock Music

Representation paradigm changes: DCNNs 24 2012: Deep Convolutional Neural Network (DCNN) wins at ImageNet challenge, with a huge gap over competitors Courtesy of Alex Krizhevsky et all, ImageNet Classification with Deep Convolutional Neural Networks Deep architecture: Features are learned; Everything becomes adaptive; No distinction between feature extractor and classifier; Big non-linear system trained from raw pixels to labels. Layer1 Layer2 LayerN End-to-end recognition system

Learning hierarchical representations 25 More than one stage of non-linear feature transformation Low-level feature Mid-level feature High-level feature Trainable classifier Feature visualization on convolutional net on ImageNet [Zeiler & Fergus 2013]

DCNN s spread 26 Convolutional Neural Network was not invented overnight; A similar architecture was already proposed by Le Cun et al. for handwriting recognition in 1998; This technology didn t widespread in the following years because of two main factors: the absence of very large data sets, which allow to reduce over-fitting problems; the lack of powerful architectures for performing intensive computations These two problems have been overcome in the recent past: ImageNet (14M images); powerful GPU s In the last years, CNNs have become ubiquitous

CDNN: Applications 27 Computer Vision domain: Image/video search; Image/video labeling; Image/video segmentation; object detection; object tracking, etc. Speech Recognition; Human action recognition using mobile sensors; Sensor data fusion SAME PARADIGM FOR DIFFERENT DOMAINS

Visual DCNN Example: AlexNet The Reference implementation of Convolutional Deep Neural Networks It discriminates 1000 ImageNet Classes with Top-5 Error Rate of 18.2% It consists of: - 5 Convolutional Layers - 3 Fully Connected Layers Courtesy of Alex Krizhevsky et all, ImageNet Classification with Deep Convolutional Neural Networks Original Memory Requirements: 217MB

Orlando Accelerating Deep Learning in Embedded Systems A configurable, scalable and design time parametric Convolutional Neural Network Processing Engine 8 Dual DSP Clusters with Instruction, Data & Shared Memory Image & DCNN Co- Processor Subsystem Orlando SoC Global Memory Subsystem ARM Corte x M4 CDNN Convolutional Layers accounts for more than 90% CDNN operations, hence 8 Convolution HW Accelerators allow high efficiency in area vs GOPS vs power consumption In addition to ARM Cortex M4, 8 DSP Clusters allow both programmability and flexible mapping of diversified, custom CDNN s Embedded Memory enable further reduction of power consumption required by IOT applications.

COPROCESSORS SUBSYSTEM Orlando Test Chip OTP High Speed Camer a IF PLL CHIP TO CHIP M4 (DSP) CORES AND LOCAL MEMS GLOBAL MEMORY SUBSYSTEM Technology FD-SOI 28nm Die Size (X) 6239.2 um (Y) 5598.2 um Package FBGA 15x15x1.83 Clock freq 200MHz 1.175GHz Supply voltages 0.575V 1.1V digital 1.8V I/O 4x1 MB (Global) On-chip RAM 8x192 KB (DSP) 128 KB (Host) Host ARM Cortex M4 DSPs Nr 16 Peak DSP performance (1.175GHz, 1.1V) 75 GOPS (dual 16b MAC loop) Convolutional Accelerators Nr 8 CA size (including local memory) 0.27 sqmm Max Tot CAs performance (1.175GHz, 1.1V) 676 GOPS Tot. CAs Power Consumption @200Mhz, 0.575V (Alexnet) 41mW CAs Peak Efficiency @200Mhz, 0.575V (Alexnet) 2.9 TOPS/W (*) 1 MAC defined as 2 OPS (ADD + MUL)

Orlando CDNN Programming Flow 1. DCNN TRAINING 2. ORLANDO-READY DCNN CONVERSION 3. ORLANDO CONFIGURATION Training Data Base DCNN configuration s Caffe Open source deep learning frameworks DCNN Weights + Metadata Orlando Configuration Tool Model2Platform DCNN Orlando-Ready Weights AUTOMATED Test/validation image dataset Fixed point analysis Optimal fixed point precision assignment (layer-wise) Weights compression Weights layout transformation PLANNED FOR AUTOMATION Network Description to Network Topology Memory management, buffer placement DMA descriptor chains generation Optimal mapping and scheduling of network execution on HW accelerator and DSP cluster.

Orlando @ CES 2017 32 AlexNet classifies up to 1000 different objects categories

Orlando @ CES 2017 33 Expression recognition is a complex task, as every person displays emotions very differently: there are some major features for each expression but they are not shared by everyone. ST FacExp classifies up to 7 different expressions (Anger, Disgust, Fear, Happiness, Surprise, Sadness, Neutral)

Food Recognition: in collaboration with IPLab 34 Quite challenging application: high intra-class variability and low inter-class variability; Standard approaches (such has HOG+SVM) perform very poorly on this task; Benchmarking bw CDNN approach and UniCT Classic Approach: CDNN: Features extraction from last fc7 layer of AlexNet fine-tuned model + multi-class SVM UniCT Classic: Bag of SIFT & Bag of Textons + one-class SVM Training set: UNICTFD889 (3583) + NonFoodFlickr (3583); Test set: FoodFlickr (4008) + NonFoodFlickr (4422) DBs available at http://iplab.dmi.unict.it/madima2015/ CDNN Food Non-Food Food 94.3% 5.7% Non-Food 4.5% 95.5% Accuracy = 94.9% UniCT Classic Food Non-Food Food 29% 71% Non-Food 6% 94% Accuracy = 61.5%

Food Recognition: Demo 35 Web demo available at: http://yoda.dmi.unict.it/demofood/ Demo running also on a PC

Human Activity Recognition 36 5 activities Training: 925 minutes Test: 591 minutes Development platform: STMicroelectronics SensorTile with STM32L4. We process 3-axis accelerometer data with neural networks and estimate the activity performed by the user. The following activities are classified: stationary, walking/ fast walking, running, biking, driving

Example with MEMS: activity recognition 37

Acoustic Scene Classification (ASC) 38 Acoustic Scene Classification CNNs Based on IEEE DCASE2016 Contest DataSet Developed two CNN Spectrogram based models for ASC (15 and 3 classes) starting from a reference model 15 acoustic classes: (beach, bus, cafe/restaurant, car, city_center, forest_path, grocery_store, home, library, metro_station, office, park, residential_area, train, tram) 3 acoustic classes: (indoor, outdoor, in vehicle) Complexity Estimation for Ref DCASE Model 15cl, ASC 15cl, ASC 3cl Started AST ASC Models porting on Target Platform (stile STM32L4) Collecting an AST ASC Dataset with AudioLog Application of SensorTile on usd for a 9 classes target DataSet: (bus, cafe/restaurant, car, city_center, home, office, park, residential_area, train)

Keyword Spotting (KWS) 39 Target: To develop a Wake Up Word solution to enable hands free triggering of consumer devices; Status: a NN based KWS solution is now available on Sensortile; Architecture: Mel Frequency Cepstral Coefficients + Multi Layer Percetron; Performance Evaluation: Train: 11.156 samples; Test: 4.782 samples; Accuracy on Test Set: 75% To be done: Continue the dataset words collection to better capture language variances; Consolidate the model and increase performances.

Arrhythmia classifier 40 Goal: to classify different cardiac arrhythmias The input is an anomalous beat Arrhythmia detector Arrhythmia classifier Arrhythmia type Body gateway electronic patch

Applications & Platforms 41

STM32 and Nucleo boards 42 A new marketing model in ST oriented to the mass market Nucleo is : A complete system for fast prototyping based on STM32 microcontrollers Very low cost Shields allow to play with ST s solutions (sensors, motor drivers, connectivity, etc.) Open Development Environment (ODE) allows simple prototyping AST is working on developing systems exploiting such devices on: Wearable IoT Drones Robot Wellness Industry 4.0

ST Solutions for Customers 43 Easy, Affordable and Rapid Prototyping Tool STM32 ODE Sense Connect Power Drive Move Actuate Translate Reference Design & Solution Boards Wearable Sensor Unit STEVAL-WESU1 SensorTile STEVAL-STLKT01V1 Product Eval Boards 1 W Wearable Wireless Charger STEVAL-ISB038V1 STEVAL-ISB039V1 Bluetooth Low Energy STEVAL-IDB007V1 Near Field Communication FLEX-M24LR04E Microphone Coupon Board STEVAL-MKI129V4... Presentation Mode: All blocks include hyperlinks

ST-Drone Prototype 1/2 44 ESC FCU PWM to PPM

ST-Drone Prototype 2/2 45 FCU-Demo Board STM32F756VG ARM Cortex -M7 LPS22HB Pressure Sensor LIS3MDL 3D Magnetometer LSM303AGR 3-axis e- compass LSM6DS33 6-axis IMU PX4 Open-source FW PWM RC Input 5V GND RC1 RC2 RC3 RC4 RC5 RC6 RC7 RC8 RC9 USB STM32F303 Nucleo-32 PWM to PPM Converter PPM Output GNSS Demo Board GNSS-Demo Board Teseo III GNSS location HUB High Dynamics (5 to 10Hz) Sensor Interfaces (SPI, I2C, ADC) ESC Demo Board 1 ESC Demo Board 2 ESC Demo Board 3 ESC Demo Board 4 ESC-Demo Board 30V, 20A FOC control (3 shunts) For 3s-5s batteries 5V BEC for FCU L6398 High Voltage Gate Drivers STL160NS3LLH7 Low Voltage STripFET H7 series STM32F303 ARM Cortex -M4

Robot Assistance 46 Started activities about Robotics elderly surveillance and assistance Using competences in Robotics and Computer Vision ST want to increase robotic performance and interaction through ST components Possibile example using Face detection and Tracking

Industry 4.0: Automatic Defect Classification Scope: Today, the images produced by the Scanning Electron Microscope (SEM) and showing different types of defects are manually classified by the operators. The goal of the project is to use an algorithm to automatically classify those images 47 Technical explanation: The recent improvement in Neural Network and specially Deep Learning can now be adapted to our industry to classify images of defects such as:

PhD & Thesis 48 Multimodal representation learning (PhD activity with IPLab): Multimodal data fusion for context and activity recognition: Audio reconstruction Video reconstruction shared representation Audio input video input Multi-modal representation learning Thesis: possibility to perform stages & thesis in all the shown application fields

QUESTIONS? 49

THANK YOU