Creating the Right Environment for Machine Learning Codesign. Cliff Young, Google AI

Size: px
Start display at page:

Download "Creating the Right Environment for Machine Learning Codesign. Cliff Young, Google AI"

Transcription

1 Creating the Right Environment for Machine Learning Codesign Cliff Young, Google AI 1

2 Deep Learning has Reinvigorated Hardware GPUs AlexNet, Speech. TPUs Many Google applications: AlphaGo and Translate, WaveNet speech. Startups both training and inference, many different approaches. I m looking forward to test-driving new systems. 2

3 Agenda Classic Codesign versus Codesign for Domain-specific Architectures Codesign in Google s TPUs Recommendations for enabling and supporting Codesign 3

4 ISA Classic Codesign at the HW/SW Interface HW SW Definition: design spanning two fields for a common goal. Classic version is between architecture and compiler. Instruction Set Architecture (ISA) as interface/contract between levels. Example of pushing things back and forth: instruction scheduling. VLIW (static scheduling) OoO (dynamic scheduling) Answer today=both. Ultimately ISA is a single thin layer between the hardware and software domains. 4

5 ISA Codesign for Domain-Specific Architectures HW Physics Compiler Numerics Application Library Algorithms Model (conceptual, not rigorous diagram) Now, there are many different layers, with many different interfaces. TPUs are still digital (for now). Some startups are pushing into physics (NVRAM, Flash, optical). Need to do codesign from physics to application: hard! 5

6 Codesign in TPUs (1): the Hardware Descriptions TPUv1 Large for its time systolic array: 256x256x2 128K ops/cycle. Reduced and mixed precision: quantized int8, int16, and int32. TPUv2 Keep the systolic array. Reduced precision for matrix multiplications in training: bfloat16. System is a torus of chips, an array of systolic arrays. Nice crisp physical description, but we ve missed where the complexity lurks. 6

7 Codesign in TPUs (2): The Implications TPUv1 Large systolic array: system and code dedicated to feeding the beast. Activation pipeline does pooling, elementwise operations, and sigmoids. Quantized 8-bit arithmetic. Software, numerics, and probability estimation issues. TPUv2 Still systolic arrays, but now with back propagation: XLA for code generation. Bfloat16 arithmetic: codesign multiple-win (next slides). Torus of chips: great for SIMD style and scalable data parallelism. WIP: Hardware is actually MIMD, so can support model parallelism. 7

8 Codesign in TPUs (3): Floating-point Formats fp32: Single-precision IEEE Floating Point Format Range: ~1e 38 to ~3e 38 Exponent: 8 bits S E E E E E E E E Mantissa (Significand): 23 bits M M M M M M M M M M M M M M M M M M M M M M M fp16: Half-precision IEEE Floating Point Format Range: ~5.96e 8 to Exponent: 5 bits Mantissa (Significand): 10 bits S E E E E E M M M M M M M M M M bfloat16: Brain Floating Point Format Range: ~1e 38 to ~3e 38 Exponent: 8 bits S E E E E E E E E Mantissa (Significand): 7 bits M M M M M M M 8

9 Codesign in TPUs (4): Bfloat16 as Good Codesign Hardware: shorter mantissa multiplier power, area float32: 23 2 =529 float16: 10 2 =100 bfloat16: 7 2 =49 Software: same dynamic range on number line, same Inf/NaN behavior as float. Numerics: trains without loss scaling [Micikevicius 2017]. System: bfloat16 as an implementation technique inside the matrix multiplier. Can also expose it to save memory capacity and bandwidth, with more work. 9

10 Codesign in TPUs: Summary Three big bets: Systolic array matrix multiplication. Reduced precision numerics, appropriate to inference or training. Torus of chips, for data/simd and model/mimd parallelism. Lots of implications from these bets at all levels of the stack. Is this enough, or can/should we be doing more? 10

11 Some open codesign questions in Machine Learning What s the best architecture? Will the market be the final arbiter? At the end of Moore s Law, perhaps architectural efficiency matters more. Software may matter more than hardware: MultiFlow s Compiler as most important artifact. Ease of use takes time: typically a decade for compilers to mature. What s the lower limit on numerics? Kolmogorov complexity. How much more is sparsity going to matter? Embeddings, attention, compute and memory savings. What else? Brains are sparse. When does batch=1 matter? Definitely for inference. For training? How can we use more weights, but touch fewer of them? Mixture of Experts. 11

12 Codesign for the Individual Contributor Be T-shaped : deep in one core competency, and broad (but shallower) in many. Cherry Murray There are superb engineers who are very narrow, and who are comfortable saying that s not my problem. They can be an important part of the solution, but they re not going to lead the way in a codesign approach. For codesign we need people who are curious, and who take ownership across domains (even when they aren t necessarily experts in that domain). 12

13 Codesign for Organizations Value and enable the connections and the connectors. Take time to have hallway conversations. Beware of Conway s Law: Any organization that designs a system...will inevitably produce a design whose structure is a copy of the organization's communication structure. Harder for big companies than startups (Dunbar number). Being a startup is no guarantee that you won t fall prey. Consider interleaving/rototilling your people. Functional orgs and seating plans discourage codesign interactions. 13

14 Codesign for the Community: Sharing, Metrics, and Infrastructure Research Ideas: huge, rapid flow through arxiv and deep learning conferences. Common Frameworks: TensorFlow and XLA are open-source projects. Benchmarking and Measurement: MLPerf! 14

15 MLPerf (mlperf.org) in One Slide Goal: Build SPEC for Machine Learning. Consortium of companies and universities. Philosophy: Agile development because ML is changing rapidly. Serve both the commercial and research communities. Enforce replicability to ensure reliable results. Use representative workloads, reflecting production use-cases. Keep benchmarking effort affordable (so all can play). Launching v0.5 in October! 15

16 Crisis as both Danger and Opportunity Danger: the end of Moore s Law, Dennard Scaling, and standard CPU performance. Limits of CMOS in sight. Intel 10nm woes, Global Foundries 7nm exit. Opportunity: the revolution in ML. Economic demand for ML accelerators. Architectural and codesign experimentation and transformation. Can we use ML to design better accelerators? Irony: exponential demand for ML computation, just at the end of Moore s Law. Efficiency is going to matter a lot. 16

17 Takeaways Codesign is Fundamental to Domain-specific Architecture TPUs made three big bets (so far), with system-wide consequences. Think hard about the software implications of your hardware choices. There are codesign problems whose solutions could transform ML Systems. For example, an algorithmic advance that plays well with hardware constraints: Large-batch training instead of decreased learning rate. K-FAC for smarter SGD steps. 1-bit training. A sparsity framework that enables novel memory and compute structures. To foster codesign, people, organization, and community matter. 17

Challenges in Transition

Challenges in Transition Challenges in Transition Keynote talk at International Workshop on Software Engineering Methods for Parallel and High Performance Applications (SEM4HPC 2016) 1 Kazuaki Ishizaki IBM Research Tokyo kiszk@acm.org

More information

Creating Intelligence at the Edge

Creating Intelligence at the Edge Creating Intelligence at the Edge Vladimir Stojanović E3S Retreat September 8, 2017 The growing importance of machine learning Page 2 Applications exploding in the cloud Huge interest to move to the edge

More information

Harnessing the Power of AI: An Easy Start with Lattice s sensai

Harnessing the Power of AI: An Easy Start with Lattice s sensai Harnessing the Power of AI: An Easy Start with Lattice s sensai A Lattice Semiconductor White Paper. January 2019 Artificial intelligence, or AI, is everywhere. It s a revolutionary technology that is

More information

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir Parallel Computing 2020: Preparing for the Post-Moore Era Marc Snir THE (CMOS) WORLD IS ENDING NEXT DECADE So says the International Technology Roadmap for Semiconductors (ITRS) 2 End of CMOS? IN THE LONG

More information

Embedding Artificial Intelligence into Our Lives

Embedding Artificial Intelligence into Our Lives Embedding Artificial Intelligence into Our Lives Michael Thompson, Synopsys D&R IP-SOC DAYS Santa Clara April 2018 1 Agenda Introduction What AI is and is Not Where AI is being used Rapid Advance of AI

More information

Lecture 1: Introduction to Digital System Design & Co-Design

Lecture 1: Introduction to Digital System Design & Co-Design Design & Co-design of Embedded Systems Lecture 1: Introduction to Digital System Design & Co-Design Computer Engineering Dept. Sharif University of Technology Winter-Spring 2008 Mehdi Modarressi Topics

More information

Hardware-Software Co-Design Cosynthesis and Partitioning

Hardware-Software Co-Design Cosynthesis and Partitioning Hardware-Software Co-Design Cosynthesis and Partitioning EE8205: Embedded Computer Systems http://www.ee.ryerson.ca/~courses/ee8205/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer

More information

Digital Systems Design

Digital Systems Design Digital Systems Design Digital Systems Design and Test Dr. D. J. Jackson Lecture 1-1 Introduction Traditional digital design Manual process of designing and capturing circuits Schematic entry System-level

More information

Neural Networks The New Moore s Law

Neural Networks The New Moore s Law Neural Networks The New Moore s Law Chris Rowen, PhD, FIEEE CEO Cognite Ventures December 216 Outline Moore s Law Revisited: Efficiency Drives Productivity Embedded Neural Network Product Segments Efficiency

More information

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture Overview 1 Trends in Microprocessor Architecture R05 Robert Mullins Computer architecture Scaling performance and CMOS Where have performance gains come from? Modern superscalar processors The limits of

More information

AI Application Processing Requirements

AI Application Processing Requirements AI Application Processing Requirements 1 Low Medium High Sensor analysis Activity Recognition (motion sensors) Stress Analysis or Attention Analysis Audio & sound Speech Recognition Object detection Computer

More information

MACHINE LEARNING Games and Beyond. Calvin Lin, NVIDIA

MACHINE LEARNING Games and Beyond. Calvin Lin, NVIDIA MACHINE LEARNING Games and Beyond Calvin Lin, NVIDIA THE MACHINE LEARNING ERA IS HERE And it is transforming every industry... including Game Development OVERVIEW NVIDIA Volta: An Architecture for Machine

More information

Artificial intelligence, made simple. Written by: Dale Benton Produced by: Danielle Harris

Artificial intelligence, made simple. Written by: Dale Benton Produced by: Danielle Harris Artificial intelligence, made simple Written by: Dale Benton Produced by: Danielle Harris THE ARTIFICIAL INTELLIGENCE MARKET IS SET TO EXPLODE AND NVIDIA, ALONG WITH THE TECHNOLOGY ECOSYSTEM INCLUDING

More information

Deep Learning. Dr. Johan Hagelbäck.

Deep Learning. Dr. Johan Hagelbäck. Deep Learning Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Image Classification Image classification can be a difficult task Some of the challenges we have to face are: Viewpoint variation:

More information

Small World Network Architectures. NIPS 2017 Workshop

Small World Network Architectures. NIPS 2017 Workshop Small World Network Architectures NIPS 2017 Workshop Small World Networks We'd like to explore training models with very wide hidden states. More active memory, more information bandwidth, more easily

More information

Computer Architecture

Computer Architecture Computer Architecture Lecture 01 Arkaprava Basu www.csa.iisc.ac.in Acknowledgements Several of the slides in the deck are from Luis Ceze (Washington), Nima Horanmand (Stony Brook), Mark Hill, David Wood,

More information

Rethinking CAD. Brent Stucker, Univ. of Louisville Pat Lincoln, SRI

Rethinking CAD. Brent Stucker, Univ. of Louisville Pat Lincoln, SRI Rethinking CAD Brent Stucker, Univ. of Louisville Pat Lincoln, SRI The views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S.

More information

Architecting Systems of the Future, page 1

Architecting Systems of the Future, page 1 Architecting Systems of the Future featuring Eric Werner interviewed by Suzanne Miller ---------------------------------------------------------------------------------------------Suzanne Miller: Welcome

More information

Architecture ISCA 16 Luis Ceze, Tom Wenisch

Architecture ISCA 16 Luis Ceze, Tom Wenisch Architecture 2030 @ ISCA 16 Luis Ceze, Tom Wenisch Mark Hill (CCC liaison, mentor) LIVE! Neha Agarwal, Amrita Mazumdar, Aasheesh Kolli (Student volunteers) Context Many fantastic community formation/visioning

More information

What is Artificial Intelligence? Alternate Definitions (Russell + Norvig) Human intelligence

What is Artificial Intelligence? Alternate Definitions (Russell + Norvig) Human intelligence CSE 3401: Intro to Artificial Intelligence & Logic Programming Introduction Required Readings: Russell & Norvig Chapters 1 & 2. Lecture slides adapted from those of Fahiem Bacchus. What is AI? What is

More information

KÜNSTLICHE INTELLIGENZ JOBKILLER VON MORGEN?

KÜNSTLICHE INTELLIGENZ JOBKILLER VON MORGEN? KÜNSTLICHE INTELLIGENZ JOBKILLER VON MORGEN? Marc Stampfli https://www.linkedin.com/in/marcstampfli/ https://twitter.com/marc_stampfli E-Mail: mstampfli@nvidia.com INTELLIGENT ROBOTS AND SMART MACHINES

More information

5G R&D at Huawei: An Insider Look

5G R&D at Huawei: An Insider Look 5G R&D at Huawei: An Insider Look Accelerating the move from theory to engineering practice with MATLAB and Simulink Huawei is the largest networking and telecommunications equipment and services corporation

More information

INTRODUCTION TO DEEP LEARNING. Steve Tjoa June 2013

INTRODUCTION TO DEEP LEARNING. Steve Tjoa June 2013 INTRODUCTION TO DEEP LEARNING Steve Tjoa kiemyang@gmail.com June 2013 Acknowledgements http://ufldl.stanford.edu/wiki/index.php/ UFLDL_Tutorial http://youtu.be/ayzoubkuf3m http://youtu.be/zmnoatzigik 2

More information

CS4617 Computer Architecture

CS4617 Computer Architecture 1/26 CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 2/26 Amdahl s Law Speedup = Execution time for entire task without using enhancement Execution time for entire task using enhancement

More information

DEEP LEARNING ON RF DATA. Adam Thompson Senior Solutions Architect March 29, 2018

DEEP LEARNING ON RF DATA. Adam Thompson Senior Solutions Architect March 29, 2018 DEEP LEARNING ON RF DATA Adam Thompson Senior Solutions Architect March 29, 2018 Background Information Signal Processing and Deep Learning Radio Frequency Data Nuances AGENDA Complex Domain Representations

More information

Image Processing Architectures (and their future requirements)

Image Processing Architectures (and their future requirements) Lecture 16: Image Processing Architectures (and their future requirements) Visual Computing Systems Smart phone processing resources Example SoC: Qualcomm Snapdragon Image credit: Qualcomm Apple A7 (iphone

More information

Lecture 1. Tinoosh Mohsenin

Lecture 1. Tinoosh Mohsenin Lecture 1 Tinoosh Mohsenin Today Administrative items Syllabus and course overview Digital systems and optimization overview 2 Course Communication Email Urgent announcements Web page http://www.csee.umbc.edu/~tinoosh/cmpe650/

More information

PoC #1 On-chip frequency generation

PoC #1 On-chip frequency generation 1 PoC #1 On-chip frequency generation This PoC covers the full on-chip frequency generation system including transport of signals to receiving blocks. 5G frequency bands around 30 GHz as well as 60 GHz

More information

Pragmatic Strategies for Adopting Model-Based Design for Embedded Applications. The MathWorks, Inc.

Pragmatic Strategies for Adopting Model-Based Design for Embedded Applications. The MathWorks, Inc. Pragmatic Strategies for Adopting Model-Based Design for Embedded Applications Larry E. Kendrick, PhD The MathWorks, Inc. Senior Principle Technical Consultant Introduction What s MBD? Why do it? Make

More information

CS Computer Architecture Spring Lecture 04: Understanding Performance

CS Computer Architecture Spring Lecture 04: Understanding Performance CS 35101 Computer Architecture Spring 2008 Lecture 04: Understanding Performance Taken from Mary Jane Irwin (www.cse.psu.edu/~mji) and Kevin Schaffer [Adapted from Computer Organization and Design, Patterson

More information

CPS331 Lecture: Search in Games last revised 2/16/10

CPS331 Lecture: Search in Games last revised 2/16/10 CPS331 Lecture: Search in Games last revised 2/16/10 Objectives: 1. To introduce mini-max search 2. To introduce the use of static evaluation functions 3. To introduce alpha-beta pruning Materials: 1.

More information

The Power of Exponential Thinking

The Power of Exponential Thinking The Power of Exponential Thinking An Introduction to Singularity University 2016 Singularity University What is Singularity University (SU)? We are a global community using exponential technologies to

More information

Proposers Day Workshop

Proposers Day Workshop Proposers Day Workshop Monday, January 23, 2017 @srcjump, #JUMPpdw Cognitive Computing Vertical Research Center Mandy Pant Academic Research Director Intel Corporation Center Motivation Today s deep learning

More information

Application of AI Technology to Industrial Revolution

Application of AI Technology to Industrial Revolution Application of AI Technology to Industrial Revolution By Dr. Suchai Thanawastien 1. What is AI? Artificial Intelligence or AI is a branch of computer science that tries to emulate the capabilities of learning,

More information

GPU ACCELERATED DEEP LEARNING WITH CUDNN

GPU ACCELERATED DEEP LEARNING WITH CUDNN GPU ACCELERATED DEEP LEARNING WITH CUDNN Larry Brown Ph.D. March 2015 AGENDA 1 Introducing cudnn and GPUs 2 Deep Learning Context 3 cudnn V2 4 Using cudnn 2 Introducing cudnn and GPUs 3 HOW GPU ACCELERATION

More information

Goals of this Course. CSE 473 Artificial Intelligence. AI as Science. AI as Engineering. Dieter Fox Colin Zheng

Goals of this Course. CSE 473 Artificial Intelligence. AI as Science. AI as Engineering. Dieter Fox Colin Zheng CSE 473 Artificial Intelligence Dieter Fox Colin Zheng www.cs.washington.edu/education/courses/cse473/08au Goals of this Course To introduce you to a set of key: Paradigms & Techniques Teach you to identify

More information

Using Deep Learning for Sentiment Analysis and Opinion Mining

Using Deep Learning for Sentiment Analysis and Opinion Mining Using Deep Learning for Sentiment Analysis and Opinion Mining Gauging opinions is faster and more accurate. Abstract How does a computer analyze sentiment? How does a computer determine if a comment or

More information

Efficient Deep Learning in Communications

Efficient Deep Learning in Communications Fraunhofer Image Processing Heinrich Hertz Institute Efficient Deep Learning in Communications Dr. Wojciech Samek Fraunhofer HHI, Machine Learning Group Fraunhofer Heinrich Hertz Institute, Einsteinufer

More information

Examen. NU reproducere mecanica ASPC, P11. Foundations of Software Engineering

Examen. NU reproducere mecanica ASPC, P11. Foundations of Software Engineering radu.marinescu@cs.upt.ro 0256-40.40.58 ASPC, P11 1 Examen NU reproducere mecanica Surse multiple de informare n ati u m r fo a va s a re ti c ede v Citi e ct d pun loose.upt.ro/~oose Teorie & Exercitii

More information

Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen

Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen GIGA seminar 11.1.2010 Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen janne.janhunen@ee.oulu.fi 2 Outline Introduction Benefits and Challenges

More information

Computer Aided Design of Electronics

Computer Aided Design of Electronics Computer Aided Design of Electronics [Datorstödd Elektronikkonstruktion] Zebo Peng, Petru Eles, and Nima Aghaee Embedded Systems Laboratory IDA, Linköping University www.ida.liu.se/~tdts01 Electronic Systems

More information

Hardware-Software Codesign. 0. Organization

Hardware-Software Codesign. 0. Organization Hardware-Software Codesign 0. Organization Lothar Thiele 0-1 Overview Introduction and motivation Course synopsis Administrativa 0-2 What is HW-SW Codesign?... integrated design of systems that consist

More information

Challenges of in-circuit functional timing testing of System-on-a-Chip

Challenges of in-circuit functional timing testing of System-on-a-Chip Challenges of in-circuit functional timing testing of System-on-a-Chip David and Gregory Chudnovsky Institute for Mathematics and Advanced Supercomputing Polytechnic Institute of NYU Deep sub-micron devices

More information

Introduction to co-simulation. What is HW-SW co-simulation?

Introduction to co-simulation. What is HW-SW co-simulation? Introduction to co-simulation CPSC489-501 Hardware-Software Codesign of Embedded Systems Mahapatra-TexasA&M-Fall 00 1 What is HW-SW co-simulation? A basic definition: Manipulating simulated hardware with

More information

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018 CPSC 340: Machine Learning and Data Mining Convolutional Neural Networks Fall 2018 Admin Mike and I finish CNNs on Wednesday. After that, we will cover different topics: Mike will do a demo of training

More information

Understanding Neural Networks : Part II

Understanding Neural Networks : Part II TensorFlow Workshop 2018 Understanding Neural Networks Part II : Convolutional Layers and Collaborative Filters Nick Winovich Department of Mathematics Purdue University July 2018 Outline 1 Convolutional

More information

Technology Transfers Opportunities, Process and Risk Mitigation. Radhika Srinivasan, Ph.D. IBM

Technology Transfers Opportunities, Process and Risk Mitigation. Radhika Srinivasan, Ph.D. IBM Technology Transfers Opportunities, Process and Risk Mitigation Radhika Srinivasan, Ph.D. IBM Abstract Technology Transfer is quintessential to any technology installation or semiconductor fab bring up.

More information

TOOLS AND PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey

TOOLS AND PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey TOOLS AND PROCESSORS FOR COMPUTER VISION Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey 1 EXECUTIVE SUMMARY Since 2015, the Embedded Vision Alliance has

More information

THE DEEP WATERS OF DEEP LEARNING

THE DEEP WATERS OF DEEP LEARNING THE DEEP WATERS OF DEEP LEARNING THE CURRENT AND FUTURE IMPACT OF ARTIFICIAL INTELLIGENCE ON THE PUBLISHING INDUSTRY. BY AND FRANKFURTER BUCHMESSE 2/6 Given the ever increasing number of publishers exploring

More information

Energy efficient multi-granular arithmetic in a coarse-grain reconfigurable architecture

Energy efficient multi-granular arithmetic in a coarse-grain reconfigurable architecture Eindhoven University of Technology MASTER Energy efficient multi-granular arithmetic in a coarse-grain reconfigurable architecture Louwers, S.T. Award date: 216 Link to publication Disclaimer This document

More information

Applying Automated Optical Inspection Ben Dawson, DALSA Coreco Inc., ipd Group (987)

Applying Automated Optical Inspection Ben Dawson, DALSA Coreco Inc., ipd Group (987) Applying Automated Optical Inspection Ben Dawson, DALSA Coreco Inc., ipd Group bdawson@goipd.com (987) 670-2050 Introduction Automated Optical Inspection (AOI) uses lighting, cameras, and vision computers

More information

Processors Processing Processors. The meta-lecture

Processors Processing Processors. The meta-lecture Simulators 5SIA0 Processors Processing Processors The meta-lecture Why Simulators? Your Friend Harm Why Simulators? Harm Loves Tractors Harm Why Simulators? The outside world Unfortunately for Harm you

More information

Low-Power Digital CMOS Design: A Survey

Low-Power Digital CMOS Design: A Survey Low-Power Digital CMOS Design: A Survey Krister Landernäs June 4, 2005 Department of Computer Science and Electronics, Mälardalen University Abstract The aim of this document is to provide the reader with

More information

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta Computer Go: from the Beginnings to AlphaGo Martin Müller, University of Alberta 2017 Outline of the Talk Game of Go Short history - Computer Go from the beginnings to AlphaGo The science behind AlphaGo

More information

SMARTPHONE SENSOR BASED GESTURE RECOGNITION LIBRARY

SMARTPHONE SENSOR BASED GESTURE RECOGNITION LIBRARY SMARTPHONE SENSOR BASED GESTURE RECOGNITION LIBRARY Sidhesh Badrinarayan 1, Saurabh Abhale 2 1,2 Department of Information Technology, Pune Institute of Computer Technology, Pune, India ABSTRACT: Gestures

More information

A.I in Automotive? Why and When.

A.I in Automotive? Why and When. A.I in Automotive? Why and When. AGENDA 01 02 03 04 Definitions A.I? A.I in automotive Now? Next big A.I breakthrough in Automotive 01 DEFINITIONS DEFINITIONS Artificial Intelligence Artificial Intelligence:

More information

Outline Simulators and such. What defines a simulator? What about emulation?

Outline Simulators and such. What defines a simulator? What about emulation? Outline Simulators and such Mats Brorsson & Mladen Nikitovic ICT Dept of Electronic, Computer and Software Systems (ECS) What defines a simulator? Why are simulators needed? Classifications Case studies

More information

EN164: Design of Computing Systems Lecture 22: Processor / ILP 3

EN164: Design of Computing Systems Lecture 22: Processor / ILP 3 EN164: Design of Computing Systems Lecture 22: Processor / ILP 3 Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University

More information

Chapter 3. H/w s/w interface. hardware software Vijaykumar ECE495K Lecture Notes: Chapter 3 1

Chapter 3. H/w s/w interface. hardware software Vijaykumar ECE495K Lecture Notes: Chapter 3 1 Chapter 3 hardware software H/w s/w interface Problems Algorithms Prog. Lang & Interfaces Instruction Set Architecture Microarchitecture (Organization) Circuits Devices (Transistors) Bits 29 Vijaykumar

More information

Perspectives on Neuromorphic Computing

Perspectives on Neuromorphic Computing Perspectives on Neuromorphic Computing Todd Hylton Brain Corporation hylton@braincorporation.com ORNL Neuromorphic Computing Workshop June 29, 2016 Outline Retrospective SyNAPSE Perspective Neuromorphic

More information

MS Project :Trading Accuracy for Power with an Under-designed Multiplier Architecture Parag Kulkarni Adviser : Prof. Puneet Gupta Electrical Eng.

MS Project :Trading Accuracy for Power with an Under-designed Multiplier Architecture Parag Kulkarni Adviser : Prof. Puneet Gupta Electrical Eng. MS Project :Trading Accuracy for Power with an Under-designed Multiplier Architecture Parag Kulkarni Adviser : Prof. Puneet Gupta Electrical Eng., UCLA - http://nanocad.ee.ucla.edu/ 1 Outline Introduction

More information

TOOLS & PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Fall 2017 Computer Vision Developer Survey

TOOLS & PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Fall 2017 Computer Vision Developer Survey TOOLS & PROCESSORS FOR COMPUTER VISION Selected Results from the Embedded Vision Alliance s Fall 2017 Computer Vision Developer Survey ABOUT THE EMBEDDED VISION ALLIANCE EXECUTIVE SUMMA Y Since 2015, the

More information

Computer Vision at the Edge and in the Cloud: Architectures, Algorithms, Processors, and Tools

Computer Vision at the Edge and in the Cloud: Architectures, Algorithms, Processors, and Tools Computer Vision at the Edge and in the Cloud: Architectures, Algorithms, Processors, and Tools IEEE Signal Processing Society Santa Clara Valley Chapter - April 11, 2018 Jeff Bier Founder, Embedded Vision

More information

www.ixpug.org @IXPUG1 What is IXPUG? http://www.ixpug.org/ Now Intel extreme Performance Users Group Global community-driven organization (independently ran) Fosters technical collaboration around tuning

More information

Fixed Point Lms Adaptive Filter Using Partial Product Generator

Fixed Point Lms Adaptive Filter Using Partial Product Generator Fixed Point Lms Adaptive Filter Using Partial Product Generator Vidyamol S M.Tech Vlsi And Embedded System Ma College Of Engineering, Kothamangalam,India vidyas.saji@gmail.com Abstract The area and power

More information

Systems Engineering Overview. Axel Claudio Alex Gonzalez

Systems Engineering Overview. Axel Claudio Alex Gonzalez Systems Engineering Overview Axel Claudio Alex Gonzalez Objectives Provide additional insights into Systems and into Systems Engineering Walkthrough the different phases of the product lifecycle Discuss

More information

A Computing Research Perspective on a Learning Healthcare System. Kevin Sullivan Computer Science University of Virginia 4/11/2013

A Computing Research Perspective on a Learning Healthcare System. Kevin Sullivan Computer Science University of Virginia 4/11/2013 A Computing Research Perspective on a Learning Healthcare System Kevin Sullivan Computer Science University of Virginia 4/11/2013 Outline Motivation unmet potential, pressing need Goal use driven, fundamental

More information

Lecture 1 What is AI?

Lecture 1 What is AI? Lecture 1 What is AI? CSE 473 Artificial Intelligence Oren Etzioni 1 AI as Science What are the most fundamental scientific questions? 2 Goals of this Course To teach you the main ideas of AI. Give you

More information

Measuring and Evaluating Computer System Performance

Measuring and Evaluating Computer System Performance Measuring and Evaluating Computer System Performance Performance Marches On... But what is performance? The bottom line: Performance Car Time to Bay Area Speed Passengers Throughput (pmph) Ferrari 3.1

More information

Performance Metrics. Computer Architecture. Outline. Objectives. Basic Performance Metrics. Basic Performance Metrics

Performance Metrics. Computer Architecture. Outline. Objectives. Basic Performance Metrics. Basic Performance Metrics Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Performance Metrics http://www.yildiz.edu.tr/~naydin 1 2 Objectives How can we meaningfully measure and compare

More information

WHITE PAPER. Hybrid Beamforming for Massive MIMO Phased Array Systems

WHITE PAPER. Hybrid Beamforming for Massive MIMO Phased Array Systems WHITE PAPER Hybrid Beamforming for Massive MIMO Phased Array Systems Introduction This paper demonstrates how you can use MATLAB and Simulink features and toolboxes to: 1. Design and synthesize complex

More information

ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική

ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική Υπολογιστών Presentation of UniServer Horizon 2020 European project findings: X-Gene server chips, voltage-noise characterization, high-bandwidth voltage measurements,

More information

Learning to Play Love Letter with Deep Reinforcement Learning

Learning to Play Love Letter with Deep Reinforcement Learning Learning to Play Love Letter with Deep Reinforcement Learning Madeleine D. Dawson* MIT mdd@mit.edu Robert X. Liang* MIT xbliang@mit.edu Alexander M. Turner* MIT turneram@mit.edu Abstract Recent advancements

More information

Design of Mixed-Signal Microsystems in Nanometer CMOS

Design of Mixed-Signal Microsystems in Nanometer CMOS Design of Mixed-Signal Microsystems in Nanometer CMOS Carl Grace Lawrence Berkeley National Laboratory August 2, 2012 DOE BES Neutron and Photon Detector Workshop Introduction Common themes in emerging

More information

Advanced FPGA Design. Tinoosh Mohsenin CMPE 491/691 Spring 2012

Advanced FPGA Design. Tinoosh Mohsenin CMPE 491/691 Spring 2012 Advanced FPGA Design Tinoosh Mohsenin CMPE 491/691 Spring 2012 Today Administrative items Syllabus and course overview Digital signal processing overview 2 Course Communication Email Urgent announcements

More information

Re-Visiting Power Measurement for the Green500

Re-Visiting Power Measurement for the Green500 Re-Visiting Power Measurement for the Green500 Thomas R. W. Scogland (LLNL/CASC, Green500) The Green500 List and its Continuing 1 Evolution BoF, November 2014 Level 1 Requirements Workload phase: Measure

More information

Convolutional neural networks

Convolutional neural networks Convolutional neural networks Themes Curriculum: Ch 9.1, 9.2 and http://cs231n.github.io/convolutionalnetworks/ The simple motivation and idea How it s done Receptive field Pooling Dilated convolutions

More information

Open Source Digital Camera on Field Programmable Gate Arrays

Open Source Digital Camera on Field Programmable Gate Arrays Open Source Digital Camera on Field Programmable Gate Arrays Cristinel Ababei, Shaun Duerr, Joe Ebel, Russell Marineau, Milad Ghorbani Moghaddam, and Tanzania Sewell Dept. of Electrical and Computer Engineering,

More information

AI Frontiers. Dr. Dario Gil Vice President IBM Research

AI Frontiers. Dr. Dario Gil Vice President IBM Research AI Frontiers Dr. Dario Gil Vice President IBM Research 1 AI is the new IT MIT Intro to Machine Learning course: 2013 138 students 2016 302 students 2017 700 students 2 What is AI? Artificial Intelligence

More information

What We Talk About When We Talk About AI

What We Talk About When We Talk About AI MAGAZINE What We Talk About When We Talk About AI ARTIFICIAL INTELLIGENCE TECHNOLOGY 30 OCT 2015 W e have all seen the films, read the comics or been awed by the prophetic books, and from them we think

More information

Framework Programme 7

Framework Programme 7 Framework Programme 7 1 Joining the EU programmes as a Belarusian 1. Introduction to the Framework Programme 7 2. Focus on evaluation issues + exercise 3. Strategies for Belarusian organisations + exercise

More information

Foundations Required for Novel Compute (FRANC) BAA Frequently Asked Questions (FAQ) Updated: October 24, 2017

Foundations Required for Novel Compute (FRANC) BAA Frequently Asked Questions (FAQ) Updated: October 24, 2017 1. TA-1 Objective Q: Within the BAA, the 48 th month objective for TA-1a/b is listed as functional prototype. What form of prototype is expected? Should an operating system and runtime be provided as part

More information

PURELY NEURAL MACHINE TRANSLATION

PURELY NEURAL MACHINE TRANSLATION PURELY NEURAL MACHINE TRANSLATION ISSUE 1 NEURAL MACHINE TRANSLATION (NMT): LET S GO BACK TO THE ORIGINS Each of us have experienced or heard of deep learning in day-to-day business applications. What

More information

THE AI REVOLUTION. How Artificial Intelligence is Redefining Marketing Automation

THE AI REVOLUTION. How Artificial Intelligence is Redefining Marketing Automation THE AI REVOLUTION How Artificial Intelligence is Redefining Marketing Automation The implications of Artificial Intelligence for modern day marketers The shift from Marketing Automation to Intelligent

More information

A Framework for Assessing the Feasibility of Learning Algorithms in Power-Constrained ASICs

A Framework for Assessing the Feasibility of Learning Algorithms in Power-Constrained ASICs A Framework for Assessing the Feasibility of Learning Algorithms in Power-Constrained ASICs 1 Introduction Alexander Neckar with David Gal, Eric Glass, and Matt Murray (from EE382a) Whether due to injury

More information

Analog Custom Layout Engineer

Analog Custom Layout Engineer Analog Custom Layout Engineer Huawei Canada s rapid growth has created an excellent opportunity to build and grow your career and make a big impact to everyone s life. The IC Lab is currently looking to

More information

SW simulation and Performance Analysis

SW simulation and Performance Analysis SW simulation and Performance Analysis In Multi-Processing Embedded Systems Eugenio Villar University of Cantabria Context HW/SW Embedded Systems Design Flow HW/SW Simulation Performance Analysis Design

More information

THE INFLUENCE OF ACADEMIC RESEARCH ON INDUSTRY R&D. Steve Keckler, Vice President of Architecture Research June 19, 2016

THE INFLUENCE OF ACADEMIC RESEARCH ON INDUSTRY R&D. Steve Keckler, Vice President of Architecture Research June 19, 2016 THE INFLUENCE OF ACADEMIC RESEARCH ON INDUSTRY R&D Steve Keckler, Vice President of Architecture Research June 19, 2016 AGENDA Academic/Industry Partnership Architecture 2030 2 My Background/Experience

More information

Exploring the Software Stack for Underdesigned Computing Machines Rajesh Gupta UC San Diego.

Exploring the Software Stack for Underdesigned Computing Machines Rajesh Gupta UC San Diego. Exploring the Software Stack for Underdesigned Computing Machines Rajesh Gupta UC San Diego. 1 Exploring the Software Stack for Underdesigned Computing Machines 1 Exploring the Software Stack for Underdesigned

More information

Bricken Technologies Corporation Presentations: Bricken Technologies Corporation Corporate: Bricken Technologies Corporation Marketing:

Bricken Technologies Corporation Presentations: Bricken Technologies Corporation Corporate: Bricken Technologies Corporation Marketing: TECHNICAL REPORTS William Bricken compiled 2004 Bricken Technologies Corporation Presentations: 2004: Synthesis Applications of Boundary Logic 2004: BTC Board of Directors Technical Review (quarterly)

More information

Prediction of Cluster System Load Using Artificial Neural Networks

Prediction of Cluster System Load Using Artificial Neural Networks Prediction of Cluster System Load Using Artificial Neural Networks Y.S. Artamonov 1 1 Samara National Research University, 34 Moskovskoe Shosse, 443086, Samara, Russia Abstract Currently, a wide range

More information

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab. 김강일

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab.  김강일 신경망기반자동번역기술 Konkuk University Computational Intelligence Lab. http://ci.konkuk.ac.kr kikim01@kunkuk.ac.kr 김강일 Index Issues in AI and Deep Learning Overview of Machine Translation Advanced Techniques in

More information

Creating a Poker Playing Program Using Evolutionary Computation

Creating a Poker Playing Program Using Evolutionary Computation Creating a Poker Playing Program Using Evolutionary Computation Simon Olsen and Rob LeGrand, Ph.D. Abstract Artificial intelligence is a rapidly expanding technology. We are surrounded by technology that

More information

Copyright 2003 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Slides prepared by Walid A. Najjar & Brian J.

Copyright 2003 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Slides prepared by Walid A. Najjar & Brian J. Introduction to Computing Systems from bits & gates to C & beyond Chapter 1 Welcome Aboard! This course is about: What computers consist of How computers work How they are organized internally What are

More information

Enabling Scientific Breakthroughs at the Petascale

Enabling Scientific Breakthroughs at the Petascale Enabling Scientific Breakthroughs at the Petascale Contents Breakthroughs in Science...................................... 2 Breakthroughs in Storage...................................... 3 The Impact

More information

Game-playing: DeepBlue and AlphaGo

Game-playing: DeepBlue and AlphaGo Game-playing: DeepBlue and AlphaGo Brief history of gameplaying frontiers 1990s: Othello world champions refuse to play computers 1994: Chinook defeats Checkers world champion 1997: DeepBlue defeats world

More information

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations Sno Projects List IEEE 1 High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations 2 A Generalized Algorithm And Reconfigurable Architecture For Efficient And Scalable

More information

Copyright 2018, Technology Futures, Inc. 1

Copyright 2018, Technology Futures, Inc. 1 Copyright 2018, Technology Futures, Inc. 1 Forecasting Artificial Intelligence Lawrence Vanston, Ph.D. President, Technology Futures, Inc. lvanston@tfi.com 512-415-5965 TFI 2018 January 25-26, 2018 Marriott

More information

Merging Propagation Physics, Theory and Hardware in Wireless. Ada Poon

Merging Propagation Physics, Theory and Hardware in Wireless. Ada Poon HKUST January 3, 2007 Merging Propagation Physics, Theory and Hardware in Wireless Ada Poon University of Illinois at Urbana-Champaign Outline Multiple-antenna (MIMO) channels Human body wireless channels

More information

Experiments with Tensor Flow Roman Weber (Geschäftsführer) Richard Schmid (Senior Consultant)

Experiments with Tensor Flow Roman Weber (Geschäftsführer) Richard Schmid (Senior Consultant) Experiments with Tensor Flow 23.05.2017 Roman Weber (Geschäftsführer) Richard Schmid (Senior Consultant) WEBGATE CONSULTING Gegründet Mitarbeiter CH Inhaber geführt IT Anbieter Partner 2001 Ex 29 Beratung

More information