Software-based Microarchitectural Attacks
|
|
- Horatio Walters
- 5 years ago
- Views:
Transcription
1 SCIENCE PASSION TECHNOLOGY Software-based Microarchitectural Attacks Daniel Gruss April 19, 2018 Graz University of Technology 1 Daniel Gruss Graz University of Technology
2 Whoami Daniel Gruss Graz University of Technology daniel.gruss@iaik.tugraz.at 2 Daniel Gruss Graz University of Technology
3 Timeline of Meltdown and Spectre Both vulnerabilities existed for many years 3 Daniel Gruss Graz University of Technology
4 Timeline of Meltdown and Spectre Both vulnerabilities existed for many years No one discovered it before 3 Daniel Gruss Graz University of Technology
5 Timeline of Meltdown and Spectre Both vulnerabilities existed for many years No one discovered it before Suddenly, 4 independent teams discover it within 6 months 3 Daniel Gruss Graz University of Technology
6 Timeline of Meltdown and Spectre Both vulnerabilities existed for many years No one discovered it before Suddenly, 4 independent teams discover it within 6 months Let s create an evidence board 3 Daniel Gruss Graz University of Technology
7 3 Daniel Gruss Graz University of Technology
8 3 Daniel Gruss Graz University of Technology
9 3 Daniel Gruss Graz University of Technology
10 3 Daniel Gruss Graz University of Technology
11 3 Daniel Gruss Graz University of Technology
12 3 Daniel Gruss Graz University of Technology
13 3 Daniel Gruss Graz University of Technology
14 3 Daniel Gruss Graz University of Technology
15 3 Daniel Gruss Graz University of Technology
16 3 Daniel Gruss Graz University of Technology
17 3 Daniel Gruss Graz University of Technology
18 3 Daniel Gruss Graz University of Technology
19 3 Daniel Gruss Graz University of Technology
20 3 Daniel Gruss Graz University of Technology
21 3 Daniel Gruss Graz University of Technology
22 3 Daniel Gruss Graz University of Technology
23 Meltdown vs. Spectre Why two names, two papers, etc? Two different problems 4 Daniel Gruss Graz University of Technology
24 Meltdown vs. Spectre Why two names, two papers, etc? Two different problems They only have a very loose connection 4 Daniel Gruss Graz University of Technology
25 Meltdown vs. Spectre Why two names, two papers, etc? Two different problems They only have a very loose connection Two different teams had already quite matured drafts ready when learning of each other 4 Daniel Gruss Graz University of Technology
26 Meltdown vs. Spectre Why two names, two papers, etc? Two different problems They only have a very loose connection Two different teams had already quite matured drafts ready when learning of each other Initially we tried to merge, but all co-authors quickly agreed that it would mix things that don t belong together More on that after we understand the attacks 4 Daniel Gruss Graz University of Technology
27 The Fallout You realize it is something big when... 5 Daniel Gruss Graz University of Technology
28 The Fallout You realize it is something big when... it is in the news, all over the world 5 Daniel Gruss Graz University of Technology
29 The Fallout You realize it is something big when... it is in the news, all over the world 5 Daniel Gruss Graz University of Technology
30 The Fallout You realize it is something big when... it is in the news, all over the world 5 Daniel Gruss Graz University of Technology
31 The Fallout You realize it is something big when... it is in the news, all over the world 5 Daniel Gruss Graz University of Technology
32 The Fallout You realize it is something big when... it is in the news, all over the world 5 Daniel Gruss Graz University of Technology
33 The Fallout You realize it is something big when... it is in the news, all over the world you get a Wikipedia article in multiple languages 5 Daniel Gruss Graz University of Technology
34 The Fallout You realize it is something big when... it is in the news, all over the world you get a Wikipedia article in multiple languages 5 Daniel Gruss Graz University of Technology
35 The Fallout You realize it is something big when... it is in the news, all over the world you get a Wikipedia article in multiple languages 5 Daniel Gruss Graz University of Technology
36 The Fallout You realize it is something big when... it is in the news, all over the world you get a Wikipedia article in multiple languages there are comics, including xkcd 5 Daniel Gruss Graz University of Technology
37 The Fallout You realize it is something big when... it is in the news, all over the world you get a Wikipedia article in multiple languages there are comics, including xkcd 5 Daniel Gruss Graz University of Technology
38 The Fallout You realize it is something big when... it is in the news, all over the world you get a Wikipedia article in multiple languages there are comics, including xkcd 5 Daniel Gruss Graz University of Technology
39 The Fallout You realize it is something big when... it is in the news, all over the world you get a Wikipedia article in multiple languages there are comics, including xkcd you get a lot of Twitter follower after Snowden mentioned you 5 Daniel Gruss Graz University of Technology
40 The Fallout You realize it is something big when... it is in the news, all over the world you get a Wikipedia article in multiple languages there are comics, including xkcd you get a lot of Twitter follower after Snowden mentioned you 5 Daniel Gruss Graz University of Technology
41 The Wall 6 Daniel Gruss Graz University of Technology
42
43 The Core of Meltdown/Spectre Kernel is isolated from user space Userspace Kernelspace Applications Operating System Memory 8 Daniel Gruss Graz University of Technology
44 The Core of Meltdown/Spectre Kernel is isolated from user space This isolation is a combination of hardware and software Userspace Kernelspace Applications Operating System Memory 8 Daniel Gruss Graz University of Technology
45 The Core of Meltdown/Spectre Kernel is isolated from user space This isolation is a combination of hardware and software User applications cannot access anything from the kernel Userspace Kernelspace Applications Operating System Memory 8 Daniel Gruss Graz University of Technology
46 The Core of Meltdown/Spectre Kernel is isolated from user space This isolation is a combination of hardware and software User applications cannot access anything from the kernel There is only a well-defined interface syscalls Userspace Applications Kernelspace Operating System Memory 8 Daniel Gruss Graz University of Technology
47 8 Daniel Gruss Graz University of Technology
48 8 Daniel Gruss Graz University of Technology
49 8 Daniel Gruss Graz University of Technology
50 8 Daniel Gruss Graz University of Technology
51 Revolutionary concept! Store your food at home, never go to the grocery store during cooking. Can store ALL kinds of food. ONLY TODAY INSTEAD OF $1,300 ORDER VIA PHONE: Daniel Gruss Graz University of Technology
52 CPU Cache printf("%d", i); printf("%d", i); 9 Daniel Gruss Graz University of Technology
53 CPU Cache printf("%d", i); printf("%d", i); Cache miss 9 Daniel Gruss Graz University of Technology
54 CPU Cache printf("%d", i); printf("%d", i); Cache miss Request 9 Daniel Gruss Graz University of Technology
55 CPU Cache printf("%d", i); printf("%d", i); Cache miss Request Response 9 Daniel Gruss Graz University of Technology
56 CPU Cache printf("%d", i); printf("%d", i); Cache miss i Request Response 9 Daniel Gruss Graz University of Technology
57 CPU Cache printf("%d", i); printf("%d", i); Cache miss Cache hit i Request Response 9 Daniel Gruss Graz University of Technology
58 CPU Cache DRAM access, slow printf("%d", i); printf("%d", i); Cache miss Cache hit i Request Response 9 Daniel Gruss Graz University of Technology
59 CPU Cache DRAM access, slow printf("%d", i); printf("%d", i); Cache miss Cache hit i No DRAM access, much faster Request Response 9 Daniel Gruss Graz University of Technology
60 Flush+Reload ATTACKER Shared Memory VICTIM flush access access 10 Daniel Gruss Graz University of Technology
61 Flush+Reload ATTACKER Shared Memory VICTIM flush access cached Shared Memory cached access 10 Daniel Gruss Graz University of Technology
62 Flush+Reload ATTACKER Shared Memory VICTIM flush access Shared Memory access 10 Daniel Gruss Graz University of Technology
63 Flush+Reload ATTACKER Shared Memory VICTIM flush access access 10 Daniel Gruss Graz University of Technology
64 Flush+Reload ATTACKER Shared Memory VICTIM flush access access 10 Daniel Gruss Graz University of Technology
65 Flush+Reload ATTACKER Shared Memory VICTIM flush access Shared Memory access 10 Daniel Gruss Graz University of Technology
66 Flush+Reload ATTACKER Shared Memory VICTIM flush access Shared Memory access 10 Daniel Gruss Graz University of Technology
67 Flush+Reload ATTACKER Shared Memory VICTIM flush access Shared Memory access fast if victim accessed data, slow otherwise 10 Daniel Gruss Graz University of Technology
68 Memory Access Latency 11 Daniel Gruss Graz University of Technology
69 Memory Access Latency 11 Daniel Gruss Graz University of Technology
70 Cache Template Attack Demo
71 Cache Template Address 0x7c680 0x7c6c0 0x7c700 0x7c740 0x7c780 0x7c7c0 0x7c800 0x7c840 0x7c880 0x7c8c0 0x7c900 0x7c940 0x7c980 0x7c9c0 0x7ca00 0x7cb80 0x7cc40 0x7cc80 0x7ccc0 0x7cd00 Key g h i j k l m n o p q r s t u v w x y z 13 Daniel Gruss Graz University of Technology
72 13 Daniel Gruss Graz University of Technology
73 13 Daniel Gruss Graz University of Technology
74 13 Daniel Gruss Graz University of Technology
75 Wait for an hour 13 Daniel Gruss Graz University of Technology
76 Wait for an hour LATENCY 13 Daniel Gruss Graz University of Technology
77 13 Daniel Gruss Graz University of Technology
78 Dependency Parallelize 13 Daniel Gruss Graz University of Technology
79 Out-of-order Execution 1 int width = 10, height = 5; 2 3 float diagonal = sqrt(width * width 4 + height * height); 5 int area = width * height; 6 7 printf("area %d x %d = %d\n", width, height, area); 14 Daniel Gruss Graz University of Technology
80 Out-of-order Execution Dependency 1 int width = 10, height = 5; 2 3 float diagonal = sqrt(width * width 4 + height * height); 5 int area = width * height; 6 7 printf("area %d x %d = %d\n", width, height, area); Parallelize 14 Daniel Gruss Graz University of Technology
81 Building Meltdown 1 char data = *(char*)0xffffffff81a000e0; 2 printf("%c\n", data); 15 Daniel Gruss Graz University of Technology
82 Building Meltdown 1 char data = *(char*)0xffffffff81a000e0; 2 printf("%c\n", data); 1 segfault at ffffffff81a000e0 ip sp ffce4a80610 error 5 in reader 15 Daniel Gruss Graz University of Technology
83 Building Meltdown 1 char data = *(char*)0xffffffff81a000e0; 2 printf("%c\n", data); 1 segfault at ffffffff81a000e0 ip sp ffce4a80610 error 5 in reader Kernel addresses are not accessible 15 Daniel Gruss Graz University of Technology
84 Building Meltdown 1 char data = *(char*)0xffffffff81a000e0; 2 printf("%c\n", data); 1 segfault at ffffffff81a000e0 ip sp ffce4a80610 error 5 in reader Kernel addresses are not accessible Are privilege checks also done when executing instructions out of order? 15 Daniel Gruss Graz University of Technology
85 Building Meltdown Adapted code 1 *(volatile char*)0; 2 array[84 * 4096] = 0; // unreachable 16 Daniel Gruss Graz University of Technology
86 Building Meltdown Adapted code 1 *(volatile char*)0; 2 array[84 * 4096] = 0; // unreachable Static code analyzer is not happy 1 warning : Dereference of null pointer 2 ( volatile char ) 0; 16 Daniel Gruss Graz University of Technology
87 Building Meltdown Flush+Reload over all pages of the array Access time [cycles] Page Unreachable code line was actually executed 17 Daniel Gruss Graz University of Technology
88 Building Meltdown Flush+Reload over all pages of the array Access time [cycles] Page Unreachable code line was actually executed Exception was only thrown afterwards 17 Daniel Gruss Graz University of Technology
89 Building Meltdown Combine the two things 1 char data = *(char*)0xffffffff81a000e0; 2 array[data * 4096] = 0; 18 Daniel Gruss Graz University of Technology
90 Building Meltdown Combine the two things 1 char data = *(char*)0xffffffff81a000e0; 2 array[data * 4096] = 0; = sending end of a cache covert channel Then check whether any part of array is cached 18 Daniel Gruss Graz University of Technology
91 Building Meltdown Combine the two things 1 char data = *(char*)0xffffffff81a000e0; 2 array[data * 4096] = 0; = sending end of a cache covert channel Then check whether any part of array is cached = receiving end of a cache covert channel 18 Daniel Gruss Graz University of Technology
92 Building Meltdown Flush+Reload over all pages of the array Access time [cycles] Page Index of cache hit reveals data 19 Daniel Gruss Graz University of Technology
93 Building Meltdown Flush+Reload over all pages of the array Access time [cycles] Page Index of cache hit reveals data Permission check is in some cases not fast enough 19 Daniel Gruss Graz University of Technology
94
95
96
97 Leaking Passwords from your Password Manager 23 Daniel Gruss Graz University of Technology
98
99 Not so fast Daniel Gruss Graz University of Technology
100 Take the kernel addresses... Kernel addresses in user space are a problem 25 Daniel Gruss Graz University of Technology
101 Take the kernel addresses... Kernel addresses in user space are a problem Why don t we take the kernel addresses Daniel Gruss Graz University of Technology
102 ...and remove them...and remove them if not needed? 26 Daniel Gruss Graz University of Technology
103 ...and remove them...and remove them if not needed? User accessible check in hardware is not reliable 26 Daniel Gruss Graz University of Technology
104 Idea Let s just unmap the kernel in user space 27 Daniel Gruss Graz University of Technology
105 Idea Let s just unmap the kernel in user space Kernel addresses are then no longer present 27 Daniel Gruss Graz University of Technology
106 Idea Let s just unmap the kernel in user space Kernel addresses are then no longer present Memory which is not mapped cannot be accessed at all 27 Daniel Gruss Graz University of Technology
107 27 Daniel Gruss Graz University of Technology
108 Kernel Address Isolation to have Side channels Efficiently Removed 27 Daniel Gruss Graz University of Technology
109 KAISER /ˈkʌɪzə/ 1. [german] Emperor, ruler of an empire 2. largest penguin, emperor penguin Kernel Address Isolation to have Side channels Efficiently Removed 27 Daniel Gruss Graz University of Technology
110 Userspace Kernelspace Applications Operating System Memory 27 Daniel Gruss Graz University of Technology
111 Kernel View User View Userspace Kernelspace Userspace Kernelspace Applications Operating System Memory Applications context switch 27 Daniel Gruss Graz University of Technology
112 Kernel Address Space Isolation We published KAISER in July Daniel Gruss Graz University of Technology
113 Kernel Address Space Isolation We published KAISER in July 2017 Intel and others improved and merged it into Linux as KPTI (Kernel Page Table Isolation) 28 Daniel Gruss Graz University of Technology
114 Kernel Address Space Isolation We published KAISER in July 2017 Intel and others improved and merged it into Linux as KPTI (Kernel Page Table Isolation) Microsoft implemented similar concept in Windows Daniel Gruss Graz University of Technology
115 Kernel Address Space Isolation We published KAISER in July 2017 Intel and others improved and merged it into Linux as KPTI (Kernel Page Table Isolation) Microsoft implemented similar concept in Windows 10 Apple implemented it in macos and called it Double Map 28 Daniel Gruss Graz University of Technology
116 Kernel Address Space Isolation We published KAISER in July 2017 Intel and others improved and merged it into Linux as KPTI (Kernel Page Table Isolation) Microsoft implemented similar concept in Windows 10 Apple implemented it in macos and called it Double Map All share the same idea: switching address spaces on context switch 28 Daniel Gruss Graz University of Technology
117 28 Daniel Gruss Graz University of Technology
118 Performance Depends on how often you need to switch between kernel and user space 29 Daniel Gruss Graz University of Technology
119 Performance Depends on how often you need to switch between kernel and user space Can be slow, 40% or more on old hardware 29 Daniel Gruss Graz University of Technology
120 Performance Depends on how often you need to switch between kernel and user space Can be slow, 40% or more on old hardware But modern CPUs have additional features 29 Daniel Gruss Graz University of Technology
121 Performance Depends on how often you need to switch between kernel and user space Can be slow, 40% or more on old hardware But modern CPUs have additional features Performance overhead on average below 2% 29 Daniel Gruss Graz University of Technology
122 Meltdown and Spectre 30 Daniel Gruss Graz University of Technology
123 Meltdown and Spectre 30 Daniel Gruss Graz University of Technology
124 30 Daniel Gruss Graz University of Technology
125 Prosciutto 30 Daniel Gruss Graz University of Technology
126 Funghi 30 Daniel Gruss Graz University of Technology
127 Diavolo 30 Daniel Gruss Graz University of Technology
128 Diavolo 30 Daniel Gruss Graz University of Technology
129 Diavolo 30 Daniel Gruss Graz University of Technology
130 Diavolo 30 Daniel Gruss Graz University of Technology
131 »A table for 6 please«30 Daniel Gruss Graz University of Technology
132 30 Daniel Gruss Graz University of Technology
133 Speculative Cooking 30 Daniel Gruss Graz University of Technology
134 »A table for 6 please«30 Daniel Gruss Graz University of Technology
135 30 Daniel Gruss Graz University of Technology
136 30 Daniel Gruss Graz University of Technology
137 30 Daniel Gruss Graz University of Technology
138 30 Daniel Gruss Graz University of Technology
139 What does Spectre do? Mistrains branch prediction 31 Daniel Gruss Graz University of Technology
140 What does Spectre do? Mistrains branch prediction CPU speculatively executes code which should not be executed 31 Daniel Gruss Graz University of Technology
141 What does Spectre do? Mistrains branch prediction CPU speculatively executes code which should not be executed Can also mistrain indirect calls 31 Daniel Gruss Graz University of Technology
142 What does Spectre do? Mistrains branch prediction CPU speculatively executes code which should not be executed Can also mistrain indirect calls Spectre convinces program to execute code 31 Daniel Gruss Graz University of Technology
143 Spectre (variant 1) index = 0; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
144 Spectre (variant 1) index = 0; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
145 Spectre (variant 1) index = 0; char* data = "textkey"; then if (index < 4) Prediction else Speculate LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
146 Spectre (variant 1) index = 0; char* data = "textkey"; Execute then if (index < 4) Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
147 Spectre (variant 1) index = 1; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
148 Spectre (variant 1) index = 1; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
149 Spectre (variant 1) index = 1; char* data = "textkey"; Speculate then if (index < 4) Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
150 Spectre (variant 1) index = 1; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
151 Spectre (variant 1) index = 2; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
152 Spectre (variant 1) index = 2; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
153 Spectre (variant 1) index = 2; char* data = "textkey"; Speculate then if (index < 4) Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
154 Spectre (variant 1) index = 2; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
155 Spectre (variant 1) index = 3; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
156 Spectre (variant 1) index = 3; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
157 Spectre (variant 1) index = 3; char* data = "textkey"; Speculate then if (index < 4) Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
158 Spectre (variant 1) index = 3; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
159 Spectre (variant 1) index = 4; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
160 Spectre (variant 1) index = 4; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
161 Spectre (variant 1) index = 4; char* data = "textkey"; Speculate then if (index < 4) Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
162 Spectre (variant 1) index = 4; char* data = "textkey"; then if (index < 4) Prediction else Execute LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
163 Spectre (variant 1) index = 5; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
164 Spectre (variant 1) index = 5; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
165 Spectre (variant 1) index = 5; char* data = "textkey"; Speculate then if (index < 4) Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
166 Spectre (variant 1) index = 5; char* data = "textkey"; then if (index < 4) Prediction else Execute LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
167 Spectre (variant 1) index = 6; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
168 Spectre (variant 1) index = 6; char* data = "textkey"; if (index < 4) then Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
169 Spectre (variant 1) index = 6; char* data = "textkey"; Speculate then if (index < 4) Prediction else LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
170 Spectre (variant 1) index = 6; char* data = "textkey"; then if (index < 4) Prediction else Execute LUT[data[index] * 4096] 0 32 Daniel Gruss Graz University of Technology
171 Spectre (variant 2) Animal* a = bird; a->move() fly() swim() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
172 Spectre (variant 2) Animal* a = bird; a->move() fly() swim() swim() Speculate Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
173 Spectre (variant 2) Animal* a = bird; a->move() fly() swim() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
174 Spectre (variant 2) Animal* a = bird; a->move() Execute fly() swim() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
175 Spectre (variant 2) Animal* a = bird; a->move() fly() fly() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
176 Spectre (variant 2) Animal* a = bird; a->move() Speculate fly() fly() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
177 Spectre (variant 2) Animal* a = bird; a->move() fly() fly() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
178 Spectre (variant 2) Animal* a = fish; a->move() fly() fly() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
179 Spectre (variant 2) Animal* a = fish; a->move() Speculate fly() fly() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
180 Spectre (variant 2) Animal* a = fish; a->move() fly() fly() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
181 Spectre (variant 2) Animal* a = fish; a->move() fly() fly() swim() Execute Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
182 Spectre (variant 2) Animal* a = fish; a->move() fly() swim() swim() Prediction LUT[data[index] * 4096] 0 33 Daniel Gruss Graz University of Technology
183 Mitigating Spectre Trivial approach: disable speculative execution 34 Daniel Gruss Graz University of Technology
184 Mitigating Spectre Trivial approach: disable speculative execution No wrong speculation if there is no speculation 34 Daniel Gruss Graz University of Technology
185 Mitigating Spectre Trivial approach: disable speculative execution No wrong speculation if there is no speculation Problem: massive performance hit! 34 Daniel Gruss Graz University of Technology
186 Mitigating Spectre Trivial approach: disable speculative execution No wrong speculation if there is no speculation Problem: massive performance hit! Also: How to disable it? 34 Daniel Gruss Graz University of Technology
187 Mitigating Spectre Trivial approach: disable speculative execution No wrong speculation if there is no speculation Problem: massive performance hit! Also: How to disable it? Speculative execution is deeply integrated into CPU 34 Daniel Gruss Graz University of Technology
188 Spectre Variant 1 Mitigations 35 Daniel Gruss Graz University of Technology
189 Spectre Variant 1 Mitigations Workaround: insert instructions stopping speculation 35 Daniel Gruss Graz University of Technology
190 Spectre Variant 1 Mitigations Workaround: insert instructions stopping speculation insert after every bounds check 35 Daniel Gruss Graz University of Technology
191 Spectre Variant 1 Mitigations Workaround: insert instructions stopping speculation insert after every bounds check x86: LFENCE, ARM: CSDB 35 Daniel Gruss Graz University of Technology
192 Spectre Variant 1 Mitigations Workaround: insert instructions stopping speculation insert after every bounds check x86: LFENCE, ARM: CSDB Available on all Intel CPUs, retrofitted to existing ARMv7 and ARMv8 35 Daniel Gruss Graz University of Technology
193 Spectre Variant 1 Mitigations 36 Daniel Gruss Graz University of Technology
194 Spectre Variant 1 Mitigations Speculation barrier requires compiler supported 36 Daniel Gruss Graz University of Technology
195 Spectre Variant 1 Mitigations Speculation barrier requires compiler supported Already implemented in GCC, LLVM, and MSVC 36 Daniel Gruss Graz University of Technology
196 Spectre Variant 1 Mitigations Speculation barrier requires compiler supported Already implemented in GCC, LLVM, and MSVC Can be automated (MSVC) not really reliable 36 Daniel Gruss Graz University of Technology
197 Spectre Variant 1 Mitigations Speculation barrier requires compiler supported Already implemented in GCC, LLVM, and MSVC Can be automated (MSVC) not really reliable Explicit use by programmer: builtin load no speculate 36 Daniel Gruss Graz University of Technology
198 Spectre Variant 1 Mitigations 37 Daniel Gruss Graz University of Technology
199 Spectre Variant 1 Mitigations 37 Daniel Gruss Graz University of Technology
200 Spectre Variant 1 Mitigations 38 Daniel Gruss Graz University of Technology
201 Spectre Variant 1 Mitigations Speculation barrier works if affected code constructs are known 38 Daniel Gruss Graz University of Technology
202 Spectre Variant 1 Mitigations Speculation barrier works if affected code constructs are known Programmer has to fully understand vulnerability 38 Daniel Gruss Graz University of Technology
203 Spectre Variant 1 Mitigations Speculation barrier works if affected code constructs are known Programmer has to fully understand vulnerability Automatic detection is not reliable 38 Daniel Gruss Graz University of Technology
204 Spectre Variant 1 Mitigations Speculation barrier works if affected code constructs are known Programmer has to fully understand vulnerability Automatic detection is not reliable Non-negligible performance overhead of barriers 38 Daniel Gruss Graz University of Technology
205 Spectre Variant 2 Mitigations (Microcode/MSRs) Intel released microcode updates Indirect Branch Restricted Speculation (IBRS): 39 Daniel Gruss Graz University of Technology
206 Spectre Variant 2 Mitigations (Microcode/MSRs) Intel released microcode updates Indirect Branch Restricted Speculation (IBRS): Do not speculate based on anything before entering IBRS mode 39 Daniel Gruss Graz University of Technology
207 Spectre Variant 2 Mitigations (Microcode/MSRs) Intel released microcode updates Indirect Branch Restricted Speculation (IBRS): Do not speculate based on anything before entering IBRS mode lesser privileged code cannot influence predictions 39 Daniel Gruss Graz University of Technology
208 Spectre Variant 2 Mitigations (Microcode/MSRs) Intel released microcode updates Indirect Branch Restricted Speculation (IBRS): Do not speculate based on anything before entering IBRS mode lesser privileged code cannot influence predictions Indirect Branch Predictor Barrier (IBPB): 39 Daniel Gruss Graz University of Technology
209 Spectre Variant 2 Mitigations (Microcode/MSRs) Intel released microcode updates Indirect Branch Restricted Speculation (IBRS): Do not speculate based on anything before entering IBRS mode lesser privileged code cannot influence predictions Indirect Branch Predictor Barrier (IBPB): Flush branch-target buffer 39 Daniel Gruss Graz University of Technology
210 Spectre Variant 2 Mitigations (Microcode/MSRs) Intel released microcode updates Indirect Branch Restricted Speculation (IBRS): Do not speculate based on anything before entering IBRS mode lesser privileged code cannot influence predictions Indirect Branch Predictor Barrier (IBPB): Flush branch-target buffer Single Thread Indirect Branch Predictors (STIBP): 39 Daniel Gruss Graz University of Technology
211 Spectre Variant 2 Mitigations (Microcode/MSRs) Intel released microcode updates Indirect Branch Restricted Speculation (IBRS): Do not speculate based on anything before entering IBRS mode lesser privileged code cannot influence predictions Indirect Branch Predictor Barrier (IBPB): Flush branch-target buffer Single Thread Indirect Branch Predictors (STIBP): Isolates branch prediction state between two hyperthreads 39 Daniel Gruss Graz University of Technology
212 Spectre Variant 2 Mitigations (Software) Retpoline (compiler extension) 40 Daniel Gruss Graz University of Technology
213 Spectre Variant 2 Mitigations (Software) Retpoline (compiler extension) 1 push < call_target > 2 call 1f 3 2: ; speculation will continue here 4 lfence ; speculation barrier 5 jmp 2b ; endless loop 6 1: 7 lea 8(%rsp), %rsp ; restore stack pointer 8 ret ; the actual call to <call_target> always predict to enter an endless loop 40 Daniel Gruss Graz University of Technology
214 Spectre Variant 2 Mitigations (Software) Retpoline (compiler extension) 1 push < call_target > 2 call 1f 3 2: ; speculation will continue here 4 lfence ; speculation barrier 5 jmp 2b ; endless loop 6 1: 7 lea 8(%rsp), %rsp ; restore stack pointer 8 ret ; the actual call to <call_target> always predict to enter an endless loop instead of the correct (or wrong) target function 40 Daniel Gruss Graz University of Technology
215 Spectre Variant 2 Mitigations (Software) Retpoline (compiler extension) 1 push < call_target > 2 call 1f 3 2: ; speculation will continue here 4 lfence ; speculation barrier 5 jmp 2b ; endless loop 6 1: 7 lea 8(%rsp), %rsp ; restore stack pointer 8 ret ; the actual call to <call_target> always predict to enter an endless loop instead of the correct (or wrong) target function performance? 40 Daniel Gruss Graz University of Technology
216 Spectre Variant 2 Mitigations (Software) Retpoline (compiler extension) 1 push < call_target > 2 call 1f 3 2: ; speculation will continue here 4 lfence ; speculation barrier 5 jmp 2b ; endless loop 6 1: 7 lea 8(%rsp), %rsp ; restore stack pointer 8 ret ; the actual call to <call_target> always predict to enter an endless loop instead of the correct (or wrong) target function performance? On Broadwell or newer: 40 Daniel Gruss Graz University of Technology
217 Spectre Variant 2 Mitigations (Software) Retpoline (compiler extension) 1 push < call_target > 2 call 1f 3 2: ; speculation will continue here 4 lfence ; speculation barrier 5 jmp 2b ; endless loop 6 1: 7 lea 8(%rsp), %rsp ; restore stack pointer 8 ret ; the actual call to <call_target> always predict to enter an endless loop instead of the correct (or wrong) target function performance? On Broadwell or newer: ret may fall-back to the BTB for prediction 40 Daniel Gruss Graz University of Technology
218 Spectre Variant 2 Mitigations (Software) Retpoline (compiler extension) 1 push < call_target > 2 call 1f 3 2: ; speculation will continue here 4 lfence ; speculation barrier 5 jmp 2b ; endless loop 6 1: 7 lea 8(%rsp), %rsp ; restore stack pointer 8 ret ; the actual call to <call_target> always predict to enter an endless loop instead of the correct (or wrong) target function performance? On Broadwell or newer: ret may fall-back to the BTB for prediction microcode patches to prevent that 40 Daniel Gruss Graz University of Technology
219 Spectre Variant 2 Mitigations (Software) ARM provides hardened Linux kernel 41 Daniel Gruss Graz University of Technology
220 Spectre Variant 2 Mitigations (Software) ARM provides hardened Linux kernel Clears branch-predictor state on context switch 41 Daniel Gruss Graz University of Technology
221 Spectre Variant 2 Mitigations (Software) ARM provides hardened Linux kernel Clears branch-predictor state on context switch Either via instruction (BPIALL) Daniel Gruss Graz University of Technology
222 Spectre Variant 2 Mitigations (Software) ARM provides hardened Linux kernel Clears branch-predictor state on context switch Either via instruction (BPIALL)......or workaround (disable/enable MMU) 41 Daniel Gruss Graz University of Technology
223 Spectre Variant 2 Mitigations (Software) ARM provides hardened Linux kernel Clears branch-predictor state on context switch Either via instruction (BPIALL)......or workaround (disable/enable MMU) Non-negligible performance overhead ( ns) 41 Daniel Gruss Graz University of Technology
224 What does not work Prevent access to high-resolution timer 42 Daniel Gruss Graz University of Technology
225 What does not work Prevent access to high-resolution timer Own timer using timing thread 42 Daniel Gruss Graz University of Technology
226 What does not work Prevent access to high-resolution timer Own timer using timing thread Flush instruction only privileged 42 Daniel Gruss Graz University of Technology
227 What does not work Prevent access to high-resolution timer Own timer using timing thread Flush instruction only privileged Cache eviction through memory accesses 42 Daniel Gruss Graz University of Technology
228 What does not work Prevent access to high-resolution timer Own timer using timing thread Flush instruction only privileged Cache eviction through memory accesses Just move secrets into secure world 42 Daniel Gruss Graz University of Technology
229 What does not work Prevent access to high-resolution timer Own timer using timing thread Flush instruction only privileged Cache eviction through memory accesses Just move secrets into secure world Spectre works on secure enclaves 42 Daniel Gruss Graz University of Technology
230 Meltdown vs. Spectre Meltdown Spectre 43 Daniel Gruss Graz University of Technology
231 Meltdown vs. Spectre Meltdown Out-of-Order Execution Spectre Speculative Execution (subset of Out-of-Order Execution) 43 Daniel Gruss Graz University of Technology
232 Meltdown vs. Spectre Meltdown Out-of-Order Execution has nothing to do with branch prediction Spectre Speculative Execution (subset of Out-of-Order Execution) fundamentally builds on branch (mis)prediction 43 Daniel Gruss Graz University of Technology
233 Meltdown vs. Spectre Meltdown Out-of-Order Execution has nothing to do with branch prediction turning off speculative execution entirely has no effect on Meltdown Spectre Speculative Execution (subset of Out-of-Order Execution) fundamentally builds on branch (mis)prediction turning off speculative execution entirely would work 43 Daniel Gruss Graz University of Technology
234 Meltdown vs. Spectre Meltdown Out-of-Order Execution has nothing to do with branch prediction turning off speculative execution entirely has no effect on Meltdown melts down the isolation provided by the user accessible-bit Spectre Speculative Execution (subset of Out-of-Order Execution) fundamentally builds on branch (mis)prediction turning off speculative execution entirely would work has nothing to do with the user accessible-bit 43 Daniel Gruss Graz University of Technology
235 Meltdown vs. Spectre Meltdown Out-of-Order Execution has nothing to do with branch prediction turning off speculative execution entirely has no effect on Meltdown melts down the isolation provided by the user accessible-bit in theory: OoO not required, pipelining can be sufficient Spectre Speculative Execution (subset of Out-of-Order Execution) fundamentally builds on branch (mis)prediction turning off speculative execution entirely would work has nothing to do with the user accessible-bit KAISER has no effect on Spectre at all 43 Daniel Gruss Graz University of Technology
236 Meltdown vs. Spectre Meltdown Out-of-Order Execution has nothing to do with branch prediction turning off speculative execution entirely has no effect on Meltdown melts down the isolation provided by the user accessible-bit in theory: OoO not required, pipelining can be sufficient mitigated by KAISER Spectre Speculative Execution (subset of Out-of-Order Execution) fundamentally builds on branch (mis)prediction turning off speculative execution entirely would work has nothing to do with the user accessible-bit KAISER has no effect on Spectre at all 43 Daniel Gruss Graz University of Technology
237 Meltdown vs. Spectre Meltdown Spectre 44 Daniel Gruss Graz University of Technology
238 Meltdown vs. Spectre Meltdown performs illegal memory accesses we need to take care of processor exceptions Spectre performs only legal memory accesses 44 Daniel Gruss Graz University of Technology
239 Meltdown vs. Spectre Meltdown performs illegal memory accesses we need to take care of processor exceptions exception handling Spectre performs only legal memory accesses has nothing to do with exception handling 44 Daniel Gruss Graz University of Technology
240 Meltdown vs. Spectre Meltdown performs illegal memory accesses we need to take care of processor exceptions exception handling exception suppression with TSX Spectre performs only legal memory accesses has nothing to do with exception handling or suppression 44 Daniel Gruss Graz University of Technology
241 Meltdown vs. Spectre Meltdown performs illegal memory accesses we need to take care of processor exceptions exception handling exception suppression with TSX exception suppression with branch misprediction Spectre performs only legal memory accesses has nothing to do with exception handling or suppression 44 Daniel Gruss Graz University of Technology
242 Meltdown vs. Spectre Meltdown performs illegal memory accesses we need to take care of processor exceptions exception handling exception suppression with TSX exception suppression with branch misprediction Spectre performs only legal memory accesses has nothing to do with exception handling or suppression two papers, two names, etc. 44 Daniel Gruss Graz University of Technology
243 But Daniel Gruss Graz University of Technology
244 But Daniel Gruss Graz University of Technology
245 But why were they named variant 1, 2 and 3 by Google? 45 Daniel Gruss Graz University of Technology
246 But why were they named variant 1, 2 and 3 by Google? How can you use speculative execution maliciously? 45 Daniel Gruss Graz University of Technology
247 But why were they named variant 1, 2 and 3 by Google? How can you use speculative execution maliciously? Intel had much interest in not fancy-naming them ;) 45 Daniel Gruss Graz University of Technology
248 But why were they named variant 1, 2 and 3 by Google? How can you use speculative execution maliciously? Intel had much interest in not fancy-naming them ;) 45 Daniel Gruss Graz University of Technology
249 But why were they named variant 1, 2 and 3 by Google? How can you use speculative execution maliciously? Intel had much interest in not fancy-naming them ;)... why were they presented on the same date and on the same website? 45 Daniel Gruss Graz University of Technology
250 But why were they named variant 1, 2 and 3 by Google? How can you use speculative execution maliciously? Intel had much interest in not fancy-naming them ;)... why were they presented on the same date and on the same website? We did not choose the date 45 Daniel Gruss Graz University of Technology
251 But why were they named variant 1, 2 and 3 by Google? How can you use speculative execution maliciously? Intel had much interest in not fancy-naming them ;)... why were they presented on the same date and on the same website? We did not choose the date We did not want to have one of them overshadow the other immediately 45 Daniel Gruss Graz University of Technology
252 What do we learn from it? We have ignored microarchitectural attacks for many many years: 46 Daniel Gruss Graz University of Technology
253 What do we learn from it? We have ignored microarchitectural attacks for many many years: attacks on crypto 46 Daniel Gruss Graz University of Technology
254 What do we learn from it? We have ignored microarchitectural attacks for many many years: attacks on crypto software should be fixed 46 Daniel Gruss Graz University of Technology
255 What do we learn from it? We have ignored microarchitectural attacks for many many years: attacks on crypto software should be fixed attacks on ASLR 46 Daniel Gruss Graz University of Technology
256 What do we learn from it? We have ignored microarchitectural attacks for many many years: attacks on crypto software should be fixed attacks on ASLR ASLR is broken anyway 46 Daniel Gruss Graz University of Technology
257 What do we learn from it? We have ignored microarchitectural attacks for many many years: attacks on crypto software should be fixed attacks on ASLR ASLR is broken anyway attacks on SGX and TrustZone 46 Daniel Gruss Graz University of Technology
258 What do we learn from it? We have ignored microarchitectural attacks for many many years: attacks on crypto software should be fixed attacks on ASLR ASLR is broken anyway attacks on SGX and TrustZone not part of the threat model 46 Daniel Gruss Graz University of Technology
259 What do we learn from it? We have ignored microarchitectural attacks for many many years: attacks on crypto software should be fixed attacks on ASLR ASLR is broken anyway attacks on SGX and TrustZone not part of the threat model for years we solely optimized for performance 46 Daniel Gruss Graz University of Technology
260 When you read the manuals... After learning about a side channel you realize: 47 Daniel Gruss Graz University of Technology
261 When you read the manuals... After learning about a side channel you realize: the side channels were documented in the Intel manual 47 Daniel Gruss Graz University of Technology
262 When you read the manuals... After learning about a side channel you realize: the side channels were documented in the Intel manual only now we understand the implications 47 Daniel Gruss Graz University of Technology
263 What do we learn from it? Motor Vehicle Deaths in U.S. by Year 48 Daniel Gruss Graz University of Technology
264 Conclusions A unique chance to rethink processor design 49 Daniel Gruss Graz University of Technology
265 Conclusions A unique chance to rethink processor design grow up, like other fields (car industry, construction industry) 49 Daniel Gruss Graz University of Technology
266 Conclusions A unique chance to rethink processor design grow up, like other fields (car industry, construction industry) dedicate more time into identifying problems and not solely in mitigating known problems 49 Daniel Gruss Graz University of Technology
267 SCIENCE PASSION TECHNOLOGY Software-based Microarchitectural Attacks Daniel Gruss April 19, 2018 Graz University of Technology 50 Daniel Gruss Graz University of Technology
Meltdown & Spectre. Side-channels considered harmful. Qualcomm Mobile Security Summit May, San Diego, CA. Moritz Lipp
Meltdown & Spectre Side-channels considered harmful Qualcomm Mobile Security Summit 2018 17 May, 2018 - San Diego, CA Moritz Lipp (@mlqxyz) Michael Schwarz (@misc0110) Flashback Qualcomm Mobile Security
More informationTransient Execution Attacks
Transient Execution Attacks Daniel Gruss September 12, 2018 Graz University of Technology 1 Daniel Gruss Graz University of Technology Timeline Meltdown/Spectre (1) 19.02.2016: Daniel has an implementation
More informationMicroarchitectural Attacks and Defenses in JavaScript
Microarchitectural Attacks and Defenses in JavaScript Michael Schwarz, Daniel Gruss, Moritz Lipp 25.01.2018 www.iaik.tugraz.at 1 Michael Schwarz, Daniel Gruss, Moritz Lipp www.iaik.tugraz.at Microarchitecture
More informationProject 5: Optimizer Jason Ansel
Project 5: Optimizer Jason Ansel Overview Project guidelines Benchmarking Library OoO CPUs Project Guidelines Use optimizations from lectures as your arsenal If you decide to implement one, look at Whale
More informationDepartment Computer Science and Engineering IIT Kanpur
NPTEL Online - IIT Bombay Course Name Parallel Computer Architecture Department Computer Science and Engineering IIT Kanpur Instructor Dr. Mainak Chaudhuri file:///e /parallel_com_arch/lecture1/main.html[6/13/2012
More informationLecture Topics. Announcements. Today: Memory Management (Stallings, chapter ) Next: continued. Self-Study Exercise #6. Project #4 (due 10/11)
Lecture Topics Today: Memory Management (Stallings, chapter 7.1-7.4) Next: continued 1 Announcements Self-Study Exercise #6 Project #4 (due 10/11) Project #5 (due 10/18) 2 Memory Hierarchy 3 Memory Hierarchy
More informationCUDA Threads. Terminology. How it works. Terminology. Streaming Multiprocessor (SM) A SM processes block of threads
Terminology CUDA Threads Bedrich Benes, Ph.D. Purdue University Department of Computer Graphics Streaming Multiprocessor (SM) A SM processes block of threads Streaming Processors (SP) also called CUDA
More informationInstructor: Dr. Mainak Chaudhuri. Instructor: Dr. S. K. Aggarwal. Instructor: Dr. Rajat Moona
NPTEL Online - IIT Kanpur Instructor: Dr. Mainak Chaudhuri Instructor: Dr. S. K. Aggarwal Course Name: Department: Program Optimization for Multi-core Architecture Computer Science and Engineering IIT
More informationUsing Variable-MHz Microprocessors to Efficiently Handle Uncertainty in Real-Time Systems
Using Variable-MHz Microprocessors to Efficiently Handle Uncertainty in Real-Time Systems Eric Rotenberg Center for Embedded Systems Research (CESR) Department of Electrical & Computer Engineering North
More informationA Static Power Model for Architects
A Static Power Model for Architects J. Adam Butts and Guri Sohi University of Wisconsin-Madison {butts,sohi}@cs.wisc.edu 33rd International Symposium on Microarchitecture Monterey, California December,
More informationOn the Rules of Low-Power Design
On the Rules of Low-Power Design (and Why You Should Break Them) Prof. Todd Austin University of Michigan austin@umich.edu A long time ago, in a not so far away place The Rules of Low-Power Design P =
More information7/19/2012. IF for Load (Review) CSE 2021: Computer Organization. EX for Load (Review) ID for Load (Review) WB for Load (Review) MEM for Load (Review)
CSE 2021: Computer Organization IF for Load (Review) Lecture-11 CPU Design : Pipelining-2 Review, Hazards Shakil M. Khan CSE-2021 July-19-2012 2 ID for Load (Review) EX for Load (Review) CSE-2021 July-19-2012
More informationCSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-11 CPU Design : Pipelining-2 Review, Hazards Shakil M. Khan IF for Load (Review) CSE-2021 July-14-2011 2 ID for Load (Review) CSE-2021 July-14-2011 3 EX for Load
More informationFinal Report: DBmbench
18-741 Final Report: DBmbench Yan Ke (yke@cs.cmu.edu) Justin Weisz (jweisz@cs.cmu.edu) Dec. 8, 2006 1 Introduction Conventional database benchmarks, such as the TPC-C and TPC-H, are extremely computationally
More informationPipelined Processor Design
Pipelined Processor Design COE 38 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Pipelining versus Serial
More informationSupporting x86-64 Address Translation for 100s of GPU Lanes. Jason Power, Mark D. Hill, David A. Wood
Supporting x86-64 Address Translation for 100s of GPU s Jason Power, Mark D. Hill, David A. Wood Summary Challenges: CPU&GPUs physically integrated, but logically separate; This reduces theoretical bandwidth,
More informationKillzone Shadow Fall: Threading the Entity Update on PS4. Jorrit Rouwé Lead Game Tech, Guerrilla Games
Killzone Shadow Fall: Threading the Entity Update on PS4 Jorrit Rouwé Lead Game Tech, Guerrilla Games Introduction Killzone Shadow Fall is a First Person Shooter PlayStation 4 launch title In SP up to
More informationΕΠΛ 605: Προχωρημένη Αρχιτεκτονική
ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική Υπολογιστών Presentation of UniServer Horizon 2020 European project findings: X-Gene server chips, voltage-noise characterization, high-bandwidth voltage measurements,
More informationIntroduction to Real-Time Systems
Introduction to Real-Time Systems Real-Time Systems, Lecture 1 Martina Maggio and Karl-Erik Årzén 16 January 2018 Lund University, Department of Automatic Control Content [Real-Time Control System: Chapter
More informationSATSim: A Superscalar Architecture Trace Simulator Using Interactive Animation
SATSim: A Superscalar Architecture Trace Simulator Using Interactive Animation Mark Wolff Linda Wills School of Electrical and Computer Engineering Georgia Institute of Technology {wolff,linda.wills}@ece.gatech.edu
More informationCS4617 Computer Architecture
1/26 CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 2/26 Amdahl s Law Speedup = Execution time for entire task without using enhancement Execution time for entire task using enhancement
More informationHow to Blog to the Vanguard Website
How to Blog to the Vanguard Website Guidance and Rules for Blogging on the Vanguard Website Version 1.01 March 2018 Step 1. Get an account The bristol vanguard website, like much of the internet these
More informationWarp-Aware Trace Scheduling for GPUS. James Jablin (Brown) Thomas Jablin (UIUC) Onur Mutlu (CMU) Maurice Herlihy (Brown)
Warp-Aware Trace Scheduling for GPUS James Jablin (Brown) Thomas Jablin (UIUC) Onur Mutlu (CMU) Maurice Herlihy (Brown) Historical Trends in GFLOPS: CPUs vs. GPUs Theoretical GFLOP/s 3250 3000 2750 2500
More informationOverview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture
Overview 1 Trends in Microprocessor Architecture R05 Robert Mullins Computer architecture Scaling performance and CMOS Where have performance gains come from? Modern superscalar processors The limits of
More informationProcessors Processing Processors. The meta-lecture
Simulators 5SIA0 Processors Processing Processors The meta-lecture Why Simulators? Your Friend Harm Why Simulators? Harm Loves Tractors Harm Why Simulators? The outside world Unfortunately for Harm you
More informationOutline Simulators and such. What defines a simulator? What about emulation?
Outline Simulators and such Mats Brorsson & Mladen Nikitovic ICT Dept of Electronic, Computer and Software Systems (ECS) What defines a simulator? Why are simulators needed? Classifications Case studies
More informationThe adventures of a Suricate in ebpf land
The adventures of a Suricate in ebpf land É. Leblond Stamus Networks Nov. 10, 2016 É. Leblond (Stamus Networks) The adventures of a Suricate in ebpf land Nov. 10, 2016 1 / 34 1 ebpf technology 2 Suricata
More informationComputer Architecture
Computer Architecture Lecture 01 Arkaprava Basu www.csa.iisc.ac.in Acknowledgements Several of the slides in the deck are from Luis Ceze (Washington), Nima Horanmand (Stony Brook), Mark Hill, David Wood,
More informationSoftware ISP Application Note
NXP Semiconductors Document Number: AN12060 Application Notes Rev. 0, 10/2017 Software ISP Application Note 1. Introduction This document describes the software-based image signal processing application(sw-isp)
More information7/11/2012. Single Cycle (Review) CSE 2021: Computer Organization. Multi-Cycle Implementation. Single Cycle with Jump. Pipelining Analogy
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10 CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan CSE-2021 July-12-2012 2 Single Cycle with Jump Multi-Cycle Implementation
More informationREVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND.
December 3-6, 2018 Santa Clara Convention Center CA, USA REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND. https://tmt.knect365.com/risc-v-summit @risc_v ACCELERATING INFERENCING ON THE EDGE WITH RISC-V
More informationCompiler Optimisation
Compiler Optimisation 6 Instruction Scheduling Hugh Leather IF 1.18a hleather@inf.ed.ac.uk Institute for Computing Systems Architecture School of Informatics University of Edinburgh 2018 Introduction This
More informationCRYPTOSHOOTER MULTI AGENT BASED SECRET COMMUNICATION IN AUGMENTED VIRTUALITY
CRYPTOSHOOTER MULTI AGENT BASED SECRET COMMUNICATION IN AUGMENTED VIRTUALITY Submitted By: Sahil Narang, Sarah J Andrabi PROJECT IDEA The main idea for the project is to create a pursuit and evade crowd
More informationImproving Loop-Gain Performance In Digital Power Supplies With Latest- Generation DSCs
ISSUE: March 2016 Improving Loop-Gain Performance In Digital Power Supplies With Latest- Generation DSCs by Alex Dumais, Microchip Technology, Chandler, Ariz. With the consistent push for higher-performance
More informationBenchmarking C++ From video games to algorithmic trading. Alexander Radchenko
Benchmarking C++ From video games to algorithmic trading Alexander Radchenko Quiz. How long it takes to run? 3.5GHz Xeon at CentOS 7 Write your name Write your guess as a single number Write time units
More informationPrecise State Recovery. Out-of-Order Pipelines
Precise State Recovery in Out-of-Order Pipelines Nima Honarmand Recall Our Generic OOO Pipeline Instruction flow (pipeline front-end) is in-order Register and memory execution are OOO And, we need a final
More informationAnalysis of Image Compression Algorithm: GUETZLI
Analysis of Image Compression Algorithm: GUETZLI Lingyi Li August 18, 2017 Abstract How to balance picture size and quality is the core of image compression. This paper evaluates Google's jpeg image compression
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Speculation and raps in Out-of-Order Cores What is wrong with omasulo s? Branch instructions Need branch prediction to guess what to fetch next Need speculative execution
More informationBlackfin Online Learning & Development
Presentation Title: Introduction to VisualDSP++ Tools Presenter Name: Nicole Wright Chapter 1:Introduction 1a:Module Description 1b:CROSSCORE Products Chapter 2: ADSP-BF537 EZ-KIT Lite Configuration 2a:
More informationEECS 470 Lecture 5. Intro to Dynamic Scheduling (Scoreboarding) Fall 2018 Jon Beaumont
Intro to Dynamic Scheduling (Scoreboarding) Fall 2018 Jon Beaumont http://www.eecs.umich.edu/courses/eecs470 Many thanks to Prof. Martin and Roth of University of Pennsylvania for most of these slides.
More informationTHE DEEP WATERS OF DEEP LEARNING
THE DEEP WATERS OF DEEP LEARNING THE CURRENT AND FUTURE IMPACT OF ARTIFICIAL INTELLIGENCE ON THE PUBLISHING INDUSTRY. BY AND FRANKFURTER BUCHMESSE 2/6 Given the ever increasing number of publishers exploring
More informationOut-of-Order Execution. Register Renaming. Nima Honarmand
Out-of-Order Execution & Register Renaming Nima Honarmand Out-of-Order (OOO) Execution (1) Essence of OOO execution is Dynamic Scheduling Dynamic scheduling: processor hardware determines instruction execution
More informationPropietary Engine VS Commercial engine. by Zalo
Propietary Engine VS Commercial engine by Zalo zalosan@gmail.com About me B.S. Computer Engineering 9 years of experience, 5 different companies 3 propietary engines, 2 commercial engines I have my own
More informationSimulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka
Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka Abstract Virtual prototyping is becoming increasingly important to embedded software developers, engineers, managers
More informationAllegroCache Tutorial. Franz Inc
AllegroCache Tutorial Franz Inc 1 Introduction AllegroCache is an object database built on top of the Common Lisp Object System. In this tutorial we will demonstrate how to use AllegroCache to build, retrieve
More informationHow different FPGA firmware options enable digitizer platforms to address and facilitate multiple applications
How different FPGA firmware options enable digitizer platforms to address and facilitate multiple applications 1 st of April 2019 Marc.Stackler@Teledyne.com March 19 1 Digitizer definition and application
More informationINTERFACING WITH INTERRUPTS AND SYNCHRONIZATION TECHNIQUES
Faculty of Engineering INTERFACING WITH INTERRUPTS AND SYNCHRONIZATION TECHNIQUES Lab 1 Prepared by Kevin Premrl & Pavel Shering ID # 20517153 20523043 3a Mechatronics Engineering June 8, 2016 1 Phase
More informationECE 4750 Computer Architecture, Fall 2016 T09 Advanced Processors: Superscalar Execution
ECE 4750 Computer Architecture, Fall 2016 T09 Advanced Processors: Superscalar Execution School of Electrical and Computer Engineering Cornell University revision: 2016-11-28-17-33 1 In-Order Dual-Issue
More informationSetting up a Digital Darkroom A guide
Setting up a Digital Darkroom A guide http://www.theuniversody.co.uk Planning / Theory Considerations: What does the facility need to be capable of? Downloading images from digital cameras, (in all Raw
More informationRunning head: THE IMPACT OF COMPUTER ENGINEERING 1
Running head: THE IMPACT OF COMPUTER ENGINEERING 1 The Impact of Computer Engineering Oakland University Andrew Nassif 11/21/2015 THE IMPACT OF COMPUTER ENGINEERING 2 Abstract The purpose of this paper
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Out-of-Order Execution and Register Rename In Search of Parallelism rivial Parallelism is limited What is trivial parallelism? In-order: sequential instructions do not have
More informationRB-Dev-03 Devantech CMPS03 Magnetic Compass Module
RB-Dev-03 Devantech CMPS03 Magnetic Compass Module This compass module has been specifically designed for use in robots as an aid to navigation. The aim was to produce a unique number to represent the
More informationArchitecture ISCA 16 Luis Ceze, Tom Wenisch
Architecture 2030 @ ISCA 16 Luis Ceze, Tom Wenisch Mark Hill (CCC liaison, mentor) LIVE! Neha Agarwal, Amrita Mazumdar, Aasheesh Kolli (Student volunteers) Context Many fantastic community formation/visioning
More informationLec 24: Parallel Processors. Announcements
Lec 24: Parallel Processors Kavita ala CS 3410, Fall 2008 Computer Science Cornell University P 3 out Hack n Seek nnouncements The goal is to have fun with it Recitations today will talk about it Pizza
More informationEnergy Efficient Soft Real-Time Computing through Cross-Layer Predictive Control
Energy Efficient Soft Real-Time Computing through Cross-Layer Predictive Control Guangyi Cao and Arun Ravindran Department of Electrical and Computer Engineering University of North Carolina at Charlotte
More informationAN PN7150X Frequently Asked Questions. Application note COMPANY PUBLIC. Rev June Document information
Document information Info Content Keywords NFC, PN7150X, FAQs Abstract This document intents to provide answers to frequently asked questions about PN7150X NFC Controller. Revision history Rev Date Description
More informationJanuary 11, 2017 Administrative notes
January 11, 2017 Administrative notes Clickers Updated on Canvas as of people registered yesterday night. REEF/iClicker mobile is not working for everyone. Use at your own risk. If you are having trouble
More informationHow cryptographic benchmarking goes wrong. Thanks to NIST 60NANB12D261 for funding this work, and for not reviewing these slides in advance.
How cryptographic benchmarking goes wrong 1 Daniel J. Bernstein Thanks to NIST 60NANB12D261 for funding this work, and for not reviewing these slides in advance. PRESERVE, ending 2015.06.30, was a European
More informationMultiple Predictors: BTB + Branch Direction Predictors
Constructive Computer Architecture: Branch Prediction: Direction Predictors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October 28, 2015 http://csg.csail.mit.edu/6.175
More informationArchitectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance
Architectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance Michael D. Powell, Arijit Biswas, Shantanu Gupta, and Shubu Mukherjee SPEARS Group, Intel Massachusetts EECS, University
More informationBlackfin Online Learning & Development
A Presentation Title: Blackfin Optimizations for Performance and Power Consumption Presenter: Merril Weiner, Senior DSP Engineer Chapter 1: Introduction Subchapter 1a: Agenda Chapter 1b: Overview Chapter
More informationCOMP 4550 Servo Motors
COMP 4550 Servo Motors Autonomous Agents Lab, University of Manitoba jacky@cs.umanitoba.ca http://www.cs.umanitoba.ca/~jacky http://aalab.cs.umanitoba.ca Servo Motors A servo motor consists of three components
More informationCOTSon: Infrastructure for system-level simulation
COTSon: Infrastructure for system-level simulation Ayose Falcón, Paolo Faraboschi, Daniel Ortega HP Labs Exascale Computing Lab http://sites.google.com/site/hplabscotson MICRO-41 tutorial November 9, 28
More informationPortland State University MICROCONTROLLERS
PH-315 MICROCONTROLLERS INTERRUPTS and ACCURATE TIMING I Portland State University OBJECTIVE We aim at becoming familiar with the concept of interrupt, and, through a specific example, learn how to implement
More informationPerformance Evaluation of Recently Proposed Cache Replacement Policies
University of Jordan Computer Engineering Department Performance Evaluation of Recently Proposed Cache Replacement Policies CPE 731: Advanced Computer Architecture Dr. Gheith Abandah Asma Abdelkarim January
More informationThe Who. Intel - no introduction required.
Delivering Demand-Based Worlds with Intel SSD GDC 2011 The Who Intel - no introduction required. Digital Extremes - In addition to be great developers of AAA games, they are also the authors of the Evolution
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Out-of-Order Execution and Register Rename In Search of Parallelism rivial Parallelism is limited What is trivial parallelism? In-order: sequential instructions do not have
More informationSancus: Low-cost trustworthy extensible networked devices with a zero-software Trusted Computing Base
Sancus: Low-cost trustworthy extensible networked devices with a zero-software Trusted Computing Base Job Noorman Pieter Agten Wilfried Daniels Raoul Strackx Anthony Van Herrewege Christophe Huygens Bart
More informationTASK NOP CIJEVI ROBOTI RELJEF. standard output
Tasks TASK NOP CIJEVI ROBOTI RELJEF time limit (per test case) memory limit (per test case) points standard standard 1 second 32 MB 35 45 55 65 200 Task NOP Mirko purchased a new microprocessor. Unfortunately,
More informationRunning the Processing environment on ARM SBCs
Running the Processing environment on ARM SBCs Lessons learned & what s missing for having an Arduino equivalent on top of Linux Gottfried Haider @mrgohai Processing Processing a flexible software sketchbook
More informationAn architecture for Scalable Concurrent Embedded Software" No more communication in your program, the key to multi-core and distributed programming.
An architecture for Scalable Concurrent Embedded Software" No more communication in your program, the key to multi-core and distributed programming. Eric.Verhulst@altreonic.com www.altreonic.com 1 Content
More informationA Case for Opportunistic Embedded Sensing In Presence of Hardware Power Variability
A Case for Opportunistic Embedded Sensing In Presence of Hardware Power Variability L. Wanner, C. Apte, R. Balani, Puneet Gupta, and Mani Srivastava University of California, Los Angeles puneet@ee.ucla.edu
More informationMicro Wizard Instructions
How to install your Fast Track flashing light display timer model K1 with optional remote start switch (If you have ordered the Quick Mount or have a Best Track, disregard this section and refer to the
More informationEE445L Fall 2015 Final Version B Page 1 of 7
EE445L Fall 2015 Final Version B Page 1 of 7 Jonathan W. Valvano First: Last: This is the closed book section. You must put your answers in the boxes. When you are done, you turn in the closed-book part
More informationSoftware Engineering Design & Construction
Winter Semester 16/17 Software Engineering Design & Construction Dr. Michael Eichberg Fachgebiet Softwaretechnik Technische Universität Darmstadt Introduction - Software Engineering Software Engineering
More informationTrack and Vertex Reconstruction on GPUs for the Mu3e Experiment
Track and Vertex Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e Collaboration GPU Computing in High Energy Physics, Pisa September 11th, 2014 Physikalisches Institut Heidelberg
More informationEthics in Artificial Intelligence
Ethics in Artificial Intelligence By Jugal Kalita, PhD Professor of Computer Science Daniels Fund Ethics Initiative Ethics Fellow Sponsored by: This material was developed by Jugal Kalita, MPA, and is
More informationHigh Performance Computing for Engineers
High Performance Computing for Engineers David Thomas dt10@ic.ac.uk / https://github.com/m8pple Room 903 http://cas.ee.ic.ac.uk/people/dt10/teaching/2014/hpce HPCE / dt10/ 2015 / 0.1 High Performance Computing
More informationComputer Architecture ( L), Fall 2017 HW 3: Branch handling and GPU SOLUTIONS
Computer Architecture (263-2210-00L), Fall 2017 HW 3: Branch handling and GPU SOLUTIONS Instructor: Prof. Onur Mutlu TAs: Hasan Hassan, Arash Tavakkol, Mohammad Sadr, Lois Orosa, Juan Gomez Luna Assigned:
More informationYou now have your Big Idea and a nice design to boot. The only thing you need now is to start publishing to show the world what you re made of.
You now have your Big Idea and a nice design to boot. The only thing you need now is to start publishing to show the world what you re made of. This is the point where procrastination can kick in moving
More informationPerformance Metrics. Computer Architecture. Outline. Objectives. Basic Performance Metrics. Basic Performance Metrics
Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Performance Metrics http://www.yildiz.edu.tr/~naydin 1 2 Objectives How can we meaningfully measure and compare
More informationCopley ASCII Interface Programmer s Guide
Copley ASCII Interface Programmer s Guide PN/95-00404-000 Revision 4 June 2008 Copley ASCII Interface Programmer s Guide TABLE OF CONTENTS About This Manual... 5 Overview and Scope... 5 Related Documentation...
More informationLesson 2: Energy. Fascinating Education Script Introduction to Science Lessons. Slide 1: Introduction. Slide 2: How do you know to eat?
Fascinating Education Script Introduction to Science Lessons Lesson 2: Energy Slide 1: Introduction Slide 2: How do you know to eat? Why did you eat breakfast this morning? I suppose you re going to say
More informationAdministrative notes January 9, 2018
Administrative notes January 9, 2018 Survey: https://survey.ubc.ca/s/cpsc-100-studentexperience-pre-2017w2/ Worth bonus 1% on final course mark We ll be using iclickers today If you want to try REEF/iClicker
More informationImage Processing Architectures (and their future requirements)
Lecture 17: Image Processing Architectures (and their future requirements) Visual Computing Systems Smart phone processing resources Qualcomm snapdragon Image credit: Qualcomm Apple A7 (iphone 5s) Chipworks
More informationEvaluation of CPU Frequency Transition Latency
Evaluation of CPU Frequency Transition Latency Abdelhafid Mazouz 1 Alexandre Laurent 1 Benoît Pradelle 1 William Jalby 1 1 University of Versailles Saint-Quentin-en-Yvelines, France ENA-HPC 2013, Dresden
More informationA B C D. Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold. Time
Pipelining Readings: 4.5-4.8 Example: Doing the laundry A B C D Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes
More informationMicro Wizard Instructions
How to install FAST TRACK K3 4-digit actual times and 1-digit sequence of finish display timer with Computer Serial Interface Enclosed you will find the Fast Track finish line, AC adapter and remote start
More informationQS PRO & QS PRO 2 Set-up App Instructions For Bluetooth BLE (Android 4.4+)
QS PRO & QS PRO 2 Set-up App Instructions For Bluetooth BLE (Android 4.4+) All QS PRO s shipped since December 1, 2015 have the newest version Bluetooth BLE capability for entering and using the setup
More informationPipelining A B C D. Readings: Example: Doing the laundry. Ann, Brian, Cathy, & Dave. each have one load of clothes to wash, dry, and fold
Pipelining Readings: 4.5-4.8 Example: Doing the laundry Ann, Brian, Cathy, & Dave A B C D each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes Folder takes
More informationEvaluation of CPU Frequency Transition Latency
Noname manuscript No. (will be inserted by the editor) Evaluation of CPU Frequency Transition Latency Abdelhafid Mazouz Alexandre Laurent Benoît Pradelle William Jalby Abstract Dynamic Voltage and Frequency
More informationUSING THE GAME BOY ADVANCE TO TEACH COMPUTER SYSTEMS AND ARCHITECTURE *
USING THE GAME BOY ADVANCE TO TEACH COMPUTER SYSTEMS AND ARCHITECTURE * Ian Finlayson Assistant Professor of Computer Science University of Mary Washington Fredericksburg, Virginia ABSTRACT This paper
More informationMemory-Level Parallelism Aware Fetch Policies for Simultaneous Multithreading Processors
Memory-Level Parallelism Aware Fetch Policies for Simultaneous Multithreading Processors STIJN EYERMAN and LIEVEN EECKHOUT Ghent University A thread executing on a simultaneous multithreading (SMT) processor
More informationFall 2015 COMP Operating Systems. Lab #7
Fall 2015 COMP 3511 Operating Systems Lab #7 Outline Review and examples on virtual memory Motivation of Virtual Memory Demand Paging Page Replacement Q. 1 What is required to support dynamic memory allocation
More informationEECS 470. Lecture 9. MIPS R10000 Case Study. Fall 2018 Jon Beaumont
MIPS R10000 Case Study Fall 2018 Jon Beaumont http://www.eecs.umich.edu/courses/eecs470 Multiprocessor SGI Origin Using MIPS R10K Many thanks to Prof. Martin and Roth of University of Pennsylvania for
More informationU. Wisconsin CS/ECE 752 Advanced Computer Architecture I
U. Wisconsin CS/ECE 752 Advanced Computer Architecture I Prof. Karu Sankaralingam Unit 5: Dynamic Scheduling I Slides developed by Amir Roth of University of Pennsylvania with sources that included University
More informationLecture 20: Combinatorial Search (1997) Steven Skiena. skiena
Lecture 20: Combinatorial Search (1997) Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena Give an O(n lg k)-time algorithm
More informationConsole Architecture 1
Console Architecture 1 Overview What is a console? Console components Differences between consoles and PCs Benefits of console development The development environment Console game design PS3 in detail
More informationRobus 600/1000. Programmable functions using the Oview programmer. STF ROBUS Rev00 Firmware: RF02
Robus 600/1000 Programmable functions using the Oview programmer STF ROBUS 600-1000 Rev00 Firmware: RF02 COMMON FUNCTIONS name This parameter enables the user to assign the automation with a name other
More informationECE 471 Embedded Systems Lecture 31
ECE 471 Embedded Systems Lecture 31 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 30 November 2018 HW#10 was due Project update was due HW#11 will be posted Announcements 1 HW#9
More information