22nd December Dear Sir/Madam:
|
|
- Cory Webster
- 6 years ago
- Views:
Transcription
1 Jose Renau Siebel Center for Computer Science Homepage N. Goodwin Phone (217) (mobile) Urbana, IL (217) (work) 22nd December 2003 Dear Sir/Madam: Please find enclosed my application for a position of tenure-track Assistant Professor at your department. I will finish my Ph.D. in computer science at the University of Illinois Urbana-Champaign this coming summer, and would be available to start working in Fall My research area is Computer Architecture. I have been working under the supervision of Professor Josep Torrellas. I have a broad expertise in computer architecture, which includes chip and processor architecture, multiprocessor systems architecture, low power design, Thread Level Speculation (TLS), and processor-inmemory systems. I have also worked on compilation support for new architectures and Linux kernel development. Finally, I have developed substantial software systems, including a simulator for computer architectures and a TLS compiler. Please find enclosed my resume, statement of research and teaching, and samples of my publications. All these documents are also available at my website ( I am very excited at the opportunity of contributing to your department. I am sure that I can enhance its visibility and reputation with my work. I am looking forward to your positive request for an interview. Yours sincerely, Jose Renau
2 Personal Information Jose Renau Citizenship Spain Siebel Center for Computer Science 201 N. Goodwin Homepage Urbana, IL Phone (217) (mobile) (217) (work) Research Interests Computer architecture, chip multiprocessors, energy/performance trade-offs, thread level speculation, interaction between architecture and compilers, Linux kernel. Education University of Illinois at Urbana Champaign: (Advisor: Professor Josep Torrellas) 2004 (expected) Ph.D. Computer Science Thesis: Chip Multiprocessor with Thread Level Speculation: Performance and Energy The thesis challenges, for the first time, the commonly-held view that Thread Level Speculation (TLS) consumes excessive energy. It also proposes novel micro-architectural mechanisms to support out-oforder task spawning in Chip Multiprocessors (CMP) with TLS. The experimental work included the development of a full TLS compiler M.S. Computer Science Thesis: Memory Hierarchies in Intelligent Memories: Energy/Performance Design The thesis describes the FlexRAM architecture, focusing on energy, performance, and complexity issues. FlexRAM is a processor-in-memory architecture. Ramon Llull University, Spain: 1997 M.S. Computer Science Thesis: Linux Kernel IEEE1284 Implementation The thesis consisted of building TCP/IP over IEEE1284 and SCSI in Linux. The implementation also included drivers B.S. Computer Science Final project: ILZR, a New Data Compression Algorithm Awards IBM Graduate Research Fellowship ( ) J. Poppelbaum Memorial Award, University of Illinois (2003). Given to one graduate student every year for academic merit and creativity in computer architecture Page 1 of 7
3 Publications Conferences and Journals [1] Speculative Multithreading Does Not (Necessarily) Waste Energy, Jose Renau, Smruti Sarangi, James Tuck, Karin Strauss, Luis Ceze, Wei Liu, and Josep Torrellas, Submitted to International Symposium on Computer Architecture (ISCA), November [2] TLS Chip Multiprocessors: Micro-Architectural Mechanisms for Tasking with Out-of-Order Spawn, Jose Renau, James Tuck, Wei Liu, Luis Ceze, Karin Strauss, and Josep Torrellas, Submitted to International Symposium on Computer Architecture (ISCA), November [3] Managing Multiple Low-Power Adaptation Techniques: The Positional Approach, Michael Huang, Jose Renau, Josep Torrellas, Sidebar, IEEE Computer Magazine, December [4] Programming the FlexRAM Parallel Intelligent Memory System, Basilio Fraguela, Jose Renau, Paul Feautrier, David Padua, and Josep Torrellas, International Symposium on Principles and Practice of Parallel Programming (PPoPP), June [5] Positional Adaptation of Processors: Application to Energy Reduction, Michael Huang, Jose Renau, and Josep Torrellas, International Symposium on Computer Architecture (ISCA), June [6] Cherry: Checkpointed Early Resource Recycling in Out-of-order Microprocessors, José F. Martínez, Jose Renau, Michael Huang, Milos Prvulovic, and Josep Torrellas, International Symposium on Microarchitecture (MICRO), November [7] Energy-Efficient Hybrid Wakeup Logic, Michael Huang, Jose Renau, and Josep Torrellas, International Symposium on Low Power Electronics and Design (ISLPED), August [8] A Framework for Dynamic Energy Efficiency and Temperature Management, Wei Huang, Jose Renau, and Josep Torrellas, Journal on Instruction Level Parallelism (JILP), October [9] Cache Decomposition for Energy-Efficient Processors, Michael Huang, Jose Renau, Seung-Moon Yoo, and Josep Torrellas, International Symposium on Low Power Electronics and Design (ISLPED), August [10] A Framework for Dynamic Energy Efficiency and Temperature Management, Wei Huang, Jose Renau, Seung-Moon Yoo, and Josep Torrellas, International Symposium on Microarchitecture (MICRO), December Workshops [11] Profile-Based Energy Reduction for High Performance, Wei Huang, Jose Renau, and Josep Torrellas, ACM Workshop on Feedback-Directed and Dynamic Optimization (FDDO), December [12] Energy/Performance Design of Memory Hierarchies for Processor-In-Memory Chips, Wei Huang, Jose Renau, Seung-Moon Yoo, and Josep Torrellas, Workshop on Intelligent Memory Systems, November It also appeared in Lecture Notes in Computer Science (Vol. 2107) by Springer-Verlag, [13] Memory Hierarchies in Intelligent Memories: Energy/Performance Design, Wei Huang, Jose Renau, Seung-Moon Yoo, and Josep Torrellas, Ninth Workshop on Scalable Shared Memory Multiprocessors, June Technical Reports and Theses [14] CFlex: A Programming Language for the FlexRAM Intelligent Memory Architecture, Basilio Fraguela, Jose Renau, Paul Feautrier, David Padua, and Josep Torrellas, Technical Report UIUCDCS-R , July [15] FlexRAM Architecture Design Parameters, Seung-Moon Yoo, Jose Renau, Wei Huang, and Josep Torrellas, Technical Report 1584, October [16] Memory Hierarchies in Intelligent Memories: Energy/Performance Design, Jose Renau, M.S. Thesis, University of Illinois, December [17] Linux Kernel IEEE1284 Implementation, Jose Renau, M.S. Thesis, Ramon Llull University, June Page 2 of 7
4 Talks As Presenter at Conferences/Workshops Cherry: Checkpointed Early Resource Recycling in Out-of-order Microprocessors, International Symposium on Microarchitecture (MICRO), November Cache Decomposition for Energy-Efficient Processors, International Symposium on Low Power Electronics and Design (ISLPED), August Memory Hierarchies in Intelligent Memories: Energy/Performance Design, The Ninth Workshop on Scalable Shared Memory Multiprocessors, June As Invited Speaker Architectural Support for Hierarchical Thread-Level Speculation, IBM T.J.Watson Research Center, New York, August As Presenter in DARPA PI Meeting Morphable Multithreaded Memory Tiles (M3T) Architecture, IBM T.J.Watson Research Center, New York, April Software Created Designed and implemented a new simulator of computer architectures (Sesc). It is used by several research groups at the University of Illinois, University of Rochester, North Carolina State University, Georgia Institute of Technology, and Cornell University. It models a variety of architectures, including dynamic superscalar processors, CMPs, processor-in-memory, and TLS architectures. Created a fully automatic TLS compiler pass using GCC. It generates tasks with software value prediction. This is the compiler used to evaluate the architecture proposed in my Ph.D. thesis. Made some extensions to CACTI, a widely used cache power model. The extensions have been used at the University of Illinois, University of Rochester, North Carolina State University, U.C. Davis, U.C. Irvine, U.C. Riverside, and University of Arizona. Contributed to official Shared Memory Multiprocessors (SMP) Linux patches to support SMP boards. These patches are included in all the Linux kernel distributions since Co-developed the IEEE 1284 (parallel port) in Linux. This implementation is included in all Linux kernels since Developed official GCC patches, which are included in the main distribution (2002). Developed TCP/IP over SCSI boards, which involved several modifications to the Linux kernel to support a high-performance interconnection system between Linux machines. Invented a new data compression algorithm (ILZR), a variant of Lempel Zib Ross William, distributed as public domain for Amiga Computers in Aminet-CD (1993). Developed the superscalar simulation infrastructure used by the Architecture Group at the Computer Science Department of Ramon Llull University ( ). Teaching Experience Substitute teacher for some senior- and graduate-level computer architecture classes at the University of Illinois (2002, 2003). Tutoring graduate students at the University of Illinois (2002, 2003). Created and taught a course for system administrators at Ramon Llull University, Spain. The course was 4 hours a week for 10 weeks (1997). Page 3 of 7
5 Professional Experience Jan 1999-Aug 2003 Research Assistant. University of Illinois at Urbana-Champaign. Aug 1998-Dec 1998 System Administrator. University of Illinois at Urbana-Champaign. Worked for the Computing and Communications Services Office. Jan 1998-Jul 1998 Computer Network Specialist. FIHOCA, S.A. (Spain). Sep 1996-Sep 1997 System Administrator. Asertel, S.A. (Spain). In charge of the computer infrastructure. Specialized in network security. May 1995-Sep 1996 Systems Manager. Ramon Lull University (Spain). In charge of the administration of the UNIX machines, PCs, and the network of the University. Profesional Activities and Memberships Reviewer of papers for conferences and journals in computer architecture (ISCA, MICRO, HPCA, ICS, CAL, and IPDPS). ACM member since References Josep Torrellas (advisor) Professor & Willet Faculty Scholar Department of Computer Science University of Illinois at Urbana-Champaign Siebel Center for Computer Science 201 North Goodwin 201 North Goodwin Urbana, IL Urbana, IL (217) (217) David Padua Professor Department of Computer Science University of Illinois at Urbana-Champaign Siebel Center for Computer Science 201 North Goodwin 201 North Goodwin Urbana, IL Urbana, IL (217) (217) Wen-Mei Hwu Franklin W. Woeltge Professor Department of Electrical and Computer Enginering University of Illinois at Urbana-Champaign 215 Coordinated Science Laboratory 1308 West Main Urbana, IL (217) Mark Snir Faiman/Muroga Professor Head, Department of Computer Science University of Illinois at Urbana-Champaign Siebel Center for Computer Science Sarita Adve Associate Professor Department of Computer Science University of Illinois at Urbana-Champaign Siebel Center for Computer Science Page 4 of 7
6 Appendix: Abstracts of my Conference Papers [1] Speculative Multithreading Does Not (Necessarily) Waste Energy (Submitted to ISCA 2004) While Chip Multiprocessors (CMP) with Speculative Multithreading (SM) have been gaining momentum, experienced processor designers in industry have reservations about their practical implementation. In particular, it is felt that SM is too energy-inefficient to compete against conventional superscalars. This paper challenges the commonly-held view that SM consumes excessive energy. We show a CMP with SM support that is not only faster but also more energy efficient than a state-of-the-art wide-issue superscalar. We demonstrate it with a new energy-efficient CMP micro-architecture. In addition, we identify the additional sources of energy consumption in SM, and propose energy-centric optimizations that mitigate them. Experiments with the SpecInt 2000 codes show that a CMP with 2 4-issue cores and support for SM delivers a speedup of 1.08 over a 8-issue superscalar and consumes only 54% of its power. Alternatively, for the same average power in both chips, the SM CMP is 1.6 times faster than the superscalar on average. [2] TLS Chip Multiprocessors: Micro-Architectural Mechanisms for Tasking with Out-of-Order Spawn (Submitted to ISCA 2004) Chip Multiprocessors (CMP) are flexible, high-frequency platforms on which to support Thread-Level Speculation (TLS). However, for TLS to deliver on its promise, CMPs must exploit multiple sources of speculative task-level parallelism, including any nesting levels of both subroutines and loop iterations. Unfortunately, these environments are hard to support in decentralized CMP hardware: since tasks are spawned out-of-order and unpredictably, maintaining key TLS basics such as task ordering and efficient resource allocation is challenging. This paper is the first one to propose micro-architectural mechanisms that, taken together, fundamentally enable fast TLS with out-of-order spawn in a CMP. These simple mechanisms are: Splitting Timestamp Intervals, the Immediate Successor List, and Dynamic Task Merging. To evaluate them, we develop a TLS compiler with out-of-order spawn. With our mechanisms, a TLS CMP with 2 4-issue processors increases the average speedup of full SpecInt 2000 applications from 1.15 (no out-of-order spawn) to 1.25 (with out-of-order spawn). Moreover, the resulting CMP outperforms a very aggressive 8-issue superscalar. Specifically, with the same clock frequency, the CMP delivers an average speedup of 1.14 over the 8-issue processor. [4] Programming the FlexRAM Parallel Intelligent Memory System (PPoPP 2003) In an intelligent memory architecture, the main memory of a computer is enhanced with many simple processors. The result is a highly-parallel, heterogeneous machine that is able to exploit computation in the main memory. While several instantiations of this architecture have been proposed, the question of how to effectively program them with little effort has remained a major challenge. In this paper, we show how to effectively hand-program an intelligent memory architecture at a high level and with very modest effort. We use FlexRAM as a prototype architecture. To program it, we propose a family of high-level compiler directives inspired by OpenMP called CFlex. Such directives enable the processors in memory to execute the program in cooperation with the main processor. In addition, we propose libraries of highly-optimized functions called Intelligent Memory Operations (IMOs). These functions program the processors in memory through CFlex, but make them completely transparent to the programmer. Simulation results show that, with CFlex and IMOs, a server with 64 simple processors in memory runs on average 10 times faster than a conventional server. Moreover, a set of conventional programs with 240 lines on average are transformed into CFlex parallel form with only 7 CFlex directives and 2 additional statements on average [5] Positional Adaptation of Processors: Application to Energy Reduction (ISCA 2003) Although adaptive processors can exploit application variability to improve performance or save energy, effectively managing their adaptivity is challenging. To address this problem, we introduce a new approach to adaptivity: the Positional approach. In this approach, both the testing of configurations and the application of the chosen configurations are associated with particular code sections. This is in contrast to the currently-used Temporal approach to adaptation, where both the testing and application of configurations are tied to successive intervals in time. Page 5 of 7
7 We propose to use subroutines as the granularity of code sections in positional adaptation. Moreover, we design three implementations of subroutine-based positional adaptation that target energy reduction in three different workload environments: embedded or specialized server, general purpose, and highly dynamic. All three implementations of positional adaptation are much more effective than temporal schemes. On average, they boost the energy savings of applications by 50% and 84% over temporal schemes in two experiments. [6] Cherry: Checkpointed Early Resource Recycling in Out-of-order Microprocessors (MICRO 2002) This paper presents CHeckpointed Early Resource RecYcling (Cherry), a hybrid mode of execution based on ROB and checkpointing that decouples resource recycling and instruction retirement. Resources are recycled early, resulting in a more efficient utilization. Cherry relies on state checkpointing and rollback to service exceptions for instructions whose resources have been recycled. Cherry leverages the ROB to (1) not require in-order execution as a fallback mechanism, (2) allow memory replay traps and branch mispredictions without rolling back to the Cherry checkpoint, and (3) quickly fall back to conventional out-of-order execution without rolling back to the checkpoint or flushing the pipeline. We present a Cherry implementation with early recycling at three different points of the execution engine: the load queue, the store queue, and the register file. We report average speedups of 1.06 and 1.26 in SPECint and SPECfp applications, respectively, relative to an aggressive conventional architecture. We also describe how Cherry and speculative multithreading can be combined and complement each other. [7] Energy-Efficient Hybrid Wakeup Logic (ISLPED 2002) The instruction window is a critical component and a major energy consumer in out-of-order superscalar processors. An important source of energy consumption in the instruction window is the instruction wakeup: a completing instruction broadcasts its result register tag and an associative comparison is performed with all the entries in the window. This paper shows that a very large fraction of the completing instructions have to wake up no more than a single instruction currently in the window. Consequently, we propose to save energy by using indexing to only enable the comparator at the single instruction to wake up. Only in the rare case when more than one instruction needs to wake up, our scheme reverts to enabling all the comparators or a subset of them. For this reason, we call our scheme Hybrid. Overall, our scheme is very effective: for a processor with a 96-entry window, the number of comparisons performed by the average completing instruction is reduced to 1.1. The exact magnitude of the energy savings will depend on the specific instruction window implementation. Furthermore, in the Hybrid schemes, the application suffers no performance penalty. [9] Cache Decomposition for Energy-Efficient Processors (ISLPED 2001) The L1 data cache is a time-critical module and, at the same time, a major source of energy consumption. To reduce its energy-delay product, we apply two principles of low power design: specialize part of the cache structure and break down the cache into smaller caches. To this end, we propose a L1 cache that combines new designs of a stack cache and a PSA cache. Individually, our stack and PSA cache designs have a lower energy-delay product than previously proposed designs. In addition, their combined operation is very effective. Relative to a conventional 2-way 32KB data cache, our design containing a 4-way 32KB PSA cache and a 512B stack cache reduces the energy-delay product of several applications by an average of 44%. [10] A Framework for Dynamic Energy Efficiency and Temperature Management (MICRO 2000) While technology is delivering increasingly sophisticated and powerful chip designs, it is also imposing alarmingly high energy requirements on the chips. One way to address this problem is to manage the energy dynamically. Unfortunately, current dynamic schemes for energy management are relatively limited. In addition, they manage energy either for energy efficiency or for temperature control, but not for both simultaneously. In this paper, we design and evaluate for the first time an energy-management framework that tackles both energy efficiency and temperature control in a unified manner. We call this general approach Dynamic Energy Efficiency and Temperature Management (DEETM). Our framework combines many energy-management techniques Page 6 of 7
8 and can activate them individually or in groups in a fine-grained manner according to a given policy. The goal of the framework is two-fold: maximize energy savings without extending application execution time beyond a given tolerable limit, and guarantee that the temperature remains below a given limit while minimizing any resulting slowdown. The framework successfully meets these goals. For example, it delivers a 40% energy reduction with only a 10% application slowdown. Page 7 of 7
9 Jose Renau Research Statement Research Interests I am a computer architect with broad interdisciplinary research interests and experience. I have made contributions to chip-level architectures for Thread Level Speculation (TLS) [1,2], superscalar processor microarchitecture [6], low-power architectures [2,7,9] and adaptive processors [3,5,8,10,11], programmability, energy, and performance of processor-in-memory architectures [4,12,13,14,15,16], and compilation support for emerging architectures [1,2,4]. I feel that interdisciplinary research is required to push the envelope in computer architecture. Past and Present Research Thread Level Speculation Most of my research has been on performance and energy trade-offs in chip-level architectures. My thesis focuses on improving the performance and minimizing the energy consumption of TLS architectures [1,2]. The thesis challenges for the first time the commonly-held view that TLS consumes excessive energy. This is an important issue because energy and power are arguably the main design constraints in current processors. My thesis describes the architecture of a Chip Multiprocessor (CMP) with TLS support that is both faster and more energy efficient than a state-of-the-art wide-issue superscalar processor. Additionally, it identifies the sources of energy waste in TLS and proposes novel energy-centric optimizations. My thesis is also the first one to propose detailed microarchitectural mechanisms to enable speculative tasking with out-of-order task spawn in a TLS CMP. Out-of-order task spawn unlocks higher performance. To evaluate my proposals, I built a detailed simulator and a novel TLS compiler on top of GCC. The compiler generates energy-efficient tasks with out-of-order spawning. Experiments with SPECint codes show that a TLS CMP with 2 narrow-issue cores delivers significant speedups over a wider-issue superscalar, while consuming a fraction of its average power. Therefore, I claim that TLS CMPs are highly-promising platforms for next-generation processors. Processor Checkpointing As part of my TLS work, I reused TLS s support for program state checkpointing and rollback recovery to improve superscalar pipeline design [6]. Current superscalar pipelines are sub-optimal in that instructions retain their resources (registers or load/store queue entries) well past their completion until they retire. In our proposal, called Cherry, we decouple resource recycling and instruction retirement. Registers and load/store queue entries are recycled before instruction retirement, boosting pipeline utilization and, as a result, processor performance. Cherry relies on TLS s support for register and cache state checkpointing and rollback to service exceptions for instructions whose resources have been recycled. The resulting higher resource utilization enabled by Cherry leads to substantial speedups for SPECint and SPECfp codes. Overall, this work is significant in that it proposes enhancing the performance of processors through aggressive resource recycling rather than through adding more resources, which is not scalable. Low-Power Adaptive Processors Before working on TLS and Cherry, I worked on low-power adaptive (or reconfigurable) processor architectures. Run-time adaptation of hardware structures such as caches or pipelines is a promising approach to partially solve the problem of high energy and power requirements in current processors. We proposed a hardware and software algorithm that controls energy consumption and temperature in a unified manner [8,10]. We also designed a novel approach to processor adaptation called Positional adaptation [3,5,11]. In positional adaptation, a processor remembers the best configuration when it executes a code section; it then uses the same configuration when the code section is invoked again. This is in contrast to the conventional adaptation schemes. In such schemes, which we call temporal, the configuration is chosen based on the behavior of the code section immediately preceding the current one. In our work, we show that positional adaptation is more effective than temporal at saving processor energy with little performance impact. In addition to this work, I have also proposed new energy-efficient cache [9] and instruction window [7] organizations. Processor-in-Memory Architectures I have also worked on the FlexRAM project, which proposes a new processor-in-memory architecture. A FlexRAM chip includes up to 64 simple processors and 64 Mbytes of DRAM. Several such chips can be placed in the memory system of a workstation, resulting in a very versatile computing platform. For example, Page 1 of 2
10 Jose Renau Research Statement highly-parallel, or memory-intensive tasks can be off-loaded to the memory processors, which can execute in parallel with the main processor. The original FlexRAM architecture did not focus on energy or complexity issues. For my M.S. thesis, I redesigned the FlexRAM chip (on paper) to make it energy-efficient [12,13,15,16]. The resulting architecture has energy and performance advantages over conventional workstations. However, it is quite complex to program. To mitigate this problem, we proposed Open MP-based extensions to a high-level language to help program FlexRAM. These extensions, called C-Flex [4,14], substantially enhance the programmability of FlexRAM and similar processor-in-memory architectures. Tool Development During my Ph.D., I have developed a large number of software tools that my colleagues and I have used for research. Specifically, I have designed and implemented a simulator of computer architectures (Sesc). It is used by several research groups at the Univ. of Illinois, Univ. of Rochester, North Carolina State Univ., Cornell Univ., and Georgia Institute of Technology. It models a variety of architectures, including dynamic superscalar processors, CMPs, processor-in-memory, and TLS architectures. To evaluate the TLS architecture proposed in my thesis, I built together with three other graduate students a TLS compiler pass using GCC. The pass automatically generates tasks with software value prediction. In addition, to improve the task selection quality, we built a profiler pass. I made some extensions to CACTI, a widely-used tool that models power consumption in caches. The extensions have been used at the Univ. of Illinois, Univ. of Rochester, North Carolina State Univ, U.C. Davis, U.C. Irvine, U.C. Riverside, and Univ. of Arizona. Finally, I also made several open source contributions. I co-developed the IEEE 1284 and some multiprocessor patches for Linux. They are included in all Linux kernels since I also contributed with some official patches for GCC in Future Research In the short-to-medium term, I plan to keep investigating TLS architectures. I think that TLS has great potential for future processors. However, there are many issues that need to be solved before we can see commercial processors supporting TLS. I have observed that processor designers in companies such as IBM and Intel have reservations about TLS. They are especially concerned about power consumption and design complexity. I plan to focus on making TLS a viable alternative. I will systematically address all the open questions and problems in TLS, starting with the impact of TLS on the chip temperature. This research is challenging because it requires interdisciplinary expertise in energy, performance, and compilation support. I plan to contribute with novel ideas to improve the performance and complexity trade-offs in Chip Multiprocessors (CMPs) with out-of-order superscalars. I believe that Cherry-style checkpointing in modern pipelines is a promising approach to boost performance while limiting complexity. For these architectures, I also want to make fundamental advances in energy, power, and temperature issues, which I consider the true constraints in future CMP designs. Current microprocessors and multiprocessor systems are very complex. In computer architecture, more complexity implies harder-to-test designs and longer time to market. I believe that microarchitectural proposals to reduce design complexity will be the next big thing in computer architecture. Therefore, in the next 5 years, I would like to open new research areas in complexity management. I plan to make contributions to simplify the design of hardware and software. Finally, I have worked on many areas because I truly enjoy working in groups. Group work is a very gratifying experience, and I want to keep doing it as I build my research team of graduate students. I plan to build an interdisciplinary research team, with graduate students performing research on microarchitecture, multiprocessor systems architecture, energy and temperature issues, compilers, and performance evaluation. Page 2 of 2
11 Jose Renau Teaching Statement Teaching Statement I consider teaching one of the most effective ways to make the world better. Teachers had a strong influence in my life, second only to my family. While research can affect a large group of people, I feel that teaching has a much larger effect on a small group of people. In my life, some professors have had a bigger impact on me than any paper that I have ever read. I would like my teaching to have this kind of impact. As a senior student in a large research group at the University of Illinois, I have had the pleasure of coordinating the research of several younger Ph.D. and M.S. students in the group. I have always liked helping new members by suggesting lists of papers to read and research problems to examine. In addition, I have coordinated the work of several group members on our research tool infrastructure. While working in my M.S. degree in Spain, I instructed a group of 20 system administrators. I prepared and taught a course on networking and security that lasted a couple of months. Moreover, at the University of Illinois, I have been a substitute instructor several times. When my advisor has been out of town, I have volunteered to teach his classes. This has given me the opportunity to interact with students on multiple occasions. When I teach, I like to balance an abstract global view against real industrial examples. I like to impart a solid understanding with some additional insights that would be difficult to find in a book. I extract most of these insights from conferences or recent news. Given my background in computer science, I am comfortable teaching any computer architecture class at the graduate and undergraduate level. I would like to teach the following subjects: single processor architectures, multiprocessor architectures, energy and performance issues, and emerging architectural approaches like processors-in-memory and thread level speculation. At the senior undergraduate level, I can also teach compiler and operating systems courses. At the undergraduate level, I can teach VLSI and networking courses. Aside from already established courses, I would like to create new interdisciplinary courses. I feel that the emerging research topics in computer architecture are at the intersection between multiple areas. I would like to teach courses analyzing the interaction between architectures and compilers, and the interaction between performance and energy optimizations. University professors are particularly fortunate in that they interact with groups of smart graduate students. I fondly remember becoming interested in computer architecture by participating in small reading groups. As a professor, I would like to create reading groups where junior students can discover their interests in computer architecture. Page 1 of 1
High Performance Computing Systems and Scalable Networks for. Information Technology. Joint White Paper from the
High Performance Computing Systems and Scalable Networks for Information Technology Joint White Paper from the Department of Computer Science and the Department of Electrical and Computer Engineering With
More informationInstruction Scheduling for Low Power Dissipation in High Performance Microprocessors
Instruction Scheduling for Low Power Dissipation in High Performance Microprocessors Abstract Mark C. Toburen Thomas M. Conte Department of Electrical and Computer Engineering North Carolina State University
More informationRANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM
RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM Fengbin Tu, Weiwei Wu, Shouyi Yin, Leibo Liu, Shaojun Wei Institute of Microelectronics Tsinghua University The 45th International
More informationUsing Variable-MHz Microprocessors to Efficiently Handle Uncertainty in Real-Time Systems
Using Variable-MHz Microprocessors to Efficiently Handle Uncertainty in Real-Time Systems Eric Rotenberg Center for Embedded Systems Research (CESR) Department of Electrical & Computer Engineering North
More informationResearch Statement. Sorin Cotofana
Research Statement Sorin Cotofana Over the years I ve been involved in computer engineering topics varying from computer aided design to computer architecture, logic design, and implementation. In the
More informationFinal Report: DBmbench
18-741 Final Report: DBmbench Yan Ke (yke@cs.cmu.edu) Justin Weisz (jweisz@cs.cmu.edu) Dec. 8, 2006 1 Introduction Conventional database benchmarks, such as the TPC-C and TPC-H, are extremely computationally
More informationLow Power and High Performance Level-up Shifters for Mobile Devices with Multi-V DD
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.17, NO.5, OCTOBER, 2017 ISSN(Print) 1598-1657 https://doi.org/10.5573/jsts.2017.17.5.577 ISSN(Online) 2233-4866 Low and High Performance Level-up Shifters
More informationProject 5: Optimizer Jason Ansel
Project 5: Optimizer Jason Ansel Overview Project guidelines Benchmarking Library OoO CPUs Project Guidelines Use optimizations from lectures as your arsenal If you decide to implement one, look at Whale
More informationEfficiently Exploiting Memory Level Parallelism on Asymmetric Coupled Cores in the Dark Silicon Era
28 Efficiently Exploiting Memory Level Parallelism on Asymmetric Coupled Cores in the Dark Silicon Era GEORGE PATSILARAS, NIKET K. CHOUDHARY, and JAMES TUCK, North Carolina State University Extracting
More informationSCALCORE: DESIGNING A CORE
SCALCORE: DESIGNING A CORE FOR VOLTAGE SCALABILITY Bhargava Gopireddy, Choungki Song, Josep Torrellas, Nam Sung Kim, Aditya Agrawal, Asit Mishra University of Illinois, University of Wisconsin, Nvidia,
More informationKosuke Imamura, Assistant Professor, Department of Computer Science, Eastern Washington University
CURRICULUM VITAE Kosuke Imamura, Assistant Professor, Department of Computer Science, Eastern Washington University EDUCATION: PhD Computer Science, University of Idaho, December
More information4202 E. Fowler Ave., ENB118, Tampa, Florida kose
Department of Electrical Engineering, 813.974.6636 (phone), kose@usf.edu 4202 E. Fowler Ave., ENB118, Tampa, Florida 33620 http://www.eng.usf.edu/ kose Research Interests Research interests: On-chip voltage
More informationshangupt 2260 Hayward St. #4861, Ann Arbor, MI 48105, Ph:
Shantanu Gupta www.eecs.umich.edu/ shangupt 2260 Hayward St. #4861, Ann Arbor, MI 48105, Ph: 734-276-3331 shangupt@umich.edu RESEARCH INTERESTS Architecture and Compiler level solutions for Fault Tolerance
More informationRamon Canal NCD Master MIRI. NCD Master MIRI 1
Wattch, Hotspot, Hotleakage, McPAT http://www.eecs.harvard.edu/~dbrooks/wattch-form.html http://lava.cs.virginia.edu/hotspot http://lava.cs.virginia.edu/hotleakage http://www.hpl.hp.com/research/mcpat/
More informationParallelism Across the Curriculum
Parallelism Across the Curriculum John E. Howland Department of Computer Science Trinity University One Trinity Place San Antonio, Texas 78212-7200 Voice: (210) 999-7364 Fax: (210) 999-7477 E-mail: jhowland@trinity.edu
More informationSATSim: A Superscalar Architecture Trace Simulator Using Interactive Animation
SATSim: A Superscalar Architecture Trace Simulator Using Interactive Animation Mark Wolff Linda Wills School of Electrical and Computer Engineering Georgia Institute of Technology {wolff,linda.wills}@ece.gatech.edu
More informationMitigating Parameter Variation with Dynamic Fine-Grain Body Biasing
Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing Radu Teodorescu, Jun Nakano, Abhishek Tiwari and Josep Torrellas University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu
More informationOverview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture
Overview 1 Trends in Microprocessor Architecture R05 Robert Mullins Computer architecture Scaling performance and CMOS Where have performance gains come from? Modern superscalar processors The limits of
More informationPhD Student Mentoring Committee Department of Electrical and Computer Engineering Rutgers, The State University of New Jersey
PhD Student Mentoring Committee Department of Electrical and Computer Engineering Rutgers, The State University of New Jersey Some Mentoring Advice for PhD Students In completing a PhD program, your most
More informationMitigating Parameter Variation with Dynamic Fine-Grain Body Biasing *
Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing * Radu Teodorescu, Jun Nakano, Abhishek Tiwari and Josep Torrellas University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu
More informationInstruction-Driven Clock Scheduling with Glitch Mitigation
Instruction-Driven Clock Scheduling with Glitch Mitigation ABSTRACT Gu-Yeon Wei, David Brooks, Ali Durlov Khan and Xiaoyao Liang School of Engineering and Applied Sciences, Harvard University Oxford St.,
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Out-of-Order Schedulers Data-Capture Scheduler Dispatch: read available operands from ARF/ROB, store in scheduler Commit: Missing operands filled in from bypass Issue: When
More informationThe Intel Science and Technology Center for Pervasive Computing
The Intel Science and Technology Center for Pervasive Computing Investing in New Levels of Academic Collaboration Rajiv Mathur, Program Director ISTC-PC Anthony LaMarca, Intel Principal Investigator Professor
More informationEducational Experiment on Generative Tool Development in Architecture
Educational Experiment on Generative Tool Development in Architecture PatGen: Islamic Star Pattern Generator Birgül Çolakoğlu 1, Tuğrul Yazar 2, Serkan Uysal 3 1,2-3 Yildiz Technical University, Computational
More informationEE 382C EMBEDDED SOFTWARE SYSTEMS. Literature Survey Report. Characterization of Embedded Workloads. Ajay Joshi. March 30, 2004
EE 382C EMBEDDED SOFTWARE SYSTEMS Literature Survey Report Characterization of Embedded Workloads Ajay Joshi March 30, 2004 ABSTRACT Security applications are a class of emerging workloads that will play
More information1 Educational Experiment on Generative Tool Development in Architecture PatGen: Islamic Star Pattern Generator
1 Educational Experiment on Generative Tool Development in Architecture PatGen: Islamic Star Pattern Generator Birgül Çolakoğlu 1, Tuğrul Yazar 2, Serkan Uysal 3. Yildiz Technical University, Computational
More informationEnergy Efficiency of Power-Gating in Low-Power Clocked Storage Elements
Energy Efficiency of Power-Gating in Low-Power Clocked Storage Elements Christophe Giacomotto 1, Mandeep Singh 1, Milena Vratonjic 1, Vojin G. Oklobdzija 1 1 Advanced Computer systems Engineering Laboratory,
More informationComputer Architecture A Quantitative Approach
Computer Architecture A Quantitative Approach Fourth Edition John L. Hennessy Stanford University David A. Patterson University of California at Berkeley With Contributions by Andrea C. Arpaci-Dusseau
More informationRun-time Power Control Scheme Using Software Feedback Loop for Low-Power Real-time Applications
Run-time Power Control Scheme Using Software Feedback Loop for Low-Power Real-time Applications Seongsoo Lee Takayasu Sakurai Center for Collaborative Research and Institute of Industrial Science, University
More informationPerformance Evaluation of Multi-Threaded System vs. Chip-Multi-Processor System
Performance Evaluation of Multi-Threaded System vs. Chip-Multi-Processor System Ho Young Kim, Robert Maxwell, Ankil Patel, Byeong Kil Lee Abstract The purpose of this study is to analyze and compare the
More informationConventional 4-Way Set-Associative Cache
ISLPED 99 International Symposium on Low Power Electronics and Design Way-Predicting Set-Associative Cache for High Performance and Low Energy Consumption Koji Inoue, Tohru Ishihara, and Kazuaki Murakami
More informationSelf-Aware Adaptation in FPGAbased
DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Self-Aware Adaptation in FPGAbased Systems IEEE FPL 2010 Filippo Siorni: filippo.sironi@dresd.org Marco Triverio: marco.triverio@dresd.org Martina Maggio: mmaggio@mit.edu
More informationAn Area Efficient Decomposed Approximate Multiplier for DCT Applications
An Area Efficient Decomposed Approximate Multiplier for DCT Applications K.Mohammed Rafi 1, M.P.Venkatesh 2 P.G. Student, Department of ECE, Shree Institute of Technical Education, Tirupati, India 1 Assistant
More informationPROBE: Prediction-based Optical Bandwidth Scaling for Energy-efficient NoCs
PROBE: Prediction-based Optical Bandwidth Scaling for Energy-efficient NoCs Li Zhou and Avinash Kodi Technologies for Emerging Computer Architecture Laboratory (TEAL) School of Electrical Engineering and
More informationCS 6135 VLSI Physical Design Automation Fall 2003
CS 6135 VLSI Physical Design Automation Fall 2003 1 Course Information Class time: R789 Location: EECS 224 Instructor: Ting-Chi Wang ( ) EECS 643, (03) 5742963 tcwang@cs.nthu.edu.tw Office hours: M56R5
More informationDEVELOPMENT OF A ROBOID COMPONENT FOR PLAYER/STAGE ROBOT SIMULATOR
Proceedings of IC-NIDC2009 DEVELOPMENT OF A ROBOID COMPONENT FOR PLAYER/STAGE ROBOT SIMULATOR Jun Won Lim 1, Sanghoon Lee 2,Il Hong Suh 1, and Kyung Jin Kim 3 1 Dept. Of Electronics and Computer Engineering,
More informationArchitectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance
Architectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance Michael D. Powell, Arijit Biswas, Shantanu Gupta, and Shubu Mukherjee SPEARS Group, Intel Massachusetts EECS, University
More informationHardware/Software Codesign of Real-Time Systems
ARTES Project Proposal Hardware/Software Codesign of Real-Time Systems Zebo Peng and Anders Törne Center for Embedded Systems Engineering (CESE) Dept. of Computer and Information Science Linköping University
More informationOutline Simulators and such. What defines a simulator? What about emulation?
Outline Simulators and such Mats Brorsson & Mladen Nikitovic ICT Dept of Electronic, Computer and Software Systems (ECS) What defines a simulator? Why are simulators needed? Classifications Case studies
More informationRecent Advances in Simulation Techniques and Tools
Recent Advances in Simulation Techniques and Tools Yuyang Li, li.yuyang(at)wustl.edu (A paper written under the guidance of Prof. Raj Jain) Download Abstract: Simulation refers to using specified kind
More informationΕΠΛ 605: Προχωρημένη Αρχιτεκτονική
ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική Υπολογιστών Presentation of UniServer Horizon 2020 European project findings: X-Gene server chips, voltage-noise characterization, high-bandwidth voltage measurements,
More informationBig versus Little: Who will trip?
Big versus Little: Who will trip? Reena Panda University of Texas at Austin reena.panda@utexas.edu Christopher Donald Erb University of Texas at Austin cde593@utexas.edu Lizy Kurian John University of
More informationFast Placement Optimization of Power Supply Pads
Fast Placement Optimization of Power Supply Pads Yu Zhong Martin D. F. Wong Dept. of Electrical and Computer Engineering Dept. of Electrical and Computer Engineering Univ. of Illinois at Urbana-Champaign
More informationHybrid QR Factorization Algorithm for High Performance Computing Architectures. Peter Vouras Naval Research Laboratory Radar Division
Hybrid QR Factorization Algorithm for High Performance Computing Architectures Peter Vouras Naval Research Laboratory Radar Division 8/1/21 Professor G.G.L. Meyer Johns Hopkins University Parallel Computing
More informationDesign Trade-offs for Memory Level Parallelism on an Asymmetric Multicore System
Design Trade-offs for Memory Level Parallelism on an Asymmetric Multicore System George Patsilaras, Niket K. Choudhary, James Tuck Department of Electrical and Computer Engineering North Carolina State
More informationJames P. Millan. Citizenship. Education
James P. Millan 13 Merasheen Pl. St.John s, Newfoundland Canada A1E 5P5 T (709)-772-2472 B jim.millan@nrc-cnrc.gc.ca http:// www.nrc.ca/ iot http:// www.engr.mun.ca/ ~millan Citizenship Canadian and Irish.
More informationSecond Workshop on Pioneering Processor Paradigms (WP 3 )
Second Workshop on Pioneering Processor Paradigms (WP 3 ) Organizers: (proposed to be held in conjunction with HPCA-2018, Feb. 2018) John-David Wellman (IBM Research) o wellman@us.ibm.com Robert Montoye
More informationPerformance Evaluation of Recently Proposed Cache Replacement Policies
University of Jordan Computer Engineering Department Performance Evaluation of Recently Proposed Cache Replacement Policies CPE 731: Advanced Computer Architecture Dr. Gheith Abandah Asma Abdelkarim January
More informationInstructor: Dr. Mainak Chaudhuri. Instructor: Dr. S. K. Aggarwal. Instructor: Dr. Rajat Moona
NPTEL Online - IIT Kanpur Instructor: Dr. Mainak Chaudhuri Instructor: Dr. S. K. Aggarwal Course Name: Department: Program Optimization for Multi-core Architecture Computer Science and Engineering IIT
More informationAdaptive Modulation with Customised Core Processor
Indian Journal of Science and Technology, Vol 9(35), DOI: 10.17485/ijst/2016/v9i35/101797, September 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Adaptive Modulation with Customised Core Processor
More informationCREATING A MINDSET FOR INNOVATION Paul Skaggs, Richard Fry, and Geoff Wright Brigham Young University /
CREATING A MINDSET FOR INNOVATION Paul Skaggs, Richard Fry, and Geoff Wright Brigham Young University paul_skaggs@byu.edu / rfry@byu.edu / geoffwright@byu.edu BACKGROUND In 1999 the Industrial Design program
More informationExploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs
Exploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs Michael Gordon, William Thies, and Saman Amarasinghe Massachusetts Institute of Technology ASPLOS October 2006 San Jose,
More informationCS4617 Computer Architecture
1/26 CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 2/26 Amdahl s Law Speedup = Execution time for entire task without using enhancement Execution time for entire task using enhancement
More informationwww.ixpug.org @IXPUG1 What is IXPUG? http://www.ixpug.org/ Now Intel extreme Performance Users Group Global community-driven organization (independently ran) Fosters technical collaboration around tuning
More informationSystem Level Analysis of Fast, Per-Core DVFS using On-Chip Switching Regulators
System Level Analysis of Fast, Per-Core DVFS using On-Chip Switching s Wonyoung Kim, Meeta S. Gupta, Gu-Yeon Wei and David Brooks School of Engineering and Applied Sciences, Harvard University, 33 Oxford
More informationChapter 16 - Instruction-Level Parallelism and Superscalar Processors
Chapter 16 - Instruction-Level Parallelism and Superscalar Processors Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 16 - Superscalar Processors 1 / 78 Table of Contents I 1 Overview
More informationMECHATRONICS Master study program. St. Kliment Ohridski University in Bitola Faculty of Technical Sciences Bitola.
MECHATRONICS Master study program St. Kliment Ohridski University in Bitola Faculty of Technical Sciences Bitola www.tfb.edu.mk 1 2 Contents Mechatronics - an interdisciplinary approach Competences / Invest
More informationABOUT COMPUTER SCIENCE
ABOUT COMPUTER SCIENCE MOST COMMON CS JOB TITLES Computer Programmer Computer System Analyst Software Developers Computer and Information Research 2 COMPUTER PROGRAMMERS What they do: Write programs in
More informationLeading by design: Q&A with Dr. Raghuram Tupuri, AMD Chris Hall, DigiTimes.com, Taipei [Monday 12 December 2005]
Leading by design: Q&A with Dr. Raghuram Tupuri, AMD Chris Hall, DigiTimes.com, Taipei [Monday 12 December 2005] AMD s drive to 64-bit processors surprised everyone with its speed, even as detractors commented
More informationProgramming and Optimization with Intel Xeon Phi Coprocessors. Colfax Developer Training One-day Boot Camp
Programming and Optimization with Intel Xeon Phi Coprocessors Colfax Developer Training One-day Boot Camp Abstract: Colfax Developer Training (CDT) is an in-depth intensive course on efficient parallel
More informationMULTISCALAR PROCESSORS
MULTISCALAR PROCESSORS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE MULTISCALAR PROCESSORS by Manoj Franklin University of Maryland, US.A. SPRINGER SCIENCE+BUSINESS MEDIA, LLC Library
More informationComputer & Information Science & Engineering (CISE)
Computer & Information Science & Engineering (CISE) Mitra Basu, PhD mbasu@nsf.gov Computer and Information Science and Engineering http://www.nsf.gov/cise Advanced Cyberinfrastructure Computing & Communication
More informationMemory-Level Parallelism Aware Fetch Policies for Simultaneous Multithreading Processors
Memory-Level Parallelism Aware Fetch Policies for Simultaneous Multithreading Processors STIJN EYERMAN and LIEVEN EECKHOUT Ghent University A thread executing on a simultaneous multithreading (SMT) processor
More informationCOTSon: Infrastructure for system-level simulation
COTSon: Infrastructure for system-level simulation Ayose Falcón, Paolo Faraboschi, Daniel Ortega HP Labs Exascale Computing Lab http://sites.google.com/site/hplabscotson MICRO-41 tutorial November 9, 28
More informationMultiple Clock and Voltage Domains for Chip Multi Processors
Multiple Clock and Voltage Domains for Chip Multi Processors Efraim Rotem- Intel Corporation Israel Avi Mendelson- Microsoft R&D Israel Ran Ginosar- Technion Israel institute of Technology Uri Weiser-
More informationDistributed Vision System: A Perceptual Information Infrastructure for Robot Navigation
Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation Hiroshi Ishiguro Department of Information Science, Kyoto University Sakyo-ku, Kyoto 606-01, Japan E-mail: ishiguro@kuis.kyoto-u.ac.jp
More informationEvaluation of CPU Frequency Transition Latency
Noname manuscript No. (will be inserted by the editor) Evaluation of CPU Frequency Transition Latency Abdelhafid Mazouz Alexandre Laurent Benoît Pradelle William Jalby Abstract Dynamic Voltage and Frequency
More informationDurham Research Online
Durham Research Online Deposited in DRO: 29 August 2017 Version of attached le: Accepted Version Peer-review status of attached le: Not peer-reviewed Citation for published item: Chiu, Wei-Yu and Sun,
More informationDynamic MIPS Rate Stabilization in Out-of-Order Processors
Dynamic Rate Stabilization in Out-of-Order Processors Jinho Suh and Michel Dubois Ming Hsieh Dept of EE University of Southern California Outline Motivation Performance Variability of an Out-of-Order Processor
More informationSCIENTIFIC LITERACY FOR SUSTAINABILITY
SCIENTIFIC LITERACY FOR SUSTAINABILITY Karen Murcia: BAppSc., GradDipEd., M Ed. Submitted in total fulfilment of the requirements of the Degree of Doctor of Philosophy. November 2006 Division of Arts School
More informationComputer Logical Design Laboratory
Division of Computer Engineering Computer Logical Design Laboratory Tsuneo Tsukahara Professor Tsuneo Tsukahara: Yukihide Kohira Senior Associate Professor Yu Nakajima Research Assistant Software-Defined
More informationYutaka Hori Web:
Email: yhori@appi.keio.ac.jp Web: http://bi.appi.keio.ac.jp/~yhori/ Full CV is available upon request. Education March 2013 Ph.D. Department of Information Physics and Computing, Graduate School of Information
More informationAn Evaluation of Speculative Instruction Execution on Simultaneous Multithreaded Processors
An Evaluation of Speculative Instruction Execution on Simultaneous Multithreaded Processors STEVEN SWANSON, LUKE K. McDOWELL, MICHAEL M. SWIFT, SUSAN J. EGGERS and HENRY M. LEVY University of Washington
More informationCUDA Threads. Terminology. How it works. Terminology. Streaming Multiprocessor (SM) A SM processes block of threads
Terminology CUDA Threads Bedrich Benes, Ph.D. Purdue University Department of Computer Graphics Streaming Multiprocessor (SM) A SM processes block of threads Streaming Processors (SP) also called CUDA
More informationTeaching digital control of switch mode power supplies
Teaching digital control of switch mode power supplies ABSTRACT This paper explains the methodology followed to teach the subject Digital control of power converters. The subject is focused on several
More informationInformation Technology Fluency for Undergraduates
Response to Tidal Wave II Phase II: New Programs Information Technology Fluency for Undergraduates Marti Hearst, Assistant Professor David Messerschmitt, Acting Dean School of Information Management and
More informationOn the Rules of Low-Power Design
On the Rules of Low-Power Design (and Why You Should Break Them) Prof. Todd Austin University of Michigan austin@umich.edu A long time ago, in a not so far away place The Rules of Low-Power Design P =
More informationSDR Applications using VLSI Design of Reconfigurable Devices
2018 IJSRST Volume 4 Issue 2 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Science and Technology SDR Applications using VLSI Design of Reconfigurable Devices P. A. Lovina 1, K. Aruna Manjusha
More informationWEI HUANG Curriculum Vitae
1 WEI HUANG Curriculum Vitae 4025 Duval Road, Apt 2538 Phone: (434) 227-6183 Austin, TX 78759 Email: wh6p@virginia.edu (preferred) https://researcher.ibm.com/researcher/view.php?person=us-huangwe huangwe@us.ibm.com
More information2009 Brian L. Greskamp
2009 Brian L. Greskamp IMPROVING PER-THREAD PERFORMANCE ON CMPS THROUGH TIMING SPECULATION BY BRIAN L. GRESKAMP B.S. Clemson University, 2003 M.S. University of Illinois at Urbana-Champaign, 2005 DISSERTATION
More informationII. ROBOT SYSTEMS ENGINEERING
Mobile Robots: Successes and Challenges in Artificial Intelligence Jitendra Joshi (Research Scholar), Keshav Dev Gupta (Assistant Professor), Nidhi Sharma (Assistant Professor), Kinnari Jangid (Assistant
More informationUsing Variability Modeling Principles to Capture Architectural Knowledge
Using Variability Modeling Principles to Capture Architectural Knowledge Marco Sinnema University of Groningen PO Box 800 9700 AV Groningen The Netherlands +31503637125 m.sinnema@rug.nl Jan Salvador van
More informationMosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes
Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes Rachata Ausavarungnirun Joshua Landgraf Vance Miller Saugata Ghose Jayneel Gandhi Christopher J. Rossbach Onur
More informationLow Power VLSI Circuit Synthesis: Introduction and Course Outline
Low Power VLSI Circuit Synthesis: Introduction and Course Outline Ajit Pal Professor Department of Computer Science and Engineering Indian Institute of Technology Kharagpur INDIA -721302 Agenda Why Low
More informationAn Integrated Modeling and Simulation Methodology for Intelligent Systems Design and Testing
An Integrated ing and Simulation Methodology for Intelligent Systems Design and Testing Xiaolin Hu and Bernard P. Zeigler Arizona Center for Integrative ing and Simulation The University of Arizona Tucson,
More informationDIGF 6B21 Ubiquitous Computing
DIGF 6B21 Ubiquitous Computing NUMBER OF CREDITS: 1.5 Day and Time: Tuesdays 18:30 21:30, beginning October 30th Location: Room 7301, 205 Richmond Professor: Nick Puckett Email: npuckett@faculty.ocadu.ca
More informationA High Definition Motion JPEG Encoder Based on Epuma Platform
Available online at www.sciencedirect.com Procedia Engineering 29 (2012) 2371 2375 2012 International Workshop on Information and Electronics Engineering (IWIEE) A High Definition Motion JPEG Encoder Based
More informationDavid Daly. IBM T. J. Watson Research Center P.O. Box 218 Yorktown Heights, NY
David Daly IBM T. J. Watson Research Center P.O. Box 218 Yorktown Heights, NY 10598 http://researcher.ibm.com/person/us-dmdaly Education University of Illinois, Urbana-Champaign Ph.D. in Electrical Engineering
More informationInvitation for involvement: NASA Frontier Development Lab (FDL) 2018
NASA Frontier Development Lab 189 N Bernardo Ave #200, Mountain View, CA 94043, USA www.frontierdevelopmentlab.org January 2, 2018 Invitation for involvement: NASA Frontier Development Lab (FDL) 2018 Dear
More informationArchitecture ISCA 16 Luis Ceze, Tom Wenisch
Architecture 2030 @ ISCA 16 Luis Ceze, Tom Wenisch Mark Hill (CCC liaison, mentor) LIVE! Neha Agarwal, Amrita Mazumdar, Aasheesh Kolli (Student volunteers) Context Many fantastic community formation/visioning
More informationCross Linking Research and Education and Entrepreneurship
Cross Linking Research and Education and Entrepreneurship MATLAB ACADEMIC CONFERENCE 2016 Ken Dunstan Education Manager, Asia Pacific MathWorks @techcomputing 1 Innovation A pressing challenge Exceptional
More informationReduction of Peak Input Currents during Charge Pump Boosting in Monolithically Integrated High-Voltage Generators
Reduction of Peak Input Currents during Charge Pump Boosting in Monolithically Integrated High-Voltage Generators Jan Doutreloigne Abstract This paper describes two methods for the reduction of the peak
More informationPower Management in Multicore Processors through Clustered DVFS
Power Management in Multicore Processors through Clustered DVFS A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Tejaswini Kolpe IN PARTIAL FULFILLMENT OF THE
More informationHeterogeneous Concurrent Error Detection (hced) Based on Output Anticipation
International Conference on ReConFigurable Computing and FPGAs (ReConFig 2011) 30 th Nov- 2 nd Dec 2011, Cancun, Mexico Heterogeneous Concurrent Error Detection (hced) Based on Output Anticipation Naveed
More informationSTRATEGIC FRAMEWORK Updated August 2017
STRATEGIC FRAMEWORK Updated August 2017 STRATEGIC FRAMEWORK The UC Davis Library is the academic hub of the University of California, Davis, and is ranked among the top academic research libraries in North
More informationFaculty of Arts and Social Sciences. STRUCTUURRAPPORT Chair Digital Arts and Culture
Faculty of Arts and Social Sciences STRUCTUURRAPPORT Chair Digital Arts and Culture December 2017 Pagina 1 van 7 MOTIVATION The Faculty of Arts and Social Sciences (FASoS) of Maastricht University (UM)
More informationCompiler Optimisation
Compiler Optimisation 6 Instruction Scheduling Hugh Leather IF 1.18a hleather@inf.ed.ac.uk Institute for Computing Systems Architecture School of Informatics University of Edinburgh 2018 Introduction This
More informationTechnical-oriented talk about the principles and benefits of the ASSUMEits approach and tooling
PROPRIETARY RIGHTS STATEMENT THIS DOCUMENT CONTAINS INFORMATION, WHICH IS PROPRIETARY TO THE ASSUME CONSORTIUM. NEITHER THIS DOCUMENT NOR THE INFORMATION CONTAINED HEREIN SHALL BE USED, DUPLICATED OR COMMUNICATED
More informationGrundlagen des Software Engineering Fundamentals of Software Engineering
Software Engineering Research Group: Processes and Measurement Fachbereich Informatik TU Kaiserslautern Grundlagen des Software Engineering Fundamentals of Software Engineering Winter Term 2011/12 Prof.
More informationDESIGN OF A MEASUREMENT PLATFORM FOR COMMUNICATIONS SYSTEMS
DESIGN OF A MEASUREMENT PLATFORM FOR COMMUNICATIONS SYSTEMS P. Th. Savvopoulos. PhD., A. Apostolopoulos 2, L. Dimitrov 3 Department of Electrical and Computer Engineering, University of Patras, 265 Patras,
More information