九州大学学術情報リポジトリ Kyushu University Institutional epository Non-uniform Selective Way Cache の動的制御による組込みプロセッサの省エネルギー化 石飛, 百合子九州大学大学院システム情報科学府 石原, 亨九州大学システム LSI 研究センター 安浦, 寛人九州大学 Ishitobi, Yuriko Graduate School of Inf. Sci. & E.E., Kyushu University 他 http://hdl.handle.net/2324/13843 出版情報 : 電子情報通信学会技術研究報告. 108 (464), pp.13-18, 2009-03-05. 電子情報通信学会バージョン :accepted 権利関係 :
THE INSTITUTE OF ELECTONICS, INFOMATION AND COMMUNICATION ENGINEES TECHNICAL EPOT OF IEICE. Non-uniform Selective Way Cache 819 0395 744 LSI 814 0001 3 8 33 812 8581 6 10 1 E-mail: ishitobi@c.csce.kyushu-u.ac.jp, ishihara@slrc.kyushu-u.ac.jp, yasuura@c.csce.kyushu-u.ac.jp Non-uniform Selective Way Cache(NSWC) NSWC 2 NSWC 7% 20% A Dynamic Management Technique of a Non-uniform Selective Way Cache for educing the Energy Consumption of Embedded Processors Yuriko ISHITOBI, Tohru ISHIHAA, and Hiroto YASUUA Graduate School of Inf. Sci. & E.E.,Kyushu University Motooka 744, Nishi-ku, Fukuoka-shi, 819 0395 Japan System LSI esearch Center,Kyushu University Momochihama 3 8 33, Sawara-ku, Fukuoka-shi, 814 0001 Japan Kyushu University Hakozaki 6 10 1, Higashi-ku, Fukuoka-shi, 812 8581 Japan E-mail: ishitobi@c.csce.kyushu-u.ac.jp, ishihara@slrc.kyushu-u.ac.jp, yasuura@c.csce.kyushu-u.ac.jp Abstract This paper proposes a dynamic management technique of Non-uniform Selective Way Cache(NSWC) for reducing the total energy consumption of a CPU core, cache memories, and off-chip memories. NSWC has a way uses low supply(vdd) and low threshould(vth). In our approach, we decide insert points of instructions to change available ways in the Non-uniform Selective Way Cache. Experiments using parameters of a commercial embedded processor and an off-chip SDAM demonstrate that our algorithm reduces the energy consumption of the processor system by 7%-20% compared to the result of a processor with a same size set associative cache memory. Key words embedded processor, energy reduction, cache memory 1. AM920T TM 44% StrongAM 27% [1] [3] 1
[7] [7] Selective Way Cache(SWC) [6] SWC Non-uniform Selective Way Cache(NSWC) NSWC 2 SWC NSWC SP DP DP 2 3 NSWC 4 5 2. 2. 1 Selective Way Cache(SWC) David H. Albonesi [6] David H. Albonesi SWC 1 SWC Cache Way Select #103254 "! #%$'& # $'&,/-,.- $)(+* # &1(56 1 SWC egister(cws) CWS [6] Way Placement Timothy M. Jones [7] Timothy M. Jones (Normal access) Normal area 1 (Way placement access) Way placement area 2 Way placement access Way placement area 1 Way placement area 2. 2 Way Placement SWC SWC Way Placement Way Placement [7] Way placement access Normal access Normal access SWC 2
* +,!#"$%%& ')( @BACEDGF &EḦ I J$KML KON N KQP J$KUL KUN N KTP KQS K KON N KTP KTS K KUN N KQP - &).0/0132547698:<; „"4 6#/ 7.*8:979<;=6#/.<8>?. @AB 8 993; @0AB 8" $. " # $ %! JKML KUN N KTP $= KQS K KON N KTP JKML KUN N KQP 5#?> KTS K KUN N KQP &('*) & '-).0/ '*+,.0/ 2 SWC & )+"5 Way Placement SWC 2 A D 2 A B 1 B C 2 A C 3 1 2 Way Placement 3 A C C Normal area Way placement access SWC SWC SAM NSWC(Nonuniform Selective Way Cache) 3. NSWC 3. 1 NSWC 3. 1. 1 NSWC NSWC 3 NSWC SWC NSWC ( SP ) 3 NSWC ( DP ) DP SP DP DP SP DP NSWC CWS CWS CWS 3. 1. 2 NSWC T E total = T E NSW C + T E DCACHE + T E main + T E logic (1) (1) T E NSW C T E DCACHE T E main T E logic NSWC T E NSW C T E NSW C = n DP E DP + n SP E SP +n wch E wch + n missi E missi +t all (N DP P DP leak + N SP P SP leak ) (2) (2) n DP n SP DP SP E DP E SP DP SP n wch NSWC E wch n missi NSWC E missi NSWC P DP leak 3
P SP leak DP SP N DP N SP NSWC DP SP t all (1) T E DCACHE T E DCACHE = n dcache E dcache +n missd E missd +t all P leakd (3) (3) n dcache E dcache n missd E missd P leakd (1) T E logic T E main P logic P main E main n main T E logic = t all P logic (4) T E main = t all P main + n main E main (5) t all t all = n hit T hit + n missi T miss + α (6) n hit T hit T miss NSWC 3. 2 SWC 1 2 3 1 2 1 3 3. 2. 1 4 4 CFG CFG 5! / 0211-!3 04115- "!# $&%'(!)+*, %-. 78& 29 %70;:-< =3 >? 49 %@0 :3< =! >?"! # 02113!-& 0211-3 "!6 $%'!)+*, %3. CFG 5 nop nop nop 4 1 1 1 1 1 1 3. 2. 2 [5] 4
4 #%$ &(')+*-,/.0 #1$ &+' )(*-,.0 6! " 2354687 6 1 1 1 3. 2. 3 (1) 4. 4. 1 (SA) [7] Way Placement (WP) SWC (SWC) (NSWC) Way Placement SWC NSWC (1) EEMBC 4. 2 MeP SAIF Synopsys Power Compiler MICON Mobile DD SDAM [10] 4KB 2 32 AM CACTI5.1 [9] CACTI HP(High Performance) LSTP(Low Standby Power) LOP(Low Operating Power) 3 SP DP LSTP LOP CWS AM CWS 2 4. 3 7 (SA) (NSWC) 7% 20% 5
65, B %$ '& + 1 34 2 /0 -. TS g\hi UWVXGY Z\[^] i VakjWlGemTnIoWpIqCrIsCtkub _ÌaIbIcbGdIe f v w w VaIj\lTemTx ykz {k }k~crisctkukb hs S U VŽk IŠeW GŒ\ G k T Cz VŽk IŠe IŒ niowpiqirisctub bgo rgsctub v hs V Ge o ƒ}~irisctkub S S U V IbIˆ! IŠWe IŒW}~krCsItkub " # 7 8 9;: 8 9=< >??: 8A@CB D : 8 EF@GB D : 8 E?H D @IB?HDD 8 J @CB K : 8 E @IB K : 8 E @IB K 8 J EMLN O E @GP Q LN O < B N ()* 7 1 LSTP LOP V dd [V] 1.2 0.8 V th [mv] 554 315 ead energy[pj] 52.43 25.06 Leak power[mw] 0.0118 1.431 Access time [ns] 1.93704 1.0999 2 CWS 15[ns] 48.34896[pJ] 1% Way Placement (WP) 2% SWC (SWC) 2.3% WP SWC NSWC nop SWC NSWC SWC NSWC NSWC 1 LOP DP LSTP SP 1 DP NSWC 5. NSWC NSWC ( ). (JST) (CEST) [1] Ching-Long Su and Alvin Despain Cache Design Tradeoffs for Power and Performance Optimization: A Case Study In Proc. of ISLPED pp.63-68 August 1995. [2] Patrick Hicks Matthew Walnock and obert Michael Owens Analysis of Power Consumption in Memory Hierarchies In Proc. of ISLPED pp.239-242 August 1997. [3] S.Seger Low Power Design Techniques for Microprocessors ISSCC Tutorial note February 2001. [4] Lars Wehmeyer, Peter Marwedel, Fast, Efficient and Predictable Memory Accesses, Springer [5] Hiroyuki Tomiyama and Hiroto Yasuura Optimal Code Placement of Embedded Software for Instruction Cachea In Proc. of European Design and Test Conference pp.96-101 March 1996. [6] David H. Albonesi, Selective Cache Ways: On-Demand Cache esource Allocation, In Proc of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 248-259, 1999. [7] Timothy M. Jones, Sandro Bartolini, Bruno De Bus, John Cavazos, Michael F.P. O Boyle, Instruction Cache Energy Saving Through Compiler Way-Placement, In Proc. of Design Automation and Test in Europe, pp. 1196-1201, March 2008. [8], DA 2008 pp. 13-18. [9] Shyamkumar Thoziyoor, Naveen Muralimanohar, Jung Ho Ahn, Norman P. Jouppi, CACTI5.1, Technical eports, HP Labs, http://www.hpl.hp.com/techreports/2008 [10] 128Mb: x16, x32 Mobile DD SDAM Features, http://www.micron.com/ 6