Hard Real-me Schedulng for Low-Energy Usng Stochastc Data and DVS Processors Flavus Gruan Department of Computer Scence, Lund Unversty Box 118 S-221 00 Lund, Sweden el.: +46 046 2224673 e-mal: Flavus.Gruan@cs.lth.se ABSRAC he wor presented n ths paper addresses schedulng for reduced energy of hard real-tme tass wth fxed prortes assgned n a rate monotonc or deadlne monotonc manner. he approach we descrbe can be exclusvely mplemented n the ROS. It targets energy consumpton reducton by usng both on-lne and off-lne decsons, taen both at tas level and at tas-set level. We consder sets of ndependent tass runnng on processors wth dynamc voltage supples (DVS). ang nto account the real behavor of a realtme system, whch s often better than the worst case, our methods employ stochastc data to derve energy effcent schedules. he expermental results show that our approach acheves more mportant energy reductons than other polces from the same class. Keywords Low-energy, hard real-tme, ROS, schedulng 1. INRODUCION Low energy consumpton s today an ncreasngly mportant desgn requrement for dgtal systems, wth mpact on operatng tme, on system cost, and, of no lesser mportance, on the envronment. Reducng power and energy dsspaton has long been addressed by several research groups, at dfferent abstracton levels. We focus here on methods applcable at system-level, where the system to be desgned s specfed as an abstract set of tass. Selectng the rght archtecture has been shown to have a great nfluence on the system energy consumpton [4,5]. Recently, wth the advent of dynamc voltage supply (DVS) processors [2,22,25], hghly flexble systems can be desgned, whle stll tang advantage of supply voltage scalng to reduce the energy consumpton. Snce the supply voltage has a drect mpact on processor speed, classc tas schedulng and supply voltage selecton have to be addressed together. Schedulng offers thus yet another level of possbltes for achevng energy/ power effcent systems, especally when the system archtecture s fxed or the system exhbts a very dynamc behavor. For such dynamc systems, varous power management technques exst and are revewed for example n [1,17]. Yet, these manly target soft Permsson to mae dgtal or hard copes of all or part of ths wor for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. o copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or fee. ISLPED 01, August 6-7, 2001, Huntngton Beach, Calforna, USA. Copyrght 2001 ACM 1-58113-371-5/01/0008...$5.00. real-tme systems, where deadlnes can be mssed f the Qualty of Servce s ept. Several schedulng technques for soft real-tme tass, runnng on DVS processors have already been descrbed [3,18,19,23]. Energy reductons can be acheved even n hard realtme systems, where no deadlne can be mssed, as shown n [6,7,10,20,24]. In ths paper, we also focus on hard real-tme schedulng technques, where every deadlne has to be met. as level voltage schedulng decsons can reduce even further the energy consumpton. Some of these ntra-tas schedulng methods use several re-schedulng ponts nsde a tas, and are usually compler asssted [11,16,21]. Alternatvely, fxng the schedule before the tas starts executng as n [6,7,8] elmnates the nternal schedulng overhead, but wth possble affects on energy reducton. Statstcs can be used to tae full advantage of the dynamc behavor of the system, both at tas level [16] and at tas-set level [24]. In our approach we employ stochastc data to derve effcent voltage schedules wthout the overhead of ntra-tas re-schedulng. he rest of the paper s organzed as follows. In secton 2 we descrbe our hard real-tme schedulng strategy, pontng out the related wor for each decson we mae. Secton 3 contans several expermental results conducted both on real lfe examples and on randomly generated, large tas sets. Fnally, we present our conclusons n secton 4. 2. R SCHEDULING FOR LOW-ENERGY In the wor descrbed here, we address ndependent tass runnng on a sngle processor. he processor has varable speed (supply voltage and energy) adustable at runtme. he tass arrve wth gven perods and have to be executed before certan deadlnes. he prortes are fxed, assgned n a rate-monotonc (RM) or deadlne monotonc (DM) manner [14]. he runtme schedulng also operates as n RM/DM schedulng wth the dfference that each tas nstance s assgned a maxmum allowed executon tme. he schedulng strateges we adopt at tas-level are presented n subsecton 2.1. he allowed executon tme are nfluenced by tas group level decsons, taen both off-lne and on-lne. he off-lne phase s presented n sub-secton 2.2 and the on-lne phase n subsecton 2.3. Sub-secton 2.3 also contans a proof that our schedulng method eeps the response tmes from the orgnal RM/DM schedulng, and thus does not affect the feasblty of the schedule. 2.1 as-level Schedulng Decsons as-level voltage schedulng has captured the attenton of the research communty rather recently [8]. Fne gran schedulng, where several re-schedulng ponts are used nsde a tas were pre- 46
sented n [11,16]. In [16] statstcal data s used to mprove the tas level schedule, by slowng down dfferent regons of a tas accordng to ther average executon tme. Our approach produces voltage schedules only when a tas starts executng, whle usng stochastc data more aggressvely both at tas level and tas-set level. At tas level we generate voltage schedules that are correlated wth the tas executon length probablty dstrbuton. For tas-set level schedulng decsons see sub-secton 2.3. In our model a tas τ can be executed n phases, at dfferent avalable voltages, dependng on ts allowed executon tme A. he deal case states that the most energy s saved when the processor uses the voltage for whch the tas exactly covers ts allowed executon tme. hs corresponds to an deal voltage whch may not overlap wth the avalable voltages. A close to optmal soluton s to execute the tas n two phases at two of the avalable voltages. hese two voltages are the ones boundng the deal voltage [6,8]. An mportant observaton s that tass may fnsh, and n many cases do fnsh, before ther worst case executon tme (WCE). herefore t maes sense to execute frst at a low voltage and accelerate the executon, nstead of executng at hgh voltage frst and decelerate. In ths manner, f a tas nstance s not the worst case, one sps executng hgh voltage (and power eager) regons. In the followng we wll dstngush between three modes of executon for a tas, as depcted n Fgure 1. he deal case (mode 1) s when the actual executon pattern (the number of cloc cycles) becomes nown when the tas arrves. We can stretch then the actual executon tme of the tas to exactly fll the allowed tme. hs mode requres rather accurate executon pattern estmates, dependng on the nput data, and therefore s rarely achevable n practce. he second mode (mode 2) s the WCE stretchng - the voltage schedule for the tas s determned as f the tas wll exhbt ts worst case behavor. hese two modes use at most two voltage regons, and therefore at most one DC-DC swtch. he thrd mode (mode 3), descrbed n more detal next, uses stochastc data to buld a multple voltage schedule. he purpose for usng stochastc data s to mnmze the average case energy consumpton. Note that the voltage schedules n all these three modes are decded at a tas nstance arrval. Unle n [11,21] no reschedulng s done whle the tas s executng. he only overhead durng tas executon s the one gven by the changes n the supply voltage. For nstance, the lparm processor [2] needs at most 70µs to swtch from 1.2 to 3.8V. For closer voltage levels, the swtch occurs faster. Dependng on the actual tas executon tme, ths delay may have some mpact on the schedule. he same goes for the energy lost durng the DC-DC swtch. Although our dscusson does not cover these, the methods Used Energy mode 1 mode 2 mode 3 actual E WCE allowed tme tme Fgure 1. Voltage schedulng modes for tass: 1) deal schedule, 2) WCE orented schedule, 3) stochastc schedule. presented here can be adapted to accommodate both the DC-DC delay and energy loss whenever the actual processor requres t. he stochastc voltage schedule (mode 3 n Fgure 1) for a tas s obtaned usng the probablty dstrbuton of the executon pattern for a tas (the number of cloc cycles used). hs probablty dstrbuton can be obtaned off-lne, va smulaton, or bult and mproved at runtme. Let us denote by X the random varable assocated wth the number of cloc cycles used by a tas nstance. We wll use the cumulatve densty of probablty functon, cdf x, assocated wth the varable X, cd f x = PX ( x). hs functon reflects the probablty that a tas nstance fnshes before a certan number of cloc cycles. If WX s the worst case number of cloc cycles, cd f WX = 1. Decdng a voltage schedule for a tas, means that for every cloc cycle up to WX we decde a specfc voltage level (and processor speed). Each cycle y, dependng on the voltage adopted, wll consume a specfc energy, e y. But each of these cycles are executed wth a certan probablty, so n average the energy consumed by cycle y can be computed as ( 1 cd f y ) e y. o obtan the average energy for the whole tas, we have to consder all the cycles up to WX: E = ( 1 cd f y ) e y (1) 0 < y WX hs s the value we want to mnmze by choosng approprate voltage levels for each cycle. Snce WX may be a large number n practce, n our mplementaton we group several consecutve cloc cycles nto equal sze groups. For the sae of brevty and clarty we descrbe here only the smpler case, when the voltage levels are decded cloc cycle by cloc cycle. A tas has to complete ts executon durng an allowed executon tme, A. If we denote the cloc length assocated to cloc cycle y by y, ths constrant can be wrtten as: y A (2) 0 < y WX he cloc cycle length dependency on the supply voltage V and threshold voltage V s accordng to: V ( V V ) β where β s the velocty saturaton ndex. If V s small enough or we use a varable threshold technology [22], ths dependency s smplfed to: V ( 1 β). he cloc cycle energy e s drectly dependent on the square of the supply voltage as n: e V 2 [6]. Elmnatng V from the last two expressons we obtan the dependency between the cloc cycle energy and length: 2 ----------- β 1 e 1 (3) For clarty we wll bound now β = 2, but the rest of the calculus can be carred out for any other reasonable value of β. If we substtute (3) n (1), we obtan: ( 1 cd f E y ) ------------------------- (4) 2 0 < y WX y whch s the value to be mnmzed. By mathematcal nducton one can prove that the rght hand sde of (4) has a lower bound (usng also (2)): 1 cd f y 2 0 < y WX 1 2 LB = ----------------------------------------------- ----- (5) y A 2 1 cd f y 0 < y WX 0 < y WX hs lower bound can only be obtaned f and only f: y = A ( 1 cd f y ) 1 cd f y 0 < y WX (6) 47
hese are the optmal values for the cloc cycle length n each cloc cycle up to WX. In practce these values may not overlap wth the avalable cloc lengths so they have to be converted to real cloc cycles. hs converson s done n a smlar way to dervng a dual level voltage schedule from an deal one [6,8]. We fnd the two boundng avalable cloc cycles CK < y CK + 1 and dstrbute the wor of the deal cycle n two such that y = w CK + ( 1 w ) CK + 1, where w s the wor gven to CK and the rest s the wor gven to CK +1. hus, each cycle n the tas wll dstrbute ts wor between two of the several avalable cloc lengths. Fnally, the accumulated wor loads for each avalable cloc cycle s rounded to ntegers, snce one can only execute full cloc cycles. Note that the coeffcent of A n (6) can be computed off-lne or, f the probablty dstrbuton s bult at runtme, on-lne from tme to tme. herefore, the on-lne computatonal complexty for obtanng the stochastc voltage schedule s gven by the steps subsequent to (6). One has to compute the deal cloc cycle for each of the WX cloc cycles. Fndng the boundng cloc cycles taes logarthmc tme of the number of voltage levels, N v. hs gves a complexty of OWX ( logn v ). wo examples of stochastc voltage schedules are gven n Fgure 2. We assumed a normal probablty dstrbuton wth the mean of 70 cycles, and standard devaton of 10. WX s 100. Assumng we only have four avalable cloc frequences f, f/2, f/3, and f/4, we gve two voltage schedules obtaned for two dfferent values of the allowed executon tme. he schedules are gven n number of cloc cycles executed at each avalable frequency. he allowed executon tme s reported n percentage of the tme needed for executng the worst case behavor (WX) at the hghest cloc frequency (f). Some expermental results on how stochastc voltage schedule contrbute at savng energy are presented n secton 3. 2.2 Off-lne as Stretchng he schedulng condton proposed by Lu and Layland [14] s a suffcent one and covers the worst possble case for the tas group characterstcs. Yet, an exact analyss as proposed n [13] may reveal possbltes for stretchng tass and stll eepng the deadlnes. Based on ths, [20] descrbes a method to compute the maxmum requred frequency for a tas set (or the mnmum stretchng factor). In smlar way, we go further and compute mnmal stretchng factors { α } 1 n for each tas τ n the tas group { τ }. A tas s a defned by the trple 1 n τ = ( C,, D ) composed of the WCE, perod and deadlne for tas τ. Note that throughout the paper C refers to the worst case executon pattern WX runnng at the fastest cloc frequency. We 1 0.8 0.6 0.4 0.2 1 - cdf 0 0 20 40 60 80 100 47@f/4 25@f/3 8 20@f 1-cdf functon for a normal dstrbuton wth mean 70 and standard devaton 10. owed s 300% of WX at cloc f 27@f/3 47@f/2 26@f owed s 200% of WX at cloc f Fgure 2. wo stochastc voltage schedules for a tas wth normal dstrbuton executon tme and worst case behavor of 100 cycles consder that the tass n the group are ndexed accordng to ther prorty, computed as n RMS. We compute the stretchng factors n an teratve manner, from the hgher to the lower prorty tass. An ndex q ponts to the latest tas whch has been assgned a stretchng factor. Intally, q = 0. Each of the tass τ, q < n has to be executed before one of ts schedulng ponts S as defned n [13]: S = { 1 ; 1 }, f = D. If D, we only need to change the set of schedulng ponts accordng to S ' = { t ( t S ) ( t < D )} { D }. For each of ths schedulng ponts S S, tas τ exactly meets ts deadlne f: S α r C S r ----- + α C p ----- = S 1 r q r q < p p Note that for the tass whch already have assgned a stretchng factor we used that one, α r, whle for the rest of the tass we assumed they wll all use the same and yet to be computed stretchng factor, α, whch s dependent on the schedulng pont. For the tas τ the best schedulng choce, from the energy pont of vew, s the largest of ts α. At the same tme, from (7), ths has to be the equal for all tass τ, q < n. here s a tas wth ndex m for whch ts best stretchng factor s the smallest among all other tass: max( α m ) = mn( max( α. Note that ths n not necessarly the last )) tas, n. If q = 0, ths tas sets the mnmal cloc frequency as computed n [20]. Havng the ndex m, all tass between q and m can be at most stretched (equally) by the stretchng factor of m. hus, we assgn them stretchng factors as α r = max( α m ), q< r m. Wth ths an teraton of the algorthm for fndng the stretchng factors s complete. he next teraton then proceeds for q = m. Fnally the process ends when q reaches n, meanng all tass have been gven ther own off-lne stretchng factors. An example s gven n able 1. Note that tass 3 and 4 can be stretched off-lne more than 1 and 2, whle 5 has the largest stretchng factor. he processor utlzaton changes from 0.687 to 0.994. We use the utlzaton after off-lne stretchng n computng the energy reducton upper bound n our experments. For > D, the dfference between the stretchng factors grows. able 1: Numercal Example for Off-lne Stretchng as τ Off-lne Stretchng factor α No. WCE (C) Perod () value teratons needed 1 1 5 1.428 1 2 5 11 1.428 1 3 1 45 1.785 2 4 1 130 1.785 2 5 1 370 2.357 3 2.3 On-lne Slac Dstrbuton At runtme t s mportant to use the varatons n executon length of the varous tas nstances to be able to stretch other tass and thus consume less energy. In [20] the only stuaton when a tas s stretched s when t s the only one runnng and has enough tme untl the next tas arrves. In all other stuatons tass are executed at the speed dctated by the off-lne analyss. In [11] tass are (7) 48
stretch at ther WCE at runtme, ndependent of other tass, usng several checng/re-schedulng ponts durng a tas nstance. he wor n [10] uses only two voltage levels. he slac produced by fnshng a tas early s entrely used to run the processor at the low voltage. As soon as ths slac s consumed, the tas starts runnng at hgh voltage. Our method s perhaps most resemblant to the optmal schedulng method OPASS presented n [7]. Yet, OPASS performs analyss over tas hyperperods, whch may lead to worng on a huge number of tas nstances for certan tas sets. Our method eeps a low and the same computatonal complexty, regardless of the tas set characterstcs. We descrbe next our strategy for slac dstrbuton. In short, an early fnshng tas may pass on ts unused processor tme for any of the tass executng next. But ths tme slac can not be used by any tas at any tme snce deadlnes have to be met. We solve ths by consderng several levels of slacs, wth dfferent prortes, as n the slac stealng algorthm [12]. If the tass n the tas set { τ = ( C,, D ) } have m dfferent prortes, we use m 1 n levels of slacs { S }. Wthout great loss of generalty consder that the tass have dfferent prortes, m=n. he slac n each 1 m level s a cumulatve value, the sum of the unused processor tmes remanng from the tass wth hgher prorty. he nvarant descrbng the state of the slacs n every level, at any tme s gven by (10). Intally, all level slacs S are set to zero. o mantan the relaton between slac levels, the levels are managed at runtme as follows: whenever an nstance of a tas τ C wth prorty starts executng, t can use an arbtrary part of the slac avalable at level, S. So the allowed executon tme for tas τ wll be: A = C + C. he remanng slac from level wll degrade nto level +1 slac. Each level slac wll be updated accordng to: 0, S ' = (8) S C, > whenever a tas nstance fnshes ts executon, t wll generate some slac f t fnshes before ts allowed tme. If E s the actual executon tme, the generated slac s A = A E. hs slac can be used by the lower prorty tass. In ths case the level slacs are updated accordng to: S, S '' = (9) S + A, > dle processor tmes are subtracted for all slacs. hs ensures that the crtcal nstance from the classc RM analyss remans the same. he computatonal complexty requred by the on-lne method s lnearly dependent to the number of slac levels. Note that tas nstances can only use slac generated from hgher prorty tass and produce low prorty slac. We call ths slac degradaton. Whenever the lowest prorty tas starts executng, all level slacs are reset. Note also that not necessarly all slac at one level s used by a sngle tas. Varous methods can be used, but we menton here only the two we used n our experments: Greedy: the tas gets all the slac avalable for ts level: C = S Mean proportonal: we consder the mean executon tme µ for each tas nstances watng to execute (n the ready queue). he slac s proportonally dstrbuted accordng to these: C = S µ µ ReadyQ he strategy of managng the slac we ust descrbed allows us to eep the crtcal nstance response tme for all tass, as we prove next. he response tme R () t for tas τ s computed as R () t = A + I () t, where A s ts allowed executon tme, as before, and I (t) s the nterference from the other tass. From the managng strategy gven before, the cumulated slac on each level, at a certan tme t s of form: S () t = S 1 () t C 1 + A 1, = t ----------- (10) 1 he slac of level s composed of all slac from level -1, less the slac used by the nstances of tass wth prorty -1 but plus all the slac generated by these. he number of nstances executed,, s determned by the tas perod. Note that S 1 s always zero. Elmnatng the teraton n the prevous formula: < S () t = A C = t (11) ----- = 1 he tas wth the hghest prorty wll never receve slac and therefore, C 1 = 0. he nterference from the hgh prorty tass s the tme used to execute all arrved nstances of these hgh prorty tass: < I () t = E = t ----- (12) = 1 Wth the notatons from the slac managng algorthm E = A A = C + C A. Introducng ths n (12): < I () t = ( C + C A ) = t ----- (13) = 1 he last two terms n the sum are actually gvng the slac of level, as n (11), so we can re-wrte (13) as: < I () t = C S () t = t ----- (14) = 1 Note that the maxmal response tme for a tas s obtaned when t uses all the slac avalable at ts level: R () t = C + I () t + S () t. From the last two equatons: < R () t = C + t ----- C (15) = 1 whch s exactly the response tme when all tass execute at WCE. hus, f the RM analyss decdes that a tas set s schedulable, t remans vald when usng our on-lne polcy. In our mplementaton we addtonally used a method smlar to the on-lne method presented n [20]. Namely, whenever there are no tass n the Ready queue, the currently executng tas can stretch untl the closest arrval tme of a tas nstance. We wll refer to ths n our experments as the 1stretch method. 3. EXPERIMENAL RESULS he frst experment examnes the energy gans of usng a stochastc voltage schedule at tas level. For ths we consdered a sngle tas wth executon tme varyng between a best case (BCE) and a worst case (WCE) accordng to a normal dstrbuton. dstrbutons have the mean (BCE+WCE)/2 and standard devaton (WCE- BCE)/6. For a several cases rangng from hghly flexble executon tme ( s 0.1) to almost fxed ( s 0.9) we bult stochastc schedules for a range of allowed executon tmes (from 49
WCE to 3x WCE). We assumed that our processor has 9 dfferent voltage levels, equally dstrbuted between f and f/3. For a large number of tas nstances generated accordng to the gven dstrbuton we computed both the energy of the stochastc schedule (mode 3 n Fgure 1) and the WCE-stretch schedule (mode 2 n Fgure 1). We depct n Fgure 3 the average energy consumpton of the stochastc schedule as a part of the WCE-stretch schedule. Note that when the allowed tme approaches ether WCE or 3-tmes WCE, the energy consumptons become equal. he lowest possble cloc frequency s f/3 whch anyway means 3-tmes WCE, so there s no better schedule for these cases. On the other hand when the allowed tme closes WCE, there s no other way but to use the fastest cloc. Somewhere between the slowest and the fastest frequences (owed/wce = 2) s the largest energy gan snce the stochastc schedule can use the whole spectrum of avalable frequences. Note that the energy gans become more mportant when the tas executon tme vares much ( closes 0.1). It s mportant to notce that WCE-stretch already gans very much energy compared to the non-scalng case. For example when the allowed tme s twce the WCE, the WCE-stretch energy s around 25% of the no-scalng energy. But a stochastc approach contrbutes even more to these gans, as the fgure shows. Next we too two real-lfe hard-r applcatons [9, 15] and appled several energy reducton strateges. he results are depcted n Fgure 4. We assumed tass wth normal dstrbutons, wth the same characterstcs as n the prevous experment. he 100% energy s the energy obtaned by runnng all tass as fast as possble and executng NOPs when no tass are supposed to run. We assumed that the NOP nstructon consumes only 20% of the average power, as n [20]. he vrtual processor used for these experments has 14 voltage levels, wth cloc frequences varyng between f=100mhz and 11MHz. A power-down mode s also avalable, n whch the processor consumes 5% of the hghest frequency average energy. he curves named depct the upper bound of the energy reducton possbltes. hese were obtaned n a post-executon analyss, by consderng that the tass are unformly stretched up to maxmum processor utlzaton as computed n sub-secton 2.2.2. hs lmt s hardly achevable n practce, snce the actual executon patterns for all tas nstances are never avalable beforehand. Moreover, ths optmum obtaned by unformly stretchng all nstances may volate some deadlnes, beng therefore useless n practce. A more realstc bound s gven by the. he curves named Offlne+1stretch were obtaned by usng only the off-lne stretchng method and the 1stretch method mentoned n sub-secton 2.2.3. he labeled curves were obtaned by Stochastc schedule energy compared to WCE-stretch 100% 95% 90% 85% 75% 70% 3 2.5 2 owed/wce 1.5 Levels 95.5% 90.8% 86.1% 81.4% 76.6% 0.7 0.9 0.5 0.3 0.1 1 Fgure 3. he average energy consumpton of a stochastc voltage schedule vs. the energy consumpton of a WCE- stretch schedule. usng the off-lne strategy, the on-lne strategy wth mean proportonal slac dstrbuton (sub-secton 2.3), plus the stochastc executon tas model (mode 3 n Fgure 1). he curves labeled were obtaned by usng the same method as the curves, except usng an deal-stretch tas executon model (mode 1 n Fgure 1). Note that ths method mples nowng the actual executon tme at a tas arrval, whch s unlely n realty. For the last three methods, Offlne+1stretch,, and Ideal-stretch, whenever the processor s dle, t goes to a power down mode. We also tested our schedulng polcy on randomly generated tas sets of 50 and 100 tass. he tas sets were generated as follows. For each set, the tas perods (and deadlnes) were selected usng a unform dstrbuton n 100..5000 and 100..10000 respectvely. he worst case executon tmes were then randomly generated such that the tas set would yeld approxmately 0.67 processor utlzaton, for the fastest cloc. he average utlzaton after off-lne stretchng turned out to be 0.92 for the sets of 50 tass, and 0.85 for the sets of 100 tass. Usng the same processor type as n the prevous experment, we smulated the runtme behavor of several schedulng methods. We also used post-smulaton data to obtan the upper bounds, as n the prevous experment. he values depcted n Fgure 5 are averages over one hundred sets of tass. As results from these experments, our polcy ( ) performs best, when lttle nformaton on tas executon s avalable. 100% 100 60% 95 90 40% 85 Offlne+1stretch 20% 80 75 Offlne+1stretch 0 70 0.1 0.3 0.5 0.7 0.9 0.1 0.3 0.5 0.7 0.9 a) avoncs, 17 tass b) CNC, 8 tass Fgure 4. he energy reducton for an a) avoncs applcaton [15] and b) a controller CNC [9]. In b) the area between 70-100% s enlarged. Energy reducton 4. CONCLUSIONS We presented and analyzed a schedulng polcy for hard real-tme tass runnng on a dynamc voltage supply processor, wth the fnal purpose of reducng the energy consumpton. he polcy s desgned for sets of tass wth fxed prortes assgned n a rate/ deadlne monotonc manner. It conssts of both off-lne and on-lne schedulng decsons, taen both at tas and tas set levels. he offlne decsons use exact tmng analyss to derve off-lne voltage scalng factors for each tas. he on-lne polcy dstrbutes avalable processor tme on prorty bass, usng slac levels and statstcs. as-level voltage schedules are bult usng stochastc data, wth the goal of mnmzng the average case energy consumpton. he paper also contans a proof that our schedulng polcy meets all deadlnes. Our method can be fully mplemented n the ROS, wthout appealng to specal complers or changng the software. Yet, combned wth the afore mentoned methods, our approach may yeld even greater energy reductons. he expermental results show that our polcy can be successfully used to reduce the energy consumpton n a hard real-tme system. 50
100% 90% 70% 60% 50% 0 100% 90% 70% 60% 50% 40% 30% Energy reducton Energy reducton Offlne+1stretch 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Offlne+1stretch 5. ACKNOWLEDGMENS hs wor was funded by ARES - A networ for Real-me research and graduate Educaton n Sweden 1. he author would le to than Petru Eles, Krs Kuchcns, and Per Larsson-Edefors for ther helpful comments. 6. REFERENCES [1] Benn, L. and DeMchel, G. System-level power optmzaton: technques and tools, n ACM rans. on Desgn Automaton of Electronc Systems, No. 2, Vol. 5, Aprl 2000, 115-192. [2] Burd,., Perng,., Strataos, A., and Brodersen, W. A dynamc voltage scaled mcroprocessor system n IEEE Journal of Sold-State Crcuts, No. 11, Vol. 35, November 2000, 1571-1580. [3] Chnadraasan, A., Gutn, V., and Xanthopoulos,. Data drven sgnal processng: an approach for energy effcent computng n Proceedngs of ISLPED 96, 347-352. [4] Dave, B.P., Lashmnarayana, G., and Jha, N.K. COSYN: hardware-software co-synthess of embedded systems n Proceedngs of the 34th DAC 1997, 703-708. [5] Gruan, F., and Kuchcns, K. Low-energy drected archtecture selecton and tas schedulng for system-level desgn n Proceedngs of the 25th Euromcro Conference, 1999, pp. 296-302. [6] Gruan, F., and Kuchcns, K. LEneS: tas schedulng for low-energy systems usng varable voltage processors n Proceedngs of ASP-DAC2001, 449-455. [7] Hong, I., Potona, M., and Srvastava, M.B. On-lne schedulng of hard real-tme tass on varable voltage 1 http://www.artes.uu.se/ sets of 50 tass sets of 100 tass 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fgure 5. he energy reducton usng dfferent strateges for sets of 50 tass above and sets of 100 tass bellow. he value are averages over a hundred tas sets. processor n Dgest of echncal Papers of ICCAD 98, 653-656. [8] Ishhara,., and Yasuura, H. Voltage schedulng problem for dynamcally varable voltage processors n Proceedngs of ISLPED 98, 197-202. [9] Km, N., Ryu, M., Hong, S., Sasena, M., Cho, C.-H., and Shn, H. Vsual assessment of a real-tme system desgn: a case study on a CNC controller, he 17th IEEE Real-me Systems Symposum, 1996, 300-310. [10] Lee, Y.-H., and Krshna, C.M. Voltage-cloc scalng for low energy consumpton n real-tme embedded systems n Proceedngs of the 6th Internatonal Conference on Real-me Computng Systems and Applcatons, 1999, 272-279. [11] Lee, S., and Saura,. Run-tme voltage hoppng for lowpower real-tme systems n Proceedngs of the 37th DAC, 2000, 806-809. [12] Lehoczy, J., and Ramos-huel, S. An optmal algorthm for schedulng soft-aperodc tass n fxed-prorty preemptve systems n Proceedngs of RSS 92, 110-123. [13] Lehoczy, J., Sha, L., and Dng, Y. he rate monotonc schedulng algorthm: exact characterzaton and average case behavor n Proceedngs of RSS 89, 166-171. [14] Lu, C.L., and Layland, J.W. Schedulng algorthms for multprogramng n a hard real tme envronment n JACM 20 (1), 1973, 46-61. [15] Loce, C.D., Vogel, D.R., and Mesler,.J. Buldng a predctable avoncs platform n Ada: a case study n Proceedngs of RSS 91, 181-189. [16] Mossé, D., Aydn, H., Chlders, B., and Melhem, R., Compler-asssted dynamc power-aware schedulng for realtme applcatons. Worsop on Complers and Operatng Systems for Low-Power, October 2000. [17] Pedram, M. Power optmzaton and management n embedded systems, Proceedngs of ASP-DAC 2001, 239-244. [18] Perng,., Burd,., and Brodersen, R., he smulaton and evaluaton of dynamc voltage scalng algorthms n Proceedngs of ISLPED 98, 76-81. [19] Perng,., Burd,., and Brodersen, R., Voltage schedulng n the lparm mcroprocessor system n Proceedngs of ISLPED 00, 96-101. [20] Shn, Y., and Cho, K. Power conscous fxed prorty schedulng for hard real-tme systems n Proceedngs of the 36th DAC, 1999, 134-139. [21] Shn, D., Km, J., and Lee, S. Intra-tas voltage schedulng for low-energy hard real-tme applcatons, Specal Issue of IEEE Desgn and est of Computers, October 2000. [22] Suzu, K., Mta, S., Futa,., Yamane, F., Sano, F., Chba, A., Watanabe, Y., Matsuda, K., Maeda,., and Kuroda,. A 300MIPS/W RISC core processor wth varable supplyvoltage scheme n varable threshold-voltage CMOS, Proceedngs of the ICC 97, 587-590. [23] Weser, M., Welch, B., Demers, A., and Shener, S. Schedulng for reduced CPU energy n Proceedngs of the Frst Symposum on Operatng Systems Desgn and Implementaton, November 1994. [24] Yao, F., Demers, A., and Shener, S. A schedulng model for reduced CPU energy n Proceedngs of the 36th Symposum on Foundatons of Computer Scence, 1995, 374-382. [25] http://www.transmeta.com 51