Exploiting Dynamic Workload Variation in Low Energy Preemptive Task Scheduling

Similar documents
Hard Real-Time Scheduling for Low-Energy Using Stochastic Data and DVS Processors

IEE Electronics Letters, vol 34, no 17, August 1998, pp ESTIMATING STARTING POINT OF CONDUCTION OF CMOS GATES

To: Professor Avitabile Date: February 4, 2003 From: Mechanical Student Subject: Experiment #1 Numerical Methods Using Excel

Dynamic Optimization. Assignment 1. Sasanka Nagavalli January 29, 2013 Robotics Institute Carnegie Mellon University

Calculation of the received voltage due to the radiation from multiple co-frequency sources

Uncertainty in measurements of power and energy on power networks

Network Reconfiguration in Distribution Systems Using a Modified TS Algorithm

Walsh Function Based Synthesis Method of PWM Pattern for Full-Bridge Inverter

TECHNICAL NOTE TERMINATION FOR POINT- TO-POINT SYSTEMS TN TERMINATON FOR POINT-TO-POINT SYSTEMS. Zo = L C. ω - angular frequency = 2πf

NATIONAL RADIO ASTRONOMY OBSERVATORY Green Bank, West Virginia SPECTRAL PROCESSOR MEMO NO. 25. MEMORANDUM February 13, 1985

MTBF PREDICTION REPORT

Redes de Comunicação em Ambientes Industriais Aula 8

Digital Transmission

Research of Dispatching Method in Elevator Group Control System Based on Fuzzy Neural Network. Yufeng Dai a, Yun Du b

High Speed, Low Power And Area Efficient Carry-Select Adder

NETWORK 2001 Transportation Planning Under Multiple Objectives

Optimal Placement of PMU and RTU by Hybrid Genetic Algorithm and Simulated Annealing for Multiarea Power System State Estimation

The Synthesis of Dependable Communication Networks for Automotive Systems

Control Chart. Control Chart - history. Process in control. Developed in 1920 s. By Dr. Walter A. Shewhart

A Mathematical Solution to Power Optimal Pipeline Design by Utilizing Soft Edge Flip-Flops

Total Power Minimization in Glitch-Free CMOS Circuits Considering Process Variation

Priority based Dynamic Multiple Robot Path Planning

A NSGA-II algorithm to solve a bi-objective optimization of the redundancy allocation problem for series-parallel systems

High Speed ADC Sampling Transients

Resource Scheduling in Dependable Integrated Modular Avionics

A Comparison of Two Equivalent Real Formulations for Complex-Valued Linear Systems Part 2: Results

Low Switching Frequency Active Harmonic Elimination in Multilevel Converters with Unequal DC Voltages

Adaptive Modulation for Multiple Antenna Channels

MASTER TIMING AND TOF MODULE-

antenna antenna (4.139)

A MODIFIED DIRECTIONAL FREQUENCY REUSE PLAN BASED ON CHANNEL ALTERNATION AND ROTATION

PRACTICAL, COMPUTATION EFFICIENT HIGH-ORDER NEURAL NETWORK FOR ROTATION AND SHIFT INVARIANT PATTERN RECOGNITION. Evgeny Artyomov and Orly Yadid-Pecht

Practical Issues with the Timing Analysis of the Controller Area Network

A High-Sensitivity Oversampling Digital Signal Detection Technique for CMOS Image Sensors Using Non-destructive Intermediate High-Speed Readout Mode

The Effect Of Phase-Shifting Transformer On Total Consumers Payments

Figure 1. DC-DC Boost Converter

Optimal Sizing and Allocation of Residential Photovoltaic Panels in a Distribution Network for Ancillary Services Application

Total Power Minimization in Glitch-Free CMOS Circuits Considering Process Variation

HIGH PERFORMANCE ADDER USING VARIABLE THRESHOLD MOSFET IN 45NM TECHNOLOGY

problems palette of David Rock and Mary K. Porter 6. A local musician comes to your school to give a performance

Comparative Analysis of Reuse 1 and 3 in Cellular Network Based On SIR Distribution and Rate

Selective Sensing and Transmission for Multi-Channel Cognitive Radio Networks

Optimal Allocation of Static VAr Compensator for Active Power Loss Reduction by Different Decision Variables

Latency Insertion Method (LIM) for IR Drop Analysis in Power Grid

A Novel Optimization of the Distance Source Routing (DSR) Protocol for the Mobile Ad Hoc Networks (MANET)

Figure.1. Basic model of an impedance source converter JCHPS Special Issue 12: August Page 13

Topology Control for C-RAN Architecture Based on Complex Network

Non Pre-emptive Scheduling of Messages on SMTV Token-Passing Networks

A TWO-PLAYER MODEL FOR THE SIMULTANEOUS LOCATION OF FRANCHISING SERVICES WITH PREFERENTIAL RIGHTS

Prevention of Sequential Message Loss in CAN Systems

Weighted Penalty Model for Content Balancing in CATS

HUAWEI TECHNOLOGIES CO., LTD. Huawei Proprietary Page 1

A Preliminary Study on Targets Association Algorithm of Radar and AIS Using BP Neural Network

Automatic Voltage Controllers for South Korean Power System

The Impact of Spectrum Sensing Frequency and Packet- Loading Scheme on Multimedia Transmission over Cognitive Radio Networks

Decomposition Principles and Online Learning in Cross-Layer Optimization for Delay-Sensitive Applications

Graph Method for Solving Switched Capacitors Circuits

Vectorless Analysis of Supply Noise Induced Delay Variation

A Predictive QoS Control Strategy for Wireless Sensor Networks

Application of Intelligent Voltage Control System to Korean Power Systems

Joint Subcarrier and CPU Time Allocation for Mobile Edge Computing

Understanding the Spike Algorithm

A Fuzzy-based Routing Strategy for Multihop Cognitive Radio Networks

Control of Chaos in Positive Output Luo Converter by means of Time Delay Feedback

ECE315 / ECE515 Lecture 5 Date:

TODAY S wireless networks are characterized as a static

A study of turbo codes for multilevel modulations in Gaussian and mobile channels

Efficient Large Integers Arithmetic by Adopting Squaring and Complement Recoding Techniques

QoS Provisioning in Wireless Data Networks under Non-Continuously Backlogged Users

Generalized Incomplete Trojan-Type Designs with Unequal Cell Sizes

Sizing and Placement of Charge Recycling Transistors in MTCMOS Circuits

Decision aid methodologies in transportation

Passive Filters. References: Barbow (pp ), Hayes & Horowitz (pp 32-60), Rizzoni (Chap. 6)

Parameter Free Iterative Decoding Metrics for Non-Coherent Orthogonal Modulation

Sensors for Motion and Position Measurement

Review: Our Approach 2. CSC310 Information Theory

Optimal Phase Arrangement of Distribution Feeders Using Immune Algorithm

EE 508 Lecture 6. Degrees of Freedom The Approximation Problem

Performance Analysis of Multi User MIMO System with Block-Diagonalization Precoding Scheme

熊本大学学術リポジトリ. Kumamoto University Repositor

Distributed Channel Allocation Algorithm with Power Control

An Effective Approach for Distribution System Power Flow Solution

Comparison of Two Measurement Devices I. Fundamental Ideas.

Traffic balancing over licensed and unlicensed bands in heterogeneous networks

Piecewise Linear Approximation of Generators Cost Functions Using Max-Affine Functions

Resource Allocation Optimization for Device-to- Device Communication Underlaying Cellular Networks

Learning Ensembles of Convolutional Neural Networks

POLYTECHNIC UNIVERSITY Electrical Engineering Department. EE SOPHOMORE LABORATORY Experiment 1 Laboratory Energy Sources

Master Physician Scheduling Problem 1

Opportunistic Beamforming for Finite Horizon Multicast

Joint Adaptive Modulation and Power Allocation in Cognitive Radio Networks

Figure 1. DC-DC Boost Converter

Study of Downlink Radio Resource Allocation Scheme with Interference Coordination in LTE A Network

An Adaptive Over-current Protection Scheme for MV Distribution Networks Including DG

A MODIFIED DIFFERENTIAL EVOLUTION ALGORITHM IN SPARSE LINEAR ANTENNA ARRAY SYNTHESIS

4.3- Modeling the Diode Forward Characteristic

NOVEL ITERATIVE TECHNIQUES FOR RADAR TARGET DISCRIMINATION

UNIT 11 TWO-PERSON ZERO-SUM GAMES WITH SADDLE POINT

Analysis of Time Delays in Synchronous and. Asynchronous Control Loops. Bj rn Wittenmark, Ben Bastian, and Johan Nilsson

Harmonic Balance of Nonlinear RF Circuits

Transcription:

Explotng Dynamc Worload Varaton n Low Energy Preemptve Tas Schedulng Lap-Fa Leung, Ch-Yng Tsu Department of Electrcal and Electronc Engneerng Hong Kong Unversty of Scence and Technology Clear Water Bay, Hong Kong SAR, Chna {eefa,eetsu}@ee.ust.h Xaobo Sharon Hu Department of Computer Scence and Engneerng Unversty of Notre Dame Notre Dame, IN 46556, USA shu@cse.nd.edu Abstract A novel energy reducton strategy to maxmally explot the dynamc worload varaton s proposed for the offlne voltage schedulng of preemptve systems. The dea s to construct a fully-preemptve schedule that leads to mnmum energy consumpton when the tass tae on approxmately the average executon cycles yet stll guarantees no deadlne volaton durng the worst-case scenaro. End-tme for each sub-nstance of the tass obtaned from the schedule s used for the on-lne dynamc voltage scalng (DVS) of the tass. For the tass that normally requre a small number of cycles but occasonally a large number of cycles to complete, such a schedule provdes more opportuntes for slac utlzaton and hence results n larger energy savng. The concept s realzed by formulatng the problem as a Non-Lnear Programmng (NLP) optmzaton problem. Expermental results show that, by usng the proposed scheme, the total energy consumpton at runtme s reduced by as hgh as 60% for randomly generated tas sets when comparng wth the statc schedulng approach only usng worst case worload.. Introducton Energy consumpton s one of the crtcal desgn ssues n real-tme embedded systems (RTES), whch are prevalent n many applcatons such as automobles, and consumer electroncs, etc. RTES are generally composed of a number of tass to be executed on one or more embedded processors. Dynamc voltage scalng (DVS),.e., varyng the supply voltage and the correspondng cloc frequency of a processor at runtme accordng to the specfc performance constrants and worload, s proven to be very effectve for reducng energy consumpton [,]. Many modern embedded processors support both varable supply voltage and the controlled shutdown mode [3,4]. How to maxmally explot the beneft provded by such hardware has been an actve research topc durng the last Ths wor was supported n part by the Hong Kong Research Grant Councl under Grant CERG HKUST 649/03E and HKUST grant HIA0/03.EG03. Ths wor was supported n part by U.S. Natonal Scence Foundaton under grant number CCR0-0899 and CNS04-077. several years. In ths paper, we focus on real-tme embedded preemptve systems usng varable voltage processors. Havng an effectve voltage schedule,.e., the voltage to be used at any gven tme s crtcal to harvest the DVS beneft. There are two man approaches to fnd a voltage schedule. One category of approaches [ 5 ] determnes the schedule durng runtme only. These results can wor wth ether real-tme or non-real-tme tass. The basc prncple s that only the runtme worload nformaton whch s predcted durng the onlne phase s used to determne the voltage schedules. Although such approaches have been shown to result n energy savng, they do not explot the fact that much nformaton about tass n an RTES, such as tas perods, deadlnes, worst-case executon cycles (WCEC) and average worload, s avalable offlne. It s not dffcult to see that not usng such nformaton may lose opportuntes to further reduce the energy consumpton. To complement the above runtme approaches, the other category of voltage schedulng wor fnds the desred voltage schedules offlne based on the avalable tas nformaton, e.g., [,,6,7,8,9,0]. These technques are generally applcable to real-tme tass wth hard deadlnes. To ensure that the schedule obtaned n offlne does not volate any tmng constrant, the worst-case executon cycles (WCEC) of each tas s always used n the offlne analyss. Such offlne voltage schedules can be altered to some extent at runtme by usng the slacs resulted from the tass not executng at the WCEC to lower the voltage obtaned n the offlne phase [,7]. The effectveness of the offlne approach together wth the runtme approach s very much dependent on how the slacs are dstrbuted, whch n turn depends on the end-tme obtaned n the statc schedule. Therefore, t s mportant to schedule the tass n such a way that the potental slac tmes can be maxmally exploted. For many real-tme systems, most of the tme the worload of the tass are much smaller than the worst case and on average the executon cycles of the tass are close to an average-case executon cycle value (ACEC) nstead of the WCEC. In general, the schedules obtaned from the WCEC values can greatly lmt the flexblty and effectveness of utlzng the slacs generated from the actual executon cycles durng runtme. In ths wor, a novel offlne schedulng approach, whch results n the best slac dstrbuton n terms of energy savng for the ACEC scenaros yet guarantees no deadlne volaton when tass assume WCEC, s ntroduced. We focus on preemptve systems, whch are more complcated, and t s 530-59/05 $0.00 005 IEEE

easly to transform the formulaton for non-preemptve systems. To the best of our nowledge, ths s the frst wor that ncorporates the ACEC and the WCEC together durng the offlne varable voltage schedulng. Gven that the worload dstrbuton of many real-tmes can be estmated offlne (e.g., usng proflng []), our approach can acheve much hgher energy savng. Expermental results show that sgnfcant energy reducton s acheved when the ACEC s consdered durng the offlne schedulng phase.. Prelmnares and Motvaton. System model In ths paper we assume a frame-based preemptve hard real tme system n whch a frame of length L, whch s the hyperperod of the tas-sets, s executed repeatedly. Rate monotonc (RM) schedulng polcy s used to schedule the perodc tass where the shorter the perod of the tas, the hgher the prorty. The prortes of two tass are the same f they have the same perod. A hgher prorty tas wll always preempt the current tas. We assume no blocng secton s avalable for a tas and hence a hgher prorty tas wll preempt the lower prorty tass mmedately once t s released. The tass are assumed to be ndependent of each other. Our technque wors for both dependent and ndependent tass as well as for multple processors. For smplcty, we only consder the sngle processor case n ths paper. Wthout loss of generalty, a set of N perodc tass s denoted as {T,T,,T N } wth T has a hgher prorty than T j f <j. Each tas T has ts own perod P, the Worst-Case- Executon-Cycles (WCEC) Wˆ and the Average-Case Executon-Cycles (ACEC) W. The ACEC s defned as the expected value of the executon cycle base on the worload dstrbuton and t can be obtaned by proflng technques []. The relatve deadlne s assumed to be equal to the perod P. Each tas T releases ts j th nstance T,j perodcally. The frst nstance of all the tass s assumed to be released at tme t=0. Also, each tas nstance T,j has ts own absolute release tme R,j and absolute deadlne D,j. The P and the relatve deadlne of each nstance of the tas are assumed to be the same. For a lower prorty tas nstance T, t may be preempted by others durng executon and hence t wll be dvded nto several subnstances and each sub-nstance s denoted as T, where ={,..,K} f T,j s preempted nto K sub-parts. When there s no preempton for the tas T,j, the tas nstance tself s denoted as T, n order to have a consstent notaton. Also, we denote the number of the tas nstances of T be N and the upper bound of the number of sub-nstances be NS,j... Motvatonal example In ths sub-secton, we use a non-preemptve system as a motvaton example to llustrate the dea of explotng the worload varaton for voltage schedulng. The man dea for preemptve and non-preemptve system s the same except that the formulaton of the problem s dfferent. The problem formulaton for the preemptve system wll be dscussed n Secton 3. Let C be the effectve swtchng capactance and v be the supply voltage of tas T. The cycle tme, CT, and the tas T s executon tme d can be computed as λ v (), λ v CT = d α = W CT = W () α v V ) ( v Vth ) ( th where V th s the threshold voltage, λ s a devce related parameter and α s a process constant whch s between and. The total energy consumpton e of executng tas T s gven by e =C W v (3) 3 T (a) (b) 0 3 0 T S T S T3 6.7 3.3 0 T.7 T3 Tme (ms) S 3 3.3 8.3 4. 0 Tme (ms) Table. Tas parameters for the system n Fg. Tas WCE C ACE C Actual executo n cycles D (ms) T 0 0 0 0 T 0 0 0 5 T 3 0 0 0 0 Fgure. A motvaton example We use a smple example to llustrate the effect of a statc schedule on energy savng when dynamc slac redstrbuton s employed. Suppose an RTES contans three tass wth the parameters of each tas specfed n Table (assumng the release tme of each tas s 0). Fgure (a) shows the optmal statc schedule f WCEC are taen by all tass. For smplcty, we assume the cloc cycle tme s nversely proportonal to the supply voltage and the mnmum and maxmum supply voltages are 0.7V and 5V, respectvely. Fgure (b) gves the actual dynamc run-tme schedule when greedy dynamc slac redstrbuton s carred out. The supply voltage value at runtme depends on both the WCEC and the end-tme obtaned n the statc schedule and can be computed by equaton (). Durng runtme, tass fnsh earler snce ther actual executon cycles are smaller than the WCEC. Greedy slac dstrbuton dstrbutes all the slac obtaned from the just-fnshed tas to the next tas. For example, slac S obtaned from tas T s 3.3ms as shown n Fgure (b) and s utlzed fully by the next tas T. The supply voltage of T s re-calculated based on the WCEC of T, that s, v =0/(3.3-3.3)=. Smlarly, slac S generated by tas T s 5ms and T 3 can adopt an even lower voltage. By usng equaton (3), the overall energy consumpton for executng the tass based on the schedule gven n Fgure (a) s 58.9µJ. It s clear that the dynamc slac redstrbuton ndeed leads to more energy savng. However, f we now that the tass most probably tae the ACEC values durng actual executon, can we do better? Let s examne the statc schedule n Fgure a lttle bt closer. In ths schedule, each tas s assocated wth a predetermned end tme, te, e.g., T s end tme s 6.7ms, T s s 3.3ms, etc. These end tmes are then used n the dynamc slac dstrbuton process to compute a new voltage schedule. The statc schedule essentally determnes the end tme for each tas. (Note that ths

predetermned end tme can be dfferent from the actual end tme when a tas does not assume the WCEC. Snce ths predetermned end tme s used frequently n our dscusson, we smply call t the end tme). Such end tmes are obtaned so that the tass wll complete by ther deadlnes and the overall energy s mnmum f tass tae on the WCEC. Now, consder a dfferent schedule where the end tmes of each tas s gven as follows: the end tmes of T, T and T 3 are 0, 5 and 0 ms, respectvely. Usng ths schedule and the same greedy slac dstrbuton as above, we obtan the runtme schedule as shown n Fgure (a). The overall energy consumpton of the schedule s 0µJ, a 4% mprovement comparng wth that of the schedule n Fgure (b). Though the schedule used by Fgure leads to a bgger energy savng, t s mportant that the schedule can stll meet the deadlne requrement when tass assume the WCEC. It s true that the schedule dctates that the end tme of tas s no later than ts deadlne. However, f the schedule s not carefully chosen, the tass may not be able to fnsh by ther deadlnes durng runtme. Fgure (b) shows what happens under the schedule used n Fgure (a) f the tass do tae the WCEC durng runtme. At tme zero, a V s adopted for T. Snce T taes the WCEC, t wll not fnsh untl 0ms. The voltages for T and T 3 can be computed accordngly. Note that 4V s needed for both T and T 3 n order to meet the tmng constrants. If the maxmum voltage level for the processor s 3.3V, the schedule would not be feasble. Therefore, smply usng the tas deadlnes as the desred end tmes does not always gve a feasble schedule. We would le to pont out that the actual schedule n Fgure (b), when tass happen to tae the WCEC, consumes 70µJ energy, a 33% ncrease over the schedule n Fgure (a). However, n general, actual executon cycles of a tas tend to be close to an average case value and only rarely equal the WCEC value. Based on ths observaton, we would le to fnd a statc schedule that result n better energy savng on average but stll satsfy the tmng requrements for the worst case. Even though the above example deals wth the non-preemptve schedule only, the basc dea s the same wth preemptve schedulng and we wll dscuss how to formulate the problem of preemptve schedule n the next sectons. In the preemptve system, a tas wll be preempted nto several sub-nstances and how to assgn the optmal worload for each sub-nstance to obtan overall mnmum average energy consumpton s a challengng problem. Wth the optmal worload assgnment, we can fnd the correspondng end-tme n the statc schedule. The statc end-tme as well as the WCEC for each sub-nstances wll thus be used for the calculaton of tewcec. tewcec. te WCEC.3 (a) T T T3 Tme (ms) 5 0 5 0 4 (b) T T T3 Tme (ms) 5 0 0 5 0 Fgure. Another schedule for the system n Fg.. the runtme supply voltage. 3. Our Approach From the dscusson above, we can see that the greedy slac dstrbuton (or any other slac dstrbuton) reles heavly on the tass end tme obtaned n the statc schedule. Exstng statc voltage schedulng technques employ the WCEC n order to guarantee that no deadlne volaton occurs durng runtme. Because of the use of the WCEC, the end tme of each tas s usually more conservatve. If we could extend the end tme of each tas to as long as that allowed by the worst-case executon scenaro, t wll have more potental for the dynamc slac dstrbuton to acheve more energy savng for the average cases. So our problem s that gven the effectve swtchng capactance, the worload dstrbuton, WCEC, release tme and deadlne of each tas, fnd a desred schedule,.e., the desred end tme of each tas, whch strve to maxmze the potental energy savng when the tass are executng based on the worload dstrbuton. In ths secton, we show that ths schedulng problem can be formulated as a mathematcal programmng problem. We gnore the voltage transton overhead n our formulaton. In most RTES applcatons, the tas executon tme s much longer than the voltage transton tme. As stated n [], the ncrease of energy consumpton s neglgble when the transton tme s small comparng wth the tas executon tme. In the rest of ths secton, we adopt the followng conventon: x and xˆ ndcate the average and the worst case values of x, respectvely. For example, W, and W ˆ, j, are the average executon cycles and the worst case executon cycles of tas sub-nstance T,, respectvely. 3. Fully Preemptve Schedule In our formulaton, we want to fnd the statc end-tme for each sub-nstance by optmally assgnng the worload so that the average energy consumpton s mnmum whle all the worst-case requrements are satsfed. In varable voltage schedulng, tas s executon tme vares nversely wth the supply voltage by equaton (). Wth a longer executon tme, the number of preempton by the other hgher prorty tass, and hence the number of tas sub-nstances, s hgher because the overlappng regon wth the hgher prorty tass s larger. In order to ensure feasblty of the fnal schedule and to allow maxmum flexblty for the mathematcal programmng to fnd the optmal assgnment, we need to consder all the possble preempton and the maxmum number of tas sub-nstances. Here we construct a fully preemptve schedule whch reflects all possble preemptons based on the perods and prortes of the tass. All the possble sub-nstances of the tas nstances are found n ths schedule. Fgures 3 and 4 show an example of how to obtan the fully preemptve schedule. Suppose we have three tass wth P =3, P =4 and P 3 =6. The ntal tas nstances for a hyper-perod are shown n Fgure 3. All possble preemptons to a tas nstance are obtaned and the orgnal schedule s expanded to a fully preemptve schedule as shown

n Fgure 4. However, the actual run-tme schedule may not be the same wth ths schedule because the lower prorty tas may fnsh executon before the hgher prorty released. Here, we want to deal wth a more general case that the optmal worload requred for each sub-nstances are found durng the mathematcal programmng formulaton. Tas T Tas T Tas T3 T,, T,, T,, T,3, T,4, T,, T,3, T3,, T3,, 0 3 6 9 Tme Fgure 3. Tas nstances n the hyper-perod of an example system. Tas T Tas T Tas T3 T,, T,, T,, T,3, T,4, T,, T,, T,, T,3, T,3, T3,, T3,, T3,,3 T3,, T3,, T3,,3 0 3 6 9 Tme Fgure 4. A fully preemptve schedule for the system n Fg. 3. From the fully preemptve schedule, we can obtan the order of the executon of the tass sub-nstances whch s based on the prorty and the release tme of each sub-nstance. E.g. T,, s preempted by T,, and so Order,, > Order,,. Snce T,, and T,, are orgnated from the same tas nstance, we have Order,, > Order,,. The total order n Fgure 4 s gven by: (,,) <(,,) <(3,,)<(,,)<(,,)<(3,,)<(,,)<(3,,3) <(,,)< (3,,)< (,3,)< (3,,)< (,4,)< (,3,)< (3,,3). 3. Problem Formulaton Determnng the schedule that optmzes the energy consumpton for a preemptve tas-set based on the tass worload dstrbuton whle satsfyng the tmng requrement n the worst case can be formulated as a Non-Lnear Programmng (NLP) problem. The model conssts of an objectve functon that mnmzes the average energy consumpton of the system when the tass tae on some worload dstrbuton whle subject to a set of resource and tmng constrants. The nterestng part of ths formulaton s the way we relate the average executon cycle based on the probablty densty functon of the worload and the worst case executon cycle. If the probablty densty functon s not nown, we can use the ACEC as an approxmaton. In [7], t s shown that ths s a good enough approxmaton of the average energy consumpton. We assume the processor can use any voltage value wthn a specfed range. In the followng formulaton, T, denotes the current tas sub-nstance and T,j, s the prevous tas subnstance based on the order of the fully preemptve schedule. Next, we defne the varables that we are gong to fnd: ts, Average start-tme of T, te, End-tme of T, w, Average worload of T, ˆ Worst-case worload of T, w, j, v, Supply voltage of T, based on average worload ˆ Supply voltage of T, based on worst-case worload v, j, Among these varables, only the end-tme te,, and the worst-case worload varables wll be passed to the onlne DVS phase to calculate the runtme supply voltage. In order to satsfy the worst-case requrements durng runtme, the value of te, and w ˆ, j, wll be determned sutably together wth the other varables when solvng the NLP. It s mportant to note that the average start-tme depends only on the average worload of the prevous tas but not the average worload of tself snce the start tme depends on the slac avalable from the prevous tass. However, the end-tmes are the same for both the average-case and the worst-case worload condtons. The complete NLP formulaton s descrbed as follows. The objectve functon of the NLP formulaton s Mn. N N Ns, j = j = = C w, The probablty weghted worload can be used n the objectve functon f the probablty densty functon s nown. Here, we use the average worload n the formulaton. To meet the release tme and deadlne requrements as well as the voltage range requrement, the followng constrants are used: R t (5), j s,, j v, (4) te, D (6) V, ˆ mn v, v, V (7) max λ v, te, = ts, + wˆ (8), ( v ) α, vth Also, we need to mae sure that there s enough allowable worng tme between the end tme of T,j, and T, for T, to fnsh f both tass use WCEC. We express ths by the followng constrant: λ vˆ, (9) te, te', j',' wˆ, ( vˆ ) α, vth If we do not consder the dynamc slac dstrbuton, we would need te ', j ', ' ts n order to ensure that no tas, j, executons are overlapped. Allowng the slacs of fnshed tass to be utlzed by the subsequent tass can be thought of as the average start tme of T, becomes earler than the scheduled end tme of T,j, f T,j, uses ACEC nstead of WCEC. Assume that the greedy slac dstrbuton s used, the dfference between te,j, and ts, s bounded by the dfference of the worst case executon tme and the average case executon tme,.e., the slac of T,j,. Therefore, we have the followng constrant: λ ( wˆ ', j',' w', j',' ) vˆ ', j',' ts, te', j',' (0) ( vˆ ) α ', j',' vth Now, we need to determne the worload assgned for each tas sub-nstance. The sum of the worloads of the subnstances s equal to the worload of the tas nstance because each sub-nstance executes only part of the wor of ts parent tas nstance. We assume the worload of every nstance of the tas s the same and hence W,j =W and we have Ns, j W = w =, j, ()

Ns, j Wˆ = wˆ (), j, = The average worload s always less than or equal to the worst-case worload, so we have: w, wˆ (3), From equatons () and (), we can see that there are many combnatons of w, and w ˆ, j, wth the sums are equal to W and Wˆ. To fnd an optmal value for each of them, we dvde the worload dstrbuton of all the sub-nstances nto three cases. Here, we need to explan the meanng of the average worload of the sub-nstances. It represents the amount of worload that should be executed on that partcular subnstance when the tas nstance taes the ACEC. For example, the ACEC and WCEC of a tas nstance are equal to 5 and 30, respectvely. Also, t s preempted nto three sub-nstances and all of them wth WCEC equals to 0. Ths means that each of them can execute up to 0 unts of executon cycles. Durng the average-case scenaro, the frst sub-nstance wll execute 0 unts but not 5 unts (5/3 unts) because the next sub-nstance wll start executon only f the prevous sub-nstance already reaches the worst-case lmt. Wth the same argument, the ACEC of the second and thrd sub-nstances are 5 and 0 unts, respectvely. The ACEC of the thrd sub-nstance s 0 unt means that ths sub-nstance does not need to perform any executon durng the average-case whle t s stll reserved wth enough tme slots when the actual executon needs the worstcase cycles. In ths case, all the sub-nstances need to perform 0 unts executon cycles. Now we formulate the above dea n the form of mathematcal programmng. For each sub-nstance T,, t falls nto one of the followng cases: (case ) w ˆ j < W ; (case ),, ' '= otherwse. From the above dscusson, we can see that to satsfy the average worload dstrbuton, we have w ˆ, ' = w, ' for all tas sub-nstances T, that belong to case. For case because the average worload wll be automatcally assgned a sutable value accordng to the constrant () and the fact that the average worload of some of the case tas sub-nstances have already been assgned. Consderng the example shown n Fgure 5 where T, has three sub-nstances. The frst subnstance T,, belongs to case because w ˆ,, < W and we have w ˆ,, = w. The second and thrd sub-nstances, T,,,, and T,,3, belong to case because w ˆ ˆ,, + w,, > W and w ˆ,, + wˆ ˆ,, + w,,3 > W. Snce ŵ s already assgned,, and so we have w,, = W ŵ because T,,,, wll execute the remanng worload after T,, fnsh the executon. Now T,, and T,, have already executed all the requred average worload, T,,3 does not need to carry out any computaton on average (.e. w,,3 =0). Note that all the worst-case worload ( w ˆ,,, wˆ ˆ,,, w,, 3 ) are non-negatve number and the sum of them s equal tow ˆ. T, T,, T,, T,, 3 w,, ˆ W =,, w,, W ˆ W Wˆ,, w,,3=0 Fgure 5. An example of a tas wth three subnstances. In the mathematcal programmng formulaton, we deal wth a more general case that more than three sub-nstances are allowed. However each of the sub-nstances stll falls nto ether one of the two cases. The NLP formulaton of the above dea s presented as follows. A dependent lnear varable, ol, = W w s W ˆ,, 3 ' =, ' ntroduced to determne whether the current tas sub-nstance T, belongs to case or case. If t belongs to case,.e. w, j, ' < '= W, ol, s postve. w, j, ' > '= W s mpossble because the sum of the average worload of the already executed sub-nstances (ncludng the current sub-nstance tself) s at most W. We have ol, =0 f the tas sub-nstance belongs to case. In order to have w, = W for case, we have the followng addtonal constrant: ol W ol (4) When w, j, ' < '= W w,,,,.e. ol, >0, constrant (4) s equvalent to w j W. Together wth constrant (3), the only,, feasble soluton s w j = W. Otherwse, constrant (4) s,, trvally true when ol, =0. Fnally we need to defne the worst-case worload for each of the tas sub-nstances to yeld the best energy savng. However, t s already done snce ts values are already governed by equatons () and (3). From the above formulaton, solvng the NLP problem wll results n the optmal assgnment of the worload to each sub-nstance and the correspondng end-tme of each sub-nstance wll also be obtaned. 4. Expermental Results To demonstrate the effectveness of the proposed technque, whch we denote as ACS, a seres of experments, ncludng both randomly-generated tas-sets and real-lfe applcatons, were carred out. For a gven number of tass, one hundred random tas sets were constructed and each tasset results n maxmum one thousand of sub-nstances. We repeatedly smulated each tasset for one thousand hyper-perod. Smlar to the expermental settngs n [7], we consder the number of executon cycles of each tas varyng between the best case (BCEC) and worst case (WCEC) followng a normal dstrbuton wth mean, µ = ACEC, and standard devaton,

WCEC BCEC σ =. The BCEC/WCEC rato s rangng from 6 hghly flexble executon (=0.) to almost fxed (=0.9). The deadlne D j of each tas was chosen from a unform dstrbuton between 0 and 00. The WCEC of a partcular tas nstance T j was adjusted such that the processor utlzaton s about 70% when all the tass are runnng at the maxmum speed [7]. We compared the energy consumpton usng ACS wth the energy consumpton of the statc schedulng method that only consders WCEC n obtanng the schedulng. We denote the later as WCS. The runtme energy consumpton s the actual energy consumpton after performng the Dynamc Voltage Scalng (DVS) based on ether the ACS or WCS statc schedules. Fgure 6 summarzes the expermental results. Fgure 6(a) shows the comparson between ACS and WCS for dfferent number of tass when the BCEC/WCEC rato vares between 0.(hghly flexble executon) and 0.9(almost fxed executon). Y-axs s the percentage mprovement n energy consumpton of ACS over WCS. It shows that as the number of tass ncreases, the energy effcency of usng ACS ncreases. Ths can be explaned by the fact that as the number of tass ncreases, more tas sub-nstances can use a lower supply voltage by explotng the worload varaton and utlze the slac tme generated from the varaton. It can be seen that comparng wth WCS, the mprovement n energy reducton reaches the hghest value, about 60% when the BCEC/WCEC value s 0. and the number of tass s ten. Ths s because there are a lot of slacs avalable when the BCEC/WCEC s low and ACS provdes a much better slac utlzaton n ths scenaro and mnmzes the overall average energy consumpton. However, when there s lttle slac avalable,.e., when BCEC/WCEC rato s hgh, there s not much mprovement as there s lttle room for both methods to reduce the energy consumpton. To further valdate the proposed algorthm, we appled our algorthm to two real-lfe applcatons, computer numercal control CNC [ 3] and GAP [ 4]. The comparsons of the energy reducton wth WCS are shown n Fgure 6(b). It can be seen that the mprovements over WCS are as hgh as 4% and 30% when the BCEC/WCEC rato s 0. for CNC and GAP respectvely. Improvement 70.0% 60.0% 50.0% 40.0% 30.0% 0.0% 0.0% 0.0% 0. 0.5 0.9 4 6 8 0 Number of Tass Improvement 50.0% 40.0% 30.0% 0.0% 0.0% 0.0% (a) (b) Fgure 6. Expermental results CNC GAP 0. 0.5 0.9 BCEC/WCEC preemptve schedule. The potental slac generated by the later tass can be utlzed by the early tass by consderng the average executon worload durng the statc voltage schedulng. The problem s formulated as a Non-Lnear Programmng (NLP) and expermental results showed sgnfcant mprovement n energy reducton. 6. References [] I. Hong, D. Krovs, G. Qu, M. Potonja and M. Srvastava, Power optmzaton of varable voltage core-based systems, DAC, pp. 76-8, 998. [] T. Ishhara and H. Yasuura, Voltage Schedulng Problem for Dynamcally Varable Voltage Processors, ISLPED, pp. 97 0, 998. [3] T. Burd, T. Perng, A. Strataos and R. Brodersen, A dynamc voltage scaled mcroprocessor system, IEEE Journal of Sold- State Crcuts, vol. 35, pp. 57-580, 000. [4 ] M.Weser, B. Welch, A. Demers and S. Shener, Schedulng for reduced CPU energy, USENIX Sym. on Operatng Systems Desgn and Implementaton, pp. 3-3, 994. [5] W. Km and J. Km and S. L. Mn, Dynamc Voltage Scalng Algorthm for Fxed-Prorty Real-Tme Systems Usng Wor- Demand Analyss, ISLPED, pp. 396-40, 003. [6] F. Gruan, K. Kuchcns, LEneS: tas schedulng for lowenergy systems usng varable supply voltage processors," ASP-DAC, pp. 449-455, 00. [7] F. Gruan, Hard Real-Tme Schedulng for Low-Energy Usng Stochastc Data and DVS Processors, ISLPED, pp. 46-5, 00. [8] A. Manza and C. Charabart, Varable Voltage Tas Schedulng Algorthms for Mnmzng Energy, ISLPED, pp. 79-8, 00 [9] S. Saewong and R. Rajumar, Practcal voltage-scalng for fxed-prorty rt-systems, RTAS, pp. 06-4, 003. [0] Y. L. A. K. Mo, An ntegrated approach for applyng dynamc voltage scalng to hard real-tme systems, RTAS, pp. 6-3, 003. [] D. Zegenben, F. Wolf, K. Rchter, M. Jersa and R. Ernst, Interval-Based Analyss of Software Processes, ACM SIGPLAN Conference on Languages, Complers, and Tools for Embedded Systems, pp. 94-0, 00. [] Bren Mochoc, Xaobo Sharon Hu and Gang Quan, A realstc varable voltage schedulng model for real-tme applcatons, ICCAD, pp. 76-73, 00. [3] Km, N., Ryu, M., Hong, S., Sasena, M., Cho, C.-H., and Shn, H, Vsual assessment of a real-tme system desgn: a case study on a CNC controller, RTSS, pp. 300-30, 996. [4] C. Douglass Loce, Davd R. Vogel and Thomas J. Mesler, Buldng a predctable avoncs platform n Ada a case study, RTSS, pp. 8-89, 99. 5. Conclusons A novel energy reducton strategy n the off-lne statc voltage schedulng phase was ntroduced. The preemptve nature of the schedulng s consdered by usng a fully