Switching threshold Low Power VLSI System Design Lecture 8 & 9: Transistor Sizing and Low Power Memory Design Prof. R. Iris ahar October & 4, 017 Define V M to be the oint where V in = V out (both PMOS and NMOS in saturation since V DS = V GS ) If V M = V DD /, then this imlies symmetric rise/fall behavior for the CMOS gate Recall at saturation, I D =(k /)(W/L) (V GS -V T ), where k n = n C ox = n ox /t ox Setting I D = -I Dn ' kn Wn ( VM VTn) Ln ssuming V Tn =-V T ' k W L n ( V W / L W / L n M V k k ' n ' ) T n 500.78 180 3 Switch delay model Inut attern effects on delay C int INVERTER R eq C int Cint Delay is deendent on the attern of inuts 1 st order aroximation of delay: t 0.69 R eff R eff deends on the inut attern NND NOR 5 8 1
Inut attern effects on delay Transistor sizing Cint 0 1 transition on outut: ossibilities one inut goes low: what is R eff? delay is 0.69 both inuts go low: what is R eff? delay is 0.69 / since two -resistors are on in arallel 1 0 transition on outut: 1 ossibility both inuts go high delay is 0.69 How should NMOS and PMOS devices be sized relative to an inverter with equal rise/fall times? 1 1 C int t 0.69 R eff dding transistors in series (without sizing) slows down the circuit Cint 1 1 9 11 Sizing a 3-inut NND tyical memory hierarchy How should a 3-inut NND gate be sized for equal rise and fall delay? Consider the worst case for rise/fall C C C int Cint y taking advantage of the rincile of locality, we can: resent the user with as much memory as is available in the cheaest technology at the seed offered by the fastest technology. On-Chi Comonents Dataath Control RegFile ITL DTL Instr Data Cache Cache Second Level Cache (SRM) Main Memory (DRM) Seed (ns):.1 s 1 s 10 s 100 s 1,000 s Secondary Memory (Disk) Size (bytes): 100 s K s 10K s M s T s 1 Cost: highest lowest 13
Read/write memories (RMs) Static SRM data is stored as long as voltage suly is enabled large s (6 FETs/) fewer bits/chi fast used where seed is imortant (e.g., caches) differential oututs (outut L and!l) comatible with CMOS technology Dynamic DRM eriodic refresh required (every 1 to 4 ms) small s (1 to 3 FETs/) more bits/chi slower used for main memories single ended outut (outut L only) not as easily comatible with CMOS technology 14 S 0 S 1 S S 3 S N- S N-1 1D Memory rchitecture M bits Word 0 Word 1 Word Word N- Word N-1 Inut/Outut Storage Cell N words N select signals 0 1 k-1 S 0 S 1 S S 3 S N- S N-1 M bits Word 0 Word 1 Word Word N- Word N-1 Inut/Outut Storage Cell Decoder reduces # of inuts K = log N 15 K L bit row address L L+1 K 1 D Memory Organization Row decoder K L it line Word line Storage N = K M-bit words read bit words recharge enable 1 4x4 SRM memory!l L bit line recharge WL[0] WL[1] WL[] WL[3] M* L L bit column address 0 L 1 Sense amlifiers/drivers Column decoder Inut-Outut (M bits) 16 clocking and control 0 Column Decoder L i L i+1 sense amlifiers write circuitry 17 3
D memory configuration 6-transistor SRM Cell WL M5!Q M M4 Q M6 Sense ms Sense ms M1 M3!L L 19 0 SRM Cell nalysis (Read) WL=1 Read Voltages Ratios C bit!l=1.0v M 1 M 4 M M!Q=0 6 5 Q=1 L=1.0V Read-disturb (read-uset): must carefully limit the allowed voltage rise on!q to a value that revents the read-uset condition from occurring while simultaneously maintaining accetable circuit seed and area constraints C bit 1 V DD =.5V V thn = 0.4V Voltage Rise on!q 1. 1 0.8 0.6 0.4 1. 0. 0 0 0.5 1 1.5.5 3 Cell Ratio (CR) Kee size minimal while maintaining read stability Make M 1 minimum size and increase the L of M 5 (to make it weaker) increases load on L Make M 5 minimum size and increase the W of M 1 3 4
C bit SRM Cell nalysis (Write)!L=1.0V M 1 WL=1 M 4 M M!Q=0 6 5 Q=1 L=0V The!Q side of the cannot be ulled high enough to ensure writing of 1 (because of the ON state of M 1 ). So, the new value of the has to be written through M 6. C bit 4 Write Voltage (VQ) V DD =.5V V th = 0.4V / n = 0.5 0.5 0.4 0.3 0. 0.1 Write Voltages Ratios 1.8 0 0 0.5 1 1.5 Pullu Ratio (PR) Need to ull Q below V thn to turn off M 1 Kee size minimal while allowing writes Make M 4 and M 6 minimum size 6 Cell Sizing and Performance Decreasing bit line energy Keeing size minimal is critical for large caches Minimum sized ull down devices (M1 and M3) Requires longer than minimum channel length ass transistors (M5 and M6) to ensure roer Cell Ratio ut u-sizing of the ass transistors increases caacitive load on the word lines and limits the current discharged on the bit lines both of which can adversely affect the seed of the read cycle Minimum width and length ass transistors oost the width of the ull downs (M1 and M3) Reduces the loading on the word lines and increases the storage caacitance in the both are good! but size may be slightly larger Performance is determined by the read oeration To accelerate the read time, SRMs use sense amlifiers (so that the bit line doesn t have to make a full swing) 7 Reduce the bit line voltage swing need sense am for each column to sense/restore signal Isolate memory s from the bit lines after sensing (to revent the s from changing the bit line voltage further) ulsed word line Isolate sense ams from bit lines after sensing (to revent bit lines from having large voltage swings) bit line isolation What will these techniques do for erformance? 8 5
Sense mlifiers S mlification: resolves data with small bit line swings inut outut Delay reduction: comensates for the limited drive caability small of the memory t = ( C * V ) / I av large make V as small as ossible Power reduction: eliminates a large art of the ower dissiation P= ½ C * V DD * V * f make V as small as ossible Signal restoration: for DRMs, need to drive the bit lines full swing after sensing (reading) in order to do data refresh 9 Pulsed word line feedback signal Read Comlete Word line Dummy bit lines it lines 10% oulated Dummy column height set to 10% of a regular column and its s are tied to a fixed value caacitance is only 10% of a regular column 30 Pulsed word line timing Read Comlete Word line it line V = 0.1V dd isolate it line isolation bit lines V = 0.1V dd Dummy bit line V = V dd Dummy bit lines have reached full swing and trigger ulse shut off when regular bit lines reach 10% swing sense Read sense amlifier V = V dd 31 sense amlifier oututs 3 6
it Line Precharge Logic it line recharge logic!pc L!PC!L First ste of Read cycle is to recharge the bit lines to V DD Every differential signal in the memory must be equalized before Read Turn off PC and enable the WL The grounded PMOS load limits the bit line swing (seeding u the next recharge cycle) L!L equalization transistor: seeds u equalization of bit lines by allowing the ca. and ull-u device of nondischarged bit line to assist in recharging the discharged line 33 What are the ower imlications of this design? Would a clocked recharge scheme be better for ower? 34 D Memory Organization K L it line Storage 3D anked Memory rchitecture lock 0 lock 1 lock P-1 K L bit row address L L+1 K 1 L bit column address Row decoder 0 L 1 Sense amlifiers/drivers Column decoder Word line M* L Inut-Outut (M bits) N = K M-bit words Row addr. Column addr. lock addr. Control circuitry Global amlifier/driver I/O Global data bus lock selector 35 36 7
Power Saving lock-oriented memory Lengths of local word and bit lines are ket small. lock address is used to activate the addressed block. Unaddressed blocks are ut in ower-saving mode: sense amlifier and row/column decoders are disabled. Cell array is ut in ower-saving mode. Power Saving Modes Power-down: Disconnect suly. Data is not retained. Examle, caches. Increasing thresholds by body biasing: Negative bias on non-active s reduces leakage. Slee mode: Insert resistance in leakage ath; retain data. Lower suly voltage. 38 39 dding Resistance in Leakage Path VDD high-threshold transistor slee VDD.int Slee = 1, data retention mode Lowering Suly Voltage slee VDD VDDL 100mV for 0.13μ CMOS SRM SRM SRM SRM SRM SRM slee VSS.int GND GND 40 41 8