A 32 Gbps 2048-bit 10GBASE-T Ethernet Energy Efficient LDPC Decoder with Split-Row Threshold Decoding Method

Similar documents
A Complete Real-Time a Baseband Receiver Implemented on an Array of Programmable Processors

Research Article LDPC Decoder with an Adaptive Wordwidth Datapath for Energy and BER Co-Optimization

Technology Timeline. Transistors ICs (General) SRAMs & DRAMs Microprocessors SPLDs CPLDs ASICs. FPGAs. The Design Warrior s Guide to.

A GALS Many-Core Heterogeneous DSP Platform with Source-Synchronous On-Chip Interconnection Network

High-performance Parallel Concatenated Polar-CRC Decoder Architecture

FPGA-Based Design and Implementation of a Multi-Gbps LDPC Decoder

Towards 100G over Copper

LDPC Decoding: VLSI Architectures and Implementations

Design and implementation of LDPC decoder using time domain-ams processing

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

A Level-Encoded Transition Signaling Protocol for High-Throughput Asynchronous Global Communication

Low Power Error Correcting Codes Using Majority Logic Decoding

Lecture 1. Tinoosh Mohsenin

FPGA-BASED DESIGN AND IMPLEMENTATION OF A MULTI-GBPS LDPC DECODER. Alexios Balatsoukas-Stimming and Apostolos Dollas

EE241 - Spring 2004 Advanced Digital Integrated Circuits. Announcements. Borivoje Nikolic. Lecture 15 Low-Power Design: Supply Voltage Scaling

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

Reduced-Complexity VLSI Architectures for Binary and Nonbinary LDPC Codes

A Multiplexer-Based Digital Passive Linear Counter (PLINCO)

Performance Evaluation of Low Density Parity Check codes with Hard and Soft decision Decoding

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

An Efficient 10GBASE-T Ethernet LDPC Decoder Design with Low Error Floors

Constellation Shaping for LDPC-Coded APSK

AREA AND DELAY EFFICIENT DESIGN FOR PARALLEL PREFIX FINITE FIELD MULTIPLIER

By Dayadi Lakshmaiah, Dr. M. V. Subramanyam & Dr. K. Satya Prasad Jawaharlal Nehru Technological University, India

Low Power Design Part I Introduction and VHDL design. Ricardo Santos LSCAD/FACOM/UFMS

The Case for Optimum Detection Algorithms in MIMO Wireless Systems. Helmut Bölcskei

Design and Analysis of Row Bypass Multiplier using various logic Full Adders

High Speed Binary Counters Based on Wallace Tree Multiplier in VHDL

An Energy Scalable Computational Array for Energy Harvesting Sensor Signal Processing. Rajeevan Amirtharajah University of California, Davis

Low-Power CMOS VLSI Design

LDPC FEC PROPOSAL FOR EPOC. Richard S. Prodan Broadcom Corporation

Jeffrey Davis Georgia Institute of Technology School of ECE Atlanta, GA Tel No

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers

Lecture #2 Solving the Interconnect Problems in VLSI

VLSI Design for High-Speed Sparse Parity-Check Matrix Decoders

BER-optimal ADC for Serial Links

Ultra Low Power Consumption Military Communication Systems

Advanced FPGA Design. Tinoosh Mohsenin CMPE 491/691 Spring 2012

Proposing. An Interpolated Pipeline ADC

Punctured vs Rateless Codes for Hybrid ARQ

A WiMAX/LTE Compliant FPGA Implementation of a High-Throughput Low-Complexity 4x4 64-QAM Soft MIMO Receiver

High-Throughput VLSI Implementations of Iterative Decoders and Related Code Construction Problems

Digital Television Lecture 5

Computer Aided Design of Electronics

Basics of Error Correcting Codes

Next Generation Wireless Communication System

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

PROBE: Prediction-based Optical Bandwidth Scaling for Energy-efficient NoCs

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ICCE.2012.

K-Best Decoders for 5G+ Wireless Communication

FPGA based Prototyping of Next Generation Forward Error Correction

Transmission-Line-Based, Shared-Media On-Chip. Interconnects for Multi-Core Processors

LOW-POWER SOFTWARE-DEFINED RADIO DESIGN USING FPGAS

A HIGH SPEED FFT/IFFT PROCESSOR FOR MIMO OFDM SYSTEMS

ECE 5325/6325: Wireless Communication Systems Lecture Notes, Spring 2013

Vector-LDPC Codes for Mobile Broadband Communications

A 0.9 V Low-power 16-bit DSP Based on a Top-down Design Methodology

End-To-End Communication Model based on DVB-S2 s Low-Density Parity-Check Coding

Faster and Low Power Twin Precision Multiplier

ENHANCING SPEED AND REDUCING POWER OF SHIFT AND ADD MULTIPLIER

Design and Implementation of Complex Multiplier Using Compressors

Study on AR4JA Code in Deep Space Fading Channel

Design of an optimized multiplier based on approximation logic

22. VLSI in Communications

COMPARISION OF LOW POWER AND DELAY USING BAUGH WOOLEY AND WALLACE TREE MULTIPLIERS

Measurement Results for a High Throughput MCM

Reducing the Computation Time in Two s Complement Multipliers A. Hari Priya 1 1 Assistant Professor, Dept. of ECE,

Implementation of 32-Bit Unsigned Multiplier Using CLAA and CSLA

Performance Optimization of Hybrid Combination of LDPC and RS Codes Using Image Transmission System Over Fading Channels

A New Architecture for Signed Radix-2 m Pure Array Multipliers

A to nj/bit/iteration Scalable 3GPP LTE Turbo Decoder with an Adaptive Sub-Block Parallel Scheme and an Embedded DVFS Engine

Merging Propagation Physics, Theory and Hardware in Wireless. Ada Poon

XJ-BP: Express Journey Belief Propagation Decoding for Polar Codes

Datorstödd Elektronikkonstruktion

A Pulse-Based CMOS Ultra-Wideband Transmitter for WPANs

International Journal of Computer Engineering and Applications, Volume XI, Issue XI, Nov. 17, ISSN

SUCCESSIVE approximation register (SAR) analog-todigital

Low-complexity Low-Precision LDPC Decoding for SSD Controllers

3GPP TSG RAN WG1 Meeting #85 R Decoding algorithm** Max-log-MAP min-sum List-X

Decoding of Block Turbo Codes

Digital Calibration for Current-Steering DAC Linearity Enhancement

Designing Reliable and Low Power Multiplier by using Algorithmic Noise Tolerant

MANY integrated circuit applications require a unique

An Design of Radix-4 Modified Booth Encoded Multiplier and Optimised Carry Select Adder Design for Efficient Area and Delay

A Survey on A High Performance Approximate Adder And Two High Performance Approximate Multipliers

Know your Algorithm! Architectural Trade-offs in the Implementation of a Viterbi Decoder. Matthias Kamuf,

A High-Speed QR Decomposition Processor for Carrier-Aggregated LTE-A Downlink Systems

AUTOMATIC IMPLEMENTATION OF FIR FILTERS ON FIELD PROGRAMMABLE GATE ARRAYS

Combining Modern Codes and Set- Partitioning for Multilevel Storage Systems

AREA AND ENERGY EFFICIENT VLSI ARCHITECTURES FOR LOW-DENSITY PARITY-CHECK DECODERS USING AN ON-THE-FLY COMPUTATION. A Dissertation KIRAN KUMAR GUNNAM

A Novel ROM Architecture for Reducing Bubble and Metastability Errors in High Speed Flash ADCs

n Based on the decision rule Po- Ning Chapter Po- Ning Chapter

Implementation of Parallel Multiplier-Accumulator using Radix- 2 Modified Booth Algorithm and SPST

ECE 5325/6325: Wireless Communication Systems Lecture Notes, Spring 2013

CS 6135 VLSI Physical Design Automation Fall 2003

Efficient Implementation of Combinational Circuits Using PTL

Chapter 7 Introduction to 3D Integration Technology using TSV

Q-ary LDPC Decoders with Reduced Complexity

NONBINARY low-density parity-check (NB-LDPC)

Transcription:

A 32 Gbps 248-bit GBASE-T Ethernet Energy Efficient LDPC Decoder with Split-Row Threshold Decoding Method Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis

Outline Introduction to LDPC Codes and Iterative Decoding Goals and Key Ideas Split-Row Threshold Decoding Method Error Performance Results Multi-Split-Row Threshold Decoder Implementations and Results Conclusion

Error Correction in Communication Systems Noise Binary information Encoder (Adding Redundancy) Encoded information Channel Corrupted information with noise Decoder (Error Detection and Correction Corrected information Error correction is widely used in communication systems Low-density parity-check (LDPC) code has been demonstrated to have a very good error correction performance

LDPC Code Applications Standards Digital Video Broadcasting (DVB-S2): 25 Gigabit Ethernet (GBASE-T): 26 WiMAX (82.6e) WiFi (82.n) WPANs (82.5.3c) Applications Flash memory Hard disks Deep-space satellite communications

LDPC Codes Defined by a large binary matrix, called parity check matrix or H matrix Example (2,6) LDPC code Code length (N)=2 Information length (K)=6 Row weight (W r )=4 Column weight (W c )=2 Row size (No. of parity checks)=6

Encoding Picture Example V Parity Image H V i T =

Decoding Picture Example Transmitter noise Receiver 5 channel 5 Iterative decoding Ethernet cable, Wireless, or Hard disk 2 25 2 4 6 8 2 4 6 8 2 5 5 5 5 5 5 2 2 2 25 25 25 2 4 6 8 2 4 6 8 2 Iteration 5 5 2 2 4 6 8 2 4 6 8 2 Iteration 5 Iteration 5 Iteration 6

Message Passing (Check node processing ) in Initialization λ Check processing α Variable processing Termination check out β SPA α ij = signβij' ϕ ϕ j', hij' =, j,' hij ' =, j' j ( β ) ij' ϕ = log[tanh( x 2 )] MinSum: α ij = signβ j', hij' =, ij' j', h min ij' =, j' j ( β ) ij'

Message Passing (Variable node processing ) in λ α β λ β ij = αij' + λ j j', h ij' = is the received information from the channel out

Decoding Architectures Serial and partial parallel decoders One or multiple row and column processors, share a few memory banks Throughput in the range of a few Mbps Large memory requirement Chk Mem Var

Serial Decoding () initialize memory (clear contents) Chk Mem Var (2) compute V V2 V3 and store V4 V5 V6 V7 V8 V9 V V V2 (3) now compute C C2 C3 and store C4 C5 C6

Partial Parallel Decoder Examples Example : 234b, rate-/2, (3,6) decoder [T. Ishikawa et al., ASP DAC, 26] 36 row, 72 column processors, 85 Kb mem 36 mm 2, 8 nm CMOS 53 Mbps 3.6 W @.8 V Example 2: 648b DVB-S2 Compliant [P.Urad et al., ISSCC, Feb 28] 8 processors, 3.8Mb mem 6.7mm 2, 65 nm CMOS 5 Mbps 36 mw @.2 V mem 36 Row +72 Col mem proc proc mem mem

Decoding Architectures- Continued Full-parallel decoders Row and column processors connected according to the parity check matrix Highest throughput, no memory Major challenges Routing congestion Large delay, area, and power caused by long global wires Chk Var Var 2 Chk 2 Var 3 Chk 384 Var 248

Full-Parallel Decoding () initialize registers (clear contents) (2) compute C,2,3,4,5,6 Chk Chk 2 Chk 5 (4) Store into registers (3) now compute V,2,3,4,5,6,7,8,9,,,2 Var Var 2 Var 3 Var 2

Full-parallel Decoder Examples Example : 24-bit, irregular code, 4 bits per symbol, [A. Blanksby et al., JSSC, Mar 22] 52.5 mm 2, 6 nm CMOS 64 MHz, Gbit/sec 69 mw @.5 V Example 2: 66-bit [A. Darabiha et al., CICC, Sep 27] 9 mm 2,3 nm CMOS 3 MHz, 3.3 Gbps 48 mw @.3 V 256 Col 256 Col 52 Row 256 Col 76 Row + 66 Col 256 Col

Outline Introduction to LDPC Codes and Iterative Decoding Goals and Key Ideas Split-Row Threshold Decoding Method Error Performance Results Multi-Split-Row Threshold Decoder Implementations and Results Conclusion

LDPC Decoder Design Goals and Features Key goals Very high throughput and high energy efficiency Area efficient (small circuit area) Well suited for long-length and large row weight LDPC codes Easy implementation with automatic CAD tools Good error performance Split-Row decoding key features Reduced interconnect complexity Reduced processor complexity T. Mohsenin and B. Baas, Split-row: A reduced complexity, high throughput LDPC decoder architecture, in ICCD, 26 T. Mohsenin and B. Baas, High-throughput LDPC decoders using a multiple Split- Row method, in ICASSP, 27

Standard MinSum vs. Split-Row Decoding Standard MinSum decoding Initialization Check proc H Variable proc C Syndrome check V3 V5 V8 V Split-Row decoding Check proc Sp Variable proc sp Initialization Sign Sp Sign Sp Check proc Sp Variable proc sp H = reduction of input wires to check processor H split-sp H split-sp C sp C sp reduction of check processor area Syndrome check V3 V5 V8 V

( ) ' ', ', ', ', ' min ' ' ij j j h j j j h j ij MS MS ij ij ij sign S β β α = = = ( ) ' ', ', ', ', ' min ' ' ij j j h j j j h j ij Row MS Split Row MS Split ij Row Split ij ij sign S β β α = = = MinSum vs. MinSum Split-Row Sign Magnitude MinSum: MinSum Split-Row:

Outline Introduction to LDPC Codes and Iterative Decoding Goals and Key Features Split-Row Threshold Decoding Method Error Performance Results Split-Row Threshold Decoder Implementation and Results Conclusion

MinSum Split-Row Threshold Algorithm A signal (Threshold_en) is passed from each partition, which indicates whether a partition has a minimum less than a given threshold (T). Based on Threshold_en status, the check nodes take as their minimum of their own local Min or T. Optimum threshold value (T) is obtained by empirical simulations Threshold_en Sp= 5 3.5.3 Threshold_en Sp= Threshold_en Sp= T T.3.5 Threshold_en Sp= Sp Sp Sp Sp T=.5 T=.5 Mohsenin et al: Asilomar 28, ICC 29, ISCAS 29

Impact of Threshold Selection - 5 decoding iterations SNR=4.2 db -2 Bit Error Probability -2-3 -4-5.5.5 2 Threshold values Optimum T=.2 SNR 3.2 SNR 3.4 SNR 3.6 SNR 4. SNR 4.2 Bit Error Probability (6,32) (248,723) LDPC Code -3-4 -5-6 Iteration 5 Iteration Iteration 5 Iteration 2.5.5 2 Threshold values Optimum T=.2 Optimum threshold (T) is independent of SNR and decoding iteration

Outline Introduction to LDPC Codes and Iterative Decoding Goals and Key Features Split-Row Threshold Decoding Method Error Performance Results Multi-Split-Row Threshold Decoder Implementations and Results Conclusion

Multi-Split-Row Threshold Decoding Divide parity check matrix to Spn (Spn>2) partitions Partitioning can be arbitrary so long as there are at least two variable nodes per partition Example: (6,32) (248,723) LDPC Code 32/Spn variable nodes

Error Performance for (248,723) GBASE-T Code MS Split-Row-2 Threshold is.7 db away from MS MS Split-Row-6 Threshold is.22 db away from MS and is.2 db better than Split-Row-2 Original. Bit Error Probability -2-4 -6-8 SPA MS Normalized MS Split-Row-2 Threshold MS Split-Row-4 Threshold MS Split-Row-8 Threshold MS Split-Row-6 Threshold MS Split-Row-2 Original Decoder - 3 3.5 4 4.5 5 5.5 SNR (db) Split-2 Split-4 Split-8 Split-6 Optimum T.2.23.24.24.22 db.2 db

Outline Introduction to LDPC Codes and Iterative Decoding Goals and Key Features Split-Row Threshold Decoding Method Error Performance Results Multi-Split-Row Threshold Decoder Implementations and Results Conclusion

Check Node Processor: Split-Row (original) The check node computes the row update equation Split-Row takes the MinSum check node processor and breaks it into two or more simpler row processors Simplification of comparator tree Number of check node I/Os reduced α β β 2 β n - β n β Wr/Spn - β Wr/Spn α β β 2 β n - β n β Wr/Spn β Wr/Spn Comp Comp Comp Sign (β ) Sign (β wr/spn ) Comp Comp Comp L = log 2 (Wr/Spn) SignSp(i-)_(i) Spn = (no split) Min Min2 SignSp(i+)_(i) = S signβ Index Min Spn = 2 ij(split-row2) ' ' min ij MS MS ij' wires while significantly j', h =, reducing j' j interconnections j', h =, j' j ijmssplit Row= SMSSplit Row j', h =, j' j ij' signβ ij' j', h min ij' Split Row α α Wr/Spn Sign (α ) Sign (α wr/spn ) SignSp(i)_(i+) SignSp(i)_(i-) cost of at most 3 XOR gates and a couple of sign ( β ) =, j' j ij' ( β ) ij'

Check Node Proc.: Split-Row Threshold Split-Row s loss of global minima transmission causes poor BER This can be overcome if we compare a Split-Row partition s minima with a well chosen Threshold Small HW overhead 5% increase in area, 7% increase in gate count Negligible effect on local critical path Improved BER.2 db improvement over original Split-Row2 Pseudocode for Threshold algorithm (Split-Row2) T. Mohsenin, P. Urard and B. Baas, A Thresholding Algorithm for Improved Split-Row Decoding of LDPC Codes" Asilomar Conference on Signals, Systems and Computers (ACSSC), October 28.

Check Node Proc.: Split-Row Threshold Improved Considering the 2 nd minima (Min2) requires more complex logic Additional HW includes two comparators and new select-mux logic Split-Row2 Threshold Improved BER is.7db from original normalized MinSum Split-Row6 Threshold Improved Check Node Processor area is over x smaller than normalized MinSum at half the latency Min Threshold β β 2 β Wr/2 Min2 Threshold_ensp Comp Comp β β Wr/2 comp comp2 Min Check Node Processor Synthesis Results (65nm) Area (µm 2 ) Gate count Delay (ns) MinSum (MS) 3578 8 2. MS Split-Row2 (original) 767 54.4 MS Split-Row6 (original) 25 85.8 MS Split-Row6 Threshold Improved comp comp2 Threshold_ensp IndexMin Min Threshold IndexMin Min Min2 Threshold comp comp comp2 Threshold_ensp α ' α Wr/2 ' α n ' Thresholding Logic α n 37 95.9 Thresholding Logic α α Wr/2

Check Node Proc.: Multi-Split-Row Threshold Improved

Variable Node Processor Based on the column update equation Split-Row leaves this unchanged from the original MinSum and SPA algorithms Variable node hardware complexity complexity is mainly reduced via wordwidth reduction β ij = αij' + λ j j', h ij' = seven 5-bit inputs

Multi-Split-Row Threshold Decoder Physical Layout RTL Synthesis Sp Sp Sp2 Sp3 Sp7 Sp6 Sp5 Sp4 Power & Floor plan Sp8 Sp9 Sp Sp Placement Sp5 Sp4 Sp3 Sp2 Clk tree placement Chk Proc Var Proc Route Post route optimization

Delay Analysis for Decoders Path: propagation of Threshold_en passing through Spn-2 partitions Path2: delay path through check and variable procs For small Spn the interconnect delay is dominant because of wire interconnect complexity As the number of partitioning increases Path delay increases Critical path delay (ns) 35 3 25 2 5 5 interconnect delay gate delay MinSumSplit-2 Split-4 Split-8 Split-6

Area Analysis for Decoders In MinSum, the synthesis area deviates significantly from layout area due to low utilization. Area break down per subblock for MinSum and Split-6 7% of MinSum decoder is empty space for wiring Check Proc Var Proc Clk tree+ Regs Wire (empty space) 75% Decoder Area (mm 2 ) 2 5 5 % % MinSum MinSum Split-2 Split-4 Split-8 Split-6 4% % 38% layout synthesis 4% % Split-Row6 Threshold

Logic Utilization MinSum Variable processor Check processor Registers+buffers SplitRow-6 Threshold one block, area not scaled 65 μm 65 μm

Comparison of Decoders (6,32) (248,723) GBASE-T code with decoding iterations. GBASE-T Code 65 nm, 7 M,.3 V MinSum standard Split-2 Threshold Split-4 Threshold Split-8 Threshold Split-6 Threshold Split-6 vs.minsum Area Utilization 25% 4% 83% 86% 89% 3.6x Area (mm 2 ) 8.2 2.2 6.8 6.2 5.2 3.5x Speed (MHz) 3 67 9 92 73 5.8x Throughput @ iter (Gbps) 5.6 2.5 6.9 35.7 32.2 5.7x CAD Tool CPU Time (hour) >78 36 8 5 >5.6x

Power Analysis for Split-Row6 Decoder Predicted voltage scaling on ST 65nm Region of Standard Operation Power breakdown (under heavy activity) 4% 6.4Gbps @.69V 4% 32Gbps @.3V 48% 34% Variable Node Check Node Clock Tree DFFs.69V: 34MHz, 34mW.2V: 48MHz, 444mW.3V: 73MHz, 68mW

Early Termination for Split-Row6 Decoder in Energy and throughput at maximum iterations and.2v α β out With early termination a high energy efficiency for a variety of SNRs can be achieved @ 3.4dB: 6.pJ/bit 27.5Gbps @ 4.4dB: 6.9pJ/bit 64.5Gbps

Comparison with Previous Work Darabiha [] LDPC Code (4,5) (66,48) Technology 3 nm, - Zhang [2] 65 nm, 7M Liu [3] This work (Split-6) (6,32) (248, 723) 9 nm, 8M 65 nm, 7M Voltage (V).2.2 -.2 Word length (bit) 4 4 6 5 Utilization 72% 8% 5% 89% Area (mm 2 ) 7.3 5.35 9.8 5.2 Speed (MHz) 3 7 27 48 Throughput per Area (Gbps/mm 2 ) 8 6 4 2 This work Higher performance smaller energy Zhang [2] Darabiha[] 5 5 Energy per bit (pj/bit) Early Termination Yes Yes No Yes Max Iteration (Imax) Throughput (Gbps) 5 8 6 3.3 47.7 5.3 64.5 Power (mw) 398 28-444 [] A. Darabiha et al., JSSC., 28 [2] Z. Zhang et al., VLSI Symp., 29 [3] L.Liu et al., TCAS I, 28 Energy per bit (pj/bit) 8. 58.7-7.

Future of LDPC in Deep Submicron CMOS New LDPC codes are being studied and constructed trying to balance theoretical performance and practical hardware realization However, code theorists generally are not concerned with transistor power and area 32nm technology and below present increased restrictions on the freedom of the backend designer, while wire delay is still increasing Must reduce design dependency on low-level optimizations for success The Split-Row technique presents an algorithmic and architectural solution that can be compatible with both future LDPC codes and submicron CMOS technology H = Low-density parity check matrix: N=2 M= (From: Information Theory, Inference, Learning Algorithms, D. MacKay) http://www.gtsav.gatech.edu/candle/research.html

Conclusion Split-Row reduces VLSI interconnect complexity through message passing reduction on row update Partitioning reduces the number of connections between check and variable processors. This results in higher silicon utilization and smaller and efficient layouts. Threshold algorithm does not reduce the effectiveness of original Split-Row At most two additional Threshold enable wires per row Improved Threshold algorithm increases error performance over original Split-Row Split-Row2:.7 db away from MinSum Normalized Split-Row6:.2 db better than Split-Row2 original Multi-Split Threshold allows us to use full parallel decoding for high speed applications with acceptable error performance loss, high energy efficiency and low area @.2 V and SNR = 4.4 db: 64.5 Gbps, 444 mw, 7 pj/bit

Acknowledgements Support ST Microelectronics NSF Grant 439 and CAREER award 54697 Intel SRC GRC Grant 598 and CSR Grant 659 Intellasys UC Micro SEM Special thanks Professor Shu Lin

VLSI Computation Lab (VCL) Advisor: Professor Bevan Baas 7 PhD students 6 MS students 3 Undergraduate student Website: http://www.ece.ucdavis.edu/vcl/

VLSI Computation Lab (VCL)

Projects in VCL High performance and high energy efficiency Low Density Parity Check (LDPC) Decoders Programmable processors Many-core DSP: AsAP. (36 processors), AsAP 2. (67 processors) Special purpose processors FFT, Viterbi decoder, Applications H.264 Biomedical Applications Circuits Dynamic frequency scaling (DVFS) Algorithms/Architectures LDPC decoding Network on chip