Enhancing System Architecture by Modelling the Flash Translation Layer

Similar documents
The Critical Role of Firmware and Flash Translation Layers in Solid State Drive Design

Greedy FTL. Jinyong Ha Computer Systems Laboratory Sungkyunkwan University

WAFTL: A Workload Adaptive Flash Translation Layer with Data Partition

Application-Managed Flash Sungjin Lee, Ming Liu, Sangwoo Jun, Shuotao Xu, Jihong Kim and Arvind

NetApp Sizing Guidelines for MEDITECH Environments

SSD Firmware Implementation Project Lab. #1

NAND Structure Aware Controller Framework

Bridging the Information Gap Between Buffer and Flash Translation Layer for Flash Memory

Sang-Phil Lim Sungkyunkwan University. Sang-Won Lee Sungkyunkwan University. Bongki Moon University of Arizona

Improving MLC flash performance and endurance with Extended P/E Cycles

Benchmarking C++ From video games to algorithmic trading. Alexander Radchenko

Novel Error Recovery Architecture Based on Machine Learning

Politecnico di Milano Advanced Network Technologies Laboratory. Radio Frequency Identification

REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND.

Feasibility of a multifunctional morphological system for use on field programmable gate arrays

Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Chapter 3. H/w s/w interface. hardware software Vijaykumar ECE495K Lecture Notes: Chapter 3 1

Parallel Storage and Retrieval of Pixmap Images

Table of Contents HOL ADV

Chapter 3 Digital Logic Structures

Blackfin Online Learning & Development

The Who. Intel - no introduction required.

Pure Versus Applied Informatics

Non-Volatile Memory Characterization and Measurement Techniques

CHAPTER 4 FIELD PROGRAMMABLE GATE ARRAY IMPLEMENTATION OF FIVE LEVEL CASCADED MULTILEVEL INVERTER

Introduction to co-simulation. What is HW-SW co-simulation?

Image Capture Procedure

MULTI-LAYERED HYBRID ARCHITECTURE TO SOLVE COMPLEX TASKS OF AN AUTONOMOUS MOBILE ROBOT

Architectural Core Salvaging in a Multi-Core Processor for Hard-Error Tolerance

Diversifying Wear Index for MLC NAND Flash Memory to Extend the Lifetime of SSDs

Data Acquisition & Computer Control

Image Processing Architectures (and their future requirements)

From Antenna to Bits:

Clay Codes: Moulding MDS Codes to Yield an MSR Code

How different FPGA firmware options enable digitizer platforms to address and facilitate multiple applications

Chapter 4. Pipelining Analogy. The Processor. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop:

FPGA-2012 Pre-Conference Workshop: FPGAs in 2032: Challenges and Opportunities

Extending and Using GNU Radio Performance Counters

FPGA Based System Design

RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM

DIGITAL ELECTRONICS QUESTION BANK

Dynamic Scheduling I

ADVANCED EMBEDDED MONITORING SYSTEM FOR ELECTROMAGNETIC RADIATION

Enabling ECN in Multi-Service Multi-Queue Data Centers

TIBCO FTL Part of the TIBCO Messaging Suite. Quick Start Guide

Lecture Perspectives. Administrivia

i800 Series Scanners Image Processing Guide User s Guide A-61510

Debugging a Boundary-Scan I 2 C Script Test with the BusPro - I and I2C Exerciser Software: A Case Study

file://c:\all_me\prive\projects\buizentester\internet\utracer3\utracer3_pag5.html

Image Processing Architectures (and their future requirements)

Laboratory set-up for Real-Time study of Electric Drives with Integrated Interfaces for Test and Measurement

Lecture 30. Perspectives. Digital Integrated Circuits Perspectives

Lecture #29. Moore s Law

Digital Controller Chip Set for Isolated DC Power Supplies

Extending NAND Endurance with Advanced Controller Technology

Lecture 9: Case Study -- Video streaming over Hung-Yu Wei National Taiwan University

DEMIGOD DEMIGOD. characterize stalls and pop-ups during game play. Serious gamers play games at their maximum settings driving HD monitors.

AUVFEST 05 Quick Look Report of NPS Activities

Lecture 3: Modulation & Clock Recovery. CSE 123: Computer Networks Alex C. Snoeren

Proc. IEEE Intern. Conf. on Application Specific Array Processors, (Eds. Capello et. al.), IEEE Computer Society Press, 1995, 76-84

Designing with STM32F3x

FTMS Booster X1 High-performance data acquisition system for Orbitrap FTMS

Challenges in Transition

FTSP Power Characterization

i1800 Series Scanners

Using a graphical interface for Fast FPGA design revision in SDR hierarchical structure

Lecture 3: Modulation & Clock Recovery. CSE 123: Computer Networks Stefan Savage

Realization and characterization of a smart meter for smart grid application

EVDP610 IXDP610 Digital PWM Controller IC Evaluation Board

A Kinect-based 3D hand-gesture interface for 3D databases

GPU-accelerated track reconstruction in the ALICE High Level Trigger

Keytar Hero. Bobby Barnett, Katy Kahla, James Kress, and Josh Tate. Teams 9 and 10 1

Improving Loop-Gain Performance In Digital Power Supplies With Latest- Generation DSCs

Last Time: Acting Humanly: The Full Turing Test

TRACE APPLICATION NOTE VERSION MB86R01 'JADE' & GREENHILLS TOOLCHAIN. Fujitsu Microelectronics Europe Application Note

FIFO WITH OFFSETS HIGH SCHEDULABILITY WITH LOW OVERHEADS. RTAS 18 April 13, Björn Brandenburg

Lecture 02: Digital Logic Review

The Xbox One System on a Chip and Kinect Sensor

Hardware-based Image Retrieval and Classifier System

2D Floor-Mapping Car

GRAPHICS CONTROLLERS INDIGO SOUND GENERATOR

SM 4117 Virtual Reality Assignment 2 By Li Yiu Chong ( )

ESE532: System-on-a-Chip Architecture. Today. Message. Crossbar. Interconnect Concerns

Agilent N5411A Serial ATA Electrical Performance Validation and Compliance Software Release Notes

Delay Variation Simulation Results for Transport of Time-Sensitive Traffic over Conventional Ethernet

PE713 FPGA Based System Design

Advances in Antenna Measurement Instrumentation and Systems

EnOcean 928 MHz (Dolphin V4 Platform) - Migration Overview

Microarchitectural Attacks and Defenses in JavaScript

Model-Based Design for Sensor Systems

Welcome to 6.111! Introductory Digital Systems Laboratory

DEEJAM: Defeating Energy-Efficient Jamming in IEEE based Wireless Networks

Wireless Sensor Networks

SpiNNaker SPIKING NEURAL NETWORK ARCHITECTURE MAX BROWN NICK BARLOW

Implementing Multipliers

INTRODUCTION TO GAME AI

PROBE: Prediction-based Optical Bandwidth Scaling for Energy-efficient NoCs

IES Digital Mock Test

What is a Simulation? Simulation & Modeling. Why Do Simulations? Emulators versus Simulators. Why Do Simulations? Why Do Simulations?

Transcription:

Enhancing System Architecture by Modelling the Flash Translation Layer Robert Sykes Sr. Dir. Firmware August 2014 OCZ Storage Solutions A Toshiba Group Company

Introduction This presentation will discuss the benefits of creating a standalone software representation of a real Flash Translation Layer that will be used in an end product How generating this model will help in ensuring the system will meet the workloads as set out in the IDEMA specification will be discussed, as well as how the FTL model will allow flexibility in looking at the system dynamics for custom workloads such as specific dirty performance characterization, write amplification characteristics, garbage collection and reclamation, and the impact that over-provisioning has on the system dynamics August 2014 2

What is an FTL Model? The FTL is the only major software components of an SSD Whilst it heavily interacts with the hardware, the hardware elements could be simulated resulting in a software only representation of a complete system Thus a software model of an FTL could be created for analyzing such interactions with its surrounding hardware to help with the architecture and system design before any RTL has been written Consideration should be given in the design of such a model that will allow the code to be easily transported to the real system A model could be created using a compiler on a desktop / laptop allowing for quick simulations / prototyping to be run August 2014 3

Why Create an FTL Model? Allows for early prototyping of FTL algorithms Enables easy experimentation Proves validity of system level concepts Easier debugging Useful to keep and extend for future generations of the FTL Better, faster test platform for FTL development August 2014 4

Scope of the FTL Model FTL is bounded by: The HIL and FIL DRAM (and other utilities) HIL FTL FIL DRAM August 2014 5

Model Boundaries Model attempts to use these same boundaries Command Generator FTL NAND Model Model DRAM August 2014 6

FTL Model Compression Packing of data into pages, and the layout of that data on the NAND Map Table Reclaim Testing the reclaim features including wear levelling and keeping erase counts close to nominal etc. Testing different trigger points for GC and wear levelling i.e. choosing a block with low erase count or low data or an old block with growing error rate August 2014 7

FTL Model Injecting errors onto the software tables and recovering PFail, testing recovery mechanism, analysing how much data can be backed up with different algorithms, DRAM sizes, bank counts etc. Accuracy, want model to be as close as possible to the real thing (in terms of number of banks and then throughput) to create the most accurate tests August 2014 8

FTL Model Advanced Model NAND should model flash controller too Add channel bottlenecks and latencies for read write or erase commands how many errors or retries before performance is affected? Max/average latencies? Model host needs to be able to take many different sequences, 4KB random, 128KB sequential, mixture of compressible and uncompressible data, mix of slower commands too, the JEDEC standards for different commands WA can then be analysed The model should be able to run on desktop PC for speed (and regression testing) but also on development platform for easy cross-platform testing. Consideration of the HW environment should be driving the tool development. August 2014 9

FTL Statistics Model can store interesting information for validating concepts WA 15 10 5 FTL Model: WA vs OP 4K RW 0 0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128 136 144 152 MILLIONS OF HOST RW 33.33 % OP, with Comp 33% OP, No Comp 7% OP, No comp August 2014 10

Graphical Reclaim Simulation August 2014 11

FTL Model HIL Basics Start with a very simple command generator Random LBA accesses Random read/write command mix Random compression ratio Adding more flexible tests, traffic patterns etc. August 2014 12

FTL Model HIL advanced Create different traffic profiles Vary OP different OPs will affect NAND utilization, WA, performance, change the bottlenecks Compression testing different compression algorithms, data patterns, mechanisms for packing data into NAND pages split onto multiple pages, limit number of slices per page etc. NAND configuration when does the number of banks become critical to the traffic profile could use it to measure throughput and with good models the system performance could estimated August 2014 13

JEDEC standards Enterprise requirement: 512 bytes (0.5k) 4% 1024 bytes (1k) 1% 1536 bytes (1.5k) 1% 2048 bytes (2k) 1% 2560 bytes (2.5k) 1% 3072 bytes (3k) 1% 3584 bytes (3.5k) 1% 4096 bytes (4k) 67% 8192 bytes (8k) 10% 16,384 bytes (16k) 7% 32,768 bytes (32k) 3% 65,536 bytes (64k) 3% The workload shall be distributed across the SSD such that the following is achieved: 1) 50% of accesses to first 5% of user LBA space (LBA group a) 2) 30% of accesses to next 15% of user LBA space (LBA group b) 3) 20% of accesses to remainder of user LBA space (LBA group c) **JESD218, Solid State Drive (SSD) Requirements and Endurance Test Method August 2014 14

Command Generator Now we have a specification of what to test we need to create the specific data. This can be done in two ways: Create a file on the host PC that has the exact data type you want and feed this into the system or Create a data generator: This would be similar in creating a packet header for the type of data you want the system to generate i.e. create a list of requirements you want the data to be, i.e. LBA address, random data, compression ratio etc. This will create data on-the-fly, obviously adds delay whilst the data is generated but that s OK as we don t need to time anything until the data is created The advantage of this method is that it provides a more powerful tool in simulating quickly different data pattern August 2014 15

Basic FIL FTL Model FIL Start with a simple flash queue interface Simple storage of commands Test-out synchronous command completion Advanced FIL Real flash queue Modelled NAND channels inc latencies Async status completions back to FTL August 2014 16

FTL Model FIL advanced NAND arbitration (round robin or some other technique based on traffic profile?) Size of a RAID block? NAND page size linked to number of channels, if you make the page larger you affect performance Measure WA over extended periods of time, see the influence that different traffic has on it Buffer management - management of the host data ring buffers, reclaim buffers, others? Cache hits? August 2014 17

FTL Model FIL cont Test NAND migration T_Prog simulation Read Retry algorithms August 2014 18

FTL Model DRAM Simple DRAM model could use static arrays to simulate large parts of DRAM Advanced features: Create a real system interface Add statistics how many accesses etc. Optimizations, caching locally in SRAM August 2014 19

Limitations Hard to keep close to HW design Different timings and latencies cause different bugs in real world Hard to get accurate models Extra maintenance August 2014 20

Thank You! August 2014 21