Characterization of a PLL circuit used on a 65 nm analog Neuromorphic Hardware System

Similar documents
Synchronous Mirror Delays. ECG 721 Memory Circuit Design Kevin Buck

CHAPTER 6 PHASE LOCKED LOOP ARCHITECTURE FOR ADC

INTEGRATED CIRCUITS. AN179 Circuit description of the NE Dec

A DPLL-based per Core Variable Frequency Clock Generator for an Eight-Core POWER7 Microprocessor

AC LAB ECE-D ecestudy.wordpress.com

Keywords: GPS, receiver, GPS receiver, MAX2769, 2769, 1575MHz, Integrated GPS Receiver, Global Positioning System

A 0.2-to-1.45GHz Subsampling Fractional-N All-Digital MDLL with Zero-Offset Aperture PD-Based Spur Cancellation and In-Situ Timing Mismatch Detection

A Modular All Digital PLL Architecture Enabling Both 1-to-2 GHz and 24-to 32-GHz Operation in 65nm CMOS

Phase Locked Loop Design for Fast Phase and Frequency Acquisition

PT7C4511. PLL Clock Multiplier. Features. Description. Pin Configuration. Pin Description

ICS PLL BUILDING BLOCK

125 Series FTS375 Disciplined Reference and Synchronous Clock Generator

ECEN620: Network Theory Broadband Circuit Design Fall 2014

Dedication. To Mum and Dad

A Wide Tuning Range (1 GHz-to-15 GHz) Fractional-N All-Digital PLL in 45nm SOI

Low-Jitter, 8kHz Reference Clock Synthesizer Outputs MHz

DESIGN OF MULTIPLYING DELAY LOCKED LOOP FOR DIFFERENT MULTIPLYING FACTORS

Frequency Synthesizer Project ECE145B Winter 2011

Low-Jitter, Precision Clock Generator with Two Outputs

Enhancing FPGA-based Systems with Programmable Oscillators

Multiple Reference Clock Generator

GFT1504 4/8/10 channel Delay Generator

An Analog Phase-Locked Loop

Analysis of Phase Noise Profile of a 1.1 GHz Phase-locked Loop

125 Series FTS125-CTV MHz GPS Disciplined Oscillators

INF4420 Phase locked loops

FPGA Implementation of a PID Controller with DC Motor Application

Geared Oscillator Project Final Design Review. Nick Edwards Richard Wright

R Using the Virtex Delay-Locked Loop

PT7C4502 PLL Clock Multiplier

SRAM Read Performance Degradation under Asymmetric NBTI and PBTI Stress: Characterization Vehicle and Statistical Aging

High-Frequency VOLTAGE-TO-FREQUENCY CONVERTER

CHAPTER 6 DIGITAL INSTRUMENTS

Dedan Kimathi University of technology. Department of Electrical and Electronic Engineering. EEE2406: Instrumentation. Lab 2

Acounter-basedall-digital spread-spectrum clock generatorwithhighemi reductionin65nmcmos

Programmable, Off-Line, PWM Controller

Low Power CMOS Digitally Controlled Oscillator Manoj Kumar #1, Sandeep K. Arya #2, Sujata Pandey* 3 and Timsi #4

/$ IEEE

High-speed Serial Interface

The rangefinder can be configured using an I2C machine interface. Settings control the

Ultrahigh Speed Phase/Frequency Discriminator AD9901

3.3V Zero Delay Buffer

ASTABLE MULTIVIBRATOR

Advance Information Clock Generator for PowerQUICC III

CLK1 GND. Phase Detector F VCO = F REF * (2 * M/R) VCO. P-Counter (14-bit) F OUT = F VCO / (2 * P) Programming Logic

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2010

Characterize Phase-Locked Loop Systems Using Real Time Oscilloscopes

LSI and Circuit Technologies for the SX-8 Supercomputer

Clock and Data Recovery With Coded Data Streams Author: Leonard Dieguez

DESCRIPTION CLKA1 CLKA2 CLKA3 CLKA4 CLKB1 CLKB2 CLKB3 CLKB4

DC-15 GHz Programmable Integer-N Prescaler

Sudatta Mohanty, Madhusmita Panda, Dr Ashis kumar Mal

Development of a 20 GS/s Sampling Chip in 130nm CMOS Technology

QPLL a Quartz Crystal Based PLL for Jitter Filtering Applications in LHC

Tel: +44 (0) Martin Burbidge V1 (V) XU2 oscout

Dual-Rate Fibre Channel Repeaters

Four Channel Inductive Loop Detector

MODELING THE PHASE STEP RESPONSE OF BANG-BANG DIGITAL PLLS

A 2.4 GHz to 3.86 GHz digitally controlled oscillator with 18.5 khz frequency resolution using single PMOS varactor

Phase-locked loop PIN CONFIGURATIONS

SY89841U. General Description. Features. Applications. Markets. Precision LVDS Runt Pulse Eliminator 2:1 Multiplexer

The SOL-20 Computer s Cassette interface.

DESCRIPTION CLKOUT CLK2 CLK4 CLK1 VDD GND SOP-8L

3.3V ZERO DELAY CLOCK BUFFER

Field Programmable Gate Array-Based Pulse-Width Modulation for Single Phase Active Power Filter

A Frequency Synthesis of All Digital Phase Locked Loop

Choosing Loop Bandwidth for PLLs

Maximum data rate: 50 MBaud Data rate range: ±15% Lock-in time: 1 bit

SERIALLY PROGRAMMABLE CLOCK SOURCE. Features

Hardware Flags. and the RTI system. Microcomputer Architecture and Interfacing Colorado School of Mines Professor William Hoff

SKY3000. Data Sheet TRIPLE-TRACK MAGNETIC STRIPE F2F DECODER IC. For More Information. Solution Way Co., Ltd

Integer-N Clock Translator for Wireline Communications AD9550

All Digital Phase Locked Loop Architecture Design Using Vernier Delay Time-to- Digital Converter

Digital Systems Design

Low Noise Oscillator series LNO 4800 B MHz

Lab Exercise 9: Stepper and Servo Motors

Lecture 23: PLLs. Office hour on Monday moved to 1-2pm and 3:30-4pm Final exam next Wednesday, in class

A CMOS CURRENT CONTROLLED RING OSCILLATOR WITH WIDE AND LINEAR TUNING RANGE

A 4 Channel Waveform Sampling ASIC in 130 nm CMOS

Model 7000 Series Phase Noise Test System

A PIPELINE VOLTAGE-TO-TIME CONVERTER FOR HIGH RESOLUTION SIGNAL EXTRACTION OFF-CHIP

Fractional- N PLL with 90 Phase Shift Lock and Active Switched- Capacitor Loop Filter

OBJECTIVE The purpose of this exercise is to design and build a pulse generator.

MB1503. LOW-POWER PLL FREQUENCY SYNTHESIZER WITH POWER SAVE FUNCTION (1.1GHz) Sept Edition 1.0a DATA SHEET. Features

Design Consideration with AP3041

A Compact, Low-Power Low- Jitter Digital PLL. Amr Fahim Qualcomm, Inc.

APPH6040B / APPH20G-B Specification V2.0

Digital Dual Mixer Time Difference for Sub-Nanosecond Time Synchronization in Ethernet

Exercise 1: Touch and Position Sensing

Low-Power 2.25V to 3.63V DC to 150MHz 1:6 Fanout Buffer IC DESCRIPTION

QPLL Manual. Quartz Crystal Based Phase-Locked Loop for Jitter Filtering Application in LHC. Paulo Moreira. CERN - EP/MIC, Geneva Switzerland

TL494 Pulse - Width- Modulation Control Circuits

PHASE-LOCKED loops (PLLs) are widely used in many

A PC-BASED TIME INTERVAL COUNTER WITH 200 PS RESOLUTION

CHAPTER III THE FPGA IMPLEMENTATION OF PULSE WIDTH MODULATION

M Hewitson, K Koetter, H Ward. May 20, 2003

Operational Amplifier

The PL is an advanced Spread Spectrum clock generator (SSCG), and a member of PicoPLL Programmable Clock family.

Note: ^ Deno tes 60K Ω Pull-up resisto r. Phase Detector F VCO = F REF * (M/R) F OUT = F VCO / P

150MHz phase-locked loop

Transcription:

Internship-Report Characterization of a PLL circuit used on a 65 nm analog Neuromorphic Hardware System Aron Leibfried May 14, 2018 Contents 1 Introduction 2 2 Phase Locked Loop (PLL) 3 2.1 General Information.............................. 3 2.2 The PLL on DLS 3............................... 4 3 The PLL-Config Container 6 4 PLL-Measurements 8 4.1 Measurements of the Capacitive Memory Ramp............... 8 4.2 Measurements with the PPU......................... 9 4.3 Determine the hang up frequency..................... 9 4.4 Problems with the PPU............................ 10 4.5 Automated measurement series........................ 11 4.6 Frequency and the corresponding error.................... 12 4.7 DCO-Frequency................................. 13 5 Discussion 14 References 14 1

1 Introduction The HICANN-DLS 3 chip (High Input Count Analog Neural Network with Digital Learning System) is a neuromorphic chip. The goal of a neuromophic chip is to emulate neural networks as found in the human brain. The aim of the HICANN-DLS 3 is to implement this on analog hardware. It consists of 32 neurons with a corresponding Array of 32x32 synapses. The synapses have individual 6 bit weights, which can be changed. These plasticity processes are the foundation of learning models. Learning models can be realized with the PPU (Plasticity Processing Unit), which allows implementing flexible learning rules by accessing all of the on-chip memory. It is based on the PowerPC architecture and has a vector unit. It represents a co-processor to the analog circuits. To clock the PPU with an adjustable frequency a PLL (Phase Locked Loop) is used. The PLL provides the main clock of the digital system components and can be configured via JTAG. This internship is about the PLL. One goal of this internship is to write a PLL-Config to easily configure the PLL via python. Another goal is to research the characteristics of the PLL. This includes classifing the occuring jitter. For the experimental part the v3-baseboard Jack London was used together with Chip 8: Green Bamboo. 2

2 Phase Locked Loop (PLL) 2.1 General Information A PLL (Phase Locked Loop) is an electronic circuit, which is used to get an adjustable clock signal. It compares the incoming frequency f in with the frequency of an internal oscillator. This is realized by a control system. The aim is to get an output signal f out, whose phase is related to the input phase. A simple PLL can be seen in figure 1 and consists of four different parts. The Phase Comparator or Phase Frequency Detector (PFD) compares the phase of f in with the phase of the DCO (Digitally Controlled Oscillator) and outputs an error signal, which is proportional to the phase difference. The Loop Filter delivers the control signal for the DCO, to keep the phase difference on a small level. This can be done by a PID controller. The DCO (Digitally Controlled Oscillator) generates the output signal f out according to the settings from the Loop Filter. The Divider is connected between the DCO and the Phase Comparator, to divide the frequency of the DCO by a factor N N. So it is ideally: f out = N f in. Phase Comparator Loop Filter Digitally Controlled Oscillator f in Φ DCO f out N Divider Figure 1: A simple PLL circuit with a Phase Comparator to compare the two incoming frequencies. It is connected to the Loop Filter, which controls the Digitally Controlled Oscillator. It outputs a constant frequency. A Divider lowers the frequency, which is compared to the reference clock. 3

2.2 The PLL on DLS 3 The used clock generator on HICANN-DLS 3 is called hs clockgen and was designed by the Technische Universität Dresden [2]. A so called ADPLL (All-Digital Phase-Locked Loop) is used as PLL. It can be seen in figure 2. It contains two independent ADPLL s with 3 different output frequencies. It has also a total of four independent clock outputs. Another feature is the BIST (Frequency built-in self test) to test the different frequencies. hs_clkgen ref_clk_i reset_sync freq_bist clk_meas_o ADPLL1 ADPLL0 PFD filter DCO P2 PFD filter DCO P2 P1 N P0 P1 N P0 M0 M1 clk_dco clk_dco clk_core0 M0 clk_core0 clk_core1 M1 clk_core1 CG CG CG CG clk_out_0_o clk_out_1_o clk_out_2_o clk_out_3_o Figure 2: Schematic of the hs clockgen used in the HICANN-DLS 3 Chip. It contains two ADPLL s and a total of four configurable output pins. Also a Frequency built-in self test is implemented. Figure from [2]. To the ADPLL s is a reference clock with frequency f ref from the FPGA connected. There are several dividers in each ADPLL (see figure 2) to configure the different frequencies, which can be calculated by f dco = P 0 N f ref, (1) The different possible settings are collected in table 1. f clk dco = f dco /P 2, (2) f clk core0 = f dco /(P 1 M0), (3) f clk core1 = f dco /(P 1 M1). (4) value N P 0 P 1 P 2 M0 M1 max 31 4 4 4 31 31 min 1 2 2 2 1 1 Table 1: Possible settings for the ADPLL used on HICANN-DLS 3. As reference frequency it is used: f ref = 50 MHz According to [2] it is recommended to keep f dco between 1000 MHz and 2000 MHz, so it is best to set f dco = 1500 MHz, which corresponds to N P 0 = 30. This was also verified in section 4.7. 4

Each of the four output pins can be enabled and connected to the different outputs of the different ADPPL s. Also a bypass is possible to get f ref at the output. This will create a stable environment for digital tests, because one uses the f ref = 50 MHz from the FPGA for the whole chip. This mode should not be used for experiments which use the analog part of the chip. Some digital test results (e.g. SRAM) might not be transferable to higher clock frequencies. As seen in section 4, the digital support circuitry of the chip is driven by f clk out 0. The so called CapMem-Ramp is created with a capacitor and a counter, which is also clocked with the PLL. A current starts flowing to the capacitor while the counter starts. One can measure the actual voltage over the capacitor. When the counter value is reached, the capacitor gets discharged and the counter resetted. When the counter reaches his counter value a second time, the capacitor will be charged again. So the CapMem-Ramp frequency is proportional to the frequency at this output. Also the PPU is clocked with this frequency. As mentioned above, there is also a built-in self test (BIST) contained in the hs clockgen. This allows testing the clock generator by counting the cycles of the selected output clock f clk out within a specified number of reference clock cycles f ref. The specified number is set by a selectable pre-scaler value p as 2 p+2. This leads to the expected counter value counter value = f clk out f ref 2 p+2. (5) With configuring the PLL with the expected counter value, the test starts and the cycles are counted. Then both values are compared within a configurable tolerance range (check range). The included pass/fail checking unit outputs whether the test failed or was a success. The PLL can be configured via JTAG. There are 10 configuration registers, each with 32 bits. The instruction register width is 4 bits. It s important to mention that the register numbers and the according JTAG instruction numbers are shifted by a factor of 3. I.e. register 0 can be configured with the JTAG instruction register 3 [1]. In the default hardware settings f clk core1 from the ADPLL0 is connected to the clk out 0 pin (See figure 2), which drives the digital circuitry. This means after every chip reset, the chip will run with this frequency. This default setting causes problems, as seen in section 4.2. The problem can be solved by using the PLL-config container as described in section 3. Because most of the experiments, which were made on this chip, ran with the ADPLL0 and their clk core1 output, in section 4 just this configuration will be studied. Other possible configurations are not covered by this internship-report. 5

3 The PLL-Config Container To change the different parameters of the PLL, a python-based PLL-config container is used to easily configure the PLL. After the creation of an instance of this class, the different parameters can be changed and exported to the PLL. To configure the PLL, the export data command have to be called after the parameters inside the class have been changed. To get the hardware configuration a import data command is possible. By printing the class, one will get the actual configuration of the PLL. By using the frequencies function, one will get information about the different frequencies of the different ADPLL s. As mentioned above, the configuration is written to the PLL via JTAG. Until now it is just possible to write on the JTAG via Impact (See ImpactJTAGDriver). In the future it will be possible to contact this with a FPGA-driver. The Driver can be changed by setting the driver value to the preferred driver. By default it is set to the ImpactJTAGDriver. Name min max default hardware value ADPLL0-config value loop filter int 1 31 2 2 loop filter prop 1 31 8 8 loop div N 1 31 10 10 core div M0 1 31 4 2 core div M1 1 31 2 1 pre div P0 2 4 2 2 pre div P1 2 4 3 3 pre div P2 2 4 2 2 tune 0 4095 512 512 dco power switch 0 63 63 63 open loop 0 1 0 0 enforce lock 0 1 0 1 pfd select 0 1 0 0 lock window 0 1 0 0 filter shift 0 3 0 3 disable lock 0 1 0 0 Table 2: The tunable parameters of the ADPLL in the hs clockgen with min/max possible values and the standard settings on hardware and in the container. Table 2 includes the parameters for the ADPLL. In the PLL-config container one have to add pll0 or pll1 to change the ADPLL0 or ADPLL1 configuration. The table also contains the different standard values of the ADPLL s (hardware and class settings). For the ADPLL1 the container holds the same configuration as the hardware, but for the ADPLL0 they are different. This is because of a problem with the standard values on hardware, explained in section 4.2. So if an instance of the PLL-config container is created and they data gets exported to the PLL, the settings on the ADPLL0 will 6

change to the default settings held by by the PLL-config container! This will cause a fix, because the ADPLL0 is connected to the clk out 0 output by default. Name Description enable clock clk Enables the output of the pin: 0 for disable, 1 for enable enable bypass clk Sets pin to bypass mode (FPGA-Clock) by setting it to 1 select adpll clk Select which ADPLL should be conntected select clock clk Selects the ADPLL output: 0 for clk core0, 1 for clk core1 and 2 or 3 for clk dco Table 3: Configuration parameters of the hs clockgen output pins. It is also possible to change the configuration of the different output pins clk out k with k in [0:3]. Each pin has four parameters, collected in table 3. To change the according pin, one have to add k, with k as the pin you want to change. The standard settings of the output pins can be found in table 4. Output-Pin Enabled Bypass ADPLL Clock 0 yes (1) no (0) 0 clk core1 1 yes (1) no (0) 0 clk core0 2 yes (1) no (0) 0 clk dco 3 yes (1) no (0) 1 clk core1 Table 4: Hardware settings of the output pins. To execute the built-in self test the function self test can be used. It uses the values collected in table 5. It is not recommended to change the check value parameter, as the function calculates the expected value according to equation 5. Name Std Min Max Task pre scaler p 8 0 15 pre-scaler p, explained in 2.2 select source 0 0 3 Choose the output pin which should be tested check range 2 0 15 Tolerance range to accept the results check value - 0 2 20 1 Expected Counter Value Table 5: BIST-Function parameters. It s important to note that self test uses the export data function at the beginning. So it is important to note that previously changed parameters on the PLL are changed according to the configuration in the PLL-config class. If the test failed the function will return False, otherwise True. The function will print the used ADPLL and the according output with its frequency if print info is set to True. The counter values are also compared and the result is also printed when print info is set to True. 7

4 PLL-Measurements Now different measurements are performed, to get more information about the functionality of the PLL. If not other specified, the standard PLL-config values from table 2 are used. 4.1 Measurements of the Capacitive Memory Ramp We measure the frequency of the CapMem-Rampout (f CAP ) (see section 2.2) for different settings of the PLL. f CAP is directly related to f clk core1. f dco gets observed to find a good frequency range. For this measurement M0 = M1 = 31 and P 1 = P 2 = 4 are fixed values. We measure for different values of N for a given P 0 = 2: N 1 2 3 4 5 6 7 8 9 10 f CAP [Hz] 212.5 10.7 17.5 23.3 29.1 35.0 40.8 46.6 52.4 58.3 N 20 21 30 31 f CAP [Hz] 116.5 122.3 174.8 180.6 Table 6: Measurement of f CAP for P 0 = 2 and different N. If the value of N get doubled, the according f CAP should also be doubled. As we can see in table 6, this happens for 3 N 31. For N = 1 we get the maximum CapMem- Rampout frequency 212.5 Hz (See table 7). The value for N = 2 also doesn t fit into the expectations. We measure for different values of N for a given P 0 = 4: N 2 10 15 16 18 19 20 25 30 f CAP [Hz] 23.3 116.5 174.8 186.4 209.7 212.5 212.5 212.5 212.5 Table 7: Measurement of f CAP for P 0 = 4 and different N. We can see in table 7, that the CapMem-Rampout frequency has its maximum at 212.5 Hz. For N 18 we get the results we expected. But for 19 N we get a maximum value of f CAP. As seen above, f dco works stable for 4 P 0 N 72. So we get the frequency range of the PLL 4 f ref = 200 MHz f dco 3600 MHz = 72 f ref. (6) The same results can be measured on different chips. Two additional chips were tested: Chip 7: Green Cheese and Chip 3: Indigo Hammer. 8

4.2 Measurements with the PPU To get better results and to automate the measurement the PPU is used. Most of the instructions executed by the PPU will take one clock cycle. We used a PPU application to toggle one of the Input/Output (GPIO) pins by setting the output pin to high and low for a specified period of time. Including the time to execute a jump instruction, the PPU can toggle the pin with a period of three clock cycles. Inserting a configurable number of NOP s we can scale the toggle frequency f PPU by m PPU = 3 + 2 N NOP, (7) f clk core1 = m PPU f PPU. (8) With m PPU = 15 the trace of the PPU, with an unconfigured PLL connected, is measured with an oscilloscope. A frequency of f PPU = 11.11 MHz is expected. The signal looks gated with a clock signal of around 50 KHz as seen in figure 3. During a clock-highsignal the expected frequency can be investigated as seen in figure 4 on the left side. Between the clock-high-signal can happen different things. It can be a clock-lowsignal or it stays high with some peaks to the ground, as seen in figure 4 on the right. This can be easily fixed by keeping all settings as they are and setting enforce lock to 1. With this setting the PLL never stops and a frequency of 11.11 MHz can be measured. The difference can be clearly seen by comparing figure 3 with figure 5 and figure 4 with figure 6. 2.5 PPU Output after a power cycle 2.5 PPU Output after a power cycle (Zoom) 2.0 2.0 Volts [V] 1.5 1.0 Volts [V] 1.5 1.0 0.5 0.5 0.0 0.0 0.5 0.04 0.06 0.08 0.10 0.12 0.14 Time [ms] Figure 3: PLL hardware settings. 0.07900 0.07925 0.07950 0.07975 0.08000 0.08025 0.08050 0.08075 0.08100 Time [ms] Figure 4: PLL hardware settings. Because of this expectation, you should never set enforce lock to 0. By default the PLL-container sets this parameter to 1, to fix the problems explained above. 4.3 Determine the hang up frequency With an value of m PPU = 7 some tests with different PLL-settings are done. The PLLconfig values of table 2 are used and the parameter N gets varied to make a conclusion about the PLL. The data is collected in table 8. 9

2.5 PPU Output with enforce_lock = 1 2.5 PPU Output with enforce_lock = 1 (Zoomed) 2.0 2.0 Volts [V] 1.5 1.0 Volts [V] 1.5 1.0 0.5 0.5 0.0 0.0 0.04 0.06 0.08 0.10 0.12 0.14 Time [ms] Figure 5: PLL hardware settings with enforce lock = 1. 0.07900 0.07925 0.07950 0.07975 0.08000 0.08025 0.08050 0.08075 0.08100 Time [ms] Figure 6: PLL hardware settings with enforce lock = 1. N 1 2 3 4 6 8 f PPU [MHz] - 7-11 12-15 18-20 27-29 38.0-38.1 f theo [MHz] 4.76 9.52 14.29 19.05 28.57 38.10 N 9 10 12 14 15 f PPU [MHz] 42.6-43.1 47.5-47.9 57.1-57.3 66-67 - f theo [MHz] 42.86 47.62 57.14 66.67 71.43 Table 8: Measurement of f PPU with m PPU = 7 for different values of N. If there is no entry for f PPU the chip hanged up. As seen in section 4.1 the PLL is t stable for N = 1. That s the reason why the chip hang up with this settings. The other values fit with the expactation, but for low N the error is pretty high. For N 15 the chip also crashes. This corresponds to f clk core1 = 500 MHz. We can conclude that the chip will hang up if f clk core1 500 MHz. 4.4 Problems with the PPU With the High and Low output of the PPU one would expect a rectangle signal as seen in figure 6. But for some settings we get a different signal, compare to figure 7 and figure 8. This measurement is done with m PPU = 3, but some things also happen with a higher value of m PPU. With a high frequency, for example f clk core1 = 250 MHz, the signal doesn t look like a rectangle signal how it should be (see figure 7). The signal looks more like a random signal. Maybe the frequency is to high for the PPU or the measurment technique with the oscilloscope isn t the best way to do this with such high frequencies. This can be fixed with a higher m PPU value. For low frequencies, for example f clk core1 = 4.2 MHz, the signal looks like a rectangle signal (see figure 8). The problem is a bad peak in the middle of the High -Signal. 10

High Frequency - f clk_core1 = 250MHz Low Frequency - f clk_core1 = 4.2MHz Volts [V] 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.0000 0.0001 0.0002 0.0003 0.0004 0.0005 Time [ms] Figure 7: PPU-Output with m P P U = 3: N = 15, M1 = 2, P 0 = 2 and P 1 = 3. Volts [V] 2.0 1.5 1.0 0.5 0.0 0.000 0.001 0.002 0.003 0.004 0.005 Time [ms] Figure 8: PPU-Output with m P P U = 3: N = 5, M1 = 30, P 0 = 2 and P 1 = 4. This peak doesn t make sense and with even lower frequencies more bad peaks appear. This problem cannot be fixed with a higher m PPU value, so it can be a problem with power supply or a problem with the PPU itself. Maybe this bad peaks and the bad peaks from figure 4 are related to each other. 4.5 Automated measurement series To automate the measurement and to classify the jitter of the PLL, a measurement series was done. As seen in section 4.3, it is possible to hang up the chip. A power cycle would be necessary to run it again. To be sure that the chip is running on a safe operating point, the PLL-values were restricted. Because of the previous measurements the area was chosen with 4 P 0 N 72, 1 P 0 N P 1 M1 9 and N 2. In total 2528 single measurements were made. To classify the jitter of the PLL it would be best to measure with no NOP s. That s because the deviation of the timing gets lower with more operations, because the mean is taken. But to classify the jitter with m PPU = 3 is also a problem, see section 4.4. As a compromise m PPU = 15 is used. An oscilloscope can be accessed via ethernet connection, to collect the trace data. It would be possible to store every signal and to evaluate them all after the measurement. But every trace takes more than 100 MB, so more than 200 GB would be needed. Also the evaluation would take a long time. It is more efficient to evaluate every signal directly in the measurement series. For every trace are the times of the rising slopes determined and stored. They can be used to determine the frequency and with this information also the jitter can be classified by statistical methods. 11

4.6 Frequency and the corresponding error PPU-Output Frequency [MHz] (15 Tick corrected) 500 400 300 200 100 PPU-Output with 15 Ticks - Frequency 0 0 100 200 300 400 500 PLL-Output f clk_core1 [MHz] (Calculated) Figure 9: f clk core1 against the corrected f PPU gives a slope of one. With the data of the measurement series the PPU-Frequency f PPU can be determined by dividing one by the measured times and taking the mean. With this information also the standard deviation for one measurement can be calculated with statistical methods. By correcting f PPU with a factor of m PPU = 15 this should give a line when plotted against f clk core1. The results can be seen in figure 9. By calculating the coefficient of variation σ µ of the frequency f PPU and plotting it, which is done in figure 10, you can see that many points have a very small coefficient σ µ. But there are also points with an error higher by two orders of magnitude. If you plot some characteristic PLL-Settings, you can see that small values of M 1 are causing a high jitter. That s because the M 1-divider cuts slopes to lower the frequency by its amount. By cutting many slopes, the jitter of a single peak doesn t matter to much and so the total jitter lowers. But with M1 = 1 no slopes are cutted. So we can measure the whole jitter of f dco (We also have to take P 1 into account). A better research in this is done in section 4.7. 12

Coefficient of variation σ/µ 10-1 10-2 10-3 PPU-Output with 15 Ticks - σ/µ Measured Values Settings: M1 =1 Settings: M1 =2 Settings: M1 =3 0 100 200 300 400 500 PLL-Output f clk_core1 [MHz] (Calculated) Figure 10: Coefficient of variation σ µ of the frequency f PPU. 4.7 DCO-Frequency Now we want to classify the jitter for different values of f dco. The dividers P 1 and M1 reduce this jitter, because they cut many slopes (Compare to section 4.6). Because of the chosen PLL-values it is also not possible to search fixed P 1 and M1 values and vary P 0 and N for f dco. For every measurement the standard deviation t of the period time t is determined. The period t is just the mean of the measured values (section 4.5). The real error t can be calculated by error propagation. We can calculate t by t = P 1 M1 t. (9) The data can be seen in figure 11. The marked area is the originally recommended range from [2]. You can see that the jitter is pretty low in this area how it should be. For lower frequencies than 800 MHz the jitter rises and gets pretty big. This is especially when N = 2 (N = 1 wasn t measured, see section 4.3). For higher frequencies the jitter also rises, but it isn t too high. It should be possible to use the PLL with an f dco till 3500 MHz. 13

Error of t [ns] (Calculated) 10 1 10 0 10-1 Uncertainty t of the rising slope Measured Values Settings: N =2 Recommended Area 10-2 0 500 1000 1500 2000 2500 3000 3500 4000 PLL DCO-Frequency f dco [MHz] (Calculated) Figure 11: The jitter t for different values of f dco. 5 Discussion The PLL-config container works fine as expected. The only issue is that you have to use Impact as driver. So the chip must be connected via a proprietary programming cable to the Server. Sometimes this method is locking the cable and you have to fix this problem with the Impact-Shell. The PLL however works fine with the standard settings of the container. The only issues are the reset parameters when you restart the chip. This should change for the next generation of HICANN. It s recommended to configure the PLL with the PLL-Config when you restart the chip. By using the recommended area of f dco (1000 MHz - 2000 MHz), the jitter can be lowered. Also the dividers P 1 and M1 shouldn t be too high. In case of power consumption a low value of f dco would be preferable. So it should be best to set f dco = 1000 MHz, which is also the default PLL-config container value for the ADPLL0. References [1] Andreas Hartel and Johannes Schemmel. Specification of the HICANN-DLS ASIC. 2018. [2] Sebastian Höppner and Stefan Scholze. TUD HPSN Clock Generator Specification for HICANN DLS. 2016. 14