A13C: Performing Digital Filtering on an MCU

A13C: Performing Digital Filtering on an MCU Renesas Electronics America Inc. Kevin P King Senior Staff Applications Engineer 13 October 2010 Version 1.2 1

Kevin P King Senior Staff Application Engineer Primary Tech Support for SH2A Focusing on Medical Segment and SH Family 2 Education Electrical Engineering, University of Lowell (Edward B Van Dusen Award for Academic Achievement) Thirty years of Embedded Design Experience (x86, HC05, HC11, 8051, Philips XA, Atmel AVR, Hitachi, Mitsubishi, etc... Five years of Emulator design for MetaLink COP8, 68HC05, 68HC11, 8051 (multi-vendors), National CR16, Hitachi H8/500, etc... Multiple Quality Awards for Embedded Software & Hardware Development. Specialty is Embedded System Design - MCU firmware & hardware Working with GSL Motor Team to define Global Motor Platform and standards for Fixed Point Library. Edward B. Van Dusen Award Number one in my class University of Lowell Thirty years: Engineering Aide on Intel 4004 design first actual Design 8085 based 80386 co-processor for Suns i386 First 32 bit Color Graphics card for Sun Sparcstation I (24 bits color, 8 bits Alpha, 65Hz refresh rate 1024x1024 resolution on Sony Trinitron Monitor) Product Award at ESC for 8051 Emulator Design the size of about a 8mm Tape (MetaLink 8051PE) Work on Emulators meant I needed to be able use multiple vendor and third party tools. Quality Awards - The best kind, voted on by my peers based on schedule and conformance to requirements. These had money tied to them also. Letter of commendation from Army for work on Weapons training systems Was project leader in charge of both Hardware and Software. Passed HW design review and code review. Delivered system on-time and on-budget. Being both HW and SW conversant, when I joined Renesas, Motor Control was a natural fit. With my extensve background in SH core designs, I am now support SH-MCUs 2

Renesas Technology and Solution Portfolio Microcontrollers & Microprocessors #1 Market share worldwide * ASIC, ASSP & Memory Advanced and proven technologies Solutions for Innovation Analog and Power Devices #1 Market share in low-voltage MOSFET** * MCU: 31% revenue basis from Gartner "Semiconductor Applications Worldwide Annual Market Share: Database" 25 March 2010 ** Power MOSFET: 17.1% on unit basis from Marketing Eye 2009 (17.1% on unit basis). 3 In the session 110C, Renesas Next Generation Microcontroller and Microprocessor Technology Roadmap, Ritesh Tyagi introduces this high level image of where the Renesas Products fit. The big picture. 3

Renesas Technology and Solution Portfolio Microcontrollers & Microprocessors #1 Market share worldwide * ASIC, ASSP & Memory Advanced and proven technologies Solutions for Innovation Analog and Power Devices #1 Market share in low-voltage MOSFET** * MCU: 31% revenue basis from Gartner "Semiconductor Applications Worldwide Annual Market Share: Database" 25 March 2010 ** Power MOSFET: 17.1% on unit basis from Marketing Eye 2009 (17.1% on unit basis). 4 This is where our session, A13C Performing Digital Filtering on an MCU, is focused within the Big picture of Renesas Products, Microcontroller and Microprocessors. 4

Microcontroller and Microprocessor Line-up Superscalar, MMU, Multimedia Up to 1200 DMIPS, 45, 65 & 90nm process Video and audio processing on Linux Server, Industrial & Automotive High Performance CPU, Low Power Up to 500 DMIPS, 150 & 90nm process 600uA/MHz, 1.5 ua standby Medical, Automotive & Industrial High Performance CPU, FPU, DSC Up to 165 DMIPS, 90nm process 500uA/MHz, 2.5 ua standby Ethernet, CAN, USB, Motor Control, TFT Display Legacy Cores Next-generation migration to RX General Purpose Up to 10 DMIPS, 130nm process 350 ua/mhz, 1uA standby Capacitive touch Ultra Low Power Up to 25 DMIPS, 150nm process 190 ua/mhz, 0.3uA standby Application-specific integration Embedded Security Up to 25 DMIPS, 180, 90nm process 1mA/MHz, 100uA standby Crypto engine, Hardware security 5 Here are the MCU and MPU Product Lines, I am not going to cover any specific information on these families, but rather I want to show you where this session is focused. 5

Microcontroller and Microprocessor Line-up Superscalar, MMU, Multimedia Up to 1200 DMIPS, 45, 65 & 90nm process Video and audio processing on Linux Server, Industrial & Automotive High Performance CPU, Low Power Up to 500 DMIPS, 150 & 90nm process 600uA/MHz, 1.5 ua standby Medical, Automotive & Industrial High Performance CPU, FPU, DSC Up to 165 DMIPS, 90nm process 500uA/MHz, 2.5 ua standby Ethernet, CAN, USB, Motor Control, TFT Display Legacy Cores Next-generation migration to RX General Purpose Up to 10 DMIPS, 130nm process 350 ua/mhz, 1uA standby Capacitive touch Ultra Low Power Up to 25 DMIPS, 150nm process 190 ua/mhz, 0.3uA standby Application-specific integration Embedded Security Up to 25 DMIPS, 180, 90nm process 1mA/MHz, 100uA standby Crypto engine, Hardware security 6 Basically filter is applicable to all of the Microcontroller families. Where ever a processor is sampling data, filtering is not far behind. Obviously, the choice of processor is determined by the application and the requirements of the filtering (i.e. real-time or whenever it get s done ). 6

Innovation Arc Fault Circuit Interrupter 7 So a good example of innovation is the arc fault circuit interrupter. For the AFC to be reliable it is required to look for specific frequencies in the range of 3-5 khz All too often we see engineers doing a design like this: playing with the numbers so to speak, then the product works and we hear it is finely tuned, meaning don t screw with anything in the software, you might break it. Innovation: Using free or low-cost filter design tools, you can implement a robust digital filter and possibly reduce the requirements for complex external analog filtering. 7

Position Renesas provides you a complete set of free development tools and a choice of low cost MCUs with ADC peripheral. Reliable Digital filtering that meets your system requirement can be achieved. 8 Yes, this is optimized code, but lets try to get code that actually does something 8

Agenda System Block Diagram analog filter FIR vs IIR Sampling theorem Anti-aliasing Oversampling Triggering skew ADC interrupt overhead Decimation Fixed point and floating point principles Fixed point vs. floating point benchmark Summary Meeting Title Date Rev. 1.00 9 9

Example Filter Applications Instrument Board MCU Sensor Cabling 60 Hz Filter ADC Microphone Voice Recorder 4 khz LowPass MCU ADC Meeting Title Date Rev. 1.00 10 This slide shows two common applications of analog filters. In the first case a sensor is cabled to an MCU. The sensor cable will pick up 60 Hz noise. The filter limits the affect on the ADC reading. In the second example audio is being sampled at 8 khz. To prevent aliasing a 4 khz analog filter is used. 10

Filter Applications The Boxcar Filter Very common to perform a running average Sum n samples, scale the output (usually divide by n) Recalculate each time one new sample comes in Very simple FIR called boxcar All coefficients equal to 1 Example of 8 khz sampling rate, 8 tap FIR 11 Many people never consider that they are already implementing digital filters. However, the simple running average filter is actually an example of a digital filter. The boxcar filter is a simple filter that can be implemented using only dividers. All the coefficients are 1. The response of the boxcar filter is controlled by changing the sampling frequency or the number of taps - number or samples that are averaged. Even if the running average is only used to minimize the effects of spurious samples is it important to understand what the actual response is, otherwise you may be rejecting a signal unintentionally 11

Filter Types - FIR Typically the gain = 1 Does not always have Decimation Decimation can be on front or back end X[n] Input samples nd Decimation Factor Y [n] Decimated Output B[n] Coefficients (multiplies) Z -1 Delay elements (storage array) 12 This diagram shows a block representation of a FIR filter. This filter also includes a decimation stage on the front end though this is not always implemented. The FIR has a finite impulse response since once the impulse moves through the filter there is no feedback to continue the signal. The filter implements a number of taps which are delay elements or values in a buffer. Each value of the signal is then multiplied by a coefficient. The result of all the multiplications are summed. Since the input is bounded and the coefficients are bounded the output is bounded or stable. 12

Filter Types - IIR In addition to a forward path there is a feedback path X[n] Input samples Y k [n] Output b k [n] Feed forward Coefficients (multiplies) -a k [n] Feedback Coefficients (multiplies) Z -1 Delay elements (storage array) 13 The IIR has feedback elements in addition to the forward elements. Since there is feedback the response is not necessarily bounded. The response is also infinite since an impulse can continuously feedback and feed the filter with new values. One of the advantages of the IIR is that it requires fewer taps to provide a desired filter response compared to an FIR 13

FIR versus IIR* FIR Phase-linear Simple instructions, single loop Suited for Multi-rate (decimation or interpolation allows some calculations to be omitted) Desirable Numeric properties (finite-precision can usually be implemented using lower number of bits) Possible to implement with coefficients less then 1.0 May require more memory and calculations the IIR Some responses are just impractical to implement in FIR IIR Less memory and calculations for a given filtering charateristic Arithmetic errors compounded by feedback Harder to implement using fixed point Not as easy to do multi-rate (decimation and interpolation) Not phase-linear * http://www.dspguru.com/dsp/faqs/fir/basics and http://www.dspguru.com/dsp/faqs/iir 14 www.dspguru.com/dsp/faqs/fir/basics 14

Designing the Filter Programs like ScopeFIR, ScopeIIR or WinFilter simplify the task of designing a filter 15 Designing an FIR filter is very easy with programs like ScopeFir or WinFilter. These programs allow you to just input the filter parameters and they provide the coefficients, graphs of the frequency response and other important characteristics. The output shown above is a screenshot from ScopeFir 15

Identifying the Noise Programs like ScopeDSP allows inputting ADC data and running FFT 16 When trying to apply a filter to a real system it is important to understand what the actual frequency content of the signal is and to evaluate the output after filtering. A program like ScopeDSP, which is part of the ScopeFIR package, allows you to input raw data from an ADC or other source and plot the signal. The diagram above is data captured from a 10 bit ADC, the question is what signals are present in the above waveform. ScopeDSP allows you to run an FFT on this data. This is shown in the next slide 16

Identifying the Noise The FFT clearly identifies a 1k,2K,4K and 8K component 17 The graph above shows the result of applying an FFT to the previous data. From this diagram it is easy to see that the primary components of the previous waveform are 1,2,4 and 8 khz signals of approximately equal power. 17

Sampling Theorem Nyquist-Shannon Sampling Theorem If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart. 1 Simply stated: A signal can only be properly sampled if it contains no frequencies greater than one-half the sampling frequency Sometimes this is incorrectly stated: To not lose information you must sample at twice the highest frequency you are concerned with in a signal Meeting Title Date Rev. 1.00 18 1. C. E. Shannon, "Communication in the presence of noise", Proc. Institute of Radio Engineers, vol. 37, no. 1, pp. 10 21, Jan. 1949. Reprint as classic paper in: Proc. IEEE, vol. 86, no. 2, (Feb. 1998) 18

Aliasing Problem Record voice data and store Limit voice bandwidth to 4 khz Sample at 8 khz Problem - Audio contains energy above 4 khz Anti-aliasing filter Adjust corner for 4 khz Meeting Title Date Rev. 1.00 19 To properly sample a signal there must usually be an anti-aliasing filter. The anti-aliasing filter should remove any frequencies above the Nyquist frequency. In real applications this is not practical since there are no brick wall filters. It is a constant trade-off between filtering the signal information and limiting the out of band information. Aliasing is a problem in audio since it creates new frequencies in the passband. Though the ear is not very sensitive to the shape of a waveform it does easily perceive the different frequencies. 19

Anti-aliasing filter - 12 db is only an attenuation of 1/4 20 This diagram shows the Bode plot and schematic circuit for a 2 nd order 4 khz anti-aliasing filter. Notice that the signals between 4 khz and 8 khz are not completely removed. In fact the attenuation at 8 khz is only ¼. The real response of the second order filter show even more issues 20

Anti-aliasing filter -2-4 dbv @ R1-P / db -6-8 -10 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k Frequency / Hertz 21 Here is the actual response of a 4 khz filter. Notice that the filter starts to affect the amplitude well before 4 khz. The real properties of an anti-aliasing filter causes issues for high quality sampling. In audio the signal could be sampled at 48 khz but there could still be some energy above this frequency. If the sample rate were increased to a higher value, like 96 khz then only signals above 48 khz would cause aliasing. The requirements on the anti-aliasing filter become much simpler 21

Frequency Response of 8 Tap 4 khz Filter -12dB line 20 db attenuation at 8 khz compared to 12 for analog filter 22 The diagram above shows the frequency response of an 8 tap filter designed using ScopFir. Notice that the attenuation at 8 khz is now 20 db compared to 8 db (about.1). The 6 khz response is still at about -5 db or.5 of the 4KHz we re trying to cut-off at. 22

Improved 4 khz Filter By using 14 taps notice the improved attenuation at 6 khz 23 This diagram shows the flexibility of the digital filter. By increasing the # of taps to 14 the attenuation is significantly improved. The passband ripple is <2 db which is a gain of 1.2 23

Oversampling and digital filtering Sample at 32 khz instead of 8 khz Only signals 16 khz or greater will alias Could use simple RC or no anti-aliasing filter 24 It is possible to eliminate or significantly simplify the anti-aliasing filter by increasing the sampling rate and applying a digital filter. If the sampling rate was increased to 32 khz then only signals greater than 16 khz could alias. The hardware anti-aliasing filter could become a simple RC circuit. Depending on the microphone and the audio source it might be possible to eliminate the filter altogether. 24

Oversampling and digital filtering Decimate Results Store only 1 of 4 samples Only calculate filter at 8 khz 25 One of the interesting things about digital filters is the process of decimation or multi-rate filters. In the example above the sampling rate is increased to 32 khz from 8 khz. This allows applying a digital filter to all signals below 16 khz without aliasing. However, since we only need data at the 8 khz rate we only have to calculate the filter at that rate. Since the FIR filter does not have feedback or rely on information from previous calculations we can calculate the filter at lower rates then sampled. It is very common in digital filters to run a first filter at a high sample rate then decimate the data, either by throwing out samples or never calculating them. 25

Multi-rate and Decimation Temp cannot change more than 1 degree/ hour Required sampling rate for 1 degree logging Noise with 1 second period, averages out in 4 readings Sampling rate for noise x x x x x x x x x x x x x x x x x Temperature x x x Meeting Title Date Rev. 1.00 26 26

ADC Considerations - Skew Problems: Interrupt Skew 32 khz requires sampling every 31.25 us Software start ADC possibility of sample skew Other interrupts in the system Long instructions required to complete Solutions: Possible - Make the start interrupt highest system priority Preferred - Use ADC system that can be triggered by timer Some devices may have to loop a timer to ADC trigger 27 Whenever data is being sampled a primary requirement is that the data is sampled at a constant interval. At slower rates slight skews will not have much effect, however, at 32 khz the time between sampling is only 31.25 us. If the sampling is under software control there will usually be some skew to the sampled data. This will affect the output of the filter. Even if the sampling is run from a timer interrupt the system must ensure that the timer interrupt is not delayed. This can be accomplished by using interrupt priorities if available but there may also be conflicts with having the ADC interrupt a high priority in the system. The interrupt must also wait for the current instruction to complete, normally this is a short time but can be long for some instructions like divide instructions. A better solution is to use an MCU that can directly trigger the ADC with a timer. In some cases this may require looping a timer output back into an ADC Trigger input pin. The ADC complete interrupt can then be used to schedule retrieving the data from the ADC unit. 27

ADC Considerations - Overhead Problem: Interrupt Overhead Storing ADC Data Assume ADC ISR takes 40 cycles context save + data save and pointer adjust + context restore Sampling at 32 khz BW to store data = 1.28 million cycles Solutions: Use a DMA controller 4-5 cycles or less per transfer CPU BW to store data <200 thousand cycles 28 Whenever data is being sampled a primary requirement is that the data is sampled at a constant interval. At slower rates slight skews will not have much effect, however, at 32 khz the time between sampling is only 31.25 us. If the sampling is under software control there will usually be some skew to the sampled data. This will affect the output of the filter. Even if the sampling is run from a timer interrupt the system must ensure that the timer interrupt is not delayed. This can be accomplished by using interrupt priorities if available but there may also be conflicts with having the ADC interrupt a high priority in the system. The interrupt must also wait for the current instruction to complete, normally this is a short time but can be long for some instructions like divide instructions. A better solution is to use an MCU that can directly trigger the ADC with a timer. In some cases this may require looping a timer output back into an ADC Trigger input pin. The ADC complete interrupt can then be used to schedule retrieving the data from the ADC unit. 28

ADC Considerations - Benchmark Example SH7216 allows triggering ADC from MTU2 (timer) DMAC transfers data to buffer HW assist to acquire/transfer data to buffer saves 7% CPU BW 29 This diagram shows an optimal solution for acquiring the data. The SH7216 allows triggering the ADC from an MTU (timer). When the ADC has completed the conversion it triggers a DMA which automatically moves the data to the buffer. The half ready interrupt on the buffer allows the CPU to process one set of data (Ping) while still acquiring a second set of data (Pong). In a real application running at 200 khz sampling rate the HW assist saved 7% CPU BW compared to standard interrupt and software data control 29

Calculating the Filter Design 4 khz, 8 tap, lowpass filter Sampling rate 32 khz Passband 4 khz Stopband 8 khz Stopband attenuation 12 db actual 20 db Passband ripple = 2 db - actual 0.76 Coefficients: -0.074778857796693535 0.020358522095065112 0.200149797853876850 0.366925297165379800 0.366925297165379800 0.200149797853876850 0.020358522095065112-0.074778857796693535 30 The parameters for a 4 khz FIR filter is shown. Notice that all the coefficients are fractional values. This is going to complicate the math. 30

Implementing the Filter Could calculate the filter as: result=0; for (index = 0; index < taps; index++) { result += data[index] * coeff[index]; } The problem is the coefficients are all fractional values 31 The parameters for a 4 khz FIR filter is shown. Notice that all the coefficients are fractional values. This is going to complicate the math. 31

Options to Calculate the Filter Use an MCU with an FPU R32C 32 Bit CISC General Purpose up to 50 MHz SH2A like SH7216 High Performance RISC up to 200 MHz RX600 High Performance CISC up to 100 MHz Use Floating Point Libraries Can be very slow Use Fixed Point Math A little more complicated than floating point 32 There are a few different options for calculating the filter. If the MCU has an FPU this simplifies the math considerably. If the MCU does not have an FPU and a Float calculation is used libraries will need to be used. This will typically be very slow and is usually not practical. The best approach for an MCU without a floating point unit is to use fixed point math. 32

Floating Point Numbers 31 30 23 22 0 S Exponent 8 bits Significand part 23 bits (implied 1) -2 8 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 2-15 2-16 2-17 2-18 2-19 2-20 2-21 2-22 2-23 Radix point Floating point value = (-1) sb + (1+Fraction) x 2 (exponent bias) The exponent is expressed in biased form: e = E + bias Precision is function of fraction bits Floating supports a very large dynamic range Parameter Total bit Width Single Precision 32bits Double Precision 64bits Sign bit 1bit 1bit Exponent field 8bits 11bits Significand 23bits 52bits Precision 24bits 53bits Bias +127 +1023 Emax +127 +1023 Emin -126 +1024 33 This slide shows the characteristics of a floating point number. One reason library math is slow is each number must be parsed and normalized to add and multiply. 33

Floating Point Hardware Single Precision Min Value = 5.88 x 10e-39, Max value = 3.4 x 10e+38 31 30 23 22 S Exponent 8 bits Significand part 23 bits (implied 1) 0-2 8 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 2-15 2-16 2-17 2-18 2-19 2-20 2-21 2-22 2-23 Radix point Double Precision Min Value = ~2.0 x 10e-308, Max value = ~2.0 x 10e+307 63 62 52 51 S Exponent 11 bits Significand part 20/52 bits (implied 1) 32-2 12 2 11 2 10 2 9 2 8 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 2-15 2-16 2-17 2-18 2-19 Radix point 31 0 Significand part 32/52 bits (implied 1) 2-20 2-21 2-22 2-23 2-24 2-25 2-26 2-27 2-28 2-29 2-30 2-31 2-32 2-33 2-34 2-35 2-36 2-37 2-38 2-39 2-40 2-41 2-42 2-43 2-44 2-45 2-46 2-47 2-48 2-49 2-50 2-51 34 Notice that the precision for single precision floats is quite good. It is precise enough for most filter calculations 34

Fixed Point Fraction value is shifted (multiplied) by a value to make an integer Example Represent 19. 78 using 16 bit fixed point 1 bit for the sign 19 requires 5 bits in binary 10 bits left to represent fraction Multiply the value by 1024 (shift left 10) Could allocate more bits for integer and less for fraction S 2 4 2 3 2 2 2 1 2 0 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 Example : Calculate a 4 tap box filter using fixed point Assume ADC samples are 0x100 (256), 0x200 (512), 0x120(288), 0x150(336) Coefficients are all 0.25 Solution Scale coefficients to be integers by multiplying by 4 (shift left 2) Multiply coefficients time ADC values 1*0x100 + 1 *0x200 + 1*0x120 + 1 *0x150 = 0x570 (1392) Restore proper scaling (shift right 2) = 0x15C (348) 35 A second choice is to use fixed point math. In this case the number is represented by a sign bit, an integer portion and a fractional portion. After allocating one bit for the sign and the appropriate number of bits for the integer the other bits are left to represent fractional components. Notice in this case even allowing 10 bits the answer is not fully precise. The calculation of a FIR filter using fixed point is shown in the example 35

Precision Requirements How many bits of coefficient are required? Do not want round-off error to cause an LSB error For 10 bit ADC need 10 bits coefficient Each tap could accumulate error Additional bits depends on number taps 8 taps add 3 LSB 16 taps add 4 LSB Etc 36 Often people are not sure how many bits of precision they need to carry. Will a 16 bit integer be sufficient to calculate a 16 tap FIR with a 12 bit ADC? The required # of bits is not that hard to calculate in this case since the ADC values are integer values. To ensure that there is not significant error on the multiplication of the coefficient and the ADC value the coefficient will have to have one more bit of precision than the ADC precision. For example, with a 10 bit ADC if the coefficients used 10 bits the multiplication error would not reach 1 LSB. Since we will be summing the results we do not want the accumulated error to reach 1 LSB, to ensure this we need to consider how many additions or taps will be used. The extra bits required are enough to express the number of taps in binary Since most FIR filters have an overall gain of 1 there is no overflow checks required. Using a 10 bit ADC and 13 bits of coefficient the result will always fit into a signed 23 bit format. So you can see in most cases 16 bit precision is more than adequate. 36

Pop Quiz: Assuming: 12 bit ADC, 7 tap FIR filter QUESTION: Is 16 bit Fixed Point enough resolution? 8 taps add 3 LSB, for a total of 15 bits Don t forget the sign bit! 16 bit total 37 37

Some Benchmark Results Using M16C/65 (16 bit, 32 MHz MCU ) 8 Tap Filter 280 cycles (35 cycles per tap) 22 Tap Filer 780 cycles (35 cycles per tap) 8 taps at 8 khz = 2.2 million cycles (approximately 7% BW @ 32 MHz) Each tap calculation requires Multiply Sum Two Pointer Increments 38 M16C RMPA does an word accumulate (RMPA) with a 4 cycle overhead and 9 cycles per loop R32C RMPA has 11 cycle overhead and 1.5 cycles per loop 38

A MAC Really Helps Really need a MAC M16C/R32C/RX have RMPA (software MAC) M16C 8 Tap Filter 120 cycles M16C 22 Tap Filer 240 cycles R32C 8 Tap Filter 32 cycles R32C 22 Tap Filter 65 cycles RX 8 Tap Filter ~47 cycles Device like SH7216 can perform a long MAC plus pointer increment in 4 cycles SH7216 can also do a 1 cycle floating point MAC (without pointer increment) 39 M16C RMPA does an word accumulate (RMPA) with a 4 cycle overhead and 9 cycles per loop R32C RMPA has 11 cycle overhead and 1.5 cycles per loop 39

Circular Buffer Bottleneck Most DSPs can handle circular buffers, MCUs typically do not Inefficient to put pointer check in loop Circular Classical Buffer Implementation X0 X4 X1 X2 X3 C0 C1 C2 C3 X1 X2 X3 X4 New Data C0 C1 C2 C3 40 One of the issues with trying to use a MAC with quick loops in an MCU is the data will typically be placed in a circular buffer. Therefore the coefficients do not end up matching with the coefficients. One of the nice features of a DSP is most of them will handle a circular buffer without any additional code overhead. However, with an MCU you would typically have to add code in the loop to check and adjust when the data pointer wrapped. This slows down the loop calculation and creates problems when using MAC instructions with automatic pointer incrementing. This slide shows the classical implementation of a FIR filter, the data is shifted along a delay line and multiplied by the coefficients. However, if a circular buffer were used the data value for X4 would replace the data value at X0. The pointer values passed to the routine could start at C0 and X1, however the wrap of the buffer must be handled 40

Double Coefficient Loops Loop1 X0 X1 X2 X3 C0 C1 C2 C3 C0 C1 C2 C3 Loop2 X4 X1 X2 X3 C0 C1 C2 C3 C0 C1 C2 C3 Loop3 X4 X5 X2 X3 C0 C1 C2 C3 C0 C1 C2 C3 41 This slide shows how a double set of coefficients can be used to keep the pointer math linear. There is a little overhead in the passing of the pointer but it simplifies the loop which has quite a bit more overhead 41

IIR Filters 42 42

IIR Since round-off error in output feeds back IIR requires greater precision 16 bit precision typically sufficient for FIR IIR requires 32 bit precision 1 Floating point simplifies math 43 1. The Scientist and Engineer's Guide to Digital Signal Processing, copyright 1997-1998 by Steven W. Smith. For more information visit the book's website at: www.dspguide.com 43

Why use IIR Design 5 khz bandpass Sampling rate 44 khz Center Frequency - 5 khz Passband - 1 khz Stopband attenuation 40 db Passband ripple = 2 db Magnitude in db 20 0-20 -40-60 Inphase Filter Frequency Response FIR filter requires 59 taps: IIR filter only requires 17 taps (13 non-zero) Forward coefficients 1,0,-4,0,6,0,-4,0,1 Feedback coefficients -0.9027953874, 5.5279871696, -16.3895992764 29.9415524963, -36.6655508659, 30.7172057969-17.2497536574. 5.9688037639-80 -100 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 Frequency in khz 44 44

Some Benchmark Results Calculating the previous filter Using R32C with FPU 150 Cycles (3 us @ 50 MHz) 12% BW if run @ 44 khz SH2A with FPU 94 cycles (0.47 usec @ 200 MHz) 2% BW if run @ 44 khz Tools like the SH2A Signal Processing Library (SPL) help simplify the calculations 45 The SPL is available for the RX and R32C family as well. 45

Summary System Block Diagram analog filter FIR vs IIR Sampling theorem Anti-aliasing Oversampling Triggering skew ADC interrupt overhead Decimation Fixed point and floating point principles Fixed point vs. floating point benchmark Meeting Title Date Rev. 1.00 46 46

Questions? 47 47

Innovation Arc Fault Circuit Interrupter 48 So a good example of innovation is the arc fault circuit interrupter. For the AFC to be reliable it is required to look for specific frequencies in the range of 3-5 khz All too often we see engineers doing a design like this: playing with the numbers so to speak, then the product works and we hear it is finely tuned, meaning don t screw with anything in the software, you might break it. Innovation: Using free or low-cost filter design tools, you can implement a robust digital filter and possibly reduce the requirements for complex external analog filtering. 48

Thank You! 49 49

Appendix: Additional Information 50 50

Resources ScopeFir and ScopeDSP http://www.iowegian.com/ http://www.dspguru.com/ The Scientist and Engineer's Guide to Digital Signal Processing, copyright 1997-1998 by Steven W. Smith. For more information visit the book's website at: www.dspguide.com C. E. Shannon, "Communication in the presence of noise", Proc. Institute of Radio Engineers, vol. 37, no. 1, pp. 10 21, Jan. 1949. Reprint as classic paper in: Proc.IEEE, vol. 86, no. 2, (Feb. 1998) http://www.winfilter.20m.com Meeting Title Date Rev. 1.00 51 51

Renesas Electronics America Inc. 52