A fast Event Preprocessor for the Simbol-X Low-Energy Detector

A fast Event Preprocessor for the Simbol-X Low-Energy Detector T. Schanz, C. Tenzer, E. Kendziorra, A. Santangelo a a Kepler Center for Astro and Particle Physics - Institute for Astronomy and Astrophysics - Eberhard Karls University of Tübingen, Sand 1, 72076 Tübingen, Germany ABSTRACT The Simbol-X 1 Low Energy Detector (LED), a 128 x 128 pixel DEPFET array, will be read out very fast (8000 frames/second). This requires a very fast onboard data preprocessing of the raw data. We present an FPGA based Event Preprocessor (EPP) which can fulfill this requirements. The design is developed in the hardware description language VHDL and can be later ported on an ASIC technology. The EPP performs a pixel related offset correction and can apply different energy thresholds to each pixel of the frame. It also provides a line related common-mode correction to reduce noise that is unavoidably caused by the analog readout chip of the DEPFET. An integrated pattern detector can block all invalid pixel patterns. The EPP has an internal pipeline structure and can perform all operation in realtime (< 2 µs per line of 64 pixel) with a base clock frequency of 100 MHz. It is utilizing a fast median-value detection algorithm for common-mode correction and a new pattern scanning algorithm to select only valid events. Both new algorithms were developed during the last year at our institute. Keywords: Simbol-X, LED, LEDA, LEDE, EPP, Event Preprocessor, IAAT 1. INTRODUCTION We present an FPGA-based Event Preprocessor (EPP) for the onboard data preprocessing of the Simbol-X low-energy detector (LED). Simbol-X is a formation flying next-generation X-ray telescope of CNES (Centre National d Etudes Spatiales) and the Italian Space Agency (ASI). The instrument consists of one mirror module and an X-ray camera containing two stacked detectors. The HED, a 128 x 128 pixel CdTe high energy detector (5-100 kev) from CEA (Commissariat à l Energie Atomique - Saclay, France) and the LED, 2 a 128 x 128 pixel DEPFET-matrix provided by MPE (Max-Planck-Institut für extraterrestrische Physik, München - Garching, Germany) and MPI-HLL (Halbleiterlabor des Max-Planck-Instituts, München - Neu Perlach, Germany) for low energies (0.5 to 17 kev). The Low Energy Detector assembly (LEDA) consists of four detector quadrants and their dedicated front-end electronics. Each quadrant is connected via bond wires to four ASICs: two first stage amplifier chips (CAMEX), that are connected alternating to pairs of two lines, as well as the gate and clear switchers. Figure 2 shows a block diagram of the LEDA and its associated electronics. The 64 analogue values of pixel charges that are read out in parallel for each line are transmitted serially from the CAMEX to a 14 bit ADC. The emerging raw energy values are then further transmitted to the Event Preprocessor (EPP), where several data corrections are applied and valid events (see below) will be identified and passed on. The output of each quadrant s EPP is in turn delivered to an interface controller (IFC), which has a serial (SpaceWire) connection to the central payload processor (DPDPA). E-mail contact: schanz@astro.uni-tuebingen.de, tenzer@astro.uni-tuebingen.de, kendziorra@astro.uni-tuebingen.de, santangelo@astro.uni-tuebingen.de; web: http://astro.uni-tuebingen.de Depleted Field Effect Transistor Field Programmable Gate Array VHSIC Hardware Description Language Application Specific Integrated Circuit Space Telescopes and Instrumentation 2008: Ultraviolet to Gamma Ray, edited by Martin J. L. Turner, Kathryn A. Flanagan, Proc. of SPIE Vol. 7011, 70112V, (2008) 0277-786X/08/$18 doi: 10.1117/12.789410 2008 SPIE Digital Library -- Subscriber Archive Copy Proc. of SPIE Vol. 7011 70112V-1

Figure 1. The EPP Xilinx Virtex4 prototyping board from IAF GmbH Braunschweig. The EPP is hosted inside a Virtex4 FPGA (Field Programmable Gate Array) during development and will be later transfered into an ASIC (Application Specific Integrated Circuit) technology. The EPP is a FPGA based design written in the hardware description language VHDL. VHDL is an ideal language for the development because design changes can be applied very fast and the results can be observed in computer simulations before any hardware synthesis must be accomplished. VHDL also guarantees an easy portability of the design to several FPGA or ASIC technologies, hence, later on the EPP-design can be easily adapted into a radiation hard ASIC technology. The readout speed of the Simbol-X LED of about 8000 frames per second requires a very fast data processing inside the EPP, resulting in a multistage pipeline concept. Each frame of an LED quadrant contains information about the energy (14 bit), the position (12 bit) and the time (32 bit) for each of its 4096 pixels. The raw data amount of each quadrant therefore is a constant data stream of about 460 Mbit per second. The EPP-design is currently hosted in an Xilinx Virtex4 XC4VSX35/10c FPGA (see also Figure 1) and consumes roughly 38% of the resources in this chip. The FPGA is clocked with a base frequency of 100 MHz, resulting in a clock cycle time of 10 ns. The data amount of the LED requires a processing speed of one pixel every 30 ns which in turn leads to a heavily pipelined design in order to achieve the specifications. The main purpose of the EPP is data correction and data reduction. Most of the information that the LED and its associated electronics will generate is artificial (noise) and of no scientific use. These data will be filtered out by the EPP to prevent unnecessary telemetry. Furthermore several correction tasks must be applied to the data and are implemented into the EPP-design. Its modularized structure reflects the order of operations that Proc. of SPIE Vol. 7011 70112V-2

SEQ_1 LEDE DPDPA LEDA clear gate clear gate Q_1 Q_4 Q_2 Q_3 gate clear gate clear ADC_1a ADC_1b ADC_2a ADC_2b SEQ_2 SEQ_3 ADC_3a ADC_3b ADC_4a ADC_4b EPP_1 EPP_2 space wire interface controller IFC EPP_3 EPP_4 EPE_red EPE_nom SEQ_4 Figure 2. Block diagram of the LED electronics. Each quadrant has its own Event Preprocessor (EPP) that receives its data from one of two multiplexed ADCs and sends the output to a SpaceWire interface. must be applied on the raw detector data. 2. THE EVENT PREPROCESSOR (EPP) FOR THE LED For the Low Energy Detector four Event Preprocessors are used in total, one for each quadrant, working together simultaneously. The EPP-design is shown in Figure 3 wherein the data entry is on the top (ADC input) and the processed data leaves the EPP on the bottom of the diagram towards the LEDE interface controller (IFC). A sequencer (SEQ), which is a separate design, will provide two clock signals to the EPP (Frame-Clock and Pixel-Clock), which propel its internal processing and define the pipeline-clock of the EPP. Besides data input (top) and output (bottom), the EPP contains a Command-Interface in order to command its operation and program its internal look-up-tables. 2.1 The correction tasks of the EPP To accomplish both tasks, data correction and data reduction, the EPP-design has to perform a number of ordered operations on the data which are described below. 2.1.1 offset correction An individual offset amplitude, which is afterward subtracted, is determined for each pixel by the dark current generated during the integration time and by further electronic offsets. A matrix containing these offset values Proc. of SPIE Vol. 7011 70112V-3

ADC 14bit ADU Data In PAGU Pixel Address Generator Unit 1bit Frameclk SEQ Pixelclk 1bit generate Pixeladdress generate Pixel Event Packets MAXLINE P I P E L I N E 32 bit PCU Pixel Correction Unit RAM 4096x16bit Offset Map Badpixel Map apply Offset Correction apply Bad Pixel Flag (BF) apply MIPS Flag (MF) apply Misfit Flag (MF) 33 bit PPF MASK RAM 2x16bit MIPS & MISFITS Threshold CMCU Common Mode Correction Unit 33 bit ETAU Energy Threshold Analyzer Unit apply Event Flag (EF) apply Neighbour Flag (NF) 32 bit PTAU Pattern and Trace Analyzer Unit apply Pattern Flag (PF) apply MIP Trace Flag (TF) PPF MASK MAXLINE 6 bit PPF Mask MAXLINE RAM 4096x16bit Energy Neighbour Map RAM 16 bit 4096x16bit Energy Map Command Bus Addr Bus 16bit 16bit Data Bus CC Command Controller & Address Decoder Filter Modes: RAW Filter OP Modes: Fullframe Window Program 32 bit PPFU Programable Pixel Filter Unit filter pixels according to flags add 32 bit time flag PPF MASK Nlt Ngood Nmip LineMip LineBad Pixel Counter HKC House Keeping Controller SPI TS @ IAAT 04.06.08 64 bit DICU Data Interface Controller Unit USB2 / Space Wire(SW) Data Out SW LEDE IFC SPI CICU Command Interface Controller Unit USB2 / Space Wire SW LEDE IFC Figure 3. Block diagram of the EPP. Proc. of SPIE Vol. 7011 70112V-4

for all pixels of a quadrant is called offset map. The offset map can either be calculated on board from closed calibration wheel observations or uploaded to the EPP from ground. The Most Significant Bit (MSB) of the map is used as a bad pixel flag for each pixel. The flagging is always done by command from ground. 2.1.2 MIPS/misfit rejection A minimum ionizing particle (MIP) usually creates a track of hit pixels in one frame, corresponding to a projection of its incoming direction. The energy loss of MIPs is 1.5 MeV cm 2 /g. Therefore, a MIP deposits in the LED at least 78 kev in one pixel (450/2 µm 2.33 g/cm 3 (density) 1.5 MeV cm 2 /g). Asthisisfar above the energy range for photons to be measured ( 20 kev), one pixel of a MIP track is typically above a high amplitude MIP-threshold or will have triggered the overflow flag of the ADC. These MIP-pixels and all adjacent pixels will be excluded from further processing. Misfit events occur when photons hit a certain pixel during the short readout time interval between signal sampling and baseline sampling. In this case the baseline will be higher than the signal, resulting in a negative value of the pixel amplitude, such events will also be rejected. 2.1.3 common-mode correction The amplifiers within the CAMEX readout chip are not compensated for supply voltage changes. Thus, a small variation of the supply voltage will cause notable changes in the pixel amplitudes. The effect concerns all 64 pixels of one line simultaneously. This common-mode noise can be effectively removed by the EPP with a filter, where the median value of all pixels of a line is subtracted from the pixel amplitudes. 2.1.4 valid pattern recognition The charge cloud generated by an incoming photon can be spread over more than one pixel if the event occurred near a pixel border (split events). The valid event patterns (singles, doubles, triples and quadruples, see also section 2.3.4) will pass the Event Preprocessor, other connected pixels will be rejected and filtered out. 2.1.5 event filter The event filter of the EPP is programmable from ground and can reject invalid events for further transmission to ground. Valid events must meet all the following requirements: pixel amplitude is above a lower threshold pixel is not flagged as bad pixel is not a MIP event, adjacent to a MIP event or a misfit event pixel belongs to a valid pixel pattern 2.2 Operation modes of the EPP The EPP currently supports only two operation modes, fullframe-mode and window-mode. fullframe-mode: In fullframe-mode all 64 lines of the DEPFET-matrix are read out in intervals of 128 µs orslower. The fullframe-mode is considered to be the default readout mode of the LED and the EPP. window-mode: The window-mode allows the partial readout of the frame beginning always with line 0 and ending at a selectable line number. The line number can be freely chosen between line 1 and line 63. The value can be commanded via ground station and is stored in an EPP internal register called MAXLINE (compare Figure 3). The window-mode will allow an even faster readout of the LED for very bright X-ray sources. Proc. of SPIE Vol. 7011 70112V-5

2.3 EPP-design details In the following section some of the EPPs design details are explained and further discussed. The design is modularized (compare Figure 3); each module is a VHDL component and was separately designed and tested. The modules are related to the filtering and correction tasks defined in chapter 2.1 and build the base structure of the processing pipeline. 2.3.1 Pixel Correction Unit (PCU) The output of the CAMEX analogue shift register is converted by a 14 bit ADC. These digital energy values for each pixel enter the EPP pipeline through a FIFO -buffer and are first of all marked with their corresponding pixel coordinates and then passed on to the following units. Each processing step of the pipeline must be accomplished in < 2 µs for one line of 64 pixels. The pixel offsets are stored as 16 bit words in an external RAM (offset LUT). For each pixel, an individual offset is subtracted. A new offset table can be calculated from raw event amplitudes on board or uploaded from ground. The most significant bit of the offset table is used to flag pixels as bad. The MIP-flag is applied to those events that after the offset subtraction still have energies above the MIP-threshold and the misfit-flag respectively to those with energies below a certain misfit-threshold. All pixels then pass on to the common-mode correction (CMCU). 2.3.2 Common-Mode Correction Unit (CMCU) The common-mode is corrected by subtracting the median energy value from all pixels of a line. In order to calculate the median, a whole line is accumulated within an internal register. Finding the median of 64 values with a hardware based technique utilizing FPGAs in less than 2 µs is quite a challenge. In order to meet this requirement we have invented a highly parallel median-counting-algorithm to determine the median value. The median of a finite list of ordered values is defined as the value that is separating the higher half of the list from the lower half. Usually the median can be found by sorting all values from the lowest value to the highest value and picking the middle one. Such a sorting algorithm can be very time consuming when realized in hardware, hence we reverted to another approach. We discovered an easy to implement and highly parallel counting algorithm in order to find the median. The idea is based on the fact, that the median of a set of n different values will have an equal number of lower and larger values. Hence we can use n parallel processes and count the number of smaller and greater values for each value in the set. The value which has an equal number of lower and larger values is the median. With slight modifications the algorithm still works in cases where not all n values of the set are distinct, even more it will be also sufficient to only count the number of larger values. On this basis the median-counting-algorithm is implemented as an array of 64 individual 6-bit pixel-counters that store for each pixel the number of pixels with a larger energy value. This array can be filled by having 64 comparators (one for each pixel) and multiplexing in 64 clock cycles through the line. Each pixel-counter is increased if the comparator finds another pixel containing a greater energy value. After 64 steps each pixel-counter contains the total number of pixels with a higher ADC value in relation to its own pixel energy. The median of the energy values is then that value whose pixel-counter registered n/2 larger values in the line. In case two or more pixels of one line hold the same energy value, such a value does not exist and the median value is then to be found at the position of the next smaller counter value. This recursive search can take First In First Out memory Proc. of SPIE Vol. 7011 70112V-6

another 32 clock cycles at maximum, when all pixels of the line have the same 64 valid energy values. Assuming an n-pixel line the median-counting-algorithm will take n+(n/2) clock cycles to complete its task. Finally, the correct median will be subtracted from each pixel of the line and a constant value is added again in order to avoid negative values. The median-counting-algorithm was invented and successfully implemented by T. Schanz and C. Tenzer at IAAT in April 2007. 2.3.3 Energy Threshold Analyzer Unit (ETAU) Each pixel of the LED has its own preamplifier and all 64 pixels of a column are connected to an individual CAMEX channel. Thus, the conversion between raw amplitude in ADU and amplitude measured with a common scale in ADU must be done individually for each pixel. This correction of gain variation and the conversion into ev is not done on board, it has to be performed on ground. In order to compensate possible gain variations already on board the lower thresholds discussed below are individually configurable for each pixel from ground. Two different low energy thresholds are used to determine the occurrence of an event. The Lower Event Threshold flags all pixels with an energy above it as valid events while the slightly lower Neighbour Pixel Threshold is only used in the pattern recognition, when an adjacent pixel is above the Lower Event Threshold. This method is applied to allow better identification of split partners that receive only a very small fraction of the total energy. Both thresholds are applied to all pixels at this stage and the respective flags are set. 2.3.4 Pattern and Trace Analyzing Unit (PTAU) In order to distinguish valid from invalid pixel patterns, three complete detector lines have to be available inside an internal register pipeline. The algorithm then searches the middle line for pixels with one of the event flags set and checks their surrounding eight pixels for further events. As mentioned earlier, if the amplitude of a pixel is only above the Neighbour Pixel Threshold, but the pixel is adjacent to a pixel with an energy above the Lower Event Threshold, this pixel is considered a split partner of that event. All valid patterns defined for the Simbol-X LED which are recognized by this unit are shown in Figure 4. It can be seen that a valid pattern fulfills the requirement of fitting into a 2 x 2 grid with no further surrounding events. Therefore, all patterns that extend over three and more columns or lines are considered invalid. They can be flagged and later filtered by applying only logical operations on the corresponding event flags (EFs) of the relevant eight pixels: line analysis = (EF i 1,j 1 EF i 1,j EF i 1,j+1 ) (EF i+1,j 1 EF i+1,j EF i+1,j+1 ) col. analysis = (EF i 1,j 1 EF i,j 1 EF i+1,j 1 ) (EF i 1,j+1 EF i,j+1 EF i+1,j+1 ) invalid pat. = line analysis column analysis (1) where i is the number of the line of the currently analyzed event and j is its column. The equation thus expresses, that an otherwise valid pixel belongs to an invalid pattern, if the three closest pixels in the preceding and in the following line both have a valid event flag set (line analysis) or, if the three closest pixels in the preceding and in the following column both have a valid event (column analysis). With the help of this equation, the pattern recognition is reduced to pure logic evaluation for which an FPGA or ASIC is perfectly suited and which can be accomplished in a single clock cycle. The method applied in this stage to identify the valid patterns is significantly different from the commonly used method of comparing each pixel pattern with a library of valid patterns, using a DSP. It was invented by Proc. of SPIE Vol. 7011 70112V-7

1) 2) 3) 4) 5) Figure 4. List of all legal event patterns. 1.) singles, 2.+3.) doubles, 4.) triples, 5.) quadruples. For case 1.) to 4.) also all rotations of the patterns are legal. T.Schanz and C.Tenzer in cooperation at IAAT in April 2007 during the design of the EPP chip. As the filter analysis is performed individually for each quadrant, some invalid events, which are located at the edge of a quadrant and spread over the quadrant boundaries, are transmitted as well if they appear as valid events in the individual quadrants. Those events have to be filtered later on ground. Valid pixels that are situated adjacent to a MIP pixel can also be flagged as MIP events, because they are most probably part of the MIP track or split partners of a MIP event. The majority of MIP tracks will be rejected by the pattern filter because they do not match one of the valid patterns. However, those that show valid patterns are in the case of containing one MIP flag event correctly flagged as a MIP pattern, indicating that the pattern is not a valid X-ray event but a MIP track. 2.3.5 Programmable Pixel Filter During the complete event preprocessing procedure, the following flags, attributed to pixels, can be set: Bad Pixel Flag Misfit Flag MIP Flag Event Flag Neighbour Event Flag Invalid Pattern Flag Events flagged as invalid can be filtered out at this stage and a timestamp is added to the valid data packets. In the normal operation mode, only pixel information from valid events will be transmitted to the EPE for further transmission to ground. Also a few invalid patterns from the borders of the quadrants may leak through the event filter. This is acceptable as long as their contribution to the telemetry load is negligible. These events can then later be rejected on ground. However, all valid event patterns have to be transmitted. Because the pixel filter is programmable, it is an option to transmit also events that are properly flagged as invalid to the EPE (for statistical analysis or anti-coincidence schemes). 3. FUNCTIONAL TESTING In order to verify the function of the EPP we have developed a hardware testbench in our electronics lab. This function verification model consists of three computers, one serves as detector simulator (data input), one as spacecraft interface simulator (data output) and the third as command interface computer in order to command and program the EPP and its internal Look-Up-Tables (LUTs). Proc. of SPIE Vol. 7011 70112V-8

4,, d Figure 5. Current lab setup of the EPP testbench at IAAT. Figure 5 shows the current setup at IAAT. The black box in front contains the EPP with the command computer (Laptop) on top. On the left side in the back is the data input computer, the data output computer is on the right side. The current testbench setup is suitable for any functional verification during the EPP logic design, which is almost finished. For the next phase we have already started to build a performance verification model which will prove the timing performance of the EPP. Figure 6 shows a screenshot of the EPP-Testbench-Software. The software can generate artificial detector data as well as load real detector raw data (e.g. from a 64 x 64 pixel DEPFET matrix) and feed it into the EPP. Another part of the software can visualize the EPP output and compare it with the input data. A third part of the software performs the commanding of the EPP. It can program the Look-Up-Tables of the EPP and control its filter constraints. The EPP-Testbench-Software was also developed in our team at IAAT. The testbench shown in Figure 6 contains an artificial frame showing the IAAT-logo. On the left side is the input frame with several distortions (noise, common-mode, MIPs etc.), on the right side is the filtered output of the EPP. 4. SUMMARY AND CONCLUSIONS On board event preprocessing is definitely required for the LED of Simbol-X. The EPP presented here can fulfill all specified tasks of onboard event preprocessing. It will reduce the huge amount of data generated by the Proc. of SPIE Vol. 7011 70112V-9

Data Input Data Output D iagnoxe S tatixtico Receive Read Flago Wordo Framex 8418 Clear Frame Read Fort Data Ready: Save Lact Frame Strobe Data Loxo: Livel Compare? Thom received data - EEEEEE r Thom ditterencex to: rf0t fl Column: 12 Row 83 Value: D Column: D Row 35 Value: D - Event Generation rlettklick: ISDDD Rightklick: ID j 5LJ - 1::::: r Thom ditterencex to: j SF11 ntertace Control S tatixtico Fipeline p Controller 8FF I extbenchc Data D utput Send 1 ci]i FrarnexrnFrV. No. ot prx*ikx: Addr.: Data: fl Hide Hide Receive 8FF Send File... Receive File... F [FM] r Data Input F LOop Frame(x) trameclk period: Framex I ranxmrtted: Hide Start FrameGen D linex j r Statux Exit Figure 6. Screenshot of the running EPP-Testbench software. On the left side an artificially generated input frame, on the right side the output frame of the EPP. LED, approximately 1800 Mbit per second, down to only a few tens or hundreds of remaining event packets every second, depending on the sources in the field of view and the background strength. The amount of data is on average reduced by a factor of >1000, resulting in a major relief on demands for the on board storage and telemetry capacities. In addition, the EPP plays an important role for background reduction as invalid patterns and - more important - the MIP events created by particle interactions are flagged and filtered out. Identifying and removing these events is lowering the overall instrument background significantly. The EPP-design so far is completely implemented as a VHDL design and successfully functional tested on a Xilinx Virtex4 FPGA. A hardware testbench setup enables further testing with real detector raw data that can be obtained from a laboratory detector setup at our institute or from MPI-HLL. For further performance tests of the EPP, a performance verification model is currently under construction which will be completed until the end of 2008. Finally the EPP-design can be transferred into a radiation hard ASIC technology and integrated into the LEDA hardware flight module. REFERENCES [1] Ferrando, P. R., Arnaud, M., Briel, U. G., Cavazzuti, E., Cledassou, R., Counil, J.-L., Fiore, F., Giommi, P.,Goldwurm,A.,Marle,O.L.,Laurent,P.,Lebrun,F.,Malaguti,G.,Mereghetti,S.,Micela,G.,Pareschi, G., Piermaria, M., Roques, J.-P., and Tagliaferri, G., The Simbol-X mission, in [to appear in Proceedings of the SPIE], (2008). [2] Lechner, P. H., Andricek, L., Briel, U. G., Heinzinger, K., Herrmann, S., Huber, H., Kendziorra, E., Lauf, T., Lutz, G., Richter, R. H., Schaller, G., Schnecke, M., Schopper, F., Segneri, G., Strüder, L. W., and Treis, J., The low energy detector of Simbol-X, in [to appear in Proceedings of the SPIE], (2008). One event packet has a data amount of 64 bit Proc. of SPIE Vol. 7011 70112V-10