Upgrades to the CMS Level-1 Calorimeter Trigger R. Aggleton h, L. Apanasevich i, M. Baber c, F. Ball 1, R. Barbieri e, F. Beaudette d, J. Berryhill b, J. Brooke h, A. Bundock c, L. Cadamuro d, I. A. Cali e,r. Cavanaugh b,i, M. Cepeda k, M. Citron c, S. Dasu k,l.dodd k, T. Durkin g, A. Elwood c, R. Forbes k, C. Fountas j, T. Gorski k, M. Guilbaud f, G. Hall c, K. Harder g, S. Harper g, G. Iles c, G. M. Innocenti e, P. Klabbers k, B. Kreis b, C. Laner c, Y.-J. Lee e, A. Levine k, W. Li f, J. Marrouche a, L. Mastrolorenzo d, K. Mishra b, D. Newbold h, M. Northup f, I. Ojalvo k, S. Paramesvaran h, B. Penning c, R. Rivera b, C. Roland e, T. Romanteau d, A. Rose c, T. Ruggles k, J. B. Sauvan d, C. Shepherd-Themistocleous g,d.smith h, N. Smith k, W. Smith k, T. Strebler d, A. Svetek k, A. Tapper c, A. Thea g, J. Tikalsky k, B. Tran f, L. Uplegger b, M. Vicente k, N. Wardle a, T. Williams g, B. Wyslouch e, A. Zabi d, J. Zhang i a CERN b Fermi National Accelerator Laboratory c Imperial College London d Laboratoire Leprince-Ringuet e Massachusetts Institute of Technology f Rice University g Rutherford Appleton Laboratory h University of Bristol i University of Illinois at Chicago j University of Ioannina k University of Wisconsin TWEPP, Lisbon, Portugal September 28 - October 2, 2015
Introduction to L1 Calorimeter Trigger Level-1 (L1) calorimeter trigger finds particle signatures and computes wholedetector level quantities at 40 MHz Electromagnetic Cal. (ECAL) Hadronic Cal. (HCAL+HF) Dedicated readout paths from calorimeter detectors at reduced position and energy granularity LHC is exceeding design performance - Higher pileup and instantaneous luminosity - Must upgrade L1 to maintain CMS performance and keep L1 trigger rate < 100 khz pileup 78 reconstructed interactions Run 1 L1 Trigger Figure 8.1: Architecture of the Level-1 Trigger. 2
Introduction to L1 Calorimeter Trigger Level-1 (L1) calorimeter trigger finds particle signatures and computes wholedetector level quantities at 40 MHz RCT Region Trigger Tower 5x5 ECAL crystal 1 HCAL tower Dedicated readout paths from calorimeter detectors at reduced position and energy granularity Detector Segmentation LHC is exceeding design performance - Higher pileup and instantaneous luminosity - Must upgrade L1 to maintain CMS performance and keep L1 trigger rate < 100 khz (GCT) (RCT) pileup 78 reconstructed interactions Run 1 L1 Trigger Figure 8.1: Architecture of the Level-1 Trigger. 3
Overview of Calorimeter Trigger Upgrades 2015: Stage 1 - Replace Global Calorimeter Trigger - Improved global algorithms, including pileup subtraction and custom heavy ion algorithms - New optical links from ECAL and Regional Calorimeter Trigger oslb! orm! orsc! HCAL! HTR! ECAL! TCC! HCAL Optical! Splitters! HF! µhtr! HCAL! mhtr! Copper! Optical! 2016 and beyond: Stage 2 - Also new HCAL+HF links - Layer 1 time multiplexes events - Layer 2 processes events with allnew algorithms at full trigger tower granularity EM! Candidates! Regional! Region! Energies! Global! Entire Summary! Layer 1! Layer 2! CTP! x18 MP7! x1 4
Overview of Calorimeter Trigger Upgrades 2015: Stage 1 - Replace Global Calorimeter Trigger - Improved global algorithms, including pileup subtraction and custom heavy ion algorithms - New optical links from ECAL and Regional Calorimeter Trigger oslb! orm! orsc! HCAL! HTR! ECAL! TCC! HCAL Optical! Splitters! HF! µhtr! HCAL! mhtr! Copper! Optical! 2016 and beyond: Stage 2 - Also new HCAL+HF links - Layer 1 time multiplexes events - Layer 2 processes events with allnew algorithms at full trigger tower granularity EM! Candidates! Regional! Region! Energies! Global! Entire Summary! Layer 1! Layer 2! CTP! x18 MP7! x9 5
Upgraded Links for Stage 1 oslb! orm! orsc! HCAL Optical! Splitters! Copper! Optical! oslb orm HCAL! HTR! ECAL! TCC! HF! µhtr! HCAL! mhtr! J. C. Da Silva Optical Synchronization and Link Board ( oslb! ) to Optical Receiver Mezzanine orm! ( orm! ) [1] - 576 x 4.8 Gbps optical - CERN VTTx [2] to commercial SFP+ [1] 2013 JINST 8 C02036 [2] 2013 JINST 8 C03004 EM! Candidates! Regional! Region! Energies! Global! Entire Summary! Layer 1! Layer 2! CTP! MP7! Optical Regional Summary Card ( orsc! ) - 18 boards with 11 x 2 Gbps output to legacy GCT and up to six copies to Layer 2, each on pair of 10 Gbps links 6
Master Processor 7 AMC card featuring a Xilinx Virtex-7 690T for data processing - Replaces >20 FPGAs in legacy GCT I/O - 72 optical links in each direction at up to 10 Gbps Four front-panel MTP-48 to PRIZM LightTurn via Sylex assembly Avago Technologies MiniPOD transceivers - Serdes and LVDS electrical I/O via backplane and front panel Atmel 32-bit MMC supporting microsdhc interface for firmware upload http://www.hep.ph.ic.ac.uk/mp7/ 7
Installation and Operation at CMS Primary MP7 and second for testing installed in underground counting room (~100 m fiber length) MP7 second MP7 for testing MicroTCA Carrier Hub (MCH) - Manages AMCs, fan trays, power modules - Gigabit ethernet for IPbus interface used to configure registers and LUTs AMC13 - Distributes 40 MHz LHC clock, L1 Accept, and zeroth bunch crossing marker via LVDS - Receives data for DAQ via serdes LHC clocks / BX0 AMC13 MCH TTC i/f AMC13 MGT Commissioned with copy of RCT data and output sent to Global Trigger test crate Readout ctrl Buffers Algo Ctrl DAQ i/f Ethernet I2C MMC AMC13 Triggering CMS since mid-august ipbus ctrl / system clocks 8
Stage 1 Algorithm Overview Input - 36 x 10 Gbps links, six 240 MHz clocks/event - 396 8/10-bit region energies - 144 highest energy e/gamma candidates, 10 energy+position bits φ (x18) η (x22) New and improved output - Highest energy jets, taus, and e/gammma candidates - Global quantities: scalar and vector sums - All quantities pileup corrected! - 14 x 3 Gbps links, two 80 MHz clocks/event Latency - Pipelined for 40 MHz collision rate - 10 bunch crossings for e/gammas, 20 for rest et < < < he < X X jet tau e/g + global quantities + all new triggers for heavy ion running! 9
Pileup Subtraction Use number of regions with nonzero energy as an event-byevent estimator of pileup - Addresses LUTs storing η- dependent energy to subtract Correlation in data: Number of primary vertices 40 30 20 CMS Preliminary Run 254790 (13TeV) 200 150 100 Counts Pileup-subtracted regions used to build all objects Implies whole event must be read in first - This then drove approach to pipelining - Generally, operate on half the detector per stage, doing many parallel operations with slower clock (80 MHz) 10 0 0 24 0 100 200 300 Chapter 2. Expecte Fri Sep 4 14:21:59 2015 Rate [khz] Expected performance improvement: 12 10 8 6 4 2 Number of regions with E T > 0 GeV Average Pileup 15 20 25 30 35 14 CMS 2012 s = 8 TeV Current L1 Upgrade L1 Linear Extrapolation HT trigger scalar sum of jet energy 0 3000 3500 4000 4500 5000 5500 6000 6500 7000 7500 Instantaneous Luminosity [10 30 cm -2 s -1 ] 50 10
Example: Tau Algorithm 1. Start with pileup subtracted regions from central detector 2. New 2x1 region clustering size 3. New relative isolation check 4. Top-four sort of inclusive and isolated taus X Run 2 data tau isolation region 11
Stage 1 Algorithm Overview Read in whole event and subtract PU Sum/cluster isolation checks Calibration with LUTs Sort Pack energy sums rank energy sums pack energy sums regions subtract PU tau clustering tau isolation calibrate and rank taus sort inclusive taus sort iso taus pack inclusive taus pack iso taus for DAQ GT-like condition unpack 3x3 sum jet condition calibrate and rank central jets calibrate and rank forward jets sort central jets sort forward jets pack central jets pack forward jets e/gammas egamma isolation sort iso egamma sort non-iso egamma pack iso egamma pack noniso egamma 12
Firmware Placement A rather full board Area constraints to guide placement - Infrastructure constrained to edges - Center reserved for algorithms, with first parts constrained near Rx Vivado meets timing more consistently and more quickly than ISE Resource utilization: 13
Custom Heavy Ion Algorithms Factor of 4-8 increase in instantaneous luminosity in heavy ion collisions Custom algorithms, including background subtraction, required to keep L1 rate below 8 khz - φ-ring average background subtraction - Custom triggers include: centrality, single track, barrel endcap e/gamma separation, and 2x2 jet size. Efficiency 1 0.8 0.6 0.4 0.2 CMS Simulation Preliminary 0 0 10 20 30 40 50 60 70 80 90 100 Offline centrality (%) L1 Threshold 25% 50% N.Herrmann, J.P.Wessels, T.Wienold, Ann. Rev. Nucl. Part. Sci. 4970% (1999 Algorithms defined. Firmware under development with protonproton firmware as base N.Herrmann, J.P.Wessels, T.Wienold, Ann. Rev. Nucl. Part. Sci. 49 (1999) 581 N.Herrmann, J.P.Wessels, T.Wienold, Ann. Rev. Nuc 5% 10% 15% 20% 80% 85% 14
Data Acquisition and Online Software 5 Gbps output to CMS DAQ system via backplane and AMC13 - Copy of inputs and outputs attached to event data - Allows for bit-level comparison with C++ emulator - Online data quality monitoring at ~100 Hz for shift crew Entries 5 10 10 4 3 10-1 CMS Preliminary 11.4 pb, Hardware Emulator s=13 TeV Central Jets Entries 616972 Mean 62.76 Entries 616972 Mean 62.76 HW# SOFTWARE# RCT#TS#Cell# RCT/oRSC#SW# PCI2VME' RCT/oRSC#HW# SOAP# NETWORK# 10Gps# Central#Cell# SOAP# S1caloL2#TS#Cell# AMC13#SW## (C++)# IP'Bus' Control'Hub' Bridge'PC' IP'Bus' AMC13#HW# MP7#HW# MP7#SW## (C++)# 3Gps# Online Software Python#Layer# Will#stand#for# teshng#purposes# GT#HW# Ratio 1.02 1.01 1 0.99 0.98 Emulator/Hardware 50 100 150 200 250 (GeV) Online Software - Set up clocks, MGTs, link alignment, LUT and register configuration - Testing modes (playing events in text files through algorithm, loopback, etc.) - Monitoring of clock locks, link alignment, CRC, etc. with alarms for shift crew p T 15
Stage 2 Upgrade Overview Goes online in 2016 Improved performance from processing full trigger tower granularity in one layer oslb! orm! orsc! HCAL! HTR! ECAL! TCC! HCAL Optical! Splitters! HF! µhtr! HCAL! mhtr! Copper! Optical! Made possible by time multiplexing events in a first layer - Flexibility from increased latency Regional! EM! Candidates! Region! Energies! Global! Entire Summary! Layer 1! Layer 2! CTP! x18 MP7! x9 RCT Region Trigger Tower 5x5 ECAL crystal 1 HCAL tower Region Granularity Trigger Tower Granularity 16
Time Multiplexing Architecture Each card spans 8 out of 72 towers in φ and ½ of η. Layer 1: CTP7 18 cards, each receiving 60 links at between 5.0 Gb/s & 6.4 Gb/s of Calorimeter data Layer-1 Cards CTP7 CTP7 CTP7... CTP7 CTP7 CTP7... CTP7 CTP7 CTP7... CTP7 CTP7 CTP7 x18 Layer 1 cards transmit 48 links @ 10G MP7 MP7 MP7 Redundant Nodes 72 input links per Layer-2 node MP7 MP7 MP7 Nodes 3 to 9 MP7 Node 2 MP7 (demux) μgt Layer-2 Cards 6 output links per MP card @ 10Gb/s MP7 Flexible system Node 1 x10 Simple to upgrade from 16 bit towers to 24 bit towers or provide extra logic resources. Virtex 7 690T ZYNQ 7000 Presented in poster by Aleš Svetek: Construction, Testing, Installation, Commissioning and Operation of the CMS Calorimeter Trigger Layer-1 CTP7 Cards Layer 2: MP7 Same infrastructural firmware and low-level online software as Stage 1 17
Links for Stage 2 New HCAL+HF mhtr/μhtr readout - Allows for upgrades to trigger primitive computation (energy resolution, depth info, HF granularity) - μtca, two Virtex 6 FPGAs, 6.4 Gbps output Layer1 to Layer2 - Time multiplexing routed with 72 to 72 12-fiber MPO cables - Molex FlexPlane - Three pizza-box sized enclosures instead of whole rack if using LC Molex Molex Molex enclosure MPO 12 Input Cables First detector region Second detector region oslb! orm! orsc! HCAL! HTR! Regional! EM! Candidates! Global! 1 2 3 4 5 6 7 8 9 10 11 12 ECAL! TCC! Region! Energies! HCAL Optical! Splitters! fibers Entire Summary! HF! µhtr! Layer 1! Layer 2! MPO 12 Output Cables 1 2 3 4 5 6 7 8 9 10 11 12 HCAL! mhtr! Copper! Optical! CTP! MP7! First bunch crossing Second bunch crossing 18
High Granularity Algorithms Processing 720 Gbps, 72 x 82 trigger towers Algorithms clocked at 240 MHz, pipelined by η slices Pileup subtraction - e/gammma: Based on trigger tower multiplicity in four η slices - Jet: Donut pileup subtraction Mean Energy Density (GeV) 0.25 0.2 0.15 0.1 CMS Simulation Preliminary Minimum Bias s = 13 TeV BX = 50ns <PU> = 40 Rate (Hz) 30000 25000 20000 15000 10000 CMS Simulation Preliminary Upgrade Run 1 inst. lumi. = 7x10 cm -2 s -2 s=13 TeV BX=50ns <PU>=40 Fourth jet p T 33 > X GeV 0.05 5000 0 0 10 20 30 40 50 60 70 Described in poster by Alexandre Zabi Triggering on electrons, jets and tau leptons with the CMS upgraded calorimeter trigger for the LHC RUN II Number of Interactions 0 0.5 0.6 0.7 0.8 0.9 1 Efficiency 19
Stage 2 Commissioning Status Layer 1 and Layer 2 fully installed and linked All links validated - Detectors to Layer 1 - Layer 1 to Layer 2 - Layer 2 to Demux to new Global Trigger Layer 1 CTP7s Layer 2 MP7s Expect to be running in parallel in a few weeks (Layer 1 in now) - Demonstrate readiness for 2016 - Reach perfect agreement between firmware and emulator - Use data to determine future calibrations and thresholds and to study performance 20
Conclusions CMS Level-1 Calorimeter trigger is being upgraded in two stages to maintain performance in Run 2 Stage 1 upgrade is online now - New and improved algorithms, including pileup subtraction and custom heavy ion algorithms - All done in a single Virtex 7 FPGA Stage 2 will go online next year - Time multiplexing events and processing at full trigger tower granularity - All hardware installed and links validated - Parallel running in 2015 21
Backup 22
More on Motivation Level-1 (L1) hardware-based trigger - Data from dedicated readout paths - 40 MHz 100 khz reduction - 3.8 μs latency 78 reconstructed interactions LHC is exceeding design performance - Higher pileup, 50 interactions per bunch crossing - Higher instantaneous luminosity (x2 for pp collisions, x4-8 for heavy ion) Together with increase in collision energy, leads to factor of ~6 increase in the Run 1 L1 trigger rate 23
Upgraded Links for Stage 1 Optical Synchronization and Link Board (oslb) [1] - Synchronizes ECAL detector data at 40 MHz - Xilinx Kintex 7 and CERN VTTx optical module [2] - 576 boards with two 4.8 Gbps output copies oslb! orm! orsc! HCAL! HTR! ECAL! TCC! HCAL Optical! Splitters! HF! µhtr! HCAL! mhtr! Copper! Optical! Optical Receiver Mezzanine (orm) [1] - Receives optical signals from oslb - Xilinx Kintex 7 and commercial SFP+ optical receivers - 504 boards, 72 with two receivers Regional! EM! Candidates! Region! Energies! Global! Entire Summary! Layer 1! Layer 2! CTP! MP7! Optical Regional Summary Card (orsc) - Transmits full crate of RCT data to Layer 2 via optical - 9U VME card with Xilinx Kintex 7-18 boards with 11 x 2 Gbps output to legacy GCT and up to six copies to Layer 2, each on pair of 10 Gbps links oslb orm [1] 2013 JINST 8 C02036 [2] 2013 JINST 8 C03004 J. C. Da Silva 24
Example: Tau Algorithm Start with pileup subtracted regions from central detector (x252) New clustering, considering every region as possible tau center (252/2=126 in parallel) New relative isolation check - (E3x3 - Eobject)/Eobject < value - 126 LUTs with 16-bit E3x3 & Eobject address and 1-bit isolation decision Calibrate and compress energy with 126 10 6-bit LUTs Top-four sort - Separately sort inclusive and isolated taus - 252 6-bit energies with 9-bit positions associated to them X Run 2 data tau isolation region smallest biggest Balance latency and routing in sort X X X DROP X X X X X X biggest X smallest 25
Data Acquisition and Quality Monitoring 5 Gbps output to CMS DAQ system via backplane and AMC13 - Sends copy of inputs received from RCT for 1% of events accepted by L1 - Sends copy of outputs sent to Global Trigger, also for the two bunch crossings before and after, for every L1-Accept Entries 5 10 10 4-1 CMS Preliminary 11.4 pb, s=13 TeV Central Jets Entries 616972 Mean 62.76 Entries 616972 Mean 62.76 Data Quality Monitoring system - ~100 Hz sampling of DAQ data - Computes bit-level expectations for output based on inputs - Fast feedback monitored by shift crew - Can help identify problems due to wrong configuration of LUTs, upstream detector problems, etc. Ratio 3 10 1.02 1.01 1 0.99 0.98 Hardware Emulator Emulator/Hardware 50 100 150 200 250 p T (GeV) Also attached to event data for offline analysis 26
Online Software High level cells in CMS trigger online software and standalone python layer SOAP# Central#Cell# SOAP# RCT#TS#Cell# S1caloL2#TS#Cell# Control AMC13 and MP7 - Set up clocks, MGTs, link alignment - Write to registers and LUTs from database Can configure MP7 for various modes of testing - Useful for debugging MGT or algorithm firmware, for example HW# SOFTWARE# RCT/oRSC#SW# PCI2VME' RCT/oRSC#HW# NETWORK# 10Gps# AMC13#SW## (C++)# IP'Bus' Control'Hub' Bridge'PC' IP'Bus' AMC13#HW# MP7#HW# MP7#SW## (C++)# 3Gps# Python#Layer# Will#stand#for# teshng#purposes# GT#HW# a) b) Monitoring - Clock locks, link alignment, CRC, etc. - Alarms for shift crew c) d) Figure 4: Schematic view of the available link buffer configurations. a) Simple loop b) Link loop c) Algorithm test d) Data capture 27
Outline Overview of two-stage calorimeter trigger upgrade Stage 1 - New optical links - New data processing card - Improved algorithms Stage 2 - New optical links - Time multiplexing architecture - High granularity algorithms - Commissioning status 28