SOFTWARE CORRELATOR CONCEPT DESCRIPTION

Size: px
Start display at page:

Download "SOFTWARE CORRELATOR CONCEPT DESCRIPTION"

Transcription

1 SOFTWARE CORRELATOR CONCEPT DESCRIPTION Document number... WP TD 002 Revision... 1 Author... Dominic Ford Date Status... Approved for release Name Designation Affiliation Date Signature Additional Authors A. Faulkner, J. Kim, P. Alexander Submitted by: D.Ford UCAM Approved by: W. Turner Signal Processing Domain Specialist SPDO

2 DOCUMENT HISTORY Revision Date Of Issue Engineering Change Number Comments 1 First Issue DOCUMENT SOFTWARE Package Version Filename Wordprocessor MsWord Word a2 WP TD 002 Software Correlator UCAM 2003 Block diagrams Other ORGANISATION DETAILS Name Physical/Postal Address SKA Program Development Office Jodrell Bank Centre for Astrophysics Alan Turing Building The University of Manchester Oxford Road Manchester, UK M13 9PL Fax. +44 (0) Website Page 2 of 31

3 TABLE OF CONTENTS 1 INTRODUCTION Purpose of the document REFERENCES OVERVIEW FX correlator implementations Beamforming SYSTEM DIAGRAM DATA FLOW THROUGH THE CORRELATOR Notation Calculation of data rates Input data rate Internal data rates Output data rate Floating point operation counts The X step The B step The F step Implications for choice of architecture IMPLEMENTATION ON GPUS Overview of the Tesla platform NVIDIA roadmap The X step The F step The data routing problem Hardware cost estimate Power requirements Non recurring engineering (NRE) Reliability and maintenance Project Execution Plan (PEP) Page 3 of 31

4 LIST OF FIGURES Figure 1: The typical flow of data through an interferometer with an FX correlator Figure 2: System diagram of a software correlator for SKA 1 Mid (dishes) implemented using GPGPU cards Figure 3: System diagram of a software correlator for SKA 1 Low (aperture arrays) implemented using GPGPU cards Figure 4: Data transport within an FX correlator Figure 5: An NVIDIA C2050 Tesla card, offering a theoretical peak performance of GFLOP/s in single precision Figure 6: The CUDA memory model Figure 7: NVIDIA's roadmap for the Tesla product line LIST OF TABLES Table 1: Illustration of how a pair of software correlators could be built for SKA 1 Low and SKA 1 Mid following the design shown in Figures 2 and 3. The scenario presented assumes that NVIDIA's Maxwell range of GPGPU cards (expected to be available in 2013) are used. These are interfaced with 40 Gbit/s fibre connections. Table 2 and Table 3 show how the scenario alters if other generations of NVIDIA GPGPUs are used Table 2: Estimates of the numbers of GPGPU cards which would be required in the SKA 1 Low (AA) system design presented in Table 1 if various future generations of NVIDIA GPGPU cards, expected to be released before the deployment of SKA 1, were used (see Section 6.2 for details of the assumed specifications) Table 3: Estimates of the numbers of GPGPU cards which would be required in the SKA 1 Mid (Dishes) system design presented in Table 1 if various future generations of NVIDIA GPGPU cards, expected to be released before the deployment of SKA 1, were used (see Section 6.2 for details of the assumed specifications) Table 4: Assumed system parameters for an SKA 1 system, based upon Memos 125 and 130. The numbers of frequency channels used are left undefined, since they has little to no effect on the correlator workload calculations presented here. They will, however, have a substantial effect on the workload of the UV processor, not discussed here Table 5: Estimated processing requirements of one dimensional FFTs as a function of size Table 6: Anticipated performance of future NVIDIA GPGPU cards, as assumed here Table 7: The processing capabilities of NVIDIA Tesla cards, expressed as the RF bandwidth a single card could process within a single beam for SKA 1 Low and SKA 1 Mid, assuming the numbers of baselines indicated in Table Page 4 of 31

5 Table 8: The estimated cost of the GPGPU cards required to implement the X step using various generations of NVIDIA Tesla cards Table 9: The estimated power dissipation of the GPGPU cards required to implement the X step using various generations of NVIDIA Tesla cards Page 5 of 31

6 LIST OF ABBREVIATIONS AA... Aperture Array ADC... Analogue-to-Digital Converter AI... Arithmetic Intensity ASIC... Application-Specific Integrated Circuit CoDR... Conceptual Design Review CMAC... Complex Multiplication and ACcumulation CPU... Central Processing Unit CUDA TM... Compute Unified Device Architecture (NVIDIA 2009) DFT... Discrete Fourier Transform DiFX... Distributed FX correlator (Deller et al. 2007) DRAM... Dynamic Random Access Memory DRM... Design Reference Mission DSP... Digital Signal Processing emerlin... extended Multi-Element Radio-Linked Interferometer Network evla... Extended Very Large Array FFT... Fast Fourier Transform FLOPS... Floating Point Operations per second FPGA... Field Programmable Gate Array FoV... Field of View GMRT... Giant Meter-wave Radio Telescope GPGPU... General-Purpose Graphics Processing Unit GPU... Graphics Processing Unit HPC... High-Performance Computing IF... Intermediate Frequency LO... Local Oscillator LOFAR... LOw-Frequency ARray MPI... Message Passing Interface (MPI Forum 2009) MWA... Murchison Wavefield Array NRE... Non-Recurring Engineering PEP... Project Execution Plan PrepSKA... Preparatory Phase for the SKA RF... Radio Frequency SEMP... Systems Engineering Management Plan SRS... Systems Requirement Specification SIMD... Single Instruction Multiple Data SKA... Square Kilometre Array Page 6 of 31

7 SKADS... SKA Design Studies SPDO... SKA Program Development Office TBD... To be decided VLBA... Very Long Baseline Array WIDAR... Widefield Interferometric Digital ARchitecture (correlator implementation) Page 7 of 31

8 1 Introduction This document describes software based architectures which could provide an FX correlator for SKA Phase 1. It provides an assessment of the feasibility of implementing such a correlator on the current generation of NVIDIA general purpose graphics processing unit (GPGPU) cards, showing that the performance delivered by this line of processors is considerably more competitive than that of the current generation of x86 based processors. It also provides a forecast of how we expect the performance metrics derived from this implementation to evolve between now and the construction of SKA Phase 1 in (Garrett et al. 2010, Dewdney et al. 2010), making reference to the digital signal processing (DSP) technology roadmap (Turner 2011). Reasonable GPGPU performance expectations by 2016 show that the SKA Phase 1 correlator can be implemented economically using this platform. 1.1 Purpose of the document The purpose of this document is to provide a concept description as part of a larger document set in support of the SKA Signal Processing concept design review (CoDR). It provides a bottom up perspective of how a software correlator could be implemented. This document has been produced in accordance with the Systems Engineering Management Plan (SEMP) and Signal Processing PrepSKA Work Breakdown document and includes: First draft block diagram of the relevant subsystem. First draft estimates of cost. First draft estimates of power. A discussion of reliability issues. System parameters for SKA Phase 1 have been drawn from Garrett et al. (2010), Dewdney et al. (2010) and the SKA Phase 1 Design Reference Mission (DRM) while the Systems Requirement Specification (SRS) is being created. 2 References [1] Garrett, M.A., et al. (2010), A Concept Design for SKA Phase 1 (SKA 1 ), Memo 125 [2] Dewdney, P., et al. (2010), SKA Phase 1: Preliminary System Description, Memo 130 [3] Turner, W., (2011), Technology Roadmap Document for SKA Signal Processing, WP TD 001 [4] System Engineering Management Plan (SEMP), WP MP 001 [5] SKA System Requirement Specification (SRS) [6] Thompson, A. R., Moran, J. M., and Swenson, G. W. (2001), Interferometry and Aperture Synthesis in Radio Astronomy, second ed., Wiley (New York) [7] Deller, A.T., et al. (2007), DiFX: A Software Correlator for Very Long Baseline Interferometry using Multiprocessor Computing Environments, PASP, 119, 318 [8] MPI Forum (2009), MPI: A Message Passing Interface Standard, Version 2.2 [9] Roy, J., et al. (2010), A real time software backend for the GMRT, ExA, 28(25) Page 8 of 31

9 [10] Romein, J.W., et al. (2009), Astronomical Real Time Streaming Signal Processing on a BlueGene/P Supercomputer. [11] Harris, C., et al. (2008), GPU accelerated radio astronomy signal convolution, Exp Astron, 22, 129 [12] van Nieuwpoort, R.V., and Romein, J.W. (2009), Using Many Core Hardware to Correlate Radio Astronomy Signals, SKADS DS3 T2 Deliverable Document [13] Wayth, R.B., Greenhill, L.J., and Briggs, F.H. (2009), A GPU based real time software correlation system for the Murchison Widefield Array prototype, PASP, 121, 857 [14] NVIDIA (2009), NVIDIA CUDA TM Programming Guide, Version [15] Alexander, P., et al. (2010), SKA Data Flow and Processing, in Wide Field Astronomy & Technology for the Square Kilometre Array, ed. Torchinsky, S., et al Page 9 of 31

10 3 Overview WP TD 002 The angular resolution of any telescope scales in proportion to the wavelength being observed, and in inverse proportion to the diameter of the telescope's aperture. The long wavelengths of radio waves mean that any single antenna instrument would have to measure many kilometres across in order to resolve structure within many astronomically interesting objects. Interferometric arrays allow fine structure to be resolved on the sky without the need for building such prohibitively large monolithic antennas, by combining the signals from many widely spaced smaller antennas in a process called aperture synthesis (see, e.g., Thompson et al. 1986). Central to any interferometer is the correlator which brings together the signals from the individual antennas. The correlator cross multiplies complex valued measurements of the radio frequency (RF) electric field produced by pairs of antennas to produce visibilities, which are related to the spatial brightness distribution of the sky by a transformation which, in the small field of view flat sky limit reduces to a Fourier transformation. These complex visibilities are summed over some short period of time, equivalent to taking a finite length exposure of the sky, before being stored for subsequent calibration, gridding, image formation and deconvolution. Antennas Antennas Antennas Time-domain FFT Cross correlation High data rate. Low complexity. Time integration Spatial FFT and Imaging Lower sample rate. High complexity. Figure 1: The typical flow of data through an interferometer with an FX correlator. This process is broken down into more detail in Figure 1. Each antenna produces measurements of the received time varying RF electric field in one or more polarisations. In a heterodyne receiver, these RF signals may be shifted down to an intermediate frequency (IF) using an analogue mixer connected to a local oscillator (LO). This signal is then passed to an analogue to digital converter (ADC) for digitisation. Samples must be taken at a minimum rate of the bandwidth of the antenna, the Nyquist rate, if both the real and imaginary parts of the input sinusoidal signal are recorded, and at twice this rate otherwise. In practice, the sampling rate N s usually slightly exceeds this and the signal is said to be oversampled Page 10 of 31

11 In an FX correlator of the type discussed here, the samples from each antenna/polarisation are collected into blocks of length N f, each of which is Fourier transformed (FFTed) or polyphase filtered (PPFed) to form an electromagnetic spectrum with N f frequency channels; we term this the F step of the correlator. After the F step, each antenna/polarisation yields samples at a rate of N s /N f in each of N f frequency channels which are handled independently from here onwards. Within each channel, samples from each pair of antennas are delay compensated and cross multiplied to form visibilities; we term this the X step of the correlator. The visibilities are time integrated and periodically recorded at some dump rate. The maximum period of time over which each visibility can be integrated is constrained by the time taken for the rotation of the Earth to become significant over the angular length scale associated with that visibility. The resultant time integrated visibilities are then stored, later to be passed to gridding and image formation algorithms. This signal chain may be divided into two regimes. Prior to the time integration step, the sample rate is necessarily high, as it has to exceed the Nyquist rate of the interferometer's bandwidth. But, the algorithms being applied to this data have low arithmetic intensities 1 (AIs). After the time integration step, the sample rate is much reduced 2, but the algorithms required for image formation have much higher AIs. 3.1 FX correlator implementations The majority of digital FX correlators built in recent times have used application specific integrated circuits (ASICs) and field programmable gate arrays (FPGAs). We term these hardware correlators. Examples include the WIDAR correlators used by the extended Very Large Array (evla) and emerlin, that used until recently by the Giant Meterwave Radio Telescope (GMRT), and that being built for the full 128 antenna Murchison Widefield Array (MWA). Driven by the increasing processing power and decreasing cost of mass produced desktop PCs, however, an alternative approach of using general purpose processors, which can be programmed using conventional programming languages, has been growing increasingly competitive. We term these software correlators. Deller et al have developed a correlator called DiFX which runs on clusters of off the shelf PCs using the MPI messaging system (MPI Forum 2009); this is used by the Australian Long Baseline Array (LBA) and the Very Long Baseline Array (VLBA). LOFAR uses a software correlator running on an IBM BlueGene/P supercomputer (Romein et al. 2009). The GMRT has recently deployed a custom designed software correlator (Roy et al. 2010). Several papers have studied the feasibility of deploying software correlators on general purpose graphics processing units (GPGPUs; see, e.g., Harris et al and van Nieuwpoort & Romein 2009). Such processors were originally developed for the computer games market, but interfaces are now available which ease the use of their phenomenal processing power for more generalised high 1 Arithmetic intensity is defined as the average number of floating point operations which need to be performed on a stream of data per byte processed. 2 For SKA Phase 2, this assumption may no longer hold (Alexander et al. 2010) as the longest baselines will sample the sky on length scales where the rotation of the Earth becomes significant on very short timescales, making very fast dump rates necessary. For SKA Phase 1, however, this assumption is valid Page 11 of 31

12 performance computing (HPC) tasks. The most widely used of these programming environments is the Compute Unified Device Architecture (CUDA TM ), a vendor specific interface for NVIDIA s line of GPGPUs. Wayth et al report the results of testing an experimental CUDA based FX correlator on the 32 antenna prototype of the MWA. Several key advantages of software correlators can be identified: o Rapid development cycles. Software correlators are designed using conventional programming languages for which mature software development tools and debugging suites already exist. Unlike ASICs, which have long development cycles, software correlators can track advances in technology very closely. o Reduced NRE. Pre existing hardware is used, which is already well tested and mass produced for the consumer market. o Easy reconfigurability. It is straightforward to reconfigure the correlator post deployment, perhaps in order to use alternative algorithms developed after installation, or in response to hardware failure. The same processors could potentially be used for both correlation and beamforming (see Section 3.2).It is conceivable to implement multiple observing modes which correlate different numbers of frequency channels or different numbers of antennas using the same hardware. These advantages come at the cost of a small decrease in FLOP per unit silicon efficiency as compared to application specific hardware, since the inherent flexibility of general purpose processors requires a sizeable fraction of the processor die to be dedicated to flow control. This in turn leads to a small decrease in power efficiency, since the dynamic power consumption of a processor is proportional to the number of gates (see, e.g., Turner 2011). In this document we present a study of the feasibility of deploying a software correlator for the SKA Phase Beamforming SKA 1 will implement beamforming using at least two hierarchical levels within the signal path. Aperture array stations will beamform their constituent antenna tiles into around 160 beams (Dewdney et al. 2010) before data is transmitted back to the central processing bunker. However, some of the science objectives of SKA 1, in particular long term pulsar monitoring, do not require imaging but rather high time resolution in a single beam. For these science objectives, the correlator and image processor would be replaced by a much simpler beamformer. As noted above, software correlators are implemented using general purpose processors which can be reconfigured near instantaneously to run different algorithms on the same hardware. These algorithms can even run concurrently. It is therefore quite conceivable that a software correlator could be used to image one part of the sky whilst simultaneously forming a beam to monitor a nearby pulsar. We show in Section 6.4 the cost overhead of doing so is modest, saving the cost of an additional beamforming system Page 12 of 31

13 4 System diagram WP TD 002 In Section 5, we develop a generic description of a software correlator which is not tied to any particular hardware platform. This approach is taken in recognition of the fact that there are several other lines of processors, known to be on the roadmaps of other vendors, which may well emerge as promising candidate platforms over the coming few years. For example, Intel demonstrated a 32 core 1.2 GHz x86 processor known as Knight's Bridge in 2010 and are expected to demonstrate a similar 50 core processor, Knight's Corner, in late 2011 (see, e.g., Turner 2011). The roadmap of ATI (owned by AMD) is less clearly known, and though the support of their GPUs for the HPC market is currently limited, this may well change. The line of DSP processors produced by Picochip may also provide a competitive platform for implementing software correlators within the timeframe of SKA 1. In Section 6, we go on to provide a specific example of how such a correlator could be implemented on the NVIDIA Tesla line of GPGPU cards. Figures 2 and 3 summarise the conclusions of this discussion, showing system diagrams for the correlators for SKA 1 Mid and SKA 1 Low respectively. The data flow through these correlators is given in Table 1. In quoting the numbers of GPGPU cards required for each correlator, we assume that NVIDIA's Maxwell line of GPGPU cards are used, which are expected to be available in 2013, and that these cards achieve 75% of their theoretical performance (see Section 6.3). Tables 2 and 3 show how the number of cards needed might be expected to change for successive generations of NVIDIA cards (see Section 6.2 for the specifications assumed for these). All network interconnections are assumed to operate at 40 Gbit/s, which is already achievable. In Figures 2 and 3, each antenna is connected to a backend which amplifies, filters, digitises and coarsely channelises the received signal. In the case of aperture array stations, stationwide beamforming occurs immediately before coarse channelisation. The resulting data stream is transmitted over fibre to the central processing bunker. The N a signals from N a antennas are fed into N FPGA FPGA based F step subsystems which buffer the data to produce the required delays and finely channelise each signal. Each subsystem is anticipated to receive data from up to N FPGA in antennas. The channelised output data is divided into a small number N switches of broad frequency bands, each of which is transmitted over fibre to a separate network switch. Each network switch collates data from all of the F step subsystems and therefore all the antennas within its particular frequency band. Data is then routed to one of a number of PCs hosting GPGPU cards, each of which performs cross correlation and time integration on a subset of the frequency channels within the switch's frequency band. Tables 2 and 3 estimate the numbers of GPGPU cards required to implement each correlator using various generations of NVIDIA hardware. For reference, the price of each generation of GPGPU card is not expected to change significantly from 2,500, and each card is expected to dissipate around 250 W. The rate of flow of data onto each card is shown; this increases with time as the cards become able to process more data. The current generation of PCI Express 2.0 cards can receive data at a maximum rate of 64 Gbit/s (theoretical peak). Though PCI Express 3.0 is not currently widely supported, its baseline specifications were published in November 2010 and support a maximum transfer rate of 128 Gbit/s (theoretical peak). It seems reasonable to assume that future NVIDIA cards will use this bus, and that at least another doubling of speed will be achieved by Page 13 of 31

14 Antenna backend Analogue gain/filter Digitisation Coarse Channelisation TX Fibre link to central processing bunker Central processing bunker Bulk delay and fine channelisation Switch GPGPU GPGPU Data readout N a = 50 inputs; two into each F step subsystem. Bulk delay and fine channelisation Bulk delay and fine channelisation N FPGA = 125 subsystems, each accepting data from two antennas. The output frequency channels are divided into sixteen broad frequency bands, each of which is handled by a separate switch. F Step N switch = 16 switches, each handling data from around a 100 MHz bandwidth segment out of the total 1.55 GHz bandwidth of SKA 1 Mid (450 MHz to 2 GHz). X Step Switch GPGPU GPGPU N GPGPU = 272 NVIDIA GPGPU cards (Maxwell series, available 2013). Each switch is connected to 17 cards via host PCs. Control and monitoring Figure 2: System diagram of a software correlator for SKA 1 Mid (dishes) implemented using GPGPU cards Page 14 of 31

15 Antenna backend Analogue gain/filter Digitisation Station beamforming Coarse Channelisation TX Fibre link to central processing bunker Central processing bunker N a = 250 inputs; six into each F step subsystem for each beam. Bulk delay and fine channelisation Bulk delay and fine channelisation N FPGA = 9 subsystems, each accepting data from six antennas. Beam 1 Switch A single switch collates data from the nine F step subsystems and passes it onto three GPGPU cards, each of which processes a third of the total bandwidth of SKA Low. Data readout GPGPU GPGPU N GPGPU = 3 NVIDIA GPGPU cards (Maxwell series, available 2013). Beam 2 Switch GPGPU GPGPU F Step Beam i X Step A total of 408 GPGPU cards are required to form 160 beams. These are connected to 160 switches and 1440 FPGA based F step subsystems. Figure 3: System diagram of a software correlator for SKA 1 Low (aperture arrays) implemented using GPGPU cards Page 15 of 31

16 Table 1: Illustration of how a pair of software correlators could be built for SKA 1 Low and SKA 1 Mid following the design shown in Figures 2 and 3. The scenario presented assumes that NVIDIA's Maxwell range of GPGPU cards (expected to be available in 2013) are used. These are interfaced with 40 Gbit/s fibre connections. Table 2 and Table 3 show how the scenario alters if other generations of NVIDIA GPGPUs are used. Symbol Description SKA 1 Low (AA) per beam N a Number of input antennas SKA 1 Mid (Dishes) F step subsystems (FPGAs) N FPGA Number of FPGA boards N FPGA in Number of inputs to each FPGA subsystem 6 2 N FPGA out Data throughput of each FPGA subsystem Gbit/s 49.6 Gbit/s Number of outputs from each FPGA subsystem 1 16 g FPGA out Bitrate on each output line Gbit/s 3.1 Gbit/s Switches N switches Number of switches (also frequency bands) 1 16 N FPGA Number of inputs to each switch n GPGPU Number of outputs from each switch 3 17 Minimum number of ports on switch (including monitoring and control) Data throughput of switch 304 Gbit/s Gbit/s GPGPU cards n GPGPU Number of Maxwell GPGPU cards connected to each switch 3 * 17 * Data rate into each Maxwell GPGPU card Gbit/s * Gbit/s * N GPGPU Total number of Maxwell GPGPU cards 480 * 272 * * These figures assume that NVIDIA's Maxwell (2013) range of GPGPU cards are used. See Table 2 and Table 3 for the numbers of cards which would be needed if other ranges of GPGPU cards were used Page 16 of 31

17 Table 2: Estimates of the numbers of GPGPU cards which would be required in the SKA 1 Low (AA) system design presented in Table 1 if various future generations of NVIDIA GPGPU cards, expected to be released before the deployment of SKA 1, were used (see Section 6.2 for details of the assumed specifications). GPGPU card Expected release year Number of GPGPU cards needed per switch Total number of GPGPU cards needed Fermi Kepler Maxwell ??? ??? Data rate onto each card (Gbit/s) Table 3: Estimates of the numbers of GPGPU cards which would be required in the SKA 1 Mid (Dishes) system design presented in Table 1 if various future generations of NVIDIA GPGPU cards, expected to be released before the deployment of SKA 1, were used (see Section 6.2 for details of the assumed specifications). GPGPU card Expected release year Number of GPGPU cards needed per switch Total number of GPGPU cards needed Fermi Kepler Maxwell ??? ??? Data rate onto each card (Gbit/s) Page 17 of 31

18 5 Data flow through the correlator A software correlator acts on a continuous flow of data, which it must be able to process in pseudoreal time. This means that it must, on average, be able to process data samples as fast as they arrive. This differs from stricter definitions of real time processing in the sense that it may be possible to buffer data for a short time, but the amount of buffered data cannot be allowed to grow indefinitely. A streamed processor may be limited either by the rate at which data can be fed into it (data limited), or by the rate at which it can perform the required computations on the data (processing limited). In this section, we enumerate the data throughput and processing requirements of a software correlator for the SKA Phase Notation To proceed, the specification of the FX correlator must be formalised. We assume that it accepts input data streams from N a antennas, each measuring N p polarisations (either one or two) and simultaneously observing N beam beams. Each input stream consists of complex integer data made up words consisting of an N b bit real component and an N b bit imaginary component for each polarisation. These words arrive at a sample rate of N s, which must exceed the Nyquist rate of Δν, where Δν is the total bandwidth of the antenna. In the correlator itself, the F step FFTs or PPFs each data stream from the time domain into N f frequency channels. The X step correlates all N p 2 polarisation pairs and all N B = N a (N a +1)/2 antenna pairs, i.e. baselines, within each of these channels. The latter comprises N a (N a 1)/2 cross correlations and N a auto correlations. Each correlation product is summed over some time period t before being dumped as output data. The maximum time period over which each correlation product may be summed is constrained by the time taken for the rotation of the Earth to become significant over the angular scale associated with the correlation product. For short baselines measuring large angular scales, long integrations may be acceptable, but for longer baselines, shorter integrations are necessary to prevent smearing. Though all interferometers built to date typically use fixed (fast) dump rates for all baselines, Alexander et al have pointed out that for a telescope with the angular resolution of SKA Phase 2, the dump rates of the longest baselines will need to be so fast that the output data rate from the correlator will exceed the input data rate unless baseline dependent dump rates are used. This is less of an issue for SKA Phase 1, but we nonetheless write t(b) here, where B is baseline, noting that we require t(b) < D/2Bω f to satisfy the criteria given by Alexander et al. 2010, where D is the diameter of the collectors used, ω = rad/s is the rotation rate of the Earth, and f is the factor by which the visibilities are oversampled. We take f=4 here. In Table 4, ballpark values for these system parameters are drawn from Memos 125 and Page 18 of 31

19 Table 4: Assumed system parameters for an SKA 1 system, based upon Memos 125 and 130. The numbers of frequency channels used are left undefined, since they has little to no effect on the correlator workload calculations presented here. They will, however, have a substantial effect on the workload of the UV processor, not discussed here. Parameter SKA 1 Low (AA) SKA 1 Mid (Dishes) N beam D 180 m 15 m N a N B 1,275 31,375 N p 2 2 Δν 380 MHz (70MHz 450 MHz) 1.55 GHz (450 MHz 2 GHz) N s 380 Msample/s 1.55 Gsample/s N f N f,low N f,mid N b 4 4 Both the F step and the X step consist of many parallel and independent computations, as shown in Figure 4. The F step consists of N a N p parallel FFT/PPFs of length N f per elapsed time interval N f /N s (i.e. at a rate of N s /N f per second). The X step consists of N f N p parallel cross multiplication and accumulation tasks, each of which receives a single input sample from each antenna/polarisation in the same time interval. In enumerating the data rates into, and processing requirements of, each calculation, we use G to denote the total requirements of a set of parallel computations, and g to denote the requirements of each individual computation. Next to each algebraic expression below, we write bit and FLOP rates derived from the above system parameters in green for SKA 1 Low and in red for SKA 1 Mid Page 19 of 31

20 Input data: Antenna 1 Antenna 2 Bitrate = Antenna 3 N a antennas N p polarisations N beam beams N s samples per second 2 complex components N b bit words. Antenna i Integer data F Step FFT/PPF FFT/PPF FFT/PPF FFT/PPF Total of N a N p N beam parallel FFT/PPFs of length N f, each computed at a rate of N s /N f times a second. Channel 1 Channel 2 Channel 3 Channel j X Step CMAC CMAC CMAC CMAC Output data: Total of N f N p N beam parallel correlations. Each correlation product within each frequency channel is accumulated over t(b,j)n s / N f discrete samples. Bitrate = N f channels N beam beams N B baselines N p 2 polarisation pairs 1/t(B,j) samples per second 2 complex components 32 bit words (single prec.). Floating point accumulations 5.2 Calculation of data rates Figure 4: Data transport within an FX correlator. In this section, we give expressions for the rate at which data enters and leaves each of the computational units shown in Figure Input data rate The rate at which each antenna produces data is SKA 1 Low (AA) SKA 1 Mid (Dishes) g A out = 2N p N beam N s N b, Gbit/s Gbit/s and the total rate at which data enters the correlator is G A out = 2N a N p N beam N s N b Tbit/s Tbit/s Internal data rates The rate of flow of data into each of the parallel F step operations is g F in = 2N s N b, Gbit/s Gbit/s and the total rate of flow of data into the F step is G F in = 2N a N p N beam N s N b Tbit/s Tbit/s The flow rate of data out of each FFT/PPF is the same as the flow rate into it: Page 20 of 31

21 g F out = g F in = 2N s N b, Gbit/s Gbit/s and G F out = G F in = 2N a N p N beam N s N b = G X in Gbit/s Tbit/s WP TD 002 The flow rate of data into the X step for each frequency channel is given by dividing this rate by N f : g X in = 2N a N p N beam N s N b /N f /N f Tbit/s 6.200/N f Tbit/s If the correlator is also used for beamforming, then each beamformer receives the same flow of data as the corresponding X step for the same frequency channel and input antenna beam: g B in = g X in = 2N a N p N s N b /N f /N f Gbit/s 6.200/N f Tbit/s Output data rate In Section 5.1 we noted that the F step can only integrate each visibility for some maximum time period t(b) given by t(b) < D/2Bω f s s before the rotation of the Earth causes smearing in the (u,v) plane (Alexander et al. 2010). This time period is a function of baseline length, and in theory the output data rate from the correlator could be reduced by dumping visibilities at different rates for different baselines. However, in the calculations presented here, we assume for simplicity that all visibilities are dumped at a rate appropriate for a baseline length of 100 km the longest baseline present in SKA 1. If this dump rate is used, the data rate at which each X step produces time integrated visibilities is g X out = N B N p / t(b) kbit/s Mbit/s The total output data rate from the correlator is given by multiplying this by the number of frequency channels (i.e. parallel X steps): G X out = N f N B N p / t(b) N f kbit/s N f Mbit/s In practice, it is apparent that the flow of data out of the X step of the correlator is much smaller than the flow into it; we envisage that it will be read out to a central server using the same switches which supply the data to the compute nodes, before being transmitted out of the central processing bunker over fibre. 5.3 Floating point operation counts The number of arithmetic operations required to perform each task can be similarly quantified Page 21 of 31

22 5.3.1 The X step Each cross correlation and accumulation block needs to perform N s N p 2 N B /N f complex multiplicationand accumulation (CMAC) operations per second. Assuming that each CMAC comprises eight floating point operations 3, this corresponds to g X FLOP = 8N s N p 2 N B /N f FLOPS 15.50/N f TFLOP/s 1.556/N f PFLOP/s per frequency channel per beam, or a total of G X FLOP = 8N beam N s N p 2 N B FLOPS PFLOP/s PFLOP/s The B step If the correlator is also used for beamforming, then the number of operations required by each B step to form a single beam is much fewer than that required by the corresponding X step. In total, N s N p N a /N f CMAC operations must be performed per second, corresponding to g B FLOP = 8N s N p N a /N f FLOPS 304/N f GFLOP/s 6.2/N f TFLOP/s per frequency channel per beam being formed, or a total of G B FLOP = 8N s N p N a FLOPS 304 GFLOP/s 6.2 TFLOP/s per beam being formed The F step Quantifying the rate of floating point operations required by each F step is more problematic owing to the plethora of optimised algorithms which exist for performing FFTs. However, taking the radix 2 Cooley Tukey algorithm as a suboptimal but representative implementation, we can estimate that each FFT requires around (N f /2)log 2 N f multiplication operations and N f log 2 N f addition operations a total of (3N f /2)log 2 N f operations per FFT. This FLOP count is evaluated in Table 5 for several values of N f ; the final row shows the FLOP count required to divide the full 380 MHz bandwidth of SKA 1 Low into 1 khz channels. The third column of the table expresses each FLOP count per bit of data input into the FFT. Table 5: Estimated processing requirements of one dimensional FFTs as a function of size. FFT Size N f FLOP count per FFT / kflop g F FLOP / g F in 1, , ,000 10, On some platforms, e.g. NVIDIA GPGPU cards, a figure of four floating point operations is more appropriate, since some processors implement combined multiply and accumulate instructions which take the same time as straightforward multiplication instructions Page 22 of 31

23 The F step of the correlator needs to perform these FFTs at a rate of N s /N f times per second per antenna per polarisation, corresponding to a processing rate of g F FLOP = (3N s /2)log 2 N f FLOPS per antenna per polarisation, or a total of G F FLOP = N a N p (3N s /2)log 2 N f FLOPS. 5.4 Implications for choice of architecture To assess the suitability of various computer architectures for the tasks described above, we consider first the black box specifications of each architecture in terms of its indivisible unit system for example, an x86 server, FPGA board or GPGPU card. We consider the maximum rate G data at which data can be transferred onto each system, and the rate G FLOP at which it performs floatingpoint operations. Comparing these rates against the values calculated above yields a lower limit on the number of parallel systems required, since it neglects the time taken internally transferring data within each system. The number of parallel blocks shown in Figure 4 is large, and so it is to be envisaged that many blocks will run in parallel on each system for example, that many independent frequency channels will be correlated in parallel on each X step system. A metric which is useful to determine the extent to which this is feasible is the ratio G data /G FLOP. If this ratio is less than the corresponding data to FLOP ratios of the tasks described above, then the system will be limited by the rate at which data can be transported into it, and it will be unable to achieve its full processing potential. Both G data determined by the speed of the PCI Express bus and G FLOP determined by the speeds of individual GPGPU cards are crudely expected to follow Moore's Law for most systems in coming years, and so this ratio is not expected to change significantly before the deployment of SKA 1. Taking the example of a Tesla C2050 GPGPU card (available now; see Section 6 for more details) working in single precision arithmetic, G data = 64 Gbit/s and G FLOP = GFLOP/s. Assessing its suitability for the X step, we require that g X FLOP /g X in = 4 N p N B /N a N b > /64 = 16.1 FLOP/bit if the GPGPU card is to be able to achieve FLOP limited performance. For values of N p, N B and N a appropriate for SKA 1, this equates to N a > 16 antennas, which is easily satisfied. However, assessing its suitability for the F step, we observe that g F FLOP /g F in = (3/2)log 2 N f / (2N b ) is much less than 16.1 FLOP/bit for any number of channels N f which might be used in SKA 1 (see the third column of Table 5). GPGPUs are only likely to be useful for the F step if data is retained on the same GPGPU card for the subsequent X step, and in practice this is unlikely to be possible. Beamforming suffers from a similar problem of low arithmetic intensity: g B FLOP /g B in = 1 FLOP/bit. However, if it can be executed in parallel with an X step working on the same data on the same system, then the data need only be transferred onto each system once for both the X and the B Page 23 of 31

24 step. In practice, this is likely to be straightforward to achieve, and if beamforming and crosscorrelation are occurring on the same data simultaneously is a strong argument for using the same processing hardware for both. 6 Implementation on GPUs In this section, we provide a specific example of how a software correlator could be implemented using NVIDIA GPGPU cards. These represent the only platform which could, with an architecture which is already available, economically meet the performance requirements of a software correlator for SKA 1. Other rival processor lines, such as Intel s Knight s Bridge, ATI s line of GPUs, and Picochip s specialist DSP processors, could become competitive within the next five years, but are either not competitive or not available at present. Other options, such as a Beowulf cluster of commodity x86 PCs, may be ruled out as unable to economically meet the performance requirements listed above. We have shown that the X step requires around 4 PFLOP/s between SKA 1 Low and SKA 1 Mid, which would today require of order hundreds of thousands of commodity x86 PCs in such a cluster; such a cluster will not become economic within the next ten years. GPUs achieve their high performance by using a single instruction multiple data (SIMD) architecture. Whereas modern CPUs devote large numbers of transitors to flow control and data caching, GPUs attach sixteen or more arithmetic units to each flow control unit, meaning that each instruction can be applied to many data simultaneously. The result is that a much higher proportion of the transistors on each die can be devoted to arithmetic operations, but that the processors are less flexible. Though GPUs were initially developed for high speed graphics rendering, a distinction is now cast between GPUs which are programmed using the traditional graphics interfaces such as DirectX, and general purpose GPUs (GPGPUs) which can be programmed using more generic interfaces such as NVIDIA s CUDA. NVIDIA s line of high performance GPGPU cards is the Tesla series of cards, and the latest generation Fermi C2050 cards (released 2009; see Figure 5) can each deliver a theoretical peak performance of GFLOP/s when working with single precision floating point data. They retain the traditional physical packaging of a graphics card, interfacing to a PCI Express bus, which means that an x86 host system is needed to house them. In practice, two or four Tesla cards can often be housed side byside in a single x86 host. The latest PCI Express standard is version 2.0, which can transfer data at a theoretical peak rate of 64 Gbit/s between the host and each Tesla card via a 16 line slot. In practice, actual transfer rates of Gbit/s are typically reported. A new PCI Express standard, version 3.0, has been published in November 2010 and promises theoretical peak transfer speeds of 128 Gbit/s via an equivalent slot, though neither motherboards nor Tesla cards which support this standard have yet appeared on the market Page 24 of 31

25 Figure 5: An NVIDIA C2050 Tesla card, offering a theoretical peak performance of GFLOP/s in single precision. It is worthy of note that any sizeable cluster of Tesla cards sits alongside a significant cluster of x86 processors in the host machines. Whilst the latter have a modest processing capability in comparison, their processors are more flexible and they may nonetheless be useful for performing operations such as packing on the data. 6.1 Overview of the Tesla platform In practice, GPGPUs rarely operate at anything close to their theoretical peak performance. For many applications, core utilisation of around 30% is reported. There are a number of reasons for this, as illustrated by the block diagram of NVIDIA s GPGPU architecture shown in Figure 6. Firstly, the processor s SIMD architecture means that many arithmetic units are controlled by a single instruction unit, and can only be utilised when multiple threads follow exactly the same flow control path. Programs with many conditionally executed blocks of code suffer a significant performance penalty. Also, the lack of a traditional memory cache in the GPGPU s memory model means that careful management of memory accesses is required. Whilst several thousand registers can be accessed at high speed, accesses to the Tesla card s multi gigabyte device memory entail a performance penalty of around a hundred clock cycles. If, and only if, consecutive threads access consecutive memory locations simultaneously called a coalesced memory access then the threads pay the performance penalty in parallel rather than serial. Hence an optimised GPU program must carefully consider the ordering of memory accesses and the arrangement of data in structures. However, the correlation tasks discussed in this document are sufficiently simple, and intrinsically parallel, that a core utilisation fraction of considerably more than 30% might be expected. This is in contrast to the imaging processor, where algorithms are complex and peak efficiencies of only a few percent have been reported. The author has himself achieved 20 30% efficiency in a prototype cross correlation (X step) code with minimal effort at optimisation. Lincoln Greenhill (Harvard) and Mike Clark report having empirically achieved 79% efficiency in a similar prototype code (private communication; publication in prep.). This includes the time penalties associated with internally transferring data within the Tesla card Page 25 of 31

26 Figure 6: The CUDA memory model. 6.2 NVIDIA roadmap NVIDIA are notoriously secretive about their product roadmaps. What little is known about the future of the Tesla line of GPGPUs 4 is based upon a single slide displayed by NVIDIA's CEO, Jen Huan Huang, at the GPU Technology Conference in September 2010, shown in Figure 7. Numerous technology websites have attempted to reverse engineer the numbers which went into the slide, based on the strong likelihood that future generations of cards will draw a similar power of around 250W to the Fermi C2050 cards already available. Compared to Fermi, their floating point operation rates are widely expected to rise by a factor of 2.7 by the end of 2011 (Kepler), and by a factor of 7.6 by the end of 2013 (Maxwell). In addition, the Maxwell cards will reportedly use a new architecture which will combine an ARM CPU with the GPU cores. It is highly likely that at least one, and perhaps two, further lines of products will be released before the deployment of SKA 1. For the purposes of this document and given the lack of any better information, we assume that these will be released at two year intervals, in 2015 and 2017, and that each will double the performance of its predecessor (see Table 6). 4 Not to be confused with the Tesla C10xx series of products released within this line in Page 26 of 31

27 Figure 7: NVIDIA's roadmap for the Tesla product line. Table 6: Anticipated performance of future NVIDIA GPGPU cards, as assumed here. GPGPU card Expected release year Performance relative to Fermi C2050 card Fermi Kepler Maxwell ??? ??? The X step In this document we assume that the X step can be implemented on NVIDIA GPGPU cards with a core utilisation fraction (i.e. processing efficiency) of 75%. As noted in Section 6.1, Lincoln Greenhill (Harvard) and Mike Clark have reported empirically achieving 79% efficiency in an implementation of a prototype system similar to one that we require (private communication; publication in prep.). This efficiency figure includes the cost of internal data transfer within the Tesla card. It is possible to express the processing capabilities of each Tesla card in terms of the observed bandwidth that it can correlate for a single beam. Inverting the equation given in Section for the processing requirements of the X step yields: N s = G X FLOP / 8N p 2 N B where N s may be equated with the bandwidth which a single card can correlate, and G X FLOP is equated with 75% of the theoretical processing capability of the GPGPU card. The resulting bandwidths are shown in Table Page 27 of 31

28 Table 7: The processing capabilities of NVIDIA Tesla cards, expressed as the RF bandwidth a single card could process within a single beam for SKA 1 Low and SKA 1 Mid, assuming the numbers of baselines indicated in Table 4. GPGPU card Expected release year Bandwidth for one SKA 1 Low beam / MHz Fermi Kepler Maxwell ??? ??? Bandwidth for SKA 1 Mid / MHz These values, used to compute the numbers of cards quoted in Tables 2 and 3, assume that the performance of the Tesla cards is limited by their processing capability rather than the rate at which data can be transferred onto them, but this was shown to be the case in Section 5.4. We envisage that visibilities would be read out via the same network routers which supply the input data to the processing nodes. Since we showed in Section that the output data rate from the X step is at least a factor 100 smaller than the input data rate, we do not envisage this being a bottleneck. 6.4 The F step Although FFT implementations exist for GPGPU cards most notably NVIDIA s CUFFT library we noted in Section 5.4 that the FLOP per input bit ratio of the F step is considerably lower than the ratio of the processing capability of GPGPU cards to the rate at which data can be transferred onto them. Therefore any GPGPU based implementation of the F step would be limited by the speed of the PCI Express bus. In the system diagrams presented in Figures 2 and 3, we have therefore suggested that the F step be implemented using FPGA boards. In the system presented, we have assumed that data from multiple antennas can be streamed through each FPGA subsystem, up to a maximum data rate of around 40 Gbit/s per card. This bandwidth is already achievable using the current generation of ROACH boards, and we expect each board to be able to process much higher data rates by the time that SKA 1 is deployed, leading to a reduction in the number of boards required. The channelised data emerging from each FPGA subsystem is directed to one of a number of network switches, each handling data from a subset of the observed frequency channels. This design reflects the independence of the data in each frequency channel after the F step, and minimises the need for very large switches. 6.5 The data routing problem The switches shown in Figures 2 and 3 each accept data from all of the F step subsystems associated with a particular beam. Thus, they receive data from all of the antennas in either SKA 1 Low or SKA 1 Mid, but only in a subset of the observed frequency channels. The reason for using multiple switches for each beam is that the frequency channels are treated independently after the F step, and the Page 28 of 31

ASIC BASED PROCESSING FOR MINIMUM POWER CONSUMPTION CONCEPT DESCRIPTION FOR PHASE 1

ASIC BASED PROCESSING FOR MINIMUM POWER CONSUMPTION CONCEPT DESCRIPTION FOR PHASE 1 ASIC BASED PROCESSING FOR MINIMUM POWER CONSUMPTION CONCEPT DESCRIPTION FOR PHASE 1 Document number... WP2 040.090.010 TD 001 Revision... 1 Author... L D Addario Date... 2011 03 29 Status... Approved for

More information

The Australian SKA Pathfinder Project. ASKAP Digital Signal Processing Systems System Description & Overview of Industry Opportunities

The Australian SKA Pathfinder Project. ASKAP Digital Signal Processing Systems System Description & Overview of Industry Opportunities The Australian SKA Pathfinder Project ASKAP Digital Signal Processing Systems System Description & Overview of Industry Opportunities This paper describes the delivery of the digital signal processing

More information

Software Spectrometer for an ASTE Multi-beam Receiver. Jongsoo Kim Korea Astronomy and Space Science Institute

Software Spectrometer for an ASTE Multi-beam Receiver. Jongsoo Kim Korea Astronomy and Space Science Institute Software Spectrometer for an ASTE Multi-beam Receiver Jongsoo Kim Korea Astronomy and Space Science Institute Design Consideration software spectrometer for a near future ASTE multi-beam receiver spectrometer

More information

TROPOSPHERIC CHARACTERISATION OF SKA SITES

TROPOSPHERIC CHARACTERISATION OF SKA SITES TROPOSPHERIC CHARACTERISATION OF SKA SITES Document number... WP3 040.020.000 R 001 Revision... 2 Author... R.P. Millenaar Date... 2011 02 09 Status... Approved for release Name Designation Affiliation

More information

Memo 65 SKA Signal processing costs

Memo 65 SKA Signal processing costs Memo 65 SKA Signal processing costs John Bunton, CSIRO ICT Centre 12/08/05 www.skatelescope.org/pages/page_memos.htm Introduction The delay in the building of the SKA has a significant impact on the signal

More information

Software Correlators for Dish and Sparse Aperture Arrays of the SKA Phase I

Software Correlators for Dish and Sparse Aperture Arrays of the SKA Phase I Software Correlators for Dsh and Sparse Aperture Arrays of the SKA Phase I Jongsoo Km Korea Astronomy and Space Scence Insttute Collaborators: Paul Alexander (Unv. of Cambrdge) Andrew Faulkner (Unv. of

More information

SKA NON IMAGING PROCESSING CONCEPT DESCRIPTION: GPU PROCESSING FOR REAL TIME ISOLATED RADIO PULSE DETECTION

SKA NON IMAGING PROCESSING CONCEPT DESCRIPTION: GPU PROCESSING FOR REAL TIME ISOLATED RADIO PULSE DETECTION SKA NON IMAGING PROCESSING CONCEPT DESCRIPTION: GPU PROCESSING FOR REAL TIME ISOLATED RADIO PULSE DETECTION Document number... WP2 040.130.010 TD 001 Revision... 1 Author... Aris Karastergiou Date... 2011

More information

March Phased Array Technology. Andrew Faulkner

March Phased Array Technology. Andrew Faulkner Aperture Arrays Michael Kramer Sparse Type of AA selection 1000 Sparse AA-low Sky Brightness Temperature (K) 100 10 T sky A eff Fully sampled AA-mid Becoming sparse Aeff / T sys (m 2 / K) Dense A eff /T

More information

May AA Communications. Portugal

May AA Communications. Portugal SKA Top-level description A large radio telescope for transformational science Up to 1 million m 2 collecting area Operating from 70 MHz to 10 GHz (4m-3cm) Two or more detector technologies Connected to

More information

The SKA New Instrumentation: Aperture Arrays

The SKA New Instrumentation: Aperture Arrays The SKA New Instrumentation: Aperture Arrays A. van Ardenne, A.J. Faulkner, and J.G. bij de Vaate Abstract The radio frequency window of the Square Kilometre Array is planned to cover the wavelength regime

More information

November SKA Low Frequency Aperture Array. Andrew Faulkner

November SKA Low Frequency Aperture Array. Andrew Faulkner SKA Phase 1 Implementation Southern Africa Australia SKA 1 -mid 250 15m dia. Dishes 0.4-3GHz SKA 1 -low 256,000 antennas Aperture Array Stations 50 350/650MHz SKA 1 -survey 90 15m dia. Dishes 0.7-1.7GHz

More information

The CASPER Hardware Platform. Richard Armstrong

The CASPER Hardware Platform. Richard Armstrong The CASPER Hardware Platform Richard Armstrong Outline Radio Telescopes and processing Backends: How they have always been done How they should be done CASPER System: a pretty good stab at how things should

More information

2-PAD: An Introduction. The 2-PAD Team

2-PAD: An Introduction. The 2-PAD Team 2-PAD: An Introduction The 2-PAD Team Workshop, Jodrell Bank, 10 Presented th November 2009 by 2-PAD: Dr An Georgina Introduction Harris Georgina Harris for the 2-PAD Team 1 2-PAD Objectives Demonstrate

More information

MWA Antenna Description as Supplied by Reeve

MWA Antenna Description as Supplied by Reeve MWA Antenna Description as Supplied by Reeve Basic characteristics: Antennas are shipped broken down and require a few minutes to assemble in the field Each antenna is a dual assembly shaped like a bat

More information

GPU based imager for radio astronomy

GPU based imager for radio astronomy GPU based imager for radio astronomy GTC2014, San Jose, March 27th 2014 S. Bhatnagar, P. K. Gupta, M. Clark, National Radio Astronomy Observatory, NM, USA NVIDIA-India, Pune NVIDIA-US, CA Introduction

More information

Overview of the SKA. P. Dewdney International SKA Project Engineer Nov 9, 2009

Overview of the SKA. P. Dewdney International SKA Project Engineer Nov 9, 2009 Overview of the SKA P. Dewdney International SKA Project Engineer Nov 9, 2009 Outline* 1. SKA Science Drivers. 2. The SKA System. 3. SKA technologies. 4. Trade-off space. 5. Scaling. 6. Data Rates & Data

More information

SKA1 low Baseline Design: Lowest Frequency Aspects & EoR Science

SKA1 low Baseline Design: Lowest Frequency Aspects & EoR Science SKA1 low Baseline Design: Lowest Frequency Aspects & EoR Science 1 st science Assessment WS, Jodrell Bank P. Dewdney Mar 27, 2013 Intent of the Baseline Design Basic architecture: 3-telescope, 2-system

More information

Correlator Development at Haystack. Roger Cappallo Haystack-NRAO Technical Mtg

Correlator Development at Haystack. Roger Cappallo Haystack-NRAO Technical Mtg Correlator Development at Haystack Roger Cappallo Haystack-NRAO Technical Mtg. 2006.10.26 History of Correlator Development at Haystack ~1973 Mk I 360 Kb/s x 2 stns. 1981 Mk III 112 Mb/s x 4 stns. 1986

More information

SKA Phase 1: Costs of Computation. Duncan Hall CALIM 2010

SKA Phase 1: Costs of Computation. Duncan Hall CALIM 2010 SKA Phase 1: Costs of Computation Duncan Hall CALIM 2010 2010 August 24, 27 Outline Motivation Phase 1 in a nutshell Benchmark from 2001 [EVLA Memo 24] Some questions Amdahl s law overrides Moore s law!

More information

Cross Correlators. Jayce Dowell/Greg Taylor. University of New Mexico Spring Astronomy 423 at UNM Radio Astronomy

Cross Correlators. Jayce Dowell/Greg Taylor. University of New Mexico Spring Astronomy 423 at UNM Radio Astronomy Cross Correlators Jayce Dowell/Greg Taylor University of New Mexico Spring 2017 Astronomy 423 at UNM Radio Astronomy Outline 2 Re-cap of interferometry What is a correlator? The correlation function Simple

More information

MISCELLANEOUS CORRECTIONS TO THE BASELINE DESIGN

MISCELLANEOUS CORRECTIONS TO THE BASELINE DESIGN MISCELLANEOUS CORRECTIONS TO THE BASELINE DESIGN Document number... SKA-TEL.SKO-DD-003 Revision... 1 Author...R.McCool, T. Cornwell Date... 2013-10-27 Status... Released Name Designation Affiliation Date

More information

A Multi-Fielding SKA Covering the Range 100 MHz 22 GHz. Peter Hall and Aaron Chippendale, CSIRO ATNF 24 November 2003

A Multi-Fielding SKA Covering the Range 100 MHz 22 GHz. Peter Hall and Aaron Chippendale, CSIRO ATNF 24 November 2003 A Multi-Fielding SKA Covering the Range 100 MHz 22 GHz Peter Hall and Aaron Chippendale, CSIRO ATNF 24 November 2003 1. Background Various analyses, including the recent IEMT report [1], have noted that

More information

SOFTWARE AND COMPUTING CONCEPT DESIGN REVIEW PLAN

SOFTWARE AND COMPUTING CONCEPT DESIGN REVIEW PLAN SOFTWARE AND COMPUTING CONCEPT DESIGN REVIEW PLAN Document number... WP2-050.020.010-PLA-001 Revision... I Author... D. Hall Date... 2012-01-27 Status... Approved for Release Name Designation Affiliation

More information

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs 5 th International Conference on Logic and Application LAP 2016 Dubrovnik, Croatia, September 19-23, 2016 Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs

More information

ASKAP Industry technical briefing. Tim Cornwell, ASKAP Computing Project Lead Australian Square Kilometre Array Pathfinder

ASKAP Industry technical briefing. Tim Cornwell, ASKAP Computing Project Lead Australian Square Kilometre Array Pathfinder ! ASKAP Industry technical briefing Tim Cornwell, ASKAP Computing Project Lead Australian Square Kilometre Array Pathfinder The Square Kilometre Array 2020 era radio telescope Very large collecting area

More information

Focal Plane Array Beamformer for the Expanded GMRT: Initial

Focal Plane Array Beamformer for the Expanded GMRT: Initial Focal Plane Array Beamformer for the Expanded GMRT: Initial Implementation on ROACH Kaushal D. Buch Digital Backend Group, Giant Metrewave Radio Telescope, NCRA-TIFR, Pune, India kdbuch@gmrt.ncra.tifr.res.in

More information

Phased Array Feeds for the SKA. WP2.2.3 PAFSKA Consortium CSIRO ASTRON DRAO NRAO BYU OdP Nancay Cornell U Manchester

Phased Array Feeds for the SKA. WP2.2.3 PAFSKA Consortium CSIRO ASTRON DRAO NRAO BYU OdP Nancay Cornell U Manchester Phased Array Feeds for the SKA WP2.2.3 PAFSKA Consortium CSIRO ASTRON DRAO NRAO BYU OdP Nancay Cornell U Manchester Dish Array Hierarchy Dish Array L5 Elements PAF Dish Single Pixel Feeds L4 Sub systems

More information

Image-Domain Gridding on Accelerators

Image-Domain Gridding on Accelerators Netherlands Institute for Radio Astronomy Image-Domain Gridding on Accelerators Bram Veenboer Monday 26th March, 2018, GPU Technology Conference 2018, San Jose, USA ASTRON is part of the Netherlands Organisation

More information

Phased Array Feeds A new technology for multi-beam radio astronomy

Phased Array Feeds A new technology for multi-beam radio astronomy Phased Array Feeds A new technology for multi-beam radio astronomy Aidan Hotan ASKAP Deputy Project Scientist 2 nd October 2015 CSIRO ASTRONOMY AND SPACE SCIENCE Outline Review of radio astronomy concepts.

More information

Smart Antennas in Radio Astronomy

Smart Antennas in Radio Astronomy Smart Antennas in Radio Astronomy Wim van Cappellen cappellen@astron.nl Netherlands Institute for Radio Astronomy Our mission is to make radio-astronomical discoveries happen ASTRON is an institute for

More information

Signal Processing on GPUs for Radio Telescopes

Signal Processing on GPUs for Radio Telescopes Signal Processing on GPUs for Radio Telescopes John W. Romein Netherlands Institute for Radio Astronomy (ASTRON) Dwingeloo, the Netherlands 1 Overview radio telescopes motivation processing pipelines signal-processing

More information

Memo 130. SKA Phase 1: Preliminary System Description

Memo 130. SKA Phase 1: Preliminary System Description Memo 130 SKA Phase 1: Preliminary System Description P. Dewdney (SPDO) J-G bij de Vaate (ASTRON) K. Cloete (SPDO) A. Gunst (ASTRON) D. Hall (SPDO) R. McCool (SPDO) N. Roddis (SPDO) W. Turner (SPDO) November

More information

SKA-low and the Aperture Array Verification System

SKA-low and the Aperture Array Verification System SKA-low and the Aperture Array Verification System Randall Wayth AADCC Project Scientist On behalf of the Aperture Array Design & Construction Consortium (AADCC) AADCC partners ASTRON (Netherlands) ICRAR/Curtin

More information

SKA-low DSP and computing overview : W. Turner SPDO. 8 th September 2011

SKA-low DSP and computing overview : W. Turner SPDO. 8 th September 2011 SKA-low DSP and computing overview : W. Turner SPDO 8 th September 2011 Agenda An overview of the DSP and Computing within the Signal Processing Domain (mainly SKA1) Channelisation Correlation Central

More information

Towards SKA Multi-beam concepts and technology

Towards SKA Multi-beam concepts and technology Towards SKA Multi-beam concepts and technology SKA meeting Meudon Observatory, 16 June 2009 Philippe Picard Station de Radioastronomie de Nançay philippe.picard@obs-nancay.fr 1 Square Kilometre Array:

More information

All-Digital Wideband Space-Frequency Beamforming for the SKA Aperture Array

All-Digital Wideband Space-Frequency Beamforming for the SKA Aperture Array All-Digital Wideband Space-Frequency Beamforming for the SKA Aperture Array Vasily A. Khlebnikov, 44-0865-273302, w.khlebnikov@ieee.org, Kristian Zarb-Adami, 44-0865-273302, kza@astro.ox.ac.uk, Richard

More information

Phased Array Feeds & Primary Beams

Phased Array Feeds & Primary Beams Phased Array Feeds & Primary Beams Aidan Hotan ASKAP Deputy Project Scientist 3 rd October 2014 CSIRO ASTRONOMY AND SPACE SCIENCE Outline Review of parabolic (dish) antennas. Focal plane response to a

More information

IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU

IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU Seunghak Lee (HY-SDR Research Center, Hanyang Univ., Seoul, South Korea; invincible@dsplab.hanyang.ac.kr); Chiyoung Ahn (HY-SDR

More information

arxiv: v1 [astro-ph.im] 1 Sep 2015

arxiv: v1 [astro-ph.im] 1 Sep 2015 Experimental Astronomy manuscript No. (will be inserted by the editor) A Real-time Coherent Dedispersion Pipeline for the Giant Metrewave Radio Telescope Kishalay De Yashwant Gupta arxiv:1509.00186v1 [astro-ph.im]

More information

INTERFEROMETRY: II Nissim Kanekar (NCRA TIFR)

INTERFEROMETRY: II Nissim Kanekar (NCRA TIFR) INTERFEROMETRY: II Nissim Kanekar (NCRA TIFR) WSRT GMRT VLA ATCA ALMA SKA MID PLAN Introduction. The van Cittert Zernike theorem. A 2 element interferometer. The fringe pattern. 2 D and 3 D interferometers.

More information

EVLA Memo 105. Phase coherence of the EVLA radio telescope

EVLA Memo 105. Phase coherence of the EVLA radio telescope EVLA Memo 105 Phase coherence of the EVLA radio telescope Steven Durand, James Jackson, and Keith Morris National Radio Astronomy Observatory, 1003 Lopezville Road, Socorro, NM, USA 87801 ABSTRACT The

More information

A model for the SKA. Melvyn Wright. Radio Astronomy laboratory, University of California, Berkeley, CA, ABSTRACT

A model for the SKA. Melvyn Wright. Radio Astronomy laboratory, University of California, Berkeley, CA, ABSTRACT SKA memo 16. 21 March 2002 A model for the SKA Melvyn Wright Radio Astronomy laboratory, University of California, Berkeley, CA, 94720 ABSTRACT This memo reviews the strawman design for the SKA telescope.

More information

Adaptive selective sidelobe canceller beamformer with applications in radio astronomy

Adaptive selective sidelobe canceller beamformer with applications in radio astronomy Adaptive selective sidelobe canceller beamformer with applications in radio astronomy Ronny Levanda and Amir Leshem 1 Abstract arxiv:1008.5066v1 [astro-ph.im] 30 Aug 2010 We propose a new algorithm, for

More information

A Scalable Computer Architecture for

A Scalable Computer Architecture for A Scalable Computer Architecture for On-line Pulsar Search on the SKA - Draft Version - G. Knittel, A. Horneffer MPI for Radio Astronomy Bonn with help from: M. Kramer, B. Klein, R. Eatough GPU-Based Pulsar

More information

A DSP ENGINE FOR A 64-ELEMENT ARRAY

A DSP ENGINE FOR A 64-ELEMENT ARRAY A DSP ENGINE FOR A 64-ELEMENT ARRAY S. W. ELLINGSON The Ohio State University ElectroScience Laboratory 1320 Kinnear Road, Columbus, OH 43212 USA E-mail: ellingson.1@osu.edu This paper considers the feasibility

More information

Phased Array Feeds A new technology for wide-field radio astronomy

Phased Array Feeds A new technology for wide-field radio astronomy Phased Array Feeds A new technology for wide-field radio astronomy Aidan Hotan ASKAP Project Scientist 29 th September 2017 CSIRO ASTRONOMY AND SPACE SCIENCE Outline Review of radio astronomy concepts

More information

THE purpose of beamforming is to precisely align the

THE purpose of beamforming is to precisely align the 1 Beamforming Techniques for Large-N Aperture Arrays K. Zarb-Adami, A. Faulkner, J.G. Bij de Vaate, G.W. Kant and P.Picard arxiv:1008.4047v1 [astro-ph.im] 24 Aug 2010 Abstract Beamforming is central to

More information

Merging Propagation Physics, Theory and Hardware in Wireless. Ada Poon

Merging Propagation Physics, Theory and Hardware in Wireless. Ada Poon HKUST January 3, 2007 Merging Propagation Physics, Theory and Hardware in Wireless Ada Poon University of Illinois at Urbana-Champaign Outline Multiple-antenna (MIMO) channels Human body wireless channels

More information

SKA technology: RF systems & signal processing. Mike Jones University of Oxford

SKA technology: RF systems & signal processing. Mike Jones University of Oxford SKA technology: RF systems & signal processing Mike Jones University of Oxford SKA RF processing Dish receivers Cryogenics RF electronics Fast sampling Antenna processing AA receivers RF gain chain Sampling/antenna

More information

Sideband Smear: Sideband Separation with the ALMA 2SB and DSB Total Power Receivers

Sideband Smear: Sideband Separation with the ALMA 2SB and DSB Total Power Receivers and DSB Total Power Receivers SCI-00.00.00.00-001-A-PLA Version: A 2007-06-11 Prepared By: Organization Date Anthony J. Remijan NRAO A. Wootten T. Hunter J.M. Payne D.T. Emerson P.R. Jewell R.N. Martin

More information

Design of Parallel Algorithms. Communication Algorithms

Design of Parallel Algorithms. Communication Algorithms + Design of Parallel Algorithms Communication Algorithms + Topic Overview n One-to-All Broadcast and All-to-One Reduction n All-to-All Broadcast and Reduction n All-Reduce and Prefix-Sum Operations n Scatter

More information

The SKA LOW correlator design challenges

The SKA LOW correlator design challenges The SKA LOW correlator design challenges John Bunton CSP System Engineer C4SKA, Auckland, 9-10 February, 2017 CSIRO ASTRONOMY AND SPACE SCIENCE SKA1 Low antenna station (Australia) Station beamforming

More information

Some Notes on Beamforming.

Some Notes on Beamforming. The Medicina IRA-SKA Engineering Group Some Notes on Beamforming. S. Montebugnoli, G. Bianchi, A. Cattani, F. Ghelfi, A. Maccaferri, F. Perini. IRA N. 353/04 1) Introduction: consideration on beamforming

More information

Implementing Logic with the Embedded Array

Implementing Logic with the Embedded Array Implementing Logic with the Embedded Array in FLEX 10K Devices May 2001, ver. 2.1 Product Information Bulletin 21 Introduction Altera s FLEX 10K devices are the first programmable logic devices (PLDs)

More information

Multiplying Interferometers

Multiplying Interferometers Multiplying Interferometers L1 * L2 T + iv R1 * R2 T - iv L1 * R2 Q + iu R1 * L2 Q - iu Since each antenna can output both L and R polarization, all 4 Stokes parameters are simultaneously measured without

More information

Plan for Imaging Algorithm Research and Development

Plan for Imaging Algorithm Research and Development Plan for Imaging Algorithm Research and Development S. Bhatnagar July 05, 2009 Abstract Many scientific deliverables of the next generation radio telescopes require wide-field imaging or high dynamic range

More information

VLBA Correlator Memo No.-ll

VLBA Correlator Memo No.-ll VLBA Correlator Memo No.-ll (860709) VLBA CORRELATOR HARDWARE TOPICS Ray Escoffier July 9, 1986 I) Introduction This memo will describe the present state of the design of the VLBA correlator. The thoughts

More information

Scalable Front-End Digital Signal Processing for a Phased Array Radar Demonstrator. International Radar Symposium 2012 Warsaw, 24 May 2012

Scalable Front-End Digital Signal Processing for a Phased Array Radar Demonstrator. International Radar Symposium 2012 Warsaw, 24 May 2012 Scalable Front-End Digital Signal Processing for a Phased Array Radar Demonstrator F. Winterstein, G. Sessler, M. Montagna, M. Mendijur, G. Dauron, PM. Besso International Radar Symposium 2012 Warsaw,

More information

Dense Aperture Array for SKA

Dense Aperture Array for SKA Dense Aperture Array for SKA Steve Torchinsky EMBRACE Why a Square Kilometre? Detection of HI in emission at cosmological distances R. Ekers, SKA Memo #4, 2001 P. Wilkinson, 1991 J. Heidmann, 1966! SKA

More information

ADVANCED EMBEDDED MONITORING SYSTEM FOR ELECTROMAGNETIC RADIATION

ADVANCED EMBEDDED MONITORING SYSTEM FOR ELECTROMAGNETIC RADIATION 98 Chapter-5 ADVANCED EMBEDDED MONITORING SYSTEM FOR ELECTROMAGNETIC RADIATION 99 CHAPTER-5 Chapter 5: ADVANCED EMBEDDED MONITORING SYSTEM FOR ELECTROMAGNETIC RADIATION S.No Name of the Sub-Title Page

More information

Why Single Dish? Darrel Emerson NRAO Tucson. NAIC-NRAO School on Single-Dish Radio Astronomy. Green Bank, August 2003.

Why Single Dish? Darrel Emerson NRAO Tucson. NAIC-NRAO School on Single-Dish Radio Astronomy. Green Bank, August 2003. Why Single Dish? Darrel Emerson NRAO Tucson NAIC-NRAO School on Single-Dish Radio Astronomy. Green Bank, August 2003. Why Single Dish? What's the Alternative? Comparisons between Single-Dish, Phased Array

More information

Multiple Antenna Systems in WiMAX

Multiple Antenna Systems in WiMAX WHITEPAPER An Introduction to MIMO, SAS and Diversity supported by Airspan s WiMAX Product Line We Make WiMAX Easy Multiple Antenna Systems in WiMAX An Introduction to MIMO, SAS and Diversity supported

More information

SKA station cost comparison

SKA station cost comparison SKA station cost comparison John D. Bunton, CSIRO Telecommunications and Industrial Physics 4 August 2003 Introduction Current SKA white papers and updates present cost in a variety of ways which makes

More information

REQUIREMENTS DOCUMENT FOR SKA SIGNAL PROCESSING

REQUIREMENTS DOCUMENT FOR SKA SIGNAL PROCESSING REQUIREMENTS DOCUMENT FOR SKA SIGNAL PROCESSING Document number... WP2 040.030.011 TD 001 Revision... 1 Author... W.Turner Date... 2011 03 30 Status... Approved for release Name Designation Affiliation

More information

Wideband Down-Conversion and Channelisation Techniques for FPGA. Eddy Fry RF Engines Ltd

Wideband Down-Conversion and Channelisation Techniques for FPGA. Eddy Fry RF Engines Ltd Wideband Down-Conversion and Channelisation Techniques for FPGA Eddy Fry RF Engines Ltd 1 st RadioNet Engineering Forum Meeting: Workshop on Digital Backends 6 th September 2004 Who are RF Engines? Signal

More information

Roshene McCool Domain Specialist in Signal Transport and Networks SKA Program Development Office

Roshene McCool Domain Specialist in Signal Transport and Networks SKA Program Development Office Roshene McCool Domain Specialist in Signal Transport and Networks SKA Program Development Office mccool@skatelescope.org SKA A description Outline Specifications Long Baselines in the SKA Science drivers

More information

Processing Real-Time LOFAR Telescope Data on a Blue Gene/P

Processing Real-Time LOFAR Telescope Data on a Blue Gene/P Processing Real-Time LOFAR Telescope Data on a Blue Gene/P John W. Romein Stichting ASTRON (Netherlands Institute for Radio Astronomy) Dwingeloo, the Netherlands 1 LOw Frequency ARray radio telescope 10

More information

Digital Logic, Algorithms, and Functions for the CEBAF Upgrade LLRF System Hai Dong, Curt Hovater, John Musson, and Tomasz Plawski

Digital Logic, Algorithms, and Functions for the CEBAF Upgrade LLRF System Hai Dong, Curt Hovater, John Musson, and Tomasz Plawski Digital Logic, Algorithms, and Functions for the CEBAF Upgrade LLRF System Hai Dong, Curt Hovater, John Musson, and Tomasz Plawski Introduction: The CEBAF upgrade Low Level Radio Frequency (LLRF) control

More information

An FPGA Based Architecture for Moving Target Indication (MTI) Processing Using IIR Filters

An FPGA Based Architecture for Moving Target Indication (MTI) Processing Using IIR Filters An FPGA Based Architecture for Moving Target Indication (MTI) Processing Using IIR Filters Ali Arshad, Fakhar Ahsan, Zulfiqar Ali, Umair Razzaq, and Sohaib Sajid Abstract Design and implementation of an

More information

An FPGA-Based Back End for Real Time, Multi-Beam Transient Searches Over a Wide Dispersion Measure Range

An FPGA-Based Back End for Real Time, Multi-Beam Transient Searches Over a Wide Dispersion Measure Range An FPGA-Based Back End for Real Time, Multi-Beam Transient Searches Over a Wide Dispersion Measure Range Larry D'Addario 1, Nathan Clarke 2, Robert Navarro 1, and Joseph Trinh 1 1 Jet Propulsion Laboratory,

More information

Memo 111. SKADS Benchmark Scenario Design and Costing 2 (The SKA Phase 2 AA Scenario)

Memo 111. SKADS Benchmark Scenario Design and Costing 2 (The SKA Phase 2 AA Scenario) Memo 111 SKADS Benchmark Scenario Design and Costing 2 (The SKA Phase 2 AA Scenario) R. Bolton G. Harris A. Faulkner T. Ikin P. Alexander M. Jones S. Torchinsky D. Kant A. van Ardenne D. Kettle P. Wilkinson

More information

Green Bank Instrumentation circa 2030

Green Bank Instrumentation circa 2030 Green Bank Instrumentation circa 2030 Dan Werthimer and 800 CASPER Collaborators http://casper.berkeley.edu Upcoming Nobel Prizes with Radio Instrumentation Gravitational Wave Detection (pulsar timing)

More information

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES 69 CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES 4.1 INTRODUCTION Multiplication is one of the basic functions used in digital signal processing. It requires more

More information

Why Single Dish? Darrel Emerson NRAO Tucson. NAIC-NRAO School on Single-Dish Radio Astronomy. Green Bank, August 2003.

Why Single Dish? Darrel Emerson NRAO Tucson. NAIC-NRAO School on Single-Dish Radio Astronomy. Green Bank, August 2003. Why Single Dish? Darrel Emerson NRAO Tucson NAIC-NRAO School on Single-Dish Radio Astronomy. Green Bank, August 2003. Why Single Dish? What's the Alternative? Comparisons between Single-Dish, Phased Array

More information

Receiver Performance and Comparison of Incoherent (bolometer) and Coherent (receiver) detection

Receiver Performance and Comparison of Incoherent (bolometer) and Coherent (receiver) detection At ev gap /h the photons have sufficient energy to break the Cooper pairs and the SIS performance degrades. Receiver Performance and Comparison of Incoherent (bolometer) and Coherent (receiver) detection

More information

GPU-accelerated SDR Implementation of Multi-User Detector for Satellite Return Links

GPU-accelerated SDR Implementation of Multi-User Detector for Satellite Return Links DLR.de Chart 1 GPU-accelerated SDR Implementation of Multi-User Detector for Satellite Return Links Chen Tang chen.tang@dlr.de Institute of Communication and Navigation German Aerospace Center DLR.de Chart

More information

Time-Frequency System Builds and Timing Strategy Research of VHF Band Antenna Array

Time-Frequency System Builds and Timing Strategy Research of VHF Band Antenna Array Journal of Computer and Communications, 2016, 4, 116-125 Published Online March 2016 in SciRes. http://www.scirp.org/journal/jcc http://dx.doi.org/10.4236/jcc.2016.43018 Time-Frequency System Builds and

More information

Practicalities of Radio Interferometry

Practicalities of Radio Interferometry Practicalities of Radio Interferometry Rick Perley, NRAO/Socorro 13 th Synthesis Imaging Summer School 29 May 5 June, 2012 Socorro, NM Topics Practical Extensions to the Theory: Finite bandwidth Rotating

More information

CHAPTER 4 GALS ARCHITECTURE

CHAPTER 4 GALS ARCHITECTURE 64 CHAPTER 4 GALS ARCHITECTURE The aim of this chapter is to implement an application on GALS architecture. The synchronous and asynchronous implementations are compared in FFT design. The power consumption

More information

GPU-based data analysis for Synthetic Aperture Microwave Imaging

GPU-based data analysis for Synthetic Aperture Microwave Imaging GPU-based data analysis for Synthetic Aperture Microwave Imaging 1 st IAEA Technical Meeting on Fusion Data Processing, Validation and Analysis 1 st -3 rd June 2015 J.C. Chorley 1, K.J. Brunner 1, N.A.

More information

Multi-octave radio frequency systems: Developments of antenna technology in radio astronomy and imaging systems

Multi-octave radio frequency systems: Developments of antenna technology in radio astronomy and imaging systems Multi-octave radio frequency systems: Developments of antenna technology in radio astronomy and imaging systems Professor Tony Brown School of Electrical and Electronic Engineering University of Manchester

More information

High Gain Advanced GPS Receiver

High Gain Advanced GPS Receiver High Gain Advanced GPS Receiver NAVSYS Corporation 14960 Woodcarver Road, Colorado Springs, CO 80921 Introduction The NAVSYS High Gain Advanced GPS Receiver (HAGR) is a digital beam steering receiver designed

More information

Application of Maxwell Equations to Human Body Modelling

Application of Maxwell Equations to Human Body Modelling Application of Maxwell Equations to Human Body Modelling Fumie Costen Room E, E0c at Sackville Street Building, fc@cs.man.ac.uk The University of Manchester, U.K. February 5, 0 Fumie Costen Room E, E0c

More information

escience: Pulsar searching on GPUs

escience: Pulsar searching on GPUs escience: Pulsar searching on GPUs Alessio Sclocco Ana Lucia Varbanescu Karel van der Veldt John Romein Joeri van Leeuwen Jason Hessels Rob van Nieuwpoort And many others! Netherlands escience center Science

More information

Interferometry I Parkes Radio School Jamie Stevens ATCA Senior Systems Scientist

Interferometry I Parkes Radio School Jamie Stevens ATCA Senior Systems Scientist Interferometry I Parkes Radio School 2011 Jamie Stevens ATCA Senior Systems Scientist 2011-09-28 References This talk will reuse material from many previous Radio School talks, and from the excellent textbook

More information

Practicalities of Radio Interferometry

Practicalities of Radio Interferometry Practicalities of Radio Interferometry Rick Perley, NRAO/Socorro Fourth INPE Course in Astrophysics: Radio Astronomy in the 21 st Century Topics Practical Extensions to the Theory: Finite bandwidth Rotating

More information

Casper Instrumentation at Green Bank

Casper Instrumentation at Green Bank Casper Instrumentation at Green Bank John Ford September 28, 2009 The NRAO is operated for the National Science Foundation (NSF) by Associated Universities, Inc. (AUI), under a cooperative agreement. GBT

More information

Radio Interferometers Around the World. Amy J. Mioduszewski (NRAO)

Radio Interferometers Around the World. Amy J. Mioduszewski (NRAO) Radio Interferometers Around the World Amy J. Mioduszewski (NRAO) A somewhat biased view of current interferometers Limited to telescopes that exist or are in the process of being built (i.e., I am not

More information

Introduction to Interferometry. Michelson Interferometer. Fourier Transforms. Optics: holes in a mask. Two ways of understanding interferometry

Introduction to Interferometry. Michelson Interferometer. Fourier Transforms. Optics: holes in a mask. Two ways of understanding interferometry Introduction to Interferometry P.J.Diamond MERLIN/VLBI National Facility Jodrell Bank Observatory University of Manchester ERIS: 5 Sept 005 Aim to lay the groundwork for following talks Discuss: General

More information

REAL TIME DIGITAL SIGNAL PROCESSING. Introduction

REAL TIME DIGITAL SIGNAL PROCESSING. Introduction REAL TIME DIGITAL SIGNAL Introduction Why Digital? A brief comparison with analog. PROCESSING Seminario de Electrónica: Sistemas Embebidos Advantages The BIG picture Flexibility. Easily modifiable and

More information

JESD204A for wireless base station and radar systems

JESD204A for wireless base station and radar systems for wireless base station and radar systems November 2010 Maury Wood- NXP Semiconductors Deepak Boppana, an Land - Altera Corporation 0.0 ntroduction - New trends for wireless base station and radar systems

More information

A report on KAT7 and MeerKAT status and plans

A report on KAT7 and MeerKAT status and plans A report on KAT7 and MeerKAT status and plans SKA SA, Cape Town Office 3rd Floor, The Park, Park Road, Pinelands, Cape Town, South Africa E mail: tony@hartrao.ac.za This is a short memo on the current

More information

Submillimeter (continued)

Submillimeter (continued) Submillimeter (continued) Dual Polarization, Sideband Separating Receiver Dual Mixer Unit The 12-m Receiver Here is where the receiver lives, at the telescope focus Receiver Performance T N (noise temperature)

More information

A Closer Look at 2-Stage Digital Filtering in the. Proposed WIDAR Correlator for the EVLA

A Closer Look at 2-Stage Digital Filtering in the. Proposed WIDAR Correlator for the EVLA NRC-EVLA Memo# 1 A Closer Look at 2-Stage Digital Filtering in the Proposed WIDAR Correlator for the EVLA NRC-EVLA Memo# Brent Carlson, June 2, 2 ABSTRACT The proposed WIDAR correlator for the EVLA that

More information

Very Long Baseline Interferometry

Very Long Baseline Interferometry Very Long Baseline Interferometry Cormac Reynolds, JIVE European Radio Interferometry School, Bonn 12 Sept. 2007 VLBI Arrays EVN (Europe, China, South Africa, Arecibo) VLBA (USA) EVN + VLBA coordinate

More information

DECEMBER 1964 NUMBER OF COPIES: 75

DECEMBER 1964 NUMBER OF COPIES: 75 NATIONAL RADIO ASTRONOMY OBSERVATORY Green Bank, West Virginia E ectronics Division Internal Report No. 42 A DIGITAL CROSS-CORRELATION INTERFEROMETER Nigel J. Keen DECEMBER 964 NUMBER OF COPIES: 75 A DIGITAL

More information

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir Parallel Computing 2020: Preparing for the Post-Moore Era Marc Snir THE (CMOS) WORLD IS ENDING NEXT DECADE So says the International Technology Roadmap for Semiconductors (ITRS) 2 End of CMOS? IN THE LONG

More information

Chapter 5: Signal conversion

Chapter 5: Signal conversion Chapter 5: Signal conversion Learning Objectives: At the end of this topic you will be able to: explain the need for signal conversion between analogue and digital form in communications and microprocessors

More information

Advanced Digital Receiver

Advanced Digital Receiver Advanced Digital Receiver MI-750 FEATURES Industry leading performance with up to 4 M samples per second 135 db dynamic range and -150 dbm sensitivity Optimized timing for shortest overall test time Wide

More information