# Supercomputer at CCS: Cygnus ## Multi-Hybrid Accelerated Computing Platform ### Combining goodness of different type of accelerators: GPU + FPGA - GPU is still an essential accelerator for simple and large degree of parallelism to provide ~10 TFLOPS peak performance - FPGA is a new type of accelerator for application-specific hardware with programmability and speeded up based on pipelining of calculation FPGA is good for external communication between them with advanced high speed interconnection up to 100Gbps x4 chan. #### Construction of "Cygnus" Operation started in May 2019 • 2x Intel Xeon CPUs, 4x NVIDIA V100 GPUs, 2x Intel Stratix10 FPGAs Deneb: 49 CPU+GPU nodes Albireo: 32 CPU+GPU+FPGA nodes with 2D-torus dedicated network for FPGAs (100Gbpsx4) **Target GPU:** **Target FPGA:** **NVIDIA Tesla V100** ### FPGA design plan Router - For the dedicated network, this impl. is mandatory. Nallatech 520N - Forwarding packets to destinations - User Logic - OpenCL kernel runs here. - Inter-FPGA comm. can be controlled from OpenCL kernel. # • SL3 - SerialLite III: Intel FPGA IP - Including transceiver modules for Inter-FPGA data transfer. - Users don't need to care 64FPGAs on Albireo nodes are connected directly as 2D-Torus configuration without Ethernet sw. Cygnus **Specification of Cygnus** Item **Specification** 2.4 PFLOPS DP Peak (GPU: 2.2 PFLOPS. CPU: 0.2 PFLOPS, FPGA: 0.6 performance PFLOPS SP) enhanced by mixed precision and variable precision on FPGA 81 (32 Albireo (GPU+FPGA) nodes, 49 Deneb #of nodes (GPU-onlu) nodes) Memory 192 GiB DDR4-2666/node = 256GB/s, 32GiB x 4 for GPU/node = 3.6TB/sCPU / node Intel Xeon Gold (SKL) x2 sockets NVIDIA V100 x4 (PCIe) GPU / node FPGA / node Intel Stratix10 x2 (each with 100Gbps x4 links/FPGA and x8 links/node) **Global File** Luster, RAID6, 2.5 PB System Mellanox InfiniBand HDR100 x4 (two cables of Interconnectio HDR200 / node) n Network 4 TB/s aggregated bandwidth CPU: C, C++, Fortran, OpneMP, GPU: OpenACC, **Programming** CUDA, FPGA: OpenCL, Verilog HDL Language **System Vendor** NEC Deneb node (x48) Albireo node (x32) Ordinary inte-node communication channel for CPU and GPU, but they can also request it to FPGA Check it now! "Cygnus Movie"