HA-PACS Base Cluster

HA-PACS: High bandwidth GPU cluster for computational sciences

New generation GPU cluster with rich I/O bandwidth

HA-PACS (Highly Accelerated Parallel Advanced System for Computational Sciences) is the 8th generation of PACS/PAX series supercomputer in CCS, University of Tsukuba. For the development and product-run on cutting edge scientific computations toward next generation accelerated computing, it is equipped with the latest GPUs and CPUs connected by new generation of PCI-express to provide rich I/O bandwidth. Two sockets of Intel Sandy Bridge-EP CPUs support full bandwidth connection of four NVIDIA M2090 GPUs without performance bottleneck. Interconnection network employs dual-rail Infiniband QDR with a full bisection bandwidth Fat-Tree configuration.
The system will be delivered on January 2012 with 802 TFLOPS of peak performance.


System configuration of HA-PACS

System Specification
Item Specification
Peak performance 802 TFLOPS
(GPU: 713 TF, CPU: 89 TF)
# of nodes 268
File system Lustre, 504 TB user area
(DDN SFA10000 ExaScaler)
Infiniband network switch 288 port QDR x 2
(Mellanox IS5300)
Total network bandwidth 2.14 TB/s
Language Fortran90, C, C++
MPI MVAPICH2, Intel MPI, OpenMPI
System Management Appro Cluster Engine, PBSpro

Computation node of HA-PACS

 


Block diagram of computation node of HA-PACS

Specification of computation node
Item Specification
Computation node Appro Xtreme-X with four GPUs
CPU Intel ES (Sandy Bridge EP)
     # of cores 8 cores/socket x 2 sockets = 16 cores/node
     Clock 2.6 GHz
     Peak performance 332.8 GFLOPS/node
     PCI-express generation 3 x 80 lanes (40 lanes/CPU)
     Memory 128 GB, DDR3 1600MHz, 4 channel/socket, 102.8 GB/s/node
GPU NVIDIA M2090
     # of GPUs/node 4
     Peak performance 2660 GFLOPS/node (665 GF/GPU)
     Memory 24 GB/node (6 GB/GPU)
Interconnection Infiniband QDR x 2 rails
(Mellanox ConnectX-3 dual head)