High Performance Linpack Benchmark

Paderborn Center for Parallel Computing: Jens Simon (simon(at)upb.de)

benchmark program:

High Performance Computing Linpack Benchmark (HPL) 1.0

Authors:

A. Petitet, R. C. Whaley, J. Dongarra, A. Cleary

Machines
Math Lib
Interconnect MPI Number of
nodes
Number of
MPI proc.
Number of
active CPUs 
Performance
[Gflop/s]
Remarks
Supermicro X6DAE-G2
Dual 3.2 GHz Xeon EM64T
4 Gbyte DDR2-RAM
Gnu gcc 3.2.3
Goto 0.96 Nocona 64bit
InfiniBand 4x
InfiniServ 9000
fw 4.5.0
MTEK43132-C08-S
MPICH 1.2.5
NCSA VMI2.0
THCA 3.2
4 8 8 25,350 N=8192, NB=80,P=4, Q=2
4 8 8 27,310 N=16384, NB=80,P=4, Q=2
4 8 8 29,180 N=32756, NB=80,P=4, Q=2
2 4 4   N=8192, NB=80,P=2, Q=2
2 4 4   N=16384, NB=80,P=2, Q=2
         
         
         
         
         
         
         
         
         
         
         
Supermicro X6DAE-G2
Dual 3.2 GHz Xeon EM64T
4 Gbyte DDR2-RAM
Gnu gcc 3.2.3
ATLAS 3.7.8
InfiniBand 4x
InfiniServ 9000
fw 4.5.0
MTEK43132-C08-S
MPICH 1.2.5
NCSA VMI2.0
THCA 3.2
4 8 8 25,250 N=8192, NB=80,P=4, Q=2
4 8 8 27,010 N=16384, NB=80,P=4, Q=2
4 8 8   N=32756, NB=80,P=4, Q=2
2 4 4   N=8192, NB=80,P=2, Q=2
2 4 4   N=16384, NB=80,P=2, Q=2
         
         
         
         
         
         
         
         
         
         
         
Supermicro X6DAE-G2
Dual 3.2 GHz Xeon EM64T
4 Gbyte DDR2-RAM
Gnu gcc 3.2.3
Intel MKL 7.2.008
InfiniBand 4x
InfiniServ 9000
fw 4.5.0
MTEK43132-C08-S
MPICH 1.2.5
NCSA VMI2.0
THCA 3.2
4 8 8 25,410 N=8192, NB=80,P=4, Q=2
4 8 8 27,260 N=16384, NB=80,P=4, Q=2
4 8 8 29,120 N=32756, NB=80,P=4, Q=2
2 4 4 14,340 N=8192, NB=80,P=2, Q=2
2 4 4 14,840 N=16384, NB=80,P=2, Q=2
         
         
         
         
         
         
         
         
         
         
         
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 8.0 / 7.1
goto lib 0.9
shared memory MPICH 1.2.5
OSU
THCA 3.1.0
VAPI 0.9.2
1 1 1 3,607 N=4096, NB=48,P=1, Q=1
1 1 1 3,950 N=4096, NB=96,P=1, Q=1
1 1 1 3,960 N=4096, NB=100,P=1, Q=1
1 1 1 4,052 N=4096, NB=128,P=1, Q=1
1 1 1 4,126 N=4096, NB=160,P=1, Q=1
1 2 2 6,899 N=4096, NB=100,P=2, Q=1
1 2 2 7,280 N=8192, NB=48,P=2, Q=1
1 2 2 7,839 N=8192, NB=96,P=2, Q=1
1 2 2 7,538 N=8192, NB=100,P=2, Q=1
1 2 2 7,660 N=8192, NB=128,P=2, Q=1
1 2 2 8,040 N=8192, NB=160,P=2, Q=1
1 2 2 7,879 N=16384, NB=48,P=2, Q=1
1 2 2 8,531 N=16384, NB=96,P=2, Q=1
1 2 2 8,294 N=16384, NB=100,P=2, Q=1
1 2 2 8,325 N=16384, NB=128,P=2, Q=1
1 2 2 8,680 N=16384, NB=160,P=2, Q=1
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
GNU Compiler
goto lib 0.8
shared memory MPIch 1.2.5
NCSA VMI2.0
Beta3b release
1 1 1 3,67 N=4096, NB=48,P=1, Q=1
1 2 2 6,11 N=4096, NB=48,P=2, Q=1
1 2 2 6,98 N=8192, NB=48,P=2, Q=1
1 2 2 7,36 N=8192, NB=100,P=2, Q=1
1 2 2 8,052 N=12000 NB=100,P=1, Q=1
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 8.0 / 7.1
goto lib 0.9
InfiniBand 4x
MTPB23108-CE128
fw 3.1.0 patched
MTS9600-36port
MPIch 1.2.5
NCSA VMI2.0
Beta3 release
THCA 3.1.0
specfile mst
2 4 4 14,70 N=8192, NB=100, P=2, Q=2
4 8 8 30,62 N=16384, NB=100,P=4, Q=2
8 16 16 58,37 N=16384, NB=100,P=4, Q=4
16 32 32 106,60 N=16384, NB=100,P=8, Q=4
16 32 32 133,90 N=65536, NB=100,P=8, Q=4
24 48 48 Complete desc (recv) error 5 N=65536, NB=100,P=8, Q=6
30 60 60 Complete desc (send) error 10 N=98304, NB=100,P=10, Q=6
32 64 64   N=9216, NB=100,P=8, Q=8
32 64 64   N=16384, NB=100,P=8, Q=8
32 64 64   N=98304, NB=100,P=8, Q=8
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
GNU Compiler
goto lib 0.8
InfiniBand 4x
MTPB23108-CE128
MTS9600-36port
MPIch 1.2.5
NCSA VMI2.0
Beta3b
specfile mst
2 4 4 14,32 N=8192, NB=100, P=2, Q=2
4 8 8 29,87 N=16384, NB=100,P=4, Q=2
8 16 16 58,75 N=16384, NB=100,P=4, Q=4
16 32 32 103,70 N=16384, NB=100,P=8, Q=4
16 32 32 130,50 N=65536, NB=100,P=8, Q=4
24 48 48 194,30 N=65536, NB=100,P=8, Q=6
30 60 60 249,90 N=65536, NB=100,P=10, Q=6
32 64 64 139,10 N=9216, NB=100,P=8, Q=8
32 64 64 278,7 N=98304, NB=100,P=8, Q=8
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 8.0 / 7.1
goto lib 0.9
InfiniBand 4x
MTPB23108-CE128
fw 3.1.0 patched
MTS9600-36port
Scali MPI 3.3.0.1
Sca Connect 4.3.0
THCA 3.1.0
networks smp,InfiniHost0
2 4 4 14,11 N=8192, NB=100, P=2, Q=2
4 8 8 29,46 N=16384, NB=100,P=4, Q=2
8 16 16 60,19 N=16384, NB=104,P=4, Q=4
16 32 32 99,70 N=16384, NB=104,P=8, Q=4
16 32 32 138,60 N=65536, NB=104,P=8, Q=4
24 48 48   N=65536, NB=104,P=8, Q=6
24 48 48 197,70 N=65536, NB=100,P=8, Q=6
30 60 60 258,50 N=98304, NB=100,P=10, Q=6
32 64 64 273.0 N=98304, NB=104,P=8, Q=8
32 64 64 117,40 N=9216, NB=100,P=8, Q=8
32 64 64 176,10 N=16384, NB=100,P=8, Q=8
32 64 64 220,90 N=32756, NB=100,P=8, Q=8
32 64 64 258,40 N=65536, NB=100,P=8, Q=8
32 64 64 280,20 N=98304, NB=100,P=8, Q=8
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 7.0
goto lib 0.8
InfiniBand 4x
MTPB23108-CE128
fw 3.0 patched
MTS9600-36port
Scali MPI 3.2.4.2
Sca Connect 4.2.0
THCA 3.0.1-b001
networks smp,InfiniHost0
2 4 4 13,94 N=8192, NB=100, P=2, Q=2
4 8 8 27,58 N=16384, NB=100,P=4, Q=2
8 16 16 53,68 N=16384, NB=100,P=4, Q=4
16 32 32 90,86 N=16384, NB=100,P=8, Q=4
16 32 32 123,30 N=65536, NB=100,P=8, Q=4
24 48 48 199,00 N=65536, NB=100,P=8, Q=6
32 64 64 116,50 N=9216, NB=100,P=8, Q=8
32 64 64 177,70 N=16384, NB=100,P=8, Q=8
32 64 64 281,00 N=98304, NB=100,P=8, Q=8
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
GCC Compiler
goto lib 0.8
InfiniBand 4x
MTPB23108-CE128
fw 3.0 patched
MTS9600-36port
Scali MPI 3.2.4.2
Sca Connect 4.2.0
THCA 3.0.1-b001
networks smp,InfiniHost0
2 4 4 14,04 N=8192, NB=100, P=2, Q=2
4 8 8 29,84 N=16384, NB=100,P=4, Q=2
8 16 16 56,49 N=16384, NB=100,P=4, Q=4
16 32 32 96,20 N=16384, NB=100,P=8, Q=4
16 32 32 131,00 N=65536, NB=100,P=8, Q=4
24 48 48 196,10 N=65536, NB=100,P=8, Q=6
32 64 64 116,50 N=9216, NB=100,P=8, Q=8
32 64 64 177,70 N=16384, NB=100,P=8, Q=8
32 64 64 281,00 N=98304, NB=100,P=8, Q=8
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 8.0 / 7.1
goto lib 0.9
InfiniBand 4x
MTPB23108-CE128
fw 3.1.0 patched
MTS9600-36port
MPICH 1.2.5
OSU
THCA 3.1.0
VAPI 0.9.2
2 4 4 15,27 N=8192, NB=104,P=2, Q=1
4 8 8 31,63 N=16384, NB=104,P=4, Q=2
8 16 16 60,73 N=16384, NB=104,P=4, Q=4
16 32 32 105,20 N=16384, NB=104,P=8, Q=4
16 32 32 139,70 N=65536, NB=104,P=8, Q=4
24 48 48 201,80 N=65536, NB=100,P=8, Q=6
30 60 60 262,20 N=98304, NB=100,P=10, Q=6
32 64 64 131,30 N=9216, NB=100,P=8, Q=8
32 64 64 187,70 N=16384, NB=100,P=8, Q=8
32 64 64 230,20 N=32756, NB=100,P=8, Q=8
32 64 64 261,70 N=65536, NB=100,P=8, Q=8
32 64 64 280,00 N=98304, NB=100,P=8, Q=8
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 7.0
goto lib 0.8
InfiniBand 4x
MTPB23108-CE128
fw 3.0 patched
MTS9600-36port
MPICH 1.2.5
OSU
THCA 3.0.1-b001
VAPI 0.9.2
2 4 4 14,61 N=8192, NB=100,P=2, Q=1
4 8 8 28,40 N=16384, NB=100,P=4, Q=2
8 16 16 55,76 N=16384, NB=100,P=4, Q=4
16 32 32 103,30 N=16384, NB=100,P=8, Q=4
16 32 32 129,10 N=65536, NB=100,P=8, Q=4
24 48 48   N=65536, NB=100,P=8, Q=6
32 64 64 136,00 N=9216, NB=100,P=8, Q=8
32 64 64 193,20 N=16384, NB=100,P=8, Q=8
32 64 64 283,10 N=98304, NB=100,P=8, Q=8
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
GNU Compiler
goto lib 0.8
InfiniBand 4x
MTPB23108-CE128
MTS9600-36port
MPICH 1.2.5
OSU
VAPI 0.9.1-pre
2 4 4   N=4096, NB=100,P=2, Q=1
4 8 8 29,34 N=16384, NB=100,P=4, Q=2
8 16 16 57,14 N=16384, NB=100,P=4, Q=4
16 32 32 103,10 N=16384, NB=100,P=4, Q=4
32 64 64 130,40 N=9216, NB=100,P=8, Q=8
32 64 64 278,70 N=98304, NB=100,P=8, Q=8
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 7.0
HP Mlib 1.0
shared memory * threaded Mlib 1 1 1 3,94 N=4096, NB=48,P=1, Q=1
1 1 2 6,51 N=4096, NB=48,P=1, Q=1
1 1 2 7,14 N=8192, NB=48,P=1, Q=1
1 1 2 7,56 N=8192, NB=100,P=1, Q=1
1 1 2 8,51 N=12000 NB=100,P=1, Q=1
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 7.0
HP Mlib 1.0
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPIch 1.2.5
NCSA VMI2.0
Beta3b
2 2 4 13,75 N=8192, NB=100,P=2, Q=1
4 4 8 24,71 N=8192, NB=100,P=2, Q=2
4 4 8 30,93 N=16384, NB=48,P=2, Q=2
8 8 16 55,82 N=16384, NB=48,P=4, Q=2
16 16 32 105,00 N=16384, NB=48,P=4, Q=4
16 16 32 101,15 N=61440, NB=48,P=4, Q=4
32 32 64 113,80 N=16384, NB=48,P=8, Q=4
32 32 64   N=98304, NB=48,P=8, Q=4
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 7.0
HP Mlib 1.0
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPICH 1.2.5
OSU
VAPI 0.9.1-pre
2 2 4 11,61 N=4096, NB=48,P=2, Q=1
4 4 8 21,61 N=4096, NB=48,P=2, Q=2
4 4 8 31,14 N=16384, NB=48,P=2, Q=2
16 16 32 54,97 N=4096 NB=48,P=4, Q=4
16 16 32   N=16384, NB=48,P=4, Q=4
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
shared memory * threaded MKL 1 1 2 5,96 N=4096, NB=48,P=1, Q=1
1 1 2 6,22 N=4096, NB=100,P=1, Q=1
1 1 2 6,20 N=8192, NB=100,P=1, Q=1
1 1 2 7,17 N=12000 NB=100,P=1, Q=1
         
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPIch 1.2.5
NCSA VMI2.0
Beta3b
2 2 4 10,65 N=4096, NB=48,P=2, Q=1
4 4 8 19,88 N=4096, NB=48,P=2, Q=2
4 4 8 27,41 N=16384, NB=48,P=2, Q=2
8 8 16 48,65 N=16384, NB=48,P=4, Q=2
32 32 64 161,6 N=16384, NB=48,P=4, Q=4
32 32 64 186,2 N=98304, NB=48,P=4, Q=4
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
shared memory * threaded MKL 1 1 1 2,74 N=4096, NB=48,P=1, Q=1
1 1 2 4,89 N=4096, NB=48,P=1, Q=1
         
         
         
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
InfiniBand 4x
MTPB23108-CE128
fw 2.0
MTS9600-36port
THCA 0.2.0-b001
MPICH 1.2.5
OSU
VAPI 0.9.1 pre
2 2 2 5,52 N=4096, NB=48,P=2, Q=1
2 2 4 9,19 N=4096, NB=48,P=2, Q=1
4 4 4 10,69 N=4096, NB=48,P=2, Q=2
4 4 8 17,54 N=4096, NB=48,P=2, Q=2
4 4 8 22,74 N=16384, NB=48,P=2, Q=2
4 4 8 error N=32756, NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPIch 1.2.5
NCSA VMI2.0 preBeta2
1   1 2,3 N=4096, NB=48,P=1, Q=1
1   2 3,94 N=4096, NB=48,P=2, Q=1
2   2 4,08 N=4096, NB=48,P=2, Q=1
2   4 7,44 N=4096, NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
InfiniBand 4x
MTPB23108-CE128
fw 2.0
MTS9600-36port
THCA 0.2.0-b001
MPIch 1.2.5
NCSA VMI2.0
pre Beta3
2 2 2 5,53 N=4096, NB=48,P=2, Q=1
2 2 4 9,2 N=4096, NB=48,P=2, Q=1
4 4 4 10,45 N=4096, NB=48,P=2, Q=2
4 4 8 17,26 N=4096, NB=48,P=2, Q=2
4 4 8 22,46 N=16384 NB=48,P=2, Q=2
4 4 8 21,63 N=32756 NB=48,P=2, Q=2
4 4 8   N=61440 NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPIch 1.2.5
NCSA VMI2.0
Beta2 rel. 2
1   1 2,7 N=4096, NB=48,P=1, Q=1
1   2 4,9 N=4096, NB=48,P=1, Q=1
2   2 5,51 N=4096, NB=48,P=2, Q=1
2   4 9,18 N=4096, NB=48,P=2, Q=1
4   4 10,5 N=4096, NB=48,P=2, Q=2
4   8 17 N=4096, NB=48,P=2, Q=2
4   8 22,4 N=16384 NB=48,P=2, Q=2
4   8 21,6 N=32756 NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
InfiniBand 4x
MTPB23108-CE128
fw 2.0
MTS9600-36port
THCA 0.2.0-b001
Scali MPI Beta
Connect 4.1.0-b3
2 2 2 5,43 N=4096, NB=48,P=2, Q=1
2 2 4 8,81 N=4096, NB=48,P=2, Q=1
4 4 4 10,00 N=4096, NB=48,P=2, Q=2
4 4 8 15,72 N=4096, NB=48,P=2, Q=2
4 4 8 error N=16384 NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
MyriNet2000
MFM-PCI64B 64/66
M3-SW16-8M
MPIch 1.2.5..10
GM 2.0.5
2 2 2 5,41 N=4096, NB=48,P=2, Q=1
2 2 4 8,84 N=4096, NB=48,P=2, Q=2
4 4 4 10,00 N=4096, NB=48,P=2, Q=2
4 4 8 15,73 N=4096, NB=48, P=4, Q=2
4 4 8 21,91 N=16384 NB=48,P=2, Q=2
4 4 8 21,38 N=32756 NB=48,P=2, Q=2
4 4 8 20,4 N=61440 NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
MyriNet2000
MFM-PCI64B 64/66
M3-SW16-8M
GM 2.0.5

MPIch 1.2.5
NCSA VMI2.0
pre Beta3
2 2 2 5,39 N=4096, NB=48,P=2, Q=1
2 2 4 8,83 N=4096, NB=48,P=2, Q=1
4 4 4 10,07 N=4096, NB=48,P=2, Q=2
4 4 8 16,04 N=4096, NB=48,P=2, Q=2
4 4 8 22,04 N=16384 NB=48,P=2, Q=2
4 4 8 21,25 N=32756 NB=48,P=2, Q=2
4 4 8   N=61440 NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
MyriNet2000
MFM-PCI64B 64/66
M3-SW16-8M

MPIch 1.2.5
NCSA VMI2.0
myrinet GM2.0
Beta2 rel. 2
1   1 2,7 N=4096, NB=48,P=1, Q=1
1   2 4,9 N=4096, NB=48,P=1, Q=1
2   2 5,39 N=4096, NB=48,P=2, Q=1
2   4 8,82 N=4096, NB=48,P=2, Q=1
4   4 10,1 N=4096, NB=48,P=2, Q=2
4   8 15,9 N=4096, NB=48,P=2, Q=2
4   8 22 N=16384, NB=48,P=2, Q=2
4   8   N=32756, NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 6.0b
MyriNet2000
MFM-PCI64B 64/66
M3-SW16-8M
MPIch 1.2.5..10
GM 1.6.4
2   2 5,47 N=4096, NB=48,P=2, Q=1
2   4 10,08 N=4096, NB=48,P=2, Q=2
4   4 10,27 N=4096, NB=48,P=2, Q=2
4   8 14,72 N=4096, NB=48, P=4, Q=2
         
               
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Intel Compiler 7.0
HP MLIB 1.0
shared memory * threaded MLIB 1 1 1 3,14 N=4096, NB=48,P=1, Q=1
1 1 2 6,03 N=4096, NB=48,P=1, Q=1
         
         
         
               
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
shared memory MPIch 1.2.5
NCSA VMI2.0
Beta3b
1 1 1 2,84 N=4096, NB=48,P=1, Q=1
1 2 2 4,83 N=4096, NB=48,P=2, Q=1
         
         
         
HP rx2600
Dual 1.3 GHz Itanium2
4 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPIch 1.2.5
NCSA VMI2.0
Beta3b
2 2 2 4,99 N=4096, NB=48,P=2, Q=1
2 2 4 8,85 N=4096, NB=48,P=2, Q=2
4 4 4 9,23 N=4096, NB=48,P=2, Q=2
4 4 4 12,15 N=16384, NB=48,P=2, Q=2
4 8 8 14,17 N=4096, NB=48,P=2, Q=2
4 8 8 21,34 N=16384, NB=48,P=2, Q=2
8 8 8 22,62 N=16384, NB=48,P=2, Q=2
8 16 16 40,60 N=16384, NB=48,P=2, Q=2
16 16 16 error N=16384, NB=48,P=2, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
shared memory MPIch 1.2.5
SMP
1 1 1 2,3 N=4096, NB=48,P=1, Q=1
1 2 2 3,93 N=4096, NB=48,P=2, Q=1
         
         
         
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPIch 1.2.5
NCSA VMI2.0
Beta2 rel. 2
2 2 2 4,07 N=4096, NB=48,P=2, Q=1
2 4 4 7,3 N=4096, NB=48,P=2, Q=2
4 4 4 7,57 N=4096, NB=48,P=2, Q=2
4 8 8 11,8 N=4096, NB=48,P=4, Q=2
         
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
MyriNet2000
PCIXD-2
back-to-back
MPIch 1.2.5
NCSA VMI2.0 preBeta2
2 2 2 4,06 N=4096, NB=48,P=2, Q=1
2 4 4 7,21 N=4096, NB=48, P=2, Q=2
         
         
         
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
MyriNet2000
MFM-PCI64B 64/66
M3-SW16-8M
MPIch 1.2.5
NCSA VMI2.0 myrinet GM2.0
Beta2 rel. 2
2 2 2 4,03 N=4096, NB=48,P=2, Q=1
2 4 4 7,07 N=4096, NB=48,P=2, Q=2
4 4 4 7,35 N=4096, NB=48,P=2, Q=2
4 8 8 10,5 N=4096, NB=48, P=4, Q=2
         
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
MyriNet2000
MFM-PCI64B 64/66
M3-SW16-8M
MPIch 1.2.5
GM 1.6.4
1 1 1 2,3 N=4096, NB=48,P=1, Q=1
1 2 2 4,16 N=4096, NB=48,P=2, Q=1
2 2 2 4,08 N=4096, NB=48,P=2, Q=1
2 4 4 7,55 N=4096, NB=48,P=2, Q=2
4 4 4 7,37 N=4096, NB=48,P=2, Q=2
4 8 8 11,36 N=4096, NB=48, P=4, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
Gigabit Ethernet
HP procurve 4104gl


MPIch 1.2.5
NCSA VMI2.0 preBeta2
1 1 1 2,3 N=4096, NB=48,P=1, Q=1
1 2 2 3,94 N=4096, NB=48,P=2, Q=1
2 2 2 3,42 N=4096, NB=48,P=2, Q=1
2 4 4 5,26 N=4096, NB=48,P=2, Q=2
       
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1
Fast Ethernet
HP procurve 4104gl


MPIch 1.2.5
NCSA VMI2.0 tcp
Beta2 rel. 2
1 1 1 2,3 N=4096, NB=48,P=1, Q=1
1 2 2 3,66 N=4096, NB=48,P=2, Q=1
2 2 2 2,13 N=4096, NB=48,P=2, Q=1
2 4 4 3,28 N=4096, NB=48,P=2, Q=2
4 4 4 2,81 N=4096, NB=48,P=2, Q=2
4 8 8 2,05 N=4096, NB=48,P=4, Q=2
HP ZX6000
Dual 1.0 GHz Itanium2
8 Gbyte DDR-RAM
Fast Ethernet
HP procurve 4104gl


MPIch 1.2.5
NCSA VMI2.0 preBeta2
1   1 2,3 N=4096, NB=48,P=1, Q=1
1   2 3,66 N=4096, NB=48,P=2, Q=1
2   2 2,02 N=4096, NB=48,P=2, Q=1
2   4 3,24 N=4096, NB=48,P=2, Q=2
               
Machines
Math Lib
Interconnect MPI Number of
nodes
Number of
MPI proc.
Number of
active CPUs 
Performance
[Gflop/s]
Remarks
Dual 2.2 GHz Opteron
Fujitsu-Siemens V810,
2 x 1 Gbyte DDR-RAM
GNU compiler gcc 3.2.3
lib goto 0.93 64-bit
shared memory MPICH 1.2.5
NCSA VMI2.0 MST
Beta3 rel
1 1 1 3,38 N=4096, NB=100,P=1, Q=1
1 2 2 6,15 N=4096, NB=100,P=2, Q=1
         
         
         
Dual 2.2 GHz Opteron
Fujitsu-Siemens V810,
2 x 1 Gbyte DDR-RAM
GNU compiler gcc 3.2.3
lib goto 0.93 64-bit
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPICH 1.2.5
NCSA VMI2.0 MST
Beta3 rel
1 1 1 3,38 N=4096, NB=100,P=1, Q=1
1 2 2 6,15 N=4096, NB=100,P=2, Q=1
2 2 2 6,193 N=4096, NB=80,P=2, Q=1
2 4 4 11,38 N=4096, NB=80,P=2, Q=2
         
Dual 2.2 GHz Opteron
Fujitsu-Siemens V810,
2 x 1 Gbyte DDR-RAM
GNU compiler gcc 3.2.3
acml 2.0 gnu64
shared memory MPICH 1.2.5
NCSA VMI2.0 MST
Beta3 rel
1 1 1 3,02 N=4096, NB=100,P=1, Q=1
1 2 2 5,52 N=4096, NB=100,P=2, Q=1
         
         
         
Dual 2.2 GHz Opteron
Fujitsu-Siemens V810,
2 x 1 Gbyte DDR-RAM
GNU compiler gcc 3.2.3
acml 2.0 gnu64
InfiniBand 4x
MTPB23108-CE128
MTEK43132-C08-S
MPICH 1.2.5
NCSA VMI2.0 MST
Beta3 rel
1 1 1 3,02 N=4096, NB=100,P=1, Q=1
1 2 2 5,52 N=4096, NB=100,P=2, Q=1
2 2 2 5,50 N=4096, NB=100,P=2, Q=1
2 4 4 9,71 N=4096, NB=100,P=2, Q=2
         
Newisys 2100
Dual 1.4 GHz Opteron
2 Gbyte DDR-RAM
GNU Compiler 64bit
Atlas 3.5.6 AMD64 SSE2
shared memory   1 1 1 2,17 N=4096, NB=48,P=1, Q=1
1 2 2 3,64 N=4096, NB=48,P=2, Q=1
         
         
         
Newisys 2100
Dual 1.4 GHz Opteron
2 Gbyte DDR-RAM
GNU Compiler 64bit
Atlas 3.5.2 AMD64 SSE2
Gigabit-Ethernet
Dell powerconnect 5224
MPIch 1.2.5 1 1 1 2,05 N=4096, NB=48,P=1, Q=1
1 2 2 3,62 N=4096, NB=48,P=2, Q=1
2 2 2 3,31 N=4096, NB=48,P=2, Q=1
2 4 4 5,9 N=4096, NB=48,P=2, Q=2
         
               
Machines
Math Lib
Interconnect MPI Number of
nodes
Number of
MPI proc.
Number of
active CPUs 
Performance
[Gflop/s]
Remarks
Dell PowerEdge 2650
Dual 2.4 GHz Xeon
2 Gbyte DDR-RAM
GNU Compiler
Atlas 3.4.1 Xeon SSE2
MyriNet2000
MFM-PCI64B 64/66
M3-SW16-8M
MPIch 1.2.5
GM 1.6.4
1 1 1 1,54 N=4096, NB=48,P=1, Q=1
1 2 2 2,62 N=4096, NB=48,P=2, Q=1
2 2 2 2,66 N=4096, NB=48,P=2, Q=1
    2 4 4 4,58 N=4096, NB=48,P=2, Q=2
             
Dell PowerEdge 2650
Dual 2.4 GHz Xeon
2 Gbyte DDR-RAM
Intel Compiler 7.0
Intel MKL 5.2
MyriNet2000
MFM-PCI64B 64/66
M3-SW16-8M
MPIch 1.2.5
GM 1.6.4
1 1 1 1,39 N=4096, NB=48,P=1, Q=1
1 2 2 1,65 N=4096, NB=48,P=2, Q=1
2 2 2 2,5 N=4096, NB=48,P=2, Q=1
    2 4 4 2,7 N=4096, NB=48,P=2, Q=2
             

HPL

http://www.netlib.org/benchmark/hpl

BLAS by Kazushige Goto

http://www.cs.utexas.edu/users/kgoto/

PC²

http://www.upb.de/pc2

All measurements have been done in our Lab by our own.

My Staffweb