Polyhedron Fortran Benchmarks: 64-bit Windows on AMD Phenom II (December 2014)

Absoft(AP)
15.0.0
Absoft
15.0.0
FTN95
7.10
G95
0.93
Intel(AP)
15.0
Intel
15.0
Lahey-GNU
Lassen-4.9
Lahey
7.30
NAG
6.0
PGI
14.9
AC 7.81 7.65 15.78 14.85 9.69 9.58 8.43 18.02 10.17 8.56
AERMOD 19.38 19.40 33.53 40.67 17.72 17.71 28.17 26.52 24.95 17.75
AIR 3.16 6.76 13.97 9.34 2.42 4.97 7.46 9.73 7.70 6.68
CAPACITA 32.13 37.64 64.96 57.47 37.91 37.93 43.93 63.52 49.21 32.02
CHANNEL2 115.30 143.98 357.06 549.51 113.99 146.61 150.93 242.77 220.92 141.17
DODUC 28.15 27.84 49.44 34.60 22.92 22.76 29.73 37.21 31.77 26.47
FATIGUE2 86.68 82.06 406.46 1008.96 62.53 62.33 42.87 261.42 128.06 151.63
GAS_DYN2 103.13 101.18 565.74 723.25 101.69 101.05 228.48 277.85 199.60 131.42
INDUCT2 68.82 96.02 600.89 232.38 44.04 109.46 97.11 480.12 152.65 204.28
LINPK 9.25 9.38 10.67 10.67 9.73 9.73 10.31 10.19 10.17 10.25
MDBX 11.55 12.45 23.75 20.43 10.90 10.80 11.34 19.48 12.47 13.60
MP_PROP_DESIGN 41.59 137.66 691.76 156.29 31.52 99.39 292.83 335.77 285.91 126.26
NF 16.70 18.26 28.17 32.47 13.67 13.70 16.79 24.56 19.31 15.90
PROTEIN 31.11 31.71 56.27 57.48 30.52 30.51 32.59 58.74 32.45 35.03
RNFLOW 21.91 17.38 35.53 32.62 19.75 18.71 22.41 28.79 24.29 23.55
TEST_FPU2 87.85 94.05 192.73 174.28 80.15 85.61 104.96 129.74 117.91 88.75
TFFT2 131.88 135.30 168.22 137.03 136.20 135.18 88.91 135.20 136.93 137.01
Geometric Mean 30.59 35.70 82.44 73.37 27.77 33.12 39.35 63.57 46.65 39.50

 

Compiler Switches
Absoft (autoparallel) af90 -m64 -O5 -speed_math=10 -fast_math-march=barcelona -xINTEGER -stack:0×80000000
Absoft af90 -m64 -O4 -speed_math=10 -fast_math -march=barcelona -xINTEGER -stack:0×80000000
FTN95 ftn95 /p6 /optimize (slink was used to increase the stack size)
G95 g95 -march=opteron -funroll-loops -O3
Intel (autoparallel) ifort /fast /Qparallel /link /stack:64000000
Intel ifort /fast /link /stack:64000000
Lahey-GNU lgf -64 -ofast -unroll -wpo -stack 64000000
Lahey lf95 -inline (35) -o1 -sse2 -nstchk -tp4 -ntrace -unroll (6) -zfm
NAG NAG nagfor -abi=64 -O4 -s -v -V
PGI pgf90 -V -fastsse -Munroll=n:4 -Mipa=fast,inline
Notes
All figures are Execution Times in Seconds – measured on a machine with an AMD Phenom II X4 955 processor (3.2 GHz), with 4GBytes memory, running Windows 7 64-bit. Each figure is the average over at least 10 runs (many more for some). Measurement error is typically <1%. Green cells highlight figures within 10% of the fastest. Red cells indicate figures which are more than 150% of the fastest.So far as possible, we have used the compiler switches which give the best overall results. We have not attempted to tune individual benchmarks, and, in particular cases, different switch settings may give better results. We have created and used 64 bit executables where possible, and 32 bit executables where the compiler does not offer a 64 bit option.The settings used for the Absoft and Intel compilers enable autoparallelization. Autoparallelization settings are not used on any other compilers because we found that they produced no significant performance benefits on this benchmark set.Thanks are due to Jos Bergervoet for permission to use his CAPACITA benchmark, to Quetzal Associates for permission to use their CHANNEL, FATIGUE, GAS_DYN, INDUCT, PROTEIN and RNFLOW benchmarks, to David Frank for his TEST_FPU benchmark, to Anthony Falzone for the use of MP_PROP_DESIGN, and to Ted Addison of McVehil-Monnett Associates for permission to use AERMOD, an air quality model used by the US Environmental Protection Agency.All the benchmarks have been modified slightly to fit into our benchmarking harness.The NF benchmark uses “nested factorization”, a little known but very effective iterative linear solver for huge finite difference matrices.