We're finally moving to the big machines. Yesterday I gathered the first timing data from our jobs on RRZE's LiMa cluster. We ran DendSim3 with LibGeoDecomp in a Gustafson and Barsis setup. This is also called weak scaling, as we increase the grid volume linearly with the number of cores. So far we have achieved an efficiency in excess of 94% while running on 768 cores.
The plot of the execution time vs. the number of cores -- shown below -- should ideally yield a horizontal line (the lower the better). Granted, it does wobble a bit, and it is slightly rising, but all in all this is negligible. Scaling is almost perfect. Still, these results should be considered preliminary as did not yet have the opportunity to run full system jobs, so stay tuned.