Developers' Guide

This page contains details on LibGeoDecomp's internals. We don't have a terrible lot of documentation online as most of the team work at FAU. However, here are some starting points.

  • The trunk is hosted on GitHub. You can also report bugs there.
    Status: Travis CI: ,
  • Mainline development goes through GentryX' repository. Changes get merged to trunk automatically once they clear CI tests.
    Status: Travis CI: ,
  • The API documentation is available here.
LibGeoDecomp is written in 1TBS (one true brace style) and mostly follows the coding style of Qt and KDE. Compliance is important to keep the codebase consistent and readable.
  • Pass primitive types (int, double, char, size_t) by value, all others as const references. Exception: if a function is meant to modify its parameters, then pass those as pointers.
  • Use const TYPE& var instead of TYPE const & var.
  • Don't repeat yourself! When writing comments, please focus on the WHY and not the WHAT. What the code does, should be obvious from the method name and/or code itself. Don't repeat yourself. But if your code is doing something complex or counterintuitive, a short explanation would be helpful.
  • Sometimes you'll need to bind template arguments of a frequently used type in a typedef, e.g. typedef typename CELL_TYPE::Topology Topology. (see example on the right).
  • Header guards are also all caps with underscores. The are simply the path of the given file within the source repositiory, e.g. #ifndef LIBGEODECOMP_MISC_GRID_H for misc/grid.h.
  • Layout:
    • Indentation: 4 spaces, no tabs
    • No indentation for namespaces
    • No trailing whitespace
  • Naming conventions
    • Use CamelCase for classes, typenames, functions, variables. All types are capitalized, all functions and variables start with lower case characters.
    • ALL_CONSTANTS are all caps, delimited by underscores.
    • Please don't shorten variable names to crptcAbvtns or single letters. No one can remember what k2, alpha, xx and such mean, not even you. Bytes are cheap these days. Why not use descriptiveNames?
namespace LibGeoDecomp {

template<typename CARGO>
class Container
{
public:
    friend class ThatOtherClass;
    typedef typename CARGO::Topology Topology;
    static const int DIM = Topology::DIM;

    inline Container(size_t size, const CARGO& default)
    {
        myData = new CARGO[size];
        for (size_t i = 0; i < size; ++i) {
          myData[i] = default;
        }
    }

    ~Container()
    {
        delete myData;
    }

    inline void setFront(const CARGO& newFront)
    {
        myData[0] = newFront;
    }

    /**
     * Will exchange 
     */
    inline void swapFront(CARGO *newFront)
    {
        std::swap(myData[0], *newFront);
    }

    inline CARGO *data()
    {
      return myData;
    }
};

}

Lines of Code

The C++ stats include libgeodecomp/src only; the build system includes all CMake files and the TypemapGenerator. All stats exclude generated files.

lines of code vs. revisions

Revisions vs. Time

revisions vs. date

Fixmes

How often does the term fixme occur in the code? (Fewer would be better.)

occurrences of term FIXME vs. revisions arrow down, label: better

Undocumented Classes

How many classes lack (basic) documentation?

number of undocumented classes vs. revisions arrow down, label: better

Unmatched Header Guards

number of unmatched header guards vs. revisions arrow down, label: better

Bad Includes

number of bad includes vs. revisions arrow down, label: better

size_t Misuse

How often is size_t used instead of std::size_t?

number of size_t misuse vs. revisions arrow down, label: better

Performance plots typically use seconds (s, the fewer the better) or Giga Lattice Updates Per Second (GLUPS, the more the better) as their unit of measure. They contain multiple curves which compare implementations based on LibGeoDecomp with manual implementations. The names of the curves indicate which kind of implementation was used:

  • Metal names (bronze, silver, gold, platinum) are assigned to implementations which use LibGeoDecomp, with bronze being the least optimized implementation and platinum representing the highest degree of ptimization.
  • Spices (vanilla, pepper) are used for manual, mostly C-style implementations. They mimic the codes developers would need to write if the had no access to LibGeoDecomp. Vanilla is typically a very basic code, while pepper may contain optimizations.

Please mind the different scales in the plots. Some benchmarks were introduced at different times to the various architectures, which is why the horizontal scales differ. And of course the vertical scales differ, too, as the different hardware architectures yield different levels of performance.

1. Mini Apps

This category contains multiple mini apps which are identical, or at least good representatives of those kernels found in actual simulation codes. They were chosen so that they allow comparison with competing implementations. We are testing a Lattice Boltzmann Method (LBM, D3Q19), a Reverse Time Migration (RTM, 3D), and a Jacobi smoother (7-point stencil), as well as some tests for unstructured grids (SPMVM, with SELL-C-Sigma storage).

LBM (Tesla P100, TeslaP100)

arrow up, label: better

Show: last month, last week, all, latest revisions

LBM (Tesla V100, TeslaV100)

arrow up, label: better

Show: last month, last week, all, latest revisions

LBM (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

LBM (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

LBM (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

RTM (Tesla P100, TeslaP100)

arrow up, label: better

Show: last month, last week, all, latest revisions

RTM (Tesla V100, TeslaV100)

arrow up, label: better

Show: last month, last week, all, latest revisions

Jacobi3D (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

Jacobi3D (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

Jacobi3D (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

SPMVM (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

SPMVM (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

SPMVM (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

2. Geometry Subsystem

The geometry subsystem is responsible for the domain decomposition. (I.e. which parts of the grid are assigned to which node and how can we determine halos and traverse the coordinates?) Their primary purpose is to discover involuntary performance degradations.

CoordEnumeration (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

CoordEnumeration (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

CoordEnumeration (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

FloatCoordAccumulation (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

FloatCoordAccumulation (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

FloatCoordAccumulation (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionCount (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionCount (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionCount (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionInsert (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionInsert (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionInsert (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionIntersect (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionIntersect (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionIntersect (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionSubtract (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionSubtract (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionSubtract (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionUnion (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionUnion (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionUnion (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionExpand1 (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionExpand1 (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionExpand1 (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionExpand5 (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionExpand5 (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

RegionExpand5 (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionHIndexing (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionHIndexing (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionHIndexing (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionStriping (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionStriping (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionStriping (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionHilbert (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionHilbert (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionHilbert (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionZCurve (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionZCurve (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartitionZCurve (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartMngr3D(RecursiveBisection) (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartMngr3D(RecursiveBisection) (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartMngr3D(RecursiveBisection) (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartMngr3D(ZCurve) (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartMngr3D(ZCurve) (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

PartMngr3D(ZCurve) (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

3. Storage Subsystem

No, this is not about disk storage. This section contains benchmarks that exercise classes that hold simulation data (cells) in memory. For the most part that is grid types and the various operations defined on them.

GridLoadSaveRegion (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

GridLoadSaveRegion (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

GridLoadSaveRegion (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

CUDAGridLoadSaveRegion (Tesla P100, TeslaP100)

arrow up, label: better

Show: last month, last week, all, latest revisions

CUDAGridLoadSaveRegion (Tesla V100, TeslaV100)

arrow up, label: better

Show: last month, last week, all, latest revisions

SELLInit (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

SELLInit (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

SELLInit (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

4. MPI

Some of our components naturally rely on MPI. This set of benchmarks evaluates how well they perform.

PatchLink(TestCell(3) ) (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

PatchLink(TestCell(3) ) (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

PatchLink(TestCell(3) ) (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

PatchLink(MySimpleCell) (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

PatchLink(MySimpleCell) (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

PatchLink(MySimpleCell) (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

CollectingWriter(TestCell(3) ) (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

CollectingWriter(TestCell(3) ) (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

CollectingWriter(TestCell(3) ) (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

CollectingWriter(MySimpleCell) (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

CollectingWriter(MySimpleCell) (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

CollectingWriter(MySimpleCell) (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

5. HPX

This section contains performance tests which utilize the HPX backend. HPX is the first implementation of the upcoming C++ standard for concurrency and parallelism.

HPXBusyworkCell (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

HPXBusyworkCell (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

HPXBusyworkCell (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

6. LibFlatArray

LibFlatArray performance, as reported by libflatarray/examples/performance_tests.

JacobiD3Q7 (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

JacobiD3Q7 (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

JacobiD3Q7 (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

NBody (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow up, label: better

Show: last month, last week, all, latest revisions

NBody (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow up, label: better

Show: last month, last week, all, latest revisions

NBody (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow up, label: better

Show: last month, last week, all, latest revisions

ParticleMover (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

ParticleMover (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

ParticleMover (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

ParticleInteractor (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

ParticleInteractor (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

ParticleInteractor (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

ConditionalAny (Intel Xeon Broadwell EN/EP/EX processor, Intel(R) Xeon(R) CPU @ 2.20GHz, Broadwell)

arrow down, label: better

Show: last month, last week, all, latest revisions

ConditionalAny (Intel Cascadelake SP processor, Intel(R) Xeon(R) CPU, CascadeLake)

arrow down, label: better

Show: last month, last week, all, latest revisions

ConditionalAny (Intel Skylake SP processor, Intel(R) Xeon(R) CPU @ 2.00GHz, Skylake)

arrow down, label: better

Show: last month, last week, all, latest revisions

last modified: Wed Apr 04 00:03:20 2018 +0200