IBM Z and LinuxONE - Languages - Group home

How does OpenMP 3.0 work better with C++?

  

This was one of the question asked at SC 08. I will try to answer that here. I will start and add more as I move through the various topics.


OpenMP 3.0 had better support for C++ in the following areas:


  • Parallelization of RandomAccess Iterator loops with strict canonical operators
  • Threadprivatization of static class members
  • Unsigned loop control variable support
  • Fully specify constructor call requirement in private/first/lastprivate/threadprivate
  • better match with the C++0x memory model


For-Worksharing with Iterator loops:


We specifically enabled C++ RandomAccess iterators and C pointers to be parallelized with explicit directives.


#include <vector>void iterator_example(){   std::vector<int> vec(23);      std::vector<int>::iterator it;      #pragma omp parallel for default(none) shared(vec)      for (it = vec.begin(); it < vec.end(); it++)      {        // do work with *it //      }}


Enabling threadprivatization of static class member variables


In 2.5, as a result of ambiguous language, the support for this was inconsistent. In general, it would claim that a threadprivate variable must be namespace, file or block scope.


In 3.0, this code is now allowed:

class T {      public:            static int i;            #pragma omp threadprivate(i) };



This may seem a trivial change, but for C++, it enables a powerful idiom of singletons and allocators, which all rely on static class member variable.


Semantics of private, first/lastprivate, threadprivate+copyin/copyprivate for C++


Let's get to the good stuff. OpenMP 2.5 did not really specify what constructors should be called with various private/first/lastprivate/threadprivate:


In some cases, it did not even specify that they should apply to non-PODs (Plain'ol Data, i.e. C structs).


OpenMP 3.0 changed that. Beside specifying non-PODs, it also specified precise rules for the constructor sequence that is in line with what the semantics would require.


Firstprivate:

For instance, it would specify that a firstprivate for a class type variable should expect an accessible copy constructor, since it is required to initialize each of the one or more list items private to a thread with the value that the corresponding original item has when the construct is encountered.


Lastprivate:

For a class type variable, it requires an accessible, unambiguous copy assignment operator for the class type. And it requires an accessible, unambiguous default constructor for the class type unless the variable is also specified in a firstprivate clause.


Threadprivate:

This is the most interesting as it differentiates three kinds of initializaton in C++.

1. Without initialization: Object1 o;

  • Default constructor is called

2. Direct initialization: Object1 o( (int)23 );

  • Constructor accepting the argument is called

3. Copy initialization: Object1 o = other_instance;

  • Copy constructor is called


The semantics of this is that for the master thread, global static objects and static class members are constructed before main() is entered in an undefined order.


For the slave threads, the exact point in time of object construction is unspecified, but is has to happen before the thread references it the first time


These changes were a long time coming. It causes what used to be vaguely implied by the 2.5 specification, now to be clearly specified so all compilers can conform. It also allows the users of C++ with OpenMP to have more consistent behavior.