1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121
|
Many of the following generic algorithms could very well be parallized. E.g.,
one of the following algorithms is tt(generate) (section ref(GEN)), filling
the elements with values produced by a generating function. If those values
are randomly generated values then tt(generated) could very well be
parallized, where each parallel thread handles a separate section of the range
to be filled. Another example where parallel execution may be useful is when
sorting series of values. For this the the tt(sort) generic algorithm can be
used (section ref(SORT), which also contains an example of parallelized
sorting).
These (and many other) generic algorithms can be executed in parallel,
depending on the specified emi(execution policy). When no execution policy is
specified then the algorithms operated in their standard, sequential, way. The
generic algorithms supporting execution policies have overloaded versions
where the first parameter specifies the execution policy to use, followed by
their remaining parameters. Function prototypes listed the following sections
showing a first parameter ti([ExecPol,]) may specify one of the execution
policies introduced in this section as their first arguments. E.g., one of the
tt(sort) generic algorithm's prototypes is
verb( void sort([ExecPol,]
RandomAccessIterator first, RandomAccessIterator last);)
and to sort, e.g., a vector if strings (tt(vs)), it can be called as
verb( sort(vs.bgin(), vs.end());)
or as (see below)
verb( sort(execution::par, vs.bgin(), vs.end());)
In order to use execution policies the tthi(execution) header file must be
included, and the linking option ti(-ltbb) must be specified with linking the
compiled object files().
There are four types of execution policies (all defined in the tt(std)
namespace):
itemization(
ithi(execution::sequenced_policy), whose predefined object
ti(execution::seq) is used to specify this execution policy when
calling generic algorithms.
When calling a generic algorithm specifying this policy it will not
be using parallel execution.
ithi(execution::parallel_policy), whose predefined object
ti(execution::par) is used to specify this execution policy when
calling generic algorithms.
When calling a generic algorithm specifying this policy it may
be using parallel execution: the generic algorithm may decide not to
use parallel execution when it decides that the overhead of parallel
execution is in fact reducing the efficiency of non-parallel
execution. E.g., when sorting 100 elements sequential execution is
faster than parallel execution and an algorithm like tt(sort) won't
use parallel execution.
ithi(execution::parallel_unsequenced_policy), whose predefined object
ti(execution::par_unseq) is used to specify this execution policy when
calling generic algorithms.
When calling a generic algorithm specifying this policy it may be
using parallel execution, execution may be migrated across threads
(using a so-called em(parent-stealing scheduler)), or execution may be
hi(vectorized execution)vectorized (i.e., a single thread is used
accessing data items at completely different locations (like swapping
the first and middle elements of vectors)). When using this
policy the order in which processed elements are accessed and the
threads from which these elements are accessed is undefined.
ithi(execution::unsequenced_policy), whose predefined object
ti(execution::unseq) is used to specify this execution policy when
calling generic algorithms.
When calling a generic algorithm specifying this policy the algorithm
uses vectorized execution.
)
Whenever algorithms are called using the above policy specifications and
during the execution of these algorithms functions are called generating
uncaught exceptions tt(std::terminate) is called.
When using parallel execution the objects or functions passed to the generic
algorithms might access data defined elsewhere. If those data are modified
then it is possible that modifications are requested from different execution
threads, which could result in hi(data race)em(data races) or
hi(deadlock)em(deadlocks). The programmer should ensure that data races and/or
dadlocks cannot occur when using parallel execution.
COMMENT(
int x = 0;
std::mutex m;
int a[] = {1, 2};
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int)
{
std::lock_guard<std::mutex> guard(m);
++x; // correct
});
Unsequenced execution policies are the only case where function calls are
unsequenced with respect to each other, meaning they can be interleaved. In
all other situations in C++, they are indeterminately-sequenced (cannot
interleave). Because of that, users are not allowed to allocate or deallocate
memory, acquire mutexes, use non-lockfree std::atomic specializations, or, in
general, perform any vectorization-unsafe operations when using these policies
(vectorization-unsafe functions are the ones that synchronize-with another
function, e.g. std::mutex::unlock synchronizes-with the next
std::mutex::lock).
int x = 0;
std::mutex m;
int a[] = {1, 2};
std::for_each(std::execution::par_unseq, std::begin(a), std::end(a), [&](int)
{
std::lock_guard<std::mutex> guard(m); // Error: lock_guard constructor calls m.lock()
++x;
});
If the implementation cannot parallelize or vectorize (e.g. due to lack of
resources), all standard execution policies can fall back to sequential
execution.
END)
|