std.parallelism.TaskPool.WorkerLocalStorage/workerLocalStorage
- multiple declarations
Function TaskPool.workerLocalStorage
Creates an instance of worker-local storage, initialized with a given
value. The value is lazy
so that you can, for example, easily
create one instance of a class for each worker. For usage example,
see the WorkerLocalStorage
struct.
Struct TaskPool.WorkerLocalStorage
Struct for creating worker-local storage. Worker-local storage is
thread-local storage that exists only for worker threads in a given
TaskPool
plus a single thread outside the pool. It is allocated on the
garbage collected heap in a way that avoids false sharing, and doesn't
necessarily have global scope within any thread. It can be accessed from
any worker thread in the TaskPool
that created it, and one thread
outside this TaskPool
. All threads outside the pool that created a
given instance of worker-local storage share a single slot.
struct WorkerLocalStorage(T)
;
Since the underlying data for this struct is heap-allocated, this struct has reference semantics when passed between functions.
The main uses cases for WorkerLocalStorageStorage
are:
1. Performing parallel reductions with an imperative, as opposed to
functional, programming style. In this case, it's useful to treat
WorkerLocalStorageStorage
as local to each thread for only the parallel
portion of an algorithm.
2. Recycling temporary buffers across iterations of a parallel foreach loop.
Properties
Name | Type | Description |
---|---|---|
get [get]
|
auto | Get the current thread's instance. Returns by ref.
Note that calling get from any thread
outside the TaskPool that created this instance will return the
same reference, so an instance of worker-local storage should only be
accessed from one thread outside the pool that created it. If this
rule is violated, undefined behavior will result.
|
get [set]
|
T | Assign a value to the current thread's instance. This function has the same caveats as its overload. |
toRange [get]
|
TaskPool | Returns a range view of the values for all threads, which can be used to further process the results of each thread after running the parallel part of your algorithm. Do not use this method in the parallel portion of your algorithm. |
Example
// Calculate pi as in our synopsis example, but
// use an imperative instead of a functional style.
immutable n = 1_000_000_000;
immutable delta = 1.0L / n;
auto sums = taskPool .workerLocalStorage(0.0L);
foreach (i; parallel(iota(n)))
{
immutable x = ( i - 0.5L ) * delta;
immutable toAdd = delta / ( 1.0 + x * x );
sums .get += toAdd;
}
// Add up the results from each worker thread.
real pi = 0;
foreach (threadResult; sums .toRange)
{
pi += 4.0L * threadResult;
}