Function std.parallelism.TaskPool.parallel
Implements a parallel foreach loop over a range.  This works by implicitly
    creating and submitting one Task to the TaskPool for each worker
    thread.  A work unit is a set of consecutive elements of range to
    be processed by a worker thread between communication with any other
    thread.  The number of elements processed per work unit is controlled by the
    workUnitSize parameter.  Smaller work units provide better load
    balancing, but larger work units avoid the overhead of communicating
    with other threads frequently to fetch the next work unit.  Large work
    units also avoid false sharing in cases where the range is being modified.
    The less time a single iteration of the loop takes, the larger
    workUnitSize should be.  For very expensive loop bodies,
    workUnitSize should  be 1.  An overload that chooses a default work
    unit size is also available.
						
				ParallelForeach!R parallel(R)
				(
				
				  R range,
				
				  size_t workUnitSize
				
				);
				
				
				ParallelForeach!R parallel(R)
				(
				
				  R range
				
				);
						
					
				Example
// Find the logarithm of every number from 1 to
// 10_000_000 in parallel.
auto logs = new double[10_000_000];
// Parallel foreach works with or without an index
// variable.  It can be iterate by ref if range.front
// returns by ref.
// Iterate over logs using work units of size 100.
foreach (i, ref elem; taskPoolNotes
The memory usage of this implementation is guaranteed to be constant
    in range.
    Breaking from a parallel foreach loop via a break, labeled break,
    labeled continue, return or goto statement throws a
    ParallelForeachError.
    In the case of non-random access ranges, parallel foreach buffers lazily
    to an array of size workUnitSize before executing the parallel portion
    of the loop.  The exception is that, if a parallel foreach is executed
    over a range returned by asyncBuf or map, the copying is elided
    and the buffers are simply swapped.  In this case workUnitSize is
    ignored and the work unit size is set to the  buffer size of range.
A memory barrier is guaranteed to be executed on exit from the loop, so that results produced by all threads are visible in the calling thread.
Exception Handling:
    When at least one exception is thrown from inside a parallel foreach loop,
    the submission of additional Task objects is terminated as soon as
    possible, in a non-deterministic manner.  All executing or
    enqueued work units are allowed to complete.  Then, all exceptions that
    were thrown by any work unit are chained using Throwable and
    rethrown.  The order of the exception chaining is non-deterministic.