Substitutable parallelization functions. More...

Functions
template<bool nothrow_ = false, typename Task_ , class Run_ >
void	parallelize_simple (Task_ num_tasks, Run_ run_task)
	Parallelize individual tasks across workers.

template<typename Task_ >
int	sanitize_num_workers (int num_workers, Task_ num_tasks)
	Adjust the number of workers to the number of tasks in `parallelize_range()`.

template<bool nothrow_ = false, typename Task_ , class Run_ >
void	parallelize_range (int num_workers, Task_ num_tasks, Run_ run_task_range)
	Parallelize a range of tasks across multiple workers.

Detailed Description

Substitutable parallelization functions.

Function Documentation

◆ parallelize_range()

template<bool nothrow_ = false, typename Task_ , class Run_ >

void subpar::parallelize_range	(	int	num_workers,
		Task_	num_tasks,
		Run_	run_task_range )

Parallelize a range of tasks across multiple workers.

The aim is to split tasks in [0, num_tasks) into non-overlapping contiguous ranges that are executed by different workers. In the default parallelization scheme, we create num_workers evenly-sized ranges that are executed via OpenMP (if available) or <thread> (otherwise). Not all workers may be used, e.g., if num_tasks < num_workers, but each worker will process no more than one range.

The SUBPAR_USES_OPENMP_RANGE macro will be defined as 1 if and only if OpenMP was used in the default scheme. Users can define the SUBPAR_NO_OPENMP_RANGE macro to force parallelize_range() to use <thread> even if OpenMP is available. This is occasionally useful when OpenMP cannot be used in some parts of the application, e.g., with POSIX forks.

Advanced users can substitute in their own parallelization scheme by defining SUBPAR_CUSTOM_PARALLELIZE_RANGE before including the subpar header. This should be a function-like macro that accepts the same arguments as parallelize_range() or the name of a function that accepts the same arguments as parallelize_range(). If defined, the custom scheme will be used instead of the default scheme whenever parallelize_range() is called. Macro authors should note the expectations on run_task_range(), as well as the one-to-zero-or-one mapping between workers and ranges.

If nothrow_ = true, exception handling is omitted from the default parallelization scheme. This avoids some unnecessary work when the caller knows that run_task_range() will never throw. For custom schemes, if SUBPAR_CUSTOM_PARALLELIZE_RANGE_NOTHROW is defined, it will be used if nothrow_ = true; otherwise, SUBPAR_CUSTOM_PARALLELIZE_RANGE will continue to be used.

Template Parameters

nothrow_	Whether the `Run_` function cannot throw an exception.
Task_	Integer type for the number of tasks.
Run_	Function that accepts three arguments: `w`, the identity of the worker executing this task range. This will be passed as an `int`. `start`, the start index of the task range. This will be passed as a `Task_`. `length`, the number of tasks in the task range. This will be passed as a `Task_`. Any return value is ignored.

Parameters

num_workers	Number of workers. This should be a positive integer. Any zero or negative values are treated as 1. (See also `sanitize_num_workers()`.)
num_tasks	Number of tasks. This should be a non-negative integer.
run_task_range	Function to iterate over a range of tasks within a worker. This may be called zero, one or multiple times in any particular worker. In each call: `w` is guaranteed to be in `[0, num_workers)`. `[start, start + length)` is guaranteed to be a non-empty range of tasks that lies in `[0, num_tasks)`. It will not overlap with any other range in any other call to `run_task_range()`. This function may throw an exception if `nothrow_ = false`.

◆ parallelize_simple()

template<bool nothrow_ = false, typename Task_ , class Run_ >

void subpar::parallelize_simple	(	Task_	num_tasks,
		Run_	run_task )

Parallelize individual tasks across workers.

The aim is to parallelize the execution of tasks across workers, under the assumption that there is a 1:1 mapping between them. This is most relevant when the overall computation has already been split up and assigned to workers outside of subpar. In such cases, parallelize_simple() is more suitable than parallelize_range() as it avoids the unnecessary overhead of partitioning the task interval.

The SUBPAR_USES_OPENMP_SIMPLE macro will be defined as 1 if and only if OpenMP was used in the default scheme. Users can define the SUBPAR_NO_OPENMP_SIMPLE macro to force parallelize_simple() to use <thread> even if OpenMP is available. This is occasionally useful when OpenMP cannot be used in some parts of the application, e.g., with POSIX forks.

Advanced users can substitute in their own parallelization scheme by defining SUBPAR_CUSTOM_PARALLELIZE_SIMPLE before including the subpar header. This should be a function-like macro that accepts the same arguments as parallelize_simple() or the name of a function that accepts the same arguments as parallelize_simple(). If defined, the custom scheme will be used instead of the default scheme whenever parallelize_simple() is called. Macro authors should note the expectations on run_task().

If nothrow_ = true, exception handling is omitted from the default parallelization scheme. This avoids some unnecessary work when the caller knows that run_task() will never throw. For custom schemes, if SUBPAR_CUSTOM_PARALLELIZE_SIMPLE_NOTHROW is defined, it will be used if nothrow_ = true; otherwise, SUBPAR_CUSTOM_PARALLELIZE_SIMPLE will continue to be used.

Template Parameters

nothrow_	Whether the `Run_` function cannot throw an exception.
Task_	Integer type for the number of tasks.
Run_	Function that accepts `w`, the index of the task (and thus the worker) as a `Task_`. Any return value is ignored.

Parameters

num_tasks	Number of tasks. This is also the number of workers as we assume a 1:1 mapping between tasks and workers.
run_task	Function to execute the task for each worker. This will be called exactly once in each worker, where `w` is guaranteed to be in `[0, num_tasks)`. This function may throw an exception if `nothrow_ = false`.

◆ sanitize_num_workers()

template<typename Task_ >

int subpar::sanitize_num_workers	(	int	num_workers,
		Task_	num_tasks )

Adjust the number of workers to the number of tasks in parallelize_range().

It is not strictly necessary to run sanitize_num_workers() prior to parallelize_range() as the latter will automatically behave correctly with all inputs. However, on occasion, applications need a better upper bound on the number of workers, e.g., to pre-allocate expensive per-worker data structures. In such cases, the return value of sanitize_num_workers() can be used by the application before being passed to parallelize_range().

Template Parameters

Task_ Integer type for the number of tasks.

Parameters

num_workers	Number of workers. This may be negative or zero.
num_tasks	Number of tasks. This should be a non-negative integer.

Returns: A more suitable number of workers. Negative or zero num_workers are converted to 1 if num_tasks > 0, otherwise zero. If num_workers is greater than num_tasks, the former is set to the latter.

Functions

Detailed Description

Function Documentation

◆ parallelize_range()

◆ parallelize_simple()

◆ sanitize_num_workers()