Parallel Sum Calculation¶
- class parallel_statistics.ParallelSum(size, sparse=False)[source]¶
ParallelMean
is a parallel and incremental calculator for sums. “Incremental” means that it does not need to read the entire data set at once, and requires only a single pass through the data.The calculator is designed to work on data in a collection of different bins, for example a map (where the bins are pixels). The usual life-cycle of this class is:
create an instance of the class (on each process if in parallel)
repeatedly call
add_data
oradd_datum
on it to add new data pointscall
collect
, (supplying in MPI communicator if in parallel)
You can also call the
run
method with an iterator to combine these.If only a few indices in the data are expected to be used, the sparse option can be set to change how data is represented and returned to a sparse form which will use less memory and be faster below a certain size.
Bins which have no objects in will be given weight=0 and sum=0.
Methods
add_data
(bin, values[, weights])Add a chunk of data in the same bin to the sum.
add_datum
(bin, value[, weight])Add a single data point to the sum.
collect
([comm, mode])Finalize the sum and return the counts and the sums.
run
(iterator[, comm, mode])Run the whole life cycle on an iterator returning data chunks.
- add_data(bin, values, weights=None)[source]¶
Add a chunk of data in the same bin to the sum.
- Parameters
- bin: int
Index of bin or pixel these value apply to
- values: sequence
Values for this bin to accumulate
- weights: sequence
Optional, weights per value
- add_datum(bin, value, weight=None)[source]¶
Add a single data point to the sum.
- Parameters
- bin: int
Index of bin or pixel these value apply to
- value: float
Value for this bin to accumulate
- collect(comm=None, mode='gather')[source]¶
Finalize the sum and return the counts and the sums.
The “mode” decides whether all processes receive the results or just the root.
- Parameters
- comm: mpi communicator or None
If in parallel, supply this
- mode: str, optional
“gather” or “allgather”
- Returns
- count: array or SparseArray
The number of values hitting each pixel
- sum: array or SparseArray
The total of values hitting each pixel
- run(iterator, comm=None, mode='gather')[source]¶
Run the whole life cycle on an iterator returning data chunks.
This is equivalent to calling add_data repeatedly and then collect.
- Parameters
- iterator: iterator
Iterator yielding (pixel, values) pairs
- comm: MPI comm or None
The comm, or None for serial
- Returns
- count: array or SparseArray
The number of values hitting each pixel
- sum: array or SparseArray
The total of values hitting each pixel