Parallel Histograms¶
- class parallel_statistics.ParallelHistogram(edges)[source]¶
ParallelHistogram is a parallel and incremental calculator histograms. “Incremental” means that it does not need to read the entire data set at once, and requires only a single pass through the data.
The usual life-cycle of this class is:
create an instance of the class (on each process if in parallel)
repeatedly call
add_data
oradd_datum
on it to add new data pointscall
collect
, (supplying in MPI communicator if in parallel)
You can also call the
run
method with an iterator to combine these.Since histograms are usually relatively small, sparse arrays are not enabled for this class.
Bin edges must be pre-defined and values outside them will be ignored.
Methods
add_data
(data[, weights])Add a chunk of data to the histogram.
collect
([comm])Finalize and collect together histogram values
run
(iterator[, comm])Run the whole life cycle on an iterator returning data chunks.
- add_data(data, weights=None)[source]¶
Add a chunk of data to the histogram.
- Parameters
- data: sequence
Values to be histogrammed
- weights: sequence, optional
Weights per value.
- collect(comm=None)[source]¶
Finalize and collect together histogram values
- Parameters
- comm: MPI comm or None
The comm, or None for serial
- Returns
- counts: array
Total counts/weights per bin
- run(iterator, comm=None)[source]¶
Run the whole life cycle on an iterator returning data chunks.
This is equivalent to calling add_data repeatedly and then collect.
- Parameters
- iterator: iterator
Iterator yieding values or (values, weights) pairs
- comm: MPI comm or None
The comm, or None for serial
- Returns
- counts: array
Total counts/weights per bin