Releases: ROCm/rocRAND
Releases · ROCm/rocRAND
rocRAND 3.2.0 for ROCm 6.3.0
Added
- Added host generator for MT19937
- Support for
rocrand_generate_poisson
in hipGraphs - Added engine, distribution, mode, throughput_gigabytes_per_second, and lambda columns for the csv format in
benchmark_rocrand_host_api
andbenchmark_rocrand_device_api
. To see these new columns, set--benchmark_format=csv
or--benchmark_out_format=csv --benchmark_out="outName.csv"
.
Changed
- Updated the default value for the
-a
argument fromrmake.py
togfx906:xnack-,gfx1030,gfx1100,gfx1101,gfx1102,gfx1151,gfx1200,gfx1201
. rocrand_discrete
for MTGP32, LFSR113 and ThreeFry generators now uses the alias method, which is faster than binary search in CDF.
rocRAND 3.1.1 for ROCm 6.2.4
Additions
- GFX1151 Support
rocRAND 3.1.0 for ROCm 6.2.2
rocRAND code for ROCm 6.2.2 did not change. The library was rebuilt for the updated ROCm 6.2.2 stack.
rocRAND 3.1.0 for ROCm 6.2.1
rocRAND code for ROCm 6.2.1 did not change. The library was rebuilt for the updated ROCm 6.2.1 stack.
rocRAND 3.1.0 for ROCm 6.2.0
Additions
- Added
rocrand_create_generator_host
- The following generators are supported:
ROCRAND_RNG_PSEUDO_MRG31K3P
ROCRAND_RNG_PSEUDO_MRG32K3A
ROCRAND_RNG_PSEUDO_PHILOX4_32_10
ROCRAND_RNG_PSEUDO_THREEFRY2_32_20
ROCRAND_RNG_PSEUDO_THREEFRY2_64_20
ROCRAND_RNG_PSEUDO_THREEFRY4_32_20
ROCRAND_RNG_PSEUDO_THREEFRY4_64_20
ROCRAND_RNG_PSEUDO_XORWOW
ROCRAND_RNG_QUASI_SCRAMBLED_SOBOL32
ROCRAND_RNG_QUASI_SCRAMBLED_SOBOL64
ROCRAND_RNG_QUASI_SOBOL32
ROCRAND_RNG_QUASI_SOBOL64
- The host-side generators support multi-core processing. On Linux, this requires the TBB (Thread Building Blocks) development package to be installed on the system when building rocRAND (
libtbb-dev
on Ubuntu and derivatives).- If TBB is not found when configuring rocRAND, the configuration is still successful, and the host generators are executed on a single CPU thread.
- The following generators are supported:
- Added the option to create a host generator to the Python wrapper
- Added the option to create a host generator to the Fortran wrapper
- Added dynamic ordering. This ordering is free to rearrange the produced numbers,
which can be specific to devices and distributions. It is implemented for:- XORWOW, MRG32K3A, MTGP32, Philox 4x32-10, MRG31K3P, LFSR113, and ThreeFry
- For the NVIDIA platform compilation using clang as the host compiler is now supported.
- C++ wrapper:
lfsr113_engine
now also supports being constructed with a seed of typeunsigned long long
, not onlyuint4
.- added optional order parameter to constructor of
mt19937_engine
- Added the following functions for the
ROCRAND_RNG_PSEUDO_MTGP32
generator:rocrand_normal2
rocrand_normal_double2
rocrand_log_normal2
rocrand_log_normal_double2
- Added
rocrand_create_generator_host_blocking
which dispatches without stream semantics. - Added host-side generator for
ROCRAND_RNG_PSEUDO_MTGP32
. - Added offset and skipahead functionality to LFSR113 generator.
- Added dynamic ordering for architecture
gfx1102
.
Changes
- For device-side generators, you can now wrap calls to rocrand_generate_* inside of a hipGraph. There are a few
things to be aware of:- Generator creation (rocrand_create_generator), initialization (rocrand_initialize_generator), and destruction (rocrand_destroy_generator) must still happen outside the hipGraph.
- After the generator is created, you may call API functions to set its seed, offset, and order.
- After the generator is initialized (but before stream capture or manual graph creation begins), use rocrand_set_stream to set the stream the generator will use within the graph.
- A generator's seed, offset, and stream may not be changed from within the hipGraph. Attempting to do so may result in unpredicable behaviour.
- API calls for the poisson distribution (eg. rocrand_generate_poisson) are not yet supported inside of hipGraphs.
- For sample usage, see the unit tests in test/test_rocrand_hipgraphs.cpp
- Building rocRAND now requires a C++17 capable compiler, as the internal library sources now require it. However consuming rocRAND is still possible from C++11 as public headers don't make use of the new features.
- Building rocRAND should be faster on machines with multiple CPU cores as the library has been
split to multiple compilation units. - C++ wrapper: the
min()
andmax()
member functions of the generators and distributions are nowstatic constexpr
. - Rename and unify the existing ROCRAND_DETAIL_.*_BM_NOT_IN_STATE to ROCRAND_DETAIL_BM_NOT_IN_STATE
- Static & dynamic library: moved all internal symbols to namespaces to avoid potential symbol name collisions when linking.
Deprecations
- Deprecated the following typedefs. Please use the unified
state_type
alias instead.rocrand_device::threefry2x32_20_engine::threefry2x32_20_state
rocrand_device::threefry2x64_20_engine::threefry2x64_20_state
rocrand_device::threefry4x32_20_engine::threefry4x32_20_state
rocrand_device::threefry4x64_20_engine::threefry4x64_20_state
- Deprecated internal header: src/rng/distribution/distributions.hpp
- Deprecated internal header: src/rng/device_engines.hpp
Removals
- Removed references to and workarounds for deprecated hcc.
- Support for HIP-CPU
Known issues
- SOBOL64 and SCRAMBLED_SOBOL64 generate poisson-distributed
unsigned long long int
numbers instead ofunsigned int
. This will be fixed in the next major release.
rocRAND 3.0.1 for ROCm 6.1.2
rocRAND code for ROCm 6.1.2 did not change. The library was rebuilt for the updated ROCm 6.1.2 stack.
rocRAND 3.0.1 for ROCm 6.1.1
rocRAND code for ROCm 6.1.1 did not change. The library was rebuilt for the updated ROCm 6.1.1 stack.
rocRAND 3.0.1 for ROCm 6.1.0
Fixes
- Implemented workaround for regressions in XORWOW and LFSR on MI200
rocRAND 3.0.0 for ROCm 6.0.2
rocRAND code for ROCm 6.0.2 did not change. The library was rebuilt for the updated ROCm 6.0.2 stack.
rocRAND 2.10.17 for ROCm 6.0.0
rocRAND code for ROCm 6.0.0 did not change. The library was rebuilt for the updated ROCm 6.0.0 stack.