From 126731e25586054e5c50cb5012577dd7d51249c0 Mon Sep 17 00:00:00 2001 From: Lisa Date: Thu, 30 Nov 2023 22:58:06 -0700 Subject: [PATCH] readme and changelog updates (#410) * readme and changelog updates * Update README.md Co-authored-by: Saad Rahim (AMD) <44449863+saadrahim@users.noreply.github.com> --------- Co-authored-by: Saad Rahim (AMD) <44449863+saadrahim@users.noreply.github.com> --- CHANGELOG.md | 339 ++++++++++++++++++++++++++++++++------------------- README.md | 87 +++++++------ 2 files changed, 262 insertions(+), 164 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c6891f202..f538e7eb8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,165 +1,256 @@ -# Change Log for rocRAND +# Changelog for rocRAND -Full documentation for rocRAND is available at [https://rocrand.readthedocs.io/en/latest/](https://rocrand.readthedocs.io/en/latest/) +Documentation for rocRAND is available at +[https://rocm.docs.amd.com/projects/rocRAND/en/latest/](https://rocm.docs.amd.com/projects/rocRAND/en/latest/) ## (Unreleased) rocRAND-x.x.x for ROCm 6.0.0 -### Changed -- Generator classes from `rocrand.hpp` are no longer copyable, in previous versions these copies -would copy internal references to the generators and would lead to double free or memory leak errors. - These types should be moved instead of copied, and move constructors and operators are now defined - for them. -### Optimized -- Improved MT19937 initialization and generation performance. -### Removed -- Removed hipRAND submodule from rocRAND. hipRAND is now only available as a separate package. -- Removed references to and workarounds for deprecated hcc -### Fixed -- `mt19937_engine` from `rocrand.hpp` is now move-constructible and move-assignable. Previously the -move constructor and move assignment operator was deleted for this class. -- Various fixes for the C++ wrapper header rocrand.hpp - - fixed the name of `mrg31k3p` it is now correctly spelled (was incorrectly named`mrg31k3a` in - previous versions). - - added missing `order` setter method for `threefry4x64` - - fixed the default ordering parameter for `lfsr113` -- Build error when using clang++ directly due to unsupported references to amdgpu-target + +### Changes + +* Generator classes from `rocrand.hpp` are no longer copyable (in previous versions these copies + would copy internal references to the generators and would lead to double free or memory leak + errors) + * These types should be moved instead of copied; move constructors and operators are now + defined + +### Optimizations + +* Improved MT19937 initialization and generation performance + +### Removals + +* Removed the hipRAND submodule from rocRAND; hipRAND is now only available as a separate + package +* Removed references to, and workarounds for, the deprecated hcc + +### Fixes + +* `mt19937_engine` from `rocrand.hpp` is now move-constructible and move-assignable (the move + constructor and move assignment operator was deleted for this class) +* Various fixes for the C++ wrapper header `rocrand.hpp` + * The name of `mrg31k3p` it is now correctly spelled (was incorrectly named `mrg31k3a` in previous + versions) + * Added the missing `order` setter method for `threefry4x64` + * Fixed the default ordering parameter for `lfsr113` +* Build error when using Clang++ directly resulting from unsupported `amdgpu-target` references ## rocRAND-2.10.17 for ROCm 5.5.0 -### Added -- MT19937 pseudo random number generator based on M. Matsumoto and T. Nishimura, 1998, Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator. -- New benchmark for the device API using Google Benchmark, `benchmark_rocrand_device_api`, replacing `benchmark_rocrand_kernel`. `benchmark_rocrand_kernel` is deprecated and will be removed in a future version. Likewise, `benchmark_curand_host_api` is added to replace `benchmark_curand_generate` and `benchmark_curand_device_api` is added to replace `benchmark_curand_kernel`. -- experimental HIP-CPU feature -- ThreeFry pseudorandom number generator based on Salmon et al., 2011, "Parallel random numbers: as easy as 1, 2, 3". -- Accessor methods for sobol 32 and 64 direction vectors and constants: - - Enum `rocrand_direction_vector_set` to select the direction vector set. - - `rocrand_get_direction_vectors32(...)` supersedes: - - `rocrand_h_sobol32_direction_vectors` - - `rocrand_h_scrambled_sobol32_direction_vectors` - - `rocrand_get_direction_vectors64(...)` supersedes: - - `rocrand_h_sobol64_direction_vectors` - - `rocrand_h_scrambled_sobol64_direction_vectors` - - `rocrand_get_scramble_constants32(...)` supersedes `h_scrambled_sobol32_constants` - - `rocrand_get_scramble_constants64(...)` supersedes `h_scrambled_sobol64_constants` -### Changed -- Python 2.7 is no longer officially supported. + +### Additions + +* MT19937 pseudo random number generator based on M. Matsumoto and T. Nishimura, 1998, + *Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator* +* New benchmark APIs for Google Benchmark: + * `benchmark_rocrand_device_api` replaces `benchmark_rocrand_kernel` + * `benchmark_curand_host_api` replaces `benchmark_curand_generate` + * `benchmark_curand_device_api` replaces `benchmark_curand_kernel` +* Experimental HIP-CPU feature +* ThreeFry pseudorandom number generator based on Salmon et al., 2011, *Parallel random numbers: + as easy as 1, 2, 3* +* Accessor methods for SOBOL 32 and 64 direction vectors and constants: + * Enum `rocrand_direction_vector_set` to select the direction vector set + * `rocrand_get_direction_vectors32(...)` supersedes: + * `rocrand_h_sobol32_direction_vectors` + * `rocrand_h_scrambled_sobol32_direction_vectors` + * `rocrand_get_direction_vectors64(...)` supersedes: + * `rocrand_h_sobol64_direction_vectors` + * `rocrand_h_scrambled_sobol64_direction_vectors` + * `rocrand_get_scramble_constants32(...)` supersedes `h_scrambled_sobol32_constants` + * `rocrand_get_scramble_constants64(...)` supersedes `h_scrambled_sobol64_constants` + +### Changes + +* Python 2.7 is no longer officially supported ## rocRAND-2.10.16 for ROCm 5.4.0 -### Added -- MRG31K3P pseudorandom number generator based on L'Ecuyer and Touzin, 2000, "Fast combined multiple recursive generators with multipliers of the form a = ±2q ±2r". -- LFSR113 pseudorandom number generator based on L'Ecuyer, 1999, "Tables of maximally equidistributed combined LFSR generators". -- SCRAMBLED_SOBOL32 and SCRAMBLED_SOBOL64 quasirandom number generators. The Scrambled Sobol sequences are generated by scrambling the output of a Sobol sequence. -### Changed -- The `mrg__distribution` structures, which provided numbers based on MRG32K3A, are now replaced by `mrg_engine__distribution`, where `` is `log_normal`, `normal`, `poisson`, or `uniform`. These structures provide numbers for MRG31K3P (with template type `rocrand_state_mrg31k3p`) and MRG32K3A (with template type `rocrand_state_mrg32k3a`). -### Fixed -- Sobol64 now returns 64 bits random numbers, instead of 32 bits random numbers. As a result, the performance of this generator has regressed. -- Fixed a bug that prevented compiling code in C++ mode (with a host compiler) when it included the rocRAND headers on Windows. + +### Additions + +* MRG31K3P pseudorandom number generator based on L'Ecuyer and Touzin, 2000, *Fast combined + multiple recursive generators with multipliers of the form a = ±2q ±2r* +* LFSR113 pseudorandom number generator based on L'Ecuyer, 1999, *Tables of maximally + equidistributed combined LFSR generators* +* `SCRAMBLED_SOBOL32` and `SCRAMBLED_SOBOL64` quasirandom number generators (scrambled + Sobol sequences are generated by scrambling the output of a Sobol sequence) + +### Changes + +* The `mrg__distribution` structures, which provide numbers based on MRG32K3A, have been replaced by `mrg_engine__distribution`, where `` is `log_normal`, `normal`, `poisson`, or `uniform` + * These structures provide numbers for MRG31K3P (with template type `rocrand_state_mrg31k3p`) + and MRG32K3A (with template type `rocrand_state_mrg32k3a`) + +### Fixes + +* Sobol64 now returns 64-bit (instead of 32-bit) random numbers, which results in the performance of + this generator being regressed +* Bug that prevented Windows code compilation in C++ mode (with a host compiler) when rocRAND + headers were included ## rocRAND-2.10.15 for ROCm 5.3.0 -### Added -- New benchmark for the host api using googlebenchmark replacing `benchmark_rocrand_generate`, - `benchmark_rocrand_generate` is deprecated and will be removed in a future version. -### Changed -- Increased number of warmup iterations for rocrand_benchmark_generate from 5 to 15 to eliminate corner cases that would generate artificially high benchmark scores. + +### Additions + +* New benchmark for the host API using Google benchmark that replaces + `benchmark_rocrand_generate`, which is deprecated + +### Changes + +* Increased the number of warmup iterations for `rocrand_benchmark_generate` from 5 to 15 to + eliminate corner cases that generate artificially high benchmark scores ## (Released) rocRAND-2.10.14 for ROCm 5.2.0 -### Added -- Backward compatibility for deprecated `#include ` using wrapper header files. -- Packages for test and benchmark executables on all supported OSes using CPack. + +### Additions + +* Backward compatibility for `#include ` (deprecated) using wrapper header files +* Packages for test and benchmark executables on all supported operating systems using CPack ## rocRAND-2.10.13 for ROCm 5.1.0 -### Added -- Generating a random sequence different sizes now produces the same sequence without gaps - indepent of how many values are generated per call. - - Only in the case of XORWOW, MRG32K3A, PHILOX4X32_10, SOBOL32 and SOBOL64 - - This only holds true if the size in each call is a divisor of the distributions - `output_width` due to performance - - Similarly the output pointer has to be aligned to `output_width * sizeof(output_type)` -### Changed -- [hipRAND](https://github.com/ROCmSoftwarePlatform/hipRAND.git) split into a separate package -- Header file installation location changed to match other libraries. - - When using the `rocrand.h` header file, users should now use `#include `, rather than `#include ` -- rocRAND still includes hipRAND using a submodule - - The rocRAND package also sets the provides field with hipRAND, so projects which require hipRAND can begin to specify it. -### Fixed -- Fix offset behaviour for XORWOW, MRG32K3A and PHILOX4X32_10 generator, setting offset now - correctly generates the same sequence starting from the offset. - - Only uniform int and float will work as these can be generated with a single call to the generator + +### Additions + +* Generating a random sequence of different sizes now produces the sequence without gaps, + independent of how many values are generated per call + * This is only in the case of XORWOW, MRG32K3A, PHILOX4X32_10, SOBOL32, and SOBOL64 + * This is only true if the size in each call is a divisor of the distributions `output_width` due to + performance + * The output pointer must be aligned with `output_width * sizeof(output_type)` + +### Changes + +* [hipRAND](https://github.com/ROCmSoftwarePlatform/hipRAND.git) has been split into a separate + package +* Header file installation location changed to match other libraries. + * When using the `rocrand.h` header file, use `#include ` rather than + `#include ` +* rocRAND still includes hipRAND using a submodule + * The rocRAND package sets the provides field with hipRAND, so projects that require hipRAND can + begin to specify it + +### Fixes + +* Offset behavior for XORWOW, MRG32K3A, and PHILOX4X32_10 generator + * Setting offset now correctly generates the same sequence starting from the offset + * Only uniform `int` and `float` will work, as these can be generated with a single call to the generator + ### Known issues -- kernel_xorwow unit test is failing for certain GPU architectures. + +* `kernel_xorwow` unit test is failing for certain GPU architectures ## rocRAND-2.10.12 for ROCm 5.0.0 -### Changed -- No updates or changes for ROCm 5.0.0. + +There are no updates for this ROCm release. ## rocRAND-2.10.12 for ROCm 4.5.0 -### Addded -- Initial HIP on Windows support. See README for instructions on how to build and install. -### Changed -- Packaging split into a runtime package called rocrand and a development package called rocrand-devel. The development package depends on runtime. The runtime package suggests the development package for all supported OSes except CentOS 7 to aid in the transition. The suggests feature in packaging is introduced as a deprecated feature and will be removed in a future rocm release. -### Fixed -- Fix for mrg_uniform_distribution_double generating incorrect range of values -- Fix for order of state calls for log_normal, normal, and uniform + +### Additions + +* Initial HIP on Windows support + +### Changes + +* Packaging has been split into a runtime package (`rocrand`) and a development package + (`rocrand-devel`): + The development package depends on the runtime package. When installing the runtime package, + the package manager will suggest the installation of the development package to aid users + transitioning from the previous version's combined package. This suggestion by package manager is + for all supported operating systems (except CentOS 7) to aid in the transition. The `suggestion` + feature in the runtime package is introduced as a deprecated feature and will be removed in a future + ROCm release. + +### Fixes + +* `mrg_uniform_distribution_double` is no longer generating an incorrect range of values +* Order of state calls for `log_normal`, `normal`, and `uniform` + ### Known issues -- kernel_xorwow test is failing for certain GPU architectures. + +* `kernel_xorwow` test is failing for certain GPU architectures ## [rocRAND-2.10.11 for ROCm 4.4.0] -### Added -- Sobol64 support added. -- Benchmark time measurement improvement -- Address Sanitizer build option added. -### Fixed -- nvcc backend fix -- Fix ranges of MRG32k3a device functions. + +### Additions + +* Sobol64 support +* Benchmark time measurement improvement +* AddressSanitizer build option + +### Fixes + +* NVCC backend fix +* Fix ranges of MRG32k3a device functions ## [rocRAND-2.10.10 for ROCm 4.3.0] -### Added -- gfx90a support added. -- gfx1030 support added -- gfx803 supported re-enabled -### Fixed -- Memory leaks in Poisson tests has been fixed. -- Memory leaks when generator has been created but setting seed/offset/dimensions throws an exception has been fixed. + +### Additions + +* gfx90a support +* gfx1030 support +* gfx803 supported re-enabled + +### Fixes + +* Memory leaks in Poisson tests +* Memory leaks when generator is created, but setting seed/offset/dimensions throws an exception ## [rocRAND-2.10.9 for ROCm 4.2.0] -### Fixed -- rocRAND benchmark performance drop for xorwow has been fixed for older ROCm builds. + +### Fixes + +* The rocRAND benchmark performance drop for `xorwow` has been fixed for older ROCm builds ## [rocRAND-2.10.8 for ROCm 4.1.0] -### Added -- Ability to force install dependencies with new -d flag in install script -### Changed -- rocRAND package name has been updated to support newer versions of ROCm. -### Fixed -- rocRAND benchmark performance drop has been fixed. -- Debug builds via the install script have been fixed. + +### Additions + +* Ability to force install dependencies with new `-d` flag in install script + +### Changes + +* rocRAND package name has been updated to support newer versions of ROCm + +### Fixes + +* rocRAND benchmark performance drop +* Debug builds via the install script ## [rocRAND-2.10.7 for ROCm 4.0.0] -### Added -- No new features + +There are no updates for this ROCm release. ## [rocRAND-2.10.6 for ROCm 3.10] -### Added -- No new features + +There are no updates for this ROCm release. ## [rocRAND-2.10.5 for ROCm 3.9.0] -### Added -- No new features + +There are no updates for this ROCm release. ## [rocRAND-2.10.4 for ROCm 3.8.0] -### Added -- No new features + +There are no updates for this ROCm release. ## [rocRAND-2.10.3 for ROCm 3.7.0] -### Fixed -- Fixed package naming to reflect OS name and architecture. + +### Fixes + +- Package naming now reflects operating system name and architecture ## [rocRAND-2.10.2 for ROCm 3.6.0] -### Added -- No new features + +There are no updates for this ROCm release. ## [rocRAND-2.10.1 for ROCm 3.5.0] -### Added -- Static library build options added in beta (subject to change in build method and naming in future releases) -### Changed -- Switched to hip-clang as default compiler -### Deprecated -- HCC build deprecated + +### Additions + +- Static library build options were added in the beta; these are subject to change (build method and + naming) in future releases + +### Changes + +- HIP-Clang is now the default compiler + +### Deprecations + +- HCC build diff --git a/README.md b/README.md index 6fdb8b003..2467232c2 100644 --- a/README.md +++ b/README.md @@ -1,17 +1,16 @@ # rocRAND -[![codecov](https://codecov.io/gh/ROCmSoftwarePlatform/rocRAND/branch/develop/graph/badge.svg?token=FdKnEdGxA1)](https://codecov.io/gh/ROCmSoftwarePlatform/rocRAND) - -The rocRAND project provides functions that generate pseudo-random and quasi-random numbers. +The rocRAND project provides functions that generate pseudorandom and quasirandom numbers. The rocRAND library is implemented in the [HIP](https://github.com/ROCm-Developer-Tools/HIP) -programming language and optimised for AMD's latest discrete GPUs. It is designed to run on top -of AMD's Radeon Open Compute [ROCm](https://rocm.github.io/) runtime, but it also works on -CUDA enabled GPUs. +programming language and optimized for AMD's latest discrete GPUs. It is designed to run on top +of AMD's [ROCm](https://rocm.docs.amd.com) runtime, but it also works on CUDA-enabled GPUs. -Prior to ROCm version 5.0, this project included the [hipRAND](https://github.com/ROCmSoftwarePlatform/hipRAND.git) wrapper. As of version 5.0, this has been split into a separate library. As of version 6.0, hipRAND can no longer be built from rocRAND. +Prior to ROCm version 5.0, this project included the +[hipRAND](https://github.com/ROCmSoftwarePlatform/hipRAND.git) wrapper. As of version 5.0, it was +split into a separate library. As of version 6.0, hipRAND can no longer be built from rocRAND. -## Supported Random Number Generators +## Supported random number generators * XORWOW * MRG31k3p @@ -28,11 +27,10 @@ Prior to ROCm version 5.0, this project included the [hipRAND](https://github.co ## Documentation -Information about the library API and other user topics can be found in the [rocRAND documentation](https://rocrand.readthedocs.io/en/latest). - -### Building the documentation +Documentation for rocRAND is available at +[https://rocm.docs.amd.com/projects/rocRAND/en/latest/](https://rocm.docs.amd.com/projects/rocRAND/en/latest/) -Run the steps below to build documentation locally. +To build documentation locally, use the following code: ```sh # Go to the docs directory @@ -68,18 +66,20 @@ python3 -m http.server Optional: -* [GTest](https://github.com/google/googletest) (required only for tests; building tests is enabled by default) - * Use `GTEST_ROOT` to specify GTest location (also see [FindGTest](https://cmake.org/cmake/help/latest/module/FindGTest.html)) - * Note: If GTest is not already installed, it will be automatically downloaded and built +* [GoogleTest](https://github.com/google/googletest) (required only for tests; building tests is enabled + by default) + * Use `GTEST_ROOT` to specify the GoogleTest location (see also + [FindGTest](https://cmake.org/cmake/help/latest/module/FindGTest.html)) + * Note: If GoogleTest is not already installed, it will be automatically downloaded and built * Fortran compiler (required only for Fortran wrapper) - * `gfortran` is recommended. + * `gfortran` is recommended * Python 3.5+ (required only for Python wrapper) -If some dependencies are missing, cmake script automatically downloads, builds and -installs them. Setting `DEPENDENCIES_FORCE_DOWNLOAD` option `ON` forces script to -not to use system-installed libraries, and to download all dependencies. +If some dependencies are missing, the CMake script automatically downloads, builds, and installs them. +Setting the `DEPENDENCIES_FORCE_DOWNLOAD` option to `ON` forces the script to download all +dependencies, rather than using the system-installed libraries. -## Build and Install +## Build and install ```shell git clone https://github.com/ROCmSoftwarePlatform/rocRAND.git @@ -122,7 +122,9 @@ ctest --output-on-failure ### HIP on Windows -Initial support for HIP on Windows has been added. To install, use the provided rmake.py python script: +We've added initial support for HIP on Windows, which you can install using the `rmake.py` python +script: + ```shell git clone https://github.com/ROCmSoftwarePlatform/rocRAND.git cd rocRAND @@ -134,15 +136,16 @@ python rmake.py -i python rmake.py -c ``` -Note: Existing gtest library in the system (especially static gtest libraries built with other compilers) -may cause build failure; if errors are encountered with existing gtest library or other dependencies, -`DEPENDENCIES_FORCE_DOWNLOAD` flag can be passed to cmake, as mentioned before, to help solve the problem. +The existing GoogleTest library in the system (especially static GoogleTest libraries built with other +compilers) may cause a build failure; if you encounter errors with the existing GoogleTest library or +other dependencies, you can pass the `DEPENDENCIES_FORCE_DOWNLOAD` flag to CMake, which can +help to solve the problem. -Note: To disable inline assembly optimisations in rocRAND (for both the host library and -the device functions provided in `rocrand_kernel.h`) set cmake option `ENABLE_INLINE_ASM` +To disable inline assembly optimizations in rocRAND (for both the host library and +the device functions provided in `rocrand_kernel.h`), set the CMake option `ENABLE_INLINE_ASM` to `OFF`. -## Running Unit Tests +## Running unit tests ```shell # Go to rocRAND build directory @@ -155,7 +158,7 @@ ctest ./test/ ``` -## Running Benchmarks +## Running benchmarks ```shell # Go to rocRAND build directory @@ -210,13 +213,14 @@ cd rocRAND; cd build ### Legacy benchmarks -The legacy benchmarks (before the move to using googlebenchmark) can be disabled by setting the -cmake option `BUILD_LEGACY_BENCHMARK` to `OFF`. For compatibility, this settings defaults to `ON` +You can disable legacy benchmarks (those used prior to Google Benchmark) by setting the +CMake option `BUILD_LEGACY_BENCHMARK` to `OFF`. For compatibility, the default setting is `ON` when `BUILD_BENCHMARK` is set. -The legacy benchmarks are deprecated and will be removed in a future version once all benchmarks are -migrated to the new framework. -## Running Statistical Tests +Legacy benchmarks are deprecated and will be removed in a future version once all benchmarks have +been migrated to the new framework. + +## Running statistical tests ```shell # Go to rocRAND build directory @@ -234,17 +238,20 @@ cd rocRAND; cd build ## Wrappers -* C++ wrappers for host API of rocRAND are in [`rocrand.hpp`](./library/include/rocrand/rocrand.hpp). +* C++ wrappers for the rocRAND host API are in [`rocrand.hpp`](./library/include/rocrand/rocrand.hpp). * [Fortran wrappers](./library/src/fortran/). * [Python wrappers](./python/): [rocRAND](./python/rocrand). ## Support -Bugs and feature requests can be reported through [the issue tracker](https://github.com/ROCmSoftwarePlatform/rocRAND/issues). +Bugs and feature requests can be reported through the +[issue tracker](https://github.com/ROCmSoftwarePlatform/rocRAND/issues). + +## Contributions and license -## Contributions and License +Contributions of any kind are most welcome! You can find more information at +[CONTRIBUTING](./CONTRIBUTING.md). -Contributions of any kind are most welcome! More details are found at [CONTRIBUTING](./CONTRIBUTING.md) -and [LICENSE](./LICENSE.txt). Please note that [statistical tests](./test/crush) link to TestU01 library -distributed under GNU General Public License (GPL) version 3, thus GPL version 3 license applies to -that part of the project. +Licensing information is located at [LICENSE](./LICENSE.txt). Note that [statistical tests](./test/crush) link +to the TestU01 library distributed under GNU General Public License (GPL) version 3. Therefore, the GPL +version 3 license applies to that part of the project.