DOCS: Update NEWS for 1.4 (#1119)

Sergei-Lebedev · web-flow · commit 87f840a1cdf5 · 2025-04-15T09:29:32.000+02:00
* DOCS: Update NEWS for 1.4

* DOCS: update authors
diff --git a/AUTHORS b/AUTHORS
@@ -1,17 +1,36 @@
-Alex Margolin               alex.margolin@huawei.com
-Anatoly Vildemanov          anatolyv@nvidia.com
-Boris Karasev               boriska@nvidia.com
-Ching-Hsiang Chu            chchu@fb.com
-Devendar Bureddy            devendar@nvidia.com
-Ferrol Aderholdt            faderholdt@nvidia.com
-Geoffroy Vallee             geoffroy@nvidia.com
-Hessam Mirsadeghi           hmirsadeghi@nvidia.com
-Lior Paz                    liorpa@nvidia.com
-Manjunath Gorentla Venkata  manjunath@nvidia.com
-Mike Dubman                 mdubman@nvidia.com
-Pavel Shamis (Pasha)        shamisp@users.noreply.github.com
-Sergey Lebedev              sergeyle@nvidia.com
-Valentin Petrov             valentinp@nvidia.com
-Xiang Gao                   xgao@nvidia.com
-Artem Ryabov                artemry@nvidia.com
-Shimmy Balsam               sbalsam@nvidia.com
+Alex Margolin              alex.margolin@huawei.com
+Alexey Rivkin              alexey.rivkin@nvidia.com
+Anatoly Vildemanov         anatolyv@nvidia.com
+Andrii Bilokur             abilokur@nvidia.com
+Andy Lin                   32576375+andylin-hao@users.noreply.github.com
+Artem Ryabov               artemry@nvidia.com
+Boris Karasev              boriska@nvidia.com
+Brad Settlemyer            bws@deepcopy.org
+Ching-Hsiang Chu           chchu@fb.com
+Devendar Bureddy           devendar@nvidia.com
+Edgar Gabriel              Edgar.Gabriel@amd.com
+Evgeny Keidar              ekeidar@nvidia.com
+Ferrol Aderholdt           faderholdt@nvidia.com
+Geoffroy Vallee            geoffroy@nvidia.com
+Hessam Mirsadeghi          hmirsadeghi@nvidia.com
+Ilya Kryukov               ikryukov@nvidia.com
+Jiri Kraus                 jkraus@nvidia.com
+Lior Paz                   liorpa@nvidia.com
+Mamzi Bayatpour            mbayatpour@nvidia.com
+Manjunath Gorentla Venkata manjunath@nvidia.com
+Masaki Kozuki              mkozuki@nvidia.com
+Mike Dubman                mdubman@nvidia.com
+Nilesh M Negi              nilesh.negi@amd.com
+Nick Sarkauskas            nsarkauskas@nvidia.com
+Pavel Shamis (Pasha)       shamisp@users.noreply.github.com
+Pedram Alizadeh            pedram.alizadeh@amd.com
+Rob Bradford               rob@robster.org.uk
+Sam Nordmann               snordmann@nvidia.com
+Sergey Lebedev             sergeyle@nvidia.com
+Shimmy Balsam              sbalsam@nvidia.com
+Sourav Chakraborty         sourav.chakraborty@nvidia.com
+Taekyung Heo               taekyung@gatech.edu
+Tommy Janjusic             tjanjusic@nvidia.com
+Valentin Petrov            valentinp@nvidia.com
+Xiang Gao                  xgao@nvidia.com
+Yael Yacobovich            yyacobovich@nvidia.com
diff --git a/NEWS b/NEWS
@@ -6,6 +6,80 @@
 
 ## Current
 
+## 1.4.0 (TBD)
+
+## New Features and Enhancements
+
+### Core
+- Implemented asymmetric memory support {PR #1000}
+- Enhanced error handling and resource cleanup {PR #960, #951}
+- Improved service team handling {PR #1046}
+- Fixed triggered post for zero size collectives {PR #960}
+
+### CL/HIER
+- Added allgatherv support {PR #1111}
+- Implemented node subgroup unpacking {PR #1103}
+- Added reduce to supported collectives {PR #997}
+- Fixed integer overflow in alltoall {PR #944}
+
+### TL/UCP
+- Split single and multithreaded send/receive operations {PR #1109}
+- Added knomial allgather with CUDA memory support {PR #1095}
+- Implemented reduce SRG knomial algorithm {PR #1058}
+- Added radix selection to knomial operations {PR #1072}
+- Added sliding window allreduce implementation {PR #958}
+- Added knomial allgatherv support {PR #1008}
+- Added sparbit algorithm for allgather {PR #940}
+- Extended broadcast active set support for size > 2 {PR #926}
+- Added knomial algorithm for reduce-scatter {PR #970}
+
+### TL/MLX5
+- Added multicast-based zero-copy broadcast {PR #1087}
+- Implemented mcast multi-group support {PR #1060}
+- Added non-blocking CUDA memory copy support {PR #1040}
+- Added device memory multicast broadcast {PR #989}
+- Enhanced mcast allgather staging-based algorithm {PR #994}
+- Improved one-sided mcast reliability initialization {PR #980}
+- Various performance optimizations in alltoall {PR #1067}
+- Fixed fences in all-to-all WQEs {PR #1069}
+- Added context option to disable all-to-all operations {PR #1062}
+- Improved error handling and device checks {PR #1102}
+- Disabled mcast for thread multiple mode {PR #961}
+
+### TL/SHARP
+- Added support for allgather operation {PR #1081}
+- Enabled reduce-scatter with SAT support {PR #1084}
+- Added SHARP multi-channel support {PR #1049}
+- Fixed service team OOB handling {PR #1001}
+- Improved internal OOB usage {PR #986}
+
+### CUDA
+- Added linear broadcast implementation {PR #948}
+- Batch CUDA stream memory operations, reduced CPU and GPU execution overhead {PR #1093}
+- Enhanced error handling for CUDA context operations {PR #1025}
+- Fixed context cleanup in CUDA operations {PR #954}
+
+### Build and Test
+- Added support for specific GPU architectures with ROCM {PR #987}
+- Added UCC pkg-config support {PR #1036}
+- Fixed build compatibility with NVC compiler {PR #1052}
+- Enhanced config parser functionality {PR #1092}
+- Enhanced ASAN/LSAN memory leak detection {PR #1074}
+- Added error checking and exit handling in gtests {PR #1083}
+
+### Documentation
+- Updated README with UCC publication information {PR #1028}
+- Added DOCA_UROM documentation {PR #999}
+- Fixed Doxygen documentation issues {PR #1038}
+- Enhanced code style consistency {PR #1020}
+
+### CL/DOCA_UROM
+- Implemented new DOCA UROM plugin {PR #978}
+- Added support for offloading collective operations to DPUs
+- Implemented allreduce collective
+
+## 1.3.0 (April 18th, 2024)
+
 ## New Features and Enhancements
 
 ### CL/HIER
@@ -207,7 +281,7 @@
 - Added support for multithreaded context progress
 - Added support for nonblocking team destroy
 
-#### CL 
+#### CL
 
 - Added support for hierarchical collectives
 - Added support for hierarchical allreduce collective operation
@@ -219,12 +293,12 @@
 
 ##### UCP
 
-- Added Bcast SAG algorithm for large messages 
-- Added Knomial based reduce algorithm 
+- Added Bcast SAG algorithm for large messages
+- Added Knomial based reduce algorithm
 - Making allgather and alltoall agree with the API
 - Added SRA knomial allreduce algorithm
 - Added pairwise alltoall and alltoallv algorithms
-- Added allgather and allgatherv ring algorithms 
+- Added allgather and allgatherv ring algorithms
 - Added support for collective operations based on one-sided semantics
 - Added support for alltoall with one-sided transfer semantics
 - Bug fixes
@@ -237,7 +311,7 @@
   scatter, bcast, allgather and allgatherv
 
 #### Tests
-- Updated tests to test the newly added algorithms and operations 
+- Updated tests to test the newly added algorithms and operations
 
 
 ## 0.1.0 (TBD)
@@ -256,12 +330,12 @@
 - Added support for configuring UCC library and contexts
 
 
-#### CL 
+#### CL
 
 - Added support for collectives, while the source and destination is either in
-  CPU or device (GPU) 
+  CPU or device (GPU)
 - Added support for UCC_THREAD_MULTIPLE
-- Added support for CUDA stream-based collectives 
+- Added support for CUDA stream-based collectives
 
 
 #### TL

-Original file line number
+Diff line change
@@ @@ -1,17 +1,36 @@ @@
 -Alex Margolin               [email protected]
 -Anatoly Vildemanov          [email protected]
 -Boris Karasev               [email protected]
 -Ching-Hsiang Chu            [email protected]
 -Devendar Bureddy            [email protected]
 -Ferrol Aderholdt            [email protected]
 -Geoffroy Vallee             [email protected]
 -Hessam Mirsadeghi           [email protected]
 -Lior Paz                    [email protected]
 -Manjunath Gorentla Venkata  [email protected]
 -Mike Dubman                 [email protected]
 -Pavel Shamis (Pasha)        [email protected]
 -Sergey Lebedev              [email protected]
 -Valentin Petrov             [email protected]
 -Xiang Gao                   [email protected]
 -Artem Ryabov                [email protected]
 -Shimmy Balsam               [email protected]
 +Alex Margolin              [email protected]
 +Alexey Rivkin              [email protected]
 +Anatoly Vildemanov         [email protected]
 +Andrii Bilokur             [email protected]
 +Andy Lin                   [email protected]
 +Artem Ryabov               [email protected]
 +Boris Karasev              [email protected]
 +Brad Settlemyer            [email protected]
 +Ching-Hsiang Chu           [email protected]
 +Devendar Bureddy           [email protected]
 +Edgar Gabriel              [email protected]
 +Evgeny Keidar              [email protected]
 +Ferrol Aderholdt           [email protected]
 +Geoffroy Vallee            [email protected]
 +Hessam Mirsadeghi          [email protected]
 +Ilya Kryukov               [email protected]
 +Jiri Kraus                 [email protected]
 +Lior Paz                   [email protected]
 +Mamzi Bayatpour            [email protected]
 +Manjunath Gorentla Venkata [email protected]
 +Masaki Kozuki              [email protected]
 +Mike Dubman                [email protected]
 +Nilesh M Negi              [email protected]
 +Nick Sarkauskas            [email protected]
 +Pavel Shamis (Pasha)       [email protected]
 +Pedram Alizadeh            [email protected]
 +Rob Bradford               [email protected]
 +Sam Nordmann               [email protected]
 +Sergey Lebedev             [email protected]
 +Shimmy Balsam              [email protected]
 +Sourav Chakraborty         [email protected]
 +Taekyung Heo               [email protected]
 +Tommy Janjusic             [email protected]
 +Valentin Petrov            [email protected]
 +Xiang Gao                  [email protected]
 +Yael Yacobovich            [email protected]