-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve stack memory usage at compile time #9963
base: master
Are you sure you want to change the base?
Improve stack memory usage at compile time #9963
Conversation
src/ucs/debug/log.c
Outdated
new_log_file_name = (char *)ucs_malloc(PATH_MAX, "new_log_file_name"); | ||
|
||
if (old_log_file_name == NULL || new_log_file_name == NULL) { | ||
ucs_free(old_log_file_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess no need to free when we ucs_fatal()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valid for all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you :), please check now @tvegas1
src/ucs/debug/log.c
Outdated
if (access(old_log_file_name, W_OK) != 0) { | ||
ucs_free(old_log_file_name); | ||
ucs_free(new_log_file_name); | ||
ucs_fatal("unable to write to %s", old_log_file_name); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Printing log file names after freeing them seems like not a good idea. Also there is a lot of code duplication, so I would propose something like the following:
if (access(old_log_file_name, W_OK) != 0) { | |
ucs_free(old_log_file_name); | |
ucs_free(new_log_file_name); | |
ucs_fatal("unable to write to %s", old_log_file_name); | |
} | |
if (access(old_log_file_name, W_OK) != 0) { | |
ucs_error("unable to write to %s", old_log_file_name); | |
goto err_out; | |
} | |
... | |
err_out: | |
ucs_free(old_log_file_name); | |
ucs_free(new_log_file_name); | |
if (err_msg) { | |
ucs_fatal("ucs_log_file_rotate() fatal error (see logs before)"); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you :), please check now @ivankochin
c44c2d2
to
9dd5ab8
Compare
src/ucs/debug/log.c
Outdated
|
||
if (access(old_log_file_name, W_OK) != 0) { | ||
ucs_fatal("unable to write to %s", old_log_file_name); | ||
ucs_error("unable to write to %s", old_log_file_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we fatal anyways, maybe set msg = "..."
then print with ucs_fatal(".. %s.. ", msg);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or just replace ucs_error() by ucs_fatal()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace ucs_error() by ucs_fatal()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced ucs_error() by ucs_fatal()
ucs_free(new_log_file_name); | ||
|
||
err_out: | ||
ucs_free(old_log_file_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to free i guess, since there would be so many leaked resources aynways..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it, please check again
src/ucs/debug/log.c
Outdated
err_out: | ||
ucs_free(old_log_file_name); | ||
ucs_free(new_log_file_name); | ||
ucs_fatal("ucs_log_file_rotate() fatal error (see logs before)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see previous suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check again
90260f5
to
daae19b
Compare
if ((rel_path_buf == NULL) || (abs_path_buf == NULL)) { | ||
ucs_error("Failed to allocate memory for path buffers"); | ||
status = UCS_ERR_NO_MEMORY; | ||
goto out; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remote out
label and use return UCS_ERR_NO_MEMORY
instead, rename out_free_buffers
to out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there could be a case where one of the buffer was allocated successfully and the other one failed, so I think we still need to free, so I changed it to "goto out", let me know what do you think
@@ -259,13 +259,21 @@ static ucs_status_t ucs_vfs_node_add(void *parent_obj, ucs_vfs_node_type_t type, | |||
va_list ap, ucs_vfs_node_t **new_node) | |||
{ | |||
ucs_vfs_node_t *current_node; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coding guideline mandates current_node
be after initialized variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to move current_node
line 264
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
7eac1d4
to
e10a8a8
Compare
ef9cbb9
to
21e1828
Compare
config/m4/compiler.m4
Outdated
# -Wall: All common warnings | ||
# -Werror: Treat warnings as errors | ||
# -Wframe-larger-than=8192: Warn if stack frame size exceeds 8192 bytes (8 KB) | ||
BASE_CFLAGS="-g -Wall -Werror -Wframe-larger-than=8192" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might as well remove comments since the flags are so common
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed "-Wall", "-Werror" and "-g"
src/tools/perf/lib/uct_tests.cc
Outdated
@@ -773,7 +773,7 @@ class uct_perf_test_runner { | |||
TEST_CASE_ALL_OSD(_perf, _case, UCT_PERF_DATA_LAYOUT_BCOPY) \ | |||
TEST_CASE_ALL_OSD(_perf, _case, UCT_PERF_DATA_LAYOUT_ZCOPY) | |||
|
|||
ucs_status_t uct_perf_test_dispatch(ucx_perf_context_t *perf) | |||
ucs_status_t __attribute__ ((optimize("O1"))) uct_perf_test_dispatch(ucx_perf_context_t *perf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is attribute needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without optimization the stack frame size > 8192 bytes. I tried using "new" at "src/tools/perf/lib/ucp_tests.cc" to make the runner use the heap instead of the stack and it compiled successfully on the HPC node, but then the CI failed because of the "new" and I couldn't reproduce it.
At the moment I changed it to pragma instead of the attribute.
src/ucs/debug/log.c
Outdated
|
||
if (access(old_log_file_name, W_OK) != 0) { | ||
ucs_fatal("unable to write to %s", old_log_file_name); | ||
ucs_error("unable to write to %s", old_log_file_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace ucs_error() by ucs_fatal()
@@ -259,13 +259,21 @@ static ucs_status_t ucs_vfs_node_add(void *parent_obj, ucs_vfs_node_type_t type, | |||
va_list ap, ucs_vfs_node_t **new_node) | |||
{ | |||
ucs_vfs_node_t *current_node; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to move current_node
line 264
if ((rel_path_buf == NULL) || (abs_path_buf == NULL)) { | ||
ucs_error("Failed to allocate memory for path buffers"); | ||
status = UCS_ERR_NO_MEMORY; | ||
goto out; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return here.
@@ -457,6 +457,11 @@ internal::CartesianProductHolder<Generator...> Combine(const Generator&... g) { | |||
#define GTEST_GET_FIRST_(first, ...) first | |||
#define GTEST_GET_SECOND_(first, second, ...) second | |||
|
|||
// Modified to reduce stack frame size per test file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to remove changes to googletest (disabling check for it)? or make it outside of googletest by including gtest-param-test.h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find a workaround for this, so @yosefe advised me to make this change
5c24d38
to
6aa969d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe remove #pragma in gtest by not specifing Wframe-large-than when building .cc files
src/tools/perf/lib/ucp_tests.cc
Outdated
{ | ||
ucs_status_t status; | ||
|
||
status = dispatch_osd(perf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe replace by an array of function pointers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, done
test/gtest/ucs/test_bitops.cc
Outdated
@@ -90,6 +90,8 @@ UCS_TEST_F(test_bitops, ptr_ctz) { | |||
ASSERT_EQ(88, ucs_count_ptr_trailing_zero_bits(buffer, 160)); | |||
} | |||
|
|||
#pragma GCC diagnostic push | |||
#pragma GCC diagnostic ignored "-Wframe-larger-than=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to avoid all pragma in test/gtest, is there a way to not set the -Wframe-larger-than when compiling gtests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few macros that I can change to make those tests use the heap instead of the stack, just like I did in the commit "TEST/GTEST: reduce stack frame size in INSTANTIATE_TEST_SUITE_P macro".
Those macros are "UCS_TEST_P_" and "UCS_TEST_P_", I'm trying to make the syntax work, it's challenging so I added the pragma to see if I can make the CI pass meanwhile.
The reason I want to reduce the stack allocation in the tests is that later on I will check the stack usage on run time alongside running the tests and if the stack usage of the tests are initially above the threshold then it'll be harder to add a rule for the CI to catch cases of large stack usage during run time.
5a31e59
to
ec83329
Compare
6b3ff9e
to
42f80b4
Compare
54c414c
to
746ca85
Compare
746ca85
to
8072eee
Compare
Waiting for:
What?
Improve stack memory usage by introducing an 8KB threshold for stack frames using the GCC flag
-Wframe-larger-than=8192
. This task involves identifying and addressing functions that exceed this limit.Why?
Customers with limited stack sizes have recently experienced stack overflow issues when running UCX.
The goal is to ensure that the entire UCX runs within a defined stack size threshold.
Mitigates the risk of stack overflow, particularly in environments with limited stack sizes.
How?