-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize many blocks read #351
Optimize many blocks read #351
Conversation
…lds) in one shot when using auto-decomp. Scenarios involving auto-decomp on a mesh with large number of element blocks relative to MPI ranks results in a slow-down due to non-concurrency when unbalanced work is being done in the client code to set up each individual block. Three new API calls on Ioss::Region allow this std::vector<size_t> get_all_block_connectivity_offset() const; std::vector<size_t> get_all_block_connectivity(const std::string &field_name, void *data, size_t data_size) const; template <typename T> std::vector<size_t> get_all_block_field_data(const std::string &field_name, std::vector<T> &field_data) const; Typical return values from data retrieval on an Ioss::Grouping entity is a scalar. In order to parse the data from a one-shot read, the return values are an array of length (numBlocks+1) which provides an offset into the data array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Lots of work. There are lots of comments, but that doesn't mean I don't like the owrk. I think it is very good, but that we can generalize it somewhat and that there seem to be some unneeded functions. Also, with no comments (or incorrect comments), it needs some better names for some of the functions and variables to make it easier to read/figure out what is happening.
return get_all_block_data(data, data_size, field_name); | ||
} | ||
|
||
std::vector<size_t> DatabaseIO::get_all_block_data(void *data, size_t data_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this function adds much value. Seems to reverse the arguments and then just call another function. More confusing than useful...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right .. this might have been an oversight from my original naming convention that I refactored into it's current state.
size_t expected_data_size = offset[i+1] * field_byte_size; | ||
if (data_size < expected_data_size) { | ||
std::ostringstream errmsg; | ||
fmt::print(errmsg, "ERROR: Connectivity data size {} on region {} is less than expected size {}\n\n", data_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error message assumes that the field being returned is always "connectivity". I would recommend outputting actual name of the field and also the name of the entity
instead of the region name. Similar to error message below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, that was a cut and paste from somewhere else
return offset; | ||
} | ||
|
||
std::vector<size_t> DatabaseIO::get_all_block_connectivity(const std::string &field_name, void *data, size_t data_size) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the client's benefit of calling this function instead of just calling get_all_block_field_data
directly? In a non-debug run, they could pass any field name and it would work correctly. If this function does add value, I would make the test on field name active in both debug and non-debug mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
|
||
std::vector<size_t> offset = get_all_block_offset(field_name); | ||
|
||
int64_t num_to_get = offset[num_blocks]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is num_to_get
used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another casualty of refactoring ... I'm surprised the compiler didn't flag it as an unused variable
const std::vector<int>& block_component_count) const | ||
{ | ||
Ioss::ParallelUtils util_(m_comm); | ||
int nProc = util_.parallel_size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move closer to first use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
|
||
std::vector<U> exports; | ||
exports.reserve(export_size); | ||
std::vector<U> imports(import_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move closer to first use (line 593)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
else if (role == Ioss::Field::TRANSIENT) { | ||
retval = internal_get_all_block_field_data(field_name, field_data.data(), data_size); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about other role
types? As mentioned above, I think that we don't need to determine the role type and can just call internal_get_all_block_field_data
for all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we could in terms of generalization .. this implementation had a specific purpose but I can see how that would work for other roles too.
@@ -30,6 +30,16 @@ namespace Ioss { | |||
class Field; | |||
} | |||
namespace Ioex { | |||
struct BlockFieldData { | |||
int64_t id = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the {0} form, but = 0
is ok also. The {}
form works better for some types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. It's a scalar so I figured that was ok
#include <Ioss_Region.h> | ||
|
||
namespace { | ||
double wall_time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an Ioss::Utils::timer()
which can be used and eliminates the need for including sys/time.h
which doesn't exist on windows, so need ifdefs (See libraries/supes/ext_lib/excpus.c for example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about that ... good to know
@gsjaardema The design is centered around element blocks because of the issues discovered in STK but if you think there is merit in extending this to be more general for other GroupingEntity types, then I can work with that. The naming convention of "all_block" for the functions was so that there wouldn't be a collision with other named functions that currently exist. If we were to generalize this scheme, is there a naming convention that you would prefer? |
I think it is definitely worth looking at generalizing. It would be fairly easy at the Ioss level and becomes more difficult down in the specializations in the ParallelDatabase level. Mabye
|
i will fix the bad merge with master into the branch... |
@tokusanya Is this ready to be merged, or are you working on any additional changes? |
@gsjaardema It can probably be merged as is ... Over the past week, I've been working some more on the generalization and I switched over to using composition instead of templating. It seems to be a much cleaner interface and it looks mostly done for Element Blocks in that the tests pass now. The generalizations comes in two flavors
You can probably go ahead and merge this and then I'll add these changes later. If you'll be around for today's standup, we can chat some more |
@gsjaardema This is my attempt to optimize the many block read issue we were having in StkIO. The biggest issue has to do with concurrency when loading up the blocks individually which means that an unbalanced workload on a per-block basis leads to processors waiting around for others to set up the element block data on the mesh. I realize you'll probably have several revisions so please let me know what you think. The new unit test I put together should show how it works