Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get topology from external function #133

Merged
merged 4 commits into from
Jul 17, 2023
Merged

Conversation

jhh67
Copy link
Collaborator

@jhh67 jhh67 commented Jun 13, 2023

If HWLOC_GET_TOPOLOGY_FUNCTION is defined, call it to get the hwloc topology instead of calling hwloc_topology_init and hwloc_topology_load. This enables qthreads to share the same hwloc topology with external code, and obviates hwloc_via_chapel affinity. The configure script accepts the argument --with-hwloc-get-topology-function to set HWLOC_GET_TOPOLOGY_FUNCTION.

If HWLOC_GET_TOPOLOGY_FUNCTION is defined, call it to get the hwloc topology instead
of calling hwloc_topology_init and hwloc_topology_load.

Signed-off-by: John H. Hartman <[email protected]>
@olivier-snl
Copy link
Collaborator

There are some failed CI checks on this. We'll need to look into whether they are actually related to the changes in the PR.

@janciesko
Copy link
Collaborator

@jhh67 - Could you please consider resetting topology to NULL in the hwloc teardown? The problem is that a sequence of qthreads_init and qthread_finalize preserves the global variable and thus we do not reenter the branch to repopulate data structures via hwloc_topology_init in a 2nd sequence. Alternatively, do not check for NULL?

@janciesko janciesko self-requested a review June 22, 2023 20:06
Copy link
Collaborator

@janciesko janciesko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently changes the logic beyond the intended purpose.

@janciesko
Copy link
Collaborator

Using binders, this would still fail?

@jhh67
Copy link
Collaborator Author

jhh67 commented Jun 23, 2023

I pushed a commit that fixes topology cleanup in the hwloc teardown routine. I tested it on the hello_world and reinitialization tests, I don't know how to run all the tests and verify the results (give me a pointer on how to do that and I'll run all the tests). I'm not sure about binders -- it doesn't have a teardown routine. It should just get the topology each time it is initialized.

extern void * HWLOC_GET_TOPOLOGY_FUNCTION;
topology = (hwloc_topology_t) HWLOC_GET_TOPOLOGY_FUNCTION;
#endif
if (topology == NULL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about removing this conditional

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to leave the conditional because it allows the external function to return NULL if it cannot provide the topology for some reason, and qthreads will then get the topology in the normal way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point.

Copy link
Collaborator

@janciesko janciesko Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhh67 Could you please add a comment to the code that the lack of a tear_down routine makes qt_affinity_init skip topology initialization in case of qthreads reinitialization? If we update binders at some point we'd likely need to add a tear_down routine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ronawho
Copy link
Contributor

ronawho commented Jul 10, 2023

Note that there's leftover hwloc_via_chapel references in configure.ac and src/Makefile.am

hwloc_via_chapel is obviated by the --with-hwloc-get-topology-function
argument to configure.

Signed-off-by: John H. Hartman <[email protected]>
Added comment that binders will not reinitialize hwloc topology during
re-initialization because it lacks a teardown routine.

Signed-off-by: John H. Hartman <[email protected]>
@jhh67 jhh67 requested a review from janciesko July 10, 2023 13:08
@janciesko
Copy link
Collaborator

LGTM

@janciesko janciesko merged commit 946b924 into sandialabs:main Jul 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants