Identify provider API calls that require FI_CONTEXT/FI_CONTEXT2 #6492
Replies: 1 comment
-
We can think of this as 2 separate cases.
The first case has been handled through API query calls. For example, fi_query_collective, and fi_query_atomic. That last sentence lacks a verb, but those query calls are mandatory to use collective/atomic operations. We could introduce a generic query call (fi_query) that can be used to obtain other details. Inline wrappers could be defined to simplify the user interface, such as defining fi_query_mode to return when a specific mode bit is required. The input to a new query call can be extended as needed. In some cases, use of the query call would be optional (like answering when fi_context can be ignored). In other situations, an additional query may be mandatory. The latter would be defined as part of requesting a specific capability (such as is the case with FI_ATOMIC and FI_COLLECTIVE), or setting some other attribute. There's not a clear standard for the second case. We have fi_get_val/fi_set_val calls to read and modify a provider setting. And applications can remove attributes when opening transmit/receive contexts associated with a scalable endpoint. This allows creating a scalable endpoint with RMA and MSG capabilities, but then open a transmit context that only supports RMA. Neither of these options would work well for specifying that only a subset of collective or atomic operations would be needed. A provider can report whether a specific collective/atomic call is supported, but the provider must assume that an application will use all supported ops. This can waste resources, particularly in the collective case. I think the best option for the second case is to pass extended attributes into object creation. |
Beta Was this translation helpful? Give feedback.
-
From the fi_getinfo man page
FI_CONTEXT : Specifies that the provider requires that applications use struct fi_context as their per operation context parameter.
In practice, a provider may require the structure for some of the APIs. For example, a provider (eg, psm2) needs fi_context for messaging, but not for RMA; if the application wants both on an endpoint, it would be required to allocate/free context structures unnecessarily for the RMA calls because there is no way to distinguish the case.
This was discussed in the OFIWG meeting on 1/12/2021 and a couple of points came up:
Beta Was this translation helpful? Give feedback.
All reactions