WeeklyTelcon_20221101

Open MPI Weekly Telecon ---

Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

Austen Lauria (IBM)
Brendan Cunningham (Cornelis Networks)
Brian Barrett (AWS)
David Bernhold (ORNL)
Edgar Gabriel (UoH)
Geoffrey Paulsen (IBM)
Harumi Kuno (HPE)
Howard Pritchard (LANL)
Joseph Schuchart
Josh Fisher (Cornelis Networks)
Josh Hursey (IBM)
Thomas Naughton (ORNL)
Todd Kordenbrock (Sandia)
William Zhang (AWS)

not there today (I keep this for easy cut-n-paste for future notes)

Akshay Venkatesh (NVIDIA)
Artem Polyakov (nVidia)
Aurelien Bouteiller (UTK)
Brandon Yates (Intel)
Charles Shereda (LLNL)
Christoph Niethammer (HLRS)
Erik Zeiske
George Bosilca (UTK)
Hessam Mirsadeghi (UCX/nVidia)
Jan (Sandia)
Jeff Squyres (Cisco)
Jingyin Tang
Marisa Roman (Cornelius)
Mark Allen (IBM)
Matias Cabral (Intel)
Matthew Dosanjh (Sandia)
Michael Heinz (Cornelis Networks)
Nathan Hjelm (Google)
Noah Evans (Sandia)
Raghu Raja (AWS)
Ralph Castain (Intel)
Sam Gutierrez (LLNL)10513
Scott Breyer (Sandia?)
Shintaro iwasaki
Tommy Janjusic (nVidia)
Xin Zhao (nVidia)

Ralph joined to ask some binding/mapping opinions.

Default placement, if np <=2 then map by core, else map by NUMA (if defined) else map by Package. But issue is that a customer has a Package inside of NUMA * OMPI recently has a user that DID hit this. They were mapping by NUMA inside the package, and not what was expecting. * Specify map-by-package solving * Hard to debug by looking at lstopo. If someone gets something weird when trying to map-by numa, try map by package. What should we do for default mapping policy (or ANY mapping policy), but don't say what the ranking policy, what should the ranking policy be?

Historically, ranking mirrored the mapping policy.
- but it's pointed out this isn't the optimal placement (since most apps communicate with neighbors).
- So then it was proposed to map by SLOT.
But then the user looks, and gets confused because that's not what they thought they were getting.
Please think about this, and decide and lock this down.
- Brian thinks the default has to be rank by SLOT. (NUMA or Package, less strong thoughts), but in absence of any information.
- Initial thought was that if user specifies non-default mapping, they then NEED to specify a ranking and vice versa.
  - Can print a useful error message.
  - We can't make everyone happy in this case, so this might be best option.
  - if users don't want to specify this every time, they can set an env var, or make an entry in conf file.

v4.1.x

v4.1.5
- Posted an RC1 last week. Brian forgot to send email to devel.
- Schedule is still end-of-month.
- May be the last v4.1.5 unless lots of bugs.
- Patch that needs some work, didn't compile. We'd take if it passes.

v5.0.x

RC went out a couple of weeks ago.
We'll need at least one more RC before we release.
HAN/Adapt is remaining blocker.
- Finally figured out why timings were so variable.
  - Because we select Bruck for Barrier for no reason...
  - since OSC times barrier as well, that was the cause for the variations he was seeing.
- There's a patch that proposes to only use HAN if the rank-distribution if we
- Don't think we should block v5.0 longer
- Don't think we'll figure out how to make HAN faster than tuned if
Don't have a good reason yet why HAN's Barrier is slower.
We promised better collective performance for v5, but we have not delivered.
- What do we do?
  - Two choices:
    - Ship now and say that we're sorry our collective performance
      - We'd need some messaging about how we're handling this.
    - How do we talk to the community about this.
  - Are there any cases where this work actually improves thing?
    - Something a bit positive where this work
    - Goes back to where ranks aren't ordered by SLOT.
      - Don't understand why only those are better.
- Do we make it better in the common case? - No.
Super Computing 2021
- ULFM, Threading MCA framework, MTL OFI, UCC
- Pretty sure we DID messaging around this.
Have had a number of new PRs.
- Did make changes to Tuned and had a PR where priorities were adjusted.
- Seeing better performance for OMPI than Intel MPI.
- Whatever the "out-of-box" performance is what they are getting. *
- If you only have a few ranks per node, then HAN doesn't help that much.
Preparing for release.
- Nov 14th release date.
- Remaining known blocking issues:
  - OSHMEM blocker issue #10978
  - OPAL LIFO tests fail on 390x - suspects bad gcc. says it works with v4.1, but fails with v5.0
    - Doesn't seem to have support for 128bit architectures. Can't use C11
  - Jenkins Pipeline fix (No issue)
Jenkins - make tarball issue.
- RPM builds dont work in Jenkins on v5.0.x
  - Doesn't block RC, but DOES block release.
HAN/Adapt - #10963
- Still some concerns that need to be addressed.
Docs - Remaining blocking issue (besides above) for v5.0.0
- mpirun --help is OUT OF DATE.
- A number of doc issues open.
- See https://github.com/open-mpi/ompi/projects/3 for more info.
- The open-mpi FAQ - refers to things like v1.7
  - Should the open-mpi.org say for v5.0
  - Like the see all of them feature.

Main branch

Accelerator framework

Merged to main, and to v5.0.x
- Try it in v5.0.0rc9

MTT

Administrative tasks

Still delayed.
We're probably not getting together in person anytime soon.
- So we'll send around a doodle to have time to talk about our rules.
- Reflect the way we worked several years ago, but not really right now.
we're to review the admin steering committee in July (per our rules):
- https://github.com/open-mpi/ompi/wiki/Administrative-rules#administrative-steering-committee-asc
we're to review the technical steering committee in July (per our rules):
- https://github.com/open-mpi/ompi/wiki/Administrative-rules#technical-steering-committees
We should also review all the OMPI github, slack, and coverity members during the month of July.
- Jeff will kick that off sometime this week or next week.
In the call we mentioned this, but no real discussion.

Face-to-face

Wiki for face to face: https://github.com/open-mpi/ompi/wiki/Meeting-2022
- Might be better to do a half-day/day-long virtual working session.
  - Due to company's travel policies, and convenience.
  - Could do administrative tasks here too.

Super Computing?

Open MPI missed submitting request for BoF this year.
MPI Forum will be presenting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly