-
Notifications
You must be signed in to change notification settings - Fork 868
WeeklyTelcon_20160920
Jeff Squyres edited this page Nov 18, 2016
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen
- Jeff Squyres
- Josh Hursey
- Joshua Ladd
- Ralph
- Sylvain Jeaugey
- Artem Polyakov
- Brad Benton
- Milestones
- 1.10.4
- Nothing new. No drivers yet.
-
- 2.0.2 in preparation
- Will create branch 2.0.x
- Looks like PRs are non-controversial, just waiting for reviews.
- 2.x branch will see the 2.1.0 PRs merged in once the 2.0.x branch is created
- 2.1.0
- Was hoping to get 2.1.0 PRs in, before we merge GIT repos.
- Looked at a prototype of the merged GITHUB repo called ompi-all-the-branches
- Review mechanism is web-only.
- Blocking on OSHMEM - needs rebasing.
- Yoda maintenance.
- Ongoing performance discussion.
- Most PRs marked as RM approved
- Discussion on a few other other items
-
Blocker 2.0.2 issues
-
Issue 2075
- Non-issue since
SIGSEGV
is not forwarded.
- Non-issue since
-
Issue 2049
- Ticket updated
-
Issue 2030
- MTT seems to be the only place to reproduce
- Might be a debug build related issue in usage of
opal_list_remove_item
-
Issue 2028
-
yoda
needs to be updated for BTL 3.0 - 2.1 will not be released until
yoda
is fixed - Propose: Remove
yoda
from 2.1, and move toucx
- Raises the question: Does it make sense to keep OSHMEM in Open MPI if
yoda
is removed?
-
- Issue 1831
-
Issue 2075
- Blocker 2.1.0 issues
- 2.0.2 in preparation
-
OSHMEM - Yoda Maintenance
- Want to progress both MPI and OSHMEM in same process, don't want multiple network stacks.
- Original argument was to use OSHMEM over BTL - to use all network stacks (TCP, SM, OpenIB)
- 4 years ago, but things changed. Don't really have that anymore, have PMLs and SPMLs.
- Last week Mellenox proposed moving to UCX.
- OSHMEM sits on top of MPI layer, since it uses much of it.
- Over last couple of years, it's been decoupled from MPI, now it's sitting on side.
- But now it's sitting off on the side, and no-one is interested in maintaining the connection to OPAL support and ORTE. If that's all it's using, there are other projects that share OPAL and ORTE.
- Only reason to be in repository is because connected at the MPI layer.
- BUT, When you start OSHMEM, first thing called is OMPI_MPI_Init.
- Maybe it would help, exactly what in MPI layer OSHMEM is using?
- OPAL<-ORTE<-OMPI<-OSHMEM dependency chain.
- Maybe it would help to show where that is.
- OSHRUN (really ORTERUN), Calls OMPI_MPI_Init. Build an MCA plugin infrastructure on top of that.
- Can't just slash pieces away.
- Take advantage of all PMIx, Direct Modex, proc structure, and everything that supports this.
- According to this PR on Master - OSHMEM has the same proc structure as OMPI, but actually has some MORE at the end of it.
- What about the Transports? MPI -mxm boils down to libmxm, and so does OSHMEM down to libmxm.
- Became an issue with BTL 3.0 API change.
- A number of things, especially over last year, MPI focus and OSHMEM focus. A number of breaks between MPI / OSHMEM, release schedules conflicts.
- Does it make sense to separate the repositories, or design a way to make it easy to pull between the two projects.
- Right now there is a regression in the code base.
- Mellanox can't replace Yoda with UCX in October.
- Mellanox will fix Yoda for this time (for 2.1.0)
- Could package UCX along side with other transports and let the market decide.
- Want to continue this discussion about OSHMEM importance included with Open MPI project.
- We need to have an important discussion about future of MPI / OSHMEM.
-
SPI - http://www.spi-inc.org/
- getting people to approve of these.
- We'll be on Oct 12th Agenda. Once they formally invite us, then we have 60 days to agree / decline.
- Works solely on a volunteer basis, so very inexpensive.
- End of September for soliciting feedback on using SPI.
- Open MPI will hold a formal vote after we receive the formal invite (in mid-to-late-December?)
-
New Contribution agreement / Consent agreement / Bylaws.
- Will need a formal vote by members.
- End of October for discussion of new contributor agreement / bylaws.
- After that we'll set a date for voting.
- New Contributor agreement
-
EuroMPI 2016 In Edinburgh - Spet 25-28
- MPI Forum: Sept. 21-23
- People might be traveling next week
Review Master MTT testing (https://mtt.open-mpi.org/)
- Date of another face to face. January or February? Think about, and discuss next week.
- LANL, Houston, IBM
- Cisco, ORNL, UTK, NVIDIA
- Mellanox, Sandia, Intel