Skip to content

08 12 2020

valentin petrov edited this page Aug 12, 2020 · 2 revisions

Participants

  • Sergey Lebedev
  • Alex Margolin
  • Valentin Petrov

Multiple discussion points have been raised for future consideration with broader audience.

CI Integration

  • UCX shifts entire CI to Azure pipelines, we are going to do the same way.
  • AI (Sergey, Val) Collect more info: how multinode testing is organized, valgrind/coverity licensing

Schedule/task/progress engine

  • Went over example implementation/usage of the proposed schedule/task/progress internal interface https://gist.github.com/vspetrov/6b7ac17ea99c24fafd8ae1f3f31f4122
  • WG is in the agreement that this approach fits all the purposes.
  • Some changes will naturally happen during actual implementation, e.g.: (i) tasks storage - list VS array; (ii) immediate task completion - processed inside EVENT_COMPLETED handler VS handled by the EVENT_MANEGER (via ret code from handler as example)

User-defined reduction operations and derived datatypes

  • Do we need to support different data types for source and destination? This item will close after the discussion on the email list converges. - Just touched upon this today. Need more input. How does different dtypes in reduce cb help implementing optimized coll?
  • Discussed non-predefined (custom) dtypes again. Summarizing: (i) ucc defines a set of simple dtypes, (ii) ucc supports OPAQUE dtype - which is similar to UCP_DATATYPE_GENERIC, it should support size_query, pack, unpack, user_reduction_callback (through a wrapper in order to avoid void* cast). Do we want to ask runtime to provide "flattened (iov)" representation of GENERIC type? (iv) - STRIDED, IOV ? (v) Support through UCD datatype engine (discussed in UCX DT WG) is unclear since there is no convergence on the iface in the UCX WG.

UCC headers PR [1]

  • Under review - multiple outstanding comments. We touched upon several of them.
  • "EP" definition/usage - discussed with Alex, came to agreement
  • ucc_coll_id_t tag - what will be the supported "width" of tag? It should fit into 64bits of UCP tag together with other values (rank, team_id, etc)
  • teamid provided by runtime - 64bit, 16bit? Need either to request the runtime to pass the width of the provided team id, OR we could just declare it as uint16_t (limiting to 16 bits) - to discuss
  • "Context_EP" (somewhat similar to MPI world rank) - useful to optimize internal EP storage/mapping. AI Val - prepare use case example and API proposal.

Next meeting

  • August 20th, 2020

Potential Agenda

  • CI info update (Val, Sergey)
  • Context_EP proposal (Val)
  • ucp.h PR #1 - review/discuss

[1] https://github.com/openucx/ucc/pull/1

Clone this wiki locally