Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug7263/pmix setup app api #11

Open
wants to merge 7 commits into
base: slurm/slurm-20.11
Choose a base branch
from

Conversation

karasevb
Copy link
Owner

@karasevb karasevb commented Dec 4, 2020

No description provided.

@karasevb
Copy link
Owner Author

karasevb commented Dec 4, 2020

@artpol84, please review

@@ -209,6 +211,7 @@ extern mpi_plugin_client_state_t *p_mpi_hook_client_prelaunch(
static bool setup_done = false;
uint32_t nnodes, ntasks, **tids;
uint16_t *task_cnt;
char *process_mapping = NULL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
char *process_mapping = NULL;

Comment on lines 239 to +240
setenvf(env, PMIXP_SLURM_MAPPING_ENV, "%s", process_mapping);
xfree(process_mapping);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
setenvf(env, PMIXP_SLURM_MAPPING_ENV, "%s", process_mapping);
xfree(process_mapping);

@@ -631,9 +635,141 @@ int pmixp_mkdir(char *path, mode_t rights)
return errno;
}

if (chown(path, (uid_t) pmixp_info_jobuid(), (gid_t) -1) < 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use pmixp_info_jobuid() but inside it return the correct value?
This is what pmixp_info is for


static inline int pmixp_info_timeout(void)
{
xassert(_pmixp_job_info.magic == PMIXP_INFO_MAGIC);
return _pmixp_job_info.timeout;
xassert(_pmixp_stepd_info.proc_info.magic == PMIXP_INFO_MAGIC);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move all static functions to the pmixp_info.c as Danny asked long time ago.

#ifdef PMIX_SERVER_TMPDIR
PMIXP_KVP_ADD(kvp, PMIX_SERVER_TMPDIR,
pmixp_info_tmpdir_lib(), PMIX_STRING);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job of pmixp_info_tmpdir_lib() is to return the proper value.
I don't see why we need to change function prototypes instead of adjusting pmixp_info to accommodate srun.

@@ -51,6 +51,11 @@ typedef struct {
#endif
char nspace[PMIXP_MAX_NSLEN];
slurm_step_id_t step_id; /* Current step id (or NO_VAL) */
char *lib_tmpdir;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
char *lib_tmpdir;
uint32_t nnodes; /* number of nodes in current step */
uint32_t nnodes_job; /* number of nodes in current job */
uint32_t ntasks; /* total number of tasks in current step */
char *lib_tmpdir;
char *spool_dir;
uid_t uid;
gid_t gid;

Comment on lines 76 to 78
char *spool_dir;
uid_t uid;
gid_t gid;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
char *spool_dir;
uid_t uid;
gid_t gid;

Comment on lines 59 to 61
uint32_t nnodes; /* number of nodes in current step */
uint32_t nnodes_job; /* number of nodes in current job */
uint32_t ntasks; /* total number of tasks in current step */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uint32_t nnodes; /* number of nodes in current step */
uint32_t nnodes_job; /* number of nodes in current job */
uint32_t ntasks; /* total number of tasks in current step */

int pmixp_srun_init(const mpi_plugin_client_info_t *job, char ***env)
{
int rc;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, proc mapping generation should go here!

@@ -1283,7 +1311,8 @@ int pmixp_server_direct_conn_early(void)

PMIXP_DEBUG("called");
proc.rank = pmixp_lib_get_wildcard();
strncpy(proc.nspace, _pmixp_job_info.nspace, PMIXP_MAX_NSLEN);
strncpy(proc.nspace, _pmixp_stepd_info.proc_info.nspace,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no direct access to pmixp_info!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants