Skip to content

How to deal with input data resolution in different execution contexts and with a replica catalog #69

@ryuwd

Description

@ryuwd

New concepts

  • dirac-run-cwl facilitates local execution of dirac CWL workflow, includes dirac-cwl features
  • Replica catalog generation: generates an LFN-->PFN mapping catalog, the "global" catalog. can be called e.g. in JobWrapper, OR dirac-run-cwl OR e.g. in PushJobAgent
  • RuntimeContext: configures how the workflow should be run by the executor (e.g. what global catalog to use, where the tmp directory is, etc?)
  • dirac-job-executor: executes the workflow with necessary pre or post step preparation (e.g. generate replica catalog for a step using the global catalog and entries for previous step outputs)

Replica catalog

Also remember to do

class DiracCatalogFsAccess(StdFsAccess):
    """Use replica catalog to resolve LFNs."""
    ...

Local execution

dirac-run-cwl

  • generate-replica-catalog
  • runtimecontext
  • dirac-job-executor

Remote job execution

  1. pre-process: generate-replica-catalog
  2. process: runtimecontext + dirac-job-executor
  3. post-process: ...

Remote job execution (no external connectivity)

  1. pushjobagent: executes jobwrapper.pre-process
  2. pushjobagent: submit runtimecontext + dirac-job-executor
  3. pushjobagent: executes postprocess

Metadata

Metadata

Assignees

Labels

points:8Very large, 1-2 weeks, many unknowns

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions