What is the recommended practice for dealing with functions that have multiple outputs (or list inputs)? #1589
-
QuestionWhat is considered best practice for dealing with a multi-step electron lattice where you need to pass only part of the output from electron 1 to electron 2? Example: Multiple OutputsConsider the following scenario: import covalent as ct
def add(val1, val2):
return {"output":val1 + val2, "name":"add"}
def mult(val1, val2):
return {"output":val1 * val2, "name":"mult"}
@ct.lattice
def workflow(val1, val2):
job1 = ct.electron(add)
job2 = ct.electron(mult)
result1= job1(val1, val2)
result2 = job2(result1["output"], val2)
return result2
dispatch_id = ct.dispatch(workflow)(1, 2)
result = ct.get_result(dispatch_id, wait=True)
print(result) This works just fine, but as you can see in the image below, there is a step in the diagram where covalent needs to I also tried the following example to see if it made a difference simply out of curious. Perhaps unsurprisingly in hindsight, it made the "problem" worse because now two entries must be queried. import covalent as ct
def add(val1, val2):
return val1 + val2, "add"
def mult(val1, val2):
return val1 * val2, "mult"
@ct.lattice
def workflow(val1, val2):
job1 = ct.electron(add)
job2 = ct.electron(mult)
out1, _ = job1(val1, val2)
result2 = job2(out1, val2)
return result2
dispatch_id = ct.dispatch(workflow)(1, 2)
result = ct.get_result(dispatch_id, wait=True)
print(result) Related Question: List InputsVery closely related, if I provide a list as the input argument, the first compute task is an An example is shown below: import covalent as ct
@ct.electron
def summer(vals_list):
sum = 0
for val in vals_list:
sum+=val
return sum
@ct.lattice
def workflow(vals_list):
sum = summer(vals_list)
return sum
dispatch_id = ct.dispatch(workflow)([1,2,3])
result = ct.get_result(dispatch_id,wait=True)
print(result) It runs successfully and here is the diagram: |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hey @arosen93, finally getting back to this. I'm happy to provide a more comprehensive explanation. Let's dive into this interesting topic. The Underlying IssueThe problem arises due to two main requirements of the Covalent framework: Electrons should be capable of returning any and all Python objects. In older versions of Covalent, these Electrons used the default local executor, running on the same server. This meant that the server had to have all the dependencies of the Electron result objects, which can be a big ask if Covalent is running on a remote self-hosted server instead of locally. Current Solution: The Workflow ExecutorTo address this issue, Covalent now uses a Lattice level parameter called How to Work Around the ProblemShort-term approach: One way users can avoid this issue is by using a custom data class returned by an Electron as a single object, which is then unpacked (or operated) within other Electrons rather than inside the Lattice. It's important to realize that any operation performed on an Electron inside a Lattice is actually being performed on the Future of the Electron, which the Lattice has no knowledge of, hence its always wise to defer these operations to the next Electron it will be connected to. Here's an example: @ct.electron
def task1(data):
X, y = data
...
@ct.electron
def gen_data(X, y):
data = transform(X, y)
return data
@ct.electron
def sum_y(data):
return sum(data[1])
@ct.lattice
def workflow(X, y):
data = gen_data(X, y)
result = task1(data)
result_sum = sum_y(data)
... instead of @ct.electron
def task1(X,y):
...
@ct.electron
def gen_data(X,y):
X,y=transform(X,y)
return X,y
@ct.lattice
def workflow(X,y):
X,y=gen_data(X,y) # This unpacking is an electron as we dont know how many objects are being returned, we need to iterate the unpickled result of (X,y)
result=task1(X,y)
result_sum=sum(y) # this again is converted to electron even though not specified by user as we need to unpickle and sum it up.
.... Medium to long-term solution: Covalent will soon introduce the feature of "task packing," allowing users to pack multiple tasks to be shipped to the same executor instance. This will enable us to automatically in the background pack these trivial Electrons connected serially to be executed in the same executor as their parent's machine without using the workflow_executor. This is very ideal as the parent is supposed to have all the env requirements for these tasks to be unpickled and worked on as well. This feature will make the Hope this helps clear things on your end, if not, lets keep the discussion flowing. @cjao anything to add more ? |
Beta Was this translation helpful? Give feedback.
Hey @arosen93, finally getting back to this. I'm happy to provide a more comprehensive explanation. Let's dive into this interesting topic.
The Underlying Issue
The problem arises due to two main requirements of the Covalent framework:
Electrons should be capable of returning any and all Python objects.
Lattice should act as a robust compiler for Electrons, without needing users to specify the number of output objects.
As a result, uncomputed operations on Electrons, such as
sum([electron_output1,electron_output2...])
orelectron_output['data']
, must also be Electrons. This is because performing operations like summation on arbitrary Python objects after computing theelectron_output
futu…