You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We would like to test CWL CommandLineTools that produce an output containing an array of files discovered during execution. This ought to be the CWL equivalent of individual datasets in the Advanced Tool Development Topic on dynamic numbers of outputs, where CWL's internal logic provides the same functionality as the Galaxy discover_datasets element.
All of the documentation and examples of how to make assertions for CWL outputs seem to only treat the case where the output is a single file.
Even though the CommandLineTool doesn't known how many outputs will be made, concretely for every test case we do know what outputs to expect, and can name them explicitly in the assertions.
If we try to use element_tests on a set of expected outputs, Planemo raises a TypeError in verify_elements, that suggests that Planemo isn't converting the array of files into a data collection as galaxy/tool_util expects.
- doc: generate some subsets by samplingjob: sample_job.yamloutputs:
samples:
element_tests:
subset-1.txt:
asserts: {"has_n_lines": {"n": 100}}subset-2.txt:
asserts: {"has_n_lines": {"n": 100}}
The error is
File "lib/python3.13/site-packages/galaxy/tool_util/verify/interactor.py", line 1205, in verify_collection
verify_elements(data_collection["elements"], output_collection_def.element_tests)
Running Planemo under Pdb reveals that data_collection is an array of CWL objects of class File, not a data collection that verify_collection can consume.
So, in decreasing order, the hope is that
Planemo can in fact make assertions about CWL arrays, but we couldn't find it in the documentation. We would be willing to make a PR to improve the documentation.
There is a way in the Planemo test to declare that the array of files is a data collection, or coerce it.
Planemo needs to be be modified to convert CWL arrays to collections, on which assertions can be expressed. I would need advice about where in the code this should happen, before I could say whether we could help.
There is a workaround, that involves using another representation for the array of files. This could be considered but would be costly, since our CWL CommandLineTools really do return arrays that subsequent steps scatter over. Normally I would be reticent to change the representation and the pipelines just to satisfy the testing framework.
Thanks in advance for any advice you might have
The text was updated successfully, but these errors were encountered:
We would like to test CWL CommandLineTools that produce an output containing an array of files discovered during execution. This ought to be the CWL equivalent of individual datasets in the Advanced Tool Development Topic on dynamic numbers of outputs, where CWL's internal logic provides the same functionality as the Galaxy
discover_datasets
element.All of the documentation and examples of how to make assertions for CWL outputs seem to only treat the case where the output is a single file.
Even though the CommandLineTool doesn't known how many outputs will be made, concretely for every test case we do know what outputs to expect, and can name them explicitly in the assertions.
If we try to use
element_tests
on a set of expected outputs, Planemo raises a TypeError inverify_elements
, that suggests that Planemo isn't converting the array of files into a data collection asgalaxy/tool_util
expects.The error is
Running Planemo under Pdb reveals that
data_collection
is an array of CWL objects of classFile
, not a data collection thatverify_collection
can consume.So, in decreasing order, the hope is that
Thanks in advance for any advice you might have
The text was updated successfully, but these errors were encountered: