Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression from breaking change in jsonschema v4.10.0 #38

Open
matthewfeickert opened this issue Feb 8, 2023 · 2 comments · May be fixed by #46
Open

Regression from breaking change in jsonschema v4.10.0 #38

matthewfeickert opened this issue Feb 8, 2023 · 2 comments · May be fixed by #46
Assignees
Labels

Comments

@matthewfeickert
Copy link
Member

There is a breaking change in jsonschema v4.10.0 that is causing a regression with yadage-schemas v0.10.7. Running the examples of https://gitlab.cern.ch/recast-atlas/examples/helloworld with the local backend without

python -m pip install --upgrade 'recast-atlas[local]' 'jsonschema<=4.9.1'

results in an error for yadage-schemas 0.10.7 of

2023-02-08 02:23:59,617 | packtivity.asyncback |   INFO | configured pool size to 16
2023-02-08 02:23:59,649 | recastatlas.subcomma |  ERROR | caught exception
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/recastatlas/backends/local.py", line 21, in run_workflow
    run_workflow(**spec)
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/yadage/steering_api.py", line 19, in run_workflow
    with steering_ctx(*args, **kwargs):
  File "/home/feickert/.pyenv/versions/3.10.4/lib/python3.10/contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/yadage/steering_api.py", line 89, in steering_ctx
    ys = YadageSteering.create(
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/yadage/steering_object.py", line 65, in create
    ctrl = creators["local"](**kw)
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/yadage/creators.py", line 58, in local_workflows
    workflow_json = workflow_loader.workflow(
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/yadage/workflow_loader.py", line 29, in workflow
    data = yadageschemas.load(
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/yadageschemas/__init__.py", line 13, in load
    validate_spec(data,validopts)
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/yadageschemas/validator.py", line 13, in validate_spec
    return validator(**validopts).validate(data)
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/jsonschema/validators.py", line 283, in validate
    raise error
jsonschema.exceptions.ValidationError: ['init'] is not of type 'object'

Failed validating 'type' in schema['properties']['stages']['items']['properties']['dependencies']:
    {'default': {'dependency_type': 'jsonpath_ready', 'expressions': []},
     'oneOf': [{'$ref': 'predicates/jsonpathready-schema.json#'},
               {'$ref': 'predicates/exprfulfilled-schema.json#'}],
     'type': 'object'}

On instance['stages'][0]['dependencies']:
    ['init']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/recastatlas/subcommands/run.py", line 56, in run
    run_sync(name, spec, backend=backend)
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/recastatlas/backends/__init__.py", line 77, in run_sync
    BACKENDS[backend].run_workflow(name, spec)
  File "/home/feickert/.pyenv/versions/recast-helloworld-dev/lib/python3.10/site-packages/recastatlas/backends/local.py", line 23, in run_workflow
    raise FailedRunException
recastatlas.exceptions.FailedRunException
Error: Workflow failed
Exception ignored in: <function Pool.__del__ at 0x7f5dbd4064d0>
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/pool.py", line 268, in __del__
    self._change_notifier.put(None)
  File "/home/feickert/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/queues.py", line 378, in put
    self._writer.send_bytes(obj)
  File "/home/feickert/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/connection.py", line 205, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/feickert/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/connection.py", line 416, in _send_bytes
    self._send(header + buf)
  File "/home/feickert/.pyenv/versions/3.10.4/lib/python3.10/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor
@matthewfeickert matthewfeickert self-assigned this Feb 8, 2023
matthewfeickert added a commit that referenced this issue Feb 8, 2023
* Add upper bound of 'jsonschema<=4.9.1' as an emergency temporary workaround.
   - c.f. #38
* This should not be done, and should be viewed as temporary only for release v0.10.8.
@mdonadoni
Copy link

Hi @matthewfeickert ,

We have also encoutered this issue when trying to upgrade jsonschema to a more recent version than what we are using, which is needed in particular for reanahub/reana-commons#461.

We would still need to update REANA's dependencies to be up-to-date with yadage/adage/yadage-schemas to benefit from this, and this has proven to be somewhat challenging in the past. However, we are currently in the middle of doing some major upgrades (adding compatibility for Python 3.12, dropping support for Python 3.7/3.8, moving to Ubuntu 24.04), so it might be a good occasion to also tackle the yadage update if we have time to do so.

In any case, I just wanted to leave some of the findings I made while investigating this issue in case they can be helpful.

There is this bit of code that is supposed to modify the default validator to "massage" the JSON input and that should solve this issue, but somehow it does not work anymore.

if schema.get('title',None)=='Yadage Stage':
if 'dependencies' in instance and type(instance['dependencies'])==list:
log.debug('dependencies provided as list, assume jsonpath_ready predicate')
instance['dependencies'] = {
"dependency_type" : "jsonpath_ready",
"expressions": instance["dependencies"]
}

This code was first added in 0b85a48.

Checking the changelog of jsonschema 4.10, there is only one item:

Add support for referencing schemas with $ref across different versions of the specification than the referrer's

So it might be that yadage's custom validator is somehow reverted back when a $ref is encountered in the JSON schema?

There is also the option of updating the schemas to allow the dependencies property to be either an object or an array, but this might cause more incompatibilities with other tools that depend on yadage.

@tiborsimko
Copy link

Hi @matthewfeickert

We have met the same issue now in relation to customising Kubernetes operator for Dask workflows... It would be great to solve this soon if possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants