builtins for jwt, some object, bug fix for substr builtin... then started looking at test cases... #25

LouisStAmour · 2020-07-01T20:35:35Z

Added basic io.jwt builtin implementations, added a couple object builtins, found a bug in substr builtin, then started looking at how much work it would be to port test cases using object.replace as an example.

What's actually holding me back from wholeheartedly using wasm here is (a) we don't have enough test cases ported to ensure upstream functionality matches for our builtin ports and (b) I'm seeing different behaviour between the OPA playground and the WASM versions when it comes to referencing non-matching rules in a response hash. Finally (c) I tried to use an optimization level above 0 and the resulting file was far more complicated (with at a guess 24 comparisons when I expected it to need only 2-3) and ignoring that, it did not run correctly...

Ideally we would have a test suite that matches upstream Go code, such that we compile every example at every optimization level, run it, and get the expected results. And after we did all that, I'd start looking at creating inputs via fuzzing and testing both implementations in parallel, with fuzzed function references too. ;-)

Signed-off-by: Louis St-Amour <[email protected]>

…retty or perfect. Signed-off-by: Louis St-Amour <[email protected]>

…from Golang by hand. Signed-off-by: Louis St-Amour <[email protected]>

LouisStAmour · 2020-07-01T21:38:21Z

After all this, I’m tempted to consider just writing C or C++ bindings for OPA’s Go library interface, and then writing Node.js bindings to call the C or C++ and adding a typescript wrapper on top. I’ve done the Node.js bindings for a different C++ library and it’s not that difficult. And then I’d know I’d have relatively low-maintenance-overhead for native compatibility. I’d probably have to do the same to make AssemblyScript bindings that can execute the Go code so that Envoy can access it. Turns out it’s relatively hard to call WASM from WASM, you see... end up having to use a linker and it gets complicated...

Not to say this isn’t a better approach long term, but there’s a lot of Go-isms baked into the standard built in functions so they’re not yet as easily ported to other languages. I do like the idea of extending Rego with new builtins, but it sounds like it might complicate Rego WASM optimization to have functions written in a different languages and called at runtime via JSON-encoded args.

Which is to say I suppose I’m in favour of compiling Rego and making it extensible but I’m not yet sold on WASM runtime bindings as being where we would experiment with that... I’m not sure I can trust WASM until similar compilation completely replaces all other execution of Rego such that we can test its execution using existing Golang tests and with identical, portable built-ins?

patrick-east · 2020-07-01T21:47:24Z

Thanks for working on this!

For the builtin functions be sure to keep an eye on #23 and in particular the list of ones provided by OPA https://github.com/open-policy-agent/opa/blob/master/internal/planner/planner.go#L26 (for OPA v0.21.0 https://github.com/open-policy-agent/opa/blob/v0.21.0/internal/planner/planner.go#L26 ). We do need the JWT ones, and it's unlikely that they'll be implemented in C and baked in anytime soon, but the object builtins and string builtins are much more likely candidates to be implemented over in OPA as part of the WASM binary output.

What's actually holding me back from wholeheartedly using wasm here is (a) we don't have enough test cases ported to ensure upstream functionality matches for our builtin ports

100% agree, and that is primarily the reason this package is still a WIP.

There is a long standing item in the backlog for OPA to split the topdown test cases out into a separate format that can be shared between the OPA golang evaluation testing and the various WASM environments. In the meantime relying on the builtin functions baked into the OPA WASM binary (linked to above) are tested and IMO more likely to give the "correct" behavior as far as matching the OPA runtime.

and (b) I'm seeing different behavior between the OPA playground and the WASM versions when it comes to referencing non-matching rules in a response hash.

Can you elaborate on that one? Seems like potentially a bug that needs to be fixed. Please feel free to file issues either here or on the main OPA github repo.

Finally (c) I tried to use an optimization level above 0 and the resulting file was far more complicated (with at a guess 24 comparisons when I expected it to need only 2-3)

This is likely because the optimizations are done with the intention of taking advantage of the rule indexing in OPA https://www.openpolicyagent.org/docs/latest/policy-performance/#use-indexed-statements Which leads to generating many helper rules with more simple comparisons, at evaluation time with the OPA runtime it will only evaluate the ones relevant for the input which means only those 2-3 (or less) would be evaluated even though the policy now has significantly more defined. The catch here is that the WASM generated binaries do not support indexing like that, so you may not see the same kind of improvements that some of the optimization levels on opa build would provide when compared to using the OPA runtime.

and ignoring that, it did not run correctly...

Please file an issue, performance aside the end result should be the same for a WASM evaluation and a OPA runtime evaluation.

Ideally we would have a test suite that matches upstream Go code, such that we compile every example at every optimization level, run it, and get the expected results. And after we did all that, I'd start looking at creating inputs via fuzzing and testing both implementations in parallel, with fuzzed function references too. ;-)

Big +1, as mentioned above there are some intentions to have that more unified test suite, just a matter of someone spending the time to realize the vision. At some point some of the maintainers will get to it, but in the meantime contributions are very welcome 😄

patrick-east · 2020-07-01T21:51:24Z

😅 missed your latest comment while writing up that last response!

After all this, I’m tempted to consider just writing C or C++ bindings for OPA’s Go library interface, and then writing Node.js bindings to call the C or C++ and adding a typescript wrapper on top. I’ve done the Node.js bindings for a different C++ library and it’s not that difficult. And then I’d know I’d have relatively low-maintenance-overhead for native compatibility. I’d probably have to do the same to make AssemblyScript bindings that can execute the Go code so that Envoy can access it. Turns out it’s relatively hard to call WASM from WASM, you see... end up having to use a linker and it gets complicated...

Not to say this isn’t a better approach long term, but there’s a lot of Go-isms baked into the standard built in functions so they’re not yet as easily ported to other languages. I do like the idea of extending Rego with new builtins, but it sounds like it might complicate Rego WASM optimization to have functions written in a different languages and called at runtime via JSON-encoded args.

Which is to say I suppose I’m in favour of compiling Rego and making it extensible but I’m not yet sold on WASM runtime bindings as being where we would experiment with that... I’m not sure I can trust WASM until similar compilation completely replaces all other execution of Rego such that we can test its execution using existing Golang tests and with identical, portable built-ins?

Keep in mind that any of the builtins can also be implemented in C and baked directly into the OPA generated output. Longer term i'd expect the majority of them to be done that way (barring ones like http.send or like the time* ones).

If you do go down the road of using the C++ bindings I would be curious to see how well it works and if there is anything more we can/should do for OPA to help unlock those use-cases.

LouisStAmour · 2020-07-03T16:39:31Z

@patrick-east Let me know if I should file a bug for this... if this is a supported use-case or if maybe I'm doing something unusual within my OPA?

I imagine one could make a simpler use case that doesn't involve JWT, but this is the one I currently have, so...

My goal originally was to have multiple rules, and evaluate them all, returning a bunch of data for each. But WASM only supports executing one rule (block?) so I've got it configured to trigger "wasm_response" in https://play.openpolicyagent.org/p/RR0eeQIsuH

On the playground, it returns the service it matched as well as a boolean for the path that matched. This way I could avoid parsing the path twice and maybe getting conflicting results between OPA and the parent app. There are alternatives, I could pre-parse the path, for example, and fail if they don't match within OPA. I'm sure there are other approaches I'm not thinking of also.

When I run the code as WASM (see does-not-work.tar.gz) it returns only []. When I run the code and leave out the line that does not match (sending_sms in this case, as in works.tar.gz), it works as expected:

[
  {
    "result": {
      "allow": true,
      "sending_email": true,
      "service": {
        "mode": "testing",
        "team_emails": [
          "[email protected]"
        ],
        "team_phones": [
          "+1-000-000-0000"
        ],
        "id": "00000000-0000-0000-0000-000000000000"
      }
    }
  }
]

I presume it's either hitting an unexpected undefined or null in the Node wrapper or there's a bug in the WASM execution. Not sure which? :)

Note that I'm running off my branch, as in package.json says:

  "dependencies": {
    "@open-policy-agent/opa-wasm": "LouisStAmour/npm-opa-wasm#16ffdd64f09d2db7472c324de108fee3d9344fb3"
  }

It's also worth pointing out that I'm running with my own TypeScript type definitions, though I don't think that would change much:

// Type definitions for @open-policy-agent/opa-wasm 1.1.0
// Project: https://github.com/open-policy-agent/npm-opa-wasm

declare module '@open-policy-agent/opa-wasm' {
    /**
     * LoadedPolicy is a wrapper around a WebAssembly.Instance and WebAssembly.Memory
     * for a compiled Rego policy. There are helpers to run the wasm instance and
     * handle the output from the policy wasm.
     */
    class LoadedPolicy {
        /**
         * Loads and initializes a compiled Rego policy.
         * @param {WebAssembly.WebAssemblyInstantiatedSource} policy
         * @param {WebAssembly.Memory} memory
         */
        constructor(policy: WebAssembly.WebAssemblyInstantiatedSource, memory: WebAssembly.Memory);
        mem: WebAssembly.Memory;
        wasmInstance: WebAssembly.Instance | WebAssembly.WebAssemblyInstantiatedSource;
        dataAddr: number;
        baseHeapPtr: number;
        baseHeapTop: number;
        dataHeapPtr: number;
        dataHeapTop: number;
        /**
         * Evaluates the loaded policy with the given input and
         * return the result set. This should be re-used for multiple evaluations
         * of the same policy with different inputs.
         * @param {object} input
         */
        evaluate(input: object): any;
        /**
         * eval_bool will evaluate the policy and return a boolean answer
         * depending on the return code from the policy evaluation.
         * @deprecated Use `evaluate` instead.
         * @param {object} input
         */
        evalBool(input: object): boolean;
        /**
         * Loads data for use in subsequent evaluations.
         * @param {object} data
         */
        setData(data: object): void;
    }
    /**
     * Takes in either an ArrayBuffer or WebAssembly.Module
     * and will return a LoadedPolicy object which can be used to evaluate
     * the policy.
     * @param {BufferSource | WebAssembly.Module} regoWasm
     */
    export function loadPolicy(regoWasm: ArrayBuffer | ArrayBufferView | WebAssembly.Module): Promise<LoadedPolicy>;
    export { LoadedPolicy };
}

patrick-east · 2020-07-06T22:06:48Z

My goal originally was to have multiple rules, and evaluate them all, returning a bunch of data for each. But WASM only supports executing one rule (block?) so I've got it configured to trigger "wasm_response" in https://play.openpolicyagent.org/p/RR0eeQIsuH

Can you clarify whether this playground example is the correct or error case? I see it includes the sending_sms part, but it seems like it is giving the same result as what you are describing from the WASM binary (undefined for wasm_response). I see the same behavior with opa eval:

{14:59} ~/s/wasm-25 ❯ opa eval -b ./does-not-work.tar.gz -i ./input.json 'data.authz_api.wasm_response'
{}

I presume it's either hitting an unexpected undefined or null in the Node wrapper or there's a bug in the WASM execution. Not sure which? :)

It is an unexpected undefined, but it looks more like a bug in the policy itself. That sms rule is undefined and in turn the response rule is too. When you craft these like entrypoint rules you'll need to be careful about handling undefined values. If you add a default value for them then you'll have an easier time of it.

That being said what would probably be even easier is to use the package so you get the same results as like what the playground gives. Eg: opa build -t wasm -e data.authz_api -b ./bundle/source/or/whatever

LouisStAmour · 2020-07-06T22:10:12Z

Thanks for your reply! I was expecting what the playground gave as the “correct” response but was confused as to how to achieve that using WASM. I really appreciate the pointers, this is why I shared that example instead of trying to explain what I was seeing. All I knew was that the playground seemed to return what I expected and if I left out a line that returned “undefined”, the same was true of the rule I gave...

tbrannam · 2020-10-21T13:54:00Z

Any plans to circle around and push this across the finish line?

srenatus · 2021-07-13T08:50:29Z

Happy to review if someone wants to pick this up. But since it's been inactive for a long time, I'll close this PR. 🧹

LouisStAmour force-pushed the master branch from 833a730 to dea9b95 Compare July 1, 2020 20:43

LouisStAmour added 4 commits July 1, 2020 16:47

Added JWT builtins.

7ac7658

Signed-off-by: Louis St-Amour <[email protected]>

Bugfix for substring builtin.

bdc5334

Signed-off-by: Louis St-Amour <[email protected]>

Added basic implementations of object.union and object.remove - not p…

00302a9

…retty or perfect. Signed-off-by: Louis St-Amour <[email protected]>

Started adding test cases, but wow, it's a lot of work to port these …

16ffdd6

…from Golang by hand. Signed-off-by: Louis St-Amour <[email protected]>

LouisStAmour force-pushed the master branch from dea9b95 to 16ffdd6 Compare July 1, 2020 20:48

srenatus closed this Jul 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

builtins for jwt, some object, bug fix for substr builtin... then started looking at test cases... #25

builtins for jwt, some object, bug fix for substr builtin... then started looking at test cases... #25

LouisStAmour commented Jul 1, 2020

LouisStAmour commented Jul 1, 2020 •

edited

Loading

patrick-east commented Jul 1, 2020

patrick-east commented Jul 1, 2020

LouisStAmour commented Jul 3, 2020 •

edited

Loading

patrick-east commented Jul 6, 2020

LouisStAmour commented Jul 6, 2020

tbrannam commented Oct 21, 2020

srenatus commented Jul 13, 2021

builtins for jwt, some object, bug fix for substr builtin... then started looking at test cases... #25

builtins for jwt, some object, bug fix for substr builtin... then started looking at test cases... #25

Conversation

LouisStAmour commented Jul 1, 2020

LouisStAmour commented Jul 1, 2020 • edited Loading

patrick-east commented Jul 1, 2020

patrick-east commented Jul 1, 2020

LouisStAmour commented Jul 3, 2020 • edited Loading

patrick-east commented Jul 6, 2020

LouisStAmour commented Jul 6, 2020

tbrannam commented Oct 21, 2020

srenatus commented Jul 13, 2021

LouisStAmour commented Jul 1, 2020 •

edited

Loading

LouisStAmour commented Jul 3, 2020 •

edited

Loading