Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtins for jwt, some object, bug fix for substr builtin... then started looking at test cases... #25

Closed
wants to merge 4 commits into from

Conversation

LouisStAmour
Copy link

Added basic io.jwt builtin implementations, added a couple object builtins, found a bug in substr builtin, then started looking at how much work it would be to port test cases using object.replace as an example.

What's actually holding me back from wholeheartedly using wasm here is (a) we don't have enough test cases ported to ensure upstream functionality matches for our builtin ports and (b) I'm seeing different behaviour between the OPA playground and the WASM versions when it comes to referencing non-matching rules in a response hash. Finally (c) I tried to use an optimization level above 0 and the resulting file was far more complicated (with at a guess 24 comparisons when I expected it to need only 2-3) and ignoring that, it did not run correctly...

Ideally we would have a test suite that matches upstream Go code, such that we compile every example at every optimization level, run it, and get the expected results. And after we did all that, I'd start looking at creating inputs via fuzzing and testing both implementations in parallel, with fuzzed function references too. ;-)

@LouisStAmour
Copy link
Author

LouisStAmour commented Jul 1, 2020

After all this, I’m tempted to consider just writing C or C++ bindings for OPA’s Go library interface, and then writing Node.js bindings to call the C or C++ and adding a typescript wrapper on top. I’ve done the Node.js bindings for a different C++ library and it’s not that difficult. And then I’d know I’d have relatively low-maintenance-overhead for native compatibility. I’d probably have to do the same to make AssemblyScript bindings that can execute the Go code so that Envoy can access it. Turns out it’s relatively hard to call WASM from WASM, you see... end up having to use a linker and it gets complicated...

Not to say this isn’t a better approach long term, but there’s a lot of Go-isms baked into the standard built in functions so they’re not yet as easily ported to other languages. I do like the idea of extending Rego with new builtins, but it sounds like it might complicate Rego WASM optimization to have functions written in a different languages and called at runtime via JSON-encoded args.

Which is to say I suppose I’m in favour of compiling Rego and making it extensible but I’m not yet sold on WASM runtime bindings as being where we would experiment with that... I’m not sure I can trust WASM until similar compilation completely replaces all other execution of Rego such that we can test its execution using existing Golang tests and with identical, portable built-ins?

@patrick-east
Copy link
Contributor

Thanks for working on this!

For the builtin functions be sure to keep an eye on #23 and in particular the list of ones provided by OPA https://github.com/open-policy-agent/opa/blob/master/internal/planner/planner.go#L26 (for OPA v0.21.0 https://github.com/open-policy-agent/opa/blob/v0.21.0/internal/planner/planner.go#L26 ). We do need the JWT ones, and it's unlikely that they'll be implemented in C and baked in anytime soon, but the object builtins and string builtins are much more likely candidates to be implemented over in OPA as part of the WASM binary output.

What's actually holding me back from wholeheartedly using wasm here is (a) we don't have enough test cases ported to ensure upstream functionality matches for our builtin ports

100% agree, and that is primarily the reason this package is still a WIP.

There is a long standing item in the backlog for OPA to split the topdown test cases out into a separate format that can be shared between the OPA golang evaluation testing and the various WASM environments. In the meantime relying on the builtin functions baked into the OPA WASM binary (linked to above) are tested and IMO more likely to give the "correct" behavior as far as matching the OPA runtime.

and (b) I'm seeing different behavior between the OPA playground and the WASM versions when it comes to referencing non-matching rules in a response hash.

Can you elaborate on that one? Seems like potentially a bug that needs to be fixed. Please feel free to file issues either here or on the main OPA github repo.

Finally (c) I tried to use an optimization level above 0 and the resulting file was far more complicated (with at a guess 24 comparisons when I expected it to need only 2-3)

This is likely because the optimizations are done with the intention of taking advantage of the rule indexing in OPA https://www.openpolicyagent.org/docs/latest/policy-performance/#use-indexed-statements Which leads to generating many helper rules with more simple comparisons, at evaluation time with the OPA runtime it will only evaluate the ones relevant for the input which means only those 2-3 (or less) would be evaluated even though the policy now has significantly more defined. The catch here is that the WASM generated binaries do not support indexing like that, so you may not see the same kind of improvements that some of the optimization levels on opa build would provide when compared to using the OPA runtime.

and ignoring that, it did not run correctly...

Please file an issue, performance aside the end result should be the same for a WASM evaluation and a OPA runtime evaluation.

Ideally we would have a test suite that matches upstream Go code, such that we compile every example at every optimization level, run it, and get the expected results. And after we did all that, I'd start looking at creating inputs via fuzzing and testing both implementations in parallel, with fuzzed function references too. ;-)

Big +1, as mentioned above there are some intentions to have that more unified test suite, just a matter of someone spending the time to realize the vision. At some point some of the maintainers will get to it, but in the meantime contributions are very welcome 😄

@patrick-east
Copy link
Contributor

😅 missed your latest comment while writing up that last response!

After all this, I’m tempted to consider just writing C or C++ bindings for OPA’s Go library interface, and then writing Node.js bindings to call the C or C++ and adding a typescript wrapper on top. I’ve done the Node.js bindings for a different C++ library and it’s not that difficult. And then I’d know I’d have relatively low-maintenance-overhead for native compatibility. I’d probably have to do the same to make AssemblyScript bindings that can execute the Go code so that Envoy can access it. Turns out it’s relatively hard to call WASM from WASM, you see... end up having to use a linker and it gets complicated...

Not to say this isn’t a better approach long term, but there’s a lot of Go-isms baked into the standard built in functions so they’re not yet as easily ported to other languages. I do like the idea of extending Rego with new builtins, but it sounds like it might complicate Rego WASM optimization to have functions written in a different languages and called at runtime via JSON-encoded args.

Which is to say I suppose I’m in favour of compiling Rego and making it extensible but I’m not yet sold on WASM runtime bindings as being where we would experiment with that... I’m not sure I can trust WASM until similar compilation completely replaces all other execution of Rego such that we can test its execution using existing Golang tests and with identical, portable built-ins?

Keep in mind that any of the builtins can also be implemented in C and baked directly into the OPA generated output. Longer term i'd expect the majority of them to be done that way (barring ones like http.send or like the time* ones).

If you do go down the road of using the C++ bindings I would be curious to see how well it works and if there is anything more we can/should do for OPA to help unlock those use-cases.

@LouisStAmour
Copy link
Author

LouisStAmour commented Jul 3, 2020

@patrick-east Let me know if I should file a bug for this... if this is a supported use-case or if maybe I'm doing something unusual within my OPA?

I imagine one could make a simpler use case that doesn't involve JWT, but this is the one I currently have, so...

My goal originally was to have multiple rules, and evaluate them all, returning a bunch of data for each. But WASM only supports executing one rule (block?) so I've got it configured to trigger "wasm_response" in https://play.openpolicyagent.org/p/RR0eeQIsuH

On the playground, it returns the service it matched as well as a boolean for the path that matched. This way I could avoid parsing the path twice and maybe getting conflicting results between OPA and the parent app. There are alternatives, I could pre-parse the path, for example, and fail if they don't match within OPA. I'm sure there are other approaches I'm not thinking of also.

When I run the code as WASM (see does-not-work.tar.gz) it returns only []. When I run the code and leave out the line that does not match (sending_sms in this case, as in works.tar.gz), it works as expected:

[
  {
    "result": {
      "allow": true,
      "sending_email": true,
      "service": {
        "mode": "testing",
        "team_emails": [
          "[email protected]"
        ],
        "team_phones": [
          "+1-000-000-0000"
        ],
        "id": "00000000-0000-0000-0000-000000000000"
      }
    }
  }
]

I presume it's either hitting an unexpected undefined or null in the Node wrapper or there's a bug in the WASM execution. Not sure which? :)

Note that I'm running off my branch, as in package.json says:

  "dependencies": {
    "@open-policy-agent/opa-wasm": "LouisStAmour/npm-opa-wasm#16ffdd64f09d2db7472c324de108fee3d9344fb3"
  }

It's also worth pointing out that I'm running with my own TypeScript type definitions, though I don't think that would change much:

// Type definitions for @open-policy-agent/opa-wasm 1.1.0
// Project: https://github.com/open-policy-agent/npm-opa-wasm

declare module '@open-policy-agent/opa-wasm' {
    /**
     * LoadedPolicy is a wrapper around a WebAssembly.Instance and WebAssembly.Memory
     * for a compiled Rego policy. There are helpers to run the wasm instance and
     * handle the output from the policy wasm.
     */
    class LoadedPolicy {
        /**
         * Loads and initializes a compiled Rego policy.
         * @param {WebAssembly.WebAssemblyInstantiatedSource} policy
         * @param {WebAssembly.Memory} memory
         */
        constructor(policy: WebAssembly.WebAssemblyInstantiatedSource, memory: WebAssembly.Memory);
        mem: WebAssembly.Memory;
        wasmInstance: WebAssembly.Instance | WebAssembly.WebAssemblyInstantiatedSource;
        dataAddr: number;
        baseHeapPtr: number;
        baseHeapTop: number;
        dataHeapPtr: number;
        dataHeapTop: number;
        /**
         * Evaluates the loaded policy with the given input and
         * return the result set. This should be re-used for multiple evaluations
         * of the same policy with different inputs.
         * @param {object} input
         */
        evaluate(input: object): any;
        /**
         * eval_bool will evaluate the policy and return a boolean answer
         * depending on the return code from the policy evaluation.
         * @deprecated Use `evaluate` instead.
         * @param {object} input
         */
        evalBool(input: object): boolean;
        /**
         * Loads data for use in subsequent evaluations.
         * @param {object} data
         */
        setData(data: object): void;
    }
    /**
     * Takes in either an ArrayBuffer or WebAssembly.Module
     * and will return a LoadedPolicy object which can be used to evaluate
     * the policy.
     * @param {BufferSource | WebAssembly.Module} regoWasm
     */
    export function loadPolicy(regoWasm: ArrayBuffer | ArrayBufferView | WebAssembly.Module): Promise<LoadedPolicy>;
    export { LoadedPolicy };
}

@patrick-east
Copy link
Contributor

My goal originally was to have multiple rules, and evaluate them all, returning a bunch of data for each. But WASM only supports executing one rule (block?) so I've got it configured to trigger "wasm_response" in https://play.openpolicyagent.org/p/RR0eeQIsuH

Can you clarify whether this playground example is the correct or error case? I see it includes the sending_sms part, but it seems like it is giving the same result as what you are describing from the WASM binary (undefined for wasm_response). I see the same behavior with opa eval:

{14:59} ~/s/wasm-25 ❯ opa eval -b ./does-not-work.tar.gz -i ./input.json 'data.authz_api.wasm_response'
{}

I presume it's either hitting an unexpected undefined or null in the Node wrapper or there's a bug in the WASM execution. Not sure which? :)

It is an unexpected undefined, but it looks more like a bug in the policy itself. That sms rule is undefined and in turn the response rule is too. When you craft these like entrypoint rules you'll need to be careful about handling undefined values. If you add a default value for them then you'll have an easier time of it.

That being said what would probably be even easier is to use the package so you get the same results as like what the playground gives. Eg: opa build -t wasm -e data.authz_api -b ./bundle/source/or/whatever

@LouisStAmour
Copy link
Author

Thanks for your reply! I was expecting what the playground gave as the “correct” response but was confused as to how to achieve that using WASM. I really appreciate the pointers, this is why I shared that example instead of trying to explain what I was seeing. All I knew was that the playground seemed to return what I expected and if I left out a line that returned “undefined”, the same was true of the rule I gave...

@tbrannam
Copy link

Any plans to circle around and push this across the finish line?

@srenatus
Copy link
Contributor

Happy to review if someone wants to pick this up. But since it's been inactive for a long time, I'll close this PR. 🧹

@srenatus srenatus closed this Jul 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants