Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable direct error handling for bundle plugin trigger method #7143

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

torwunder
Copy link

Why the changes in this PR are needed?

When using opa as a library you may want to directly control when and how opa interacts with the bundle registry.
When switching to manual trigger mode and triggering the the bundle plugin directly you may want to react to errors that occurred when downloading the bundles.
The current capability of using the bundle status requires configuring a listener and wait for it to be triggered which differs from the approach taken f.ex. in the discovery plugin.
By returning the downloader errors directly in the bundle plugin trigger method you can f.ex. check for HTTPError and its status code so to determine if a retry can be successful (status code 5xx) or will yield the same result (4xx).

What are the changes in this PR?

  • creating an error type for the bundle plugin similar to ast.Errors that will collect the different errors and wrap them in a detailed error type carrying the bundle name
  • collecting all downloader errors for bundles in manual trigger mode.
  • return errors in bundle plugin trigger mode or nil if no errors occurred
  • changed the exec_test to also verify the returned errors and not expect zero errors

Notes to assist PR review:

Further comments:

  • the different behaviour of the exec method might be undesired but would IMO be helpful
  • in case the behaviour change is undesired this functionality could be made configurable
  • also the downloader errors for bundles in periodic trigger mode could be returned

Copy link

netlify bot commented Oct 30, 2024

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit 356764b
🔍 Latest deploy log https://app.netlify.com/sites/openpolicyagent/deploys/6731c3c8d33a7800088d772a
😎 Deploy Preview https://deploy-preview-7143--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@torwunder torwunder force-pushed the bundle_plugin_error_return_value branch 3 times, most recently from 201abe0 to c510621 Compare October 31, 2024 13:59
@@ -6522,6 +6522,68 @@ func TestPluginManualTriggerWithTimeout(t *testing.T) {
}
}

func TestPluginManualTriggerWithServerError(t *testing.T) {
Copy link
Author

@torwunder torwunder Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test triggers the race condition.
My investigation led to the conclusion that this is triggered by having the bundle with the "periodic" trigger mode in the test.
The downloader.oneShot function will be triggered by the explicitly trigger method and also by the downloader.loop function.
The access to the shared variables (f.ex. etag, client) in Downloader is not synchronized and leads to the race condition.
A possible solution is to only trigger bundle dowloads with "manual" trigger mode in the trigger method.
F.ex. like this (in bundle.Plugin.Trigger)

func (p *Plugin) Trigger(ctx context.Context) error {
[...]
    for name, dl := range p.downloaders {
        downloaders[name] = dl
        p.cfgMtx.RLock()
        triggerMode := p.config.Bundles[name].Trigger
        p.cfgMtx.RUnlock()
        
        if triggerMode != nil && *triggerMode == plugins.TriggerManual {
	        downloaders[name] = dl
        }
    }
[...]

This would also solve the question if Trigger should return errors for bundles with "periodic" trigger mode.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashutosh-narkar
Do you have a recommendation how to proceed with this?
To make the test green, I could remove the configuration for the bundle in "periodic" trigger mode but I think this might be a problem in the bundle plugin that should be addressed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having only a manual trigger for the test should be fine.

Copy link
Member

@ashutosh-narkar ashutosh-narkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @torwunder! One high level comment I have is about the structure of the error. If we can model it similar to https://github.com/open-policy-agent/opa/blob/main/plugins/bundle/status.go#L64, it would be more informative I think.

@torwunder
Copy link
Author

@ashutosh-narkar I modelled the error similar to the status.
I did not put the collected ast.Error from the ast.Errors as an array into the Err field (in contrast to the bundle status) but rather ast.Errors itself to make it easier to use with single errors like HTTPError
This can ofc. be changed.

return fmt.Sprintf("Bundle name: %s, Code: %s, HTTPCode: %d, Message: %s", e.BundleName, errCode, e.HTTPCode, e.Message)
}

func (e Error) Unwrap() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Is there a reason to define this method?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will allow using errors.As for retrieving the wrapped error. Something like this:

errs := Errors{NewBundleError("example", download.HTTPError{StatusCode: 500})}

var httpError download.HTTPError
errors.As(errs, &httpError) 

if httpError.StatusCode == 500 { ...

There is a test in errors_test.go (TestUnwrap) that will start to fail if we remove the function.

// plugin callback will also log the trigger error and include it in the bundle status
err := d.Trigger(ctx)
// only return errors for TriggerMode manual as periodic bundles will be retried
if err != nil && *p.config.Bundles[name].Trigger == plugins.TriggerManual {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the Config() method on the plugin to retrieve the config.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashutosh-narkar do you think it makes sense to only return the errors for bundles in manual trigger mode?
It felt sensible to me as periodic triggermode will be retried by the bundle plugin but might also just create confusion.
it also creates this additional if statement which will then not be tested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual trigger mode can currently only be configured via a library integration. So to keep parity with the discovery plugin, returning errors in the manual mode makes sense.

@@ -6522,6 +6522,68 @@ func TestPluginManualTriggerWithTimeout(t *testing.T) {
}
}

func TestPluginManualTriggerWithServerError(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having only a manual trigger for the test should be fine.

@torwunder torwunder force-pushed the bundle_plugin_error_return_value branch 2 times, most recently from 356764b to 2776139 Compare November 12, 2024 08:58
@@ -321,15 +323,32 @@ main contains "hello" if {

ctx, cancel := context.WithCancel(context.Background())
defer cancel()
go func() {
go func(expectedErrors []string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand what we're testing here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the existing test is verifying that the errors are present in the status,
my goal was to also verify that these errors are also returned directly from the trigger method
handing over the expected error as a parameter became necessary due to a detected race condition as the goroutine will otherwise work on tc.expErrs which is changed concurrently in the outer loop (l. 294)

@ashutosh-narkar
Copy link
Member

@torwunder thanks for addressing the comments. Can you please fix the conflict and squash your changes and we can then get this in? Thanks!

@torwunder torwunder force-pushed the bundle_plugin_error_return_value branch from 2776139 to f9a46e2 Compare November 13, 2024 08:45
…r errors for bundles in manual trigger mode

Signed-off-by: towunderlich <[email protected]>
* model Error similar to bundle status

Signed-off-by: towunderlich <[email protected]>
* only test error handling for bundle in manual trigger mode
* use plugin config method

Signed-off-by: towunderlich <[email protected]>
* enable new test to run in parallel

Signed-off-by: towunderlich <[email protected]>
@torwunder torwunder force-pushed the bundle_plugin_error_return_value branch from f9a46e2 to bd3212c Compare November 13, 2024 08:57
@torwunder
Copy link
Author

@ashutosh-narkar I resolved the conflict and added the parallel setting to the test, in line with the upstream changes.
also thanks for your reviews :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants