Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(otlp): Write protobuf status on error #15097

Merged
merged 2 commits into from
Nov 26, 2024

Conversation

salvacorts
Copy link
Contributor

@salvacorts salvacorts commented Nov 25, 2024

What this PR does / why we need it:

The OTLP spec states that:

If the processing of the request fails, the server MUST respond with appropriate HTTP 4xx or HTTP 5xx status code.
The response body for all HTTP 4xx and HTTP 5xx responses MUST be a Protobuf-encoded Status message that describes the problem.
This specification does not use Status.code field and the server MAY omit Status.code field.
The clients are not expected to alter their behavior based on Status.code field but MAY record it for troubleshooting purposes.
The Status.message field SHOULD contain a developer-facing error message as defined in Status message schema.

Loki currently writes the error as a string so it's lost. This PR writes the error as a protobuf Status instead.

We also ditch the otlp error interceptor and put it's logic inside the OTELError writer.

Without change Error message is lost

2024-11-25T10:48:42.662805076Z 2024-11-25T10:48:42.652Z	error	internal/queue_sender.go:92	Exporting failed. Dropping data.	{"service": "opentelemetry-collector", "kind": "exporter", "data_type": "logs", "name": "otlphttp/loki", "error": "not retryable error: Permanent error: rpc error: code = InvalidArgument desc = error exporting items, request to http://loki:3100/otlp/v1/logs responded with HTTP Status Code 400", "dropped_items": 1456}

With change Error message is displayed (set max_line_size: 1B)

2024-11-25T10:44:00.295481650Z 2024-11-25T10:44:00.281Z	error	internal/queue_sender.go:92	Exporting failed. Dropping data.	{"service": "opentelemetry-collector", "kind": "exporter", "data_type": "logs", "name": "otlphttp/loki", "error": "not retryable error: Permanent error: rpc error: code = InvalidArgument desc = error exporting items, request to http://loki:3100/otlp/v1/logs responded with HTTP Status Code 400, Message=2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '104' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '107' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '156' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '130' bytes; 52 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '213' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '103' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '140' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '108' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '139' bytes; 109 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '217' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '150' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '119' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '117' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '152' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '173' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '206' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '215' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '123' bytes; 4 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '168' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '118' bytes; 10 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '131' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '112' bytes; 206 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '110' bytes; 4 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '219' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '220' bytes; 4 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '207' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '124' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '175' bytes; 4 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '113' bytes; 30 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '208' bytes; 24 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '109' bytes; 11 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '216' bytes; 131 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '133' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '149' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '157' bytes; 303 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '211' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '148' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '171' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '314' bytes; 4 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '111' bytes; 4 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '106' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '127' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '136' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '169' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '115' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '137' bytes; 85 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '132' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '96' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '159' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '146' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '125' bytes; 39 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '218' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '101' bytes; 5 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '114' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '100' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '214' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '178' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '129' bytes; 2 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '116' bytes; 309 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '209' bytes; 5 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '134' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '135' bytes; 1 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '183' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '120' bytes; 7 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '212' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '126' bytes; 29 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '210' bytes; 3 errors like: Max entry size '1' bytes exceeded for stream '{service_name=\"unknown_service\"}' while adding an entry with length '121' bytes, Details=[]", "dropped_items": 1456}

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@salvacorts salvacorts force-pushed the salvacorts/otlp-return-status-error branch from 11c1813 to a12cf8c Compare November 25, 2024 10:51
}

// otelErrorHeaderInterceptor maps 500 errors to 503.
// According to the OTLP specification, 500 errors are never retried on the client side, but 503 are.
type otelErrorHeaderInterceptor struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic moved into new OTLPError writer

@salvacorts salvacorts marked this pull request as ready for review November 25, 2024 10:56
@salvacorts salvacorts requested a review from a team as a code owner November 25, 2024 10:56
@@ -16,17 +16,19 @@ import (

"github.com/dustin/go-humanize"
"github.com/go-kit/log"
"github.com/gogo/protobuf/proto"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be gogo proto? i am not sure how different this is from the google version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not. I tried but panics.

Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@salvacorts salvacorts merged commit 63a2442 into main Nov 26, 2024
61 checks passed
@salvacorts salvacorts deleted the salvacorts/otlp-return-status-error branch November 26, 2024 08:04
@salvacorts salvacorts added type/bug Somehing is not working as expected backport k230 labels Nov 26, 2024
loki-gh-app bot pushed a commit that referenced this pull request Nov 26, 2024
@loki-gh-app
Copy link
Contributor

loki-gh-app bot commented Nov 26, 2024

The backport to k229 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-15097-to-k229 origin/k229
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x 63a2442191751e32aaafd6227e1602dfa3a95caa

When the conflicts are resolved, stage and commit the changes:

git add . && git cherry-pick --continue

If you have the GitHub CLI installed:

# Push the branch to GitHub:
git push --set-upstream origin backport-15097-to-k229
# Create the PR body template
PR_BODY=$(gh pr view 15097 --json body --template 'Backport 63a2442191751e32aaafd6227e1602dfa3a95caa from #15097{{ "\n\n---\n\n" }}{{ index . "body" }}')
# Create the PR on GitHub
echo "${PR_BODY}" | gh pr create --title 'fix(otlp): Write protobuf status on error (backport k229)' --body-file - --label 'size/L' --label 'type/bug' --label 'backport' --base k229 --milestone k229 --web

Or, if you don't have the GitHub CLI installed (we recommend you install it!):

# Push the branch to GitHub:
git push --set-upstream origin backport-15097-to-k229

# Create a pull request where the `base` branch is `k229` and the `compare`/`head` branch is `backport-15097-to-k229`.

# Remove the local backport branch
git switch main
git branch -D backport-15097-to-k229

salvacorts added a commit that referenced this pull request Nov 26, 2024
@salvacorts salvacorts mentioned this pull request Nov 26, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants