fix(new_relic sink): Do not quote paths containing periods for the event API #21323

bruceg · 2024-09-19T15:00:08Z

This could not be accomplished the same way as #21305 since the event API cannot handle nested JSON data. Instead, a new LogEvent::convert_to_fields_unquoted method was added to produce the flattened data without quoting.

…ent API This could not be accomplished the same way as #21305 since the event API cannot handle nested JSON data. Instead, a new `LogEvent::convert_to_fields_unquoted` method was added to produce the flattened data without quoting.

pront · 2024-09-19T15:52:56Z

lib/vector-core/src/event/util/log/all_fields.rs

@@ -10,14 +10,14 @@ static IS_VALID_PATH_SEGMENT: Lazy<Regex> = Lazy::new(|| Regex::new(r"^[a-zA-Z0-

 /// Iterates over all paths in form `a.b[0].c[1]` in alphabetical order
 /// and their corresponding values.
-pub fn all_fields(fields: &ObjectMap) -> FieldsIter {
-    FieldsIter::new(fields)
+pub fn all_fields(fields: &ObjectMap, quote_periods: bool) -> FieldsIter {


I don't like this change. This iterator now returns parsable paths which is the desirable behavior.

Are you looking for the old behavior of this iterator? Then I would add all_fields_unquoted, similar to all_fields_non_object_root.

With quote_periods = true, the current behavior of the iterator is preserved, as demonstrated by the tests. It only changes the behavior when that parameter is false. I don't want the old behavior where periods are escaped neither, as that too caused problems in the new_relic sink.

FWIW I am not opposed to adding a second all_fields_unquoted function as described, as that is what I originally had coded. I used a parameter instead since the two functions differ just in the value of the parameter, and all_fields is actually crate-local and so the change is entirely contained within vector-core.

Plain bool parameters in utils can be an anti-pattern. Also, eventually all these iterator should return OwnedTargetPaths. Are the paths returned by this new iterator parsable with parse_target_path?

I have split up the all_fields function. No, the unquoted paths are no longer parsable. Technically, I don't think the current paths are all parsable neither if they contain quotes since those are not escaped.

datadog-vectordotdev · 2024-09-19T16:09:58Z

Datadog Report

Branch report: bruceg/OPA-2327-fix-new-relic-event-quoting
Commit report: a03b8bc
Test service: vector

✅ 0 Failed, 7 Passed, 0 Skipped, 25.47s Total Time

pront

👍

github-actions · 2024-09-19T18:23:21Z

Regression Detector Results

Run ID: 489592f2-7b93-45ec-8332-51d7e37c3e99 Metrics dashboard

Baseline: 8238e5a
Comparison: 141ea8c

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	links
✅	file_to_blackhole	egress throughput	+14.45	[+6.87, +22.03]

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI
✅	file_to_blackhole	egress throughput	+14.45	[+6.87, +22.03]
➖	http_text_to_http_json	ingress throughput	+3.89	[+3.78, +4.01]
➖	datadog_agent_remap_blackhole_acks	ingress throughput	+2.83	[+2.70, +2.95]
➖	socket_to_socket_blackhole	ingress throughput	+2.82	[+2.74, +2.89]
➖	otlp_http_to_blackhole	ingress throughput	+1.57	[+1.42, +1.72]
➖	syslog_humio_logs	ingress throughput	+1.51	[+1.39, +1.63]
➖	otlp_grpc_to_blackhole	ingress throughput	+1.38	[+1.26, +1.49]
➖	fluent_elasticsearch	ingress throughput	+1.18	[+0.69, +1.68]
➖	datadog_agent_remap_blackhole	ingress throughput	+0.96	[+0.84, +1.07]
➖	syslog_loki	ingress throughput	+0.91	[+0.82, +0.99]
➖	http_to_s3	ingress throughput	+0.49	[+0.21, +0.77]
➖	http_to_http_noack	ingress throughput	+0.10	[+0.04, +0.17]
➖	http_to_http_json	ingress throughput	+0.05	[+0.00, +0.10]
➖	splunk_hec_to_splunk_hec_logs_noack	ingress throughput	+0.01	[-0.08, +0.10]
➖	splunk_hec_to_splunk_hec_logs_acks	ingress throughput	-0.00	[-0.10, +0.10]
➖	splunk_hec_indexer_ack_blackhole	ingress throughput	-0.01	[-0.09, +0.07]
➖	datadog_agent_remap_datadog_logs	ingress throughput	-0.18	[-0.37, +0.00]
➖	syslog_log2metric_splunk_hec_metrics	ingress throughput	-0.24	[-0.35, -0.13]
➖	datadog_agent_remap_datadog_logs_acks	ingress throughput	-0.38	[-0.54, -0.21]
➖	syslog_log2metric_tag_cardinality_limit_blackhole	ingress throughput	-0.40	[-0.49, -0.31]
➖	http_to_http_acks	ingress throughput	-0.55	[-1.77, +0.68]
➖	syslog_splunk_hec_logs	ingress throughput	-0.60	[-0.69, -0.50]
➖	http_elasticsearch	ingress throughput	-2.20	[-2.39, -2.01]
➖	splunk_hec_route_s3	ingress throughput	-2.63	[-2.96, -2.30]
➖	syslog_log2metric_humio_metrics	ingress throughput	-2.65	[-2.76, -2.55]
➖	syslog_regex_logs2metric_ddmetrics	ingress throughput	-2.89	[-3.03, -2.75]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

…ent API (vectordotdev#21323) * fix(new_relic sink): Do not quote paths containing periods for the event API This could not be accomplished the same way as vectordotdev#21305 since the event API cannot handle nested JSON data. Instead, a new `LogEvent::convert_to_fields_unquoted` method was added to produce the flattened data without quoting. * Fix and add tests * Drop the extra parameter from `all_fields`

bruceg added type: bug A code related bug. domain: logs Anything related to Vector's log events sink: new_relic Anything `new_relic` sink related labels Sep 19, 2024

bruceg requested a review from a team as a code owner September 19, 2024 15:00

github-actions bot added domain: sinks Anything related to the Vector's sinks domain: core Anything related to core crates i.e. vector-core, core-common, etc labels Sep 19, 2024

Fix and add tests

bc3462f

pront reviewed Sep 19, 2024

View reviewed changes

Drop the extra parameter from all_fields

befeb90

pront approved these changes Sep 19, 2024

View reviewed changes

bruceg enabled auto-merge September 19, 2024 16:54

bruceg added this pull request to the merge queue Sep 19, 2024

Merged via the queue into master with commit 141ea8c Sep 19, 2024
80 checks passed

bruceg deleted the bruceg/OPA-2327-fix-new-relic-event-quoting branch September 19, 2024 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(new_relic sink): Do not quote paths containing periods for the event API #21323

fix(new_relic sink): Do not quote paths containing periods for the event API #21323

bruceg commented Sep 19, 2024

pront Sep 19, 2024 •

edited

Loading

bruceg Sep 19, 2024

bruceg Sep 19, 2024

pront Sep 19, 2024

bruceg Sep 19, 2024

datadog-vectordotdev bot commented Sep 19, 2024 •

edited

Loading

pront left a comment

github-actions bot commented Sep 19, 2024

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

fix(new_relic sink): Do not quote paths containing periods for the event API #21323

fix(new_relic sink): Do not quote paths containing periods for the event API #21323

Conversation

bruceg commented Sep 19, 2024

pront Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

bruceg Sep 19, 2024

Choose a reason for hiding this comment

bruceg Sep 19, 2024

Choose a reason for hiding this comment

pront Sep 19, 2024

Choose a reason for hiding this comment

bruceg Sep 19, 2024

Choose a reason for hiding this comment

datadog-vectordotdev bot commented Sep 19, 2024 • edited Loading

Datadog Report

pront left a comment

Choose a reason for hiding this comment

github-actions bot commented Sep 19, 2024

Regression Detector Results

No significant changes in experiment optimization goals

Experiments ignored for regressions

Fine details of change detection per experiment

Explanation

pront Sep 19, 2024 •

edited

Loading

datadog-vectordotdev bot commented Sep 19, 2024 •

edited

Loading