Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Blog Post: Collecting file-based Java logs with OpenTelemetry #5600

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
2c4bda8
BlogPost: Collecting file based Java logs with OpenTelemetry
cyrille-leclerc Nov 6, 2024
2d09bca
Update collecting-file-based-java-logs-with-opentelemetry.md
cyrille-leclerc Nov 6, 2024
d38a366
blog post
zeitlinger Nov 11, 2024
e26b392
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 12, 2024
e2e759c
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 12, 2024
fe9e82c
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 12, 2024
df2d0d8
Rename file
cyrille-leclerc Nov 13, 2024
8d22708
cSpell ignore
cyrille-leclerc Nov 13, 2024
9e63f91
WIP
cyrille-leclerc Nov 13, 2024
254c409
Fix issue link
cyrille-leclerc Nov 13, 2024
2a6e869
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 15, 2024
69d254d
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 15, 2024
74900fc
Apply suggestions from code review
cyrille-leclerc Nov 15, 2024
2520f47
Apply suggestions from code review
cyrille-leclerc Nov 15, 2024
a483250
Apply suggestions from code review
cyrille-leclerc Nov 15, 2024
23be0a1
fix submodules
cyrille-leclerc Nov 15, 2024
54f6d09
Apply suggestions from code review
cyrille-leclerc Nov 15, 2024
67c9361
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 15, 2024
1d7ed71
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 15, 2024
7d2bc11
wip
cyrille-leclerc Nov 15, 2024
fbe16d6
wip
cyrille-leclerc Nov 15, 2024
1ef7c05
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 18, 2024
35e448c
Apply suggestions from code review
cyrille-leclerc Nov 19, 2024
b8f989e
Update content/en/blog/2024/collecting-file-based-java-logs-with-open…
cyrille-leclerc Nov 20, 2024
4175d74
Fix formatting
cyrille-leclerc Nov 20, 2024
e5aee18
Fix formatting
cyrille-leclerc Nov 20, 2024
ee7bfba
Better hyperlink
cyrille-leclerc Nov 20, 2024
15bbe6f
Fix formatting
cyrille-leclerc Nov 20, 2024
0bc6699
Add missing alt on img
cyrille-leclerc Nov 20, 2024
8801600
Results from /fix:refcache
opentelemetrybot Nov 20, 2024
c50d9b0
Apply suggestions from code review
cyrille-leclerc Nov 20, 2024
0f04576
Apply suggestions from code review
cyrille-leclerc Nov 21, 2024
75068dd
Results from /fix:format
opentelemetrybot Nov 21, 2024
3c0354f
Apply suggestions from code review
cyrille-leclerc Nov 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
---
title: Collecting OpenTelemetry-compliant Java logs from files
date:
author: >
[Cyrille Le Clerc](https://github.com/cyrille-leclerc) (Grafana Labs), [Gregor
Zeitlinger](https://github.com/zeitlinger) (Grafana Labs)
issue: https://github.com/open-telemetry/opentelemetry.io/issues/5606
sig: Java, Specification
# prettier-ignore
cSpell:ignore: Clerc cust Cyrille Dotel Gregor Logback logback otlphttp otlpjson resourcedetection SLF4J stdout Zeitlinger
---

If you want to get logs from your Java application ingested into an
OpenTelemetry-compatible logs backend, the easiest and recommended way is using
an OpenTelemetry protocol (OTLP) exporter. However, some scenarios require logs
to be output to files or stdout due to organizational or reliability needs.

A common approach to centralize logs is to use unstructured logs, parse them
with regular expressions, and add contextual attributes.

However, regular expression parsing is problematic. They become complex and
fragile quickly when handling all log fields, line breaks in exceptions, and
unexpected log format changes. Parsing errors are inevitable with this method.

## Native solution for Java logs

The OpenTelemetry Java Instrumentation agent and SDK now offer an easy solution
to convert logs from frameworks like SLF4J/Logback or Log4j2 into OTel-compliant
JSON logs on stdout with all resource and log attributes.

This is a true turnkey solution:

- No code or dependency changes, just a few configuration adjustments typical
for production deployment.
- No complex field mapping in the log collector. Just use the
[OTLP/JSON connector](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/otlpjsonconnector)
to ingest the payload.
- Automatic correlation between logs, traces, and metrics.

This blog post shows how to set up this solution step by step.

- In the first part, we'll show how to configure the Java application to output
logs in the OTLP/JSON format.
- In the second part, we'll show how to configure the OpenTelemetry Collector to
ingest the logs.
- Finally, we'll show a Kubernetes-specific setup to handle container logs.

## Reference architecture

The deployment architecture looks like the following:
Comment on lines +48 to +50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we need a section heading here, you're only introducing the diagram.

Also, this isn't a "reference architecture", nor even a "deployment architecture". Prefer the use of the simpler term "deployment diagram". Actually, if you just present the diagram, you don't really need to name it, but if you do (e.g., via the "alt" text in the image element below, then avoid the use of "architecture".

Suggested change
## Reference architecture
The deployment architecture looks like the following:


![OTLP/JSON Architecture](otlpjson-architecture.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagram is really busy, and the content too small. I'd suggest the following improvements:

  • Remove the following text from the diagram and add it instead as prose to this section (before or after the image):
    • "Logs contextualized with OTel metadata including: ..."
    • "experimental-otlp/stout" exporter Available in Java Agent v2.10"
  • Move the "Observability Platform" block to be below the rest so that the image becomes less wide -- that is, favor vertical image real estate rather than horizontal. Or, if you prefer, put the app at the top of the diagram with the OTel components below. Something (very roughly) like what I show below.
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of this style?
image


## Configure Java application to output OTLP/JSON logs

No code changes needed. Continue using your preferred logging library, including
templated logs, mapped diagnostic context, and structured logging.

```java
Logger logger = org.slf4j.LoggerFactory.getLogger(MyClass.class);
...
MDC.put("customerId", customerId);

logger.info("Order {} successfully placed", orderId);

logger.atInfo().
.addKeyValue("orderId", orderId)
.addKeyValue("outcome", "success")
.log("placeOrder");
```

Export the logs captured by the OTel Java instrumentation to stdout using the
OTel JSON format (aka [OTLP/JSON](/docs/specs/otlp/#json-protobuf-encoding)).
Configuration parameters for
[Logback](https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/logback/logback-appender-1.0/javaagent)
and
[Log4j](https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/log4j/log4j-appender-2.17/javaagent)
are optional but recommended.

```bash
# Tested with opentelemetry-javaagent v2.10.0
#
# Details on -Dotel.logback-appender.* params on documentation page:
# https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/logback/logback-appender-1.0/javaagent

java -javaagent:/path/to/opentelemetry-javaagent.jar \
-Dotel.logs.exporter=experimental-otlp/stdout \
cyrille-leclerc marked this conversation as resolved.
Show resolved Hide resolved
-Dotel.instrumentation.logback-appender.experimental-log-attributes=true \
-Dotel.instrumentation.logback-appender.experimental.capture-key-value-pair-attributes=true \
-Dotel.instrumentation.logback-appender.experimental.capture-mdc-attributes=* \
-jar /path/to/my-app.jar
```

The `-Dotel.logs.exporter=experimental-otlp/stdout` JVM argument and the
environment variable `OTEL_LOGS_EXPORTER="experimental-otlp/stdout"` can be used
interchangeably.

{{% alert title="Note" color="info" %}}

The OTLP logs exporter is experimental and subject to change. Check the
[Specification PR](https://github.com/open-telemetry/opentelemetry-specification/pull/4183)
for the latest updates.

{{% /alert %}}

Verify that OTLP/JSON logs are outputted to stdout. The logs are in the
OTLP/JSON format, with a JSON object per line. The log records are nested in the
`resourceLogs` array.

<!-- prettier-ignore-start -->

```json
{"resourceLogs":[{"resource":{"attributes":[{"key":"deployment.environment.name","value":{"stringValue":"staging"}},{"key":"service.instance.id","value":{"stringValue":"6ad88e10-238c-4fb7-bf97-38df19053366"}},{"key":"service.name","value":{"stringValue":"checkout"}},{"key":"service.namespace","value":{"stringValue":"shop"}},{"key":"service.version","value":{"stringValue":"1.1"}}]},"scopeLogs":[{"scope":{"name":"com.mycompany.checkout.CheckoutServiceServer$CheckoutServiceImpl","attributes":[]},"logRecords":[{"timeUnixNano":"1730435085776869000","observedTimeUnixNano":"1730435085776944000","severityNumber":9,"severityText":"INFO","body":{"stringValue":"Order order-12035 successfully placed"}, "attributes":[{"key":"customerId","value":{"stringValue":"customer-49"}},{"key":"thread.id","value":{"intValue":"44"}},{"key":"thread.name","value":{"stringValue":"grpc-default-executor-1"}}],"flags":1,"traceId":"42de1f0dd124e27619a9f3c10bccac1c","spanId":"270984d03e94bb8b"}]}],"schemaUrl":"https://opentelemetry.io/schemas/1.24.0"}]}
cyrille-leclerc marked this conversation as resolved.
Show resolved Hide resolved
```

<!-- prettier-ignore-end -->
Comment on lines +110 to +116
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This JSON output is unreadable. Let Prettier format it, or format it manually. If it is too big, then wrap it in a <details> element.


## Configure the Collector to ingest the OTLP/JSON logs

![OpenTelemetry Collector OTLP/JSON pipeline](otel-collector-otlpjson-pipeline.png)

You can also
[view OTel Collector pipeline](https://www.otelbin.io/s/69739d790cf279c203fc8efc86ad1a876a2fc01a)
with OTelBin.

```yaml
# tested with otelcol-contrib v0.112.0

receivers:
filelog/otlp-json-logs:
# start_at: beginning # for testing purpose, use "start_at: beginning"
include: [/path/to/my-app.otlpjson.log]
otlp:
protocols:
grpc:
http:

processors:
batch:
resourcedetection:
detectors: ['env', 'system']
override: false

connectors:
otlpjson:

service:
pipelines:
logs/raw_otlpjson:
receivers: [filelog/otlp-json-logs]
# (i) no need for processors before the otlpjson connector
# Declare processors in the shared "logs" pipeline below
processors: []
exporters: [otlpjson]
logs:
receivers: [otlp, otlpjson]
processors: [resourcedetection, batch]
# remove "debug" for production deployments
exporters: [otlphttp, debug]

exporters:
debug:
verbosity: detailed
# Exporter to the OTLP backend like `otlphttp`
otlphttp:
```

Verify the logs collected by the OTel Collector by checking the output of the
OTel Collector Debug exporter:

```log
2024-11-01T10:03:31.074+0530 info Logs {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-11-01T10:03:31.074+0530 info ResourceLog #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.24.0
Resource attributes:
-> deployment.environment.name: Str(staging)
-> service.instance.id: Str(6ad88e10-238c-4fb7-bf97-38df19053366)
-> service.name: Str(checkout)
-> service.namespace: Str(shop)
-> service.version: Str(1.1)
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope com.mycompany.checkout.CheckoutServiceServer$CheckoutServiceImpl
LogRecord #0
ObservedTimestamp: 2024-11-01 04:24:45.776944 +0000 UTC
Timestamp: 2024-11-01 04:24:45.776869 +0000 UTC
SeverityText: INFO
SeverityNumber: Info(9)
Body: Str(Order order-12035 successfully placed)
Attributes:
-> customerId: Str(cust-12345)
-> thread.id: Int(44)
-> thread.name: Str(grpc-default-executor-1)
Trace ID: 42de1f0dd124e27619a9f3c10bccac1c
Span ID: 270984d03e94bb8b
Flags: 1
{"kind": "exporter", "data_type": "logs", "name": "debug"}
```

Verify the logs in the OpenTelemetry backend.

After the pipeline works end-to-end, ensure production readiness:

- Remove the `debug` exporter from the `logs` pipeline in the OTel Collector
configuration.
- Disable file and console exporters in the logging framework (for example,
`logback.xml`) but keep using the logging configuration to filter logs. The
OTel Java agent will output JSON logs to stdout.

```xml
<!-- tested with logback-classic v1.5.11 -->
<configuration>
<logger name="com.example" level="debug"/>
<root level="info">
<!-- No appender as the OTel Agent emits otlpjson logs through stdout -->
<!--
IMPORTANT enable a console appender to troubleshoot cases where
logs are missing in the OTel backend
-->
</root>
</configuration>
```

## Configure an OpenTelemetry Collector in Kubernetes to handle container logs

To support Kubernetes and container specifics, add a standard parsing step in
the pipeline without specific mapping configuration.

Use the OTel Collector File Log Receiver's
[`container`](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/container.md)
parser to handle container logging specifics.

Replace `<<namespace>>`, `<<pod_name>>`, and `<<container_name>>` with the
desired values or use a broader [glob pattern](https://pkg.go.dev/v.io/v23/glob)
like `*`.

```yaml
receivers:
filelog/otlp-json-logs:
# start_at: beginning # for testing purpose, use "start_at: beginning"
include: [/var/log/pods/<<namespace>>_<<pod_name>>_*/<<container_name>>/]
include_file_path: true
operators:
- type: container
add_metadata_from_filepath: true

otlp:
protocols:
grpc:
http:

processors:
batch:
resourcedetection:
detectors: ['env', 'system']
override: false

connectors:
otlpjson:

service:
pipelines:
logs/raw_otlpjson:
receivers: [filelog/otlp-json-logs]
# (i) no need for processors before the otlpjson connector
# Declare processors in the shared "logs" pipeline below
processors: []
exporters: [otlpjson]
logs:
receivers: [otlp, otlpjson]
# TODO change processors if needed
processors: [resourcedetection, batch]
# TODO remove "debug" for production deployments
exporters: [otlphttp, debug]
# TODO add "traces" and "metrics" pipelines

exporters:
debug:
verbosity: detailed
# Exporter to the OTLP backend like `otlphttp`
otlphttp:
```

## Conclusion

This blog post showed how to collect file-based Java logs with OpenTelemetry.
The solution is easy to set up and provides a turnkey solution for converting
logs from frameworks like SLF4J/Logback or Log4j2 into OTel-compliant JSON logs
on stdout with all resource and log attributes. This JSON format is certainly
verbose, but it generally has minimal impact on performances and offers a solid
balance by providing highly contextualized logs that can be correlated with
traces and metrics.

Any feedback or questions? Reach out on
[GitHub](https://github.com/open-telemetry/opentelemetry-specification/pull/4183).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this linking to a spec PR?
Is this PR really the best place for folks to comment?
If so, that's ok, but then the link text should explain more clearly what the link target is.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 16 additions & 0 deletions static/refcache.json
Original file line number Diff line number Diff line change
Expand Up @@ -7443,6 +7443,10 @@
"StatusCode": 200,
"LastSeen": "2024-01-18T20:05:26.46768-05:00"
},
"https://github.com/open-telemetry/opentelemetry-specification/pull/4183": {
"StatusCode": 200,
"LastSeen": "2024-11-20T10:58:53.525737396Z"
},
"https://github.com/open-telemetry/opentelemetry-specification/pull/4197": {
"StatusCode": 200,
"LastSeen": "2024-10-24T15:10:29.718998+02:00"
Expand Down Expand Up @@ -11767,6 +11771,10 @@
"StatusCode": 200,
"LastSeen": "2024-08-02T13:14:36.816743-04:00"
},
"https://pkg.go.dev/v.io/v23/glob": {
"StatusCode": 200,
"LastSeen": "2024-11-20T10:59:02.306122595Z"
},
"https://pkgs.alpinelinux.org/packages": {
"StatusCode": 200,
"LastSeen": "2024-01-18T19:07:29.294901-05:00"
Expand Down Expand Up @@ -14279,6 +14287,14 @@
"StatusCode": 200,
"LastSeen": "2024-01-30T16:14:44.039011-05:00"
},
"https://www.otelbin.io/favicon.ico": {
"StatusCode": 206,
"LastSeen": "2024-11-20T10:59:00.492890749Z"
},
"https://www.otelbin.io/s/69739d790cf279c203fc8efc86ad1a876a2fc01a": {
"StatusCode": 200,
"LastSeen": "2024-11-20T10:58:58.366517284Z"
},
"https://www.outreachy.org/": {
"StatusCode": 200,
"LastSeen": "2024-01-18T19:55:46.020866-05:00"
Expand Down