-
Notifications
You must be signed in to change notification settings - Fork 8
Support for filelog receiver and syslog receiver #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Support for filelog receiver and syslog receiver #175
Conversation
|
Hi @mukeshkhicherbr, we could add the Generally speaking, we should rebuild the syslog-release's functionality with Otel. Please take a look at it and try to rebuild everything with the Otel Collector. |
Hi @chombium, thanks for reviewing it. Yes, Overall idea is to build capability like syslog-release to get cf platform component logs as well as syslog from VM via OTEL. In syslog-release, as per my understanding, blackbox job tails component logs files and forward to 514 port on localhost (refer blackbox_config). And any other syslog received on 514 also gets forwarded to configured destination using rsyslog-settings . So filelog receiver can tail component logs file like blackbox job, but with only filelog receiver we will not get VM level syslog messages received on 514 port. IMO to receive these syslog we will need syslog receiver as well. |
Yes, that's true. We should check if there are other things connecting to localhost:514 and think about jow to deal with that. All platform components are deployed as bosh releases which means that they log to IMO, the long term solution would be to replace the syslog-release with the Otel Collector. Having both and managing both would be overkill. We could also think if we want to separate the app logs from the platform component metrics as well in order to eliminate the problem of app logs affecting the platform metrics and logs. We could configure the Otel Collector release to be deployed with two instances one for platform telemetry data and one for application data or one instance which collects everything. That way the platform operators can decide what's best tor their foundations. |
514 should receive only syslogs which can be read using syslog receiver. We can read logs under directory
yes, we can do that. In that case two instances of otel collector will run on each VM one for platform logs and other for rest. I was thinking to leverage existing Otel collector instance for collecting these platform logs.
Do you mean giving choice to platform operator whether to run 2 instances of Otel collector or one instance? |
Let's check first if there are some other components except blacbox which send directly to rsyslog. If we find something, we can discuss how to handle that.
We could start with a single Otel Collector instance and decide later if we need to separate platform from app data. Having separate pipelines will be a must. What worries me is the collector's performance and if the apps with their log load can affect the collection and transmission of platform metrics and logs. That's why I'm thinking of possibly having separate collectors in order to avoid a noisy neighbor problem. |
Yes there can be some performance hit. However, based on this diagram, i am assuming that app logs are already received by Otel collector along with platform metrics. Additional load will be platform logs only. Is this assumption correct? Otel collector running on each VM will only receive platform component logs from same VM. How much additional load you expect due to platform logs on Otel? |
|
Yes, the app logs and container metrics and the platform component metrics are already sent to the Otel Collector. The problem is that an app can start producing too many logs (stacktraces debug log level and similar) and affect the delivery of the platform metrics and logs. The resources of the VMs are we limited and if the Otel collector eats more resources the available resources for the applications will go down. We should also generally think what should we do about the Otel Collector's reliability. |
nginx access logs are coming directly to 514 port directly i think. I did not find anything other than this so far. |
nginx? Which bosh release/job is that and which VM instance type? I guess we should also check the stemcells as well. |
this is coming from cloud controller vm. I looked in inputs of rsyslogs, based on this documentation , rsyslog uses default configuration defined at /etc/rsyslog.conf along with additional configs from location /etc/rsyslog.d/*.conf which are configured by syslog-release. So rsyslog receives logs from /dev/log Unix socket, /proc/kmsg for kernel level logging, and UDP/TCP listener at 514, these logs are then forwarded to syslog endpoint configured in syslog-release configuration. Based on this we have to provide support for
I think we can add a forwarding rule like syslog-release in rsyslog to forward these logs to otel collector. Otel collector will open a syslog endpoint using syslog receiver. This forwarding rule can be set as part of pre-start script in Otel Collector. |
|
Thanks for looking into this. I'll talk to the folks taking care for the Cloud Controller to see why and how is nginx used. We could also eventually. If we add the Syslog receiver it will be a temporary solution to stay backward compatible, but I think it would be better if we use the filelog receiver gor everything. It will save us time if we add the proper configuration once instead of adding the Syslog receiver + forwarding rules in the syslog-release. |
We can use filelog receiver to read log files from /var/log where i am assuming these syslogs are written. There are some filter like which are applied in syslog-release before sending to external syslog endpoint, we may have to build those filters to make it compatible with syslog release. In that case we have to parse messages in Otel before applying these filters. I checked some files under /var/log directory (kern.log, syslog, cron.log), they are logging logs in RFC3164 format. There are some files like /var/log/audit/audit.log, there are not in syslog format. We may to handle parsing of these files differently. These parsing will be based on regex_parser operator where we might have to define different regex for different kind of messages. There might be other files without syslog format logs. There is one syslog file under /var/log, is it supposed to contain all logs? If we tail this file only, will it be sufficient? This will simplify the handling in otel. Otherwise with syslog receiver in Otel we can receive logs in syslog format directly where rsyslog will continue to handle parsing and existing filtering and send logs to otel collector in RFC5424 format. |
|
We should think how to handle the parsing an forwarding rules... I guess i will be good if we build them with the Otel Collector. The other thing that we should think about are the Otel Semantic Conventions and how do we map the data in Otel. We should check if and what are the filelog and the syslog receiver doing in that regard. |
|
yes, i agree, forwarding rules and message template for rsyslog should be build with Otel Collector. I will explore on these further. I am planning to divide whole thing in 3 PR
Could you please approve this PR so that i will work on top of these changes for other PR's? |
|
Hi @mukeshkhicherbr, I've checked the Nginx job's config template and the logs are duplicated as they are written both to If there are no other jobs which write to local Rsyslogd than we don't need the Syslog receiver. |
|
there other logs like audit logs, auth logs, cron logs etc which are coming to rsyslogd directly, but rsyslog also logs them under /var/log directory. We can read these from file /var/log/syslog and /var/log/auth.log, then we will not need syslog receiver. |
|
Ok, let's add the Syslog receiver and we can check the metrics to see if and what data gets ingressed. We should think how to avoid data duplication. The easiest way would be to either deploy the syslog-release or the otel-collector-release, but what should happen if both are deployed? What should the Syslog receiver be connected to and how should the things sending to Syslog be configured? If we use the current configuration they will be send on both sides, if we say they listen on different ports, all sending things have to be reconfigured or maybe rebuild to support multiple Syslog endpoint configuration. |
|
@chombium I did some testing and i think using filelog receiver for reading syslog from /var/vcap/syslog and /var/log/auth.log should be good enough to start with. So i have removed syslog receiver from commit. |
|
Thanks for the update and your hard work. That's great news. |
|
@chombium Can we merge this change? |
Support for filelog (https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/filelogreceiver/README.md) and syslog receiver (https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/syslogreceiver/README.md) in otel collector builder. These receivers may be used to support receiving platform logs from platform component logs directory and syslog by configuring these receivers later in otel config