This message was deleted SigNoz Community #support

Join Slack

This message was deleted.

# support

Slackbot

01/18/2023, 4:45 PM

This message was deleted.

Srikanth Chekuri

01/18/2023, 5:35 PM

I’d like to make a cronjob that will restart the clickhouse-setup_otel-collector_1 container when the results are 0

Sounds like you are trying to work around another issue. Can you share why you want to restart the collector and what’s leading to results 0?

Bill Cavalieri

01/18/2023, 5:40 PM

Correct, I'm on version 0.14.0. Every hour or so my Service/Traces stop logging. While this is happening, the memory usage in the clickhouse-setup_otel-collector_1 container increases until it restarts. This takes about an hour, so I go every other hour basically with stats.

Copy code

CONTAINER ID   NAME                                        CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
f16d7b1538ca   frontend                                    0.00%     2.621MiB / 15.63GiB   0.02%     8.07MB / 15.1MB   72.9MB / 14.7MB   7
dc7aec357218   clickhouse-setup_otel-collector_1           265.47%   8.823GiB / 15.63GiB   56.47%    49.5GB / 8.83GB   293MB / 3.92MB    14
e57232ccc194   clickhouse-setup_otel-collector-metrics_1   122.27%   1.567GiB / 15.63GiB   10.03%    2.13TB / 88.4GB   48.9GB / 38.3GB   14
e190ef81da7d   query-service                               0.00%     108.4MiB / 15.63GiB   0.68%     869MB / 189MB     15.2GB / 2.51GB   13
e24885bb349f   clickhouse                                  167.03%   1.788GiB / 15.63GiB   11.44%    1.01TB / 6.29TB   2.11TB / 4.88TB   380
725e4dea35b1   zookeeper-1                                 0.23%     50.32MiB / 15.63GiB   0.31%     24.2MB / 30.7MB   32GB / 5.14GB     55
5d41c306f5e7   clickhouse-setup_alertmanager_1             0.10%     12.36MiB / 15.63GiB   0.08%     295kB / 1.63kB    30.2GB / 392MB    13

Srikanth Chekuri

01/18/2023, 6:43 PM

Can you share your collector config?

Bill Cavalieri

01/18/2023, 7:03 PM

@Srikanth Chekuri here you go

otel-collector-config.yaml

Srikanth Chekuri

01/18/2023, 8:27 PM

Can you use this updating config and let us know how it goes?

otel-collector-config.yaml

Bill Cavalieri

01/18/2023, 8:49 PM

Thanks, will let you know

Bill Cavalieri

01/18/2023, 9:12 PM

Services stopped working even faster than normal

Alexei Zenin

01/18/2023, 11:09 PM

I found my collectors restarting often and its because i was not using memory_limiter, for PROD i would highly advise to use it on every pipeline. Docs here: https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/memorylimiterprocessor/README.md

Alexei Zenin

01/18/2023, 11:09 PM

since adding it it has been fairly stable (0 OOM crashes) and garbage collection ran more robustly

Srikanth Chekuri

01/19/2023, 12:39 AM

@Bill Cavalieri based on what you shared earlier I suspected the memory is slowly building up for longer time based on the max_batch_send_size being set to 11k which I thought could be less for you. While the memorylimiter helps with no OOMs by dropping the data you may still want to understand if the ingestion is high that one collector can’t handle it (in that case you may want to scale up) or is it something else. I would be happy to debug this on call when the issue occurs.

Ankit Nayan

01/19/2023, 4:32 AM

@Alexei Zenin and @Bill Cavalieri possible to schedule a call with @Srikanth Chekuri to drill down on this? If we get to the root cause, we will fix it asap. I guess other users must be facing the same too

Bill Cavalieri

01/19/2023, 4:40 PM

Yes I have high tracing volume during the day, at night the process will stay running without issue. I'm pretty free today, so can debug whenever @Srikanth Chekuri is available. It's currently working, but should fail inside the next hour

Srikanth Chekuri

01/19/2023, 4:42 PM

I have a call in some time for ~30 mins. I will be available after that. I will let you know let’s get on a call after that.

🙌 1

Bill Cavalieri

01/19/2023, 6:19 PM

Getting this error:

Copy code

2023-01-19T17:21:42.360Z	error	prometheusexporter@v0.66.0/log.go:34	error encoding and sending metric family: write tcp 172.27.0.8:8889->172.27.0.5:34328: write: broken pipe
	{"kind": "exporter", "data_type": "metrics", "name": "prometheus"}
<http://github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println|github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println>
	/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.66.0/log.go:34
<http://github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1.2|github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1.2>
	/go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/http.go:187
<http://github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1|github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1>
	/go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/http.go:205
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:2084
net/http.(*ServeMux).ServeHTTP
	/usr/local/go/src/net/http/server.go:2462
<http://go.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1|go.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1>
	/go/pkg/mod/go.opentelemetry.io/collector@v0.66.0/config/confighttp/compression.go:162
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:2084
<http://go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP|go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP>
	/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.36.4/handler.go:204
<http://go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP|go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP>
	/go/pkg/mod/go.opentelemetry.io/collector@v0.66.0/config/confighttp/clientinfohandler.go:39
net/http.serverHandler.ServeHTTP
	/usr/local/go/src/net/http/server.go:2916
net/http.(*conn).serve
	/usr/local/go/src/net/http/server.go:1966

Srikanth Chekuri

01/19/2023, 6:27 PM

Can we get on call now?

Bill Cavalieri

01/19/2023, 6:29 PM

Yes I'm available

Srikanth Chekuri

01/20/2023, 10:25 PM

@Bill Cavalieri there are a couple of things I would like to check for debugging this further. Let me know if we can do the call.

Bill Cavalieri

01/20/2023, 10:27 PM

yes I'm available

Srikanth Chekuri

01/20/2023, 10:32 PM

Give me few mins, I will send the huddle invite

👍 1

88 Views

Open in Slack

Previous Next