The `signoz-otel-collector` keeps restarting with OOMKilled - exit code: 137. There’s only ~175k sp...
t

Tyler Wells

over 1 year ago
The
signoz-otel-collector
keeps restarting with OOMKilled - exit code: 137. There’s only ~175k spans, and 17k metrics but it’s using a ton of memory and then crashing I see this in the logs.
{"level":"info","timestamp":"2024-06-12T13:23:22.493Z","caller":"signozcol/collector.go:121","msg":"Collector service is running"}
{"level":"info","timestamp":"2024-06-12T13:23:22.493Z","logger":"agent-config-manager","caller":"opamp/config_manager.go:168","msg":"Config has not changed"}
{"level":"info","timestamp":"2024-06-12T13:23:23.279Z","caller":"service/service.go:73","msg":"Client started successfully"}
{"level":"info","timestamp":"2024-06-12T13:23:23.279Z","caller":"opamp/client.go:49","msg":"Ensuring collector is running","component":"opamp-server-client"}
2024-06-12T13:24:22.389Z	warn	clickhousemetricsexporter/exporter.go:272	Dropped cumulative histogram metric	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "name": "signoz_latency"}
2024-06-12T13:24:22.484Z	warn	clickhousemetricsexporter/exporter.go:279	Dropped exponential histogram metric with no data points	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "name": "signoz_latency"}
2024-06-12T13:25:18.135Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "5.882953348s"}
2024-06-12T13:25:24.996Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "7.161709269s"}
2024-06-12T13:25:26.504Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "6.523426302s"}
2024-06-12T13:25:26.536Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "4.419607822s"}
2024-06-12T13:25:26.753Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "6.233919422s"}
2024-06-12T13:25:26.763Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "2.67037973s"}
2024-06-12T13:25:26.769Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "5.126252319s"}
2024-06-12T13:25:26.958Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "4.857335267s"}
2024-06-12T13:25:28.494Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "StatementSend:context deadline exceeded", "interval": "4.344819049s"}
any help would be much appreciated.
Hello, i seem to be having issues with sending queue is full `{"level":"error","ts":1727180784.2226...
s

Samuel Olowoyeye

about 1 year ago
Hello, i seem to be having issues with sending queue is full
{"level":"error","ts":1727180784.2226639,"caller":"exporterhelper/common.go:296","msg":"Exporting failed. Rejecting data.","kind":"exporter","data_type":"traces","name":"clickhousetraces","error":"sending queue is full","rejected_items":30,"stacktrace":"<http://go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send|go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send>\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/collector/exporter@v0.102.0/exporterhelper/common.go:296\<http://ngo.opentelemetry.io/collector/exporter/exporterhelper.NewTracesRequestExporter.func1|ngo.opentelemetry.io/collector/exporter/exporterhelper.NewTracesRequestExporter.func1>\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/collector/exporter@v0.102.0/exporterhelper/traces.go:134\<http://ngo.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces|ngo.opentelemetry.io/collector/consumer.ConsumeTracesFunc.ConsumeTraces>\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/collector/consumer@v0.102.1/traces.go:25\<http://ngo.opentelemetry.io/collector/processor/batchprocessor.(*batchTraces).export|ngo.opentelemetry.io/collector/processor/batchprocessor.(*batchTraces).export>\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.102.0/batch_processor.go:414\<http://ngo.opentelemetry.io/collector/processor/batchprocessor.(*shard).sendItems|ngo.opentelemetry.io/collector/processor/batchprocessor.(*shard).sendItems>\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.102.0/batch_processor.go:261\<http://ngo.opentelemetry.io/collector/processor/batchprocessor.(*shard).startLoop|ngo.opentelemetry.io/collector/processor/batchprocessor.(*shard).startLoop>\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.102.0/batch_processor.go:223"}
However, when i checked clickhouse storage i got
Filesystem                Size      Used Available Use% Mounted on
/dev/sde                884.8G    265.1G    574.7G  32% /var/lib/clickhouse
chi-my-release-clickhouse-cluster-0-0-0:/$
What could be the issue here