Bill Cavalieri
12/28/2022, 6:06 PMOpenTelemetry error: OTLP exporter received rpc.Status{message=value not found in metricKeyToDimensions cache by key "id-staging-app01\x00GET /admin/tag_management(.:format)\x00SPAN_KIND_SERVER\x00STATUS_CODE_UNSET\x00200\x00default\x00default", details=[]}
Ankit Nayan
Bill Cavalieri
12/28/2022, 6:16 PMAnkit Nayan
v0.13.0
is out which has the above fix. Let us know how it goes https://github.com/SigNoz/signoz/releases/tag/v0.13.0Bill Cavalieri
12/29/2022, 7:51 PMBill Cavalieri
12/29/2022, 8:21 PMBill Cavalieri
12/29/2022, 10:18 PMAnkit Nayan
Ankit Nayan
Attribute cardinality
from clickhouse-setup_otel-collector_1
container?https://github.com/SigNoz/signoz-otel-collector/blob/main/processor/signozspanmetricsprocessor/processor.go#L417Srikanth Chekuri
12/30/2022, 9:19 AMsignoz/signoz-otel-collector
image would be v0.66.1
) and let us know if you still face the issue.Bill Cavalieri
12/30/2022, 5:20 PMBill Cavalieri
12/30/2022, 5:21 PMSrikanth Chekuri
12/30/2022, 5:22 PMBill Cavalieri
12/30/2022, 5:23 PMSrikanth Chekuri
12/30/2022, 5:53 PM<http://github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*SpanWriter).writeBatch|github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*SpanWriter).writeBatch>
/src/exporter/clickhousetracesexporter/writer.go:129
<http://github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*SpanWriter).backgroundWriter|github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*SpanWriter).backgroundWriter>
/src/exporter/clickhousetracesexporter/writer.go:108
2022-12-30T16:54:32.516Z error clickhousetracesexporter/writer.go:109 Could not write a batch of spans {"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "error": "dial tcp 172.27.0.2:9000: connect: connection refused"}
<http://github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*SpanWriter).backgroundWriter|github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*SpanWriter).backgroundWriter>
What’s surprising is only the traces exporter had the connection error.Bill Cavalieri
12/30/2022, 5:56 PMVishal Sharma
01/02/2023, 11:04 AMAnkit Nayan
Srikanth Chekuri
01/02/2023, 11:33 AMSrikanth Chekuri
01/02/2023, 11:46 AMBill Cavalieri
01/03/2023, 3:10 PM2023-01-02T15:44:59.119Z error prometheusexporter@v0.66.0/log.go:34 error encoding and sending metric family: write tcp 172.27.0.4:8889->172.27.0.7:49386: write: broken pipe
{"kind": "exporter", "data_type": "metrics", "name": "prometheus"}
<http://github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println|github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println>
/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.66.0/log.go:34
<http://github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1.2|github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1.2>
/go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/http.go:187
<http://github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1|github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1>
/go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/http.go:205
net/http.HandlerFunc.ServeHTTP
/usr/local/go/src/net/http/server.go:2084
net/http.(*ServeMux).ServeHTTP
/usr/local/go/src/net/http/server.go:2462
<http://go.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1|go.opentelemetry.io/collector/config/confighttp.(*decompressor).wrap.func1>
/go/pkg/mod/go.opentelemetry.io/collector@v0.66.0/config/confighttp/compression.go:162
net/http.HandlerFunc.ServeHTTP
/usr/local/go/src/net/http/server.go:2084
<http://go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP|go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP>
/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.36.4/handler.go:204
<http://go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP|go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP>
/go/pkg/mod/go.opentelemetry.io/collector@v0.66.0/config/confighttp/clientinfohandler.go:39
net/http.serverHandler.ServeHTTP
/usr/local/go/src/net/http/server.go:2916
net/http.(*conn).serve
/usr/local/go/src/net/http/server.go:1966
Bill Cavalieri
01/03/2023, 3:11 PMVishal Sharma
01/03/2023, 3:12 PMSrikanth Chekuri
01/03/2023, 3:12 PMBill Cavalieri
01/03/2023, 3:18 PM2023-01-02T16:35:54.865Z info exporterhelper/queued_retry.go:426 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "error": "PrepareBatch:read tcp 172.27.0.4:34884->172.27.0.2:9000: i/o timeout", "interval": "6.421334743s"}
Srikanth Chekuri
01/03/2023, 3:21 PMSrikanth Chekuri
01/03/2023, 3:23 PMHow often did you notice this? You ClickHouse is probably busy at the time and didn’t complete the request.Copy coderead tcp 172.27.0.4:34884->172.27.0.2:9000: i/o timeout
Bill Cavalieri
01/03/2023, 3:24 PMBill Cavalieri
01/03/2023, 3:25 PMSrikanth Chekuri
01/03/2023, 3:26 PMEvery 30-60 minutes this happens during the day.Does a broken pipe error show up every 30mins?
Srikanth Chekuri
01/03/2023, 3:28 PMBill Cavalieri
01/03/2023, 3:30 PM2023-01-03T15:05:59.355Z error prometheusexporter@v0.66.0/log.go:34 error encoding and sending metric family: write tcp 172.27.0.4:8889->172.27.0.7:50926: write: broken pipe
2023-01-03T15:05:59.355Z error prometheusexporter@v0.66.0/log.go:34 error encoding and sending metric family: write tcp 172.27.0.4:8889->172.27.0.7:50926: write: broken pipe
2023-01-03T15:05:59.355Z error prometheusexporter@v0.66.0/log.go:34 error encoding and sending metric family: write tcp 172.27.0.4:8889->172.27.0.7:50926: write: broken pipe
2023-01-03T15:07:59.583Z error prometheusexporter@v0.66.0/log.go:34 error encoding and sending metric family: write tcp 172.27.0.4:8889->172.27.0.7:36196: write: broken pipe
Bill Cavalieri
01/03/2023, 3:31 PMSrikanth Chekuri
01/03/2023, 3:34 PMSrikanth Chekuri
01/03/2023, 3:51 PMSrikanth Chekuri
01/03/2023, 3:57 PMclickhouse-setup_otel-collector-metrics_1
? Do you see any scrape timeout errors?Bill Cavalieri
01/03/2023, 4:00 PM2022-12-30T16:43:09.311Z info prometheusreceiver@v0.66.0/metrics_receiver.go:288 Starting scrape manager {"kind": "receiver", "name": "prometheus", "pipeline": "metrics"}
time="2022-12-30T16:54:13Z" level=error msg="read tcp 172.27.0.7:42490->172.27.0.2:9000: read: connection reset by peer" component=clickhouse
time="2022-12-30T16:54:19Z" level=error msg="dial tcp 172.27.0.2:9000: i/o timeout" component=clickhouse
time="2022-12-30T16:54:23Z" level=error msg="dial tcp 172.27.0.2:9000: connect: connection refused" component=clickhouse
time="2022-12-30T16:54:28Z" level=error msg="dial tcp 172.27.0.2:9000: connect: connection refused" component=clickhouse
2022-12-30T16:55:54.582Z info service/collector.go:219 Received signal from OS {"signal": "terminated"}
Bill Cavalieri
01/03/2023, 4:01 PMnode-exporter:
image: prom/node-exporter
environment:
GOMAXPROCS: '1'
Srikanth Chekuri
01/03/2023, 4:05 PMclickhouse-setup_otel-collector-metrics_1
?Srikanth Chekuri
01/03/2023, 4:13 PMSrikanth Chekuri
01/03/2023, 4:14 PMclickhouse-setup_otel-collector-metrics_1
has scrape failues.Bill Cavalieri
01/03/2023, 4:16 PMCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0964498e706c signoz/frontend:0.13.0 "nginx -g 'daemon of…" 5 days ago Up 3 days 80/tcp, 0.0.0.0:3301->3301/tcp, :::3301->3301/tcp frontend
29743a5780f5 signoz/signoz-otel-collector:0.66.1 "/signoz-collector -…" 5 days ago Up 41 minutes 0.0.0.0:4317-4318->4317-4318/tcp, :::4317-4318->4317-4318/tcp clickhouse-setup_otel-collector_1
2d0db03a384e signoz/signoz-otel-collector:0.66.1 "/signoz-collector -…" 5 days ago Up 3 days 4317-4318/tcp clickhouse-setup_otel-collector-metrics_1
4be9e1f5908b signoz/query-service:0.13.0 "./query-service -co…" 5 days ago Up 3 days (healthy) 8080/tcp query-service
d207004996ec clickhouse/clickhouse-server:22.8.8-alpine "/entrypoint.sh" 3 weeks ago Up 3 days (healthy) 0.0.0.0:8123->8123/tcp, :::8123->8123/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp, 0.0.0.0:9181->9181/tcp, :::9181->9181/tcp, 9009/tcp clickhouse
725e4dea35b1 bitnami/zookeeper:3.7.0 "/opt/bitnami/script…" 3 weeks ago Up 3 days 0.0.0.0:2181->2181/tcp, :::2181->2181/tcp, 0.0.0.0:2888->2888/tcp, :::2888->2888/tcp, 0.0.0.0:3888->3888/tcp, :::3888->3888/tcp, 8080/tcp zookeeper-1
5d41c306f5e7 signoz/alertmanager:0.23.0-0.2 "/bin/alertmanager -…" 4 weeks ago Up 4 weeks 9093/tcp
docker stats
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
0964498e706c frontend 0.00% 2.332MiB / 9.72GiB 0.02% 2.43MB / 3.43MB 126MB / 8.9MB 5
29743a5780f5 clickhouse-setup_otel-collector_1 216.59% 2.167GiB / 9.72GiB 22.29% 3.16GB / 6.6GB 294MB / 131kB 11
2d0db03a384e clickhouse-setup_otel-collector-metrics_1 55.05% 1.219GiB / 9.72GiB 12.54% 389GB / 46.9GB 38.8GB / 27.7GB 12
4be9e1f5908b query-service 0.05% 198.9MiB / 9.72GiB 2.00% 490MB / 240MB 15.6GB / 3.79GB 12
d207004996ec clickhouse 57.66% 1.39GiB / 9.72GiB 14.30% 638GB / 762GB 1.09TB / 2.92TB 301
725e4dea35b1 zookeeper-1 0.25% 45.2MiB / 9.72GiB 0.45% 5.97MB / 5.59MB 7.76GB / 1.58GB 47
5d41c306f5e7 clickhouse-setup_alertmanager_1 0.08% 11.62MiB / 9.72GiB 0.12% 24.1MB / 23.8MB 9.14GB / 368MB 11
Srikanth Chekuri
01/03/2023, 4:17 PMBill Cavalieri
01/03/2023, 4:18 PMSrikanth Chekuri
01/03/2023, 4:19 PMBill Cavalieri
01/03/2023, 4:20 PMBill Cavalieri
01/03/2023, 4:24 PMSrikanth Chekuri
01/03/2023, 4:25 PMBill Cavalieri
01/03/2023, 4:37 PMSrikanth Chekuri
01/03/2023, 4:55 PMscrape_interval
to 90s and set scrape_timeout
to something like 50s and see how it goes?Bill Cavalieri
01/03/2023, 5:01 PMAnkit Nayan
ExitCode
of container might be helpful. Code 137
is OOMKilled.
docker inspect clickhouse-setup_otel-collector_1 --format='{{.State.ExitCode}}'
Ankit Nayan
docker inspect clickhouse-setup_otel-collector-metrics_1 --format='{{.State.ExitCode}}'
Bill Cavalieri
01/04/2023, 6:30 PMSrikanth Chekuri
01/04/2023, 6:33 PMscrape_interval
, it’s not there today and uses the default valueBill Cavalieri
01/04/2023, 6:36 PM