Signoz pods not Running
m

Maitryy

over 1 year ago
Hi Team !! I'm trying to deploy signoz on a minikube using helm on a proxy setup. The status is
platform        chi-my-release-clickhouse-cluster-0-0-0                     1/1     Running                 0                 10h
platform        my-release-clickhouse-operator-657986696-mtgdq              2/2     Running                 0                 12h
platform        my-release-k8s-infra-otel-agent-pvf29                       1/1     Running                 0                 12h
platform        my-release-k8s-infra-otel-deployment-65767679c6-llgmg       1/1     Running                 0                 12h
platform        my-release-signoz-alertmanager-0                            0/1     Init:0/1                0                 9h
platform        my-release-signoz-frontend-5fc8679d4b-zd5c9                 0/1     Init:0/1                0                 15h
platform        my-release-signoz-frontend-775b95894-rl5pm                  0/1     Init:0/1                0                 11h
platform        my-release-signoz-otel-collector-577f7cc9c6-jswbm           0/1     Init:0/1                0                 12h
platform        my-release-signoz-otel-collector-7b7784c866-hr754           0/1     Init:0/1                0                 15h
platform        my-release-signoz-otel-collector-metrics-54d75b67c7-5ccx9   0/1     Init:0/1                0                 12h
platform        my-release-signoz-otel-collector-metrics-7f9fcd767-tqqxv    0/1     Init:0/1                0                 15h
platform        my-release-signoz-query-service-0                           0/1     Init:0/1                0                 10h
platform        my-release-signoz-schema-migrator-56769c434706-mzm2s        0/1     Init:0/1                0                 12h
platform        my-release-zookeeper-0                                      1/1     Running                 0                 15h
In the init container logs i see
wget: bad address 'my-release-signoz-query-service:8080'
waiting for query-service
wget: bad address 'my-release-signoz-query-service:8080'
waiting for query-service

---> init queryservice logs
wget: bad address 'my-release-clickhouse:8123'
waiting for clickhouseDB
wget: bad address 'my-release-clickhouse:8123'
waiting for clickhouseDB
I was thinking it to be a coredns issue coredns logs:
[INFO] 10.244.0.46:45441 - 56699 "AAAA IN my-release-signoz-query-service. udp 49 false 512" SERVFAIL qr,aa,rd,ra 49 0.000118854s
[INFO] 10.244.0.46:45441 - 31615 "A IN my-release-signoz-query-service. udp 49 false 512" SERVFAIL qr,aa,rd,ra 49 0.000033334s
[INFO] 10.244.0.56:54189 - 49629 "A IN my-release-clickhouse. udp 39 false 512" SERVFAIL qr,rd,ra 39 0.025739722s
[INFO] 10.244.0.56:54189 - 54743 "AAAA IN my-release-clickhouse. udp 39 false 512" SERVFAIL qr,rd,ra 39 0.025829025s
[INFO] 10.244.0.48:49914 - 36437 "AAAA IN my-release-clickhouse.platform-1.svc.cluster.local. udp 68 false 512" NOERROR qr,aa,rd 161 0.000295586s
[INFO] 10.244.0.48:49914 - 43859 "A IN my-release-clickhouse.platform-1.svc.cluster.local. udp 68 false 512" NOERROR qr,aa,rd 134 0.000323042s
[INFO] 10.244.0.43:34472 - 30113 "AAAA IN my-release-signoz-otel-collector.platform-1.svc.cluster.local. udp 90 false 1232" NOERROR qr,aa,rd 172 0.000309202s
[INFO] 10.244.0.43:43698 - 24992 "A IN my-release-signoz-otel-collector.platform-1.svc.cluster.local. udp 90 false 1232" NOERROR qr,aa,rd 156 0.000315525s
Please let me know what the problem is and how to resolve it.
please has anyone exposed their otel-collector through nginx ingress before? i need help, here is my...
a

Abdulmalik Salawu

9 months ago
please has anyone exposed their otel-collector through nginx ingress before? i need help, here is my current settings
apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
kind: Ingress
metadata:
  name: signoz-otel-collector-grpc-ingress
  namespace: ops
  annotations:
    <http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: nginx
    <http://nginx.ingress.kubernetes.io/grpc-backend|nginx.ingress.kubernetes.io/grpc-backend>: "true"
    <http://nginx.ingress.kubernetes.io/backend-protocol|nginx.ingress.kubernetes.io/backend-protocol>: "GRPCS"
    <http://nginx.ingress.kubernetes.io/proxy-buffer-size|nginx.ingress.kubernetes.io/proxy-buffer-size>: "128k"
    <http://nginx.ingress.kubernetes.io/ssl-redirect|nginx.ingress.kubernetes.io/ssl-redirect>: "true"
    <http://nginx.ingress.kubernetes.io/proxy-body-size|nginx.ingress.kubernetes.io/proxy-body-size>: "0"
    <http://nginx.ingress.kubernetes.io/proxy-connect-timeout|nginx.ingress.kubernetes.io/proxy-connect-timeout>: "300"
    <http://nginx.ingress.kubernetes.io/proxy-read-timeout|nginx.ingress.kubernetes.io/proxy-read-timeout>: "300"
    <http://nginx.ingress.kubernetes.io/proxy-send-timeout|nginx.ingress.kubernetes.io/proxy-send-timeout>: "300"
    <http://nginx.ingress.kubernetes.io/upstream-keepalive-timeout|nginx.ingress.kubernetes.io/upstream-keepalive-timeout>: "600"
    <http://nginx.ingress.kubernetes.io/upstream-keepalive-requests|nginx.ingress.kubernetes.io/upstream-keepalive-requests>: "100"
spec:
  rules:
  - host: <http://otelcollector.domain.com|otelcollector.domain.com>
    http:
      paths:
      - path: /
        pathType: ImplementationSpecific
        backend:
          service:
            name: signoz-otel-collector
            port:
              number: 4317
---
apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
kind: Ingress
metadata:
  name: signoz-otel-collector-http-ingress
  namespace: ops
  annotations:
    ingressClassName: nginx
    <http://nginx.ingress.kubernetes.io/backend-protocol|nginx.ingress.kubernetes.io/backend-protocol>: HTTP
    <http://nginx.ingress.kubernetes.io/proxy-connect-timeout|nginx.ingress.kubernetes.io/proxy-connect-timeout>: "300"
    <http://nginx.ingress.kubernetes.io/proxy-read-timeout|nginx.ingress.kubernetes.io/proxy-read-timeout>: "300"
    <http://nginx.ingress.kubernetes.io/proxy-send-timeout|nginx.ingress.kubernetes.io/proxy-send-timeout>: "300"
spec:
  rules:
  - host: <http://otelcollector-http.domain.com|otelcollector-http.domain.com>
    http:
      paths:
      - path: /
        pathType: ImplementationSpecific
        backend:
          service:
            name: signoz-otel-collector
            port:
              number: 4318
Hello, I have observed a significant increase in CPU utilization after the Signoz version upgrade to...
s

saurabh biramwar

over 1 year ago
Hello, I have observed a significant increase in CPU utilization after the Signoz version upgrade to v0.41.1. Additionally, we're encountering the following error in the ClickHouse pod. 2024.03.21 130442.644576 [ 1011 ] {bc235080-bd8e-4a26-99d6-69e10ec9bee9} <Error> TCPHandler: Code: 210. DB:NetException I/O error: Broken pipe, while writing to socket ([:ffff10.0.145.23]:9000 -> [:ffff10.0.129.149]:33502). (NETWORK_ERROR), Stack trace (when copying this message, always include the lines below): 0. DB:Exception:Exception(DB:Exception:MessageMasked&&, int, bool) @ 0x000000000c800f1b in /usr/bin/clickhouse 1. DB:NetException:NetException<String, String, String>(int, FormatStringHelperImpl<std:type identity&lt;String&gt;:type, std:type identity&lt;String&gt;:type, std:type identity&lt;String&gt;:type>, String&&, String&&, String&&) @ 0x000000000caa69a1 in /usr/bin/clickhouse 2. DB:WriteBufferFromPocoSocket:nextImpl() @ 0x000000000caa733e in /usr/bin/clickhouse 3. DB:TCPHandler:runImpl() @ 0x000000001292120f in /usr/bin/clickhouse 4. DB:TCPHandler:run() @ 0x0000000012933eb9 in /usr/bin/clickhouse 5. Poco:NetTCPServerConnection:start() @ 0x00000000153a5a72 in /usr/bin/clickhouse 6. Poco:NetTCPServerDispatcher:run() @ 0x00000000153a6871 in /usr/bin/clickhouse 7. Poco:PooledThread:run() @ 0x000000001549f047 in /usr/bin/clickhouse 8. Poco:ThreadImpl:runnableEntry(void*) @ 0x000000001549d67d in /usr/bin/clickhouse
Hello I keep seeing these error logs in the `signoz-otel-collector` pod ```024-03-20T08:22:57.138Z...
s

Srinivas Anant

over 1 year ago
Hello I keep seeing these error logs in the
signoz-otel-collector
pod
024-03-20T08:22:57.138Z	error	clickhousetracesexporter/writer.go:421	Could not write a batch of spans to index table: 	{"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "error": "clickhouse [Append]:  clickhouse: expected 37 arguments, got 36"}
<http://github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*SpanWriter).WriteBatchOfSpans|github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*SpanWriter).WriteBatchOfSpans>
	/home/runner/work/signoz-otel-collector/signoz-otel-collector/exporter/clickhousetracesexporter/writer.go:421
<http://github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*storage).pushTraceData|github.com/SigNoz/signoz-otel-collector/exporter/clickhousetracesexporter.(*storage).pushTraceData>
	/home/runner/work/signoz-otel-collector/signoz-otel-collector/exporter/clickhousetracesexporter/clickhouse_exporter.go:424
<http://go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesRequest).Export|go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesRequest).Export>
	/home/runner/go/pkg/mod/go.opentelemetry.io/collector/exporter@v0.88.0/exporterhelper/traces.go:60
<http://go.opentelemetry.io/collector/exporter/exporterhelper.(*timeoutSender).send|go.opentelemetry.io/collector/exporter/exporterhelper.(*timeoutSender).send>
	/home/runner/go/pkg/mod/go.opentelemetry.io/collector/exporter@v0.88.0/exporterhelper/timeout_sender.go:41
<http://go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send|go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send>
	/home/runner/go/pkg/mod/go.opentelemetry.io/collector/exporter@v0.88.0/exporterhelper/retry_sender.go:138
<http://go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesExporterWithObservability).send|go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesExporterWithObservability).send>
	/home/runner/go/pkg/mod/go.opentelemetry.io/collector/exporter@v0.88.0/exporterhelper/traces.go:177
<http://go.opentelemetry.io/collector/exporter/exporterhelper.(*queueSender).start.func1|go.opentelemetry.io/collector/exporter/exporterhelper.(*queueSender).start.func1>
	/home/runner/go/pkg/mod/go.opentelemetry.io/collector/exporter@v0.88.0/exporterhelper/queue_sender.go:126
<http://go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).Start.func1|go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).Start.func1>
	/home/runner/go/pkg/mod/go.opentelemetry.io/collector/exporter@v0.88.0/exporterhelper/internal/bounded_memory_queue.go:52
2024-03-20T08:22:57.138Z	info	exporterhelper/retry_sender.go:177	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "error": "clickhouse [Append]:  clickhouse: expected 37 arguments, got 36", "interval": "8.596695142s"
What could cause this issue? Checked the clickhouse container utilisation, it's normal
1