Hi team I have a self hosted signoz on a self managed k8s cl SigNoz Community #support

Hi team, I have a self-hosted signoz on a self-man...

kankan ghosh

04/02/2025, 3:45 AM

Hi team, I have a self-hosted signoz on a self-managed k8s cluster. signoz version setup is v0.76.2. facing a peculiar issue where all of a sudden the traces are not visible in signoz for any of our springboot microservices anymore. the services tab stopped showing any of the microservices that were earlier showing up and traces for which were also visible. Logs are still visible; it's only the traces that are not being captured and/or displayed anymore. the only change we had, was restarting of the microservices. we are sending these logs and traces using otel-java-agent which were are enabling when starting up our spring-boot microservices. A similar issue had happened earlier, when I had the setup the signoz v0.57.0. So thought that upgrading signoz and it's components might help. but same behaviour observed again, with the latest signoz version. Can you help/guide what could be the reason? and how do I resolve this?

Vibhu Pandey

04/02/2025, 6:21 AM

Hi how are you running 76.2?

kankan ghosh

04/02/2025, 6:37 AM

using signoz helm chart. everything is self-hosted, including the k8s cluster too.

kankan ghosh

04/03/2025, 4:37 AM

@Srikanth Chekuri will you be able to help here?

Srikanth Chekuri

04/03/2025, 5:30 AM

Hello @kankan ghosh, Are you not seeing any services/traces or is it only some of the services?

kankan ghosh

04/03/2025, 9:27 AM

I am not seeing any services and/or traces anymore. it was showing till day before yesterday, and then suddenly it stopped

kankan ghosh

04/03/2025, 9:29 AM

although logs are still showing up for the same services

Srikanth Chekuri

04/03/2025, 9:29 AM

1. Can you share the confimap values for query-service and signoz-otel-collector? 2. Do you see any errors in the signoz-otel-collector logs? 3. If you can exec into ClickHouse, please get the output of the following.

Copy code

SELECT max(timestamp)
FROM signoz_traces.signoz_index_v3

Srikanth Chekuri

04/03/2025, 9:30 AM

cc @Nagesh Bansal

kankan ghosh

04/03/2025, 9:40 AM

configmap for signoz

kankan ghosh

04/03/2025, 9:40 AM

getting for collector too

Srikanth Chekuri

04/03/2025, 9:41 AM

Please share the signoz pod args too

kankan ghosh

04/03/2025, 9:43 AM

[root@j---------- ------]# kubectl describe cm @@@@@@-signoz-otel-collector -n monitoring Name: @@@@@@@-signoz-otel-collector Namespace: monitoring Labels: app.kubernetes.io/component=otel-collector app.kubernetes.io/instance=@@@@@@ app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=signoz app.kubernetes.io/version=v0.76.2 helm.sh/chart=signoz-0.74.3 Annotations: meta.helm.sh/release-name: @@@@@@ meta.helm.sh/release-namespace: monitoring Data ==== otel-collector-config.yaml: ---- exporters: clickhouselogsexporter: dsn: tcp://${envCLICKHOUSE USER}${env:CLICKHOUSE_PASSWORD}@${envCLICKHOUSE HOST}${env:CLICKHOUSE_PORT}/${env:CLICKHOUSE_LOG_DATABASE} timeout: 10s use_new_schema: true clickhousemetricswrite: endpoint: tcp://${envCLICKHOUSE USER}${env:CLICKHOUSE_PASSWORD}@${envCLICKHOUSE HOST}${env:CLICKHOUSE_PORT}/${env:CLICKHOUSE_DATABASE} resource_to_telemetry_conversion: enabled: true timeout: 15s clickhousetraces: datasource: tcp://${envCLICKHOUSE USER}${env:CLICKHOUSE_PASSWORD}@${envCLICKHOUSE HOST}${env:CLICKHOUSE_PORT}/${env:CLICKHOUSE_TRACE_DATABASE} low_cardinal_exception_grouping: ${env:LOW_CARDINAL_EXCEPTION_GROUPING} use_new_schema: true metadataexporter: cache: provider: in_memory dsn: tcp://${envCLICKHOUSE USER}${env:CLICKHOUSE_PASSWORD}@${envCLICKHOUSE HOST}${env:CLICKHOUSE_PORT}/signoz_metadata tenant_id: ${env:TENANT_ID} timeout: 10s extensions: health_check: endpoint: 0.0.0.0:13133 pprof: endpoint: localhost:1777 zpages: endpoint: localhost:55679 processors: batch: send_batch_max_size: 15000 send_batch_size: 10000 timeout: 1s memory_limiter: check_interval: 1s limit_mib: 1000 spike_limit_mib: 200 probabilistic_sampler/logs: sampling_percentage: 50 signozspanmetrics/delta: aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA dimensions: - default: default name: service.namespace - default: default name: deployment.environment - name: signoz.collector.id dimensions_cache_size: 100000 latency_histogram_buckets: - 100us - 1ms - 2ms - 6ms - 10ms - 50ms - 100ms - 250ms - 500ms - 1000ms - 1400ms - 2000ms - 5s - 10s - 20s - 40s - 60s metrics_exporter: clickhousemetricswrite tail_sampling: decision_wait: 10s expected_new_traces_per_sec: 10 num_traces: 100 policies: - and: and_sub_policy: - name: threshold-policy string_attribute: key: service.name values: - backend-service type: string_attribute - name: route-name-policy string_attribute: enabled_regex_matching: true key: http.route values: - /acs-base-service.* type: string_attribute - latency: threshold_ms: 90 name: latency-policy type: latency name: threshold type: and receivers: httplogreceiver/heroku: endpoint: 0.0.0.0:8081 source: heroku httplogreceiver/json: endpoint: 0.0.0.0:8082 source: json jaeger: protocols: grpc: endpoint: 0.0.0.0:14250 thrift_http: endpoint: 0.0.0.0:14268 otlp: protocols: grpc: endpoint: 0.0.0.0:4317 max_recv_msg_size_mib: 16 http: endpoint: 0.0.0.0:4318 service: extensions: - health_check - zpages - pprof pipelines: logs: exporters: - clickhouselogsexporter - metadataexporter processors: - batch receivers: - otlp - httplogreceiver/heroku - httplogreceiver/json metrics: exporters: - clickhousemetricswrite - metadataexporter processors: - batch receivers: - otlp traces: exporters: - clickhousetraces - metadataexporter processors: - signozspanmetrics/delta - batch receivers: - otlp - jaeger telemetry: logs: encoding: json metrics: address: 0.0.0.0:8888 otel-collector-opamp-config.yaml: ---- server_endpoint: "ws://@@@@@@-signoz:4320/v1/opamp" BinaryData ==== Events: <none>

kankan ghosh

04/03/2025, 9:48 AM

here are signoz statefulset pod args

kankan ghosh

04/03/2025, 9:52 AM

I do see error messages in the signoz-otel-collector logs

signoz-otel-collector-errors.log

Srikanth Chekuri

04/03/2025, 9:55 AM

Please override the config and increase the timeout for traces exporter and check again

Copy code

otelCollector:
    config:
        exporters:
            clickhousetraces:
                timeout: 30s

kankan ghosh

04/03/2025, 9:55 AM

okay.. let me try it out

kankan ghosh

04/03/2025, 9:56 AM

the output of the query SELECT max(timestamp) FROM signoz_traces.signoz_index_v3

Srikanth Chekuri

04/03/2025, 9:57 AM

So you should be seeing at least some spans. Are you not able to see anything in traces explorer?

kankan ghosh

04/03/2025, 10:02 AM

yes, now I can see some spans

kankan ghosh

04/03/2025, 10:02 AM

after making this change

kankan ghosh

04/03/2025, 10:02 AM

even services tab is showing up the services now

kankan ghosh

04/03/2025, 10:05 AM

will observe the setup for some time. Thank you very much for your timely response. Really appreciate it

Srikanth Chekuri

04/03/2025, 11:31 AM

Hi @kankan ghosh, Can you give us some idea of your telemetry volume? This will help us better help you and others like you with DIY guides.

kankan ghosh

04/22/2025, 8:54 AM

Hi @Srikanth Chekuri - Sorry I totally missed this ping of yours. Do you have any specific query that I can run to give you the telemetry data size. As an example I got top tables by size:

kankan ghosh

04/22/2025, 8:56 AM

Also, do you have any suggestions on how to reduce the amount of data that is being collected and stored in clickhouse?

10 Views

Open in Slack

Previous Next