Hi Team, I've done a fresh setup of SigNoz with clickhouse on EKS with 2 shards and 2 replicas (<htt...
d

Divyansh Sharma

4 months ago
Hi Team, I've done a fresh setup of SigNoz with clickhouse on EKS with 2 shards and 2 replicas (https://signoz.io/docs/operate/clickhouse/distributed-clickhouse/#kubernetes-installation), Now, whenever I do a helm upgrade, the signoz-schema-migrator-sync job runs and fails few times due to table not found errors then automatically succeeds.
Error: code: 60, message: There was an error on [chi-signoz-clickhouse-cluster-1-1:9000]: Code: 60. DB::Exception: Could not find table: time_series_v4. (UNKNOWN_TABLE) (version 24.1.2.5 (official build))
In clickhouse logs as well I see missing table errors. error logs of chi-signoz-clickhouse-cluster-0-0-0:
"message":"Code: 60. DB::Exception: Received from chi-signoz-clickhouse-cluster-1-1:9000. DB::Exception: Table signoz_metrics.samples_v4 does not exist.
"message":"Code: 60. DB::Exception: Received from chi-signoz-clickhouse-cluster-1-1:9000. DB::Exception: Table signoz_metrics.samples_v2 does not exist.
Then while setting the retention on UI, it just gets stuck and in the logs of the query service I see it is not able to do GetTTL from clickhouse:
"msg":"http: panic serving 10.10.249.136:55804: runtime error: invalid memory address or nil pointer dereference\ngoroutine 684 [running]:\nnet/http.(*conn).serve.func1()\n\t/home/runner/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.linux-amd64/src/net/http/server.go:1903 +0xbe\npanic({0x22593a0?, 0x4155d20?})\n\t/home/runner/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.22.7.linux-amd64/src/runtime/panic.go:770 +0x132\<http://ngo.signoz.io/signoz/pkg/query-service/app/clickhouseReader.(*ClickHouseReader).GetTTL(0xc00011f688|ngo.signoz.io/signoz/pkg/query-service/app/clickhouseReader.(*ClickHouseReader).GetTTL(0xc00011f688>, {0x2f132b8, 0xc000d074a0}
I cleared the SQLite db table as well, but it is still stuck. (https://signoz.io/docs/faqs/troubleshooting/#i-am-trying-to-change-the-retention-period-of-traces-but-the-process-gets-stuck-everytime) Am I missing something wrt to the db schemas? Is anyone able to make it work with the latest helm chart appVersion=0.73.0?