just did a fresh install with latest chart on a new clickhouse ...
2023/07/28 16:01:02 application run finished with error: failed to build pipelines: failed to create "clickhousetraces" exporter for data type "traces": code: 60, message: Table signoz_traces.distributed_signoz_index_v2 doesn't exist
anyone get this problem?
This happens when migration files are not executed properly. OtelCollector and OtelCollector Metrics are responsible for running those to create the tables.
Init containers make sure that those pods run only after clickhouse cluster is healthy. But it seems like something is not right with the cluster health or order.
Could you try restarting SigNoz OtelCollector and OtelCollector Metrics pods?
i did ... it ended up restarting over a 100x by now
i just fixed it 2s ago
i found the last migration, in
and deleted the row then set the other ones to dirty = 0
there are 2 problems that i see: • the chart doesn't respect
at all • the pod does not query to see how many shards or replicas are present so it marks 1 as good and other as dirty
is static value
at the moment due to the limitations of the go-migrate and it being static in migration files
the pod does not query to see how many shards or replicas are present so it marks 1 as good and other as dirty
Could you please elaborate on this?
if i have 2 replicas (default behavior) then the other replica still has migration marked as dirty
I see. Dirty migration issue. TMK, this is usually resolved in later pod restarts. @Srikanth Chekuri could you please look into this?
2 more issues @Prashant Shahi
signoz-otel-collector-7b86585cd5-nrnnd signoz-otel-collector 2023-08-01T16:57:09.381Z   info    clickhousetracesexporter/clickhouse_factory.go:127      Clickhouse Migrate finished {"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "error": "migration failed in line 0: \n\n\n\nCREATE MATERIALIZED VIEW IF NOT EXISTS signoz_traces.dependency_graph_minutes_service_calls_mv ON CLUSTER cluster\nTO signoz_traces.dependency_graph_minutes AS\nSELECT\n    A.serviceName as src,\n    B.serviceName as dest,\n    quantilesState(0.5, 0.75, 0.9, 0.95, 0.99)(toFloat64(B.durationNano)) as duration_quantiles_state,\n    countIf(B.statusCode=2) as error_count,\n    count(*) as total_count,\n    toStartOfMinute(B.timestamp) as timestamp\nFROM signoz_traces.signoz_index_v2 AS A, signoz_traces.signoz_index_v2 AS B\nWHERE (A.serviceName != B.serviceName) AND (A.spanID = B.parentSpanID)\nGROUP BY timestamp, src, dest; (details: code: 47, message: Unknown identifier: B.timestamp; there are columns: timestamp, serviceName, B.serviceName, quantilesState(0.5, 0.75, 0.9, 0.95, 0.99)(toFloat64(B.durationNano)), countIf(equals(B.statusCode, 2)), count(): While processing serviceName AS src, B.serviceName AS dest, quantilesState(0.5, 0.75, 0.9, 0.95, 0.99)(toFloat64(B.durationNano)) AS duration_quantiles_state, countIf(B.statusCode = 2) AS error_count, count() AS total_count, toStartOfMinute(B.timestamp) AS timestamp)"}
signoz-otel-collector-7b86585cd5-nrnnd signoz-otel-collector 2023-08-01T17:00:01.906Z   error   clickhousetracesexporter/writer.go:128  Could not write a batch of spans    {"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "error": "code: 60, message: Table signoz_traces.distributed_span_attributes doesn't exist"}
cc @Ankit Nayan @Vishal Sharma
it seems like the problem is related to the dirty flag that i pointed out earlier
constantly changing it to 0 and restarting the pod resolves
the issue is that if you don't know how many total migrations there are, you don't know when to stop
apart from the issue that it shouldn't be required 😕
Please subscribe to this issue if you want to keep an eye on when it gets fixed https://github.com/SigNoz/signoz-otel-collector/issues/156. The root cause is a lack of atomicity.