Hi everyone! We're currently facing an issue in production with SigNoz + ClickHouse replication, and I’d love to hear if anyone has experienced something similar or has suggestions.
We’re running a ClickHouse cluster with multiple replicas, and we’ve noticed that when querying for specific spans (example: signoz/trace/378c3b108bf0859800ab9d9dbb1b05ae?spanId=e8065564f6ac4631&levelUp=0&levelDown=0), we often need to refresh the page multiple times before the data appears. It seems like the query is hitting a node that hasn’t received the replicated data yet.
Is there a recommended way to ensure consistent reads across replicas in SigNoz? Can we configure it to query all nodes or route through a coordinator to avoid this inconsistency?
We’re currently testing this in staging before applying any changes to production. Any insights would be greatly appreciated — thanks in advance!