Hey guys recently we moved our signoz instance fro...
# support
j
Hey guys recently we moved our signoz instance from docker-compose to kubernetes for better scalability and reliability - we are facing issues where the signoz instance is crashing with the otel-collector instance queue getting full and the otel-collector runs out of memory. We have 10 replicas of the otel-collector and still facing issues - is this a problem with the clickhouse database not being able to handle the load. If this the case how do I scale the database to handle this load
n
are you adding batch processors before sending insert queries to clickhouse ? if yes try increasing the batch size and the timeout
j
by default doesn't the otel-collector do batch processing? how do I increase the batch size and the timeout? Srikanth had suggested that wouldn't be the problem
because I can see in the helm chart values that the default send_batch_size is 50k and the timeout is 1s
this is the otel-collector config that I can see
n
Can you try increasing the batch timeout to a higher number, clickhouse is complaining because too many small insert requests are coming up. @Srikanth Chekuri @Prashant Shahi any more ideas ?
j
I've changed the timeout to 10s now and the batch size is 50k which I think should be good enough
s
For anyone reading this, it was the disk causing slower merges.
👀 1