Our queries have gotten very slow even for just 1 ...
# support
d
Our queries have gotten very slow even for just 1 day of data. And I can't query 1 month of data at all as of now. We are using default rentention period, and AWS EBS gp3 for disk. Resources allocated - 4 vCPU and 16GB of RAM, Signoz version 0.86 Are there any suggestions to improve query performance?
@Srikanth Chekuri any idea?
h
It would help to indicate which type of telemetry (metrics, logs, traces), show your query, and what your table sizes are. That'd make it easier to spot if you're including indexed columns that allow Clickhouse to efficiently filter rather than scan the entire data set.
s
Yes, @Dhruv garg, please share more context. Did this change without version upgrade or did you see change in resource usage?
d
I am talking about traces and logs mainly. • the default services tab - /services • services detail page - services/{service}?relativeTime=1d • traces search by error_code, name • for logs search on attribute funcName, body contains
s
Our queries have gotten very slow even for just 1 day of data
By this I am implying that they were fine earlier but now have gotten very slow. Please share more context 1. did the volume of the telemetry remain same or changed? 2. did you see any change in the resource usage? 3. what is the base resource usage without any querying 4. how much cpu is available for queyrying
k
Hi, FYI, I have the same issue with a server with these specs (and nothing else running on it except Signoz): •
vCores: 16
RAM: 64 Go
Disk: 350 Go
I have ~35GB of data, but it’s possible that I’m storing too many traces because I only set it up a day ago 😅 Looking at the console, I saw that there were
rowsScanned: 5262696
just for a 30-minute window, so I imagine that could be related. But in any case, I think I have the same setup as for Datadog, and with Datadog it works pretty well. It only starts to take time if I go looking for 1 month+ of data.
d
1. did the volume of the telemetry remain same or changed?
yeah, our scale just went upto 1000 to 2000 RPS, earlier it used to be around 300RPS
1. did you see any change in the resource usage?
last i checked, It kept around 1700m cpu, even though I have given it 4000mcpu.
1. what is the base resource usage without any querying
chi-apm-clickhouse-cluster-0-0-0 1293m 3764Mi
1. how much cpu is available for queyrying
4000mcpu, and 16GB of RAM
k
hello any news here?
FYI, for my use case, I lowered the number of trace I send to Signoz I didn’t test yet but it seems better with small time range But if we can get an answer with real number, it would be great 👍