https://signoz.io logo
Title
a

Alexei Zenin

08/25/2022, 2:57 PM
How much memory should the query service need? I am running it with 512MB and it seems to be dying due to OOM errors. Does it need to scale according to how much clickhouse data there is or maybe there is a memory leak? On version 10.2
a

Ankit Nayan

08/25/2022, 3:07 PM
Right now, it should scale based on the number of timeseries. This will be improved soon in upcoming versions.
a

Alexei Zenin

08/25/2022, 3:07 PM
Does that correlate to the number of services?
a

Ankit Nayan

08/25/2022, 3:08 PM
can you run
select count() from signoz_metrics.time_series_v2;
Does that correlate to the number of services?
yes...as we create APM metrics for each service
usually takes around 2K timeseries per service but depends on number of operations/apis, etc
a

Alexei Zenin

08/25/2022, 3:10 PM
SELECT count()
FROM signoz_metrics.time_series_v2

Query id: 10fe1e90-fe65-41bc-b73f-9c8c4d120917

┌─count()─┐
│  238930 │
└─────────┘

1 row in set. Elapsed: 0.001 sec.
a

Ankit Nayan

08/25/2022, 3:39 PM
You should try upgrading the memory allocation of the query-service. @Srikanth Chekuri any requirements estimate assuming twice the above mentioned number to be on the safer side?
a

Alexei Zenin

08/25/2022, 3:39 PM
I went from 512 to 1024 and seems to be stable for now
a

Ankit Nayan

08/25/2022, 3:41 PM
cool
s

Srikanth Chekuri

08/25/2022, 3:57 PM
This is the rough estimation based on the past observations and generic requirements. For 100k time series with max of 30 labels it will be in the range of 190-240MB. The query service also does other things + process runtime needs some memory. So in your case the time series alone will eat up the all the memory you allocated and cause OOM.
a

Alexei Zenin

08/25/2022, 3:59 PM
I see, is there some kind of caching going on in the service? Look forward to this being optimized as our entire company will depend on this 1 service being up to send alerts to our Pagerduty. If it goes down we are blind.
a

Aditya KP

08/25/2022, 4:13 PM
@Ankit Nayan we should add a metrics summary which provides cardinality details of each metric. That would be a) useful to devs to understand their cardinality numbers / sample tag values. b) track cardinality explosions.
This is a huge driver of costs at observability solutions.
a

Ankit Nayan

08/25/2022, 4:14 PM
nice idea...
mind opening an issue?
a

Aditya KP

08/25/2022, 4:14 PM
Sure. Gimme a min.
a

Ankit Nayan

08/25/2022, 4:15 PM
We should also have some OSS design contributors to pick up a few design tasks or atleast suggest a few mockups
a

Aditya KP

08/25/2022, 4:26 PM
a

Alexei Zenin

08/26/2022, 6:02 PM
I added a few more services and had to increase memory again to 2048. Will see how much more I need