How can I dig deeper into ingestion of metrics I can see we SigNoz Community #signoz-cloud

How can I dig deeper into ingestion of metrics? I ...

Ashwini Manoj

07/15/2025, 9:51 AM

How can I dig deeper into ingestion of metrics? I can see we are reaching 1 million data points in a few hours (although we have only enabled 1 service for now). I'm not able to pinpoint what is causing so many metrics to be emitted. Is there any doc that I could use for this?

Hien Le

07/15/2025, 5:08 PM

What's your tech stack? I went through this recently for our test env which is a mix of NodeJS and Python running on K8s that was generating 33M samples a day while idle. I started from Metrics per hour and looked at metric names:

Hien Le

07/15/2025, 5:17 PM

Things that helped me: 1. Turning collectionInterval from

30s

2m

for all infra collectors (host, kubelet, etc). This brought their usage down to 25%. 2. A Python app was using otel-auto-instrumentation, which includes system_metrics so pods were reporting redundant "host metrics". Fixed by setting the pod env-var

OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=system_metrics

to disable the library. 3. Changed the infra chart to disable a lot of metrics, since our pods use external DBs and didn't have meaningful local storage many of the filesystem ones weren't useful. Many of my services are low-volume, so

replicaset

desired/available metrics were a constant

and not useful. Ditto for pod_state and a bunch of other k8s metrics that aren't applicable to "simpler architectures". 4. Update the otel collector to drop a heavy weight HTTP metric bucket since trace spans already captured the same info. The Signoz team provided additional tuning help for me in this thread. I was able to get from 33M/day down to 2M/day with those changes.

🔥 3

Ankit Nayan

07/15/2025, 6:49 PM

@Chitransh Gupta this thread should be converted to docs for all cloud and community users

🙌 1

Hien Le

07/15/2025, 6:57 PM

Also ended up dropping all logs containing

kube-probe

. Didn't care enough to sample them but that also reduced my idle cluster volume significantly. All this material is documented but I agree it'd be very useful to have a targeted guide during Setup. I can imagine cash-strapped startups might rule out the product entirely to find out monitoring one pod will cost them $6/mo.

🙌 1

Chitransh Gupta

07/15/2025, 8:35 PM

Created an issue for this @Ankit Nayan - https://github.com/SigNoz/signoz-web/issues/1680 Thanks for a very detailed answer @Hien Le!

Ashwini Manoj

07/25/2025, 1:08 PM

This is super useful @Hien Le. Yes, while it is detailed in the metrics-explorer page, it would be helpful to add a section on "analyzing metrics ingestion in detail". I'm on nodejs with ecs cluster with ~ 10 instances being monitored (to start with) that resulted in >10m metrics a day. I'm updating my collector interval and digging deeper into Metrics explorer now. Thank you!

Hien Le

07/25/2025, 4:54 PM

CollectorInterval is the easiest upfront reduction. Due to my health / readiness interval my pods take about ~2minute to get ready so a 30s collection seems overkill. I've been meaning to write up more notes specifically for our workflow which was using the auto-instrument operator and signoz-k8s-infra. Yeah, everything is very well documented but scattered across various small articles that can be hard for folks to find their specific flow. They actually have an Ingestion Analysis Dashboard you can manually install as well.

Hien Le

07/25/2025, 5:00 PM

If you take

2*60*60*24*30 = 5,184,000

samples/mo, and at least 12 metrics enabled by default per pod with

signoz-k8-sinfra

that's $6.22/mo for one pod so I can see how it seems to add up very quickly. I think there's actually more metrics enabled by default so definitely out-of-box Signoz might be a surprise for most folks.

2 Views

Open in Slack

Previous Next