Hien Le
06/19/2025, 8:05 PMk8s_replicaset_available/desired
) that we don't care about but my heaviest metric is http-client/server_duration_bucket
coming from NodeJS AutoInstrumentation.
There's been 3M samples over the last 24 hours but much of that is off-hours where our application is doing very little. Is this a case of generating samples very frequently even though the value is usually 0, if so how could we reduce the samples? It doesn't seem a Batch Processor in the Metrics Pipeline would help.Srikanth Chekuri
06/20/2025, 7:26 AMIs this a case of generating samples very frequently even though the value is usually 0Yes
if so how could we reduce the samples?Please change the temporality to
delta
by setting the env OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE=delta
Hien Le
06/20/2025, 4:34 PMSrikanth Chekuri
06/21/2025, 1:31 AMSo I'd set that on Instrumentation so it applies to injected pods?It should be part of application env vars.
Would the same env-var need to be set in signoz-k8s-infra chart to reduce the node agent metrics volume?No, the agent default metrics are k8s resource metrics. Unlike the application metrics, the agent metrics won't see any reduction with this change.
Hien Le
06/23/2025, 4:52 PMhttp_server_duration_buckets
is still ~1.6M. I'm guessing this could be due to regular kubeprobes sending metrics?
The http_client_duration_buckets
went from 3M to 1.5M even with an extra environment, so it seems temporarily=delta is helping but it is not drastic?
It it safe to filter those two metrics? they seem redundant with traces.Hien Le
06/23/2025, 4:54 PMSrikanth Chekuri
06/24/2025, 12:53 AMWhen trying to determine which metrics to drop to reduce costs, should I sort by Samples or Time Series in the Metrics Explorer?You should sort by samples.
TheThe change would completely cut down the samples produced during off-hours. However, the samples during the regular hours would be relatively same if the attributes appear recurring (than say they visit app once and don't come again). So the gains you have are from the off-hours in that case.went from 3M to 1.5M even with an extra environment, so it seems temporarily=delta is helping but it is not drastic?http_client_duration_buckets
Hien Le
07/03/2025, 1:00 AMsystem.disk.operations
samples, 1.4M+ k8s.replicaset.desired
samples.
Do we need a newer chart like 0.13.0 to select metrics? Should the names be pulled from the Cloud Metrics dashboard? the documented conventions look different.Srikanth Chekuri
07/03/2025, 2:41 AMsystem.disk.operations
or k8s.replicaset.desired
because the come from hostmetricsreceiver
and kubeletstatsreceiver
neither of which have a bug such as producing duplicate samples.Hien Le
07/03/2025, 4:47 PMHien Le
07/04/2025, 12:22 AM0.13.0
chart and used these values to disable:
1. kubeletstatsreceiver
metrics that are mostly unchanging for me
2. hostmetricsreceiver
completely since many Node metrics seem available in the pod / container metrics
but I'm not seeing a drop in Samples for the system.disk*
or system.cpu.*
metrics over the last 15 minutes. I'd also increased collectionInterval
from 60s
to 4m
and that still didn't seem to reduce samples.
The Flux HelmRelease fragment is attached along with the resulting ConfigMaps (from kubectl describe
) and they appear correct (no hostmetricsreceiver
, and DaemonSet signoz-k8s-infra-otel-agent
and Deployment signoz-k8s-infra-otel-deployment
have been restarted to ensure the pods are created with the latest version of the maps. Any other debugging suggestions?Hien Le
07/04/2025, 12:27 AMcollectionInterval: 5m
be okay?Hien Le
07/07/2025, 7:20 PM<http://system.disk.io|system.disk.io>
with SUM BY k8s.pod.name
to discover what I thought was hostMetrics
were being sent by Python containers. Had to edit `instrumentation.yaml`:
python:
env:
- name: OTEL_PYTHON_DISABLED_INSTRUMENTATIONS
value: system_metrics
Hien Le
07/07/2025, 7:24 PMprocessors:
filter/drop_http_duration_buckets:
metrics:
exclude:
match_type: strict
metric_names:
- http.server.duration.bucket
- http.client.duration.bucket
service:
pipelines:
metrics:
receivers: [otlp]
processors: [filter/drop_http_duration_buckets, attributes/upsert, resource/upsert, batch]
exporters: [debug, otlp/local, otlp/cloud]
Any chance this is related to the underscore / period name normalization?Hien Le
07/07/2025, 7:48 PMhttp.server.request.duration
and require a transformer to drop. This is the biggest consumer of my ingestion costs.
Will dropping http.server.request.duration
break the Signoz Services view?Vibhu Pandey
07/07/2025, 7:52 PMHien Le
07/07/2025, 7:55 PMVibhu Pandey
07/07/2025, 8:00 PM