Dimitris Mavrommatis
05/17/2025, 4:49 PMQuery Builder
and PromQL
regarding the count of metrics?
I see that I have 4 distinct counts of a metric on the Query Builder
whule PromQL
reports only 1.
How does PromQLs sum
does deduplication from the same datastore but Query Builder Count(Distinct)
cannot?
also, why do I end-up with a count of 4 metrics for k8s_pod_cpu_usage
when using SigNoz infra collector without any changes.Dimitris Mavrommatis
05/18/2025, 11:11 PMDimitris Mavrommatis
05/18/2025, 11:25 PMkubeletstats:
auth_type: serviceAccount
collection_interval: 30s
and then the infra monitoring looks like it does SUM and aggregates on 60s which means it shows all the values double of what they actually are?
look at the screenshot. infra monitoring reports 4 CPU usage on pod while it was 2 CPU usage and you can see it on the right with avg
aggregation func (promql query and cluster itself report 2 CPU usage as well).
This looks like a major bug that can cause many issues in production if people use your metrics. Also, it should be visible that lowest aggregation is 60 and should not be able to set it lower and then ignored.Dimitris Mavrommatis
05/18/2025, 11:32 PMSELECT ts, sum(per_series_value) as value FROM (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), INTERVAL 60 SECOND) as ts, max(value) as per_series_value FROM signoz_metrics.distributed_samples_v4 INNER JOIN (SELECT DISTINCT fingerprint FROM signoz_metrics.time_series_v4 WHERE metric_name IN ['k8s_pod_cpu_utilization','k8s_pod_cpu_usage'] AND temporality = 'Unspecified' AND __normalized = true AND unix_milli >= 1747602000000 AND unix_milli < 1747603800000 AND JSONExtractString(labels, 'k8s_pod_name') = 'exec-run-91efd0ba-fab4-4bc1-b483-c9d077e76536' AND JSONExtractString(labels, 'k8s_namespace_name') = 'luminai') as filtered_time_series USING fingerprint WHERE metric_name IN ['k8s_pod_cpu_utilization','k8s_pod_cpu_usage'] AND unix_milli >= 1747602900000 AND unix_milli < 1747603800000 GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WHERE isNaN(per_series_value) = 0 GROUP BY ts ORDER BY ts ASC
the interval is set to 60 and it does sum. this means that because we scrape every 30 seconds, we will scrape twice in the aggregation period and them sum wrongly instead of averaging the values.
your code with sum
SELECT ts, sum(per_series_value) as value FROM (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), INTERVAL 60 SECOND) as ts, max(value) as per_series_value FROM signoz_metrics.distributed_samples_v4 INNER JOIN (SELECT DISTINCT fingerprint FROM signoz_metrics.time_series_v4 WHERE metric_name IN ['k8s_pod_cpu_utilization','k8s_pod_cpu_usage'] AND temporality = 'Unspecified' AND __normalized = true AND unix_milli >= 1747602000000 AND unix_milli < 1747603800000 AND JSONExtractString(labels, 'k8s_pod_name') = 'exec-run-91efd0ba-fab4-4bc1-b483-c9d077e76536' AND JSONExtractString(labels, 'k8s_namespace_name') = 'luminai') as filtered_time_series USING fingerprint WHERE metric_name IN ['k8s_pod_cpu_utilization','k8s_pod_cpu_usage'] AND unix_milli >= 1747602900000 AND unix_milli < 1747603800000 GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WHERE isNaN(per_series_value) = 0 GROUP BY ts ORDER BY ts ASC
Query id: 8437bc7f-e2d1-4d06-9898-612450be141f
┌──────────────────ts─┬───────value─┐
1. │ 2025-05-18 21:15:00 │ 3.833452586 │
2. │ 2025-05-18 21:16:00 │ 3.88995441 │
3. │ 2025-05-18 21:17:00 │ 3.93827644 │
4. │ 2025-05-18 21:18:00 │ 4.00250301 │
5. │ 2025-05-18 21:19:00 │ 3.994867402 │
6. │ 2025-05-18 21:20:00 │ 4.005173116 │
7. │ 2025-05-18 21:21:00 │ 4.00739036 │
8. │ 2025-05-18 21:22:00 │ 3.989971906 │
9. │ 2025-05-18 21:23:00 │ 4.002892684 │
10. │ 2025-05-18 21:24:00 │ 3.99566403 │
11. │ 2025-05-18 21:25:00 │ 3.993028 │
12. │ 2025-05-18 21:26:00 │ 3.986259702 │
13. │ 2025-05-18 21:27:00 │ 4.008238804 │
14. │ 2025-05-18 21:28:00 │ 3.992228324 │
15. │ 2025-05-18 21:29:00 │ 3.989568936 │
└─────────────────────┴─────────────┘
my code with avg
SELECT ts, avg(per_series_value) as value FROM (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), INTERVAL 60 SECOND) as ts, max(value) as per_series_value FROM signoz_metrics.distributed_samples_v4 INNER JOIN (SELECT DISTINCT fingerprint FROM signoz_metrics.time_series_v4 WHERE metric_name IN ['k8s_pod_cpu_utilization','k8s_pod_cpu_usage'] AND temporality = 'Unspecified' AND __normalized = true AND unix_milli >= 1747602000000 AND unix_milli < 1747603800000 AND JSONExtractString(labels, 'k8s_pod_name') = 'exec-run-91efd0ba-fab4-4bc1-b483-c9d077e76536' AND JSONExtractString(labels, 'k8s_namespace_name') = 'luminai') as filtered_time_series USING fingerprint WHERE metric_name IN ['k8s_pod_cpu_utilization','k8s_pod_cpu_usage'] AND unix_milli >= 1747602900000 AND unix_milli < 1747603800000 GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WHERE isNaN(per_series_value) = 0 GROUP BY ts ORDER BY ts ASC
Query id: 1f89012a-5dfe-4c19-b69c-c14748ce1fe0
┌──────────────────ts─┬───────value─┐
1. │ 2025-05-18 21:15:00 │ 1.916726293 │
2. │ 2025-05-18 21:16:00 │ 1.944977205 │
3. │ 2025-05-18 21:17:00 │ 1.96913822 │
4. │ 2025-05-18 21:18:00 │ 2.001251505 │
5. │ 2025-05-18 21:19:00 │ 1.997433701 │
6. │ 2025-05-18 21:20:00 │ 2.002586558 │
7. │ 2025-05-18 21:21:00 │ 2.00369518 │
8. │ 2025-05-18 21:22:00 │ 1.994985953 │
9. │ 2025-05-18 21:23:00 │ 2.001446342 │
10. │ 2025-05-18 21:24:00 │ 1.997832015 │
11. │ 2025-05-18 21:25:00 │ 1.996514 │
12. │ 2025-05-18 21:26:00 │ 1.993129851 │
13. │ 2025-05-18 21:27:00 │ 2.004119402 │
14. │ 2025-05-18 21:28:00 │ 1.996114162 │
15. │ 2025-05-18 21:29:00 │ 1.994784468 │
└─────────────────────┴─────────────┘
Dimitris Mavrommatis
05/18/2025, 11:42 PMNagesh Bansal
05/19/2025, 9:20 AM