Alexei Zenin
08/30/2022, 3:16 PM929234
time series at the moment. The query service needed to have its memory increased to 3GB for the container to not crash due to OOM. Our PROD setup is much more extensive and has more traffic so will this need dozens of GB to operate the backend service? Is there anything I am doing wrong, I know there are optimizations coming but seems tough to handle this linearly increasing memory in terms of operations.Ankit Nayan
08/30/2022, 4:17 PMPrashant Shahi
08/30/2022, 6:13 PM6060
of query-service container:
kubectl -n platform port-forward pod/my-release-signoz-query-service-0 6060:6060
In another terminal, run the following to obtain pprof data:
• CPU Profile
curl "<http://localhost:6060/debug/pprof/profile?seconds=30>" -o query-service.pprof -v
• Heap Profile
curl "<http://localhost:6060/debug/pprof/heap>" -o query-service-heap.pprof -v
Alexei Zenin
08/30/2022, 6:34 PMAnkit Nayan
09/20/2022, 4:37 PMselect count() from signoz_metrics.time_series_v2;
select count() from signoz_metrics.samples_v2 where timestamp_ms > toUnixTimestamp(now() - INTERVAL 30 MINUTE)*1000;
Alexei Zenin
09/20/2022, 7:29 PMSELECT count()
FROM signoz_metrics.time_series_v2
Query id: 195009d4-26ea-49e1-86fb-15dc7de0313a
┌─count()─┐
│ 2449091 │
└─────────┘
1 row in set. Elapsed: 0.002 sec.
SELECT count()
FROM signoz_metrics.samples_v2
WHERE timestamp_ms > (toUnixTimestamp(now() - toIntervalMinute(30)) * 1000)
Query id: 1941f1fd-56c8-4a6d-8f28-e6fc9bdab919
┌─count()─┐
│ 2714 │
└─────────┘
1 row in set. Elapsed: 0.005 sec. Processed 40.58 thousand rows, 324.64 KB (8.54 million rows/s., 68.31 MB/s.)
Ankit Nayan
09/22/2022, 2:51 PMOur containers come up and down dozens of times per day, would that affect it (due to rescheduling of EC2 Spot instances)?yes .. heavily...
Alexei Zenin
09/22/2022, 2:53 PMAnkit Nayan
09/22/2022, 2:57 PMwhat's the retention for metrics?this can be seen in the UI => settings page
truncate table signoz_metrics.time_series_v2;
and, restart otel-collectorsAlexei Zenin
09/22/2022, 3:01 PMAnkit Nayan
09/22/2022, 3:02 PMdo you see charts whose data you did not collect say in the last 1hr?what's the oldest data you want to see in charts?
Alexei Zenin
09/22/2022, 3:03 PMAnkit Nayan
09/22/2022, 3:04 PMAlexei Zenin
09/22/2022, 3:04 PMAnkit Nayan
09/22/2022, 3:05 PMAlexei Zenin
09/22/2022, 3:05 PMAnkit Nayan
09/22/2022, 3:05 PMAlexei Zenin
09/22/2022, 3:05 PMAnkit Nayan
09/22/2022, 3:06 PMselect metric_name, count() as count from signoz_metrics.time_series_v2 group by metric_name order by count desc limit 10;
select metric_name, count() as count from signoz_metrics.samples_v2 where timestamp_ms > toUnixTimestamp(now() - INTERVAL 120 MINUTE)*1000 group by metric_name order by count desc limit 10 ;
Alexei Zenin
09/22/2022, 5:23 PMSELECT
metric_name,
count() AS count
FROM signoz_metrics.time_series_v2
GROUP BY metric_name
ORDER BY count DESC
LIMIT 10
Query id: d4f1d9a8-067d-48e4-a75e-4fff3f3ef6e2
┌─metric_name────────────────────────────────────┬──count─┐
│ otelcol_processor_batch_batch_send_size_bucket │ 765371 │
│ otelcol_exporter_enqueue_failed_spans │ 208634 │
│ otelcol_exporter_enqueue_failed_metric_points │ 208634 │
│ otelcol_exporter_enqueue_failed_log_records │ 208634 │
│ otelcol_process_uptime │ 107054 │
│ otelcol_process_cpu_seconds │ 107054 │
│ otelcol_process_memory_rss │ 107054 │
│ otelcol_process_runtime_heap_alloc_bytes │ 107054 │
│ otelcol_process_runtime_total_alloc_bytes │ 107054 │
│ otelcol_exporter_queue_size │ 107054 │
└────────────────────────────────────────────────┴────────┘
10 rows in set. Elapsed: 0.096 sec. Processed 2.50 million rows, 2.59 MB (25.92 million rows/s., 26.89 MB/s.)
SELECT
metric_name,
count() AS count
FROM signoz_metrics.samples_v2
WHERE timestamp_ms > (toUnixTimestamp(now() - toIntervalMinute(120)) * 1000)
GROUP BY metric_name
ORDER BY count DESC
LIMIT 10
Query id: fef0992c-1d77-4a6b-9465-8e7aeea03f75
┌─metric_name───────────────────────────────────┬─count─┐
│ up │ 66 │
│ scrape_samples_scraped │ 66 │
│ scrape_series_added │ 66 │
│ scrape_duration_seconds │ 66 │
│ scrape_samples_post_metric_relabeling │ 66 │
│ otelcol_exporter_enqueue_failed_metric_points │ 59 │
│ otelcol_exporter_enqueue_failed_log_records │ 59 │
│ otelcol_exporter_enqueue_failed_spans │ 59 │
│ otelcol_process_memory_rss │ 33 │
│ otelcol_process_runtime_total_alloc_bytes │ 33 │
└───────────────────────────────────────────────┴───────┘
10 rows in set. Elapsed: 0.022 sec. Processed 28.82 thousand rows, 261.97 KB (1.31 million rows/s., 11.92 MB/s.)
Ankit Nayan
09/22/2022, 5:31 PMtruncate table signoz_metrics.time_series_v2;
and then restart otel-collectorsAlexei Zenin
09/22/2022, 5:32 PMAnkit Nayan
09/22/2022, 5:32 PMAlexei Zenin
09/22/2022, 5:33 PMAnkit Nayan
09/22/2022, 5:33 PMAlexei Zenin
09/22/2022, 5:34 PMAnkit Nayan
09/22/2022, 5:35 PMselect metric_name, count() as count from signoz_metrics.time_series_v2 where metric_name ilike '%signoz%' group by metric_name;
select metric_name, count() as count from signoz_metrics.samples_v2 where timestamp_ms > toUnixTimestamp(now() - INTERVAL 1 DAY)*1000 and metric_name ilike '%signoz%' group by metric_name;
Alexei Zenin
09/22/2022, 8:34 PMSELECT
metric_name,
count() AS count
FROM signoz_metrics.time_series_v2
WHERE metric_name ILIKE '%signoz%'
GROUP BY metric_name
Query id: 14b718d9-4522-401b-9f15-aa362870b0be
┌─metric_name────────────────────────┬──count─┐
│ signoz_latency_bucket │ 105108 │
│ signoz_latency_count │ 5532 │
│ signoz_external_call_latency_count │ 862 │
│ signoz_db_latency_sum │ 211 │
│ signoz_db_latency_count │ 211 │
│ signoz_calls_total │ 5754 │
│ signoz_latency_sum │ 5532 │
│ signoz_external_call_latency_sum │ 862 │
└────────────────────────────────────┴────────┘
8 rows in set. Elapsed: 0.266 sec. Processed 2.50 million rows, 2.59 MB (9.41 million rows/s., 9.76 MB/s.)
SELECT
metric_name,
count() AS count
FROM signoz_metrics.samples_v2
WHERE (timestamp_ms > (toUnixTimestamp(now() - toIntervalDay(1)) * 1000)) AND (metric_name ILIKE '%signoz%')
GROUP BY metric_name
Query id: a5255015-e491-434c-80c7-cffd8c850e44
┌─metric_name────────────────────────┬─count─┐
│ signoz_latency_bucket │ 16986 │
│ signoz_latency_count │ 894 │
│ signoz_external_call_latency_count │ 199 │
│ signoz_db_latency_sum │ 136 │
│ signoz_db_latency_count │ 136 │
│ signoz_calls_total │ 894 │
│ signoz_latency_sum │ 894 │
│ signoz_external_call_latency_sum │ 199 │
└────────────────────────────────────┴───────┘
8 rows in set. Elapsed: 0.023 sec. Processed 90.94 thousand rows, 560.54 KB (3.93 million rows/s., 24.22 MB/s.)
Ankit Nayan
09/23/2022, 1:48 AMAlexei Zenin
10/13/2022, 8:46 PMAnkit Nayan
10/14/2022, 3:15 AMv0.11.2
you won't need to truncate anymore. Use metrics freely nowim assuming the data is independent of each one and will have no impact on span metrics scrapping if done in 1:1 fashion instead of 1:Nthat is true and should work if scrape configs are correct. But why would you want to do 1:1, won't it be too much overkill?
Alexei Zenin
10/14/2022, 4:44 AM