This message was deleted SigNoz Community #support

Join Slack

This message was deleted.

# support

Slackbot

02/26/2024, 10:47 AM

This message was deleted.

Srikanth Chekuri

02/26/2024, 4:27 PM

Which version of SigNoz deployment is this? Do you see any error logs in query-service?

Vibhav Parameswara Chary

02/27/2024, 3:07 AM

Dont see any error in query service

Srikanth Chekuri

02/27/2024, 3:44 AM

How long does it keep spinning? How many services do you have? Would you be able to exec into ClickHouse and run some query?

Vibhav Parameswara Chary

02/27/2024, 4:43 AM

Yeah I can login to clickhouse db

Vibhav Parameswara Chary

02/27/2024, 4:43 AM

It just keeps spinning

Vibhav Parameswara Chary

02/27/2024, 4:46 AM

2024-02-26T132253.933Z ERROR clickhouseReader/reader.go:4609 error while reading time series result write: write tcp 10.107.86.25357200 >172.20.141.219000: i/o timeout 2024-02-26T132310.797Z ERROR clickhouseReader/reader.go:4609 error while reading time series result write: write tcp 10.107.86.25357260 >172.20.141.219000: i/o timeout 2024-02-26T132310.801Z ERROR clickhouseReader/reader.go:4609 error while reading time series result write: write tcp 10.107.86.25360758 >172.20.141.219000: i/o timeout 2024-02-26T132310.801Z INFO utils/time.go:12 func GetTimeSeriesResultV3 took 1m0.002184427s with args [SELECT A.

address

address

, A.

ts

ts

, A.value * 100 / B.value as value FROM (SELECT address, ts, sum(rate_value) as value FROM (SELECT address, ts, If((value - lagInFrame(value, 1, 0) OVER rate_window) < 0, nan, If((ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window) >= 86400, nan, (value - lagInFrame(value, 1, 0) OVER rate_window) / (ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window))) as rate_value FROM(SELECT fingerprint, address, toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), INTERVAL 60 SECOND) as ts, max(value) as value FROM signoz_metrics.distributed_samples_v2 INNER JOIN (SELECT JSONExtractString(labels, 'address') as address, fingerprint FROM signoz_metrics.time_series_v2 WHERE metric_name = 'signoz_external_call_latency_count' AND temporality IN ['Cumulative', 'Unspecified'] AND JSONExtractString(labels, 'service_name') IN ['app-settlement'] AND JSONExtractString(labels, 'status_code') IN ['STATUS_CODE_ERROR']) as filtered_time_series USING fingerprint WHERE metric_name = 'signoz_external_call_latency_count' AND timestamp_ms >= 1708951860000 AND timestamp_ms < 1708953720000 GROUP BY fingerprint, address,ts ORDER BY fingerprint, address ASC, ts) WINDOW rate_window as (PARTITION BY fingerprint, address ORDER BY fingerprint, address ASC, ts) ) WHERE isNaN(rate_value) = 0 GROUP BY GROUPING SETS ( (address, ts), (address) ) ORDER BY address ASC, ts) as A INNER JOIN (SELECT address, ts, sum(rate_value) as value FROM (SELECT address, ts, If((value - lagInFrame(value, 1, 0) OVER rate_window) < 0, nan, If((ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window) >= 86400, nan, (value - lagInFrame(value, 1, 0) OVER rate_window) / (ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window))) as rate_value FROM(SELECT fingerprint, address, toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), INTERVAL 60 SECOND) as ts, max(value) as value FROM signoz_metrics.distributed_samples_v2 INNER JOIN (SELECT JSONExtractString(labels, 'address') as address, fingerprint FROM signoz_metrics.time_series_v2 WHERE metric_name = 'signoz_external_call_latency_count' AND temporality IN ['Cumulative', 'Unspecified'] AND JSONExtractString(labels, 'service_name') IN ['app-settlement']) as filtered_time_series USING fingerprint WHERE metric_name = 'signoz_external_call_latency_count' AND timestamp_ms >= 1708951860000 AND timestamp_ms < 1708953720000 GROUP BY fingerprint, address,ts ORDER BY fingerprint, address ASC, ts) WINDOW rate_window as (PARTITION BY fingerprint, address ORDER BY fingerprint, address ASC, ts) ) WHERE isNaN(rate_value) = 0 GROUP BY GROUPING SETS ( (address, ts), (address) ) ORDER BY address ASC, ts) as B ON A.

address

= B.

address

AND A.

ts

= B.

ts

] countIf(statusCode=2) as errorCount, 2024-02-26T133335.137Z INFO utils/time.go:12 func GetTimeSeriesResultV3 took 418.083515ms with args [SELECT B.

ts

ts

, ((B.value + C.value) / 2) / A.value as value FROM (SELECT ts, sum(rate_value) as value FROM (SELECT ts, If((value - lagInFrame(value, 1, 0) OVER rate_window) < 0, nan, If((ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window) >= 86400, nan, (value - lagInFrame(value, 1, 0) OVER rate_window) / (ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window))) as rate_value FROM(SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), INTERVAL 60 SECOND) as ts, max(value) as value FROM signoz_metrics.distributed_samples_v2 INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v2 WHERE metric_name = 'signoz_latency_bucket' AND temporality IN ['Cumulative', 'Unspecified'] AND JSONExtractString(labels, 'status_code') != 'STATUS_CODE_ERROR' AND JSONExtractString(labels, 'le') = '1000' AND JSONExtractString(labels, 'service_name') = 'core-cam-http' AND JSONExtractString(labels, 'operation') IN ['HTTP POST route not found']) as filtered_time_series USING fingerprint WHERE metric_name = 'signoz_latency_bucket' AND timestamp_ms >= 1708951860000 AND timestamp_ms < 1708953720000 GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WINDOW rate_window as (PARTITION BY fingerprint ORDER BY fingerprint, ts) ) WHERE isNaN(rate_value) = 0 GROUP BY ts ORDER BY ts) as B INNER JOIN (SELECT ts, sum(rate_value) as value FROM (SELECT ts, If((value - lagInFrame(value, 1, 0) OVER rate_window) < 0, nan, If((ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window) >= 86400, nan, (value - lagInFrame(value, 1, 0) OVER rate_window) / (ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window))) as rate_value FROM(SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), INTERVAL 60 SECOND) as ts, max(value) as value FROM signoz_metrics.distributed_samples_v2 INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v2 WHERE metric_name = 'signoz_latency_bucket' AND temporality IN ['Cumulative', 'Unspecified'] AND JSONExtractString(labels, 'le') = '10000' AND JSONExtractString(labels, 'status_code') != 'STATUS_CODE_ERROR' AND JSONExtractString(labels, 'service_name') = 'core-cam-http' AND JSONExtractString(labels, 'operation') IN ['HTTP POST route not found']) as filtered_time_series USING fingerprint WHERE metric_name = 'signoz_latency_bucket' AND timestamp_ms >= 1708951860000 AND timestamp_ms < 1708953720000 GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WINDOW rate_window as (PARTITION BY fingerprint ORDER BY fingerprint, ts) ) WHERE isNaN(rate_value) = 0 GROUP BY ts ORDER BY ts) as C ON B.

ts

= C.

ts

INNER JOIN (SELECT ts, sum(rate_value) as value FROM (SELECT ts, If((value - lagInFrame(value, 1, 0) OVER rate_window) < 0, nan, If((ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window) >= 86400, nan, (value - lagInFrame(value, 1, 0) OVER rate_window) / (ts - lagInFrame(ts, 1, toDate('1970-01-01')) OVER rate_window))) as rate_value FROM(SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), INTERVAL 60 SECOND) as ts, max(value) as value FROM signoz_metrics.distributed_samples_v2 INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v2 WHERE metric_name = 'signoz_latency_count' AND temporality IN ['Cumulative', 'Unspecified'] AND JSONExtractString(labels, 'service_name') = 'core-cam-http' AND JSONExtractString(labels, 'operation') IN ['HTTP POST route not found']) as filtered_time_series USING fingerprint WHERE metric_name = 'signoz_latency_count' AND timestamp_ms >= 1708951860000 AND timestamp_ms < 1708953720000 GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WINDOW rate_window as (PARTITION BY fingerprint ORDER BY fingerprint, ts) ) WHERE isNaN(rate_value) = 0 GROUP BY ts ORDER BY ts) as A ON C.

ts

= A.

ts

] 2024-02-27T002048.114Z INFO clickhouseReader/reader.go:2464 SELECT id, status, ttl, cold_storage_ttl FROM ttl_status WHERE table_name = ? ORDER BY created_at DESCsignoz_traces.signoz_error_index_v2 2024-02-27T002048.329Z DEBUG clickhouseReader/reader.go:2558 Parsing TTL from: MergeTree PARTITION BY toDate(timestamp) PRIMARY KEY (serviceName, hasError, toStartOfHour(timestamp), name) ORDER BY (serviceName, hasError, toStartOfHour(timestamp), name, timestamp) TTL toDateTime(timestamp) + toIntervalSecond(1296000) SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1 2024-02-27T030618.905Z INFO clickhouseReader/reader.go:1221 SELECT COUNT(*) as numTotal FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU AND hasError = true 2024-02-27T030618.937Z INFO clickhouseReader/reader.go:1232 SELECT COUNT(*) as numTotal FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU AND hasError = false 2024-02-27T030806.918Z INFO clickhouseReader/reader.go:1221 SELECT COUNT(*) as numTotal FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU AND hasError = true 2024-02-27T030806.930Z INFO clickhouseReader/reader.go:1232 SELECT COUNT(*) as numTotal FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU AND hasError = false 2024-02-27T032703.062Z ERROR clickhouseReader/reader.go:853 Error in processing sql query: write: write tcp 10.107.86.25348690 >172.20.141.219000: i/o timeout 2024-02-27T034113.609Z ERROR clickhouseReader/reader.go:853 Error in processing sql query: write: write tcp 10.107.86.25341776 >172.20.141.219000: i/o timeout 2024-02-27T035440.873Z ERROR clickhouseReader/reader.go:853 Error in processing sql query: write: write tcp 10.107.86.25358618 >172.20.141.219000: i/o timeout posthog 2024/02/27 035745 ERROR: sending request - Post "https://app.posthog.com/batch/": read tcp 10.107.86.25332882 >104.22.58.181443: read: connection reset by peer 2024-02-27T044121.721Z ERROR clickhouseReader/reader.go:853 Error in processing sql query: write: write tcp 10.107.86.25359732 >172.20.141.219000: i/o timeout

Vibhav Parameswara Chary

02/27/2024, 4:47 AM

See some error in query service

Srikanth Chekuri

02/27/2024, 6:00 AM

It says i/o timeout. What are the resources given to ClickHouse?

Vibhav Parameswara Chary

02/27/2024, 6:03 AM

Default values but it can go up to 5-6 cores

Srikanth Chekuri

02/27/2024, 6:04 AM

How much data are you ingesting? And what is the current CPU and memory usage of clickhouse pods?

Vibhav Parameswara Chary

02/27/2024, 6:06 AM

Ist consuming arround 700 cores as of now

Vibhav Parameswara Chary

02/27/2024, 6:07 AM

Mem is arround 6 gb its using

Vibhav Parameswara Chary

02/27/2024, 6:12 AM

how do we dsiable s3 once its deployed?

Vibhav Parameswara Chary

02/27/2024, 6:12 AM

Pods go in to crash if I disable s3

Vibhav Parameswara Chary

02/27/2024, 6:12 AM

Last optin is to uninistall and reinstall

Vibhav Parameswara Chary

02/27/2024, 6:13 AM

Deleting clickhosue tables and running helm fixes issies for some time and again the services goes in to spinning mode

Srikanth Chekuri

02/27/2024, 6:15 AM

That's not a way to solve the issue. We first need to understand what's the reason behind this. How much data are you ingesting?

Srikanth Chekuri

02/27/2024, 6:17 AM

Share the output of this

Copy code

SELECT
    serviceName,
    count()
FROM signoz_traces.distributed_top_level_operations
GROUP BY serviceName

Srikanth Chekuri

02/27/2024, 6:25 AM

What is your table TTL policies?

Copy code

SHOW CREATE TABLE signoz_traces.signoz_index_v2

Vibhav Parameswara Chary

02/27/2024, 6:41 AM

SHOW CREATE TABLE signoz_traces.signoz_index_v2 Query id: 86614b46-690a-4e88-b8ba-827046d9a858 ┌─statement──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ CREATE TABLE signoz_traces.signoz_index_v2 (

timestamp

DateTime64(9) CODEC(DoubleDelta, LZ4),

traceID

FixedString(32) CODEC(ZSTD(1)),

spanID

String CODEC(ZSTD(1)),

parentSpanID

String CODEC(ZSTD(1)),

serviceName

LowCardinality(String) CODEC(ZSTD(1)),

name

LowCardinality(String) CODEC(ZSTD(1)),

kind

Int8 CODEC(T64, ZSTD(1)),

durationNano

UInt64 CODEC(T64, ZSTD(1)),

statusCode

Int16 CODEC(T64, ZSTD(1)),

externalHttpMethod

LowCardinality(String) CODEC(ZSTD(1)),

externalHttpUrl

LowCardinality(String) CODEC(ZSTD(1)),

component

LowCardinality(String) CODEC(ZSTD(1)),

dbSystem

LowCardinality(String) CODEC(ZSTD(1)),

dbName

LowCardinality(String) CODEC(ZSTD(1)),

dbOperation

LowCardinality(String) CODEC(ZSTD(1)),

peerService

LowCardinality(String) CODEC(ZSTD(1)),

events

Array(String) CODEC(ZSTD(2)),

httpMethod

LowCardinality(String) CODEC(ZSTD(1)),

httpUrl

LowCardinality(String) CODEC(ZSTD(1)),

httpCode

LowCardinality(String) CODEC(ZSTD(1)),

httpRoute

LowCardinality(String) CODEC(ZSTD(1)),

httpHost

LowCardinality(String) CODEC(ZSTD(1)),

msgSystem

LowCardinality(String) CODEC(ZSTD(1)),

msgOperation

LowCardinality(String) CODEC(ZSTD(1)),

hasError

Bool CODEC(T64, ZSTD(1)),

tagMap

Map(LowCardinality(String), String) CODEC(ZSTD(1)),

gRPCMethod

LowCardinality(String) CODEC(ZSTD(1)),

gRPCCode

LowCardinality(String) CODEC(ZSTD(1)),

rpcSystem

LowCardinality(String) CODEC(ZSTD(1)),

rpcService

LowCardinality(String) CODEC(ZSTD(1)),

rpcMethod

LowCardinality(String) CODEC(ZSTD(1)),

responseStatusCode

LowCardinality(String) CODEC(ZSTD(1)),

stringTagMap

Map(String, String) CODEC(ZSTD(1)),

numberTagMap

Map(String, Float64) CODEC(ZSTD(1)),

boolTagMap

Map(String, Bool) CODEC(ZSTD(1)),

resourceTagsMap

Map(LowCardinality(String), String) CODEC(ZSTD(1)), INDEX idx_service serviceName TYPE bloom_filter GRANULARITY 4, INDEX idx_name name TYPE bloom_filter GRANULARITY 4, INDEX idx_kind kind TYPE minmax GRANULARITY 4, INDEX idx_duration durationNano TYPE minmax GRANULARITY 1, INDEX idx_httpCode httpCode TYPE set(0) GRANULARITY 1, INDEX idx_hasError hasError TYPE set(2) GRANULARITY 1, INDEX idx_tagMapKeys mapKeys(tagMap) TYPE bloom_filter(0.01) GRANULARITY 64, INDEX idx_tagMapValues mapValues(tagMap) TYPE bloom_filter(0.01) GRANULARITY 64, INDEX idx_httpRoute httpRoute TYPE bloom_filter GRANULARITY 4, INDEX idx_httpUrl httpUrl TYPE bloom_filter GRANULARITY 4, INDEX idx_httpHost httpHost TYPE bloom_filter GRANULARITY 4, INDEX idx_httpMethod httpMethod TYPE bloom_filter GRANULARITY 4, INDEX idx_timestamp timestamp TYPE minmax GRANULARITY 1, INDEX idx_rpcMethod rpcMethod TYPE bloom_filter GRANULARITY 4, INDEX idx_responseStatusCode responseStatusCode TYPE set(0) GRANULARITY 1, INDEX idx_resourceTagsMapKeys mapKeys(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64, INDEX idx_resourceTagsMapValues mapValues(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64, PROJECTION timestampSort ( SELECT * ORDER BY timestamp ) ) ENGINE = MergeTree PARTITION BY toDate(timestamp) PRIMARY KEY (serviceName, hasError, toStartOfHour(timestamp), name) ORDER BY (serviceName, hasError, toStartOfHour(timestamp), name, timestamp) TTL toDateTime(timestamp) + toIntervalSecond(1296000) SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1 │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ 1 row in set. Elapsed: 0.003 sec. chi-signoz-release-clickhouse-cluster-0-0-0.chi-signoz-release-clickhouse-cluster-0-0.platform.svc.cluster.local :)

Srikanth Chekuri

02/27/2024, 6:44 AM

Run this and see how long does it take to complete the query

Copy code

SELECT
    serviceName,
    toStartOfInterval(timestamp, toIntervalSecond(60)) AS ts,
    quantile(0.5)(durationNano) AS value
FROM signoz_traces.distributed_signoz_index_v2
WHERE ((timestamp >= '1709013204000000000') AND (timestamp <= '1709015036000000000'))
GROUP BY serviceName, ts
ORDER BY serviceName, ts

Vibhav Parameswara Chary

02/27/2024, 6:45 AM

558 rows in set. Elapsed: 0.025 sec. Processed 59.00 thousand rows, 1.00 MB (2.33 million rows/s., 39.63 MB/s.) Peak memory usage: 4.12 MiB.

Srikanth Chekuri

02/27/2024, 6:46 AM

Did you purge clickhouse in the last 1-2 hours?

Vibhav Parameswara Chary

02/27/2024, 6:46 AM

May be yesterday

Vibhav Parameswara Chary

02/27/2024, 6:46 AM

Trying all steps have purged couple of times

Vibhav Parameswara Chary

02/27/2024, 6:47 AM

Once purged it works for some time and then stops

Srikanth Chekuri

02/27/2024, 6:47 AM

I don't see anything wrong based on the outputs you shared.

Srikanth Chekuri

02/27/2024, 6:49 AM

When this happens there might be some frontend error which makes the spinner no go away. Please check if that's the case

Vibhav Parameswara Chary

02/27/2024, 6:53 AM

restarted forntend pods, no luck

Srikanth Chekuri

02/27/2024, 6:55 AM

No, I meant the javascript error. Do you see the spinner now?

Vibhav Parameswara Chary

02/27/2024, 6:55 AM

yeah spinner stll ther e

Vibhav Parameswara Chary

02/27/2024, 6:55 AM

different browsers is the same

Srikanth Chekuri

02/27/2024, 6:56 AM

Would you be able to join huddle now?

Vibhav Parameswara Chary

02/27/2024, 6:58 AM

sure

Srikanth Chekuri

02/27/2024, 6:59 AM

Open the network tab and share which requests are these

Srikanth Chekuri

02/27/2024, 7:02 AM

The queries were running quick enough to not timeout. What does your setup look like? Did you make any custom changes to SigNoz deployment? I am also waiting on huddle if you would perfer that way.

Vibhav Parameswara Chary

02/27/2024, 7:06 AM

Not sure how huddle works, do you have to send an invite?

Srikanth Chekuri

02/27/2024, 7:07 AM

I did send an invite. You joined for a moment and then left.

Srikanth Chekuri

02/27/2024, 7:26 AM

Copy code

SHOW CREATE TABLE signoz_traces.top_level_operations

20 Views

Open in Slack

Previous Next