This message was deleted SigNoz Community #support

Join Slack

This message was deleted.

# support

Slackbot

02/07/2023, 4:17 AM

This message was deleted.

Pranay

02/07/2023, 4:38 AM

@moronmon Which version of SigNoz did you upgrade from?

Pranay

02/07/2023, 4:39 AM

Also, is it happening for all traces or some traces?

Vishal Sharma

02/07/2023, 4:52 AM

@moronmon Can you please share logs of query service?

moronmon

02/07/2023, 5:02 AM

Hi @Pranay I updated from 0.13.0, it happens to all traces that we have Hi @Vishal Sharma, here's the log from the query service.

Copy code

2023-02-07T04:56:07.939Z	INFO	clickhouseReader/reader.go:1217	SELECT durationNano as numTotal FROM signoz_traces.distributed_durationSort WHERE timestamp >= @timestampL AND timestamp <= @timestampU ORDER BY durationNano LIMIT 1
2023-02-07T04:56:10.250Z	INFO	clickhouseReader/reader.go:1229	SELECT durationNano as numTotal FROM signoz_traces.distributed_durationSort WHERE timestamp >= @timestampL AND timestamp <= @timestampU ORDER BY durationNano DESC LIMIT 1
2023-02-07T04:56:10.636Z	INFO	clickhouseReader/reader.go:1679	
2023-02-07T04:56:10.637Z	INFO	app/server.go:236	/api/v1/getTagFilters	timeTaken: 4.361572765s
2023-02-07T04:56:12.692Z	INFO	app/server.go:236	/api/v1/version	timeTaken: 23.08µs
2023-02-07T04:56:13.242Z	INFO	clickhouseReader/reader.go:1166	SELECT COUNT(*) as numTotal FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU AND hasError = true
2023-02-07T04:56:14.752Z	INFO	clickhouseReader/reader.go:1177	SELECT COUNT(*) as numTotal FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU AND hasError = false
2023-02-07T04:56:16.329Z	INFO	clickhouseReader/reader.go:1031	SELECT serviceName, count() as count FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU GROUP BY serviceName
2023-02-07T04:56:16.331Z	INFO	app/server.go:236	/api/v1/getSpanFilters	timeTaken: 10.021926759s
2023-02-07T04:56:17.545Z	INFO	clickhouseReader/reader.go:1432	SELECT timestamp, spanID, traceID, serviceName, name, durationNano, httpMethod, rpcMethod, responseStatusCode FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU LIMIT @limit
2023-02-07T04:56:17.546Z	INFO	app/server.go:236	/api/v1/getFilteredSpans	timeTaken: 892.793894ms
2023-02-07T04:56:18.437Z	INFO	clickhouseReader/reader.go:2146	SELECT toStartOfInterval(timestamp, INTERVAL 1 minute) as time,  count(*) as value  FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU GROUP BY time ORDER BY time
2023-02-07T04:56:18.439Z	INFO	app/server.go:236	/api/v1/getFilteredSpans/aggregates	timeTaken: 1.783472976s
2023-02-07T04:56:23.735Z	INFO	api/traces.go:18	SmartTraceDetail feature is not enabled in this plan
2023-02-07T04:56:36.249Z	INFO	clickhouseReader/reader.go:1679	
2023-02-07T04:56:36.249Z	INFO	app/server.go:236	/api/v1/getTagFilters	timeTaken: 30.007187224s
2023-02-07T04:56:43.115Z	INFO	app/server.go:236	/api/v1/version	timeTaken: 23.735µs
2023-02-07T04:57:13.535Z	INFO	app/server.go:236	/api/v1/version	timeTaken: 23.477µs
2023-02-07T04:57:14.033Z	INFO	clickhouseReader/reader.go:1914	SELECT timestamp, traceID, model FROM signoz_traces.distributed_signoz_spans WHERE traceID=$1
2023-02-07T04:57:14.033Z	DEBUG	clickhouseReader/reader.go:1917	Error in processing sql query: code: 241, message: Memory limit (total) exceeded: would use 3.52 GiB (attempt to allocate chunk of 5644118 bytes), maximum: 3.52 GiB. OvercommitTracker decision: Query was selected to stop by OvercommitTracker.: (while reading column model): (while reading from part /var/lib/clickhouse/store/349/3498ca02-e2c6-4147-b422-37bbd7943d34/20230206_1220283_1220283_0/ from mark 27 with max_rows_to_read = 1024): While executing MergeTreeInOrder
2023-02-07T04:57:14.033Z	INFO	app/server.go:236	/api/v1/traces/{traceId}	timeTaken: 50.298413819s
2023-02-07T04:57:44.001Z	INFO	app/server.go:236	/api/v1/version	timeTaken: 23.757µs

If I saw there's a line that said memory limit. I did give a limitation for the several containers, due to high usage. any suggestion on how to set up and allocated the resources the Signoz?

Vishal Sharma

02/07/2023, 5:03 AM

@moronmon Generally how many spans does a single trace has?

Vishal Sharma

02/07/2023, 5:04 AM

Here are some recommendations for running signoz on production: https://signoz.io/docs/production-readiness/

moronmon

02/07/2023, 7:02 AM

How to see the spans for each trace?

moronmon

02/07/2023, 7:04 AM

because on the 0.13.0 it's working fine. after we upgrade, the usage getting higher and the server getting crashed

Vishal Sharma

02/07/2023, 7:05 AM

Have set retention period? How much resources have been given to clickhouse?

moronmon

02/08/2023, 1:46 AM

yes we did set the retention period,we set it for 1 day. for the resources for clickhouse 0.8CPU and 4GB memory

moronmon

02/20/2023, 4:05 AM

Hi @Vishal Sharma sorry to bother you again. Any idea why the Clickhouse getting OOM?

2023.02.20 04:03:05.333191 [ 90 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 241. DB::Exception: Memory limit (total) exceeded: would use 2.93 GiB (attempt to allocate chunk of 9310195 bytes), maximum: 2.93 GiB. OvercommitTracker decision: Memory overcommit isn't used. Waiting time or overcommit denominator are set to zero. (MEMORY_LIMIT_EXCEEDED), Stack trace (when copying this message, always include the lines below):

moronmon

02/20/2023, 4:06 AM

any config that I missed, to avoid the clickhouse from dying cause by the OOM. because we are running on the limited memory environment

Vishal Sharma

02/20/2023, 4:06 AM

This happens when a query is unable to execute due to limited memory in clickhouse

Vishal Sharma

02/20/2023, 4:14 AM

You have to increase resources to be able to run heavier queries.

moronmon

02/20/2023, 4:17 AM

is there another option beside increase the resource?

moronmon

02/20/2023, 4:23 AM

or is there any suggestion for the clickhouse's specification?

moronmon

02/20/2023, 4:33 AM

I tried to run the clickhouse on 8GB memory seems working well for now, thank you @Vishal Sharma. but have another question about error in SQL query on the traces pages. is there any timeout setting on query-service? I got this error while opening one of our traces

DEBUG	clickhouseReader/reader.go:1917	Error in processing sql query: context deadline exceeded

on the clickhouse log I see the task been tagged as outdated on this log`[ 259 ] {} <Information> DDLWorker: Task query-0000233047 is outdated, deleting it`

Vishal Sharma

02/20/2023, 5:48 AM

Looks like there are too many spans in a single trace. By default timeout is set to 60s: https://github.com/SigNoz/signoz/blob/c657f96032a3889f35075603dae6aebee511eede/pkg/query-service/constants/constants.go#L98

moronmon

02/20/2023, 7:30 AM

can I set the default timeout? or can I set the limit of the span in a single trace?

62 Views

Open in Slack

Previous Next