Hi, I'm facing an Error in processing SQL query in...
# support
m
Hi, I'm facing an Error in processing SQL query in the traces menu after upgrading to 0.15.0, is it an issue with the application or the limitation of the resources?
p
@moronmon Which version of SigNoz did you upgrade from?
Also, is it happening for all traces or some traces?
v
@moronmon Can you please share logs of query service?
m
Hi @Pranay I updated from 0.13.0, it happens to all traces that we have Hi @Vishal Sharma, here's the log from the query service.
Copy code
2023-02-07T04:56:07.939Z	INFO	clickhouseReader/reader.go:1217	SELECT durationNano as numTotal FROM signoz_traces.distributed_durationSort WHERE timestamp >= @timestampL AND timestamp <= @timestampU ORDER BY durationNano LIMIT 1
2023-02-07T04:56:10.250Z	INFO	clickhouseReader/reader.go:1229	SELECT durationNano as numTotal FROM signoz_traces.distributed_durationSort WHERE timestamp >= @timestampL AND timestamp <= @timestampU ORDER BY durationNano DESC LIMIT 1
2023-02-07T04:56:10.636Z	INFO	clickhouseReader/reader.go:1679	
2023-02-07T04:56:10.637Z	INFO	app/server.go:236	/api/v1/getTagFilters	timeTaken: 4.361572765s
2023-02-07T04:56:12.692Z	INFO	app/server.go:236	/api/v1/version	timeTaken: 23.08µs
2023-02-07T04:56:13.242Z	INFO	clickhouseReader/reader.go:1166	SELECT COUNT(*) as numTotal FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU AND hasError = true
2023-02-07T04:56:14.752Z	INFO	clickhouseReader/reader.go:1177	SELECT COUNT(*) as numTotal FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU AND hasError = false
2023-02-07T04:56:16.329Z	INFO	clickhouseReader/reader.go:1031	SELECT serviceName, count() as count FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU GROUP BY serviceName
2023-02-07T04:56:16.331Z	INFO	app/server.go:236	/api/v1/getSpanFilters	timeTaken: 10.021926759s
2023-02-07T04:56:17.545Z	INFO	clickhouseReader/reader.go:1432	SELECT timestamp, spanID, traceID, serviceName, name, durationNano, httpMethod, rpcMethod, responseStatusCode FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU LIMIT @limit
2023-02-07T04:56:17.546Z	INFO	app/server.go:236	/api/v1/getFilteredSpans	timeTaken: 892.793894ms
2023-02-07T04:56:18.437Z	INFO	clickhouseReader/reader.go:2146	SELECT toStartOfInterval(timestamp, INTERVAL 1 minute) as time,  count(*) as value  FROM signoz_traces.distributed_signoz_index_v2 WHERE timestamp >= @timestampL AND timestamp <= @timestampU GROUP BY time ORDER BY time
2023-02-07T04:56:18.439Z	INFO	app/server.go:236	/api/v1/getFilteredSpans/aggregates	timeTaken: 1.783472976s
2023-02-07T04:56:23.735Z	INFO	api/traces.go:18	SmartTraceDetail feature is not enabled in this plan
2023-02-07T04:56:36.249Z	INFO	clickhouseReader/reader.go:1679	
2023-02-07T04:56:36.249Z	INFO	app/server.go:236	/api/v1/getTagFilters	timeTaken: 30.007187224s
2023-02-07T04:56:43.115Z	INFO	app/server.go:236	/api/v1/version	timeTaken: 23.735µs
2023-02-07T04:57:13.535Z	INFO	app/server.go:236	/api/v1/version	timeTaken: 23.477µs
2023-02-07T04:57:14.033Z	INFO	clickhouseReader/reader.go:1914	SELECT timestamp, traceID, model FROM signoz_traces.distributed_signoz_spans WHERE traceID=$1
2023-02-07T04:57:14.033Z	DEBUG	clickhouseReader/reader.go:1917	Error in processing sql query: code: 241, message: Memory limit (total) exceeded: would use 3.52 GiB (attempt to allocate chunk of 5644118 bytes), maximum: 3.52 GiB. OvercommitTracker decision: Query was selected to stop by OvercommitTracker.: (while reading column model): (while reading from part /var/lib/clickhouse/store/349/3498ca02-e2c6-4147-b422-37bbd7943d34/20230206_1220283_1220283_0/ from mark 27 with max_rows_to_read = 1024): While executing MergeTreeInOrder
2023-02-07T04:57:14.033Z	INFO	app/server.go:236	/api/v1/traces/{traceId}	timeTaken: 50.298413819s
2023-02-07T04:57:44.001Z	INFO	app/server.go:236	/api/v1/version	timeTaken: 23.757µs
If I saw there's a line that said memory limit. I did give a limitation for the several containers, due to high usage. any suggestion on how to set up and allocated the resources the Signoz?
v
@moronmon Generally how many spans does a single trace has?
Here are some recommendations for running signoz on production: https://signoz.io/docs/production-readiness/
m
How to see the spans for each trace?
because on the 0.13.0 it's working fine. after we upgrade, the usage getting higher and the server getting crashed
v
Have set retention period? How much resources have been given to clickhouse?
m
yes we did set the retention period,we set it for 1 day. for the resources for clickhouse 0.8CPU and 4GB memory
Hi @Vishal Sharma sorry to bother you again. Any idea why the Clickhouse getting OOM?
2023.02.20 04:03:05.333191 [ 90 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 241. DB::Exception: Memory limit (total) exceeded: would use 2.93 GiB (attempt to allocate chunk of 9310195 bytes), maximum: 2.93 GiB. OvercommitTracker decision: Memory overcommit isn't used. Waiting time or overcommit denominator are set to zero. (MEMORY_LIMIT_EXCEEDED), Stack trace (when copying this message, always include the lines below):
any config that I missed, to avoid the clickhouse from dying cause by the OOM. because we are running on the limited memory environment
v
This happens when a query is unable to execute due to limited memory in clickhouse
You have to increase resources to be able to run heavier queries.
m
is there another option beside increase the resource?
or is there any suggestion for the clickhouse's specification?
I tried to run the clickhouse on 8GB memory seems working well for now, thank you @Vishal Sharma. but have another question about error in SQL query on the traces pages. is there any timeout setting on query-service? I got this error while opening one of our traces
DEBUG	clickhouseReader/reader.go:1917	Error in processing sql query: context deadline exceeded
on the clickhouse log I see the task been tagged as outdated on this log`[ 259 ] {} <Information> DDLWorker: Task query-0000233047 is outdated, deleting it`
v
Looks like there are too many spans in a single trace. By default timeout is set to 60s: https://github.com/SigNoz/signoz/blob/c657f96032a3889f35075603dae6aebee511eede/pkg/query-service/constants/constants.go#L98
m
can I set the default timeout? or can I set the limit of the span in a single trace?