Hello, I have observed a significant increase in C...
# support
s
Hello, I have observed a significant increase in CPU utilization after the Signoz version upgrade to v0.41.1. Additionally, we're encountering the following error in the ClickHouse pod. 2024.03.21 130442.644576 [ 1011 ] {bc235080-bd8e-4a26-99d6-69e10ec9bee9} <Error> TCPHandler: Code: 210. DB:NetException I/O error: Broken pipe, while writing to socket ([:ffff10.0.145.23]:9000 -> [:ffff10.0.129.149]:33502). (NETWORK_ERROR), Stack trace (when copying this message, always include the lines below): 0. DB:Exception:Exception(DB:Exception:MessageMasked&&, int, bool) @ 0x000000000c800f1b in /usr/bin/clickhouse 1. DB:NetException:NetException<String, String, String>(int, FormatStringHelperImpl<std:type identity&lt;String&gt;:type, std:type identity&lt;String&gt;:type, std:type identity&lt;String&gt;:type>, String&&, String&&, String&&) @ 0x000000000caa69a1 in /usr/bin/clickhouse 2. DB:WriteBufferFromPocoSocket:nextImpl() @ 0x000000000caa733e in /usr/bin/clickhouse 3. DB:TCPHandler:runImpl() @ 0x000000001292120f in /usr/bin/clickhouse 4. DB:TCPHandler:run() @ 0x0000000012933eb9 in /usr/bin/clickhouse 5. Poco:NetTCPServerConnection:start() @ 0x00000000153a5a72 in /usr/bin/clickhouse 6. Poco:NetTCPServerDispatcher:run() @ 0x00000000153a6871 in /usr/bin/clickhouse 7. Poco:PooledThread:run() @ 0x000000001549f047 in /usr/bin/clickhouse 8. Poco:ThreadImpl:runnableEntry(void*) @ 0x000000001549d67d in /usr/bin/clickhouse
r
cc: @Prashant Shahi @Srikanth Chekuri
r
I ran into this too. Are you using a single shard? Also, is CPU usage abnormally high consistently for more than 5 mins on Clickhouse? You'll find some more context here: https://github.com/SigNoz/signoz-otel-collector/issues/247#issuecomment-2012415529
s
yes, we are using single shard & the cpu utilization is consistently on high side.
we ingest around 10-15 gb of data with retention period of 10 days. CPU & memory alloted to clickhouse pod is 3.5 core & 6 gb of ram. MEMORY & CPU is sufficient or we should increase it. Amidst of this clickhouse pod is using cpu around 3 core. @Prashant Shahi
r
Are you running this on a public cloud or on-prem? Apart from increasing the CPU and memory allocations, I'd also recommend not running Clickhouse on shared CPUs - if you're running this on AWS, stay away from `t2`/`t3`/`t3a` instances -
t2
doesn't really let you burst CPU, while the other two allow for bursting CPU but you tend to run out of CPU credits pretty fast and that throttles Clickhouse.
@saurabh biramwar another thing I noticed - activating cold storage works wonders, and reduces CPU consumption. Def worth considering.
p
Apart from volume of data, it also depends on the nature of data and queries. Like if you have decent workloads but with high cardinality data, that would contribute to more resource usage.