Hi, I’m using the helm charts to deploy signoz. S...
# support
a
Hi, I’m using the helm charts to deploy signoz. Signoz version: 0.11.0 Helm chart version: 0.3.3 First, I have some problem running
signoz-query-services
, it constantly restarting. Logs looks ok:
Copy code
2022-09-16T12:44:52.645Z	INFO	app/server.go:84	Using ClickHouse as datastore ...
ts=2022-09-16T12:44:52.652594487Z caller=log.go:168 level=info msg="Loading configuration file" filename=/root/config/prometheus.yml
2022-09-16T12:44:52.654Z	INFO	alertManager/notifier.go:94	Starting notifier with alert manager:[<http://signoz-alertmanager:9093/api/>]
2022-09-16T12:44:52.654Z	INFO	app/server.go:396	rules manager is ready
ts=2022-09-16T12:44:52.656830443Z caller=log.go:168 level=info msg="Completed loading of configuration file" filename=/root/config/prometheus.yml
2022-09-16T12:44:52.657Z	INFO	alertManager/notifier.go:126	msg: Initiating alert notifier...
2022-09-16T12:44:52.658Z	INFO	app/server.go:273	Query server started listening on 0.0.0.0:8080...
2022-09-16T12:44:52.658Z	INFO	app/server.go:286	Query server started listening on private port 0.0.0.0:8085...
starting private http
2022-09-16T12:44:52.658Z	INFO	app/server.go:312	Starting HTTP server{port 11 8080  <nil>} {addr 15 0 0.0.0.0:8080 <nil>}
2022-09-16T12:44:52.658Z	INFO	app/server.go:324	Starting pprof server{addr 15 0 0.0.0.0:6060 <nil>}
2022-09-16T12:44:52.658Z	INFO	app/server.go:338	Starting Private HTTP server{port 11 8085  <nil>} {addr 15 0 0.0.0.0:8085 <nil>}
2022-09-16T12:44:52.700Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 27.736µs
2022-09-16T12:44:55.562Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 16.782µs
2022-09-16T12:44:55.562Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 14.205µs
Is a way to troubleshoot it ? to put on Debug ? Second, on the other hand, is a way to set
signoz-otel-collector
as deployment in place of daemonset ?
s
Do you have already some existing data? Or is this a fresh installation? Can you share exit status code for query service?
a
@Srikanth Chekuri I can not get the exit code, is it exposed by signoz ? I get an empty response:
Copy code
kubectl get pod signoz-query-service-0 -n platform -o jsonpath='{.status.containerStatuses[?(@.name=="signoz-query-service")].state.terminated.exitCode}'
It was a fresh installation, after some hour it breaks and now I am trying to debug it
s
shouldn’t it be
lastState.terminated.exitCode
? The most common and known one is OOM with 137 when the query service doesn’t have enough memory. What are the resource limits for this?
a
Yes, it was 137. The resouces are:
Copy code
resources:
            limits:
              memory: 2000Mi
            requests:
              cpu: 500m
              memory: 2000Mi
s
What’s the amount of data you are ingesting? Can you share the number of time series you have (run
select count() from signoz_metrics.time_series_v2;
from clickhouse client)? Right now the query service needs resources adjusted for the volume of time series. So if you can share the what’s the avg number of time series you expect in total a resource limit can be suggested.
a
@Srikanth Chekuri ok, make sense. The query result is 464302
s
Can you upgrade tot
0.11.1
? We did some improvement and it should run fine for the above number.
a
yes, looks it works
Is a way to make the account to calculate based on the query you shared the amount of resources needed ?
s
You can make rough estimation of 1M series ~ 1-1.5GB or RAM
a
That amount of timse series is based by time ? for example each 5 min, or something like that ?
s
No, it’s based on the data being ingested. For instance number of status codes, method types etc.. Say you have http requests with 4 status codes and 5 methods then you have 20 time series.
a
@Srikanth Chekuri thanks you! you save my day! 🙂
And what about to run collector as deployment ? is it possible ?
s
I haven’t followed the recent developments there but I believe it was changed to daemonset for some reason. @Prashant Shahi can give you more details on it.
a
I think actually is running as daemonset because it needs to grab Logs, but would be nice if Logs is not required take an option to disable Logs and deploy as deployment, probably it required some chart modification.
p
@Alejandro Decchi we have moved back to Deployment. There is a release PR pending. It will be merged after review from @Ankit Nayan. You can track it here: https://github.com/SigNoz/charts/pull/82
@Alejandro Decchi The PR above has been merged. We should have SigNoz otelCollector reverted back to deployment. While
k8s-infra
chart is introduced to handle the logs and metrics collection from K8s cluster.