This message was deleted SigNoz Community #support

Join Slack

This message was deleted.

# support

Slackbot

09/20/2022, 11:14 AM

This message was deleted.

Srikanth Chekuri

09/20/2022, 11:34 AM

That shouldn’t be the case. Can you share the output of describe pods for query-service?

sudhanshu dev

09/20/2022, 11:34 AM

sudhanshu dev

09/20/2022, 11:34 AM

I will share

nitya-signoz

09/20/2022, 11:54 AM

can you also check if clickhouse is in a healthy state ? or it’s restarting as well ?

sudhanshu dev

09/20/2022, 11:54 AM

No click house is in healthy state no restart

sudhanshu dev

09/20/2022, 11:54 AM

I am out

👍 1

sudhanshu dev

09/20/2022, 11:55 AM

Soon i will share the descrbe commnd out put

👍 1

Srikanth Chekuri

09/20/2022, 12:02 PM

That will help to know the correct reason. The background upload fail to s3 wouldn’t crash the DB and (should be) unrelated to query service getting into crashbackloopoff.

sudhanshu dev

09/20/2022, 12:03 PM

Got it

sudhanshu dev

09/20/2022, 2:28 PM

describe pods output

sudhanshu dev

09/20/2022, 2:28 PM

Name: signoz-release-query-service-0 Namespace: platform Priority: 0 Node: ip-10-107-65-6.ap-south-1.compute.internal/10.107.65.6 Start Time: Tue, 20 Sep 2022 195145 +0530 Labels: app.kubernetes.io/component=query-service app.kubernetes.io/instance=signoz-release app.kubernetes.io/name=signoz controller-revision-hash=signoz-release-query-service-6b74848544 statefulset.kubernetes.io/pod-name=signoz-release-query-service-0 Annotations: checksum/config: 04f4266ae5775a09aa16c23105aec568e83d8e15a04f4d5588eeac26b5bc74e4 kubernetes.io/psp: eks.privileged Status: Running IP: 10.107.93.107 IPs: IP: 10.107.93.107 Controlled By: StatefulSet/signoz-release-query-service Init Containers: signoz-release-query-service-init: Container ID: docker://ecf7c957bcc41dc3de24949a9a40e46b1b9284460b924cef9f4a467a1398003d Image: docker.io/busybox:1.35 Image ID: docker-pullable://busybox@sha256:09439c073bd3eb029a91c72eff2c0d9f12ab9c84f66bdef360fcf3f91a81bf7c Port: <none> Host Port: <none> Command: sh -c until wget --spider -q signoz-release-clickhouse:8123/ping; do echo -e "waiting for clickhouseDB"; sleep 5; done; echo -e "clickhouse ready, starting query service now"; State: Terminated Reason: Completed Exit Code: 0 Started: Tue, 20 Sep 2022 195155 +0530 Finished: Tue, 20 Sep 2022 195155 +0530 Ready: True Restart Count: 0 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hk7nl (ro) Containers: signoz-release-query-service: Container ID: docker://898705709a98a98b39673b364418bec50048e0a454514fab1f7f6b1031559e6e Image: docker.io/signoz/query-service:0.11.0 Image ID: docker-pullable://signoz/query-service@sha256:fbaba7b20e60dfa2cc55a456afdf13bcc94f17b99afab94969250cf1b35bc6dd Port: 8080/TCP Host Port: 0/TCP Args: -config=/root/config/prometheus.yml State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Tue, 20 Sep 2022 195703 +0530 Finished: Tue, 20 Sep 2022 195728 +0530 Ready: False Restart Count: 5 Limits: cpu: 750m memory: 1000Mi Requests: cpu: 200m memory: 300Mi Liveness: http-get http//http/api/v1/version delay=0s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get http//http/api/v1/version delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: STORAGE: clickhouse ClickHouseUrl: tcp://signoz-release-clickhouse:9000?database=signoz_traces&username=admin&password=27ff0399-0d3a-4bd8-919d-17c2181e6fb9 ALERTMANAGER_API_PREFIX: http://signoz-release-alertmanager:9093/api/ GODEBUG: netdns=go TELEMETRY_ENABLED: true DEPLOYMENT_TYPE: kubernetes-helm Mounts: /root/config from prometheus (rw) /root/config/dashboards from dashboards (rw) /var/lib/signoz/ from signoz-db (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hk7nl (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: signoz-db: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: signoz-db-signoz-release-query-service-0 ReadOnly: false prometheus: Type: ConfigMap (a volume populated by a ConfigMap) Name: signoz-release-query-service Optional: false dashboards: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> kube-api-access-hk7nl: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 5m59s default-scheduler Successfully assigned platform/signoz-release-query-service-0 to ip-10-107-65-6.ap-south-1.compute.internal Normal Pulled 5m49s kubelet Container image "docker.io/busybox:1.35" already present on machine Normal Created 5m49s kubelet Created container signoz-release-query-service-init Normal Started 5m49s kubelet Started container signoz-release-query-service-init Warning Unhealthy 4m10s kubelet Liveness probe failed: Get "http://10.107.93.107:8080/api/v1/version": read tcp 10.107.65.645490 >10.107.93.1078080: read: connection reset by peer Warning Unhealthy 4m10s kubelet Readiness probe failed: Get "http://10.107.93.107:8080/api/v1/version": read tcp 10.107.65.645488 >10.107.93.1078080: read: connection reset by peer Warning BackOff 4m (x6 over 4m53s) kubelet Back-off restarting failed container Normal Created 3m46s (x4 over 5m49s) kubelet Created container signoz-release-query-service Normal Started 3m46s (x4 over 5m49s) kubelet Started container signoz-release-query-service Warning Unhealthy 3m45s (x2 over 4m36s) kubelet Readiness probe failed: Get "http://10.107.93.107:8080/api/v1/version": dial tcp 10.107.93.1078080 connect: connection refused Normal Pulled 41s (x6 over 5m49s) kubelet Container image "docker.io/signoz/query-service:0.11.0" already present on machine

sudhanshu dev

09/20/2022, 2:28 PM

Copy code

Name:         signoz-release-query-service-0
Namespace:    platform
Priority:     0
Node:         ip-10-107-65-6.ap-south-1.compute.internal/10.107.65.6
Start Time:   Tue, 20 Sep 2022 19:51:45 +0530
Labels:       <http://app.kubernetes.io/component=query-service|app.kubernetes.io/component=query-service>
              <http://app.kubernetes.io/instance=signoz-release|app.kubernetes.io/instance=signoz-release>
              <http://app.kubernetes.io/name=signoz|app.kubernetes.io/name=signoz>
              controller-revision-hash=signoz-release-query-service-6b74848544
              <http://statefulset.kubernetes.io/pod-name=signoz-release-query-service-0|statefulset.kubernetes.io/pod-name=signoz-release-query-service-0>
Annotations:  checksum/config: 04f4266ae5775a09aa16c23105aec568e83d8e15a04f4d5588eeac26b5bc74e4
              <http://kubernetes.io/psp|kubernetes.io/psp>: eks.privileged
Status:       Running
IP:           10.107.93.107
IPs:
  IP:           10.107.93.107
Controlled By:  StatefulSet/signoz-release-query-service
Init Containers:
  signoz-release-query-service-init:
    Container ID:  <docker://ecf7c957bcc41dc3de24949a9a40e46b1b9284460b924cef9f4a467a1398003>d
    Image:         <http://docker.io/busybox:1.35|docker.io/busybox:1.35>
    Image ID:      <docker-pullable://busybox@sha256:09439>c073bd3eb029a91c72eff2c0d9f12ab9c84f66bdef360fcf3f91a81bf7c
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      until wget --spider -q signoz-release-clickhouse:8123/ping; do echo -e "waiting for clickhouseDB"; sleep 5; done; echo -e "clickhouse ready, starting query service now";
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 20 Sep 2022 19:51:55 +0530
      Finished:     Tue, 20 Sep 2022 19:51:55 +0530
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hk7nl (ro)
Containers:
  signoz-release-query-service:
    Container ID:  <docker://898705709a98a98b39673b364418bec50048e0a454514fab1f7f6b1031559e6>e
    Image:         <http://docker.io/signoz/query-service:0.11.0|docker.io/signoz/query-service:0.11.0>
    Image ID:      <docker-pullable://signoz/query-service@sha256:fbaba7b20e60dfa2cc55a456afdf13bcc94f17b99afab94969250cf1b35bc6dd>
    Port:          8080/TCP
    Host Port:     0/TCP
    Args:
      -config=/root/config/prometheus.yml
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Tue, 20 Sep 2022 19:57:03 +0530
      Finished:     Tue, 20 Sep 2022 19:57:28 +0530
    Ready:          False
    Restart Count:  5
    Limits:
      cpu:     750m
      memory:  1000Mi
    Requests:
      cpu:      200m
      memory:   300Mi
    Liveness:   http-get http://:http/api/v1/version delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:http/api/v1/version delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      STORAGE:                  clickhouse
      ClickHouseUrl:            <tcp://signoz-release-clickhouse:9000?database=signoz_traces&username=admin&password=27ff0399-0d3a-4bd8-919d-17c2181e6fb9>
      ALERTMANAGER_API_PREFIX:  <http://signoz-release-alertmanager:9093/api/>
      GODEBUG:                  netdns=go
      TELEMETRY_ENABLED:        true
      DEPLOYMENT_TYPE:          kubernetes-helm
    Mounts:
      /root/config from prometheus (rw)
      /root/config/dashboards from dashboards (rw)
      /var/lib/signoz/ from signoz-db (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hk7nl (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  signoz-db:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  signoz-db-signoz-release-query-service-0
    ReadOnly:   false
  prometheus:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      signoz-release-query-service
    Optional:  false
  dashboards:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-hk7nl:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
                             <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  5m59s                  default-scheduler  Successfully assigned platform/signoz-release-query-service-0 to ip-10-107-65-6.ap-south-1.compute.internal
  Normal   Pulled     5m49s                  kubelet            Container image "<http://docker.io/busybox:1.35|docker.io/busybox:1.35>" already present on machine
  Normal   Created    5m49s                  kubelet            Created container signoz-release-query-service-init
  Normal   Started    5m49s                  kubelet            Started container signoz-release-query-service-init
  Warning  Unhealthy  4m10s                  kubelet            Liveness probe failed: Get "<http://10.107.93.107:8080/api/v1/version>": read tcp 10.107.65.6:45490->10.107.93.107:8080: read: connection reset by peer
  Warning  Unhealthy  4m10s                  kubelet            Readiness probe failed: Get "<http://10.107.93.107:8080/api/v1/version>": read tcp 10.107.65.6:45488->10.107.93.107:8080: read: connection reset by peer
  Warning  BackOff    4m (x6 over 4m53s)     kubelet            Back-off restarting failed container
  Normal   Created    3m46s (x4 over 5m49s)  kubelet            Created container signoz-release-query-service
  Normal   Started    3m46s (x4 over 5m49s)  kubelet            Started container signoz-release-query-service
  Warning  Unhealthy  3m45s (x2 over 4m36s)  kubelet            Readiness probe failed: Get "<http://10.107.93.107:8080/api/v1/version>": dial tcp 10.107.93.107:8080: connect: connection refused
  Normal   Pulled     41s (x6 over 5m49s)    kubelet            Container image "<http://docker.io/signoz/query-service:0.11.0|docker.io/signoz/query-service:0.11.0>" already present on machine

sudhanshu dev

09/20/2022, 2:29 PM

No error in signoz query service pods

sudhanshu dev

09/20/2022, 2:30 PM

Copy code

2022-09-20T14:27:03.866Z	INFO	version/version.go:43

SigNoz version   : v0.11.0
Commit SHA-1     : 73b00f4
Commit timestamp : 2022-08-24T13:32:19Z
Branch           : HEAD
Go version       : go1.17.13

For SigNoz Official Documentation,  visit <https://signoz.io/docs>
For SigNoz Community Slack,         visit <http://signoz.io/slack>
For discussions about SigNoz,       visit <https://community.signoz.io>

Licensed under the MIT License.
Copyright 2022 SigNoz


2022-09-20T14:27:03.867Z	WARN	query-service/main.go:61	No JWT secret key is specified.
main.main
	/go/src/github.com/signoz/signoz/pkg/query-service/main.go:61
runtime.main
	/usr/local/go/src/runtime/proc.go:255
2022-09-20T14:27:04.532Z	INFO	app/server.go:84	Using ClickHouse as datastore ...
ts=2022-09-20T14:27:04.540122131Z caller=log.go:168 level=info msg="Loading configuration file" filename=/root/config/prometheus.yml
2022-09-20T14:27:04.543Z	INFO	alertManager/notifier.go:94	Starting notifier with alert manager:[<http://signoz-release-alertmanager:9093/api/>]
2022-09-20T14:27:04.543Z	INFO	app/server.go:396	rules manager is ready
2022-09-20T14:27:04.551Z	DEBUG	rules/apiParams.go:83	postable rule(parsed):%!(EXTRA *rules.PostableRule=&{index cpu utilisation  threshold_rule 300000000000 0 {"compositeMetricQuery":{"builderQueries":{"A":{"queryName":"A","metricName":"container_cpu_utilization","tagFilters":{"op":"AND","items":[{"key":"k8s_namespace_name","value":["orange"],"op":"LIKE"}]},"aggregateOperator":5,"expression":"A","disabled":false}},"promQueries":{"A":{"query":"k","disabled":false}},"panelType":0,"queryType":1},"op":"1","target":0.001,"matchType":"1"} map[severity:warning] map[description:A new alert] false <https://observability-dash-14e0a46b923382883464f0a5c53159a8.fnpaas.com/alerts/edit?ruleId=1> []  })
2022-09-20T14:27:04.551Z	DEBUG	rules/apiParams.go:124	postable rule:%!(EXTRA *rules.PostableRule=&{index cpu utilisation  threshold_rule 300000000000 60000000000 {"compositeMetricQuery":{"builderQueries":{"A":{"queryName":"A","metricName":"container_cpu_utilization","tagFilters":{"op":"AND","items":[{"key":"k8s_namespace_name","value":["orange"],"op":"LIKE"}]},"aggregateOperator":5,"expression":"A","disabled":false}},"promQueries":{"A":{"query":"k","disabled":false}},"panelType":0,"queryType":1},"op":"1","target":0.001,"matchType":"1"} map[severity:warning] map[description:A new alert] false <https://observability-dash-14e0a46b923382883464f0a5c53159a8.fnpaas.com/alerts/edit?ruleId=1> []  }, string=	 condition, string={"compositeMetricQuery":{"builderQueries":{"A":{"queryName":"A","metricName":"container_cpu_utilization","tagFilters":{"op":"AND","items":[{"key":"k8s_namespace_name","value":["orange"],"op":"LIKE"}]},"aggregateOperator":5,"expression":"A","disabled":false}},"promQueries":{"A":{"query":"k","disabled":false}},"panelType":0,"queryType":1},"op":"1","target":0.001,"matchType":"1"})
2022-09-20T14:27:04.551Z	DEBUG	rules/manager.go:344	msg:%!(EXTRA string=adding a new rule task, string=	 task name:, string=1-groupname)
2022-09-20T14:27:04.552Z	INFO	rules/thresholdRule.go:91	msg:creating new alerting rule	 name:index cpu utilisation	 condition:{"compositeMetricQuery":{"builderQueries":{"A":{"queryName":"A","metricName":"container_cpu_utilization","tagFilters":{"op":"AND","items":[{"key":"k8s_namespace_name","value":["orange"],"op":"LIKE"}]},"aggregateOperator":5,"expression":"A","disabled":false}},"promQueries":{"A":{"query":"k","disabled":false}},"panelType":0,"queryType":1},"op":"1","target":0.001,"matchType":"1"}	 generatorURL:<https://observability-dash-14e0a46b923382883464f0a5c53159a8.fnpaas.com/alerts/edit?ruleId=1>
2022-09-20T14:27:04.552Z	INFO	rules/ruleTask.go:44	msg:initiating a new rule task	 name:1-groupname	 frequency:1m0s
2022-09-20T14:27:04.552Z	INFO	app/server.go:273	Query server started listening on 0.0.0.0:8080...
starting private http
2022-09-20T14:27:04.552Z	INFO	app/server.go:286	Query server started listening on private port 0.0.0.0:8085...
2022-09-20T14:27:04.553Z	INFO	alertManager/notifier.go:126	msg: Initiating alert notifier...
2022-09-20T14:27:04.553Z	INFO	app/server.go:312	Starting HTTP server{port 11 8080  <nil>} {addr 15 0 0.0.0.0:8080 <nil>}
2022-09-20T14:27:04.553Z	INFO	app/server.go:324	Starting pprof server{addr 15 0 0.0.0.0:6060 <nil>}
2022-09-20T14:27:04.553Z	INFO	app/server.go:338	Starting Private HTTP server{port 11 8085  <nil>} {addr 15 0 0.0.0.0:8085 <nil>}
2022-09-20T14:27:04.553Z	DEBUG	rules/ruleTask.go:93	group:%!(EXTRA string=1-groupname, string=	 group run to begin at: , time.Time=2022-09-20 14:27:23.708306749 +0000 UTC)
ts=2022-09-20T14:27:04.556631051Z caller=log.go:168 level=info msg="Completed loading of configuration file" filename=/root/config/prometheus.yml
2022-09-20T14:27:04.703Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 17.861µs
2022-09-20T14:27:04.726Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 16.56µs
2022-09-20T14:27:14.694Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 16.661µs
2022-09-20T14:27:14.714Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 15.18µs
2022-09-20T14:27:23.714Z	DEBUG	rules/ruleTask.go:296	msg:%!(EXTRA string=rule task eval started, string=	 name:, string=1-groupname, string=	 start time:, time.Time=2022-09-20 14:27:23.708306749 +0000 UTC)
2022-09-20T14:27:23.714Z	DEBUG	rules/thresholdRule.go:515	ruleid:%!(EXTRA string=1, string=	 runQueries:, map[string]string=map[A:SELECT  toStartOfInterval(toDateTime(intDiv(timestamp_ms, 1000)), INTERVAL 30 SECOND) as ts, avg(value) as value FROM signoz_metrics.samples_v2 INNER JOIN (SELECT  fingerprint FROM signoz_metrics.time_series_v2 WHERE metric_name = 'container_cpu_utilization' AND like(labels_object.k8s_namespace_name, 'orange')) as filtered_time_series USING fingerprint WHERE metric_name = 'container_cpu_utilization' AND timestamp_ms >= 1663683743708 AND timestamp_ms <= 1663684043708 GROUP BY ts ORDER BY  ts])
2022-09-20T14:27:23.714Z	DEBUG	rules/thresholdRule.go:533	ruleId: %!(EXTRA string=1, string=	 result query label:, string=A)
2022-09-20T14:27:23.804Z	DEBUG	rules/thresholdRule.go:488	ruleid:%!(EXTRA string=1, string=	 resultmap(potential alerts):, int=1)
2022-09-20T14:27:23.804Z	DEBUG	rules/thresholdRule.go:497	ruleid:%!(EXTRA string=1, string=	 result (found alerts):, int=1)
2022-09-20T14:27:23.804Z	INFO	rules/thresholdRule.go:630	rule:index cpu utilisation	 alerts found: 1
2022-09-20T14:27:23.804Z	INFO	rules/thresholdRule.go:291	msg:sending alerts	 rule:index cpu utilisation
2022-09-20T14:27:24.694Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 24.41µs
2022-09-20T14:27:24.695Z	INFO	app/server.go:189	/api/v1/version	timeTaken: 9.95µs

nitya-signoz

09/20/2022, 3:47 PM

I can see

OOMKilled

as the reason for termination, we will have to increase the limits. cc @Ankit Nayan @Srikanth Chekuri

Srikanth Chekuri

09/20/2022, 4:59 PM

@sudhanshu dev Can you share the output for this from clickhouse

select count() from signoz_metrics.time_series_v2;

so we can give some rough estimate of how much RAM is needed for query service to not crash?

sudhanshu dev

09/21/2022, 4:30 AM

Got it

sudhanshu dev

09/21/2022, 5:04 AM

@Srikanth Chekuri Here the query output

sudhanshu dev

09/21/2022, 5:04 AM

SELECT count() FROM signoz_metrics.time_series_v2 Query id: a268c555-d984-4e25-9fc6-86856e661876 ┌─count()─┐ │ 309883 │ └─────────┘ 1 rows in set. Elapsed: 0.006 sec.

sudhanshu dev

09/21/2022, 5:04 AM

Plz provide any idea for RAM

sudhanshu dev

09/21/2022, 5:04 AM

limit

Ankit Nayan

09/21/2022, 5:06 AM

@sudhanshu dev there is some inefficiency in loading timeseries right now. We should be fixing this within 3-4 weeks. Right now we are trying a temp fix and estimate.

sudhanshu dev

09/21/2022, 5:07 AM

sudhanshu dev

09/21/2022, 5:07 AM

got it

Ankit Nayan

09/21/2022, 5:07 AM

Is it possible to not limit the query service in resources and run it for a few mins (15-30 mins should be good), we want to collect pprof data

Ankit Nayan

09/21/2022, 5:07 AM

and then we could provide a better fix sooner

sudhanshu dev

09/21/2022, 5:07 AM

sudhanshu dev

09/21/2022, 5:07 AM

got it

Ankit Nayan

09/21/2022, 5:10 AM

also, I see you running in

0.11.0

, we have raised a fix to reduce memory usage of query-service in

v0.11.1

. Can you give it a try?

sudhanshu dev

09/21/2022, 5:10 AM

Sure

Ankit Nayan

09/21/2022, 5:10 AM

Let us know if the query-service does not run within 4GB of memory

sudhanshu dev

09/21/2022, 5:10 AM

will also do that

sudhanshu dev

09/21/2022, 5:17 AM

I removed the limits from query service statefulset

sudhanshu dev

09/21/2022, 5:17 AM

and now monitoring

Ankit Nayan

09/21/2022, 5:18 AM

thanks

Ankit Nayan

09/21/2022, 5:18 AM

@Srikanth Chekuri @Prashant Shahi can you share instructions to capture cpu and memory profiles when under high usage

sudhanshu dev

09/21/2022, 5:20 AM

Yes it would help us

sudhanshu dev

09/21/2022, 5:20 AM

To do capacity planning

Prashant Shahi

09/21/2022, 5:25 AM

@Ankit Nayan @sudhanshu dev Port-forward pprof port

of query-service container:

Copy code

kubectl -n platform port-forward pod/my-release-signoz-query-service-0 6060:6060

In another terminal, run the following to obtain pprof data: • CPU Profile

Copy code

curl "<http://localhost:6060/debug/pprof/profile?seconds=30>" -o query-service.pprof -v

• Heap Profile

Copy code

curl "<http://localhost:6060/debug/pprof/heap>" -o query-service-heap.pprof -v

sudhanshu dev

09/21/2022, 5:26 AM

Got it

89 Views

Open in Slack

Previous Next