Hello Everyone, I am already using a self-hosted signoz for tracing, now i want to get a metrics of...
p
Hello Everyone, I am already using a self-hosted signoz for tracing, now i want to get a metrics of my different back-ends, and my back-ends is running on GKE (Google Kubernetes Cluster), I have Kafka, Redis, CockroachDb, Solr in my backends, I want to get CPU, Memory Usage Metrics and Healthcheck Metrics of this. So how can i do that? @Prashant Shahi
1
c
Hey @Parth prajapati! Please checkout this documentation. This should help. https://signoz.io/docs/tutorial/kubernetes-infra-metrics/
p
Thanks @Chitransh Gupta for your response, but in the documentation its not mentioned that how we can get metrics from an specific backend , like i want a cpu and memory metrics of backends like a "kafka", "solr", "CockroachDB", "redis", "elastic search"
And i am also looking for an metrics of backend healthcheck.
p
K8s-Infra helm chart collects both K8s cluster metrics as well as kubelet metrics. Kubelet metrics include cpu/memory metrics of all application pods running in the cluster.
p
• Thanks @Prashant Shahi for your response, now let suppose i want to a get a metrics from k8s of this specific metrics, so how can i do that, how i have to mention in an helm chart values.yaml file, or we have to mention at any other place? • "k8s_container_restarts", • "k8s_pod_memory_usage" , • "k8s_pod_cpu_utilization", • "container_cpu_utilization", • "container_memory_usage",
Can you look into this @Prashant Shahi @Chitransh Gupta
p
The above metrics should be auto collected from the K8s-Infra helm chart. You could plots charts using those metrics either by creating dashboards manually or import the K8s dashboards mentioned in the docs or from here.
p
Thanks @Prashant Shahi Its works
I just want to know that where this "httpcheck" config we have to exactly put in my values.yaml file receivers: httpcheck: targets: - endpoint: http://example.com method: GET - endpoint: http://my-app.com method: GET collection_interval: 10s ... ... service: .... metrics: receivers: [otlp, httpcheck] processors: [batch] exporters: [clickhousemetricswrite] because there is many receivers in values.yaml file.
@Prashant Shahi
p
Copy code
otelCollectorMetrics:
  config:
    receivers:
      httpcheck:
        ...
^ In case of OSS SigNoz, this is the method.
In case of SigNoz Cloud or K8s-Infra chart in general, you will use `otel-deployment`:
Copy code
otelDeployment:
  config:
    receivers:
      httpcheck:
        ...
    services:
      metrics/internal:
        receivers:
          - k8s_cluster
          - httpcheck
p
I have applied successfully, now in signoz UI, where i have to check and verify?
@Prashant Shahi
p
You can create a dashboard/alert using the metric
httpcheck_status
p
Got it @Prashant Shahi, Thanks for your support.
🙌 1
Hello @Prashant Shahi, when i install a signoz in a GKE Cluster, i am getting this error in an "otel-collector" pod "error clickhousetracesexporter/writer.go:431 Could not write a batch of spans to tag/tagKey tables: {"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "error": "context deadline exceeded"}" And i am getting this error in an "clickhouse cluster" pod" :- {82223673-b61a-4939-85dc-241295578b24} <Error> TCPHandler: Code: 210. DB:NetException I/O error: Broken pipe, while writing to socket (10.32.34.28:9000 -> 10.32.34.29:37986). (NETWORK_ERROR), Stack trace (when copying this message, always include the lines below) I have gone through the documentation, but i have not found anything related to this issue. Can you please look into it.
p
Are you using anything different from the default installation? Like custom image tags?
1
@Vishal Sharma can you please look into this?
p
Yes i am using an same default installation, I have just disabled log and changed a resources of Cpu and Memory.
I am getting this error logs in an otelCollector pod "{"level":"warn","ts":1721910684.4347334,"caller":"clickhousemetricsexporter/exporter.go:279","msg":"Dropped exponential histogram metric with no data points","kind":"exporter","data_type":"metrics","name":"clickhousemetricswrite","name":"signoz_latency"}"................... And also getting this error loogs in an clickhouse pod "2024.07.25 124703.101268 [ 801 ] {1080254f-c111-4bd7-a761-b700ef378677} Error TCPHandler: Code: 210. DB:NetException I/O error: Broken pipe, while writing to socket (10.132.22.5:9000 -> 10.132.22.22:60528). (NETWORK_ERROR), Stack trace (when copying this message, always include the lines below):" @Prashant Shahi @Vishal Sharma Can you look into this issue, and tell me that whats the issue, and my data is being dropped?