Hi Team, I've installed signoz via helm chart. Ini...
# support
v
Hi Team, I've installed signoz via helm chart. Initially there is a problem in the alertmanager pod because its not able to get private IP. I added pod_ip to extraargs and alertmanager pod starts running again. Now the problem is when I'm trying to create alert I'm not able to see any metrics graph. It says no data to display, any help on this. For system metrics, data is coming but for cloudwatch metrices it's not coming. While creating the dashboard I'm able to create dashboard for cloudwatch metrices
a
this should not be the case. When creating alerts on cloudwatch metrics, can you share the payload of the request and check if it shows failing in the network tab
cc: @Amol Umbark
v
Calls are not failing, it's 200 and the payload also looks fine
Screenshot from 2023-01-19 10-08-20.png,Screenshot from 2023-01-19 10-08-08.png
a
can you share the json payload and not the screenshot?
v
Copy code
{
  "start": 1674102564000,
  "end": 1674102864000,
  "step": 60,
  "variables": {},
  "dataSource": 1,
  "compositeMetricQuery": {
    "queryType": 1,
    "panelType": 1,
    "builderQueries": {
      "A": {
        "queryName": "A",
        "aggregateOperator": 5,
        "metricName": "aws_rds_cpuutilization_average",
        "tagFilters": {
          "op": "AND",
          "items": []
        },
        "groupBy": [
          "dbinstance_identifier"
        ],
        "expression": "A",
        "disabled": false
      }
    }
  }
}
a
I will let @Amol Umbark dive deeper into this
a
@Vinayak Singh Do you see alert found message with count > 0 when you click test alert button . is this the problem only with the chart?
can you please screenshot or payload for dashboard as well. Try creating a new panel and entering same filters, in network tab you will see a call to query range api. please share the payload
v
No. when i click on test notification. it says, No alerts found on evaluation This happens when rule condition is unsatisfies
This is the request json {"start":1674102976000,"end":1674104776000,"step":60,"variables":{"SIGNOZ_START_TIME":1674102976000,"SIGNOZ_END_TIME":1674104776000},"dataSource":1,"compositeMetricQuery":{"queryType":1,"panelType":1,"builderQueries":{"A":{"queryName":"A","aggregateOperator":5,"metricName":"aws_rds_database_connections_maximum","tagFilters":{"items":[],"op":"AND"},"groupBy":[],"expression":"A","disabled":false}}}}
This is the response {"status":"success","data":{"resultType":"matrix","result":[{"queryName":"A","metric":{},"values":[[1674103080,"4.111111111111111"],[1674103140,"3.3333333333333335"],[1674103200,"3.4444444444444446"],[1674103260,"3.3333333333333335"],[1674103320,"3.4444444444444446"],[1674103380,"4"],[1674103440,"3.3333333333333335"],[1674103500,"3.6666666666666665"],[1674103560,"3.5555555555555554"],[1674103620,"3.4444444444444446"],[1674103680,"3.4444444444444446"],[1674103740,"3.3333333333333335"],[1674103800,"3.5555555555555554"],[1674103860,"3.5555555555555554"],[1674103920,"3.7777777777777777"],[1674103980,"3.3333333333333335"],[1674104040,"3.3333333333333335"],[1674104100,"3.4444444444444446"]]}]}}
one more thing for system metrics its working fine but for cloudwatch metrics its not working
a
yeah, got it. Thanks. Let me review and get back.
Can you please create exact same query. The metric is different, this could also be cause of no -data. Notice the metric name in these two queries. It is different. JSON from Alerts:
{"start":1674102976000,"end":1674104776000,"step":60,"variables":{"SIGNOZ_START_TIME":1674102976000,"SIGNOZ_END_TIME":1674104776000},"dataSource":1,"compositeMetricQuery":{"queryType":1,"panelType":1,"builderQueries":{"A":{"queryName":"A","aggregateOperator":5,"metricName":"*aws_rds_database_connections_maximum*","tagFilters":{"items":[],"op":"AND"},"groupBy":[],"expression":"A","disabled":false}}}}
JSON from Dashboard:
{"start":1674102564000,"end":1674102864000,"step":60,"variables":{},"dataSource":1,"compositeMetricQuery":{"queryType":1,"panelType":1,"builderQueries":{"A":{"queryName":"A","aggregateOperator":5,"metricName":"*aws_rds_cpuutilization_average*","tagFilters":{"op":"AND","items":[]},"groupBy":["dbinstance_identifier"],"expression":"A","disabled":false}}}}
v
json from dashboard {"start":1674104601000,"end":1674106401000,"step":60,"variables":{"SIGNOZ_START_TIME":1674104601000,"SIGNOZ_END_TIME":1674106401000},"dataSource":1,"compositeMetricQuery":{"queryType":1,"panelType":1,"builderQueries":{"A":{"queryName":"A","aggregateOperator":5,"metricName":"aws_rds_database_connections_maximum","tagFilters":{"items":[],"op":"AND"},"groupBy":[],"expression":"A","disabled":false}}}}
@Amol Umbark any luck ?
a
hey @Vinayak Singh can we get into a quick huddle ?
v
sure
a
sent you a request
a
Is this fixed now?
v
yes @Ankit Nayan that problem is fixed. its because of the delay in incoming metrics. Right now im pulling cloudwatch metrics from prometheus exporter and there is a delat of 10-12 min and that is why default 5 min queries are not working
a
oh..ok
v
Actually im using the prometheus exporter for exporting cloudwatch metrices but at the otel collector metrics i'm getting this failed to scrape prometheus endpoint also sometimes its working sometime its not. Any help
This is what ive added in the values.yml
Copy code
- job_name: "CW-exporter"
              scrape_interval: 60s
              static_configs:
                - targets: ["prometheus-cloudwatch-exporter.platform.svc.cluster.local:9106"]
when i did the curl prometheus cloudwatch exporter i can see the cloudwatch metrices
@Prashant Shahi Any idea or something im not doing correcty?
One more thing how to enable a prometheus receiver in signoz helm chart.?
@Ankit Nayan can you help?
p
sometimes its working sometime its not
have you verified that metrics are scraped and accessible from Dashboard in SigNoz UI?
OtelCollector Metrics is the right place to include new scraper job. You would not need new prometheus receiver or pipeline.
@Vinayak Singh You can opt for scraping using pod annotations:
<http://signoz.io/scrape|signoz.io/scrape>
and
<http://signoz.io/port|signoz.io/port>
v
hi i wanted to show cloudwatch metrices, so i configured the prometheus cloudwatch exporter. I can see all the cloudwatch metrices in the prometheus but then i followed this https://signoz.io/docs/userguide/send-metrics/#enable-a-prometheus-receiver but this prometheus reciever is not able to scrape the metrices
p
@Vinayak Singh can you make sure that exporter endpoint is correct?
v
yes its correct
how to add the details in values.yaml
Copy code
config:
          scrape_configs:
            - job_name: "CWexporter"
              scrape_interval: 120s
              static_configs: - targets: ["192.20.12.111:9106"]
is this what we need to add in docker setup. What is the kubernative alternative for this
It looks like something wrong with the exporter. Something related to the syntax. Sometime its working sometime not
One more help can you guys let me know what this log means
Copy code
skipping send alert due to resend delay%!(EXTRA string=	 rule: , string=ES/Opensearch CPUUtilization, string=	 alert:, labels.Labels={alertname="ES/Opensearch CPUUtilization", domain_name="test-store-staging-new", ruleId="1", ruleSource="<https://monitoring.test.com/alerts/edit?ruleId=1>", severity="warning"})
What this resend delay is?
a
it means the alert has already been sent. it waits for sometime (30s i think) before resending it.
v
okay thanks can you also let me know whether we can integrate ops-genie with this signoz? @Amol Umbark @Prashant Shahi
p
@Vinayak Singh we do not have direct integration with it yet. You will have to use custom webhook to achieve it.
v
can you share few links?
p
It looks like something wrong with the exporter. Something related to the syntax. Sometime its working sometime not
If it works sometime, it means the endpoint is correct. Not sure about issue with Otel exporter. @Srikanth Chekuri might be able to help with it.
v
okay thanks
s
It looks like something wrong with the exporter. Something related to the syntax. Sometime its working sometime not
Not clear to me, what syntax are you talking about here. What is working sometimes and not working?
v
this is what im using in the prometheus clodwatch exporter
Copy code
config: |-
  # This is the default configuration for prometheus-cloudwatch-exporter
  region: us-east-1
  period_seconds: 240
  metrics:
  - aws_namespace: AWS/ES
    aws_metric_name: OpenSearchRequests
    aws_dimensions: [ClientId, DomainName]
    aws_statistics: [Sum]

  - aws_namespace: AWS/ES
    aws_metric_name: DeletedDocuments
    aws_dimensions: [ClientId, DomainName]
    aws_statistics: [Sum]

  - aws_namespace: AWS/ES
    aws_metric_name: FreeStorageSpace
    aws_dimensions: [ClientId, DomainName]
    aws_statistics: [Average]

  - aws_namespace: AWS/ES
    aws_metric_name: ClusterUsedSpace
    aws_dimensions: [ClientId, DomainName]
    aws_statistics: [Average]

  - aws_namespace: AWS/ES
    aws_metric_name: ElasticsearchRequests 
    aws_dimensions: [ClientId, DomainName]
    aws_statistics: [Sum]

  - aws_namespace: AWS/ES
    aws_metric_name: CPUUtilization
    aws_dimensions: [ClientId, DomainName]
    aws_statistics: [Average]

  - aws_namespace: AWS/ES
    aws_metric_name: SearchableDocuments
    aws_dimensions: [ClientId, DomainName]
aws_statistics: [Average]
But when im tryiing to add more metrices like RDS its not working. And also is there any way where i can flush the old data from the clickhouse DB
s
I am not very familiar with the config of the cloudwatch exporter. You can set TTL in the settings of SigNoz application or you can manually truncate once if you want to start fresh.