This message was deleted.
# support
s
This message was deleted.
a
cc: @Amol Umbark
a
Hi @Alejandro Decchi are you able to consistently create the issue #1? if so please share exact steps, log of query service and alert manager and also see jf the alert shows up in triggered notifications
i will try to reproduce the test alert second issue issue. and get back.
a
@Amol Umbark thanks for your reply. @Pranay create the issue https://github.com/SigNoz/signoz/issues/1986 I will try to add more details to it
I updated the issue
a
@Alejandro Decchi there was an issue in 0.11 where the channels were not getting into alert manager. in your log of alert manager i don't see channels.
let me check that issue exists in .11.4 can you check your channel configuration and test it
🙌 1
a
I have many channel added at signoz
Sometimes, to resolve the issue I have to recretae the channel
I am using
<http://docker.io/signoz/alertmanager:0.23.0-0.2|docker.io/signoz/alertmanager:0.23.0-0.2>
a
@Alejandro Decchi I have posted an update in the issue. please take a look https://github.com/SigNoz/signoz/issues/1986
I am unable to reproduce the issue. hence, need further inputs from you.
a
@Amol Umbark thank you for your feedback, I will review the git hub issue to give more details
I update the tickets 🙂
a
Hi @Alejandro Decchi I looked at your response. the wget result is unexpected. would you be able to get on a call to resolve this. I am in IST time zone. please share a suitable time for huddle
@Alejandro Decchi Also, is it possible to upgrade to v12 and try?
@Prashant Shahi will it be safe to delete query-service pod to re-create it? I am supsecting the new deployment did not re-create the pod. and an older version of query-service is active
p
@Amol Umbark @Alejandro Decchi yes, I have not seen any issues with restarting query-service pods.
a
@Alejandro Decchi can you please try deleting query service pod and capturing log as well after it starts. please share the log as well
a
@Amol Umbark I will try to delete it and I will sahre the Logs
Here some part of the Log output:
Copy code
2023-01-18T16:20:30.226Z	INFO	version/version.go:43	

SigNoz version   : v0.11.4
Commit SHA-1     : 8e55228
Commit timestamp : 2022-11-29T11:43:47Z
Branch           : HEAD
Go version       : go1.17.13

For SigNoz Official Documentation,  visit <https://signoz.io/docs>
For SigNoz Community Slack,         visit <http://signoz.io/slack>
For discussions about SigNoz,       visit <https://community.signoz.io>

Check SigNoz Github repo for license details.
Copyright 2022 SigNoz
2023-01-18T16:20:30.227Z	WARN	query-service/main.go:61	No JWT secret key is specified.
main.main
	/go/src/github.com/signoz/signoz/ee/query-service/main.go:61
runtime.main
	/usr/local/go/src/runtime/proc.go:255
2023-01-18T16:20:30.452Z	INFO	license/manager.go:124	No active license found, defaulting to basic plan
2023-01-18T16:20:30.452Z	INFO	app/server.go:100	Using ClickHouse as datastore ...
ts=2023-01-18T16:20:30.460035085Z caller=log.go:168 level=info msg="Loading configuration file" filename=/root/config/prometheus.yml
ts=2023-01-18T16:20:30.462181835Z caller=log.go:168 level=info msg="Completed loading of configuration file" filename=/root/config/prometheus.yml
2023-01-18T16:20:30.466Z	INFO	alertManager/notifier.go:94	Starting notifier with alert manager:[<http://signoz-alertmanager:9093/api/>]
2023-01-18T16:20:30.466Z	INFO	app/server.go:428	rules manager is ready
2023-01-18T16:20:30.468Z	DEBUG	rules/apiParams.go:85	postable rule(parsed):%!(EXTRA *rules.PostableRule=&{testing-error-rate   promql_rule 300000000000 0 {"compositeMetricQuery":{"builderQueries":{"A":{"queryName":"A","metricName":"","tagFilters":{"op":"AND","items":[]},"aggregateOperator":1,"expression":"A","disabled":false}},"promQueries":{"A":{"query":"(max(sum(rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`, status_code=\"STATUS_CODE_ERROR\"}[5m]) OR rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`, http_status_code=~\"5..\"}[5m]))*100/sum(rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`}[5m]))) \u003c 1000 OR vector(0))","disabled":false}},"panelType":0,"queryType":3},"op":"1","target":5,"matchType":"1"} map[severity:critical] map[description:A new alert] false <https://signoz.stg.travelx.it/alerts/edit?ruleId=2> [testing-alarms]  })
2023-01-18T16:20:30.468Z	DEBUG	rules/apiParams.go:126	postable rule:%!(EXTRA *rules.PostableRule=&{testing-error-rate   promql_rule 300000000000 60000000000 {"compositeMetricQuery":{"builderQueries":{"A":{"queryName":"A","metricName":"","tagFilters":{"op":"AND","items":[]},"aggregateOperator":1,"expression":"A","disabled":false}},"promQueries":{"A":{"query":"(max(sum(rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`, status_code=\"STATUS_CODE_ERROR\"}[5m]) OR rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`, http_status_code=~\"5..\"}[5m]))*100/sum(rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`}[5m]))) \u003c 1000 OR vector(0))","disabled":false}},"panelType":0,"queryType":3},"op":"1","target":5,"matchType":"1"} map[severity:critical] map[description:A new alert] false <https://signoz.stg.travelx.it/alerts/edit?ruleId=2> [testing-alarms]  }, string=	 condition, string={"compositeMetricQuery":{"builderQueries":{"A":{"queryName":"A","metricName":"","tagFilters":{"op":"AND","items":[]},"aggregateOperator":1,"expression":"A","disabled":false}},"promQueries":{"A":{"query":"(max(sum(rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`, status_code=\"STATUS_CODE_ERROR\"}[5m]) OR rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`, http_status_code=~\"5..\"}[5m]))*100/sum(rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`}[5m]))) \u003c 1000 OR vector(0))","disabled":false}},"panelType":0,"queryType":3},"op":"1","target":5,"matchType":"1"})
2023-01-18T16:20:30.468Z	DEBUG	rules/manager.go:345	msg:%!(EXTRA string=adding a new rule task, string=	 task name:, string=2-groupname)
2023-01-18T16:20:30.468Z	INFO	rules/promRule.go:94	msg:creating new alerting rule	 name:testing-error-rate	 condition:{"compositeMetricQuery":{"builderQueries":{"A":{"queryName":"A","metricName":"","tagFilters":{"op":"AND","items":[]},"aggregateOperator":1,"expression":"A","disabled":false}},"promQueries":{"A":{"query":"(max(sum(rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`, status_code=\"STATUS_CODE_ERROR\"}[5m]) OR rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`, http_status_code=~\"5..\"}[5m]))*100/sum(rate(signoz_calls_total{service_name=\"api-ktor-template\", operation=~`HTTP GET|HTTP POST`}[5m]))) \u003c 1000 OR vector(0))","disabled":false}},"panelType":0,"queryType":3},"op":"1","target":5,"matchType":"1"}	 query:(max(sum(rate(signoz_calls_total{service_name="api-ktor-template", operation=~`HTTP GET|HTTP POST`, status_code="STATUS_CODE_ERROR"}[5m]) OR rate(signoz_calls_total{service_name="api-ktor-template", operation=~`HTTP GET|HTTP POST`, http_status_code=~"5.."}[5m]))*100/sum(rate(signoz_calls_total{service_name="api-ktor-template", operation=~`HTTP GET|HTTP POST`}[5m]))) < 1000 OR vector(0)) > 5.000000
2023-01-18T16:20:30.468Z	INFO	rules/promRuleTask.go:42	Initiating a new rule group:2-groupname	 frequency:1m0s
👀 1
a
@Alejandro Decchi thanks for sharing the log. did the recreating pod fix the issue?
If not is there a good time we can get on a huddle. I am in IST
a
It did not work. I am at GMT-300. When you are available ?
any feedback ?
a
hey .. can you try a fresh install of 0.11.4 in a separate environment. i suspect something from earlier versions is running and causing the issue
i will confirm if i can connect tomorrow (friday) at 7pm ist (same time as you sent your last message)
a
@Amol Umbark this issue happened in 2 different environment . If you want I am available nest Monday 30 at at 7pm (GMT-300)
a
hey sorry I was on leave. can we connect today?
Please ping me when you are online
btw, i am available in IST timezone (GMT+5:30) 7pm (GMT-3) is like 3.30am here. can you take a look at this timeline and share a suitable time for you.
I am available for next couple of hours. if we dont get to connect, please book a meeting from here. https://calendly.com/amol-umbark/30min?month=2023-02
@Alejandro Decchi I am able to reproduce the error in one of the help installations. will have more update tomorrow
🙌 1
a
@Amol Umbark perfect! It great to be able to reproduce this random error. I keep waiting your update
a
@Prashant Shahi has resolved the issue. we will be publishing a PR soon. If possible, we will suggest a point fix so you can do it in your env and carry on
a
@Amol Umbark can you share the image/tag to deploy in my enviornment ?
a
@Alejandro Decchi the fix is in helm chart. so you would have to update helm chart and reinstall @Prashant Shahi can you please post here once the chart is updated.
p
@Alejandro Decchi The fix is merged and out. Follow our docs to upgrade to latest chart release: https://signoz.io/docs/operate/kubernetes/#upgrade-signoz-cluster Be sure to include
-f override-values.yaml
if you had passed custom values during installation.
a
@Amol Umbark @Prashant Shahi so it is fixed at chart version 0.10.2 that was released 2 hours ago ?
p
@Alejandro Decchi yes, it is
a
Perfect I will try it at Dev
p
@Alejandro Decchi Okay, do let us know if the issue persists or you face any issues upgrading.
a
thanks @Prashant Shahi
p
happy to help 🙂