Anyone know what info to apply in the alert Query ...
# support
s
Anyone know what info to apply in the alert Query Builder (Metrics) to trigger an alert whenever your app sends an Exception to Signoz?
p
@Shane Snediker I don't think it is supported as of now, but we are soon shipping ability to create alerts on ClickHouse queries. That would enable you to write ClickHouse queries on exceptions data and trigger alert based on that. Check out this PR - https://github.com/SigNoz/signoz/pull/1706
cc @Amol Umbark
s
Ok, thank you @Pranay. I appreciate your support!
@Pranay This is a fantastic feature to add to the Signoz platform. Any idea of when this feature will ship or how we can know when it ships?
p
thanks @Shane Snediker We are planning to ship first iteration for it by coming Monday. You can track this PR being merged and being part of release. As of now we are targeting it to go with v0.11.4 - which is the next release version
s
AWESOME. Thank you!
p
Would be great to receive feedback if this feature is able to help you in your usecase, so please do share any feedback when you try it
s
I will absolutely provide feedback as soon as the feature is deployed and we are able to implement it! Btw, do you anticipate the docs having an outline of how to design a ClickHouse query?
p
we have some sample clickhouse queries here - https://signoz.io/docs/tutorial/writing-clickhouse-queries-in-dashboard/ If you have any specific type of clickhouse queries which you think might be helpful for you, please create an issue - https://github.com/SigNoz/signoz/issues We will try to add more examples as we find time
s
That’s fantastic. Thank you very much @Pranay!
@Pranay the new release is great! I believe that I have successfully used the new Exception-based ClickHouse query Alert builder to generate Signoz alerts whenever my app throws an exception. I am curious: do you know what provokes the ‘RESOLVED’ alert to fire and if there’s a way to disable the ‘RESOLVED’ alerts?
p
Awesome. Great to hear that! @Ankit Nayan @Amol Umbark May have more insights on firing of resolved alerts
a
RESOLVED
alerts are thrown when the alert condition goes lower than the set threshold. @Amol Umbark can add more
a
if your alert condition no longer exists the resolved alert will be sent. the rule engine in SigNoz keeps track of active alerts and when your query does not return the same record or the threshold criteria is no longer met, a resolved message is sent
s
That makes sense. Thank you all very much. I’m curious if one of you @Pranay, @Ankit Nayan or @Amol Umbark would be able to help me identify the flaw in the logic of my clickhouse alert. I’m trying to simply create an alert that will send 1 Slack notification whenever an exception occurs in our app. I think I’m close, but I think I still need to fine-tune the time window because the alert pertaining to my current logic fires indefinitely and I need it to just fire 1 time for every exception. Here’s my current logic: SELECT count() as value, toStartOfInterval(timestamp, INTERVAL 5 MINUTE) AS interval FROM signoz_traces.signoz_error_index_v2 WHERE (serviceName=‘emailer’) AND (exceptionType = ‘Exception’) GROUP BY interval; Send notificaiton when the metric is ‘equal to’ the threshold ‘at least once’ during the last ‘5 minutes’ Alert Threshold: 1 Any ideas how I can refine my logic?
a
@Shane Snediker convert returned value to float64. Eg
Copy code
SELECT toFloat64(count()) as value, toStartOfInterval(timestamp, INTERVAL 5 MINUTE) AS interval
FROM signoz_traces.signoz_error_index_v2
WHERE (serviceName='emailer')
AND (exceptionType = 'Exception')
GROUP BY interval ORDER BY interval ASC;
s
Awesome @Ankit Nayan thank you very much!
p
Did this work @Shane Snediker?
a
@Amol Umbark is float64 as returned value necessary in alerts?
s
Well, I’m going to try it tomorrow morning. I got it working better earlier today using the following logic (which was adding timestamp to the logic):
It seemed to be only generating 1 alert and the alert would stop firing after 5 minutes, which is basically the functionality I’m going for. I really just want to receive a Slack notification any time an exception occurs in our new app that we’re creating.
a
shouldn't the threshold be
anytime
during the last 5 mins? Otherwise the value has to be
exactly equal to 1
for all the time in last 5 mins for the above alert to work.
a
@Shane Snediker Please choose options
above
instead of 'equal to' and
at least once
instead of 'all the times'. btw, If the exception continue to happen beyond 5 minutes you will keep receiving the alert. Is that a problem? are you looking to be notified only the first time when this condition happens?
your current condition will only send notification if there is exactly one exception in last 5 mins. I am assuming you don't care about the count of exceptions but just when the count > 0
s
@Amol Umbark I’m looking to be notified only the first time an exception occurs, yes you are assuming correct
a
@Shane Snediker it's a tricky one. the alert engine keeps track of alert until it's resolved. That means the exception count goes back to what it was before the alert started. are you receiving multiple messages or you are just concerned about alert status showing firing? Would suggest looking at general count of errors that your app encounters every 5 mins and set the threshold accordingly. If you are seeing alerts fire often increasing threshold is a good option
when you initially raised this issue, i think you were seeing firing status all times because the query didn't have date range condition ({{.start_datime}} but now that you have this condition the alert will only be raised based on last 5 mins of data makes sense?
s
Yes, @Amol Umbark that makes perfect sense and I’ve been extensively testing it and it is working perfectly. Thank you guys so much, this is going to make a HUGE difference for our company to be able to know when our app throws an errror-we’ll be able to get to the bottom of it proactively. Thank you!