This message was deleted SigNoz Community #support

Join Slack

This message was deleted.

# support

Slackbot

03/06/2023, 1:56 PM

This message was deleted.

Amol Umbark

03/06/2023, 3:40 PM

@Andreas the common use case for alerts is for scenarios like to sum up the errors and get notified when error count is too high or too low. this setup may be a bit experimental i would suggest grouping exceptions by error count for 5 mins and then send the details as labels you would have to write a clickhouse query. The ch queries for alerts require a value and interval column. in this case the value can be count of records and interval can be 5 mins set the alert condition to threshold above or equal to 1 in total for last 5 mins the select can have additional columsn apart from value and interval. these columns will be sent as labels

Amol Umbark

03/07/2023, 5:41 AM

let me share a sample query

Amol Umbark

03/07/2023, 7:10 AM

@Andreas More details on this: I would recommend using short window in the alert. Last 5 mins (screenshot of conditions attached). This is a sample query you can work with. It groups exceptions by interval of 1 minute. Also adds a limit, this is to ensure some rate limits. I would recommend to keep it on the lower side ~100 and experiment from there

Copy code

SELECT
    count() AS value,
    toStartOfInterval(timestamp, toIntervalMinute(1)) AS interval,
    serviceName,
    exceptionMessage,
    exceptionType
FROM signoz_traces.distributed_signoz_error_index_v2
WHERE  timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}}
GROUP BY
    serviceName,
    interval,
    exceptionMessage,
    exceptionType
LIMIT 100

Use the same conditions in the attached screenshot. You would see the following kind of webhook message.

Copy code

{
      "status": "firing",
      "labels": {
        "additionalInfo": "The rule threshold is set to 0.0000, and the observed metric value is 2.",
        "alertname": "testexceptions_TEST_ALERT",
        "details": "http://<signoz-url>/exceptions",
        "exceptionMessage": "HTTPSConnectionPool(host='xxx', port=443): Max retries exceeded with url: /v3/xxxxx (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x10f8970d0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))",
        "exceptionType": "ConnectionError",
        "ruleId": "testexceptions",
        "ruleSource": "http://<sgnoz-url>/alerts/edit?ruleId=testexceptions",
        "serviceName": "flaskApp",
        "severity": "warning"
      },
      "startsAt": "2023-03-07T06:52:49.022196837Z",
      "endsAt": "2023-03-07T06:56:49.022196837Z",
      "fingerprint": "221f269939452105"
    },

The messages may be repeated as rules engine same message with "firing" and then "resolved" status. In your case, the status wouldnt be useful. I would recommend just looking at "firing" status. Also use the fingerprint in the message to identify if this is a new record or an older one. As you can see the exceptionType, exceptionMessage will show up in label section of the message. Do share your use case, so I can help you further

47 Views

Open in Slack

Previous Next