Hi there, I'm seeking assistance with configuring monitoring and alerts in Signoz. Here are the spec...

saurabh biramwar

03/15/2024, 5:27 PM

Hi there, I'm seeking assistance with configuring monitoring and alerts in Signoz. Here are the specific areas I need help with: 1) Alert on CrashLoopBack/OOMkilled pods & Additionally, I need assistance in configuring alerts to retrieve logs from previously terminated pods. 2) How can I establish alerts to notify when spot instances go down and are subsequently rescheduled. 3) How can I leverage Prometheus default metrics within Signoz to create alerts?"

nitya-signoz

03/16/2024, 2:45 AM

@Prashant Shahi will be able to help you here.

saurabh biramwar

03/16/2024, 3:32 AM

@nitya-signoz Thanks

saurabh biramwar

03/16/2024, 3:38 AM

@Prashant Shahi, Could you please help me out with the above points.

Prashant Shahi

03/18/2024, 5:33 AM

1) Alert on CrashLoopBack/OOMkilled pods & Additionally, I need assistance in configuring alerts to retrieve logs from previously terminated pods.

If you know the errors/exceptions pattern, you can set up alert based on that, and easily view logs from the alert and log context itself. If you are do not have any error patterns or they are not always printed, you will have to opt for two-step solution. 1. You should be able to use

k8s.pod.phase

metrics to detect pod failure. Check this thread for query: https://signoz-community.slack.com/archives/C01HWQ1R0BC/p1710329083837019?thread_ts=1710327606.553629&cid=C01HWQ1R0BC 2. View logs of the pod based on the

k8s.pod.name

in the metrics alert

Prashant Shahi

03/18/2024, 5:35 AM

2) How can I establish alerts to notify when spot instances go down and are subsequently rescheduled.

You can use

absent(up{hostname="..."})

in the alert PromQL query. We have something equivalent recently shipped for Query-Builder as well, docs for the same should be out soon.

Prashant Shahi

03/18/2024, 5:36 AM

3) How can I leverage Prometheus default metrics within Signoz to create alerts?

Did you go through this docs? https://signoz.io/docs/userguide/alerts-management/

21 Views

Open in Slack

Previous Next

SigNoz Community

SigNoz is an open-source APM. It helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc.