This message was deleted SigNoz Community #support

Join Slack

This message was deleted.

# support

Slackbot

08/25/2022, 9:25 PM

This message was deleted.

Alexei Zenin

08/25/2022, 9:29 PM

Currently migrating from Datadog to SigNoz for the same reasons

Craig Rodrigues

08/25/2022, 9:34 PM

@Alexei Zenin Very nice! Would love to hear your experience (positive/negaive) with this exercise. Datadog costs seems to be a common motivator for looking for alternatives

Alexei Zenin

08/25/2022, 9:35 PM

Biggest headache so far was setting up clickhouse and everything else via Cloudformation (we run on ECS so could not use the Kubernetes stuff SigNoz has written)

Alexei Zenin

08/25/2022, 9:37 PM

I would say its not complete feature parity just yet and still some UX quirks to iron out but currently migrated a chunk of services and doing about 250k spans per hour and seems to work fine so far

Craig Rodrigues

08/25/2022, 9:37 PM

Yikes, I hate CloudFormation. I used CF when spinning up EKS clusters, and it is a pain. But I’m not using ECS so that may be one hurdle I can avoid

Alexei Zenin

08/25/2022, 9:38 PM

Yeah took me a few weeks to iron out the kinks so would definitely advise going EKS if you can

Craig Rodrigues

08/25/2022, 9:38 PM

What features does it lack compared to Datadog? Are these things that are “good enough” if you don’t have them?

Alexei Zenin

08/25/2022, 9:39 PM

At the moment no easy way to drill down on a per endpoint level in the service page for errors and browse certain traces. You could go to the traces tab though and filter on the specific service and endpoints though i think

Alexei Zenin

08/25/2022, 9:40 PM

I guess overall will be a learning curve to get used to new habits for debugging 😅

Craig Rodrigues

08/25/2022, 9:41 PM

main things, despite the quirks, are you confident that this is “good enough” to replace DD for your usage?

Alexei Zenin

08/25/2022, 9:42 PM

At the moment its the best thing we found so far for an OpenTelemetry backend thats open source

Craig Rodrigues

08/25/2022, 9:42 PM

what are your thoughts on ClickHouse? that is one thing I don’t have experience with

Alexei Zenin

08/25/2022, 9:42 PM

We wanted to limit the tools needed so went for SigNoz, Prometheus, Grafana

Alexei Zenin

08/25/2022, 9:44 PM

Biggest areas of concern are only the single node Clickhouse deployment option and the backend service which sends alerts being down. Both can only run with 1 instance atm to my understanding. I would say we aren't using SigNoz for everything, only for Traces (due to maturity of other tools at the moment)

Craig Rodrigues

08/25/2022, 9:45 PM

Are you self-hosting everything (SigNoz, Prometheus, Grafana), or are you using any cloud-hosted versions of these services?

Alexei Zenin

08/25/2022, 9:46 PM

Self hosting most things. Grafana is simple to run on fargate so costs like 50 cents a day. Thinking of using managed AWS Prometheus to avoid needing to scale/operate that

Craig Rodrigues

08/25/2022, 9:46 PM

Thanks for your answers, and good luck in your efforts. Maybe I’ll follow your footsteps!!

👍 1

Alexei Zenin

08/25/2022, 9:48 PM

Thanks! Good luck as well with your migration definitely not an easy task

Craig Rodrigues

08/25/2022, 10:27 PM

Well if it was easy, it wouldn’t be fun. 😉

Ankit Nayan

08/25/2022, 11:28 PM

@Craig Rodrigues

I’m interested in migrating a Datadog setup which uses about 400K metrics/month

400K metrics should be easy ... we have users using >1M metrics

Ankit Nayan

08/25/2022, 11:28 PM

Deploying to EKS should be pretty straightforward following https://signoz.io/docs/install/kubernetes/aws/

Ankit Nayan

08/25/2022, 11:31 PM

@Alexei Zenin

At the moment no easy way to drill down on a per endpoint level in the service page for errors and browse certain traces. You could go to the traces tab though and filter on the specific service and endpoints though i think

You should be able to click on any endpoint/operation in a service overview page and that would take you to traces page filtered with service name and operation. You can select

error

option to see only the errors then

Alexei Zenin

08/25/2022, 11:33 PM

@Ankit Nayan ah yeah thanks. I forgot you could do that, was used to seeing everything in services page on number of errors per endpoint in Datadog

🆗 1

Ankit Nayan

08/25/2022, 11:34 PM

Biggest areas of concern are only the single node Clickhouse deployment option and the backend service which sends alerts being down. Both can only run with 1 instance atm to my understanding.

we are working to make both clickhouse and query service horizontally scalable. Should be out in a month or so. I didn't get the part

the backend service which sends alerts being down.

Would love to hear more about it. It would be great if you can open a github issue about it to track it publicaly

Pranay

08/26/2022, 12:43 PM

Biggest headache so far was setting up clickhouse and everything else via Cloudformation (we run on ECS so could not use the Kubernetes stuff SigNoz has written

@Alexei Zenin would you be able to share any snippets/docs on how ran in ECS/Cloudformation? We don't have official docs for it yet, but some members in the community may be interested in it

Pranay

08/26/2022, 12:43 PM

e.g here - https://signoz-community.slack.com/archives/C01HWUTP4HH/p1661370508056209

Alexei Zenin

08/29/2022, 9:12 PM

Yeah, talked to Jason and saw the open issue. Will try to find some time to share some templates. The collectors were the trickiest so thinking of sharing those first

👍 1

Alexei Zenin

08/29/2022, 9:18 PM

Ankit I think your work will solve what I am describing. My point I was trying to make is that for high availabilty SigNoz would not provide a resilient setup if either Clickhouse or the query service goes down (since there is 1 of them). If Clickhouse goes down no ingestion or querying/alerting. If query service goes down no alerting for that time period.

Jessie

09/27/2022, 1:48 PM

I would love a copy of those ECS blueprints.

24 Views

Open in Slack

Previous Next