Hi, I'm looking for replacement for our current mo...
# general
ł
Hi, I'm looking for replacement for our current monitoring/obervability stack. Currently we are grabbing metrics and logs using metricbeat and filebeat (and some other beats) and sending them to ELK stack. Can we achieve something similar with SigNoz? Currently I see it supports only pull base data (like Prometheus) but I would like to push all data into some public endpoint from monitoring system (like Logstash do). The best will be if we can stay with metricbeat and filebeat and just change the endpoint. In general I would like to build one monitoring system i central location and send there data from all other environments.
n
Hi you can easily forward your logs from logstash to Signoz, for that you can follow this guide https://signoz.io/docs/userguide/logstash_to_signoz
ł
Great. But... I'm not sure if I would like to stay with logstash 😉 So I'm looking for all possibilities to choose the best one. I found that there is fluentforward receiver which is doings something what I want but for FluenD/Bit, etc. Other option is to build Kafka and send all data there and then grab it from SigNoz (it will add queue possibilities as I understand). I saw that CloudFlare is using Kafka between servers and Clickhouse db. Are there any other/better/easier ways to stay with environments which will only send data to some endpoint?
n
yeah😄, the logstash guide helps you to do a quick POC without changing much in your existing architecture. What’s your environment btw, if it’s k8s then all the logs are automatically collected from the host. This uses the architecture mentioned here

https://signoz.io/assets/images/n_collectors-af31d02c2a41866aca6410976c4e4c86.png

Also, what is the scale that you are currently running, something like the size of logs per hour will help us 🙂
ł
We work only with VMs. 20 datacenters, 20-1500 servers each. ELK is outsourced and because it's not storing data in one of our datacenters we need to find some solution for storing all the data in our datacenter (and current solution could not be moved). We have around 300 GB logs a day and ~10k metrics and it's doubling every year.
And I need solution for the next 5 years at least.
n
If I understand correctly, you want to collect logs from all the servers present in all the 20 datacenters and store them in a single datacenter right?
ł
Correct.
n
So the problems that you are looking to solve is • Move from outsourced to In-house management. • Should be able to handle the rate of data which is doubling every year for the next 5 years. Any other requirements that you have / pain points that you have with the existing systems?
You can start with a 16CPU machine for signoz, and deploy otel collector agents in all your VMs which will push data to the signoz otel collector ex :-

link

. From our tests, a 16CPU machine is easily able to handle 300+GB logs/hour, which should easily handle your scale.
ł
When there is reason to put Kafka in between source and destination?
Shouldn't it be included within SigNoz?
How can it be scaled out? If we add more pods for clickhouse will it re-shard all data?
n
The scale that you are working with would work fine without Kafka as of now but can be added later. @Ankit Nayan you want to add anything here.
Support for distributed/sharded clickhouse is a work in progress and should be out soon.
ł
The least changes in current solution will be to move elastic beats output (metricbeat/filebeat/etc) to Kafka, Redis or Logstash. Since I don't have any of them currently I will need to build competency how to install and maintain in. Other solution is to move from beats into fluentd, but I'm not sure if all of our sources will be covered (we have also auditbeat and winlogbeat). And the best solution will be to have Kafka/Redis/Logstash (or any other beats output) withing SigNoz.
I'm looking for solution where I don't need to learn a lot to use it and it will be good enough out of the box 🙂 Currently SigNoz is close to it thats why I'm here.
n
So basically you are thinking of an architecture where • flow of data is something like
beats -> kafka -> signoz
right ?
ł
Yes. Or something similar. We are thinking about using Redis for some caching purposes so it can be it as we need to start using it. I just saw that cloudflare is using Kafka.
n
One of the reasons they have kafka is because they generate about 800k logs/s. So I would say they would require Kafka as a broker to scale up and down the consumers as they require and add make things more resilient. Also, it helps with things like dead letter queue etc. But again the best thing you can do is try simulating your load and doing a POC which will help you boost your confidence. You can start without Kafka as of now. We will be happy to help anywhere you face an issue. Kafka is definitely something we at SigNoz have in our plan 🙂
ł
I will do the POC. Mostly for traces from applications. I'm looking for best solution to switch current logs and metrics flow. If there will we Redis or Kafka within SigNoz then I can just change endpoint in current beats configuration and all will be working. For all other scenarios I need to build Redis or Kafka on my own. And it is currently stopping me...