This message was deleted SigNoz Community #general

Join Slack

This message was deleted.

# general

Slackbot

09/30/2022, 10:08 AM

This message was deleted.

nitya-signoz

09/30/2022, 10:17 AM

Hi you can easily forward your logs from logstash to Signoz, for that you can follow this guide https://signoz.io/docs/userguide/logstash_to_signoz

Łukasz Herman

09/30/2022, 10:31 AM

Great. But... I'm not sure if I would like to stay with logstash 😉 So I'm looking for all possibilities to choose the best one. I found that there is fluentforward receiver which is doings something what I want but for FluenD/Bit, etc. Other option is to build Kafka and send all data there and then grab it from SigNoz (it will add queue possibilities as I understand). I saw that CloudFlare is using Kafka between servers and Clickhouse db. Are there any other/better/easier ways to stay with environments which will only send data to some endpoint?

nitya-signoz

09/30/2022, 10:39 AM

yeah😄, the logstash guide helps you to do a quick POC without changing much in your existing architecture. What’s your environment btw, if it’s k8s then all the logs are automatically collected from the host. This uses the architecture mentioned here

https://signoz.io/assets/images/n_collectors-af31d02c2a41866aca6410976c4e4c86.png▾

Also, what is the scale that you are currently running, something like the size of logs per hour will help us 🙂

Łukasz Herman

09/30/2022, 10:45 AM

We work only with VMs. 20 datacenters, 20-1500 servers each. ELK is outsourced and because it's not storing data in one of our datacenters we need to find some solution for storing all the data in our datacenter (and current solution could not be moved). We have around 300 GB logs a day and ~10k metrics and it's doubling every year.

Łukasz Herman

09/30/2022, 10:48 AM

And I need solution for the next 5 years at least.

nitya-signoz

09/30/2022, 10:48 AM

If I understand correctly, you want to collect logs from all the servers present in all the 20 datacenters and store them in a single datacenter right?

Łukasz Herman

09/30/2022, 10:48 AM

Correct.

nitya-signoz

09/30/2022, 11:02 AM

So the problems that you are looking to solve is • Move from outsourced to In-house management. • Should be able to handle the rate of data which is doubling every year for the next 5 years. Any other requirements that you have / pain points that you have with the existing systems?

nitya-signoz

09/30/2022, 11:46 AM

You can start with a 16CPU machine for signoz, and deploy otel collector agents in all your VMs which will push data to the signoz otel collector ex :-

link▾

. From our tests, a 16CPU machine is easily able to handle 300+GB logs/hour, which should easily handle your scale.

Łukasz Herman

09/30/2022, 12:00 PM

When there is reason to put Kafka in between source and destination?

Łukasz Herman

09/30/2022, 12:01 PM

I'm referencing Cloudflare solution https://blog.cloudflare.com/log-analytics-using-clickhouse/

Łukasz Herman

09/30/2022, 12:02 PM

Shouldn't it be included within SigNoz?

Łukasz Herman

09/30/2022, 12:12 PM

How can it be scaled out? If we add more pods for clickhouse will it re-shard all data?

nitya-signoz

09/30/2022, 12:13 PM

The scale that you are working with would work fine without Kafka as of now but can be added later. @Ankit Nayan you want to add anything here.

nitya-signoz

09/30/2022, 12:15 PM

Support for distributed/sharded clickhouse is a work in progress and should be out soon.

💯 1

Łukasz Herman

09/30/2022, 12:24 PM

The least changes in current solution will be to move elastic beats output (metricbeat/filebeat/etc) to Kafka, Redis or Logstash. Since I don't have any of them currently I will need to build competency how to install and maintain in. Other solution is to move from beats into fluentd, but I'm not sure if all of our sources will be covered (we have also auditbeat and winlogbeat). And the best solution will be to have Kafka/Redis/Logstash (or any other beats output) withing SigNoz.

Łukasz Herman

09/30/2022, 12:26 PM

I'm looking for solution where I don't need to learn a lot to use it and it will be good enough out of the box 🙂 Currently SigNoz is close to it thats why I'm here.

nitya-signoz

09/30/2022, 12:30 PM

So basically you are thinking of an architecture where • flow of data is something like

beats -> kafka -> signoz

right ?

Łukasz Herman

09/30/2022, 12:32 PM

Yes. Or something similar. We are thinking about using Redis for some caching purposes so it can be it as we need to start using it. I just saw that cloudflare is using Kafka.

nitya-signoz

09/30/2022, 12:49 PM

One of the reasons they have kafka is because they generate about 800k logs/s. So I would say they would require Kafka as a broker to scale up and down the consumers as they require and add make things more resilient. Also, it helps with things like dead letter queue etc. But again the best thing you can do is try simulating your load and doing a POC which will help you boost your confidence. You can start without Kafka as of now. We will be happy to help anywhere you face an issue. Kafka is definitely something we at SigNoz have in our plan 🙂

Łukasz Herman

09/30/2022, 12:59 PM

I will do the POC. Mostly for traces from applications. I'm looking for best solution to switch current logs and metrics flow. If there will we Redis or Kafka within SigNoz then I can just change endpoint in current beats configuration and all will be working. For all other scenarios I need to build Redis or Kafka on my own. And it is currently stopping me...

36 Views

Open in Slack

Previous Next