Hi, can you help me with configuring fluentD to se...
# support
ł
Hi, can you help me with configuring fluentD to send correct data? Currently I'm working on two cases: • IIS Logs • Log4Net application logs With IIS I have almost everything as I would like to. The only problem I see that I would like to store some fields as numeric/int and IP address. Currently all is stored as a string. With Log4Net I need to build grok parsing and I'm lost with it... In Log4Net we are using configuration: %d [%t][%P{vRequestId}]*[%P{vSqlServer}]*[%P{vUserName}][%P{vInstance}] %-5p %c - %m%n vRootUrl:%P{vRootUrl} %n vRequestUrl:%P{vRequestUrl} %n Body:%P{vLogExtension}%n And I have no idea how to build grok filter for it.
a
@nitya-signoz have you worked with grok? Also, can we do the parsing at otel-collector?
n
Have a little idea but would like to understand the problem better. @Łukasz Herman can you please create a GitHub issue with the configuration that you are currently using for both problems, and also share some example logs. While we work towards a solution it will help the community as well.
ł
Where should we parse the logs? Currently I understand that fluentD can use Grok to parse log and slice it into different fields. But on the SigNoz side all of them are strings. Is there any other way to parse logs (on SigNoz side)?
@nitya-signoz @Ankit Nayan Where should I parse logs? In general it will be more suitable for me to do all parsing in one central location like SigNoz. But how? And when somehow I will parse the logs, how to change datatypes (to int, IP, decimal, etc.)?
n
Yes, Ideally all the parsing should be done in the otel-collector-config.yaml file. For parsing the logs there are different operators available https://signoz.io/docs/userguide/logs/#operators-for-parsing-and-manipulating-logs
If you can share some example log lines, we can help you with how to parse it.
ł
Lets focus on log like below: { "timestamp": 1666659631502000000, "id": "2FlLy8mdGBzFof0oKEZy4x4jtMg", "trace_id": "", "span_id": "", "trace_flags": 0, "severity_text": "", "severity_number": 0, "body": ""[216][638022636288199459][SQL123\\SQL2019B][System][saas] INFO Core.WebTasks.Esb.WebTaskJobProcessor - Zaczynam WebTask:AfterUpgrade (saas)\r\n vRootUrl:https://fake.org/ \r\n vRequestUrl:https://fake.org/? \r\n Body:(null)"", "resources_string": {}, "attributes_string": { "app": "app1", "environment": "UAT", "fluent_tag": "log4net", "hostname": "AP20", "logfile": "info.log", "logfilepath": "D:/Data/info.log", "module": "app1" }, "attributes_int": {}, "attributes_float": {} } I would like to parse 'body' and extract some fields like thread in "[", "]" (216), and so on.
n
This is something that you have taken from the signoz UI. Can you share a raw log line from the file/source where you are getting the data, as some operators are already applied here.
ł
I'm grabbing file logs using FluentD and sending to fluentforward receiver. As I would like to parse them on SigNoz side, it doesn't matter what are source logs.
n
Yeah if you are using fluentD then source logs format doesn’t matter, but the output format of fluentD matters. you can use https://docs.fluentd.org/output/stdout to check the output that is going to signoz.
ł
On Windows FluentD is working as a service. I'm not able to get standard output. I can save it to file, but it will the same as saving to fluentforward, isn't it?
n
Yeah it will give me us an idea about whats happening with the logs in fluentD before it’s going to a dowstream.
ł
I was not able to send log to file (it is failing and I'm not able to figure out why) but it's logging in FluentD log showing what data it has. "2022-10-26T092254+02:00\tlog4net\t{\"message\":\"[46] DEBUG - Init WindsorHttpModule\",\"logfilepath\":\"D:/Data/App1/Komunikacja/info.txt\",\"hostname\":\"AP20\",\"app\":\"App1\",\"module\":\"Komunikacja\",\"logfile\":\"info.txt\",\"environment\":\"UAT\"}\r\n"
It's more readable when I grab what is in FluentD buffer. For log4Net: 2022-10-26T090939+02:00 log4net {"message":"[52] DEBUG - Dispose RequestTimerModule","logfilepath":"D:/Data/App1/Administracja/info.txt","hostname":"AP20","app":"App1","module":"Administracja","logfile":"info.txt","environment":"UAT"} For IIS: 2022-10-26T093744+02:00 iis.arr {"message":"W3SVC1 ARR2 10.50.60.122 POST /App1Wcf/App1WcfService.svc X-ARR-CACHE-HIT=0&SERVER-ROUTED=AP23C&X-ARR-LOG-ID=01dcb691-a6c9-4c6d-a30f-9fa6ee484b0b&SERVER-STATUS=200 443 - 10.50.60.163 HTTP/1.1 - - App1.uat 200 0 0 3581 796 11 10.50.60.163:50137","s-sitename":"W3SVC1","s-computername":"ARR2","s-ip":"10.50.60.122","cs-method":"POST","cs-uri-stem":"/App1Wcf/App1WcfService.svc","cs-uri-query":"X-ARR-CACHE-HIT=0&SERVER-ROUTED=AP23C&X-ARR-LOG-ID=01dcb691-a6c9-4c6d-a30f-9fa6ee484b0b&SERVER-STATUS=200","s-port":"443","cs-username":null,"c-ip":"10.50.60.163","cs-version":"HTTP/1.1","cs(User-Agent)":null,"cs(Referer)":null,"cs-host":"App1.uat","sc-status":"200","sc-substatus":"0","sc-win32-status":"0","sc-bytes":"3581","cs-bytes":"796","time-taken":"11","OriginalIP":"10.50.60.163:50137","app":"arr","type":"iis","environment":"UAT"}
n
Got it, let me get back to you on the parsing steps for parsing the above logs.
@Łukasz Herman so from the data that you have shared,
message
is the actual log that is being scraped and you are parsing that in fluentD to extract different values such as
sc-bytes
,
sc-status
etc. Now since these values are parsed as a string in fluentD itself, it is stored as string in SigNoz. When you are parsing the log in fluentD you can specify the types parameter https://docs.fluentd.org/configuration/parse-section#types-parameter to parse to a specific type.
ł
Ok, thanks. What for log4net example, when I would like to parse it outside fluentD? It's more suitable for me to parse it at destination.
n
The current version of signoz otel collector doesn’t support type casting. Once this issue is fixed https://github.com/SigNoz/signoz-otel-collector/issues/19#issuecomment-1293276797 , you will be able to typecast string values to int.
ł
Because of new version of Fluent Bit (2.0 looks like solving most demanding tasks to have one tool to grab all logs and metrics) I'm wondering if it will best to put few Fluent Bit servers in beetween source and destination for parsing and filtering. Then the whole stack will be http based as I can send all data in Otel format. (I like HTTP more than TCP because I can send all through our load balancers and have everything in control). Then the pipeline will be Source servers with FluentBit (many servers in many datacenters) -> (Maybe) Central local FluentBit per Environment/Datacenter -> Load balancer (entrance to main datacenter where all data will be stored) -> FluenBit farm (scailing servers easily, performing all parsing and basic filtering) -> Load balancer -> SigNoz Otel Collector I need to check If Otel input and output in FluentBit are working as I think and I they can grab data I would like to have. What do you think about that?
n
Keeping the parsing stuff separate from ingestion machines will be definitely a better idea. Though you can replace the intermediate ones with otel-collectors instead of fluentbit if you are parsing the data at the source fluentBit. Since at the end when fluentD/fluentBit sends data to signoz you might have to transform a few things and doing it before it reaches the final signoz-otel-collector is a better idea.