I am using signoz on k8s. This is a log message th...
# support
h
I am using signoz on k8s. This is a log message that arrives at the web ui. "Log Details -> Json". Q: How and where do I extract the fields
level
and
something
from the body?
Copy code
{
  "timestamp": 1687410330621216300,
...
  "severity_text": "",
  "severity_number": 0,
  "body": "{\"level\":\"info\",\"message\":\"cleaned 0 job(s)\",\"something\":\"foobar\"}",
  "k8s_cluster_name": "",
  "k8s_container_name": "chart",
...
  "log_file_path": "/var/log/pods/foobar/chart/0.log",
  "log_iostream": "stdout",
  "logtag": "F",
  "time": "2023-06-22T07:05:30.621216224+02:00"
}
example
Copy code
config:
    processors:
      logstransform/internal:
        operators:
          - type: trace_parser ## doesn't work
            if: '"level" in body'
            method:
              parse_from: body.level 

          - type: json_parser ## doesn't work
            if: '"level" in body'
            method:
              parse_from: body.level
or
Copy code
k8s-infra:
  presets:
    logsCollection:
      operators:
        - type: json_parser ## doesn't work
          if: '"level" in body'
          method:
            parse_from: body.level
always leading to this errors on the otel-collector
Copy code
Error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'processors': error reading configuration for "logstransform/internal": 1 error(s) decoding:

* error decoding 'operators[0]': unmarshal to trace_parser: 1 error(s) decoding:

* '' has invalid keys: method
2023/06/22 05:18:33 application run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* error decoding 'processors': error reading configuration for "logstransform/internal": 1 error(s) decoding:

* error decoding 'operators[0]': unmarshal to trace_parser: 1 error(s) decoding:

* '' has invalid keys: method
p
@nitya-signoz do you have more insights on this?
h
Q: Why do I have to do this in the first place? I went to make my application create a json body. I would expect that otel can extract things from there (as your examples do so)? What am I doing wrong?
Q2: If that is the way to go - that I have to extract fields by myself (I am cool with that) - how do I do that and how do I make otel-collector don't cry, if other logs don't have that in their body?
n
Hi, parsing from raw body won’t work, you will have to use json parser here is an example json parser that I wrote please modify it to support your use case
Copy code
logstransform/parse_json:
        operators:
          - default: noop
            id: router_signoz
            routes:
              - expr: 'body matches "^{.*}$"'
                output: parse_json
            type: router
          - id: parse_json
            type: json_parser
            parse_from: body
            parse_to: attributes.temp
            output: trace_parse
          - type: trace_parser
            id: trace_parse
            trace_id:
              parse_from: attributes.temp.traceId
            span_id:
              parse_from: attributes.temp.spanId
            output: move_message
          - type: move
            id: move_message
            from: attributes.temp.message
            to: attributes.message
            if: "'message' in attributes.temp"
            output: move_enduserid
          - type: move
            id: move_enduserid
            from: attributes.temp.extra.endUserId
            if: "'extra' in attributes.temp and 'endUserId' in attributes.temp.extra"
            to: attributes.endUserId
            output: remove
          - id: remove
            type: remove
            field: attributes.temp
            output: noop
          - id: noop
            type: noop
h
can I add this section to my values.yaml?
n
Q1: This present in body are stored as string if you check the ote-logs-model https://opentelemetry.io/docs/specs/otel/logs/data-model/ for this reason you need to use json parser on body Q2: you can use the if conditions to handle unavailable values
h
(I don't 10000% understand if this is an "add" or an "replace" operation, if I add this to my values.yaml in helm)
n
You will have to modify it a bit for your use case. If you can give me example log lines and the things you want to extract I can help.
This is a new processor that you are going to add
h
Thx a lot - I give it a try
n
In your case from the above example operators between
trace_parse
and
move_enduserid
will change and the other parts will remain same
h
this is my body - I want to extract the "account_id"
Copy code
{
  "account_id": "xxx",
  "level": "info",
  "message": "my message",
   ...
}
That is how I understand and modified your code
Copy code
# parse body as json to tmp
          - id: parse_json
            type: json_parser
            parse_from: body
            parse_to: attributes.temp
            output: trace_parse

          # parse trace id
          - type: trace_parser
            id: trace_parse
            trace_id:
              parse_from: attributes.temp.traceId
            span_id:
              parse_from: attributes.temp.spanId
            output: move_message

          # check if account_id attribute exist and move it to the attributes
          - type: move
            id: move_message
            from: attributes.temp.account_id
            to: attributes.account_id
            if: "'account_id' in attributes.temp"
            output: move_account_id

          # remove tmp
          - type: move
            id: move_account_id
            from: attributes.temp.account_id
            if: "'account_id' in attributes.temp"
            to: attributes.account_id
            output: remove
no account_id in the web browser
@Pranay I am willing to write you guys a markdown file with a howto
n
you can remove
trace_parse
operator and
move_message
operator
h
i tried
accountId
instead of
account_id
I also tried to just use "level" that is debug/info/error
no success
there is one thing I don't understand
if I move these things to "attributes" I don't mentally understand this
Copy code
- type: trace_parser
            id: trace_parse
            trace_id:
              parse_from: attributes.temp.traceId
here you put it to trace_id
in your example
Copy code
- type: move
            id: move_enduserid
            from: attributes.temp.extra.endUserId
            if: "'extra' in attributes.temp and 'endUserId' in attributes.temp.extra"
            to: attributes.endUserId
            output: remove
the enduser Id is now in attributes .. .not in endUserId "toplevel"
why are there 2 different concepts?
are they "equivalent"?
n
We follow the otel log model https://opentelemetry.io/docs/specs/otel/logs/data-model/ TraceID is a top level key and is a standard, while
endUserId
is an attribute specific to my log structure
h
ok that is a nice information
how can I display the attributes in the log view in signoz? maybe it's "there" and I am just too stupid to see it
n
If you expand your log line you will be able to see the attributes
h
yes - there is no "level" and no "account_id" 😞
n
But this config seems wrong https://signoz-community.slack.com/archives/C01HWQ1R0BC/p1687421225834239?thread_ts=1687411148.945629&cid=C01HWQ1R0BC the last two operators are doing the same things, can you clean it up and share the config you are adding
h
yes
to this is always a chain?
A->B->C -> remove -> done
n
h
body
Copy code
{
  ...
  "level": "debug",
  ...
}
rules
Copy code
operators:
          # match for a body in json type
          - default: noop
            id: router_signoz
            routes:
              - expr: 'body matches "^{.*}$"'
                output: parse_json
            type: router
          # parse body as json to tmp
          - id: parse_json
            type: json_parser
            parse_from: body
            parse_to: attributes.temp
            output: trace_parse
          # parse trace id
          - type: trace_parser
            id: trace_parse
            trace_id:
              parse_from: attributes.temp.traceId
            span_id:
              parse_from: attributes.temp.spanId
            output: move_level
          # check if level attribute exist and move it to the attributes
          - type: move
            id: move_level
            from: attributes.temp.level
            to: attributes.level
            if: "'level' in attributes.temp"
            output: remove
          # remove temp
          - id: remove
            type: remove
            field: attributes.temp
            output: noop
          - id: noop
            type: noop
I can kill the chart and reinstall it?
n
Are you using an override yaml to apply the above ^
h
image.png
yes firing this via ansible
i removed the chart (including the uninstall instructions you provide on your website) and now I am reinstalling the chart
That is an issue of it's own πŸ™ˆπŸ™ˆπŸ™ˆ
Copy code
TASK [k8s_install_signoz : waiting for frontend pod to become ready (max 30m - can be a bit unstable)]
But I don't want to open this can of worms, too πŸŽ‰πŸŽ‰
n
image.png
did you addd it to your pipeline as well
I have just that section (+ingeress, storage class..) and install it via
helm install ... -f myvalues.yaml
for the operators I have only that section for operators overwritten
I do a lot of things here during installation e.g.
also from my values.yaml (tpl file for ansible)
Copy code
frontend:
  ingress:
    enabled: true
    className: traefik
    hosts:
      - host: "signoz.{{ domain }}"
        paths:
          - path: /
            pathType: ImplementationSpecific
            port: 3301
    annotations:
      <http://traefik.ingress.kubernetes.io/router.entrypoints|traefik.ingress.kubernetes.io/router.entrypoints>: websecure
    tls:
      - secretName: "{{ cluster_name }}-{{ domain_name | replace('.','-') }}-wildcard-tls"
n
Okay, not an expect to comment on that part. But if your configurations are getting applied then I think it’s good to go, Prashant can help with best practices.
Did the new configuration got applied
h
Well πŸ˜‰
Copy code
[k8s_install_signoz : waiting for frontend pod to become ready (max 30m - can be a bit unstable)]
That's the next bugo on my list.... getting coffee
Copy code
{
  "level": "info",
  "message": "xxx"
}
oh .. here is the issue
helm show values signoz/signoz | grep 'logstransform/parse_json'
has no entries
Copy code
$ helm show values signoz/signoz | grep 'logstransform/internal'  
      logstransform/internal:
          processors: [logstransform/internal, batch]
so this section is not relevant to the helm charts πŸŽ‰πŸŽ‰πŸŽ‰
it can't work
image.png
@nitya-signoz thank you!
I have to mess with the /internals
n
Great πŸ‘
h
Should I write a bug report for this?
n
Sure please go ahead
h
@Pranay I will write a short article how to set this up.
image.png
Omg πŸŽ‰πŸŽ‰πŸŽ‰πŸ₯³πŸ₯³πŸ₯³
Can I remove this attribute also from the body?
https://github.com/SigNoz/charts/issues/243 < done and I am nice πŸ™‚
p
I will write a short article how to set this up.
@Harald Fielker that would be very helpful. Please share it with the community also after you have written the guide
h
@nitya-signoz any idea how I can remove the extracted fields from the body?
[k8s_install_signoz : waiting for frontend pod to become ready (max 30m - can be a bit unstable)]
this is such a pain 😞
There is a chicken/egg issue within the helm charts.
I burned weeks on this
* '' has invalid keys: pipeline
I have no idea how to add the pipeline into your helm chart 😞
I really really want - but I am giving up for today πŸ™ˆπŸ™ˆ
n
@Prashant Shahi will be able to help you on above once he gets some bandwidth.
h
I want to document this - it's probably not so hard. But it's pure trial and error for me 😞
p
* '' has invalid keys: pipeline
@Harald Fielker In which component are you getting this error? also, could you share your complete override-values.yaml?
h
Nop - this is 1000% pure trial and error πŸ™‚
We have to come to a point where I can follow the documentation. I am going to reference values.yaml from the helm chart.
this line
it is the "`logstransform/internal:`"
pure wild guessing and speculations without any documentation - I replaced the operators and added pipeline
log says something is not right in the above section
h
I am rephrasing my question.
given your values.yaml - where do I need to put a "pipeline" section?
operators are working if I make a
Copy code
otelCollector:
  config:
    processors:
      logstransform/internal:
        operators:
....
how can I use pipeline? is it a child of operators?
p
nopes, it is not
willing to create a kind, minikube, etc k8s cluster in a repo and then we can do the testing
p
why is the logtrasform processor added to
otelCollector
?
log collection is handled by OtelAgent of k8s-infra dependency chart
you can update operatorss using
presets
section
h
Sorry let me write one sentence of rant and then I'll answer your question in a constructive way (please forgive me) - ok with that?
p
okay
h
"why is the logtrasform processor added to
otelCollector
? ^^^ i am asking for a solution here for 3 month πŸ™‚ I love opensource this turns out very time burning. And I know it's a tiny thing.
I have no idea how you organized your value.yamls
willing to help to debug and document this
what I want - multiple custom properties (That I collect via logrus in golang) to be available as attributes in the signoz logger
if we have to do this by 1) otel logparsers or 2) my needing to use a logrus -> otel lib - I don't care
I just want to have my account_id, foo_id, whatever_id as attributes πŸ™‚
In Ddog that is - "put it to the extras, send datadog a gazillion of $ to index them, see it in the ui"
p
In SigNoz, Log collection is enabled by default in the same K8s cluster. And in other clusters, it can be easily done using K8s-Infra chart. https://signoz.io/docs/tutorial/kubernetes-infra-metrics/
It requires additional work for setting up the parser based on the nature of the logs.
h
I understand that and based on the work of @nitya-signoz i have one up and working
however - I don't understand how and where to put a pipeline section in your values.yaml
I am 1000% totally cool if you tell me "that's not possible"
That would be an awesome section for a documentation πŸ™‚
p
h
I can collect logs
I want to have custom fields from by body being parsed as attributes
I know how to do that - with the "operators" section in the values.yaml
p
At the end of the k8s pod logs, it mentions about operators.
yes, you are right
h
that's solved
however
it will make super super complicated rules, if i want to do that for 10 custom fields
A -> B -> C ..... P -> Q ...
that is why we have pipelines - if I can't use them (with the values.yaml) 1000% cool - I go with the chains
your college @nitya-signoz came with the suggestion pipelines
n
So basically it’s a new processor that he wants to add @Prashant Shahi
h
You have better words for it - but yes πŸ™‚
p
I see. But
otelCollector.config
is not the right place for it, is it?
it should be included under
k8s-infra.otelAgent.config
h
I really love the McDonalds concept for this πŸ™‚
Copy code
- extract:
	- from: body
	- json_fields
		- a
		- a
		- c
	- to: attributes
something like this - and I bet no more stupid questions
(if there is no solution, I will make my ansible script create a values.yaml chains following your syntax - just base on an input like this)
n
So basically anywhere works, keeping it at infra chart level will reduce the load and keep it at node level.
p
okay, fair enough.
@Harald Fielker you would have to include the new processor under
config.service.pipelines.log
if it helps, we can get on a quick call to take a look at this
h
I can create a minikube / kind demo cluster
p
yes, I would recommend
kind
. We use it extensively.
h
(i am not the only one having troubles with this according to git)
Ok - deal - I make it and we continue here
give me some time - it's 5:15 pm and ppl are going home
Good Morning.
https://github.com/egandro/signoz-bugos << I have a full example using kind
it's creating a local registry
I have a sample go application with a sample logger I am using
I provided two values.yaml - one plain and one with operators
I also documented the race condition bug that sometimes is there
(that was a lot of work)
Good morning. Anything I can do to help here?
Does the example suit your needs?
p
Awesome! Thanks @Harald Fielker Would you be willing to publish a tutorial on our website on how you solved this? https://signoz.io/docs/tutorials/ You can raise a PR here - https://github.com/SigNoz/signoz.io/tree/main/docs/tutorial cc @Nočnica Mellifera
h
Dude πŸ™‚ I didn't "Solve" it
there are 2 unsolved isses
or one issue and one missing feature
I worked 1/2 day on creating this sample. My deep wish from the bottom of my heart - someone can fix them within a reasonable amount of time.
Willing to spend 1-2-3 days of my own time, willing to jump with you in a call
p
ah, ok πŸ™‚
h
Hello greetings.
Any updates here? Anything I can do? Any idea if we should schedule a call?
Willing to spend my own time on fixing this.
Good morning.
Any idea when we can jump in a phone call?
p
Hi @Harald Fielker, thanks for all the work on creating complete repository to look into this.
Initially I suggested to use
logtransform
processor over OtelAgent, but it seems like that processor is not part of the OpenTelemetry Contrib distribution but SigNoz OtelCollector. https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/logstransformprocessor#logs-transform-processor
@Harald Fielker For the
Want 1
, you can do it using the
override-values.yaml
below:
Copy code
k8s-infra:
  presets:
    logsCollection:
      blacklist:
        namespaces:
          - default
          - kube-node-lease
          - kube-public
          - kube-system
          - local-path-storage
          - platform
          # we only want this
          #- the-app

otelCollector:
  config:
    processors:
      logstransform/custom:
        operators:
          # match for a body in json type
          - default: noop
            id: router_signoz
            routes:
              - expr: 'body matches "^{.*}$"'
                output: parse_json
            type: router
          # parse body as json to tmp
          - id: parse_json
            type: json_parser
            parse_from: body
            parse_to: attributes.temp
            output: trace_parse
          # parse trace id
          - type: trace_parser
            id: trace_parse
            trace_id:
              parse_from: attributes.temp.traceId
            span_id:
              parse_from: attributes.temp.spanId
            output: move_level
          # check if level attribute exist and move it to the attributes
          - type: move
            id: move_level
            from: attributes.temp.level
            to: attributes.level
            if: "'level' in attributes.temp"
            output: add_database
          # add database resource from body
          - type: add
            id: add_database
            field: resource.database
            value: EXPR(attributes.temp.database)
            output: remove
          # remove temp
          - id: remove
            type: remove
            field: attributes.temp
            output: noop
          # done
          - id: noop
            type: noop
    service:
      pipelines:
        logs:
          receivers: [otlp]
          processors: [logstransform/internal, logstransform/custom, batch]
          exporters: [clickhouselogsexporter]
``` - type: add
id: add_database
field: resource.database
value: EXPR(attributes.temp.database)
output: remove```
That is for
database
, you could do the same for other data.
For the
Want 2
, it helps to have K8s cluster with enough resources.
@Harald Fielker do let us know if this helped
h
Yes I am going to test this today or tomorrow.
For the
want 2
I hope that's a joke
why not do a "while <port of the service I want is not open>; sleep 10" script?
that's an easy fix for you
to be on lucky that the services spawn in the correct order is very very bad
p
we do have init containers in place which wait for dependent components to be ready
h
Copy code
while ! nc -z localhost 8080; do   
  sleep 0.1 # wait for 1/10 of the second before check again
done
there is a race condition in your software πŸ™‚
not willing to debug this in my free time - I did head bashing for months
it would be very very nice if you put this - at last - as "know bug/don't fix" at some place
at the moment you are suggesting the following thing "Throw more money on you k8s cluster, because the 3-4 lines fix is something we don't do". Which isn't pleasing
```helm install my-release signoz/signoz -n platform \
--wait \
--timeout 10m0s \
...```
^ this has been working out fine for us
h
Let me be very very polite here.
p
there is a race condition in your software
appreciate your report, will look into it
h
Copy code
install-signoz: helm-prepare
	kubectl create ns platform
	helm --namespace platform install my-release signoz/signoz -f ./values.yaml
	sleep 30
	@echo waiting until frontend pod is ready... this is sometimes super unstable and needs to be fixed!
	kubectl -n platform wait --for=condition=ready \
      pod -l "<http://app.kubernetes.io/component=frontend|app.kubernetes.io/component=frontend>" --timeout=30m
whenever the pods runs into this - you see it in my makefile
I could reproduce this in kind in 1 of 10 attempts
when I had a look at the logs of the broken pods, they where running in 404 or host not founds
and then I stopped
debugging race conditions is one of the ugliest things you can have as developer - so I make this a "you" and not a "me" problem πŸ™‚
but that is the beauty about this .. you can now (with my kind example) run a "make create-cluster install-signoz ... make destroy-cluster" in a while(true) bash loop and wait untit it's getting stuck
@Prashant Shahi I finally solved this task
I install Signoz via Ansible and I could template this
Copy code
{% for item in attributes %}
          - type: move
            id: move_{{ item }}
            from: attributes.temp.{{ item }}
            to: attributes.{{ item }}
            if: "'{{ item }}' in attributes.temp"
{% endfor %}