Hi guys, my name is Diogo, nice to meet you all! ...
# support
d
Hi guys, my name is Diogo, nice to meet you all! I'm fairly new to SigNoz and to OpenTelemetry, and I hope this is the right channel to ask this; I'm trying to setup logging for my Python application in such a way that I can correlate these logs lines in SigNoz with the traces. I'm having an issue though: even though, using
opentelemetry-instrumentation-logging
, I can get all the information in the log lines (e.g.
2023-11-28 02:07:35,925 INFO [studioregistration.services.registration] [registration.py:51] [trace_id=15789f18fc808508e6ffae468dd0b655 span_id=8f6067a7de71637c resource.service.name=registration trace_sampled=True]
), when I click "View Details" there's no span ID, no trace ID, nothing, in the log fields. So when I try to arrive in the logs coming from the "Traces" section I can't find any logs with, say, that trace ID - because these log fields are empty. I'm sure I must be doing something wrong, but what is it? Any ideas? I saw some people talking about using parsers in the collector, but is it really necessary? Thank you in advance!
l
Hi Diogo, Can you share a sample JSON output from one of the log lines? Here's one from a java app.
v
@nitya-signoz ^
n
Seems like you have added trace_id and span_id in the body, you will need to parse it from your body. You can add this parser to your collector config and it should parse your trace_id and span_id https://github.com/SigNoz/signoz/commit/d165f727acebddb519d1c80e8ccae74642c843c9 Or you can use our pipelines feature to extract it from the UI https://signoz.io/docs/logs-pipelines/introduction/
d
Thanks, guys! @Leigh Finch thanks for that; Did you have to add the log parser to have that? Here's a JSON example of one of my log lines:
Copy code
{
  "body": "2023-11-28 09:50:37,933 INFO [studioregistration.services.registration] [registration.py:51] [trace_id=8f6d9e783f88a869c54b07af617c8bde span_id=691154848cc4707f resource.service.name=registration trace_sampled=True] - Registered user: {\"uuid\": \"\", \"email\": \"r234r243\", \"name\": \"sdfsdr234\"}",
  "id": "2Ynbjv17UEwsow3XRVzPy9V21W2",
  "timestamp": "2023-11-28T09:50:37.934104792Z",
  "attributes": {
    "log_file_path": "/var/log/pods/default_registration-855dfb97f6-h9vxs_0544a01e-e782-410b-8530-eb43df531ba9/registration/0.log",
    "log_iostream": "stderr",
    "logtag": "F",
    "time": "2023-11-28T09:50:37.934104792Z"
  },
  "resources": {
    "k8s_cluster_name": "",
    "k8s_container_name": "registration",
    "k8s_container_restart_count": "0",
    "k8s_deployment_name": "registration",
    "k8s_namespace_name": "default",
    "k8s_node_name": "deepopinion-worker",
    "k8s_pod_name": "registration-855dfb97f6-h9vxs",
    "k8s_pod_start_time": "2023-11-28 09:50:06 +0000 UTC",
    "k8s_pod_uid": "0544a01e-e782-410b-8530-eb43df531ba9"
  },
  "severity_text": "",
  "severity_number": 0,
  "span_id": "",
  "trace_flags": 0,
  "trace_id": ""
}
@nitya-signoz thanks, in that case I'd prefer to change the collector so that I can store logs with the correct attributes from the get-go.
n
Yeah adding the parser in collector will do the job. The pipelines is just an UI way of doing it which will internally create the parser, you can try that as well.
d
Got it. Thanks! ☺️
Alright, it worked for
span_id
, but not yet for `trace_id`; I tried changing the regex to, instead of matching at start of string, match with a word boundary (
\b
) on the left, but it still doesn't work (even though in the regex website used as a reference it matches just fine)
@nitya-signoz is there a way to change the config in place (inside the container) and just reload the collector there, without having to redeploy it? (I'm running it within a local Kind cluster)
n
Try the pipelines UI, it will generate the config and automatically restart the collector ?
d
OK, I'll give that a try, thanks!
@nitya-signoz thanks for the help with this. There seems to be a problem though: the regex parsing works differently in the collector processor compared to the log pipeline UI. When I apply the regex in the pipeline, it works just fine and processes the log lines appropriately; However, in the collector processor, it misses some of the capture groups of the regex. This is the pattern I'm using:
Copy code
(?P<severity_text>(NOTSET|DEBUG|INFO|WARN|WARNING|ERROR|FATAL|CRITICAL))\/(?P<severity_number>\d+) .+trace_id=(?P<trace_id>[a-zA-Z0-9]+) span_id=(?P<span_id>[a-zA-Z0-9]+) resource\.service\.name=(?P<service_name>[-\w]+)
(starting with a blank space) And this is an example log that should have all the matches in the log line processed by the collector, but only has
trace_id
and `span_id`:
Copy code
2023-11-30 01:14:24,681 INFO/20 [studioregistration.services.registration] [registration.py:51] [trace_id=381d38dc1d24127da0482b6b493138dc span_id=c120a55676c4fd6a resource.service.name=studio-registration trace_sampled=True] - Registered user: {\"uuid\": \"\", \"email\": \"dasa\", \"name\": \"123123\"}
Maybe the pipeline UI and the collector processor use different regular expression styles?
Alright, I finally made it work. Here's my processor, which organizes the attributes correctly now:
Copy code
logstransform/internal:
        operators:
          - type: regex_parser
            id: logs_to_tags
            # <https://regex101.com/r/TUZhgm/5>
            regex: ' (?P<severity>NOTSET|DEBUG|INFO|WARN|WARNING|ERROR|FATAL|CRITICAL).*trace_id=(?P<trace_id>[-\w]+) span_id=(?P<span_id>[-\w]+) resource\.service\.name=(?P<service_name>[-\w]+)'
            parse_from: body
            parse_to: attributes.temp_trace
            if: 'body matches "trace_id=\\w+.+span_id=\\w+"'
            output: trace_parser
          - type: trace_parser
            id: trace_parser
            trace_id:
              parse_from: attributes.temp_trace.trace_id
            span_id:
              parse_from: attributes.temp_trace.span_id
            output: severity_parser
          - type: severity_parser
            id: severity_parser
            parse_from: attributes.temp_trace.severity
            if: '"severity" in attributes.temp_trace'
            mapping:
              default: NOTSET
              debug: DEBUG
              info: INFO
              warn:
                - WARN
                - WARNING
              error: ERROR
              fatal:
                - FATAL
                - CRITICAL
            output: move_service_name
          - type: move
            id: move_service_name
            from: attributes.temp_trace.service_name
            to: resource.service_name
            if: '"service_name" in attributes.temp_trace'
            output: remove_temp
          - type: remove
            id: remove_temp
            field: attributes.temp_trace
            if: '"temp_trace" in attributes'