Regarding version 14.0 in signoz, logs page seems ...
# support
a
Regarding version 14.0 in signoz, logs page seems really buggy Problems I am facing 1. Getting repeated attribute keys everywhere 2. Search not giving proper result
Quite often search is not returning proper results
p
@nitya-signoz do you have more idea on what could be going on here? @Apoorva did you upgrade from a previous version or this was a fresh install
a
@Pranay I upgraded from a previous version
p
Do you remember which version? Were things working fine in that?
n
Can you give us an idea about when did you start noticing duplicate keys, did you change something in the configuration after which this started?
a
@Pranay Upgraded from 0.12.0, things were smooth in 0.12.0 @Nityananda Gohain Upgraded and started seeing these ui issues. To be clear, these duplicates are in UI only and not in data
Screen Shot 2023-01-14 at 5.35.24 PM.png
p
@Apoorva Have you enabled multiple shards in clickhouse ?
a
@Pranay To the best of my understanding, no, Here is my override-values.yaml (updated)
Copy code
global:
  storageClass: gp2-resizable

clickhouse:
  cloud: aws
  storageClass: gp2-resizable
k8s-infra:
  # -- Whether to enable K8s infra monitoring
  enabled: true

  # -- Endpoint/IP Address of the SigNoz or any other OpenTelemetry backend.
  # Set it to `<http://ingest.signoz.io:4317|ingest.signoz.io:4317>` for SigNoz SaaS.
  #
  # If set to null and the chart is installed as dependency, it will attempt
  # to autogenerate the endpoint of SigNoz OtelCollector.
  otelCollectorEndpoint: null

  # -- Whether the OTLP endpoint is insecure.
  # Set this to false, in case of secure OTLP endpoint.
  otelInsecure: true

  # -- API key of SigNoz SaaS
  signozApiKey: ""

  # -- Kubernetes cluster domain used when k8s-infra component installed in different namespace
  clusterDomain: cluster.local

  # -- Which namespace to install k8s-infra components.
  # By default installed to the namespace same as the chart.
  namespace: ""

  # Default values for OtelAgent
  otelAgent:
    name: "otel-agent"
    image:
      registry: <http://docker.io|docker.io>
      repository: otel/opentelemetry-collector-contrib
      tag: 0.62.0
      pullPolicy: IfNotPresent
    imagePullSecrets: []

    # OpenTelemetry Collector executable
    command:
      # -- OtelAgent command name
      name: /otelcol-contrib
      # -- OtelAgent command extra arguments
      extraArgs: []

    configMap:
      # -- Specifies whether a configMap should be created (true by default)
      create: true

    # OtelAgent service
    service:
      # -- Annotations to use by service associated to OtelAgent
      annotations: {}
      # -- Service Type: LoadBalancer (allows external access) or NodePort (more secure, no extra cost)
      type: ClusterIP

    # -- Configure resource requests and limits. Update according to your own use
    # case as these values might not be suitable for your workload.
    # ref: <http://kubernetes.io/docs/user-guide/compute-resources/>
    # @default -- See `values.yaml` for defaults
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
      # limits:
      #   cpu: 1000m
      #   memory: 1Gi

    # -- Configurations for OtelAgent
    # @default -- See `values.yaml` for defaults
    config:
      receivers:
        otlp:
          protocols:
            grpc:
              endpoint: 0.0.0.0:4317
            http:
              endpoint: 0.0.0.0:4318
        hostmetrics:
          collection_interval: 30s
          scrapers:
            cpu: {}
            load: {}
            memory: {}
            disk: {}
            filesystem: {}
            network: {}
        kubeletstats:
          collection_interval: 60s
          auth_type: "serviceAccount"
          endpoint: "${K8S_NODE_NAME}:10250"
          insecure_skip_verify: true
        filelog/k8s:
          exclude:
          - /var/log/pods/kube-system_*/*/*.log
          - /var/log/pods/*_hotrod-*/*/*.log
          - /var/log/pods/*_locust-*_*/*/*.log
          include:
          - /var/log/pods/*/*/*.log
          include_file_name: false
          include_file_path: true
          operators:
          - id: get-format
            routes:
            - expr: body matches "^\\{"
              output: parser-docker
            - expr: body matches "^[^ Z]+ "
              output: parser-crio
            - expr: body matches "^[^ Z]+Z"
              output: parser-containerd
            type: router
          - id: parser-crio
            output: extract_metadata_from_filepath
            regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
            timestamp:
              layout: "2006-01-02T15:04:05.000000000-07:00"
              layout_type: gotime
              parse_from: attributes.time
            type: regex_parser
          - id: parser-containerd
            output: extract_metadata_from_filepath
            regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
            timestamp:
              layout: '%Y-%m-%dT%H:%M:%S.%LZ'
              parse_from: attributes.time
            type: regex_parser
          - id: parser-docker
            output: extract_metadata_from_filepath
            timestamp:
              layout: '%Y-%m-%dT%H:%M:%S.%LZ'
              parse_from: attributes.time
            type: json_parser
          - id: extract_metadata_from_filepath
            parse_from: attributes["log.file.path"]
            regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
            type: regex_parser
          - id: get-json
            routes:
            - expr: attributes["k8s.namespace.name"] == "services"
              output: parser-nucash-json
            type: router
          - id: parser-nucash-json
            parse_from: attributes.log
            type: json_parser
          - from: attributes.log
            to: body
            type: move
        prometheus:
          config:
            global:
              scrape_interval: 60s
            scrape_configs:
              - job_name: otel-agent
                static_configs:
                - targets:
                  - ${MY_POD_IP}:8888
      processors:
        batch:
          send_batch_size: 5000
          timeout: 5s
        # Resource detection processor config.
        # ref: <https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/resourcedetectionprocessor/README.md>
        resourcedetection:
          detectors: [env, system, eks, ec2]  # Include ec2/eks for AWS, gce/gke for GCP and azure/aks for Azure
          # Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels
          timeout: 2s
          system:
            hostname_sources: [os]  # Alternatively, use [dns,os] for setting FQDN as host.name and os as fallback
        # Memory Limiter processor config.
        # If set to null, will be overridden with values based on k8s resource limits.
        # ref: <https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/memorylimiterprocessor>
        memory_limiter: null
      extensions:
        health_check:
          endpoint: 0.0.0.0:13133
        zpages:
          endpoint: localhost:55679
        pprof:
          endpoint: localhost:1777
      exporters:
        otlp:
          endpoint: ${OTEL_EXPORTER_OTLP_ENDPOINT}
          tls:
            insecure: ${OTEL_EXPORTER_OTLP_INSECURE}
          headers:
            "signoz-access-token": "Bearer ${SIGNOZ_API_KEY}"
      service:
        telemetry:
          metrics:
            address: 0.0.0.0:8888
        extensions: [health_check, zpages]
        pipelines:
          traces:
            receivers: [otlp]
            processors: [batch]
            exporters: [otlp]
          metrics:
            receivers: [otlp]
            processors: [batch]
            exporters: [otlp]
          metrics/generic:
            receivers: [hostmetrics, prometheus, kubeletstats]
            processors: [resourcedetection, batch]
            exporters: [otlp]
          logs:
            receivers: [filelog/k8s, otlp]
            processors: [batch]
            exporters: [otlp]
n
Interesting, can you share the api response for ‘/fields’ api when you load the logs page ?
Yeah, you have the same attribute in resources as well as attributes, because of which it’s crashing as of now.
a
@Nityananda Gohain it used to work, any quick fix ?
n
Did you remember adding the resource attribute?
You can truncate the logs_resource_keys table for a quick fix.
But if data with those resource attributes is ingested again, it will reappear.
Can you create a github issue for this?
a
a
@Apoorva let us know if truncating
logs_resource_keys
helped
a
@Ankit Nayan Didn't help, immediately those keys are generated again
n
@Apoorva By any chance are you sending data using OTLP which has those keys? From the config you have shared no resource attributes are created from the
filelog/k8s
receiver
a
@Nityananda Gohain To the best of my understanding 1. Log is being scraped by the daemonset pods on each nodes 2. I am sending only traces and spans using otlp This same configuration was working really well before. Should I remove filelog/k8s ?
n
Yes, you are correct. Let’s get on a call sometime today if that works for you?
a
Yes, sure when ? (Let me know what time works for you, I will accomodate)
n
2 P.M?
a
Sure works for me
n
Found the issue, the k8sattribute process was adding duplicate attributes, since we are already extracting all the attributes, the k8sattribute processor is not required for logs. We have manually disabled it for logs for Apoorva. @Prashant Shahi can you help in what will be the best way to disable it just for logs through override-values.yaml
p
@nitya-signoz yes, already working on the same
n
Cool, thanks
a
Even after these changes search is still failing
Copy code
<https://platform.dev1.nucash.net/api/v1/logs?q=k8s_namespace_name+IN+(%27services%27)&limit=25&orderBy=timestamp&order=desc&timestampStart=1673860453456000000&timestampEnd=1673860753456000000>
Copy code
Query String Parameters
q: k8s_namespace_name IN ('services')
limit: 25
orderBy: timestamp
order: desc
timestampStart: 1673860453456000000
timestampEnd: 1673860753456000000
Is giving response
Copy code
{
  "results": []
}
n
Can you share the output of
show create table signoz_logs.logs
a
Copy code
CREATE TABLE signoz_logs.logs
(
    `timestamp` UInt64 CODEC(DoubleDelta, LZ4),
    `observed_timestamp` UInt64 CODEC(DoubleDelta, LZ4),
    `id` String CODEC(ZSTD(1)),
    `trace_id` String CODEC(ZSTD(1)),
    `span_id` String CODEC(ZSTD(1)),
    `trace_flags` UInt32,
    `severity_text` LowCardinality(String) CODEC(ZSTD(1)),
    `severity_number` UInt8,
    `body` String CODEC(ZSTD(2)),
    `resources_string_key` Array(String) CODEC(ZSTD(1)),
    `resources_string_value` Array(String) CODEC(ZSTD(1)),
    `attributes_string_key` Array(String) CODEC(ZSTD(1)),
    `attributes_string_value` Array(String) CODEC(ZSTD(1)),
    `attributes_int64_key` Array(String) CODEC(ZSTD(1)),
    `attributes_int64_value` Array(Int64) CODEC(ZSTD(1)),
    `attributes_float64_key` Array(String) CODEC(ZSTD(1)),
    `attributes_float64_value` Array(Float64) CODEC(ZSTD(1)),
    `k8s_container_name` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'k8s_container_name')] CODEC(LZ4),
    `message` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'message')] CODEC(LZ4),
    `level` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'level')] CODEC(LZ4),
    `msg` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'msg')] CODEC(LZ4),
    `errmsg` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'errmsg')] CODEC(LZ4),
    `func` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'func')] CODEC(LZ4),
    `component` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'component')] CODEC(LZ4),
    `file` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'file')] CODEC(LZ4),
    `k8s_namespace_name` String MATERIALIZED attributes_string_value[indexOf(attributes_string_key, 'k8s_namespace_name')] CODEC(LZ4),
    INDEX body_idx body TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 4,
    INDEX id_minmax id TYPE minmax GRANULARITY 1,
    INDEX message_idx message TYPE bloom_filter(0.01) GRANULARITY 64,
    INDEX level_idx level TYPE bloom_filter(0.01) GRANULARITY 64,
    INDEX component_idx component TYPE bloom_filter(0.01) GRANULARITY 64,
    INDEX k8s_container_name_idx k8s_container_name TYPE bloom_filter(0.01) GRANULARITY 64,
    INDEX k8s_namespace_name_idx k8s_namespace_name TYPE bloom_filter(0.01) GRANULARITY 64
)
ENGINE = MergeTree
PARTITION BY toDate(timestamp / 1000000000)
ORDER BY (timestamp, id)
TTL toDateTime(timestamp / 1000000000) + toIntervalSecond(86400)
SETTINGS index_granularity = 8192
n
Yeah, the
k8s_namespace_name
materialized column is reading from the
attributes
which was populated previously and now is empty for the new data because of which this is happening. Let me get back with a proper solution for this.
s
@nitya-signoz does presets processor part add attributes to the log record?
n
yeah, the receiver part added the same keys, because of which duplicate keys are appearing. Had a discussion with Ankit to remove the keys from the receiver part and let k8sattribute processor take care of it.
s
Yeah, I think that makes sense.
a
@Nityananda Gohain I tried updating my override-values file by updating here , it's not reflecting in infra-otel-agent configmap
I think there is some problem with helm chart, because even after complete installation infra agent configmap is not being updated. The solution recommended by @Nityananda Gohain worked smoothly(on manual configmap edit), now there is no duplicate keys and search is working perfectly Now it looks something like this
Copy code
- id: extract_metadata_from_filepath
          parse_from: attributes["log.file.path"]
          regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
          type: regex_parser
        - id: get-json
          routes:
          - expr: attributes.namespace == "services" || attributes.namespace == "staging"
            output: parser-nucash-json
          type: router
        - id: parser-nucash-json
          parse_from: attributes.log
          type: json_parser
        - from: attributes.log
          to: body
          type: move
n
Yeah that is correct @Apoorva .
If you still facing issues with overriding the helm chart, Prashant will be able to help you out, please let us know
a
@Prashant Shahi I have made this work with manual configmap edit, but I think there might be some bug in helm chart https://signoz-community.slack.com/archives/C01HWQ1R0BC/p1673876740144479?thread_ts=1673697393.250009&amp;cid=C01HWQ1R0BC
p
@Apoorva can you tell me more about the issue and how it was resolved?
I have been working on the chart default configurations. I did not find any issues as such.
a
@Prashant Shahi The issue is 1. I have a
signoz-override.yaml
, I tried to change the value of
k8s-infra.presets.operators
2. Ran a command
helm --namespace platform upgrade --install signoz signoz/signoz -f signoz-override.yaml
3. This used to update the configmap of infra-agent, but now whatever I set in k8s-infra.presets.operators and run the above command, configmap is not being updated 4. So to solve this I updated the configmap manually, and refreshed the pods My updated configmap https://signoz-community.slack.com/archives/C01HWQ1R0BC/p1673698255328439?thread_ts=1673697393.250009&amp;cid=C01HWQ1R0BC
p
@Apoorva looks like you are using old version of SigNoz along with old versions of SigNoz and K8s-Infra charts. We have a lot of new changes and improvements done since.
Upgrade guide here: https://signoz.io/docs/operate/kubernetes/#upgrade --- I see that you had upgrade some time back. After upgrade, you must compare you custom
override-values.yaml
with the latest
values.yaml
changes in signoz and sometimes k8s-infra charts. https://github.com/SigNoz/charts/blob/signoz-0.9.0/charts/signoz/values.yaml https://github.com/SigNoz/charts/blob/k8s-infra-0.5.0/charts/k8s-infra/values.yaml ^ replace
signoz-0.9.0
and
k8s-infra-0.5.0
with respective chart versions of signoz and k8s-infra charts
Hey @Apoorva! Do let me know if the issue is resolved.
If not, we could get on call to resolve the issue.