Hi, How Can we filter the span/traces based on pod...
# support
v
Hi, How Can we filter the span/traces based on pod name? One approach i can think of is to get the pod name via system property and manually use that in the spans attributes of all the spans that i emit from my service. Is there any better way to achieve this when deployed on K8s?
s
v
Thanks Srikanth, will try this out
Hi @Srikanth Chekuri, Have tried this and I'm able to get the k8.pod.ip as an attribute in my span. I also need k8s.pod.name in the spans, I have tried by adding k8.pod.name in the pod_associations under k8attributes in the override.yaml, but no luck. Can you please suggest how can i achieve this.
s
Don’t add a
k8.pod.name
in the association rules. You would put it in
extract::metadata
. Example.
Copy code
pod_association:
  - sources:
    - from: resource_attribute
      name: k8s.pod.ip
  - sources:
    - from: resource_attribute
      name: k8s.pod.uid
  - sources:
    - from: connection
extract:
  metadata:
    - k8s.namespace.name
    - k8s.pod.name
    - k8s.pod.uid
    - k8s.pod.start_time
    - k8s.deployment.name
v
Ohh those are already present, i have not overridden those values. Still i'm not able to get those attributes in the spans. Do i need to enable some flag to add these extracted values to the spans?
s
@Vishnu Teja Vallala can you dry run and share the final values.yaml that contains collector config (with any sensitive info redacted)?
v
Thanks @Srikanth Chekuri, Actually in the final values.yaml, the k8sattributes processor is not present in the pipeline of otel-collector, eve though i added it in override.yaml Following is my config in override.yaml
Copy code
otelCollector:
  name: "otel-collector"
  replicaCount: 2

  resources:
    requests:
      cpu: 100m
      memory: 200Mi
    limits:
      cpu: "1"
      memory: 2Gi

  config:
    service:
      pipelines:
        traces:
          processors:
          - signozspanmetrics/prometheus
          - k8sattributes
          - batch

  nodeSelector:
    project: signoz
  tolerations:
  - key: "app"
    value: "dev-signoz"
    operator: "Equal"
    effect: "NoSchedule"
Do you see any issue with this config for otelCollector?
s
I don’t see an issue; what does the resulting config look like?
v
@Srikanth Chekuri I have got the following with helm template signoz/signoz command
Copy code
# Source: signoz/templates/otel-collector/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: release-name-signoz-otel-collector
  labels:
    <http://helm.sh/chart|helm.sh/chart>: signoz-0.20.0
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: signoz
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: release-name
    <http://app.kubernetes.io/component|app.kubernetes.io/component>: otel-collector
    <http://app.kubernetes.io/version|app.kubernetes.io/version>: "0.24.0"
    <http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
data:
  otel-collector-config.yaml: |-
    exporters:
      clickhouselogsexporter:
        dsn: tcp://${CLICKHOUSE_HOST}:${CLICKHOUSE_PORT}/?username=${CLICKHOUSE_USER}&password=${CLICKHOUSE_PASSWORD}
        retry_on_failure:
          enabled: true
          initial_interval: 5s
          max_elapsed_time: 300s
          max_interval: 30s
        sending_queue:
          queue_size: 100
        timeout: 10s
      clickhousemetricswrite:
        endpoint: tcp://${CLICKHOUSE_HOST}:${CLICKHOUSE_PORT}/?database=${CLICKHOUSE_DATABASE}&username=${CLICKHOUSE_USER}&password=${CLICKHOUSE_PASSWORD}
        resource_to_telemetry_conversion:
          enabled: true
      clickhousetraces:
        datasource: tcp://${CLICKHOUSE_HOST}:${CLICKHOUSE_PORT}/?database=${CLICKHOUSE_TRACE_DATABASE}&username=${CLICKHOUSE_USER}&password=${CLICKHOUSE_PASSWORD}
        low_cardinal_exception_grouping: ${LOW_CARDINAL_EXCEPTION_GROUPING}
      prometheus:
        endpoint: 0.0.0.0:8889
    extensions:
      health_check:
        endpoint: 0.0.0.0:13133
      pprof:
        endpoint: localhost:1777
      zpages:
        endpoint: localhost:55679
    processors:
      batch:
        send_batch_size: 50000
        timeout: 1s
      k8sattributes:
        extract:
          metadata:
          - k8s.namespace.name
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.pod.start_time
          - k8s.deployment.name
          - k8s.node.name
        filter:
          node_from_env_var: K8S_NODE_NAME
        passthrough: false
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: connection
      logstransform/internal:
        operators:
        - if: '"trace_id" in attributes or "span_id" in attributes'
          output: remove_trace_id
          span_id:
            parse_from: attributes.span_id
          trace_id:
            parse_from: attributes.trace_id
          type: trace_parser
        - if: '"traceId" in attributes or "spanId" in attributes'
          output: remove_traceId
          span_id:
            parse_from: attributes.spanId
          trace_id:
            parse_from: attributes.traceId
          type: trace_parser
        - field: attributes.traceId
          id: remove_traceId
          if: '"traceId" in attributes'
          output: remove_spanId
          type: remove
        - field: attributes.spanId
          id: remove_spanId
          if: '"spanId" in attributes'
          type: remove
        - field: attributes.trace_id
          id: remove_trace_id
          if: '"trace_id" in attributes'
          output: remove_span_id
          type: remove
        - field: attributes.span_id
          id: remove_span_id
          if: '"span_id" in attributes'
          type: remove
      memory_limiter: null
      resourcedetection:
        detectors:
        - env
        - system
        system:
          hostname_sources:
          - dns
          - os
        timeout: 2s
      signozspanmetrics/prometheus:
        dimensions:
        - default: default
          name: service.namespace
        - default: default
          name: deployment.environment
        - name: signoz.collector.id
        dimensions_cache_size: 100000
        latency_histogram_buckets:
        - 100us
        - 1ms
        - 2ms
        - 6ms
        - 10ms
        - 50ms
        - 100ms
        - 250ms
        - 500ms
        - 1000ms
        - 1400ms
        - 2000ms
        - 5s
        - 10s
        - 20s
        - 40s
        - 60s
        metrics_exporter: prometheus
    receivers:
      hostmetrics:
        collection_interval: 30s
        scrapers:
          cpu: {}
          disk: {}
          filesystem: {}
          load: {}
          memory: {}
          network: {}
      jaeger:
        protocols:
          grpc:
            endpoint: 0.0.0.0:14250
          thrift_http:
            endpoint: 0.0.0.0:14268
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
            max_recv_msg_size_mib: 16
          http:
            endpoint: 0.0.0.0:4318
      otlp/spanmetrics:
        protocols:
          grpc:
            endpoint: localhost:12345
    service:
      extensions:
      - health_check
      - zpages
      - pprof
      pipelines:
        logs:
          exporters:
          - clickhouselogsexporter
          processors:
          - logstransform/internal
          - batch
          receivers:
          - otlp
        metrics:
          exporters:
          - clickhousemetricswrite
          processors:
          - batch
          receivers:
          - otlp
        metrics/internal:
          exporters:
          - clickhousemetricswrite
          processors:
          - resourcedetection
          - k8sattributes
          - batch
          receivers:
          - hostmetrics
        metrics/spanmetrics:
          exporters:
          - prometheus
          receivers:
          - otlp/spanmetrics
        traces:
          exporters:
          - clickhousetraces
          processors:
          - signozspanmetrics/prometheus
          - batch
          receivers:
          - otlp
          - jaeger
      telemetry:
        metrics:
          address: 0.0.0.0:8888
---
s
What is the result when you apply override.yaml? I don’t think that is done in the above output.
v
Do you mean the config map of otel collector for the current running pod? or when i apply helm upgrade --debug, the result config?
s
or when i apply helm upgrade --debug, the result config?
Yes, what is the result when do dry-run with override-values.yaml instead of the actual upgrade
v
Shared on DM