Hello, Signoz practitioners, I am trying to insta...
# support
v
Hello, Signoz practitioners, I am trying to install Signoz on standalone VM with two additional disks mounted for Clickhouse. Almost all services started using docker containers, but OTEL collector can't because of problems with migrations. I am using main branch from Github repository. (current verisons mentioned there). I've adopted one docker-compose to work with two partitions for Clickhouse. But all I see is
Copy code
Error: invalid configuration: service::pipeline::traces: references exporter "clickhousetraces" which is not configured
2023/10/02 07:06:54 application run finished with error: invalid configuration: service::pipeline::traces: references exporter "clickhousetraces" which is not configured
a
@Vishal Sharma any idea why this might be happening? @vvpreo can you confirm this is not fixed yet?
Also, @vvpreo I am guessing you might have misconfigured the otel-collector config. Can you paste your config here for us to have a look?
v
I Confirm. not solved
I am using several docker compose files instead of one (if it matters.
@Ankit Nayan
This is config
Copy code
receivers:
  tcplog/docker:
    listen_address: "0.0.0.0:2255"
    operators:
      - type: regex_parser
        regex: '^<([0-9]+)>[0-9]+ (?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}(\.[0-9]+)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?) (?P<container_id>\S+) (?P<container_name>\S+) [0-9]+ - -( (?P<body>.*))?'
        timestamp:
          parse_from: attributes.timestamp
          layout: '%Y-%m-%dT%H:%M:%S.%LZ'
      - type: move
        from: attributes["body"]
        to: body
      - type: remove
        field: attributes.timestamp
        # please remove names from below if you want to collect logs from them
      - type: filter
        id: signoz_logs_filter
        expr: 'attributes.container_name matches "^signoz-(logspout|frontend|alertmanager|query-service|otel-collector|otel-collector-metrics|clickhouse-1|clickhouse-2|zookeeper)"'
  opencensus:
    endpoint: 0.0.0.0:55678
  otlp/spanmetrics:
    protocols:
      grpc:
        endpoint: localhost:12345
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  jaeger:
    protocols:
      grpc:
        endpoint: 0.0.0.0:14250
      thrift_http:
        endpoint: 0.0.0.0:14268
      # thrift_compact:
      #   endpoint: 0.0.0.0:6831
      # thrift_binary:
      #   endpoint: 0.0.0.0:6832
  hostmetrics:
    collection_interval: 30s
    scrapers:
      cpu: {}
      load: {}
      memory: {}
      disk: {}
      filesystem: {}
      network: {}
  prometheus:
    config:
      global:
        scrape_interval: 60s
      scrape_configs:
        # otel-collector internal metrics
        - job_name: otel-collector
          static_configs:
          - targets:
              - localhost:8888
            labels:
              job_name: otel-collector


processors:
  logstransform/internal:
    operators:
      - type: trace_parser
        if: '"trace_id" in attributes or "span_id" in attributes'
        trace_id:
          parse_from: attributes.trace_id
        span_id:
          parse_from: attributes.span_id
        output: remove_trace_id
      - type: trace_parser
        if: '"traceId" in attributes or "spanId" in attributes'
        trace_id:
          parse_from: attributes.traceId
        span_id:
          parse_from: attributes.spanId
        output: remove_traceId
      - id: remove_traceId
        type: remove
        if: '"traceId" in attributes'
        field: attributes.traceId
        output: remove_spanId
      - id: remove_spanId
        type: remove
        if: '"spanId" in attributes'
        field: attributes.spanId
      - id: remove_trace_id
        type: remove
        if: '"trace_id" in attributes'
        field: attributes.trace_id
        output: remove_span_id
      - id: remove_span_id
        type: remove
        if: '"span_id" in attributes'
        field: attributes.span_id
  batch:
    send_batch_size: 10000
    send_batch_max_size: 11000
    timeout: 10s
  signozspanmetrics/prometheus:
    metrics_exporter: prometheus
    latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
    dimensions_cache_size: 100000
    dimensions:
      - name: service.namespace
        default: default
      - name: deployment.environment
        default: default
      # This is added to ensure the uniqueness of the timeseries
      # Otherwise, identical timeseries produced by multiple replicas of
      # collectors result in incorrect APM metrics
      - name: 'signoz.collector.id'
  # memory_limiter:
  #   # 80% of maximum memory up to 2G
  #   limit_mib: 1500
  #   # 25% of limit up to 2G
  #   spike_limit_mib: 512
  #   check_interval: 5s
  #
  #   # 50% of the maximum memory
  #   limit_percentage: 50
  #   # 20% of max memory usage spike expected
  #   spike_limit_percentage: 20
  # queued_retry:
  #   num_workers: 4
  #   queue_size: 100
  #   retry_on_failure: true
  resourcedetection:
    # Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
    detectors: [env, system] # include ec2 for AWS, gcp for GCP and azure for Azure.
    timeout: 2s

extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  zpages:
    endpoint: 0.0.0.0:55679
  pprof:
    endpoint: 0.0.0.0:1777

exporters:
  clickhousetraces:
    datasource: <tcp://clickhouse-1:9000/?database=signoz_traces>
    docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
    low_cardinal_exception_grouping: ${LOW_CARDINAL_EXCEPTION_GROUPING}
  clickhousemetricswrite:
    endpoint: <tcp://clickhouse-1:9000/?database=signoz_metrics>
    resource_to_telemetry_conversion:
      enabled: true
  clickhousemetricswrite/prometheus:
    endpoint: <tcp://clickhouse-1:9000/?database=signoz_metrics>
  prometheus:
    endpoint: 0.0.0.0:8889
  # logging: {}

  clickhouselogsexporter:
    dsn: <tcp://clickhouse-1:9000/>
    docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
    timeout: 5s
    sending_queue:
      queue_size: 100
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s

service:
  telemetry:
    metrics:
      address: 0.0.0.0:8888
  extensions:
    - health_check
    - zpages
    - pprof
  pipelines:
    traces:
      receivers: [jaeger, otlp]
      processors: [signozspanmetrics/prometheus, batch]
      exporters: [clickhousetraces]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [clickhousemetricswrite]
    metrics/generic:
      receivers: [hostmetrics]
      processors: [resourcedetection, batch]
      exporters: [clickhousemetricswrite]
    metrics/prometheus:
      receivers: [prometheus]
      processors: [batch]
      exporters: [clickhousemetricswrite/prometheus]
    metrics/spanmetrics:
      receivers: [otlp/spanmetrics]
      exporters: [prometheus]
    logs:
      receivers: [otlp, tcplog/docker]
      processors: [logstransform/internal, batch]
      exporters: [clickhouselogsexporter]
This is compose:
Copy code
services:
  otel-collector:
    image: signoz/signoz-otel-collector:${OTELCOL_TAG:-0.79.7}
    container_name: signoz-otel-collector
#    restart: unless-stopped
    privileged: true
    # entrypoint: ["sleep", "9999999999"]
    # ./signoz-collector --config=/etc/otel-collector-config.yaml --feature-gates=-pkg.translator.prometheus.NormalizeName
    command:
      [
        "--config=/etc/otel-collector-config.yaml",
        "--feature-gates=-pkg.translator.prometheus.NormalizeName",
      ]
    user: root # required for reading docker container logs
    volumes:
      - "{{remote_project_data_dir}}/otel-collector-config.yaml:/etc/otel-collector-config.yaml"
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
    environment:
      - OTEL_RESOURCE_ATTRIBUTES=host.name=signoz-host,os.type=linux
      - DOCKER_MULTI_NODE_CLUSTER=false
      - LOW_CARDINAL_EXCEPTION_GROUPING=false
    ports:
      # - "1777:1777"     # pprof extension
      - "4317:4317" # OTLP gRPC receiver
      - "4318:4318" # OTLP HTTP receiver
      # - "8888:8888"     # OtelCollector internal metrics
      # - "8889:8889"     # signoz spanmetrics exposed by the agent
      # - "9411:9411"     # Zipkin port
      # - "13133:13133"   # health check extension
      # - "14250:14250"   # Jaeger gRPC
      # - "14268:14268"   # Jaeger thrift HTTP
      # - "55678:55678"   # OpenCensus receiver
      # - "55679:55679"   # zPages extension

    networks:
      - traefik-internal

  otel-collector-metrics:
    image: signoz/signoz-otel-collector:${OTELCOL_TAG:-0.79.7}
    container_name: signoz-otel-collector-metrics
    privileged: true
    command:
      [
        "--config=/etc/otel-collector-metrics-config.yaml",
        "--feature-gates=-pkg.translator.prometheus.NormalizeName",
      ]
    volumes:
      - "{{remote_project_data_dir}}/otel-collector-metrics-config.yaml:/etc/otel-collector-metrics-config.yaml"
    # ports:
    #   - "1777:1777"     # pprof extension
    #   - "8888:8888"     # OtelCollector internal metrics
    #   - "13133:13133"   # Health check extension
    #   - "55679:55679"   # zPages extension
    restart: unless-stopped
    networks:
      - traefik-internal

  logspout:
    image: "gliderlabs/logspout:v3.2.14"
    container_name: signoz-logspout
    volumes:
      - /etc/hostname:/etc/host_hostname:ro
      - /var/run/docker.sock:/var/run/docker.sock
    command: <syslog+tcp://otel-collector:2255>
    depends_on:
      - otel-collector
    restart: on-failure

networks:
  traefik-internal:
external: true
v
Almost all services started using docker containers, but OTEL collector can’t because of problems with migrations.
@vvpreo Please share any logs related to migrations which you see
v
moment
Copy code
2023-10-03T08:36:38.292Z        info    clickhouselogsexporter/exporter.go:455  Running migrations from path:   {"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "test": "/logsmigrations"}
Error: failed to build pipelines: failed to create "clickhouselogsexporter" exporter for data type "logs": cannot configure clickhouse logs exporter: clickhouse Migrate failed to run, error: Dirty database version 1. Fix and force version.
2023/10/03 08:36:38 application run finished with error: failed to build pipelines: failed to create "clickhouselogsexporter" exporter for data type "logs": cannot configure clickhouse logs exporter: clickhouse Migrate failed to run, error: Dirty database version 1. Fix and force version.
I just reinstalled clickhouse, disabled collector-metrics, logspout, And migrations passed. But later I uncommented lines in config (which were problematic) and now see what I've sent to you.
Anyway Error is the same since beggining:
Copy code
cannot configure clickhouse logs exporter: clickhouse Migrate failed to run, error: Dirty database version 1. Fix and force version
May be it is possible to control launch migrations? To be sure, that I've launched migrations only once
v
@vvpreo Please connect to clickhouse container and run these commands:
Copy code
docker exec -it signoz-clickhouse /bin/bash

// connect to clickhouse client
clickhouse client

// clickhouse queries
use signoz_logs;
drop table schema_migrations;
drop table logs_attribute_keys on CLUSTER cluster;
v
moment
v
https://signoz-community.slack.com/archives/C01HWQ1R0BC/p1696324551005369?thread_ts=1696230324.525709&amp;cid=C01HWQ1R0BC Yes there’s already an issue on this. We are working on this and this will be fixed soon.
v
Here what I've got
v
That’s fine, can you restart docker containers now?
v
OTEL collector?
v
Yes
v
moment
Dead again
n
can you bring up the collectors one by one
first try stopping the crashing collectors and then run the commands for deleting the table.
We can get on a huddle if you want to fix this over by sharing your screen
v
that would be great )
Thank you for your help! One more question: Do these commands delete data? or only migrations info?
Copy code
// clickhouse queries
use signoz_logs;
drop table schema_migrations on CLUSTER cluster;
drop table logs_attribute_keys on CLUSTER cluster;
v
Schema migrations table only stores migration data. Attribute keys are metadata for suggestions for filters and aggregate attributes.
v
Thank you