I'm deploying the SigNoz Stack on host network in docker VM via bash script. I don't know why the ot...
a

Anurag Vishwakarma

about 1 year ago
I'm deploying the SigNoz Stack on host network in docker VM via bash script. I don't know why the otel collector is crashing. I'm using custom nginx config in signoz frontend. Here is the script, env's. Bash Script
#!/bin/bash

# Define the Host IP
HOST_IP=10.160.0.41

# Create and run containers
docker run -d --name signoz-clickhouse \
  --hostname clickhouse \
  --network host \
  --restart on-failure \
  -v "$(pwd)/clickhouse-config.xml:/etc/clickhouse-server/config.xml" \
  -v "$(pwd)/clickhouse-users.xml:/etc/clickhouse-server/users.xml" \
  -v "$(pwd)/custom-function.xml:/etc/clickhouse-server/custom-function.xml" \
  -v "$(pwd)/clickhouse-cluster.xml:/etc/clickhouse-server/config.d/cluster.xml" \
  -v "$(pwd)/clickhouse-storage.xml:/etc/clickhouse-server/config.d/storage.xml" \
  -v "$(pwd)/data/clickhouse/:/var/lib/clickhouse/" \
  -v "$(pwd)/user_scripts:/var/lib/clickhouse/user_scripts/" \
  --health-cmd "wget --spider -q 0.0.0.0:8123/ping || exit 1" \
  --health-interval=30s \
  --health-timeout=5s \
  --health-retries=3 \
  clickhouse/clickhouse-server:24.1.2-alpine 

docker run -d --name signoz-alertmanager \
  --network host \
  --restart on-failure \
  -v "$(pwd)/data/alertmanager:/data" \
  --health-cmd "wget --spider -q <http://localhost:9093/api/v1/status> || exit 1" \
  --health-interval=30s \
  --health-timeout=5s \
  --health-retries=3 \
  signoz/alertmanager:0.23.5 --queryService.url=http://$HOST_IP:8085 --storage.path=/data


docker run -d --name signoz-query-service \
  --network host \
  --restart on-failure \
  -v "$(pwd)/prometheus.yml:/root/config/prometheus.yml" \
  -v "$(pwd)/dashboards:/root/config/dashboards" \
  -v "$(pwd)/data/signoz/:/var/lib/signoz/" \
  --env-file signoz-query-service.env \
  --health-cmd "wget --spider -q localhost:8080/api/v1/health || exit 1" \
  --health-interval=30s \
  --health-timeout=5s \
  --health-retries=3 \
  signoz/query-service:0.47.0 -config="/root/config/prometheus.yml"

docker run -d --name signoz-frontend \
  --network host \
  --restart on-failure \
  -v "$(pwd)/nginx.conf:/etc/nginx/conf.d/default.conf" \
  -v "/opt/samespace/samespace-public/samespace.com.crt:/opt/samespace/samespace-public/samespace.com.crt" \
  -v "/opt/samespace/samespace-public/samespace.com.key:/opt/samespace/samespace-public/samespace.com.key" \
  signoz/frontend:0.47.0

docker run -d --name otel-migrator \
  --network host \
  --restart on-failure \
  signoz/signoz-schema-migrator:0.88.26 --dsn="tcp://$HOST_IP:9000"

docker run -d --name signoz-otel-collector \
  --network host \
  --restart on-failure \
  --user root \
  -v "$(pwd)/otel-collector-config.yaml:/etc/otel-collector-config.yaml" \
  -v "$(pwd)/otel-collector-opamp-config.yaml:/etc/manager-config.yaml" \
  -v "/var/lib/docker/containers:/var/lib/docker/containers:ro" \
  -v "/opt/samespace/Cert-mtls/ca.crt:/opt/samespace/Cert-mtls/ca.crt" \
  -v "/opt/samespace/Cert-mtls/gw.key:/opt/samespace/Cert-mtls/gw.key" \
  -v "/opt/samespace/Cert-mtls/mesh.crt:/opt/samespace/Cert-mtls/mesh.crt" \
  --env-file signoz-otel-collector.env \
  --health-cmd "wget --spider -q <http://localhost:13133/health> || exit 1" \
  --health-interval=30s \
  --health-timeout=5s \
  --health-retries=3 \
  signoz/signoz-otel-collector:0.88.26 --config="/etc/otel-collector-config.yaml" --manager-config="/etc/manager-config.yaml" --copy-path="/var/tmp/collector-config.yaml" --feature-gates="-pkg.translator.prometheus.NormalizeName"
OTEL Config:
receivers:
  tcplog/docker:
    listen_address: "0.0.0.0:2255"
    operators:
      - type: regex_parser
        regex: '^<([0-9]+)>[0-9]+ (?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}(\.[0-9]+)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?) (?P<container_id>\S+) (?P<container_name>\S+) [0-9]+ - -( (?P<body>.*))?'
        timestamp:
          parse_from: attributes.timestamp
          layout: '%Y-%m-%dT%H:%M:%S.%LZ'
      - type: move
        from: attributes["body"]
        to: body
      - type: remove
        field: attributes.timestamp
        # please remove names from below if you want to collect logs from them
      - type: filter
        id: signoz_logs_filter
        expr: 'attributes.container_name matches "^signoz-(logspout|frontend|alertmanager|query-service|otel-collector|clickhouse|zookeeper)"'
  opencensus:
    endpoint: 0.0.0.0:55678
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
        tls:
          cert_file: /opt/samespace/Cert-mtls/mesh.crt
          key_file: /opt/samespace/Cert-mtls/gw.key   
          ca_file: /opt/samespace/Cert-mtls/ca.crt
      http:
        endpoint: 0.0.0.0:4318
        tls:
          cert_file: /opt/samespace/Cert-mtls/mesh.crt
          key_file: /opt/samespace/Cert-mtls/gw.key   
          ca_file: /opt/samespace/Cert-mtls/ca.crt
  otlp/mtls:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
        tls:
          cert_file: /opt/samespace/Cert-mtls/mesh.crt
          key_file: /opt/samespace/Cert-mtls/gw.key   
          ca_file: /opt/samespace/Cert-mtls/ca.crt   
      http:
        endpoint: 0.0.0.0:4318
        tls:
          cert_file: /opt/samespace/Cert-mtls/mesh.crt
          key_file: /opt/samespace/Cert-mtls/gw.key   
          ca_file: /opt/samespace/Cert-mtls/ca.crt
  jaeger:
    protocols:
      grpc:
        endpoint: 0.0.0.0:14250
      thrift_http:
        endpoint: 0.0.0.0:14268
      # thrift_compact:
      #   endpoint: 0.0.0.0:6831
      # thrift_binary:
      #   endpoint: 0.0.0.0:6832
  hostmetrics:
    collection_interval: 30s
    scrapers:
      cpu: {}
      load: {}
      memory: {}
      disk: {}
      filesystem: {}
      network: {}
  prometheus:
    config:
      global:
        scrape_interval: 60s
      scrape_configs:
        # otel-collector internal metrics
        - job_name: otel-collector
          static_configs:
          - targets:
              - 10.160.0.41:8888
            labels:
              job_name: otel-collector


processors:
  batch:
    send_batch_size: 10000
    send_batch_max_size: 11000
    timeout: 10s
  signozspanmetrics/cumulative:
    metrics_exporter: clickhousemetricswrite
    metrics_flush_interval: 60s
    latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
    dimensions_cache_size: 100000
    dimensions:
      - name: service.namespace
        default: default
      - name: deployment.environment
        default: default
      # This is added to ensure the uniqueness of the timeseries
      # Otherwise, identical timeseries produced by multiple replicas of
      # collectors result in incorrect APM metrics
      - name: 'signoz.collector.id'
  # memory_limiter:
  #   # 80% of maximum memory up to 2G
  #   limit_mib: 1500
  #   # 25% of limit up to 2G
  #   spike_limit_mib: 512
  #   check_interval: 5s
  #
  #   # 50% of the maximum memory
  #   limit_percentage: 50
  #   # 20% of max memory usage spike expected
  #   spike_limit_percentage: 20
  # queued_retry:
  #   num_workers: 4
  #   queue_size: 100
  #   retry_on_failure: true
  resourcedetection:
    # Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
    detectors: [env, system] # include ec2 for AWS, gcp for GCP and azure for Azure.
    timeout: 2s
  signozspanmetrics/delta:
    metrics_exporter: clickhousemetricswrite
    metrics_flush_interval: 60s
    latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
    dimensions_cache_size: 100000
    aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
    enable_exp_histogram: true
    dimensions:
      - name: service.namespace
        default: default
      - name: deployment.environment
        default: default
      # This is added to ensure the uniqueness of the timeseries
      # Otherwise, identical timeseries produced by multiple replicas of
      # collectors result in incorrect APM metrics
      - name: signoz.collector.id

extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  zpages:
    endpoint: 0.0.0.0:55679
  pprof:
    endpoint: 0.0.0.0:1777

exporters:
  clickhousetraces:
    datasource: <tcp://10.160.0.41:9000/signoz_traces>
    docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
    low_cardinal_exception_grouping: ${LOW_CARDINAL_EXCEPTION_GROUPING}
  clickhousemetricswrite:
    endpoint: <tcp://10.160.0.41:9000/signoz_metrics>
    resource_to_telemetry_conversion:
      enabled: true
  clickhousemetricswrite/prometheus:
    endpoint: <tcp://10.160.0.41:9000/signoz_metrics>
  clickhouselogsexporter:
    dsn: <tcp://10.160.0.41:9000/signoz_logs>
    docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
    timeout: 10s
  # logging: {}

service:
  telemetry:
    metrics:
      address: 0.0.0.0:8888
  extensions:
    - health_check
    - zpages
    - pprof
  pipelines:
    traces:
      receivers: [jaeger, otlp]
      processors: [signozspanmetrics/cumulative, signozspanmetrics/delta, batch]
      exporters: [clickhousetraces]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [clickhousemetricswrite]
    metrics/generic:
      receivers: [hostmetrics]
      processors: [resourcedetection, batch]
      exporters: [clickhousemetricswrite]
    metrics/prometheus:
      receivers: [prometheus]
      processors: [batch]
      exporters: [clickhousemetricswrite/prometheus]
    logs:
      receivers: [otlp, tcplog/docker]
      processors: [batch]
      exporters: [clickhouselogsexporter]
ENV
OTEL_RESOURCE_ATTRIBUTES=host.name=signoz-host,os.type=linux
DOCKER_MULTI_NODE_CLUSTER=false
LOW_CARDINAL_EXCEPTION_GROUPING=false

ClickHouseUrl=<tcp://10.160.0.41:9000>
ALERTMANAGER_API_PREFIX=<http://10.160.0.41:9093/api/>
SIGNOZ_LOCAL_DB_PATH=/var/lib/signoz/signoz.db
DASHBOARDS_PATH=/root/config/dashboards
STORAGE=clickhouse
GODEBUG=netdns=go
TELEMETRY_ENABLED=true
DEPLOYMENT_TYPE=docker-standalone-amd

server_endpoint: <ws://10.160.0.41:4320/v1/opamp>
The error i'm getting is : OTEL Logs
{"level":"error","timestamp":"2024-06-07T06:22:58.034Z","caller":"opamp/server_client.go:216","msg":"failed to apply config","component":"opamp-server-client","error":"failed to reload config: /var/tmp/collector-config.yaml: collector failed to restart: failed to build pipelines: failed to create \"clickhouselogsexporter\" exporter for data type \"logs\": cannot configure clickhouse logs exporter: code: 81, message: Database signoz_logs does not exist","stacktrace":"<http://github.com/SigNoz/signoz-otel-collector/opamp.(*serverClient).onRemoteConfigHandler|github.com/SigNoz/signoz-otel-collector/opamp.(*serverClient).onRemoteConfigHandler>\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/opamp/server_client.go:216\ngithub.com/SigNoz/signoz-otel-collector/opamp.(*serverClient).onMessageFuncHandler\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/opamp/server_client.go:199\ngithub.com/open-telemetry/opamp-go/client/types.CallbacksStruct.OnMessage\n\t/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.5.0/client/types/callbacks.go:162\ngithub.com/open-telemetry/opamp-go/client/internal.(*receivedProcessor).ProcessReceivedMessage\n\t/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.5.0/client/internal/receivedprocessor.go:131\ngithub.com/open-telemetry/opamp-go/client/internal.(*wsReceiver).ReceiverLoop\n\t/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.5.0/client/internal/wsreceiver.go:57\ngithub.com/open-telemetry/opamp-go/client.(*wsClient).runOneCycle\n\t/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.5.0/client/wsclient.go:243\ngithub.com/open-telemetry/opamp-go/client.(*wsClient).runUntilStopped\n\t/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.5.0/client/wsclient.go:265\ngithub.com/open-telemetry/opamp-go/client/internal.(*ClientCommon).StartConnectAndRun.func1\n\t/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.5.0/client/internal/clientcommon.go:197"}
Help please! @nitya-signoz
Hi, I’m migrating from an old signoz version to the latest one, I have this issue: ``` ✘ Container o...
r

Romario Lopez C

7 months ago
Hi, I’m migrating from an old signoz version to the latest one, I have this issue:
✘ Container otel-migrator-sync  service "otel-collector-migrator-sync" didn't complete successfully: exit 1                                                                                                               0.0s
I saw the same issue in github here. I tried to delete the
data/clickhouse/
contents with which returns:
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/.gitkeep': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/user_scripts': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/tmp': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/user_files': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/format_schemas': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/preprocessed_configs': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/uuid': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/flags': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/dictionaries_lib': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/data': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/metadata': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/metadata_dropped': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/rocksdb': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/access': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/store': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/user_defined': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/named_collections': Permission denied
rm: cannot remove 'docker/clickhouse-setup/data/clickhouse/status': Permission denied
Has someone had this issue?
Another question, I am trying to set up otel collector in docker-compose by following <https://signo...
p

PG

10 months ago
Another question, I am trying to set up otel collector in docker-compose by following https://signoz.io/opentelemetry/collector-nodejs/ but for a Golang Application. But I am getting
otel-collector-1  | 2024-09-15T16:45:05.721-0700	error	scraperhelper/scrapercontroller.go:197	Error scraping metrics	{"kind": "receiver", "name": "hostmetrics", "data_type": "metrics", "error": "error reading username for process \"otelcol-contrib\" (pid 1): open /etc/passwd: no such file or directory", "scraper": "hostmetrics"}
otel-collector-1  | <http://go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).scrapeMetricsAndReport|go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).scrapeMetricsAndReport>
otel-collector-1  | 	<http://go.opentelemetry.io/collector/receiver@v0.109.0/scraperhelper/scrapercontroller.go:197|go.opentelemetry.io/collector/receiver@v0.109.0/scraperhelper/scrapercontroller.go:197>
otel-collector-1  | <http://go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).startScraping.func1|go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).startScraping.func1>
otel-collector-1  | 	<http://go.opentelemetry.io/collector/receiver@v0.109.0/scraperhelper/scrapercontroller.go:169|go.opentelemetry.io/collector/receiver@v0.109.0/scraperhelper/scrapercontroller.go:169>
I've uploaded the app in https://github.com/GaikwadPratik/signoz-test. Can someone please help me set up the app?