brandon
03/04/2024, 5:31 PMdocker-compose.yaml
and otel-collector-config.yaml
files to point to a single log file, but i get the following when i view the logs for the otel-collector container:
{
"level": "fatal",
"timestamp": "2024-03-04T17:29:06.159Z",
"caller": "signozcollector/main.go:72",
"msg": "failed to create collector service:",
"error": "failed to create server client: failed to create collector config: failed to upsert instance id failed to parse config file /var/tmp/collector-config.yaml: yaml: line 166: did not find expected key",
"stacktrace": "main.main\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/cmd/signozcollector/main.go:72\nruntime.main\n\t/opt/hostedtoolcache/go/1.21.7/x64/src/runtime/proc.go:267"
}
here is my corresponding config block:
logs:
receivers: [otlp, tcplog/docker, filelog]
processors: [batch]
exporters: [clickhouselogsexporter]
filelog:
include: [/cloudadmins/logs/OTHER/D202306/M5WEB_PRESENTATION_COMMON_CSILOGON.ASPX_638224444396847090_0.txt]
start_at: beginning
brandon
03/04/2024, 5:40 PMbrandon
03/04/2024, 6:36 PMdocker-compose
, it rebuilt the signoz-otel-collector
but it's still flapping with the same:
{
"level": "fatal",
"timestamp": "2024-03-04T18:34:51.261Z",
"caller": "signozcollector/main.go:72",
"msg": "failed to create collector service:",
"error": "failed to create server client: failed to create collector config: failed to upsert instance id failed to parse config file /var/tmp/collector-config.yaml: yaml: line 166: did not find expected key",
"stacktrace": "main.main\n\t/home/runner/work/signoz-otel-collector/signoz-otel-collector/cmd/signozcollector/main.go:72\nruntime.main\n\t/opt/hostedtoolcache/go/1.21.7/x64/src/runtime/proc.go:267"
}
couldn't say what's missing. gonna go see if i can find the code on github to dig into the missing key...Pedro Carvalho
03/05/2024, 3:12 AMexporters: [clickhouselogsexporter]
filelog:
include:
I believe you may be mixing two parts of the opentelemetry config. You should have a top level exporters key, where you make all your exporter definitions such as file log, and then you should have a pipellines key under which you define which exporters to use for logs.Pedro Carvalho
03/05/2024, 3:12 AMPedro Carvalho
03/05/2024, 3:17 AMSrikanth Chekuri
03/05/2024, 9:30 AMbrandon
03/05/2024, 4:53 PMbrandon
03/05/2024, 4:58 PMreceivers:
tcplog/docker:
listen_address: "0.0.0.0:2255"
operators:
- type: regex_parser
regex: '^<([0-9]+)>[0-9]+ (?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}(\.[0-9]+)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?) (?P<container_id>\S+) (?P<container_name>\S+) [0-9]+ - -( (?P<body>.*))?'
timestamp:
parse_from: attributes.timestamp
layout: '%Y-%m-%dT%H:%M:%S.%LZ'
- type: move
from: attributes["body"]
to: body
- type: remove
field: attributes.timestamp
# please remove names from below if you want to collect logs from them
- type: filter
id: signoz_logs_filter
expr: 'attributes.container_name matches "^signoz-(logspout|frontend|alertmanager|query-service|otel-collector|clickhouse|zookeeper)"'
opencensus:
endpoint: 0.0.0.0:55678
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
jaeger:
protocols:
grpc:
endpoint: 0.0.0.0:14250
thrift_http:
endpoint: 0.0.0.0:14268
# thrift_compact:
# endpoint: 0.0.0.0:6831
# thrift_binary:
# endpoint: 0.0.0.0:6832
hostmetrics:
collection_interval: 30s
scrapers:
cpu: {}
load: {}
memory: {}
disk: {}
filesystem: {}
network: {}
prometheus:
config:
global:
scrape_interval: 60s
scrape_configs:
# otel-collector internal metrics
- job_name: otel-collector
static_configs:
- targets:
- localhost:8888
labels:
job_name: otel-collector
filelog/app:
include: ["/cloudadmins/logs/20240304_process_monitor.log"]
start_at: beginning
processors:
batch:
send_batch_size: 10000
send_batch_max_size: 11000
timeout: 10s
signozspanmetrics/cumulative:
metrics_exporter: clickhousemetricswrite
metrics_flush_interval: 60s
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
dimensions_cache_size: 100000
dimensions:
- name: service.namespace
default: default
- name: deployment.environment
default: default
# This is added to ensure the uniqueness of the timeseries
# Otherwise, identical timeseries produced by multiple replicas of
# collectors result in incorrect APM metrics
- name: 'signoz.collector.id'
# memory_limiter:
# # 80% of maximum memory up to 2G
# limit_mib: 1500
# # 25% of limit up to 2G
# spike_limit_mib: 512
# check_interval: 5s
#
# # 50% of the maximum memory
# limit_percentage: 50
# # 20% of max memory usage spike expected
# spike_limit_percentage: 20
# queued_retry:
# num_workers: 4
# queue_size: 100
# retry_on_failure: true
resourcedetection:
# Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
detectors: [env, system] # include ec2 for AWS, gcp for GCP and azure for Azure.
timeout: 2s
signozspanmetrics/delta:
metrics_exporter: clickhousemetricswrite
metrics_flush_interval: 60s
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
dimensions_cache_size: 100000
aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
enable_exp_histogram: true
dimensions:
- name: service.namespace
default: default
- name: deployment.environment
default: default
# This is added to ensure the uniqueness of the timeseries
# Otherwise, identical timeseries produced by multiple replicas of
# collectors result in incorrect APM metrics
- name: signoz.collector.id
extensions:
health_check:
endpoint: 0.0.0.0:13133
zpages:
endpoint: 0.0.0.0:55679
pprof:
endpoint: 0.0.0.0:1777
exporters:
clickhousetraces:
datasource: <tcp://clickhouse:9000/?database=signoz_traces>
docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
low_cardinal_exception_grouping: ${LOW_CARDINAL_EXCEPTION_GROUPING}
clickhousemetricswrite:
endpoint: <tcp://clickhouse:9000/?database=signoz_metrics>
resource_to_telemetry_conversion:
enabled: true
clickhousemetricswrite/prometheus:
endpoint: <tcp://clickhouse:9000/?database=signoz_metrics>
# logging: {}
clickhouselogsexporter:
dsn: <tcp://clickhouse:9000/>
docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
timeout: 10s
service:
telemetry:
metrics:
address: 0.0.0.0:8888
extensions:
- health_check
- zpages
- pprof
pipelines:
traces:
receivers: [jaeger, otlp]
processors: [signozspanmetrics/cumulative, signozspanmetrics/delta, batch]
exporters: [clickhousetraces]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [clickhousemetricswrite]
metrics/generic:
receivers: [hostmetrics]
processors: [resourcedetection, batch]
exporters: [clickhousemetricswrite]
metrics/prometheus:
receivers: [prometheus]
processors: [batch]
exporters: [clickhousemetricswrite/prometheus]
logs:
receivers: [otlp, filelog/app, tcplog/docker]
processors: [batch]
exporters: [clickhouselogsexporter]
# exporters: [otlp]
brandon
03/05/2024, 4:58 PMversion: "2.4"
x-clickhouse-defaults: &clickhouse-defaults
restart: on-failure
# addding non LTS version due to this fix <https://github.com/ClickHouse/ClickHouse/commit/32caf8716352f45c1b617274c7508c86b7d1afab>
image: clickhouse/clickhouse-server:24.1.2-alpine
tty: true
depends_on:
- zookeeper-1
# - zookeeper-2
# - zookeeper-3
logging:
options:
max-size: 50m
max-file: "3"
healthcheck:
# "clickhouse", "client", "-u ${CLICKHOUSE_USER}", "--password ${CLICKHOUSE_PASSWORD}", "-q 'SELECT 1'"
test:
[
"CMD",
"wget",
"--spider",
"-q",
"localhost:8123/ping"
]
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
x-db-depend: &db-depend
depends_on:
clickhouse:
condition: service_healthy
otel-collector-migrator:
condition: service_completed_successfully
# clickhouse-2:
# condition: service_healthy
# clickhouse-3:
# condition: service_healthy
services:
zookeeper-1:
image: bitnami/zookeeper:3.7.1
container_name: signoz-zookeeper-1
hostname: zookeeper-1
user: root
ports:
- "2181:2181"
- "2888:2888"
- "3888:3888"
volumes:
- ./data/zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
# - ZOO_SERVERS=0.0.0.0:2888:3888,zookeeper-2:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
# zookeeper-2:
# image: bitnami/zookeeper:3.7.0
# container_name: signoz-zookeeper-2
# hostname: zookeeper-2
# user: root
# ports:
# - "2182:2181"
# - "2889:2888"
# - "3889:3888"
# volumes:
# - ./data/zookeeper-2:/bitnami/zookeeper
# environment:
# - ZOO_SERVER_ID=2
# - ZOO_SERVERS=zookeeper-1:2888:3888,0.0.0.0:2888:3888,zookeeper-3:2888:3888
# - ALLOW_ANONYMOUS_LOGIN=yes
# - ZOO_AUTOPURGE_INTERVAL=1
# zookeeper-3:
# image: bitnami/zookeeper:3.7.0
# container_name: signoz-zookeeper-3
# hostname: zookeeper-3
# user: root
# ports:
# - "2183:2181"
# - "2890:2888"
# - "3890:3888"
# volumes:
# - ./data/zookeeper-3:/bitnami/zookeeper
# environment:
# - ZOO_SERVER_ID=3
# - ZOO_SERVERS=zookeeper-1:2888:3888,zookeeper-2:2888:3888,0.0.0.0:2888:3888
# - ALLOW_ANONYMOUS_LOGIN=yes
# - ZOO_AUTOPURGE_INTERVAL=1
clickhouse:
<<: *clickhouse-defaults
container_name: signoz-clickhouse
hostname: clickhouse
ports:
- "9000:9000"
- "8123:8123"
- "9181:9181"
volumes:
- ./clickhouse-config.xml:/etc/clickhouse-server/config.xml
- ./clickhouse-users.xml:/etc/clickhouse-server/users.xml
- ./custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ./clickhouse-cluster.xml:/etc/clickhouse-server/config.d/cluster.xml
# - ./clickhouse-storage.xml:/etc/clickhouse-server/config.d/storage.xml
- ./data/clickhouse/:/var/lib/clickhouse/
- ./user_scripts:/var/lib/clickhouse/user_scripts/
# clickhouse-2:
# <<: *clickhouse-defaults
# container_name: signoz-clickhouse-2
# hostname: clickhouse-2
# ports:
# - "9001:9000"
# - "8124:8123"
# - "9182:9181"
# volumes:
# - ./clickhouse-config.xml:/etc/clickhouse-server/config.xml
# - ./clickhouse-users.xml:/etc/clickhouse-server/users.xml
# - ./custom-function.xml:/etc/clickhouse-server/custom-function.xml
# - ./clickhouse-cluster.xml:/etc/clickhouse-server/config.d/cluster.xml
# # - ./clickhouse-storage.xml:/etc/clickhouse-server/config.d/storage.xml
# - ./data/clickhouse-2/:/var/lib/clickhouse/
# - ./user_scripts:/var/lib/clickhouse/user_scripts/
# clickhouse-3:
# <<: *clickhouse-defaults
# container_name: signoz-clickhouse-3
# hostname: clickhouse-3
# ports:
# - "9002:9000"
# - "8125:8123"
# - "9183:9181"
# volumes:
# - ./clickhouse-config.xml:/etc/clickhouse-server/config.xml
# - ./clickhouse-users.xml:/etc/clickhouse-server/users.xml
# - ./custom-function.xml:/etc/clickhouse-server/custom-function.xml
# - ./clickhouse-cluster.xml:/etc/clickhouse-server/config.d/cluster.xml
# # - ./clickhouse-storage.xml:/etc/clickhouse-server/config.d/storage.xml
# - ./data/clickhouse-3/:/var/lib/clickhouse/
# - ./user_scripts:/var/lib/clickhouse/user_scripts/
alertmanager:
image: signoz/alertmanager:${ALERTMANAGER_TAG:-0.23.4}
container_name: signoz-alertmanager
volumes:
- ./data/alertmanager:/data
depends_on:
query-service:
condition: service_healthy
restart: on-failure
command:
- --queryService.url=<http://query-service:8085>
- --storage.path=/data
# Notes for Maintainers/Contributors who will change Line Numbers of Frontend & Query-Section. Please Update Line Numbers in `./scripts/commentLinesForSetup.sh` & `./CONTRIBUTING.md`
query-service:
image: signoz/query-service:${DOCKER_TAG:-0.40.0}
container_name: signoz-query-service
command:
[
"-config=/root/config/prometheus.yml",
# "--prefer-delta=true"
]
# ports:
# - "6060:6060" # pprof port
# - "8080:8080" # query-service port
volumes:
- ./prometheus.yml:/root/config/prometheus.yml
- ../dashboards:/root/config/dashboards
- ./data/signoz/:/var/lib/signoz/
environment:
- ClickHouseUrl=<tcp://clickhouse:9000>
- ALERTMANAGER_API_PREFIX=<http://alertmanager:9093/api/>
- SIGNOZ_LOCAL_DB_PATH=/var/lib/signoz/signoz.db
- DASHBOARDS_PATH=/root/config/dashboards
- STORAGE=clickhouse
- GODEBUG=netdns=go
- TELEMETRY_ENABLED=true
- DEPLOYMENT_TYPE=docker-standalone-amd
restart: on-failure
healthcheck:
test:
[
"CMD",
"wget",
"--spider",
"-q",
"localhost:8080/api/v1/health"
]
interval: 30s
timeout: 5s
retries: 3
<<: *db-depend
frontend:
image: signoz/frontend:${DOCKER_TAG:-0.40.0}
container_name: signoz-frontend
restart: on-failure
depends_on:
- alertmanager
- query-service
ports:
- "3301:3301"
volumes:
- ../common/nginx-config.conf:/etc/nginx/conf.d/default.conf
otel-collector-migrator:
image: signoz/signoz-schema-migrator:${OTELCOL_TAG:-0.88.14}
container_name: otel-migrator
command:
- "--dsn=<tcp://clickhouse:9000>"
depends_on:
clickhouse:
condition: service_healthy
# clickhouse-2:
# condition: service_healthy
# clickhouse-3:
# condition: service_healthy
otel-collector:
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-0.88.14}
container_name: signoz-otel-collector
command:
[
"--config=/etc/otel-collector-config.yaml",
"--manager-config=/etc/manager-config.yaml",
"--copy-path=/var/tmp/collector-config.yaml",
"--feature-gates=-pkg.translator.prometheus.NormalizeName"
]
user: root # required for reading docker container logs
volumes:
# added /cloudadmins/logs/**/*.txt
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
- ./otel-collector-opamp-config.yaml:/etc/manager-config.yaml
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- "/cloudadmins/logs/20240304_process_monitor.log"
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name=signoz-host,os.type=linux
- DOCKER_MULTI_NODE_CLUSTER=false
- LOW_CARDINAL_EXCEPTION_GROUPING=false
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
# - "8888:8888" # OtelCollector internal metrics
# - "8889:8889" # signoz spanmetrics exposed by the agent
# - "9411:9411" # Zipkin port
# - "13133:13133" # health check extension
# - "14250:14250" # Jaeger gRPC
# - "14268:14268" # Jaeger thrift HTTP
# - "55678:55678" # OpenCensus receiver
# - "55679:55679" # zPages extension
restart: on-failure
depends_on:
clickhouse:
condition: service_healthy
otel-collector-migrator:
condition: service_completed_successfully
query-service:
condition: service_healthy
logspout:
image: "gliderlabs/logspout:v3.2.14"
container_name: signoz-logspout
volumes:
- /etc/hostname:/etc/host_hostname:ro
- /var/run/docker.sock:/var/run/docker.sock
command: <syslog+tcp://otel-collector:2255>
depends_on:
- otel-collector
restart: on-failure
hotrod:
image: jaegertracing/example-hotrod:1.30
container_name: hotrod
logging:
options:
max-size: 50m
max-file: "3"
command: [ "all" ]
environment:
- JAEGER_ENDPOINT=<http://otel-collector:14268/api/traces>
load-hotrod:
image: "signoz/locust:1.2.3"
container_name: load-hotrod
hostname: load-hotrod
environment:
ATTACKED_HOST: <http://hotrod:8080>
LOCUST_MODE: standalone
NO_PROXY: standalone
TASK_DELAY_FROM: 5
TASK_DELAY_TO: 30
QUIET_MODE: "${QUIET_MODE:-false}"
LOCUST_OPTS: "--headless -u 10 -r 1"
volumes:
- ../common/locust-scripts:/locust
brandon
03/05/2024, 5:01 PM{"timestamp":"2024-03-04T17:10:14-0600","source":"process_monitor","log_level":"INFO","action":"startup","message":"tomlq version tomlq 3.2.3 found.","hostname":"WHWH1WD9F2","host_ip":""}
{"timestamp":"2024-03-04T17:10:14-0600","source":"process_monitor","log_level":"INFO","action":"startup","message":"jq version jq-1.7.1 found.","hostname":"WHWH1WD9F2","host_ip":""}
{"timestamp":"2024-03-04T17:10:15-0600","source":"process_monitor","log_level":"INFO","action":"startup","message":"Created new log file /Users/brandon/bin/bash_process_monitor/log/20240304_process_monitor.log","hostname":"WHWH1WD9F2","host_ip":""}
{"timestamp":"2024-03-04T17:10:15-0600","source":"process_monitor","log_level":"WARN","action":"process_check","message":"Process found (6 actual of 1 expected) with pattern kitty with PIDs 21069,22746,78381,78889,78953,81667.","pid":"21069,22746,78381,78889,78953,81667","process_count":"6","expected_count":"1","pattern":"kitty","status":"WARN","n_status":"1","hostname":"WHWH1WD9F2","host_ip":""}
{"timestamp":"2024-03-04T17:11:13-0600","source":"process_monitor","log_level":"INFO","action":"startup","message":"tomlq version tomlq 3.2.3 found.","hostname":"WHWH1WD9F2","host_ip":"127.0.0.1"}
{"timestamp":"2024-03-04T17:11:13-0600","source":"process_monitor","log_level":"INFO","action":"startup","message":"jq version jq-1.7.1 found.","hostname":"WHWH1WD9F2","host_ip":"127.0.0.1"}
{"timestamp":"2024-03-04T17:11:13-0600","source":"process_monitor","log_level":"INFO","action":"startup","message":"Using existing log file /Users/brandon/bin/bash_process_monitor/log/20240304_process_monitor.log","hostname":"WHWH1WD9F2","host_ip":"127.0.0.1"}
{"timestamp":"2024-03-04T17:11:14-0600","source":"process_monitor","log_level":"WARN","action":"process_check","message":"Process found (6 actual of 1 expected) with pattern kitty with PIDs 21069,22746,78381,78889,78953,81667.","pid":"21069,22746,78381,78889,78953,81667","process_count":"6","expected_count":"1","pattern":"kitty","status":"WARN","n_status":"1","hostname":"WHWH1WD9F2","host_ip":"127.0.0.1"}
Srikanth Chekuri
03/05/2024, 5:12 PMbrandon
03/05/2024, 5:25 PMbrandon
03/05/2024, 5:26 PMbrandon
03/05/2024, 5:26 PMinfo
and warn
entries when doing a docker logs
but pulling into IDE so i can more easily look thru the entries.brandon
03/05/2024, 6:03 PMwarn
about no matching files:
2024-03-05T17: 05: 31.895Z warn fileconsumer/file.go: 61 finding files: no files match the configured criteria
{
"kind": "receiver",
"name": "filelog/app",
"data_type": "logs",
"component": "fileconsumer"
}
brandon
03/05/2024, 6:04 PMSrikanth Chekuri
03/06/2024, 1:40 AMbrandon
03/06/2024, 5:04 PMbrandon
03/06/2024, 5:04 PMSigNoz is an open-source APM. It helps developers monitor their applications & troubleshoot problems, an open-source alternative to DataDog, NewRelic, etc.
Powered by