Hello! I'm trying to set up a simple test of Prome...
# support
t
Hello! I'm trying to set up a simple test of Prometheus/Node Exporter to demo internally to our company (so we can decide to start using SigNoz more generally). I'm running Node Exporter on a host and can confirm that I see Prometheus successfully getting those metrics. However, I can't seem to get SigNoz to receive those same metrics. I've followed this (https://signoz.io/docs/userguide/send-metrics/#enable-a-prometheus-receiver) and added the endpoint where Node Exporter is exposing these metrics to my
deploy/docker/clickhouse-setup/otel-collector-metrics-config.yaml
. However, I can't seem to find any of these metrics showing up in SigNoz. Could I get some help troubleshooting? more info in 🧵
if i curl the
<hostname>:9100/metrics
endpoint from the machine SigNoz is runnig on, i see all the metrics output:
Copy code
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.464e-05
go_gc_duration_seconds{quantile="0.25"} 2.888e-05
go_gc_duration_seconds{quantile="0.5"} 3.2991e-05
go_gc_duration_seconds{quantile="0.75"} 3.8511e-05
go_gc_duration_seconds{quantile="1"} 7.2621e-05
go_gc_duration_seconds_sum 0.129597327
go_gc_duration_seconds_count 3379
...
...
so i know the Node Exporter is working and is accessible to SigNoz.
this might just be my inability to visualize these Node Exporter metrics successfully. I tried to just import the Grafana Node Exporter full dashboard, but in SigNoz that doesn't seem to work (or maybe it would work, if i knew what variable values to provide?)
s
Was it the case of the imported dashboard not being able to show the data, or was it the no data scraped? Can you use the query builder tab and check for the node exporter metric names and see if they show up?
t
i do not see them show up in the query builder. i think the ones i do see are metrics that SigNoz collects already.
s
Can you share the logs of the collector metrics receiver? And what was the scrape config?
t
here's my scrape_config:
Copy code
receivers:
  otlp:
    protocols:
      grpc:
      http:
  prometheus:
    config:
      scrape_configs:
        # otel-collector-metrics internal metrics
        - job_name: otel-collector-metrics
          scrape_interval: 60s
          static_configs:
            - targets: ["localhost:8888", "pacific:9100"]
              labels:
                job_name: otel-collector-metrics
        # SigNoz span metrics
        - job_name: signozspanmetrics-collector
          scrape_interval: 60s
          static_configs:
            - targets:
              - otel-collector:8889
pacific:9100
is the machine i want to hit. it's using tailscale's MagicDNS. OH. well now i know i should have checked the
clickhouse-setup_otel-collector-metrics_1
logs.
Copy code
$ docker logs clickhouse-setup_otel-collector-metrics_1
...
...
2023-01-13T01:03:18.173Z	info	prometheusreceiver@v0.66.0/metrics_receiver.go:288	Starting scrape manager	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics"}
2023-01-13T01:04:19.900Z	warn	internal/transaction.go:120	Failed to scrape Prometheus endpoint	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics", "scrape_timestamp": 1673571859900, "target_labels": "{__name__=\"up\", instance=\"pacific:9100\", job=\"otel-collector-metrics\", job_name=\"otel-collector-metrics\"}"}
s
It says failed to scrape; that’s the reason they aren’t showing up. How does the pod/container know about the tailescale’s magic?
t
hmm.. it's running on a host that has tailscale configured. i assumed it would just work.
s
No, that would have been correct if you running the collector binary directly. Since you are running a docker container it doesn’t work.
Please configure the address the scraper can reach and check. You should be able see the data in SigNoz UI
t
alright. i'll play with it. thanks for your help!
Is there any way to get more information on why a prometheus scrape failed? I just see
failed to scrape prometheus endpoint
. in the logs.
Copy code
2023-01-13T21:06:31.894Z	warn	internal/transaction.go:120	Failed to scrape Prometheus endpoint	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics", "scrape_timestamp": 1673643981893, "target_labels": "{__name__=\"up\", hostname=\"pacific\", instance=\"<http://pacific.numbersstation.ai:9090\|pacific.numbersstation.ai:9090\>", job=\"otel-collector-metrics\", job_name=\"node_exporter\"}"}
however, i've verified the
clickhouse-setup_otel-collector-metrics_1
container has access to the endpoint.
Copy code
$ docker exec -it clickhouse-setup_otel-collector-metrics_1 sh
/ $ ping <http://pacific.numbersstation.ai:9090|pacific.numbersstation.ai:9090>
PING <http://pacific.numbersstation.ai:9090|pacific.numbersstation.ai:9090> (100.87.98.118): 56 data bytes
64 bytes from 100.87.98.118: seq=0 ttl=42 time=0.068 ms
64 bytes from 100.87.98.118: seq=1 ttl=42 time=0.075 ms
64 bytes from 100.87.98.118: seq=2 ttl=42 time=0.065 ms
s
t
sorry... where does the
logging
key go?
Copy code
'service.telemetry' has invalid keys: logging
2023/01/17 16:56:13 application run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* 'service.telemetry' has invalid keys: logging
i think i've tried nesting it under all of them -- service, telemetry, or metrics, with no luck.
Copy code
service:
  telemetry:
    logging:
      loglevel: debug
    metrics:
      address: 0.0.0.0:8888
s
Ah sorry it should be logs instead
Copy code
service:
  telemetry:
    logs:
      level: debug
t
i finally got around to working on this again. works.
$ sudo ufw allow 9100
🤦‍♂️
305 Views