https://signoz.io logo
t

Travis Chambers

01/13/2023, 1:07 AM
Hello! I'm trying to set up a simple test of Prometheus/Node Exporter to demo internally to our company (so we can decide to start using SigNoz more generally). I'm running Node Exporter on a host and can confirm that I see Prometheus successfully getting those metrics. However, I can't seem to get SigNoz to receive those same metrics. I've followed this (https://signoz.io/docs/userguide/send-metrics/#enable-a-prometheus-receiver) and added the endpoint where Node Exporter is exposing these metrics to my
deploy/docker/clickhouse-setup/otel-collector-metrics-config.yaml
. However, I can't seem to find any of these metrics showing up in SigNoz. Could I get some help troubleshooting? more info in 🧵
if i curl the
<hostname>:9100/metrics
endpoint from the machine SigNoz is runnig on, i see all the metrics output:
Copy code
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.464e-05
go_gc_duration_seconds{quantile="0.25"} 2.888e-05
go_gc_duration_seconds{quantile="0.5"} 3.2991e-05
go_gc_duration_seconds{quantile="0.75"} 3.8511e-05
go_gc_duration_seconds{quantile="1"} 7.2621e-05
go_gc_duration_seconds_sum 0.129597327
go_gc_duration_seconds_count 3379
...
...
so i know the Node Exporter is working and is accessible to SigNoz.
this might just be my inability to visualize these Node Exporter metrics successfully. I tried to just import the Grafana Node Exporter full dashboard, but in SigNoz that doesn't seem to work (or maybe it would work, if i knew what variable values to provide?)
s

Srikanth Chekuri

01/13/2023, 1:12 AM
Was it the case of the imported dashboard not being able to show the data, or was it the no data scraped? Can you use the query builder tab and check for the node exporter metric names and see if they show up?
t

Travis Chambers

01/13/2023, 1:15 AM
i do not see them show up in the query builder. i think the ones i do see are metrics that SigNoz collects already.
s

Srikanth Chekuri

01/13/2023, 1:15 AM
Can you share the logs of the collector metrics receiver? And what was the scrape config?
t

Travis Chambers

01/13/2023, 1:18 AM
here's my scrape_config:
Copy code
receivers:
  otlp:
    protocols:
      grpc:
      http:
  prometheus:
    config:
      scrape_configs:
        # otel-collector-metrics internal metrics
        - job_name: otel-collector-metrics
          scrape_interval: 60s
          static_configs:
            - targets: ["localhost:8888", "pacific:9100"]
              labels:
                job_name: otel-collector-metrics
        # SigNoz span metrics
        - job_name: signozspanmetrics-collector
          scrape_interval: 60s
          static_configs:
            - targets:
              - otel-collector:8889
pacific:9100
is the machine i want to hit. it's using tailscale's MagicDNS. OH. well now i know i should have checked the
clickhouse-setup_otel-collector-metrics_1
logs.
Copy code
$ docker logs clickhouse-setup_otel-collector-metrics_1
...
...
2023-01-13T01:03:18.173Z	info	prometheusreceiver@v0.66.0/metrics_receiver.go:288	Starting scrape manager	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics"}
2023-01-13T01:04:19.900Z	warn	internal/transaction.go:120	Failed to scrape Prometheus endpoint	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics", "scrape_timestamp": 1673571859900, "target_labels": "{__name__=\"up\", instance=\"pacific:9100\", job=\"otel-collector-metrics\", job_name=\"otel-collector-metrics\"}"}
s

Srikanth Chekuri

01/13/2023, 1:19 AM
It says failed to scrape; that’s the reason they aren’t showing up. How does the pod/container know about the tailescale’s magic?
t

Travis Chambers

01/13/2023, 1:20 AM
hmm.. it's running on a host that has tailscale configured. i assumed it would just work.
s

Srikanth Chekuri

01/13/2023, 1:23 AM
No, that would have been correct if you running the collector binary directly. Since you are running a docker container it doesn’t work.
Please configure the address the scraper can reach and check. You should be able see the data in SigNoz UI
t

Travis Chambers

01/13/2023, 1:30 AM
alright. i'll play with it. thanks for your help!
Is there any way to get more information on why a prometheus scrape failed? I just see
failed to scrape prometheus endpoint
. in the logs.
Copy code
2023-01-13T21:06:31.894Z	warn	internal/transaction.go:120	Failed to scrape Prometheus endpoint	{"kind": "receiver", "name": "prometheus", "pipeline": "metrics", "scrape_timestamp": 1673643981893, "target_labels": "{__name__=\"up\", hostname=\"pacific\", instance=\"<http://pacific.numbersstation.ai:9090\|pacific.numbersstation.ai:9090\>", job=\"otel-collector-metrics\", job_name=\"node_exporter\"}"}
however, i've verified the
clickhouse-setup_otel-collector-metrics_1
container has access to the endpoint.
Copy code
$ docker exec -it clickhouse-setup_otel-collector-metrics_1 sh
/ $ ping <http://pacific.numbersstation.ai:9090|pacific.numbersstation.ai:9090>
PING <http://pacific.numbersstation.ai:9090|pacific.numbersstation.ai:9090> (100.87.98.118): 56 data bytes
64 bytes from 100.87.98.118: seq=0 ttl=42 time=0.068 ms
64 bytes from 100.87.98.118: seq=1 ttl=42 time=0.075 ms
64 bytes from 100.87.98.118: seq=2 ttl=42 time=0.065 ms
s

Srikanth Chekuri

01/14/2023, 2:29 AM
t

Travis Chambers

01/17/2023, 4:57 PM
sorry... where does the
logging
key go?
Copy code
'service.telemetry' has invalid keys: logging
2023/01/17 16:56:13 application run finished with error: failed to get config: cannot unmarshal the configuration: 1 error(s) decoding:

* 'service.telemetry' has invalid keys: logging
i think i've tried nesting it under all of them -- service, telemetry, or metrics, with no luck.
Copy code
service:
  telemetry:
    logging:
      loglevel: debug
    metrics:
      address: 0.0.0.0:8888
s

Srikanth Chekuri

01/18/2023, 4:45 AM
Ah sorry it should be logs instead
Copy code
service:
  telemetry:
    logs:
      level: debug
t

Travis Chambers

03/09/2023, 8:04 PM
i finally got around to working on this again. works.
$ sudo ufw allow 9100
🤦‍♂️
50 Views