Hello, does signoz have some agents for other serv...
# general
m
Hello, does signoz have some agents for other servers I want to collect data from or signoz server must be installed on all the servers to work ?
p
hey @Miloš Hlavička are you on kubernetes or want to collect data from independent servers? If on Kubernetes, you can check this tutorial - https://signoz.io/docs/tutorial/kubernetes-infra-metrics/
m
I want to install signoz from docker one one of my many servers. Which are mostly VM's from different providers as contabo and others
p
@Miloš Hlavička you can run node exporter in each of the servers and send data to SigNoz in Prometheus format, and configure SigNoz otel collector to scrape those targets. Check this doc - https://signoz.io/docs/userguide/send-metrics/
m
great will check it out
p
👍
p
@Miloš Hlavička Another alternative of that would be using OTel binary to send host metrics data to SigNoz. We are yet to include the same in our docs, but you should be able to follow steps from here: https://github.com/SigNoz/benchmark/tree/main/docker#binary We also have a script to generate hostmetrics dashboard. Replace
test-instance
with your instance hostname - output of `echo $HOST`:
Copy code
curl -sL <https://github.com/SigNoz/benchmark/raw/main/dashboards/hostmetrics/hostmetrics-import.sh> \
    | HOSTNAME="test-instance" DASHBOARD_TITLE="HostMetrics Dashboard for test-instance" bash
👀 1
@Miloš Hlavička end result would be something like this:
👍 1
m
@Prashant Shahi Tried to work it out on the same computer as the server is running on, and got into trouble when I spin taht metrics docker that port is already taken by server installation, looked into it and this collector is installed by default installation it seems, so I proceeded just with dahsboard generation of json but all fields except signoz collector are empty, what did I miss? 😞
p
questions, • did you set up OTel binary in the server and pointed config to SigNoz OTel 4317 port? And is SigNoz OTel accessible from the server? • did you generated dashboard on correct machine?
@Miloš Hlavička
m
@Prashant Shahi I did docker standalone installation of the server which already has preconfigured otel collector with reveiver connected to 4317 port, so I didnt have to change the basic docker-compose.yaml as it looks properly configured already. Then I did generate json on the Signoz server where I want to monitor metrics at the moment before I will start to move otel to other servers.
p
@Miloš Hlavička if you don't haven't configured another OTel collector at server level. And want to use hostmetrics collected by signoz itself, you can import the following dashboard: https://raw.githubusercontent.com/SigNoz/benchmark/main/dashboards/hostmetrics/hostmetrics-signoz.json
m
@Prashant Shahi It works finally, wanted to create alerts but seems you support only slack and some generic webhooks ? Not even emails ? We using rocket.chat but email is like best thing to send it to together with sending to rocket.chat, any chance in future to add email and rocket.chat alerts ? Also possibility for alert to send it to both channels mail and rocket.chat and or webhook etc. ?
p
@Miloš Hlavička Glad to know that you were able to get dashboard to work. Yes, currently we support Slack, Pagerduty, and Generic Webhook. Using the generic webhook you should be able to plug in anything on the other end. However, if you require support for something specific.. can you please create a feature request issue in GitHub, so that we can evaluate and prioritise the same?
m
ok will do
s
@Prashant Shahi sorry to hijack this thread, but have some related query. I am trying out the docs for monioring k8s metrics https://signoz.io/docs/tutorial/kubernetes-infra-metrics/ ( multinode k8s cluster running on Azure ) I can see the CPU etc metrics for the the cluster, however, the host-metrics for the node are empty. Is there any additional steps to get the metrics for the node ?
p
@som neema No worries. we recently had a minor change in dashboard format.. that might be potentially causing it. Let me check and get back to you
s
thanks, I am running 0.8.1 Signoz
@Prashant Shahi do you think the issue you mentioned affects 0.8.1 ?
p
@som neema it should work fine. Can you tell me how did you generate the dashboard JSON? I created new local K8s cluster and the generated dashboard JSON worked for me.
s
I had cloned the hostmetrics repo and ran the shell script
Copy code
2051  cat hostmetrics-import.sh | HOSTNAME="aks-agentpool-14188195-vmss000003" DASHBOARD_TITLE="HostMetrics Dashboard for aks-agentpool-14188195-vmss000003" bash
 2052  cat hostmetrics-import.sh | HOSTNAME="aks-agentpool-14188195-vmss000002" DASHBOARD_TITLE="HostMetrics Dashboard for aks-agentpool-14188195-vmss000002" bash
I just restarted the cluster, let me run through the steps again
p
are you sure the hostname entered is correct one? Is it hostname of the K8s node or an external instance?
s
yes its the k8s node
p
can you run this?
Copy code
for node in $(kubectl get nodes -o name | sed -e "s/^node\///");
do
curl -sL <https://github.com/SigNoz/benchmark/raw/main/dashboards/hostmetrics/hostmetrics-import.sh> \
   | HOSTNAME="$node" DASHBOARD_TITLE="Node Metrics Dashboard for $node" bash
done
we had recently updated hostmetrics template and script.
s
Copy code
som@SomNeema-pc:~/ws/repos/benchmark/dashboards/hostmetrics$ kubectl get nodes -o name | sed -e "s/^node\///"
aks-agentpool-14188195-vmss000006
aks-agentpool-14188195-vmss000007
aks-userpool-14188195-vmss000008
running hostmetrics with these now
p
Also remember to
git pull
latest changes from hostmetrics repo.
s
ok, let me do that
Looks ok? :
Copy code
Recent commits
5bc3ecf origin/main chore: 🏗️ migrate dashboard to v0.8.1 format

e5f2969 chore: 🔧 move out config with dockerstats
c44fd80 chore: 🔧 update hostmetrics dashboard
32491ff chore(config): 🔧 update OTel config and hostmetrics dashboard files
still the same
p
s
anything I can provide for you to help isolate the issue ?
yes, the CPU and Memory metrics are fine ( I guess, but not sure why it does not list the containers/pods running int the main namespace. Do I have write promql queries to get that ? )
p
Oh, found it!
The PR still hasn't been merged yet. https://github.com/SigNoz/otel-collector-k8s/pull/3
s
ok, should I cherry-pick and try ?
looks like I need to restart signoz after this ?
p
signoz need not be restarted.
s
ok, let me try, also let me know if you know why I dont see my app containers in the CPU Memory metric dashboard
p
I have asked for review again. It should get merged by tonight. Meanwhile, you can clone the fork repository and steps from above tutorial: https://github.com/prashant-shahi/otel-collector-k8s
I dont see my app containers
if you hover-over graph or scroll-over labels, you should see your application. you can edit widgets to only show your application namespace. For example:
k8s_container_cpu_request{k8s_namespace_name="my-app-namespace"}
s
Hmm, I checked using scroll and expanding, but I dont see them. so my apps are in the default ( main namespace ) So it will be
k8s_container_cpu_request(k8s_namespace_name="") ?
p
it will be:
Copy code
k8s_container_cpu_request{k8s_namespace_name="default"}
s
ok, thanks. Let me try. Also, I am trying with the changes with pending PR. The otel-collector-agent seems to not comeup>
Copy code
Warning  Unhealthy  8s (x2 over 58s)   kubelet            Readiness probe failed: Get "<http://10.244.1.24:13133/>": dial tcp 10.244.1.24:13133: connect: connection refused
  Warning  BackOff    5s (x10 over 56s)  kubelet            Back-off restarting failed container
will cleanup the namespace and try again
nope, the agent is still crashing .. on the health check at port 13133, reverting the change and checking
ok, reverting the changes, fixes the crash
p
@som neema is the issue resolved now?
s
no, I am trying to debug and understand the logs
p
can you cleanup previous installation from
signoz-infra-metrics
?
Copy code
kubectl -n signoz-infra-metrics delete -Rf agent
kubectl -n signoz-infra-metrics delete -Rf deployment
and after cleanup.. try again with latest K8s manifests.
s
I did try deleting ( including the namespace ) but I saw the crash with the fix. Let me pull the latest
p
Did you follow steps from the tutorial to inlcude the signoz endpoint? Also, the previous PR has been merged, you can switch to this. https://github.com/SigNoz/otel-collector-k8s/pull/3
s
include the signoz endpoint?
Yes I have the ip addresses of the endpoint updated in the yaml
👍 1