Hi everyone, I just discovered SigNoz. I'm trying ...
# support
v
Hi everyone, I just discovered SigNoz. I'm trying to deploy it via Kubernetes by following the tutorial with Helm. Some pods are crashing because of an unsuccessful DB access : Logs from the pod my-release-signoz-otel-collector-ddb76f957-kxjp9 :
Copy code
time="2022-09-01T12:49:09Z" level=info msg="Executing:\nCREATE DATABASE IF NOT EXISTS signoz_metrics\n" component=clickhouse
Error: cannot build pipelines: failed to create "clickhouselogsexporter" exporter, in pipeline "logs": cannot configure clickhouse logs exporter: code: 516, message: admin: Authentication failed: password is incorrect or there is no user with such name
p
Hey @Valentin Lorand 👋 That usually happens due to clickhouse IP whitelisting. Could you please share the output of the following?
Copy code
kubectl -n platform get pods -o=wide
v
Copy code
NAME                                                       READY   STATUS             RESTARTS        AGE   IP             NODE                                             NOMINATED NODE   READINESS GATES
chi-signoz-cluster-0-0-0                                   1/1     Running            0               14m   100.64.0.55    scw-k8s-vld-default-368a16ee0c6b403eb3ee422bcb   <none>           <none>
clickhouse-operator-6c966f59cd-dk7pv                       2/2     Running            0               14m   100.64.0.197   scw-k8s-vld-default-368a16ee0c6b403eb3ee422bcb   <none>           <none>
my-release-signoz-alertmanager-0                           0/1     Pending            0               14m   <none>         <none>                                           <none>           <none>
my-release-signoz-frontend-cf9b8d8c9-rkj7n                 0/1     Init:0/1           0               14m   100.64.0.145   scw-k8s-vld-default-368a16ee0c6b403eb3ee422bcb   <none>           <none>
my-release-signoz-otel-collector-ddb76f957-kg7lk           0/1     CrashLoopBackOff   3 (26s ago)     74s   100.64.0.106   scw-k8s-vld-default-368a16ee0c6b403eb3ee422bcb   <none>           <none>
my-release-signoz-otel-collector-metrics-5dd74686c-9g9sc   0/1     CrashLoopBackOff   6 (2m55s ago)   14m   100.64.0.119   scw-k8s-vld-default-368a16ee0c6b403eb3ee422bcb   <none>           <none>
my-release-signoz-query-service-0                          0/1     CrashLoopBackOff   7 (19s ago)     14m   100.64.0.98    scw-k8s-vld-default-368a16ee0c6b403eb3ee422bcb   <none>           <none>
my-release-zookeeper-0                                     1/1     Running            0               14m   100.64.0.19    scw-k8s-vld-default-368a16ee0c6b403eb3ee422bcb   <none>           <none>
Indeed, it's seems that it's a IP access problem More logs here :
Copy code
[chi-signoz-cluster-0-0-0] 2022.09.01 13:37:57.003959 [ 11 ] {} <Error> Access(user directories): from: ::ffff:100.64.0.119, user: admin: Authentication failed: Code: 195. DB::Exception: Connections from ::ffff:100.64.0.119 are not allowed. (IP_ADDRESS_NOT_ALLOWED), Stack trace (when copying this message, always include the lines below):
Copy code
[chi-signoz-cluster-0-0-0] 2022.09.01 13:37:57.003780 [ 11 ] {} <Warning> AddressPatterns: Failed to check if the allowed client hosts contain address ::ffff:100.64.0.119. DB::Exception: Cannot getnameinfo(::ffff:100.64.0.119): Name or service not known, code = 198
p
You can either whitelist all IPv4/IPv6 sources or allow SigNoz the pod IP address
100.64.0.0/16
https://github.com/SigNoz/charts/blob/main/charts/signoz/values.yaml#L117-L120
Include the above IP range in
clickhouse.allowedNetworkIps
list
v
Mmmh it's seems that "By default anything within a private network will be allowed." by clickhouse 🤔 Another strange thing, I don't know if it's linked :
Copy code
[my-release-signoz-otel-collector-metrics-54d4778cbf-b2qs4 my-release-signoz-otel-collector-metrics-init] wget: bad address 'my-release-clickhouse:8123' 
[my-release-signoz-otel-collector-metrics-54d4778cbf-b2qs4 my-release-signoz-otel-collector-metrics-init] waiting for clickhouseDB
p
yes, signoz/clickhouse chart by default allows following IPs:
Copy code
- "10.0.0.0/8"
    - "172.16.0.0/12"
    - "192.168.0.0/16"
b
[my-release-signoz-otel-collector-5cgbv my-release-signoz-otel-collector-init] waiting for clickhouseDB
[my-release-signoz-query-service-0 my-release-signoz-query-service-init] wget: bad address 'my-release-clickhouse:8123'
[my-release-signoz-query-service-0 my-release-signoz-query-service-init] waiting for clickhouseDB
[my-release-signoz-otel-collector-5cgbv my-release-signoz-otel-collector-init] wget: bad address 'my-release-clickhouse:8123'
[my-release-signoz-otel-collector-5cgbv my-release-signoz-otel-collector-init] waiting for clickhouseDB
[my-release-signoz-otel-collector-metrics-54d4778cbf-p6hp2 my-release-signoz-otel-collector-metrics-init] wget: bad address 'my-release-clickhouse:8123'
[my-release-signoz-otel-collector-metrics-54d4778cbf-p6hp2 my-release-signoz-otel-collector-metrics-init] waiting for clickhouseDB
[my-release-signoz-query-service-0 my-release-signoz-query-service-init] wget: bad address 'my-release-clickhouse:8123'
[my-release-signoz-query-service-0 my-release-signoz-query-service-init] waiting for clickhouseDB
Hi, I had ̀`- "0.0.0.0/0"` in the allowedNetworksIps array
hmmmm, maybe a serviceAccount problem ?
I0901 14:48:48.489450       1 labeler.go:80] OPERATOR_POD_NAMESPACE=platform OPERATOR_POD_NAME=my-release-clickhouse-operator-5cbd5c88d5-6frhb
E0901 14:48:48.576474       1 labeler.go:209] labelDeployment():platform/my-release-clickhouse-operator:ERROR get Deployment platform/my-release-clickhouse-operator
E0901 14:48:48.576707       1 controller.go:488] Run():ERROR label objects, will retry. Err: deployments.apps "my-release-clickhouse-operator" is forbidden: User "system:serviceaccount:platform:my-release-clickhouse-operator" cannot get resource "deployments" in API group "apps" in the namespace "platform"
I0901 14:48:53.577790       1 labeler.go:80] OPERATOR_POD_NAMESPACE=platform OPERATOR_POD_NAME=my-release-clickhouse-operator-5cbd5c88d5-6frhb
E0901 14:48:53.688019       1 labeler.go:209] labelDeployment():platform/my-release-clickhouse-operator:ERROR get Deployment platform/my-release-clickhouse-operator
E0901 14:48:53.688047       1 controller.go:488] Run():ERROR label objects, will retry. Err: deployments.apps "my-release-clickhouse-operator" is forbidden: User "system:serviceaccount:platform:my-release-clickhouse-operator" cannot get resource "deployments" in API group "apps" in the namespace "platform"
p
I don't think it is related to the previous issue. It was IP whitelisting issue before. The one above seems to be with RBAC.
Could you please share more details about your K8s cluster? How did you set it up? And how many nodes and their types?
it works without any issue for EKS, kind and k3s
@Benjamin Carriou If you are using any updated
override-values.yaml
, do share that as well.
deployments.apps "my-release-clickhouse-operator" is forbidden
I was able to reproduce the error above. Creating PR to fix it in some time.
b
Ok thx. Do you know the fix ? To try it on my cluster ?
p
@Benjamin Carriou It is out now. However, the error above is unrelated to the IP whitelist issue.
@Benjamin Carriou @Valentin Lorand IP whitelisting issue is resolved in the latest charts release. Now by default, we whitelist all private IP ranges from the reserved IPv4 list. https://en.wikipedia.org/wiki/Reserved_IP_addresses#IPv4