```I0705 06:56:45.088037 1 poller.go:245] po...
# support
d
Copy code
I0705 06:56:45.088037       1 poller.go:245] pollStatefulSet():platform/chi-apm-clickhouse-cluster-0-0:%s/%s - TIMEOUT reached
E0705 06:56:45.088053       1 creator.go:118] updateStatefulSet():StatefulSet update wait failed. err: waitStatefulSet(platform/chi-apm-clickhouse-cluster-0-0) - wait timeout
I0705 06:56:45.088067       1 creator.go:237] onStatefulSetUpdateFailed():going to ROLLBACK FAILED StatefulSet platform/chi-apm-clickhouse-cluster-0-0
W0705 06:56:45.126433       1 warnings.go:70] spec.template.spec.containers[0].ports[3]: duplicate port definition with spec.template.spec.containers[0].ports[1]
I0705 06:56:45.126799       1 deleter.go:97] Delete Pod platform/chi-apm-clickhouse-cluster-0-0-0
E0705 06:56:45.144447       1 deleter.go:105] statefulSetDeletePod():FAIL delete Pod platform/chi-apm-clickhouse-cluster-0-0-0 err:pods "chi-apm-clickhouse-cluster-0-0-0" is forbidden: User "system:serviceaccount:platform:apm-clickhouse-operator" cannot delete resource "pods" in API group "" in the namespace "platform"
I0705 06:56:45.144471       1 worker.go:1458] Got abort. Abort
E0705 06:56:45.144586       1 worker-reconciler.go:735] reconcileStatefulSet():FAILED to reconcile StatefulSet: chi-apm-clickhouse-cluster-0-0 CHI: apm-clickhouse 
I0705 06:56:45.238676       1 creator.go:58] CreateServiceCHI():platform/apm-clickhouse/45fae902-f819-4d37-9016-a9013c4bcf8f:platform/apm-clickhouse
I0705 06:56:45.251515       1 worker.go:1201] updateService():platform/apm-clickhouse/45fae902-f819-4d37-9016-a9013c4bcf8f:Update Service platform/apm-clickhouse
E0705 06:56:45.323817       1 worker-reconciler.go:91] reconcileCHI():platform/apm-clickhouse/45fae902-f819-4d37-9016-a9013c4bcf8f:FAILED to update err: crud error - should abort
I0705 06:56:45.418894       1 worker.go:657] markReconcileComplete():platform/apm-clickhouse/45fae902-f819-4d37-9016-a9013c4bcf8f:reconcile completed unsuccessfully, task id: 45fae902-f819-4d37-9016-a9013c4bcf8f
getting this error in our production signoz suddenly, is there any quick way to resolve this? @nitya-signoz @Prashant Shahi
Copy code
apm-clickhouse-operator-676658c454-vjxxr            2/2     Running     0             22d
apm-k8s-infra-otel-agent-24gqj                      1/1     Running     0             7d21h
apm-k8s-infra-otel-agent-5ggrm                      1/1     Running     0             7d21h
apm-k8s-infra-otel-agent-f5glt                      1/1     Running     0             7d21h
apm-k8s-infra-otel-agent-hxp87                      1/1     Running     0             7d21h
apm-k8s-infra-otel-agent-n79k7                      1/1     Running     0             7d21h
apm-k8s-infra-otel-agent-nfv6j                      1/1     Running     0             7d21h
apm-k8s-infra-otel-deployment-dfb9b77bf-lp5dj       1/1     Running     0             95m
apm-signoz-alertmanager-0                           1/1     Running     0             22d
apm-signoz-frontend-7b4dd6989c-cg2d9                1/1     Running     0             95m
apm-signoz-otel-collector-789cf6c675-fmxkc          1/1     Running     0             7d21h
apm-signoz-otel-collector-789cf6c675-zkndn          1/1     Running     3 (90m ago)   95m
apm-signoz-otel-collector-metrics-f69ff5867-wxgpt   1/1     Running     0             95m
apm-signoz-query-service-0                          0/1     Running     0             7d21h
apm-signoz-schema-migrator-upgrade-dfjwj            0/1     Completed   0             89m
apm-zookeeper-0                                     1/1     Running     0             22d
there is no pod for clickhouse also here right now
n
The error logs indicate that there are permission problems. Can you try the below steps: • First, check if the ClickHouse custom resource is present and correctly defined:
Copy code
kubectl get clickhouseinstallations -n platform
• Look at the logs of the ClickHouse operator pod for more detailed error messages:
Copy code
kubectl logs -n platform apm-clickhouse-operator-676658c454-vjxxr
• It appears that the ClickHouse operator doesn't have the necessary permissions. Review and possibly update the RBAC rules for the ClickHouse operator:
Copy code
kubectl get clusterrole clickhouse-operator-cluster-role -o yaml
kubectl get role -n platform clickhouse-operator-role -o yaml
• If the ClickHouse pod is missing, you might need to recreate it. However, be cautious as this might lead to data loss if not done correctly. First, check if the StatefulSet exists:
Copy code
kubectl get statefulset -n platform
If it doesn't exist, you might need to reapply your SigNoz Helm chart • Ensure that the Persistent Volumes for ClickHouse are still intact:
Copy code
kubectl get pv -n platform
• Try restarting the ClickHouse operator:
Copy code
kubectl rollout restart deployment apm-clickhouse-operator -n platform
• Ensure all other SigNoz components are running correctly. The query-service pod seems to be in a running state but not ready (0/1). Check its logs:
Copy code
kubectl logs -n platform apm-signoz-query-service-0
👍 1