Hey! I'm currently trying Signoz for my k8s cluste...
# support
e
Hey! I'm currently trying Signoz for my k8s cluster. I have the problem that the alertmanager is in pending mode. What do I need to do to fix this?
s
What does the describe pod show?
e
oh... Now there's an error message. Warning FailedScheduling 2m8s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition
------- Normal WaitForFirstConsumer 60m persistentvolume-controller waiting for first consumer to be created before binding Warning ProvisioningFailed 35m (x15 over 60m) cloud.ionos.com_csi-ionoscloud-547ff5c6cf-xf55x_ec448156-856a-40a1-b1d4-0d307d8bc24b failed to provision volume with StorageClass "ionos-enterprise-hdd": rpc error: code = OutOfRange desc = requested size 104857600 must be between 1073741824 and 4398046511104 bytes Normal Provisioning 5m5s (x23 over 60m) cloud.ionos.com_csi-ionoscloud-547ff5c6cf-xf55x_ec448156-856a-40a1-b1d4-0d307d8bc24b External provisioner is provisioning volume for claim "platform/storage-signoz-monitoring-alertmanager-0" Normal ExternalProvisioning 30s (x242 over 60m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "cloud.ionos.com" or manually created by system administrator
Do i unterstand it right, that the pvc is to small? 😄
p
@Elias It looks like your storage class has set range limit for PVCs to be between 1GiB to 1TiB. Can you increase PVC size of alertmanager and re-install using
helm upgrade
? override-values.yaml
Copy code
alertmanager:
  persistence:
    size: 1Gi
If you face any issues when using
helm upgrade
, remove the alertmanager statefulset and retry the command.
e
Thanks! I deleted the statefulset and upgraded with
helm upgrade signoz-monitoring signoz/signoz -f override-values.yaml --namespace=platform
AlterManager is still pending since 8min. Same message in the pod logs. Pod description isn't showing an error:
Copy code
Name:             signoz-monitoring-alertmanager-0
Namespace:        platform
Priority:         0
Service Account:  signoz-monitoring-alertmanager
Node:             <none>
Labels:           <http://app.kubernetes.io/component=alertmanager|app.kubernetes.io/component=alertmanager>
                  <http://app.kubernetes.io/instance=signoz-monitoring|app.kubernetes.io/instance=signoz-monitoring>
                  <http://app.kubernetes.io/name=signoz|app.kubernetes.io/name=signoz>
                  controller-revision-hash=signoz-monitoring-alertmanager-6d448ccf6d
                  <http://statefulset.kubernetes.io/pod-name=signoz-monitoring-alertmanager-0|statefulset.kubernetes.io/pod-name=signoz-monitoring-alertmanager-0>
Annotations:      checksum/config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
Status:           Pending
IP:
IPs:              <none>
Controlled By:    StatefulSet/signoz-monitoring-alertmanager
Init Containers:
  signoz-monitoring-alertmanager-init:
    Image:      <http://docker.io/busybox:1.35|docker.io/busybox:1.35>
    Port:       <none>
    Host Port:  <none>
    Command:
      sh
      -c
      until wget --spider -q signoz-monitoring-query-service:8080/api/v1/version; do echo -e "waiting for query-service"; sleep 5; done; echo -e "query-service ready, starting alertmanager now";
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xfvkb (ro)
Containers:
  signoz-monitoring-alertmanager:
    Image:      <http://docker.io/signoz/alertmanager:0.23.0-0.2|docker.io/signoz/alertmanager:0.23.0-0.2>
    Port:       9093/TCP
    Host Port:  0/TCP
    Args:
      --storage.path=/alertmanager
      --queryService.url=<http://signoz-monitoring-query-service:8085>
    Requests:
      cpu:      100m
      memory:   100Mi
    Liveness:   http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_IP:   (v1:status.podIP)
    Mounts:
      /alertmanager from storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xfvkb (ro)
Volumes:
  storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  storage-signoz-monitoring-alertmanager-0
    ReadOnly:   false
  kube-api-access-xfvkb:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
                             <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:                      <none>
Hmm seems like the pvc is still pending.
Copy code
C:\Users\elias\signoz>kubectl describe pvc --namespace=platform storage-signoz-monitoring-alertmanager-0
Name:          storage-signoz-monitoring-alertmanager-0
Namespace:     platform
StorageClass:  ionos-enterprise-hdd
Status:        Pending
Volume:
Labels:        <http://app.kubernetes.io/component=alertmanager|app.kubernetes.io/component=alertmanager>
               <http://app.kubernetes.io/instance=signoz-monitoring|app.kubernetes.io/instance=signoz-monitoring>
               <http://app.kubernetes.io/name=signoz|app.kubernetes.io/name=signoz>
Annotations:   <http://volume.beta.kubernetes.io/storage-provisioner|volume.beta.kubernetes.io/storage-provisioner>: <http://cloud.ionos.com|cloud.ionos.com>
               <http://volume.kubernetes.io/selected-node|volume.kubernetes.io/selected-node>: prod-performance-o3niqbe464
               <http://volume.kubernetes.io/storage-provisioner|volume.kubernetes.io/storage-provisioner>: <http://cloud.ionos.com|cloud.ionos.com>
Finalizers:    [<http://kubernetes.io/pvc-protection|kubernetes.io/pvc-protection>]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       signoz-monitoring-alertmanager-0
Events:
  Type    Reason                Age                      From
                       Message
  ----    ------                ----                     ----
                       -------
  Normal  Provisioning          7m52s (x63 over 3h33m)   cloud.ionos.com_csi-ionoscloud-547ff5c6cf-xf55x_ec448156-856a-40a1-b1d4-0d307d8bc24b  External provisioner is provisioning volume for claim "platform/storage-signoz-monitoring-alertmanager-0"
  Normal  ExternalProvisioning  3m16s (x842 over 3h33m)  persistentvolume-controller
                       waiting for a volume to be created, either by external provisioner "<http://cloud.ionos.com|cloud.ionos.com>" or manually created by system administrator
The other pvcs are working:
Copy code
C:\Users\elias\signoz>kubectl get pvc --namespace=platform
NAME                                                                       STATUS    VOLUME
        CAPACITY   ACCESS MODES   STORAGECLASS           AGE
data-signoz-monitoring-zookeeper-0                                         Bound     pvc-b675e319-b64e-4039-bea8-748869aed061   8Gi        RWO            ionos-enterprise-hdd   3h34m
data-volumeclaim-template-chi-signoz-monitoring-clickhouse-cluster-0-0-0   Bound     pvc-4bf90e0e-2ff5-4dd4-888c-473d87822e0f   20Gi       RWO            ionos-enterprise-hdd   3h34m
signoz-db-signoz-monitoring-query-service-0                                Bound     pvc-ca1049e9-de4d-4dbd-a9a3-677f4338e02d   1Gi        RWO            ionos-enterprise-hdd   3h34m
storage-signoz-monitoring-alertmanager-0                                   Pending
                                  ionos-enterprise-hdd   3h34m
Deleted the pvc and the statefulset and upgraded again. Now it's working. Thanks for your quick help!
p
That's great to hear 👍