Slackbot
07/19/2022, 2:10 PMAnkit Nayan
Ankit Nayan
Kaouther Abrougui
07/19/2022, 7:15 PMAnkit Nayan
Ankit Nayan
Kaouther Abrougui
07/20/2022, 1:33 PMAnkit Nayan
Ankit Nayan
Kaouther Abrougui
07/20/2022, 2:03 PMKaouther Abrougui
07/20/2022, 2:04 PMKaouther Abrougui
07/20/2022, 2:04 PMKaouther Abrougui
07/20/2022, 2:06 PMKaouther Abrougui
07/20/2022, 2:07 PMKaouther Abrougui
07/20/2022, 2:15 PMKaouther Abrougui
07/20/2022, 2:45 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 3:12 PMKaouther Abrougui
07/20/2022, 3:13 PMAnkit Nayan
Ankit Nayan
Ankit Nayan
Kaouther Abrougui
07/20/2022, 3:17 PMKaouther Abrougui
07/20/2022, 3:19 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 3:20 PMAnkit Nayan
are the logs persisted somewhere?not for K8s IMO. @Prashant Shahi is this correct?
Prashant Shahi
07/20/2022, 3:29 PMPrashant Shahi
07/20/2022, 3:30 PMOTEL_COLLECTOR_POD=$(kubectl get pods -n platform -o jsonpath={..metadata.name} -l "<http://app.kubernetes.io/component=otel-collector|app.kubernetes.io/component=otel-collector>")
kubectl logs -n platform $OTEL_COLLECTOR_POD
previous logs:
kubectl logs -n platform $OTEL_COLLECTOR_POD --previous
Ankit Nayan
Kaouther Abrougui
07/20/2022, 3:32 PMKaouther Abrougui
07/20/2022, 3:32 PMPrashant Shahi
07/20/2022, 3:32 PMkubectl get events -n platform
Prashant Shahi
07/20/2022, 3:33 PMKaouther Abrougui
07/20/2022, 3:33 PMLAST SEEN TYPE REASON OBJECT MESSAGE
40m Normal Pulling pod/signoz-release-otel-collector-678f68755c-fnj24 Pulling image "<http://docker.io/signoz/otelcontribcol:0.45.1-1.1|docker.io/signoz/otelcontribcol:0.45.1-1.1>"
40m Warning BackOff pod/signoz-release-otel-collector-678f68755c-fnj24 Back-off restarting failed container
49m Normal Pulled pod/signoz-release-otel-collector-678f68755c-fnj24 Successfully pulled image "<http://docker.io/signoz/otelcontribcol:0.45.1-1.1|docker.io/signoz/otelcontribcol:0.45.1-1.1>" in 671.012383ms
30m Normal Created pod/signoz-release-query-service-0 Created container signoz-release-query-service
30m Normal Started pod/signoz-release-query-service-0 Started container signoz-release-query-service
30m Normal Pulled pod/signoz-release-query-service-0 Container image "<http://docker.io/signoz/query-service:0.10.0|docker.io/signoz/query-service:0.10.0>" already present on machine
30m Warning BackOff pod/signoz-release-query-service-0 Back-off restarting failed container
30m Warning Unhealthy pod/signoz-release-query-service-0 Readiness probe failed: Get "<http://10.16.169.30:8080/api/v1/version>": dial tcp 10.16.169.30:8080: connect: connection refused
Prashant Shahi
07/20/2022, 3:35 PMPrashant Shahi
07/20/2022, 3:35 PMkubectl describe -n platform pod/$OTEL_COLLECTOR_POD
Kaouther Abrougui
07/20/2022, 3:37 PMkubectl describe -n platform pod/$OTEL_COLLECTOR_POD
W0720 16:35:56.616742 68666 gcp.go:120] WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.25+; use gcloud instead.
To learn more, consult <https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke>
Name: signoz-release-otel-collector-678f68755c-fnj24
Namespace: platform
Priority: 0
Node: xxx
Start Time: Wed, 20 Jul 2022 13:10:00 +0100
Labels: <http://app.kubernetes.io/component=otel-collector|app.kubernetes.io/component=otel-collector>
<http://app.kubernetes.io/instance=signoz-release|app.kubernetes.io/instance=signoz-release>
<http://app.kubernetes.io/name=signoz|app.kubernetes.io/name=signoz>
pod-template-hash=678f68755c
Annotations: checksum/config: 7511037609a6822915f6adc83937e8de5da3aceca3ef20b4f83f4cba72d2eaf5
Status: Running
IP: 10.16.1.79
IPs:
IP: 10.16.1.79
Controlled By: ReplicaSet/signoz-release-otel-collector-678f68755c
Init Containers:
signoz-release-otel-collector-init:
Container ID: <containerd://1f1c64a60323017a3b84f6bc576e2de21481a7d23d5aa737bb0a2534eacdc22>d
Image: <http://docker.io/busybox:1.35|docker.io/busybox:1.35>
Image ID: <http://docker.io/library/busybox@sha256:8c40df61d40166f5791f44b3d90b77b4c7f59ed39a992fd9046886d3126ffa68|docker.io/library/busybox@sha256:8c40df61d40166f5791f44b3d90b77b4c7f59ed39a992fd9046886d3126ffa68>
Port: <none>
Host Port: <none>
Command:
sh
-c
until wget --spider -q signoz-release-clickhouse:8123/ping; do echo -e "waiting for clickhouseDB"; sleep 5; done; echo -e "clickhouse ready, starting otel collector now";
State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 20 Jul 2022 13:10:03 +0100
Finished: Wed, 20 Jul 2022 13:11:12 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5jzhr (ro)
Containers:
signoz-release-otel-collector:
Container ID: <containerd://20b6eb0471ab94c2049f66c3d49b0091b0c9798c0cd36265b4b051dc77cbe65>5
Image: <http://docker.io/signoz/otelcontribcol:0.45.1-1.1|docker.io/signoz/otelcontribcol:0.45.1-1.1>
Image ID: <http://docker.io/signoz/otelcontribcol@sha256:f3378be7a69b38ebb03c4cfa941fa35715f927e374519b7051f525a6c5a020c3|docker.io/signoz/otelcontribcol@sha256:f3378be7a69b38ebb03c4cfa941fa35715f927e374519b7051f525a6c5a020c3>
Port: <none>
Host Port: <none>
Command:
/otelcontribcol
--config=/conf/otel-collector-config.yaml
State: Running
Started: Wed, 20 Jul 2022 15:52:18 +0100
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Wed, 20 Jul 2022 15:43:51 +0100
Finished: Wed, 20 Jul 2022 15:52:04 +0100
Ready: True
Restart Count: 16
Limits:
cpu: 1
memory: 2Gi
Requests:
cpu: 200m
memory: 400Mi
Environment:
CLICKHOUSE_HOST: signoz-release-clickhouse
CLICKHOUSE_PORT: 9000
CLICKHOUSE_HTTP_PORT: 8123
CLICKHOUSE_CLUSTER: cluster
CLICKHOUSE_DATABASE: signoz_metrics
CLICKHOUSE_TRACE_DATABASE: signoz_traces
CLICKHOUSE_USER: admin
CLICKHOUSE_PASSWORD: 27ff0399-0d3a-4bd8-919d-17c2181e6fb9
CLICKHOUSE_SECURE: false
CLICKHOUSE_VERIFY: false
Mounts:
/conf from otel-collector-config-vol (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5jzhr (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
otel-collector-config-vol:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: signoz-release-otel-collector
Optional: false
kube-api-access-5jzhr:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
<http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 52m kubelet Successfully pulled image "<http://docker.io/signoz/otelcontribcol:0.45.1-1.1|docker.io/signoz/otelcontribcol:0.45.1-1.1>" in 671.012383ms
Warning BackOff 43m (x215 over 148m) kubelet Back-off restarting failed container
Normal Pulling 43m (x17 over 3h24m) kubelet Pulling image "<http://docker.io/signoz/otelcontribcol:0.45.1-1.1|docker.io/signoz/otelcontribcol:0.45.1-1.1>"
Prashant Shahi
07/20/2022, 3:38 PMLast State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Wed, 20 Jul 2022 15:43:51 +0100
Finished: Wed, 20 Jul 2022 15:52:04 +0100
cc @Ankit Nayan @Srikanth ChekuriPrashant Shahi
07/20/2022, 3:39 PMKaouther Abrougui
07/20/2022, 3:39 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 3:40 PMAnkit Nayan
Prashant Shahi
07/20/2022, 3:45 PMPrashant Shahi
07/20/2022, 3:45 PMoverride-values.yml
is recommended.Kaouther Abrougui
07/20/2022, 3:46 PMPrashant Shahi
07/20/2022, 3:46 PMhelm upgrade
commands would overwrite back to defaultsAnkit Nayan
Ankit Nayan
kubectl describe ...
would help knowing issue with the query-service podKaouther Abrougui
07/20/2022, 3:52 PMKaouther Abrougui
07/20/2022, 3:53 PMContainers:
signoz-release-query-service:
Container ID: <containerd://84efdf05080a6e07b646a34bc6ef204bd19f924bff2711674a1e6c3de2a62de>6
Image: <http://docker.io/signoz/query-service:0.10.0|docker.io/signoz/query-service:0.10.0>
Image ID: <http://docker.io/signoz/query-service@sha256:1cbf6d2e0b55f1a2a7e8bb0f9b199c438198340096879511e57e4d9f8edf8cf8|docker.io/signoz/query-service@sha256:1cbf6d2e0b55f1a2a7e8bb0f9b199c438198340096879511e57e4d9f8edf8cf8>
Port: 8080/TCP
Host Port: 0/TCP
Args:
-config=/root/config/prometheus.yml
State: Running
Started: Wed, 20 Jul 2022 16:02:34 +0100
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Wed, 20 Jul 2022 15:56:53 +0100
Finished: Wed, 20 Jul 2022 16:02:06 +0100
Kaouther Abrougui
07/20/2022, 3:57 PMPrashant Shahi
07/20/2022, 4:04 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 4:10 PMKaouther Abrougui
07/20/2022, 4:10 PMKaouther Abrougui
07/20/2022, 5:00 PMLimits:
cpu: 2
memory: 4Gi
Requests:
cpu: 1
memory: 2Gi
Kaouther Abrougui
07/20/2022, 5:00 PMKaouther Abrougui
07/20/2022, 5:26 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 5:32 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 5:36 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 5:39 PMKaouther Abrougui
07/20/2022, 5:39 PMKaouther Abrougui
07/20/2022, 5:40 PMEvents:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 32m default-scheduler 0/151 nodes are available: 146 node(s) had taint {<http://nvidia.com/gpu|nvidia.com/gpu>: present}, that the pod didn't tolerate, 3 Insufficient memory, 5 Insufficient cpu.
Normal NotTriggerScaleUp 32m cluster-autoscaler pod didn't trigger scale-up: 3 node(s) had taint {<http://nvidia.com/gpu|nvidia.com/gpu>: present}, that the pod didn't tolerate, 1 Insufficient cpu, 1 Insufficient memory, 1 max node group size reached
Normal Scheduled 32m default-scheduler Successfully assigned platform/signoz-release-otel-collector-dc4fb6bff-rz4l8 to gke-benchmarks-clust-pool-e2-standard-9a225627-76rw
Normal Pulling 32m kubelet Pulling image "<http://docker.io/busybox:1.35|docker.io/busybox:1.35>"
Normal Pulled 32m kubelet Successfully pulled image "<http://docker.io/busybox:1.35|docker.io/busybox:1.35>" in 8.72024562s
Normal Created 32m kubelet Created container signoz-release-otel-collector-init
Normal Started 32m kubelet Started container signoz-release-otel-collector-init
Normal Pulled 31m kubelet Successfully pulled image "<http://docker.io/signoz/otelcontribcol:0.45.1-1.1|docker.io/signoz/otelcontribcol:0.45.1-1.1>" in 6.005665963s
Normal Pulling 14m (x2 over 32m) kubelet Pulling image "<http://docker.io/signoz/otelcontribcol:0.45.1-1.1|docker.io/signoz/otelcontribcol:0.45.1-1.1>"
Normal Pulled 14m kubelet Successfully pulled image "<http://docker.io/signoz/otelcontribcol:0.45.1-1.1|docker.io/signoz/otelcontribcol:0.45.1-1.1>" in 524.349113ms
Normal Created 14m (x2 over 31m) kubelet Created container signoz-release-otel-collector
Normal Started 14m (x2 over 31m) kubelet Started container signoz-release-otel-collector
Warning Evicted 2m42s kubelet The node was low on resource: memory. Container signoz-release-otel-collector was using 5119956Ki, which exceeds its request of 2Gi.
Normal Killing 2m41s kubelet Stopping container signoz-release-otel-collector
Warning Evicted 2m36s kubelet The node was low on resource: memory. Container signoz-release-otel-collector was using 5365200Ki, which exceeds its request of 2Gi.
Warning ExceededGracePeriod 2m26s (x2 over 2m32s) kubelet Container runtime did not kill the pod within specified grace period.
Ankit Nayan
Ankit Nayan
Kaouther Abrougui
07/20/2022, 5:43 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 5:45 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 5:47 PMAnkit Nayan
Ankit Nayan
no they are not rare, that is the standard.and how many of such traces are produced in an hr or day? having such traces being produced say at 10K traces/s and each of them being 150K spans, it would be 1.5B spans/s ingestion
Kaouther Abrougui
07/20/2022, 5:51 PMKaouther Abrougui
07/20/2022, 5:52 PMAnkit Nayan
Kaouther Abrougui
07/20/2022, 5:59 PMAnkit Nayan
processors: [batch]
https://github.com/SigNoz/charts/blob/main/charts/signoz/values.yaml#L881
and removing section
https://github.com/SigNoz/charts/blob/main/charts/signoz/values.yaml#L883-L889Ankit Nayan
So 2 aspects here: first getting all spans of traces and not losing any, and second being able to visualize traces on UI...correct ✅
Kaouther Abrougui
07/20/2022, 11:27 PMKaouther Abrougui
07/20/2022, 11:31 PMKaouther Abrougui
07/20/2022, 11:31 PMSrikanth Chekuri
07/21/2022, 2:35 AMKaouther Abrougui
07/21/2022, 7:44 AMKaouther Abrougui
07/21/2022, 7:48 AMKaouther Abrougui
07/21/2022, 7:56 AMKaouther Abrougui
07/21/2022, 8:02 AMKaouther Abrougui
07/21/2022, 8:14 AMSrikanth Chekuri
07/21/2022, 8:53 AMKaouther Abrougui
07/21/2022, 8:55 AMSrikanth Chekuri
07/21/2022, 8:58 AMKaouther Abrougui
07/21/2022, 9:00 AMKaouther Abrougui
07/21/2022, 9:00 AMSrikanth Chekuri
07/21/2022, 9:01 AMKaouther Abrougui
07/21/2022, 11:13 AMKaouther Abrougui
07/21/2022, 12:09 PMAnkit Nayan
Kaouther Abrougui
07/21/2022, 12:11 PMKaouther Abrougui
07/21/2022, 12:14 PMAnkit Nayan
v0.55.0
otel collector?Srikanth Chekuri
07/21/2022, 6:34 PMsignoz/signoz-otel-collector:0.55.0-rc.1
Kaouther Abrougui
07/21/2022, 6:35 PMKaouther Abrougui
07/21/2022, 6:56 PMotelCollector:
image:
tag: 0.55.0-rc.1
otelCollectorMetrics:
image:
tag: 0.55.0-rc.1
Srikanth Chekuri
07/21/2022, 6:57 PMKaouther Abrougui
07/21/2022, 6:58 PMotelCollector:
image:
tag: signoz/signoz-otel-collector:0.55.0-rc.1
otelCollectorMetrics:
image:
tag: signoz/signoz-otel-collector:0.55.0-rc.1
Kaouther Abrougui
07/21/2022, 6:58 PMSrikanth Chekuri
07/21/2022, 6:59 PMregistry: <http://docker.io|docker.io>
repository: signoz/signoz-otel-collector
tag: 0.55.0-rc.1
Kaouther Abrougui
07/21/2022, 6:59 PMPrashant Shahi
07/21/2022, 7:11 PMKaouther Abrougui
07/21/2022, 7:12 PMEvents:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m11s default-scheduler Successfully assigned platform/signoz-release-otel-collector-5bfbf89f9-vbh8z to xxx
Normal Pulled 2m10s kubelet Container image "<http://docker.io/busybox:1.35|docker.io/busybox:1.35>" already present on machine
Normal Created 2m10s kubelet Created container signoz-release-otel-collector-init
Normal Started 2m10s kubelet Started container signoz-release-otel-collector-init
Normal Pulled 84s kubelet Successfully pulled image "<http://docker.io/signoz/signoz-otel-collector:0.55.0-rc.1|docker.io/signoz/signoz-otel-collector:0.55.0-rc.1>" in 428.412212ms
Normal Pulled 83s kubelet Successfully pulled image "<http://docker.io/signoz/signoz-otel-collector:0.55.0-rc.1|docker.io/signoz/signoz-otel-collector:0.55.0-rc.1>" in 426.96839ms
Normal Pulled 69s kubelet Successfully pulled image "<http://docker.io/signoz/signoz-otel-collector:0.55.0-rc.1|docker.io/signoz/signoz-otel-collector:0.55.0-rc.1>" in 530.406348ms
Normal Pulling 46s (x4 over 84s) kubelet Pulling image "<http://docker.io/signoz/signoz-otel-collector:0.55.0-rc.1|docker.io/signoz/signoz-otel-collector:0.55.0-rc.1>"
Normal Created 45s (x4 over 84s) kubelet Created container signoz-release-otel-collector
Warning Failed 45s (x4 over 83s) kubelet Error: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/otelcontribcol": stat /otelcontribcol: no such file or directory: unknown
Normal Pulled 45s kubelet Successfully pulled image "<http://docker.io/signoz/signoz-otel-collector:0.55.0-rc.1|docker.io/signoz/signoz-otel-collector:0.55.0-rc.1>" in 444.906797ms
Warning BackOff 16s (x6 over 82s) kubelet Back-off restarting failed container
Kaouther Abrougui
07/21/2022, 7:12 PMKaouther Abrougui
07/21/2022, 7:14 PMPrashant Shahi
07/21/2022, 7:17 PMPrashant Shahi
07/21/2022, 7:18 PMsignoz-collector
Srikanth Chekuri
07/21/2022, 7:19 PM/otelcontribcol
to new name right?Prashant Shahi
07/21/2022, 7:20 PMotelcontribcol
should be updated to signoz-collector
in following templates of the chart.
https://github.com/SigNoz/charts/blob/main/charts/signoz/templates/otel-collector/deployment.yaml#L39
https://github.com/SigNoz/charts/blob/main/charts/signoz/templates/otel-collector-metrics/deployment.yaml#L39Prashant Shahi
07/21/2022, 7:20 PMKaouther Abrougui
07/21/2022, 7:23 PMPrashant Shahi
07/21/2022, 8:02 PMotel-0.55-changes
branch.
git clone <https://github.com/SigNoz/charts.git> && cd charts
git checkout otel-0.55-changes
helm upgrade my-release -n platform -f override-values.yaml charts/signoz
Kaouther Abrougui
07/21/2022, 8:43 PMKaouther Abrougui
07/21/2022, 8:48 PMPrashant Shahi
07/21/2022, 8:52 PMKaouther Abrougui
08/17/2022, 10:56 PM