This message was deleted.
# support
s
This message was deleted.
s
Share the full stacktrace.
s
Copy code
Defaulted container "signoz-otel-collector" out of: signoz-otel-collector, signoz-otel-collector-migrate-init (init)
{"level":"info","timestamp":"2024-02-16T16:10:25.369Z","logger":"dynamic-config","caller":"opamp/config_manager.go:89","msg":"Added instance id to config file","component":"opamp-server-client","instance_id":"4e26f14f-4e76-4823-b64b-428d325b197e"}
{"level":"info","timestamp":"2024-02-16T16:10:25.370Z","caller":"service/service.go:69","msg":"Starting service"}
{"level":"info","timestamp":"2024-02-16T16:10:25.370Z","caller":"opamp/server_client.go:171","msg":"Waiting for initial remote config","component":"opamp-server-client"}
{"level":"info","timestamp":"2024-02-16T16:10:25.374Z","caller":"opamp/server_client.go:127","msg":"Connected to the server.","component":"opamp-server-client"}
{"level":"info","timestamp":"2024-02-16T16:10:25.382Z","logger":"agent-config-manager","caller":"opamp/config_manager.go:172","msg":"Config has changed, reloading","path":"/var/tmp/collector-config.yaml"}
2024-02-16T16:10:25.395Z	info	service@v0.88.0/telemetry.go:84	Setting up own telemetry...
2024-02-16T16:10:25.396Z	info	service@v0.88.0/telemetry.go:201	Serving Prometheus metrics	{"address": "0.0.0.0:8888", "level": "Basic"}
{"level":"info","timestamp":"2024-02-16T16:10:26.370Z","caller":"service/service.go:73","msg":"Client started successfully"}
{"level":"info","timestamp":"2024-02-16T16:10:26.370Z","caller":"opamp/client.go:49","msg":"Ensuring collector is running","component":"opamp-server-client"}
{"level":"info","timestamp":"2024-02-16T16:11:27.263Z","caller":"signozcollector/main.go:90","msg":"Context done, shutting down..."}
{"level":"info","timestamp":"2024-02-16T16:11:27.263Z","caller":"service/service.go:79","msg":"Shutting down service"}
{"level":"info","timestamp":"2024-02-16T16:11:27.263Z","caller":"opamp/server_client.go:185","msg":"Stopping OpAMP server client","component":"opamp-server-client"}
above are the OtelCollector pod logs
the above is error am facing since yesterday
s
This happens when the collector process receives a
SIGINT
or
SIGTERM
signals.
s
i have increased the readiness and livenessProbe values to below initialDelaySeconds: 20 periodSeconds: 10 timeoutSeconds: 10 failureThreshold: 8 successThreshold: 1 still pods are getting into CrashLoop Backoff state
All the pods are Crashing sometimes suddenly, resources are available still pods are getting failed. What could be the reason @Srikanth Chekuri?
is this because of Otel-collector image tag? Should i keep the image tag fized?
s
What image tag are you using? What are the error logs? It would help if you could share more context
s
Copy code
warn	k8sattributesprocessor@v0.88.0/processor.go:54	k8s.pod.start_time value will be changed to use RFC3339 format in v0.83.0. see <https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/24016> for more information. enable feature-gate k8sattr.rfc3339 to opt into this change.	{"kind": "processor", "name": "k8sattributes", "pipeline": "metrics/internal"}

	warn	internal@v0.88.0/warning.go:40	Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks	{"kind": "receiver", "name": "jaeger", "data_type": "traces", "documentation": "<https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks>"}
The above are the warnings i could see in the pod logs
s
The warnings are not harmful. I am not sure why they crash. I believe it's specific to your environment.
s
first pod state is getting into OOMKILLED and into CrashloopBackoff. is this because of am giving the limit for the memory for otell collector pod?
s
How much memory limit did you set?
s
Hi @Srikanth Chekuri, issue was with the memory limit for the otel collector pod. its fixed. Thanks