Hi Team, i'm trying to use signoz for the first ti...
# support
i
Hi Team, i'm trying to use signoz for the first time and but i'm trying to config the auto-instrumentation for my java applications, but the unfortunaly when I add the annotations, the services or traces don't show at signoz page, I try to test my collector using the troubleshoot tool, and appears to be working fine, but never work with my pods. Any tip? PS: I already use the side car and instrumentation instance options and using k8s and trying to trace k8s pods
p
@Prashant Shahi may have more insights here
@igor estevan jasinski Have you checked this grid on providing otel collector addresses - https://signoz.io/docs/install/troubleshooting/#signoz-otel-collector-address-grid Might be helpful!
p
@igor estevan jasinski are you following the OpenTelemetry Operator guide for the auto-instrumentation?
i
yes, actually I tried the troubleshooting tools to test inside the cluster that my application are configured and reached the otel-collector, which is configured in another cluster:
@Prashant Shahi probably i'm doing something wrong during the auto instrumentation phase, i'll share my yaml files maybe you can help me
sidecar yaml:
apiVersion: <http://opentelemetry.io/v1alpha1|opentelemetry.io/v1alpha1>
kind: OpenTelemetryCollector
metadata:
annotations:
<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |
{"apiVersion":"<http://opentelemetry.io/v1alpha1|opentelemetry.io/v1alpha1>","kind":"OpenTelemetryCollector","metadata":{"annotations":{},"name":"signoz-otel-collector-sidecar","namespace":"default"},"spec":{"config":"receivers:\n  otlp:\n    protocols:\n      http:\n      grpc:\nprocessors:\n  batch:\nexporters:\n  logging:\n  otlp:\n    endpoint: 100.94.35.234:4317\n    tls:\n      insecure: true\nservice:\n  pipelines:\n    traces:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [logging, otlp]\n    metrics:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [logging, otlp]\n","mode":"sidecar"}}
creationTimestamp: "2023-03-15T12:50:40Z"
generation: 3
name: signoz-otel-collector-sidecar
namespace: default
resourceVersion: "850306767"
uid: c6a30f0e-1a88-4628-a6ac-c57400ff20c9
spec:
config: |
receivers:
otlp:
protocols:
http:
grpc:
processors:
batch:
exporters:
logging:
otlp:
endpoint: 100.94.35.234:4317
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [logging, otlp]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [logging, otlp]
mode: sidecar
~
~
~
~
instrumentation yaml:
apiVersion: <http://opentelemetry.io/v1alpha1|opentelemetry.io/v1alpha1>
kind: Instrumentation
metadata:
annotations:
<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |
{"apiVersion":"<http://opentelemetry.io/v1alpha1|opentelemetry.io/v1alpha1>","kind":"Instrumentation","metadata":{"annotations":{},"name":"signoz-otel-collector-instrumentation","namespace":"default"},"spec":{"exporter":{"endpoint":"<https://signoz-otel-collector.dev.sicredi.cloud:4317>"},"java":{"image":"<http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest>"},"nodejs":{"image":"<http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest>"},"propagators":["tracecontext","baggage","b3"],"python":{"image":"<http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest>"},"sampler":{"argument":"0.25","type":"parentbased_traceidratio"}}}
creationTimestamp: "2023-03-15T01:54:20Z"
generation: 4
name: signoz-otel-collector-instrumentation
namespace: default
resourceVersion: "850306883"
uid: 2ca19a50-dcea-4c99-97fb-4f60cef3ba03
spec:
exporter:
endpoint: 100.94.35.234:4317
java:
image: <http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest>
nodejs:
image: <http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest>
propagators:
- tracecontext
- baggage
- b3
python:
image: <http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest>
sampler:
argument: "0.25"
type: parentbased_traceidratio
~
application pod deloymenty yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
<http://deployment.kubernetes.io/revision|deployment.kubernetes.io/revision>: "23"
creationTimestamp: "2022-07-04T15:57:14Z"
generation: 40
labels:
app: plataforma-monitoramento-infraphone-dialer
devconsole: managed
devconsole_application: plataforma-monitoramento
devconsole_component: plataforma-monitoramento-infraphone-dialer
name: plataforma-monitoramento-infraphone-dialer-deployment
namespace: default
resourceVersion: "850309601"
uid: 77e93b40-e95e-4e60-b065-641f516f664a
spec:
progressDeadlineSeconds: 220
replicas: 1
revisionHistoryLimit: 1
selector:
matchLabels:
app: plataforma-monitoramento-infraphone-dialer
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
<http://instrumentation.opentelemetry.io/inject-java|instrumentation.opentelemetry.io/inject-java>: "true"
<http://sidecar.opentelemetry.io/inject|sidecar.opentelemetry.io/inject>: "true"
creationTimestamp: null
labels:
app: plataforma-monitoramento-infraphone-dialer
date: "1668455468726"
p
can you check the logs of the otel sidecar?
endpoint: 100.94.35.234:4317
I don't think the endpoint IP should point to private internal IP directly like that.
if the internal IP of the Pod/Svc changes, it would stop working.
use service name instead, if signoz in same cluster.
@igor estevan jasinski also make sure libraries/framework used by your application is supported: https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md
i
I will check the libraries, but i'm using a static ip to connect, and using the troubleshoot tools shows that the collector is working, any other tip about how setup the sidecar and also the annotations? ps: i have the signoz installed in 1 cluster and my pods with the applications that i wanna monitor are configured in another cluster
And looking at my otel-collector pod the only log thats i foung is this, but nothing related with my application pods
image.png
p
i have the signoz installed in 1 cluster and my pods with the applications that i wanna monitor are configured in another cluster
I do not think that internal static private IP could be used across clusters unless you can configured some internal solutions to enable it. In any case, make sure that the passed endpoint (static IP or public loadbalancer endpoint) can be used across clusters by running
troubleshoot
or
telemetrygen
from the application cluster.
i
@Prashant Shahi That ip is an external ip from the loadbalancer for the otel-collector service and I already used the troubleshoot tool to test from the application cluster and is reaching the otel-collector.
I think for me that problem is during the sidecar/instrumentation configuration or the pod annotation part.
p
I see. Let me verify with the configuration shared by you.
i
thank you
@Prashant Shahi I also try to create the petclinic test application, but same error
@Prashant Shahi If you have and tip or any way to test, will be very helpful because signoz looks amazing
@Prashant Shahi Were you able to take a look at the configuration that i sent to you ?
p
Hey @igor estevan jasinski! I was able to see the pet clinic app in signoz without any issue using OpenTelemetry Operator
also, check logs of SigNoz otel-collector.
as well as that of the
otc-container
container inside the intrumented pod, you should see logs like this:
Copy code
2023-04-03T16:25:14.770Z	info	MetricsExporter	{"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 78}
2023-04-03T16:26:42.474Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 5}
2023-04-03T16:26:47.529Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 20}
2023-04-03T16:29:14.866Z	info	MetricsExporter	{"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 83}
endpoint: 100.94.35.234:4317
can you try to use this public endpoint with
telemetrygen
instead? After which verify it in SigNoz UI.
Copy code
telemetrygen traces --traces 1 --otlp-endpoint 100.94.35.234:4317 --otlp-insecure
i
Follow a couple of logs from otel-collector
Copy code
/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.66.0/batch_processor.go:144
2023-04-03T17:40:06.880Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:07.881Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:08.178Z	info	exporterhelper/queued_retry.go:426	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "code: 243, message: Cannot reserve 1.00 MiB, not enough space", "interval": "137.169576ms"}
2023-04-03T17:40:08.882Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:09.883Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:10.883Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:11.884Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:12.886Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:13.887Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:14.887Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:15.889Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:16.890Z	error	exporterhelper/queued_retry.go:310	Dropping data because sending_queue is full. Try increasing queue_size.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "dropped_items": 1144}
<http://go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send|go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send>
	/go/pkg/mod/go.opentelemetry.io/collector@v0.66.0/exporter/exporterhelper/queued_retry.go:310
<http://go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsExporter.func2|go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsExporter.func2>
	/go/pkg/mod/go.opentelemetry.io/collector@v0.66.0/exporter/exporterhelper/logs.go:114
<http://go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs|go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs>
	/go/pkg/mod/go.opentelemetry.io/collector/consumer@v0.66.0/logs.go:36
<http://go.opentelemetry.io/collector/processor/batchprocessor.(*batchLogs).export|go.opentelemetry.io/collector/processor/batchprocessor.(*batchLogs).export>
	/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.66.0/batch_processor.go:339
<http://go.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).sendItems|go.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).sendItems>
	/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.66.0/batch_processor.go:176
<http://go.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).startProcessingCycle|go.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).startProcessingCycle>
	/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.66.0/batch_processor.go:144
2023-04-03T17:40:16.890Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:17.891Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:18.226Z	info	exporterhelper/queued_retry.go:426	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "code: 243, message: Cannot reserve 1.00 MiB, not enough space", "interval": "125.471225ms"}
2023-04-03T17:40:18.892Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:19.892Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:20.893Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:21.894Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:22.895Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:23.896Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:24.445Z	error	exporterhelper/queued_retry.go:310	Dropping data because sending_queue is full. Try increasing queue_size.	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "dropped_items": 442}
p
Copy code
2023-04-03T17:40:18.226Z	info	exporterhelper/queued_retry.go:426	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "code: 243, message: Cannot reserve 1.00 MiB, not enough space", "interval": "125.471225ms"}
@igor estevan jasinski You might want to increase ClickHouse PVC size. It seems to disk full issue.
Later, you can reduce TTL from SigNoz UI setting page to appropriate durations which would not let this happen in future.