This message was deleted SigNoz Community #support

Join Slack

This message was deleted.

# support

Slackbot

03/15/2023, 2:22 PM

This message was deleted.

Pranay

03/15/2023, 4:47 PM

@Prashant Shahi may have more insights here

Pranay

03/15/2023, 4:48 PM

@igor estevan jasinski Have you checked this grid on providing otel collector addresses - https://signoz.io/docs/install/troubleshooting/#signoz-otel-collector-address-grid Might be helpful!

Prashant Shahi

03/15/2023, 4:58 PM

@igor estevan jasinski are you following the OpenTelemetry Operator guide for the auto-instrumentation?

igor estevan jasinski

03/15/2023, 5:33 PM

yes, actually I tried the troubleshooting tools to test inside the cluster that my application are configured and reached the otel-collector, which is configured in another cluster:

igor estevan jasinski

03/15/2023, 5:33 PM

i followed this one: https://signoz.io/docs/tutorial/opentelemetry-operator-usage/#opentelemetry-auto-instrumentation-injection

igor estevan jasinski

03/15/2023, 10:25 PM

@Prashant Shahi probably i'm doing something wrong during the auto instrumentation phase, i'll share my yaml files maybe you can help me

igor estevan jasinski

03/15/2023, 10:28 PM

sidecar yaml:

apiVersion: <http://opentelemetry.io/v1alpha1|opentelemetry.io/v1alpha1>

kind: OpenTelemetryCollector

metadata:

annotations:

<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |

{"apiVersion":"<http://opentelemetry.io/v1alpha1|opentelemetry.io/v1alpha1>","kind":"OpenTelemetryCollector","metadata":{"annotations":{},"name":"signoz-otel-collector-sidecar","namespace":"default"},"spec":{"config":"receivers:\n  otlp:\n    protocols:\n      http:\n      grpc:\nprocessors:\n  batch:\nexporters:\n  logging:\n  otlp:\n    endpoint: 100.94.35.234:4317\n    tls:\n      insecure: true\nservice:\n  pipelines:\n    traces:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [logging, otlp]\n    metrics:\n      receivers: [otlp]\n      processors: [batch]\n      exporters: [logging, otlp]\n","mode":"sidecar"}}

creationTimestamp: "2023-03-15T12:50:40Z"

generation: 3

name: signoz-otel-collector-sidecar

namespace: default

resourceVersion: "850306767"

uid: c6a30f0e-1a88-4628-a6ac-c57400ff20c9

spec:

config: |

receivers:

otlp:

protocols:

http:

grpc:

processors:

batch:

exporters:

logging:

otlp:

endpoint: 100.94.35.234:4317

tls:

insecure: true

service:

pipelines:

traces:

receivers: [otlp]

processors: [batch]

exporters: [logging, otlp]

metrics:

receivers: [otlp]

processors: [batch]

exporters: [logging, otlp]

mode: sidecar

igor estevan jasinski

03/15/2023, 10:28 PM

instrumentation yaml:

apiVersion: <http://opentelemetry.io/v1alpha1|opentelemetry.io/v1alpha1>

kind: Instrumentation

metadata:

annotations:

<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |

{"apiVersion":"<http://opentelemetry.io/v1alpha1|opentelemetry.io/v1alpha1>","kind":"Instrumentation","metadata":{"annotations":{},"name":"signoz-otel-collector-instrumentation","namespace":"default"},"spec":{"exporter":{"endpoint":"<https://signoz-otel-collector.dev.sicredi.cloud:4317>"},"java":{"image":"<http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest>"},"nodejs":{"image":"<http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest>"},"propagators":["tracecontext","baggage","b3"],"python":{"image":"<http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest>"},"sampler":{"argument":"0.25","type":"parentbased_traceidratio"}}}

creationTimestamp: "2023-03-15T01:54:20Z"

generation: 4

name: signoz-otel-collector-instrumentation

namespace: default

resourceVersion: "850306883"

uid: 2ca19a50-dcea-4c99-97fb-4f60cef3ba03

spec:

exporter:

endpoint: 100.94.35.234:4317

java:

image: <http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest>

nodejs:

image: <http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest>

propagators:

- tracecontext

- baggage

- b3

python:

image: <http://ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest|ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest>

sampler:

argument: "0.25"

type: parentbased_traceidratio

igor estevan jasinski

03/15/2023, 10:29 PM

application pod deloymenty yaml:

igor estevan jasinski

03/15/2023, 10:29 PM

apiVersion: apps/v1

kind: Deployment

metadata:

annotations:

<http://deployment.kubernetes.io/revision|deployment.kubernetes.io/revision>: "23"

creationTimestamp: "2022-07-04T15:57:14Z"

generation: 40

labels:

app: plataforma-monitoramento-infraphone-dialer

devconsole: managed

devconsole_application: plataforma-monitoramento

devconsole_component: plataforma-monitoramento-infraphone-dialer

name: plataforma-monitoramento-infraphone-dialer-deployment

namespace: default

resourceVersion: "850309601"

uid: 77e93b40-e95e-4e60-b065-641f516f664a

spec:

progressDeadlineSeconds: 220

replicas: 1

revisionHistoryLimit: 1

selector:

matchLabels:

app: plataforma-monitoramento-infraphone-dialer

strategy:

rollingUpdate:

maxSurge: 25%

maxUnavailable: 25%

type: RollingUpdate

template:

metadata:

annotations:

<http://instrumentation.opentelemetry.io/inject-java|instrumentation.opentelemetry.io/inject-java>: "true"

<http://sidecar.opentelemetry.io/inject|sidecar.opentelemetry.io/inject>: "true"

creationTimestamp: null

labels:

app: plataforma-monitoramento-infraphone-dialer

date: "1668455468726"

Prashant Shahi

03/16/2023, 6:53 AM

can you check the logs of the otel sidecar?

Prashant Shahi

03/16/2023, 6:54 AM

endpoint: 100.94.35.234:4317

I don't think the endpoint IP should point to private internal IP directly like that.

Prashant Shahi

03/16/2023, 6:55 AM

if the internal IP of the Pod/Svc changes, it would stop working.

Prashant Shahi

03/16/2023, 6:56 AM

use service name instead, if signoz in same cluster.

Prashant Shahi

03/16/2023, 12:01 PM

@igor estevan jasinski also make sure libraries/framework used by your application is supported: https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md

igor estevan jasinski

03/17/2023, 6:13 PM

I will check the libraries, but i'm using a static ip to connect, and using the troubleshoot tools shows that the collector is working, any other tip about how setup the sidecar and also the annotations? ps: i have the signoz installed in 1 cluster and my pods with the applications that i wanna monitor are configured in another cluster

igor estevan jasinski

03/17/2023, 6:38 PM

And looking at my otel-collector pod the only log thats i foung is this, but nothing related with my application pods

Prashant Shahi

03/20/2023, 5:47 AM

i have the signoz installed in 1 cluster and my pods with the applications that i wanna monitor are configured in another cluster

I do not think that internal static private IP could be used across clusters unless you can configured some internal solutions to enable it. In any case, make sure that the passed endpoint (static IP or public loadbalancer endpoint) can be used across clusters by running

troubleshoot

telemetrygen

from the application cluster.

👎 1

igor estevan jasinski

03/20/2023, 10:58 AM

@Prashant Shahi That ip is an external ip from the loadbalancer for the otel-collector service and I already used the troubleshoot tool to test from the application cluster and is reaching the otel-collector.

igor estevan jasinski

03/20/2023, 11:03 AM

I think for me that problem is during the sidecar/instrumentation configuration or the pod annotation part.

Prashant Shahi

03/20/2023, 11:20 AM

I see. Let me verify with the configuration shared by you.

igor estevan jasinski

03/20/2023, 12:57 PM

thank you

igor estevan jasinski

03/21/2023, 2:15 PM

@Prashant Shahi I also try to create the petclinic test application, but same error

igor estevan jasinski

03/23/2023, 1:00 PM

@Prashant Shahi If you have and tip or any way to test, will be very helpful because signoz looks amazing

igor estevan jasinski

04/03/2023, 12:32 PM

@Prashant Shahi Were you able to take a look at the configuration that i sent to you ?

Prashant Shahi

04/03/2023, 4:30 PM

Hey @igor estevan jasinski! I was able to see the pet clinic app in signoz without any issue using OpenTelemetry Operator

Prashant Shahi

04/03/2023, 4:34 PM

also, check logs of SigNoz otel-collector.

Prashant Shahi

04/03/2023, 4:35 PM

as well as that of the

otc-container

container inside the intrumented pod, you should see logs like this:

Copy code

2023-04-03T16:25:14.770Z	info	MetricsExporter	{"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 78}
2023-04-03T16:26:42.474Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 5}
2023-04-03T16:26:47.529Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 20}
2023-04-03T16:29:14.866Z	info	MetricsExporter	{"kind": "exporter", "data_type": "metrics", "name": "logging", "#metrics": 83}

Prashant Shahi

04/03/2023, 4:40 PM

endpoint: 100.94.35.234:4317

can you try to use this public endpoint with

telemetrygen

instead? After which verify it in SigNoz UI.

Copy code

telemetrygen traces --traces 1 --otlp-endpoint 100.94.35.234:4317 --otlp-insecure

igor estevan jasinski

04/03/2023, 5:49 PM

Follow a couple of logs from otel-collector

Copy code

/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.66.0/batch_processor.go:144
2023-04-03T17:40:06.880Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:07.881Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:08.178Z	info	exporterhelper/queued_retry.go:426	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "code: 243, message: Cannot reserve 1.00 MiB, not enough space", "interval": "137.169576ms"}
2023-04-03T17:40:08.882Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:09.883Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:10.883Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:11.884Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:12.886Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:13.887Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:14.887Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:15.889Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:16.890Z	error	exporterhelper/queued_retry.go:310	Dropping data because sending_queue is full. Try increasing queue_size.	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "dropped_items": 1144}
<http://go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send|go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).send>
	/go/pkg/mod/go.opentelemetry.io/collector@v0.66.0/exporter/exporterhelper/queued_retry.go:310
<http://go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsExporter.func2|go.opentelemetry.io/collector/exporter/exporterhelper.NewLogsExporter.func2>
	/go/pkg/mod/go.opentelemetry.io/collector@v0.66.0/exporter/exporterhelper/logs.go:114
<http://go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs|go.opentelemetry.io/collector/consumer.ConsumeLogsFunc.ConsumeLogs>
	/go/pkg/mod/go.opentelemetry.io/collector/consumer@v0.66.0/logs.go:36
<http://go.opentelemetry.io/collector/processor/batchprocessor.(*batchLogs).export|go.opentelemetry.io/collector/processor/batchprocessor.(*batchLogs).export>
	/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.66.0/batch_processor.go:339
<http://go.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).sendItems|go.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).sendItems>
	/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.66.0/batch_processor.go:176
<http://go.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).startProcessingCycle|go.opentelemetry.io/collector/processor/batchprocessor.(*batchProcessor).startProcessingCycle>
	/go/pkg/mod/go.opentelemetry.io/collector/processor/batchprocessor@v0.66.0/batch_processor.go:144
2023-04-03T17:40:16.890Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:17.891Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:18.226Z	info	exporterhelper/queued_retry.go:426	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "code: 243, message: Cannot reserve 1.00 MiB, not enough space", "interval": "125.471225ms"}
2023-04-03T17:40:18.892Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:19.892Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:20.893Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:21.894Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:22.895Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:23.896Z	warn	batchprocessor@v0.66.0/batch_processor.go:178	Sender failed	{"kind": "processor", "name": "batch", "pipeline": "logs", "error": "sending_queue is full"}
2023-04-03T17:40:24.445Z	error	exporterhelper/queued_retry.go:310	Dropping data because sending_queue is full. Try increasing queue_size.	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "dropped_items": 442}

Prashant Shahi

04/04/2023, 5:39 AM

Copy code

2023-04-03T17:40:18.226Z	info	exporterhelper/queued_retry.go:426	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "code: 243, message: Cannot reserve 1.00 MiB, not enough space", "interval": "125.471225ms"}

@igor estevan jasinski You might want to increase ClickHouse PVC size. It seems to disk full issue.

Prashant Shahi

04/04/2023, 7:29 AM

Later, you can reduce TTL from SigNoz UI setting page to appropriate durations which would not let this happen in future.

45 Views

Open in Slack

Previous Next