and images is otel/opentelemetry-collector-contrib...
# support
s
and images is otel/opentelemetry-collector-contrib:0.43.0
p
It should be 0.55.0
s
Is there any dependency with eks version?
n
There are a lot of breaking changes between v0.43.0 and 0.55.0 . Some protocol buffers are also updated. For exact details, you can visit the releases https://github.com/open-telemetry/opentelemetry-collector/releases
s
Let me try with updated one
@nitya-signoz Will signoz repo supports
0.55.0
As I upgraded tag
this deployment otelcontribcol start giving error
p
Oh, right! That version should be updated.
@sudhanshu dev can you try re-installing otel-collector-k8s components?
n
Yeah, we will have to make change in the otel-collecot-k8s
s
DO I need to take update from ur git and then reploye
with new tag
'?
bcz in signoz repo image tag is 0.45.0
@Prashant Shahi could u plz let me know what r u trying to say
redeploy means
taking updated configuration and then
p
Copy code
# cleanup
kubectl -n signoz-infra-metrics delete -Rf agent
kubectl -n signoz-infra-metrics delete -Rf deployment

# re-install
kubectl -n signoz-infra-metrics apply -Rf agent
kubectl -n signoz-infra-metrics apply -Rf deployment
@sudhanshu dev You have updated the version in both agent and deployment to
0.55.0
right?
s
yes
same error
p
@Srikanth Chekuri any insights here? cc @Ankit Nayan
s
Ok let me try
Added
data: otel-collector-config: | receivers: otlp: protocols: grpc: max_recv_msg_size_mib: 300 endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318
now deploying
@nitya-signoz same error
I did changes in infra-metrics.yaml
file
I hope u referred to the same
n
Just confirming here • You are running multiple otel collectors and the otel collectors which are sending the data to singoz otel collector is throwing the error. • You applied the changes on the signoz otel collector which is accepting the data from other otel collectors right ?
Let me know if the above to assumptions are correct.
s
yes one demaon set and one deployment is running
I am getting error in deployment
In the last pod
I am getting error
n
Ohh thats interesting. I expected this error to occur on the daemon set collectors.
s
yes
That why
I sent u file
bcz we did changes in demon configmap
and getting error in deployment
no error in demonset
n
Ahh the number is high.
s
Where should we define this
n
Try increasing the value to a bigger number like 10000 for max_recv_msg_size_mib and see if you get the error again ?
s
In the same
File
Infra-metrics.yaml
n
Not sure about the file but the otel collector which is receiving the data and from other otel collectors. So it should be for the deployment. @Prashant Shahi can you correct me if that points to the infra-metrics.yaml.
s
I didn’t follow the whole conversation but the keep increasing the receiver msg size is not the correct thing. You should be able to batch the export size that fits within some limit without hampering the receiver side stuff. Can you try configuring
send_batch_max_size
to a sensible limit for you. This should also be configurable from some SDKs and this issue tries make it official https://github.com/open-telemetry/opentelemetry-specification/issues/2772.
s
Same error
I start getting this error when upgraded to 0.55.0
from 0.43.0
@nitya-signoz
s
Where and how much did you configure? If the batch size is not reasonable the total payload size will sill be big and receiver throws error. This was the case for user who I know faced similar problem and they got it fixed by using
send_batch_max_size
. Make sure you set it at the exporter which is sending the data not on the receiver side.
a
@sudhanshu dev is this error fixed now?
s
@Prashant Shahi After adding properties
seems issue fixed
kubectl logs -f otelcontribcol-65f7cbcd58-wcr9c -n signoz-infra-metrics 2022/09/16 110109 proto: duplicate proto type registered: jaeger.api_v2.PostSpansRequest 2022/09/16 110109 proto: duplicate proto type registered: jaeger.api_v2.PostSpansResponse 2022-09-16T110109.879Z info service/telemetry.go:103 Setting up own telemetry... 2022-09-16T110109.879Z info service/telemetry.go:138 Serving Prometheus metrics {"address": ":8888", "level": "basic"} 2022-09-16T110109.913Z info extensions/extensions.go:42 Starting extensions... 2022-09-16T110109.913Z info pipelines/pipelines.go:74 Starting exporters... 2022-09-16T110109.913Z info pipelines/pipelines.go:78 Exporter is starting... {"kind": "exporter", "data_type": "metrics", "name": "otlp"} 2022-09-16T110109.913Z info pipelines/pipelines.go:82 Exporter started. {"kind": "exporter", "data_type": "metrics", "name": "otlp"} 2022-09-16T110109.913Z info pipelines/pipelines.go:86 Starting processors... 2022-09-16T110109.913Z info pipelines/pipelines.go:90 Processor is starting... {"kind": "processor", "name": "batch", "pipeline": "metrics"} 2022-09-16T110109.913Z info pipelines/pipelines.go:94 Processor started. {"kind": "processor", "name": "batch", "pipeline": "metrics"} 2022-09-16T110109.913Z info pipelines/pipelines.go:98 Starting receivers... 2022-09-16T110109.913Z info pipelines/pipelines.go:102 Receiver is starting... {"kind": "receiver", "name": "k8s_cluster", "pipeline": "metrics"} 2022-09-16T110109.914Z info pipelines/pipelines.go:106 Receiver started. {"kind": "receiver", "name": "k8s_cluster", "pipeline": "metrics"} 2022-09-16T110109.914Z info service/collector.go:215 Starting otelcol-contrib... {"Version": "0.55.0", "NumCPU": 32} 2022-09-16T110109.914Z info service/collector.go:128 Everything is ready. Begin running and processing data. 2022-09-16T110109.914Z info k8sclusterreceiver@v0.55.0/receiver.go:59 Starting shared informers and wait for initial cache sync. {"kind": "receiver", "name": "k8s_cluster", "pipeline": "metrics"} 2022-09-16T110114.916Z info k8sclusterreceiver@v0.55.0/receiver.go:80 Completed syncing shared informer caches. {"kind": "receiver", "name": "k8s_cluster", "pipeline": "metrics"}
No more message drop error