Hi Team, followed <https://signoz.io/docs/instrume...
# support
s
Hi Team, followed https://signoz.io/docs/instrumentation/fastapi/ in local setup requirement.txt opentelemetry-api==1.22.0 opentelemetry-distro==0.43b0 opentelemetry-instrumentation==0.43b0 opentelemetry-exporter-otlp==1.22.0 command to run uvicorn app: OTEL_RESOURCE_ATTRIBUTES=service.name=translationService OTEL_EXPORTER_OTLP_ENDPOINT="localhost:4317" OTEL_EXPORTER_OTLP_PROTOCOL=grpc opentelemetry-instrument uvicorn main:app still service and traces are not coming up on signoz
@Srikanth Chekuri any help please 🙏
s
Can you share the full output of pip freeze in venv?
s
(translation_api) sumitroy@Sumit-ka-MacBook-Pro app % pip freeze anyio==4.2.0 asgiref==3.7.2 backoff==2.2.1 boto3==1.26.134 botocore==1.29.165 cachetools==5.3.2 certifi==2024.2.2 charset-normalizer==3.3.2 click==8.1.7 confluent-kafka==2.3.0 ctranslate2==3.0.2 Deprecated==1.2.14 docker==7.0.0 exceptiongroup==1.2.0 fastapi==0.85.2 fasttext==0.9.2 filelock==3.13.1 fsspec==2024.2.0 google-api-core==2.17.0 google-auth==2.27.0 google-cloud-core==2.4.1 google-cloud-translate==3.15.1 googleapis-common-protos==1.62.0 grpcio==1.60.1 grpcio-status==1.60.1 h11==0.14.0 huggingface-hub==0.20.3 idna==3.6 importlib-metadata==6.0.1 jmespath==1.0.1 joblib==1.3.2 MarkupSafe==2.1.5 nltk==3.7 numpy==1.26.4 opentelemetry-api==1.22.0 opentelemetry-distro==0.43b0 opentelemetry-exporter-otlp==1.22.0 opentelemetry-exporter-otlp-proto-common==1.22.0 opentelemetry-exporter-otlp-proto-grpc==1.22.0 opentelemetry-exporter-otlp-proto-http==1.22.0 opentelemetry-instrumentation==0.43b0 opentelemetry-instrumentation-asgi==0.43b0 opentelemetry-instrumentation-aws-lambda==0.43b0 opentelemetry-instrumentation-boto3sqs==0.43b0 opentelemetry-instrumentation-botocore==0.43b0 opentelemetry-instrumentation-dbapi==0.43b0 opentelemetry-instrumentation-fastapi==0.43b0 opentelemetry-instrumentation-grpc==0.43b0 opentelemetry-instrumentation-logging==0.43b0 opentelemetry-instrumentation-requests==0.43b0 opentelemetry-instrumentation-sqlite3==0.43b0 opentelemetry-instrumentation-tortoiseorm==0.43b0 opentelemetry-instrumentation-urllib==0.43b0 opentelemetry-instrumentation-urllib3==0.43b0 opentelemetry-instrumentation-wsgi==0.43b0 opentelemetry-propagator-aws-xray==1.0.1 opentelemetry-proto==1.22.0 opentelemetry-sdk==1.22.0 opentelemetry-semantic-conventions==0.43b0 opentelemetry-util-http==0.43b0 packaging==23.2 proto-plus==1.23.0 protobuf==4.25.2 pyasn1==0.5.1 pyasn1-modules==0.3.0 pybind11==2.11.1 pydantic==1.10.14 python-dateutil==2.8.2 PyYAML==6.0.1 regex==2023.12.25 requests==2.31.0 rsa==4.9 s3transfer==0.6.2 sacremoses==0.0.53 sentencepiece==0.1.97 six==1.16.0 slack-sdk==3.11.0 sniffio==1.3.0 starlette==0.20.4 tokenizers==0.13.3 torch==1.13.0 tqdm==4.66.1 transformers==4.24.0 typing_extensions==4.9.0 urllib3==1.26.18 uvicorn==0.22.0 Werkzeug==3.0.1 wrapt==1.16.0 zipp==3.17.0
can you please suggest approach to send logs too from fastapi service
s
Let's address one at a time. Can you use this command
OTEL_RESOURCE_ATTRIBUTES=service.name=translationService OTEL_TRACES_EXPORTER=console opentelemetry-instrument uvicorn main:app
and try hitting some endpoints and see if it produces json traces to console/stdout/terminal?
s
cool, let me try this. thanks
yup getting json traces on console 2024-02-16 142712,934 - process_ticket_util - INFO - Performed health check INFO: 127.0.0.1:49591 - "GET / HTTP/1.1" 200 OK { "name": "GET / http send", "context": { "trace_id": "0x15a8d59a75179ba3870405f68e00a7fb", "span_id": "0x372153c4a19aa7c5", "trace_state": "[]" }, "kind": "SpanKind.INTERNAL", "parent_id": "0x96ece8851e0ffab2", "start_time": "2024-02-16T085710.902625Z", "end_time": "2024-02-16T085710.904424Z", "status": { "status_code": "UNSET" }, "attributes": { "http.status_code": 200, "type": "http.response.start" }, "events": [], "links": [], "resource": { "attributes": { "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.22.0", "service.name": "translationService", "telemetry.auto.version": "0.43b0" }, "schema_url": "" } }
s
Did you see any errors when you tried earlier
OTEL_EXPORTER_OTLP_ENDPOINT="localhost:4317" OTEL_EXPORTER_OTLP_PROTOCOL=grpc
s
no
no error, but no service in signoz that time, neither now, this traces are only in console.
s
Were there any errors in signoz-otel-collector?
s
not there too
s
I am not sure what might be the issue then. The console exporter indicates that instrumentation is working. And if you don't see any errors with grpc exporter and collector then it's unclear where the error might be.
s
hmm, so I tried to hit the (http) endpoint around alot of time, now I am able to see the service in signoz and its traces(not logs). not sure, what made it work. can we please discuss about sending logs from the same service to signoz? context: there are distributed microservices(along with this) connected via kafka, I am trying to have logging such that I can see the logs of multiple services addressing the same request e2e. I believe it can be achieve with the help of traceid
http traces i am able to see but not seeing traces of kafka message processing from the same service
WARNINGopentelemetry.exporter.otlp.proto.grpc.exporterTransient error StatusCode.DEADLINE_EXCEEDED encountered while exporting metrics to localhost:4317, retrying in 1s.
s
Set
OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED=true
to get the logs exported.
s
hey thanks. let me try
logs are not being exported even after using the above property
after using OTEL_LOGS_EXPORTER=otlp able to see only error logs, is there any flag to see all the logs
{ "body": "unhandled exception during asyncio.run() shutdown\ntask: <Task finished name='Task-4' coro=<consume() done, defined at /Users/sumitroy/Desktop/worklspace/translation_api/app/kafka_util.py:34> exception=AttributeError(\"'cimpl.Consumer' object has no attribute 'flush'\")>", "severity_number": "<SeverityNumber.ERROR: 17>", "severity_text": "ERROR", "attributes": { "trace_id": "00000000000000000000000000000000", "span_id": "0000000000000000", "exception.type": "AttributeError", "exception.message": "'cimpl.Consumer' object has no attribute 'flush'", "exception.stacktrace": "Traceback (most recent call last):\n File \"/Users/sumitroy/Desktop/worklspace/translation_api/app/kafka_util.py\", line 48, in consume\n await asyncio.sleep(1)\n File \"/opt/homebrew/Cellar/python@3.10/3.10.13_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/tasks.py\", line 605, in sleep\n return await future\nasyncio.exceptions.CancelledError\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/Users/sumitroy/Desktop/worklspace/translation_api/app/kafka_util.py\", line 62, in consume\n await consumer.flush()\nAttributeError: 'cimpl.Consumer' object has no attribute 'flush'\n" }, "dropped_attributes": 0, "timestamp": "2024-02-16T134015.625087Z", "trace_id": "0x00000000000000000000000000000000", "span_id": "0x0000000000000000", "trace_flags": 0, "resource": "BoundedAttributes({'telemetry.sdk.language': 'python', 'telemetry.sdk.name': 'opentelemetry', 'telemetry.sdk.version': '1.22.0', 'service.name': 'translationAPIService', 'telemetry.auto.version': '0.43b0'}, maxlen=None)" } otel is forwarding log only when I am closing the fastapi service
Hi @Srikanth Chekuri able to get logs and traces to signoz via otel from fastapi service which is auto instrumentated. need one more help, Service A (springboot app which is auto instrumented too) sends a kafka message for processing to service B(fastapi) and sends back to service A. the whole process is not able to sustain traceId + any log regarding those kafka message processing is giving TraceID: 00000000000000000000000000000000 SpanID: 0000000000000000 can you please help me with the above issue, it will be really helpful as I m almost at the last leg?
s
You need to serialize the tracecontext and send it along with the message and they in the downstream service use the context when starting the span. Example here https://signoz.io/blog/opentelemetry-context-propagation/#manual-context-propagation-taking-control
s
thanks let me have a look
Hi @Srikanth Chekuri, hosted signoz in aws eks as https://signoz.io/docs/tutorial/setting-up-tls-for-signoz/ configured ingress nginx and Cert-Manager and able to access signoz via using public domain. is there any best practise to configure otel such that my apps can be able to send logs and traces from outside of this cluster?
s
Add load balancer to collector
s
thanks, it worked. Thanks for all the help
Hi @Srikanth Chekuri , hosted signoz eks. we are facing 'insufficient memory' continously. we have 2 instances with total 4 cpus , 16 gb ram , 200 gb. this deployment is only for dev, where we are doing testing, not so much of logs. what could be the bottle neck? what is advised to have one bigger instance or multiple smaller instances?
s
Are you just using logs?
s
traces too
s
What is the approximate volume of data?
s
for a day, can be 15k
around 20k
s
It shouldn't take that much resources then. Please check which process it taking the memory.
s
and what is advised to have one bigger instance or multiple smaller instances?
@John Silvan
j
Hi Srikanth - this is the resource usage looks like the otel collector and clickhouse db are the main users.
And if there's a big request to fetch logs looks like the query service usage jumps and the nodes run of memory and it goes into memory reclaiming
@Srikanth Chekuri can you help here?
s
The memory usage of the collector is unusually high. Can you please collect the heap profile and share it?
s
Hi @Srikanth Chekuri hope this finds you well. we had to delay the prod deployment with the on-going hiccups on dev and other items were prioritised. Wondering, if we can connect over meet/huddle to iron out the challenges. will really appreciate the help and effort. cc: @John Silvan
j
Hey @Srikanth Chekuri we are looking at this again - the collector keeps getting killed because of OOMKilled error.
s
Please share more info. What version of SigNoz are you using? Did you set any memory limits? What are the resources available?
j
I've created a new thread let's follow up there