Hello team, Just one more question. I am implemen...
# support
j
Hello team, Just one more question. I am implementing manual instrumentation via the opentelemetry python api. My use case is to have some pretty basic error counters that I can alert on when they are above a certain threshold. When setting up the normal OTLPMetricExporter, I am able to query for these custom metrics but the values I query for are cumulative ( counters dont reset back to zero after exporting). Because of this, it makes it impossible to create an query that can alarm if there are x amount of errors in time interval y. The opentelemetry api docs say that you can add a "preferred_temporality" argument that allows you to overide this and specifcy DELTA, which should do what I need and reset the counter to zero after exporting. Problem is when I set this SigNoz is not receieving this metric. Is this not supported? If so how should I go about doing something as simple as this. Some code below: (when I remove AggregationTemporality.DELTA metrics are exported but not correctly for use case)
Copy code
resource = Resource(attributes = {
        "profile": profile_name,
        "environment": env
    })

    reader = PeriodicExportingMetricReader(
        OTLPMetricExporter(
            endpoint = _monitoring_api_url,
            preferred_temporality = {Counter: AggregationTemporality.DELTA}
        ),
        export_interval_millis = 60000
    )


    provider = MeterProvider(resource = resource, metric_readers = [reader])
    metrics.set_meter_provider(provider)
s
SigNoz doesn’t support the delta temporality, yet
it makes it impossible to create an query that can alarm if there are x amount of errors in time interval y.
You can use rate to achieve this.
To be more explicit. The Rate of change will give you the value for 30s seconds interval. You will get the absolute value when you multiply this result by 30. IIUC, you wanted this absolute value and set alert. There is a formula tab in the builder where you can write the expression
A*30
where the
A
is the rate query.
j
Thanks! Makes sense; played with rate before but just assumed it should be the raw difference and was confused at the number it gave because the interval length isnt specified Can we assume cumulative aggregation would be a robust measurement for the following scenarios? • multiple services pushing the same metrics with frequent restarts (rate will be negative if a service restarts?). How do we take the rate per service and sum them (SUM_RATE)? • export interval is greater than alarm interval (is rate correctly zero on no-exports?) Thanks again for your help 🙂
s
The counter resets are not handled correctly, so that can be a problem. Yes, if there is no change in the counter for an interval the rate will be zero
However, the promql eval takes the resets into account. If you could share the error counter metric you created I would share the expression by service for this. It would roughly look like following
sum by (service_name) (rate(your_metric_name_here{your_filters}[5m]))
j
your example should be helpful enough to get there if i need it, It will only affect dashboard graphs a litte since I only alert above a threshold just confirming that besides this SUM_RATE in query builder should work out of the box correctly with identical horizontally scaled services? aka each service counter is tracked separately as a time series and the sum of differences of those will be calculated. or since they are identical they are pooled in same time series?
s
aka each service counter is tracked separately as a time series and the sum of differences of those will be calculated
This is correct.
since they are identical they are pooled in same time series?
How are they identical? There should be some identifier that differentiates each scaled instance and that should come from the instrumentation.
Unfortunately SDK doesn’t do this today, since you mentioned Python SDK here is the issue tracker issue https://github.com/open-telemetry/opentelemetry-python/issues/2113
You should work around this by creating your own unique ID until sdk supports it.
j
cool, so I must manually set this then from ptyhon sdk?
s
Yes