https://signoz.io logo
Join the conversationJoin Slack
Channels
contributing
contributing-frontend
general
github-activities
incidents
k8s_operator_helm
reactjs-config
signoz-alert-testing
support
testsupport
watercooler
webhook-dev
write-for-signoz
Powered by Linen
support
  • b

    Blake Romano

    06/30/2022, 5:15 PM
    New Update is giving me Otel Collector Error
    2022-06-30T17:14:03.114Z	info	service/collector.go:124	Everything is ready. Begin running and processing data.
    panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x1063552]
    
    goroutine 211 [running]:
    <http://github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhousemetricsexporter.(*PrwExporter).export.func1()|github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhousemetricsexporter.(*PrwExporter).export.func1()>
    	/src/exporter/clickhousemetricsexporter/exporter.go:279 +0xf2
    created by <http://github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhousemetricsexporter.(*PrwExporter).export|github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhousemetricsexporter.(*PrwExporter).export>
    	/src/exporter/clickhousemetricsexporter/exporter.go:275 +0x256
    ✅ 1
    p
    • 2
    • 15
  • b

    Blake Romano

    07/01/2022, 4:07 PM
    Have you guys thought of leveraging Kuberhealthy to support Synthetic Checks? I think that may be a great way to add synthetics to the platform. https://github.com/kuberhealthy/kuberhealthy < This is one thing that I know my company heavily uses within New Relic/DataDog that Signoz doesn’t support
    p
    s
    • 3
    • 13
  • u

    Usman Ali

    07/04/2022, 2:21 PM
    hello dears, any one whose pulse is running kindly RESPOND coz i'm also to my last breath here. nodejs pm2 cluster mode ..... unable to join signoz but fork mode works well. but we explicitly using pm2 cluster mode for our nodejs app. help us connect our cluster with signoz. share me a doc or something regarding this thanks
    s
    • 2
    • 3
  • t

    teja

    07/05/2022, 6:14 AM
    hi team,
  • t

    teja

    07/05/2022, 6:14 AM
    with alertmanager i am facing issue could you help me out this please
  • t

    teja

    07/05/2022, 6:14 AM
    unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: No private IP address found, and explicit IP not provided"
    ✅ 1
    p
    a
    • 3
    • 28
  • b

    Brian Bills

    07/05/2022, 9:34 PM
    This used to work for me, I don't remember how I passed the backend, now my containers don't talk to the backend:
    docker run -it -p 8080:8080 -e SW_AGENT_NAME=docmagic-dev-torrance::bluelagoon -e POD_NAMESPACE=docmagic-dev-torrance cfpb-brian bash
    
    docker run -it -p 8081:8080 -e SW_AGENT_NAME=docmagic-preprod-torrance::jetblue -e POD_NAMESPACE=docmagic-preprod-torrance cfpb-brian bash
    
    docker run -it -p 8082:8080 -e SW_AGENT_NAME=docmagic-qa-torrance::chinaair -e POD_NAMESPACE=docmagic-qa-torrance cfpb-brian bash
    
    docker run -it -p 8083:8080 -e SW_AGENT_NAME=docmagic-stage-torrance::evaair -e POD_NAMESPACE=docmagic-stage-torrance cfpb-brian bash
    
    docker run -it -p 8084:8080 -e SW_AGENT_NAME=docmagic-dev-torrance::manadarinair -e POD_NAMESPACE=docmagic-dev-torrance cfpb-brian bash
    
    docker run -it -p 8085:8080 -e SW_AGENT_NAME=docmagic-stage-torrance::starlux -e POD_NAMESPACE=docmagic-stage-torrance cfpb-brian bash
    
    docker run -it -p 8086:8080 -e SW_AGENT_NAME=docmagic-qa-torrance::tigerair -e POD_NAMESPACE=docmagic-qa-torrance cfpb-brian bash
    
    docker run -it -p 8087:8080 -e SW_AGENT_NAME=docmagic-preprod-torrance::delta -e POD_NAMESPACE=docmagic-preprod-torrance cfpb-brian bash
    
    docker run -it -p 8088:8080 -e SW_AGENT_NAME=boa-preprod-torrance::blue22 -e POD_NAMESPACE=boa-preprod-torrance cfpb-brian bash
    Dockerfile:
    FROM <http://harbor.docmagic.com/library/dm-openjdk:8-jdk-slim-buster-012122164131|harbor.docmagic.com/library/dm-openjdk:8-jdk-slim-buster-012122164131>
    ADD agent /app/agent
    COPY cfpb-service.war /app
    COPY entrypoint.sh /usr/local/bin/
    ENTRYPOINT ["entrypoint.sh", "-j", "/app/cfpb-service.war"]
    entrypoint.sh:
    #!/bin/bash
    
    while getopts ":j:f:h" opt; do
      case ${opt} in
        j)
          JAR=$OPTARG
          ;;
        f)
          JVM_EXTRA_FLAGS=$OPTARG
          ;;
        h)
          echo "USAGE: entrypoint.sh -j app.jar"
          echo "  -j: /path/to/jar"
          echo "  -f: "JVM_EXTRA_FLAGS" (optional)"
          exit 0
          ;;
        \?)
          echo "Invalid option: $OPTARG" 1>&2
          exit 1
          ;;
        :)
          echo "Invalid option: $OPTARG requires an argument" 1>&2
          exit 1
          ;;
      esac
    done
    shift $((OPTIND -1))
    
    if [ -f "$JAR" ]; then
        JAVA_BIN=$(/usr/bin/which java)
        umask 002
        exec $JAVA_BIN \
            -Dfile.encoding=ISO-8859-1 \
            -Dserver.port=8080 \
            -Djava.security.egd=file:/dev/./urandom \
            -XX:+UnlockExperimentalVMOptions \
            -XX:+UseContainerSupport \
            -Dcom.sun.management.jmxremote \
            -Djava.rmi.server.hostname=127.0.0.1 \
            -Dcom.sun.management.jmxremote.port=1083 \
            -Dcom.sun.management.jmxremote.rmi.port=1083 \
            -Dcom.sun.management.jmxremote.local.only=false \
            -Dcom.sun.management.jmxremote.ssl=false \
            -Dcom.sun.management.jmxremote.authenticate=true \
            -Dcom.sun.management.jmxremote.access.file=$JAVA_HOME/conf/jmx.access \
            -Dcom.sun.management.jmxremote.password.file=$JAVA_HOME/conf/jmx.password \
            $JVM_EXTRA_FLAGS \
            -javaagent:/app/opentelemetry-javaagent.jar \
            -jar \
            $JAR
    else
        echo "File $JAR does not exist"
        exit 1
    fi
  • b

    Brian Bills

    07/05/2022, 9:49 PM
    I'm getting:
    [otel.javaagent 2022-07-05 14:48:40:134 -0700] [OkHttp <http://localhost:4317/...>] ERROR io.opentelemetry.exporter.internal.grpc.OkHttpGrpcExporter - Failed to export spans. The request could not be executed. Full error message: Failed to connect to localhost/127.0.0.1:4317
  • b

    Brian Bills

    07/05/2022, 10:23 PM
    I had to run my containers like this:
    docker run -it -p 8080:8080 -e  OTEL_EXPORTER_OTLP_ENDPOINT="<http://10.1.130.93:4317>" -e OTEL_RESOURCE_ATTRIBUTES=service.name=bluelagoon -e POD_NAMESPACE=docmagic-dev-torrance cfpb-brian bash
  • l

    Leon Jones

    07/06/2022, 12:07 AM
    Hi. I'm struggling to understand what I should expect to see in the 'Services' tab. I've seen other questions in this area, but none quite answer my query. I'm putting together a POC for a client and have a basic 3-hop kafka setup. I'm using spring boot, sleuth and the otel autoconfig. Everything is running in docker, and I see individual traces (via OTLP Grpc @ 4317). I've tried passing resource attrs in the env. Can you give me an idea as to what I should expect to see, and what I might be missing. Thanks.
    OTEL_RESOURCE_ATTRIBUTES: 'service:name=a_service'
    s
    a
    • 3
    • 4
  • e

    Edson F Cunha

    07/06/2022, 8:26 PM
    I'm trying to install via helm with a values.yaml config file But I'm getting this error return: efcunha@DevOps:~/k8s-cluster/signoz$ helm install --dry-run --debug signoz -n platform signoz/signoz -f values.yaml install.go:178: [debug] Original chart version: "" install.go:195: [debug] CHART PATH: /home/efcunha/.cache/helm/repository/signoz-0.1.3.tgz install.go:210: [debug] WARNING: This chart or one of its subcharts contains CRDs. Rendering may fail or contain inaccuracies. Error: INSTALLATION FAILED: template: signoz/templates/otel-collector/ingress.yaml3️⃣23: executing "signoz/templates/otel-collector/ingress.yaml" at <.Values.otelCollector.service.port>: nil pointer evaluating interface {}.port helm.go:84: [debug] template: signoz/templates/otel-collector/ingress.yaml3️⃣23: executing "signoz/templates/otel-collector/ingress.yaml" at <.Values.otelCollector.service.port>: nil pointer evaluating interface {}.port INSTALLATION FAILED main.newInstallCmd.func2 helm.sh/helm/v3/cmd/helm/install.go:127 github.com/spf13/cobra.(*Command).execute github.com/spf13/cobra@v1.4.0/command.go:856 github.com/spf13/cobra.(*Command).ExecuteC github.com/spf13/cobra@v1.4.0/command.go:974 github.com/spf13/cobra.(*Command).Execute github.com/spf13/cobra@v1.4.0/command.go:902 main.main helm.sh/helm/v3/cmd/helm/helm.go:83 runtime.main runtime/proc.go:255 runtime.goexit runtime/asm_amd64.s:1581
    p
    • 2
    • 12
  • v

    Vikash Kashyap

    07/07/2022, 4:38 AM
    Hi, can someone help me how to connect ClickHouse to Azure Blob storage for long term storage
  • s

    Shreyas Mishra

    07/07/2022, 11:30 AM
    Hey i am trying to send the tracing data(using postgres) to signoz using otelgorm i am following the example seen here(link) but not able to see it on signoz
    n
    v
    • 3
    • 16
  • s

    Sandeep Sarpe

    07/08/2022, 11:51 AM
    Hi All I have installed signoz on my k8s cluster (v.0.8.0). I am getting below error as my disk space is full 2022.07.08 09:07:44.827179 [ 313 ] {57c20762-2e62-427b-ac38-8bd93a05eb5b} <Error> executeQuery: Code: 243. DB::Exception: Cannot reserve 1.00 MiB, not enough space. (NOT_ENOUGH_SPACE) (version 21.12.3.32 (official build)) (from 192.168.103.85:51576) (in query: INSERT INTO signoz_metrics.samples (fingerprint, timestamp_ms, value) VALUES), Stack trace (when copying this message, always include the lines below): but I do not want to increase the storage space but wanted to delete the older metrics and traces so as to free up the disk space. I have tried to apply the retention period to 7 days for metrics and traces, but it is giving below error How to delete older metrics and traces from Signoz? Can I delete them by doing ssh to the container? Please help!
    ✅ 1
    p
    • 2
    • 4
  • s

    Sandeep Sarpe

    07/09/2022, 6:09 AM
    Hi All Is there any way to create Alert channel to send notification to email account? Please advise!
    a
    • 2
    • 2
  • s

    Shiwam Jaiswal

    07/09/2022, 10:26 AM
    Hi Team, I am having a hard time running query-service with alertmanager locally, please help me out if you know the issue I am getting a conncetion refused error while I try to create a new channel Here is my understanding of this issue so far, tell me if I am correct.
    alertmanager:
        image: signoz/alertmanager:0.23.0-0.1
        volumes:
          - ./data/alertmanager:/data
        expose:
          - "9093"
        ports:
          - "9093:9093"
        # depends_on:
        #   query-service:
        #     condition: service_healthy
        restart: on-failure
        command:
          - --queryService.url=172.17.0.1:8085
          - --storage.path=/data
    I have mapped my host port to dockers port both are same i.e. 9093. So my local alert manager will send alerts to this port which will be received by the application running on port 9093 in docker. Here I am trying to create a new alert from the frontend and then I am trying to ingest it into the DB, before it goes to the DB the following function gets triggered.
    apiError := r.alertManager.AddRoute(receiver)
    Here is AddRoute method's defn.
    func (m *manager) AddRoute(receiver *Receiver) *model.ApiError {
    
    	receiverString, _ := json.Marshal(receiver)
    
    	amURL := prepareAmChannelApiURL()
    	response, err := <http://http.Post|http.Post>(amURL, contentType, bytes.NewBuffer(receiverString))
    
    	if err != nil {
    		zap.S().Errorf(fmt.Sprintf("Error in getting response of API call to alertmanager(POST %s)\n", amURL), err)
    		return &model.ApiError{Typ: model.ErrorInternal, Err: err}
    	}
    
    	if response.StatusCode > 299 {
    		err := fmt.Errorf(fmt.Sprintf("Error in getting 2xx response in API call to alertmanager(POST %s)\n", amURL), response.Status)
    		zap.S().Error(err)
    		return &model.ApiError{Typ: model.ErrorInternal, Err: err}
    	}
    	return nil
    }
    The error occurs at line
    response, err := <http://http.Post|http.Post>(amURL, contentType, bytes.NewBuffer(receiverString))
    the amURL is as follows:
    <http://localhost:9093/api/v1/routes>
    I am getting the error as connection refused:
    accessJwt eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlbWFpbCI6InNoaXdhbUBzaWdub3ouaW8iLCJleHAiOjE2NTczNTg3NTMsImdpZCI6ImMyMjk5NGM1LTU2ZWItNGRkYy04NmM0LWU4MTBjY2JjYmY0NSIsImlkIjoiZTZlZmZhMmUtZTc3NC00MDI5LWIyNjQtMDBlMTRlNTE4ZGE3In0.sZi3OYomyl28bb78hpWI0RdIESleXSst1CanBhb_4TA
    user  &{{e6effa2e-e774-4029-b264-00e14e518da7  <mailto:shiwam@signoz.io|shiwam@signoz.io>  0   c22994c5-56eb-4ddc-86c4-e810ccbcbf45}  }
    2022-07-09T14:26:01.365+0530	ERROR	alertManager/manager.go:63	Error in getting response of API call to alertmanager(POST <http://localhost:9093/api/v1/routes>)
    %!(EXTRA *url.Error=Post "<http://localhost:9093/api/v1/routes>": dial tcp 127.0.0.1:9093: connect: connection refused)
    <http://go.signoz.io/query-service/integrations/alertManager.(*manager).AddRoute|go.signoz.io/query-service/integrations/alertManager.(*manager).AddRoute>
    	/home/ella/dev/signoz/pkg/query-service/integrations/alertManage
    Also there is this error that I see during the startup:
    s=2022-07-09T08:55:07.844081945Z caller=log.go:168 component=notifier level=error alertmanager=<http://127.0.0.1:9093/api/v1/alerts> count=0 msg="Error sending alert" err="Post \"<http://127.0.0.1:9093/api/v1/alerts>\": dial tcp 127.0.0.1:9093: connect: connection refused"
    2022-07-09T14:25:53.204+0530	DEBUG	auth/auth.go:294	Login method called for user:
    Setting the following env variable didnt help as well: ALERTMANAGER_API_PREFIX=http://localhost:9093/api/
    ✅ 1
    p
    • 2
    • 1
  • s

    Shiwam Jaiswal

    07/10/2022, 7:27 PM
    Hi Team, is there any existing WebHook Url for slack which one can use for testing purposes?
    a
    • 2
    • 3
  • r

    Rishabh Tripathi

    07/11/2022, 5:09 AM
    my requirement is to get the alerts if for any service i will get status code <=199 or >=400 what should be the expression? I tried multiple expression for alerts but didn't get the metrics 1. http_status_code 2. status_code 3. http_status_code 4. probe_http_status_code above expressions didn't work. Kindly help me.
    p
    s
    a
    • 4
    • 12
  • y

    Yuriy

    07/11/2022, 11:16 AM
    Hello, I install SigNoz using docker compose and run
    php 2-send-trace-to-collector.php
    (from https://github.com/SigNoz/sample-php-app), but I not visible data in panel (http://localhost:3301/application).
    p
    p
    a
    • 4
    • 8
  • b

    Blake Romano

    07/11/2022, 1:12 PM
    Does anyone have experience of setting up the Otel Collector to grab data from all instances from the kube-prom operator?
  • b

    Blake Romano

    07/11/2022, 1:21 PM
    Also on a different but similar question. What is the difference between the Otel collector, and Otel Metrics Collector with the Helm Chart? Why not have one Otel Collector for everything? I am a bit confused on distinguishment.
    a
    p
    • 3
    • 57
  • b

    blackmoja

    07/12/2022, 7:31 AM
    Hi All, I have a question. Is it possible to check userInteraction on SigNoz’s dashboard? The development environment is Next.js.
    p
    p
    +2
    • 5
    • 13
  • t

    Tamir Shkolnik

    07/12/2022, 12:09 PM
    Hi All, I succeeded to run & login to SigNoz app but cannot see my NestJS server data in the SigNoz. I would like to set a Console Exporter in order to make sure I’m actually generate a telemtry data. anyone can help me please ?? thanks in advance 🙂
    s
    • 2
    • 2
  • s

    Shashank Gupta

    07/12/2022, 5:48 PM
    I want to know , where does the generatorURL and externalURL points to when we get payload for alert generated in SigNoz? { "receiver":"w1", "status":"firing", "alerts":[ { "status":"firing", "labels":{ "alertname":"DiskRunningFull", "dev":"sda3", "instance":"example3", "severity":"critical" }, "annotations":{ "info":"The disk sda3 is running full", "summary":"please check the instance example1" }, "startsAt":"2022-04-25T14:35:19.490146+05:30", "endsAt":"0001-01-01T00:00:00Z", "generatorURL":"", "fingerprint":"ad592b0afcbe2e79" } ], "groupLabels":{ "alertname":"DiskRunningFull" }, "commonLabels":{ "alertname":"DiskRunningFull", "dev":"sda3", "instance":"example3", "severity":"critical" }, "commonAnnotations":{ "info":"The disk sda3 is running full", "summary":"please check the instance example1" }, "externalURL":"http://Apples-MacBook-Pro-3.local:9093", "version":"4", "groupKey":"{}/{}:{alertname=\"DiskRunningFull\"}", "truncatedAlerts":0 }
    p
    a
    • 3
    • 7
  • s

    Shashank Gupta

    07/13/2022, 5:51 AM
    Hi All ,In the alert payload we recieve from SigNoz we have an array named alerts . Can someone share any scenario where we will have more than one object inside the alerts array?
    ✅ 2
    p
    • 2
    • 3
  • a

    Alexei Zenin

    07/13/2022, 5:46 PM
    Are there any guides on how to use a clickhouse cluster with SigNoz? How should we configure the URL for the query service to use such that it is a HA setup (not just pointing to 1 clickhouse node) and any other components that need the URL?
    p
    • 2
    • 4
  • q

    Quyet Nguyen Duc

    07/14/2022, 5:27 AM
    Hi, I've installed the latest Signoz helm chart and otelcollector didn't create signoz_traces database anymore (it used to be on the last time I tried)
    a
    p
    • 3
    • 20
  • q

    Quyet Nguyen Duc

    07/14/2022, 8:42 AM
    Hi, I still wasn't able to set signoz up: Query service return Table signoz_traces.signoz_index_v2 doesn't exist
  • a

    Anil Kumar Bandrapalli

    07/14/2022, 10:38 AM
    Hi we had a situation where we configured signoz alert channel with aws lambda function. we are getting emails when triggering via api or on browser . but when configured the same url in the signoz alert channels . we are not getting the email. This email is to be sent when api takes more than 200 ms
    a
    • 2
    • 5
  • a

    Anil Kumar Bandrapalli

    07/14/2022, 10:40 AM
    Alert script :- alert: High Latency of operation ProcessStart expr: histogram_quantile(0.99, sum(rate(signoz_latency_bucket{service_name="workflow-service", operation="/api/task/complete"}[1m])) by (le)) > 100 for: 0m labels: severity: critical annotations: summary: High Latency of operation ProcessStart in Workflow Service description: "Latency is > 200 VALUE = {{ $value }} LABELS = {{ $labels }}"
    a
    p
    +5
    • 8
    • 102
Powered by Linen
Title
a

Anil Kumar Bandrapalli

07/14/2022, 10:40 AM
Alert script :- alert: High Latency of operation ProcessStart expr: histogram_quantile(0.99, sum(rate(signoz_latency_bucket{service_name="workflow-service", operation="/api/task/complete"}[1m])) by (le)) > 100 for: 0m labels: severity: critical annotations: summary: High Latency of operation ProcessStart in Workflow Service description: "Latency is > 200 VALUE = {{ $value }} LABELS = {{ $labels }}"
a

Ankit Nayan

07/14/2022, 11:16 AM
try changing the interval to
2m
or
5m
in expr
histogram_quantile(0.99, sum(rate(signoz_latency_bucket{service_name="workflow-service", operation="/api/task/complete"}[5m])) by (le)) > 100
a

Anil Kumar Bandrapalli

07/14/2022, 11:22 AM
ok
i will try this and let you you know
👍 1
its not working
any suggestions ?
a

Ankit Nayan

07/14/2022, 1:15 PM
this should be working
which version of signoz are you using?
a

Anil Kumar Bandrapalli

07/14/2022, 1:15 PM
v0.8.1
actually my scenario is when ever a http request taking more time i need to trigger that email
a

Ankit Nayan

07/14/2022, 1:22 PM
ahhh...I think you need to upgrade to
v0.9.2
and follow migration docs to do that
I remember an issue with alerts being sent to channels in earlier versions
here are the migration docs https://signoz.io/docs/operate/migration/upgrade-0.9/
though we have a new release coming in a day or two that will make setting alerts seamless using charts and builders..a sneak peek
p

Priyansh

07/14/2022, 2:32 PM
glad there is a threshold limit line now 😅 which I was just mentioned in yesterdays query builder session. Kudos 🚀
r

Rahul Tiwari

07/15/2022, 5:59 AM
@Ankit Nayan we are getting below error while migrating signoz from 0.8.1 to 0.9
[ec2-user@ip-10-0-4-191 ~]$ kubectl -n platform run -i -t signoz-migrate-clickhouse --image=signoz/migrate:0.9-clickhouse \
-- -host=my-release-clickhouse -port=9000 -userName=admin -password=27ff0399-0d3a-4bd8-919d-17c2181e6fb9
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead. If you don't see a command prompt, try pressing enter. Writing samples to DB 2022/07/15 05:44:09 Error while writing samples to DB code: 60, message: Table signoz_metrics.samples_v2 doesn't exist Session ended, resume using 'kubectl attach signoz-migrate-clickhouse-56767c457-sqpl2 -c signoz-migrate-clickhouse -i -t' command when the pod is running [ec2-user@ip-10-0-4-191 ~]$ [ec2-user@ip-10-0-4-191 ~]$ kubectl logs signoz-migrate-clickhouse-56767c457-kfmdj -n platform -f my-release-clickhouse 9000 admin 27ff0399-0d3a-4bd8-919d-17c2181e6fb9 signoz_metrics Total Rows: 63262424 There are total 63262424 samples rows, starting migration... Total Rows: 2555 There are total 2555 time series rows, starting migration... Writing samples to DB 2022/07/15 05:58:08 Error while writing samples to DB code: 60, message: Table signoz_metrics.samples_v2 doesn't exist [ec2-user@ip-10-0-4-191 ~]$
Can anyone help me on this.
a

Anil Kumar Bandrapalli

07/15/2022, 10:43 AM
Hi @Ankit Nayan, we upgraded to 9.2 but we can't trigger any alerts.
the same issue only
a

Ankit Nayan

07/15/2022, 11:44 AM
@Amol Umbark possible to look into this?
a

Amol Umbark

07/15/2022, 11:44 AM
yep on it
👍 1
a

Anil Kumar Bandrapalli

07/15/2022, 11:48 AM
@Amol Umbark FYI,my requirement is very simple, one specific api request time crosses 100ms i need to send an email
Also where can we get full list of metrics like signoz_latency_bucket? @Ankit Nayan you are referring to new release right when it will be released ?
a

Amol Umbark

07/15/2022, 11:52 AM
@Anil Kumar Bandrapalli are you facing issue with this particular alert ‘High Latency of operation ProcessStart’ or all the alerts? Can you please share log of alert manager and query service? Do you see any alerts in triggered alerts when condition is met? If you do then we should focus on getting channel setup right. I am assuming your channel is working correctly (?). if you are not sure then please go to settings>>channels, pick a channel to edit and click Test. See if you receive a test message. Also, please try setting up a simple alert (may be system_cpu_load_average_15m > 0.15 ) and test that alert setup works.
a

Anil Kumar Bandrapalli

07/15/2022, 12:27 PM
@Amol Umbark i have tested my channel via test button. i am able to receive the mail. i will set up the simple alert that you mentioned
@Amol Umbark i am able to receive the alert for the sample one which you mentioned
can you kindly look into this what went wrong with this below code alert: High Latency of operation ProcessStart expr: histogram_quantile(0.99, sum(rate(signoz_latency_bucket{service_name="workflow-service", operation="/api/task/complete"}[1m])) by (le)) > 50 for: 1m labels: severity: critical annotations: summary: High Latency of operation ProcessStart in Workflow Service description: "Latency is > 200 VALUE = {{ $value }} LABELS = {{ $labels }}"
This is the alert manager log level=info ts=2022-07-15T09:21:48.402Z caller=main.go:237 msg="Starting Alertmanager" version="(version=0.23.0, branch=release/v0.23.0-0.1, revision=6f8c41aa660a379880af00d7b42fd8ed8af854bd)" level=info ts=2022-07-15T09:21:48.403Z caller=main.go:238 build_context="(go=go1.18, user=ubuntu@ip-172-31-87-228, date=20220503-10:50:46)" level=info ts=2022-07-15T09:21:48.405Z caller=cluster.go:184 component=cluster msg="setting advertise address explicitly" addr=10.0.1.11 port=9094 level=info ts=2022-07-15T09:21:48.407Z caller=cluster.go:679 component=cluster msg="Waiting for gossip to settle..." interval=2s level=info ts=2022-07-15T09:21:48.702Z caller=coordinator.go:141 component=configuration msg="Loading a new configuration" level=warn ts=2022-07-15T09:21:48.718Z caller=configLoader.go:61 component=configuration msg="No channels found in query service " level=info ts=2022-07-15T09:21:48.718Z caller=coordinator.go:156 component=configuration msg="Completed loading of configuration file" RouteOpts: {default-receiver map[alertname:{}] false 30s 5m0s 4h0m0s []} RouteOpts: {default-receiver map[alertname:{}] false 30s 5m0s 4h0m0s []} RouteOpts: {default-receiver map[alertname:{}] false 30s 5m0s 4h0m0s []} level=info ts=2022-07-15T09:21:48.725Z caller=main.go:570 msg=Listening address=:9093 level=info ts=2022-07-15T09:21:48.726Z caller=tls_config.go:191 msg="TLS is disabled." http2=false level=info ts=2022-07-15T09:21:50.408Z caller=cluster.go:704 component=cluster msg="gossip not settled" polls=0 before=0 now=1 elapsed=2.000953591s level=info ts=2022-07-15T09:21:58.413Z caller=cluster.go:696 component=cluster msg="gossip settled; proceeding" elapsed=10.006777612s RouteOpts: {default-receiver map[alertname:{}] false 30s 5m0s 4h0m0s []} RouteOpts: {High Transaction Time Alert map[alertname:{}] false 30s 5m0s 4h0m0s []} RouteOpts: {default-receiver map[alertname:{}] false 30s 5m0s 4h0m0s []} RouteOpts: {High Transaction Time Alert map[alertname:{}] false 30s 5m0s 4h0m0s []} RouteOpts: {default-receiver map[alertname:{}] false 30s 5m0s 4h0m0s []} RouteOpts: {High Transaction Time Alert map[alertname:{}] false 30s 5m0s 4h0m0s []}
a

Ankit Nayan

07/15/2022, 12:56 PM
@Anil Kumar Bandrapalli are you able to plot this query in any sample dashboard panel? Does the chart show anything?
histogram_quantile(0.99, sum(rate(signoz_latency_bucket{service_name="workflow-service", operation="/api/task/complete"}[1m])) by (le))
a

Anil Kumar Bandrapalli

07/15/2022, 1:00 PM
yes promptQL showing no data.
a

Ankit Nayan

07/15/2022, 1:01 PM
now change it to
[2m]
do you see a chart now?
a

Anil Kumar Bandrapalli

07/15/2022, 1:06 PM
nope empty dash board
a

Ankit Nayan

07/15/2022, 1:09 PM
so you do not have the data to set an alert on
a

Anil Kumar Bandrapalli

07/15/2022, 1:10 PM
let me try this way . i will ignite a test for 15 mins and then i will check whether some data is populating or not
👍 1
a

Ankit Nayan

07/15/2022, 1:11 PM
are you using docker installation on 1 VM or k8s? you should
exec -it
into your clickhouse container and connect to db by running
clickhouse client
inside the container
a

Anil Kumar Bandrapalli

07/15/2022, 1:12 PM
we are running k8s i will do that
a

Ankit Nayan

07/15/2022, 1:13 PM
then
use signoz_metrics;
and
select * from time_series_v2 where metric_name='signoz_latency_bucket';
and try to search for rows which has
workflow-service
and
/api/task/complete
unless you see a chart with the above query plotting with
2m
time range..your alert won't work
a

Anil Kumar Bandrapalli

07/15/2022, 1:40 PM
when i log into clickhouse container and did executed this command curl -fO "https://packages.clickhouse.com/tgz/stable/clickhouse-client-22.6.3.35-amd64.tgz"
but showing permission denied
when i run clickhouse client command showing error command not found
a

Amol Umbark

07/15/2022, 1:42 PM
there must be a client already in the container.. try
clickhouse client --host localhost --port 9000
a

Ankit Nayan

07/15/2022, 2:14 PM
@Prashant Shahi how can a user connect to clickhouse db in k8s?
p

Prashant Shahi

07/15/2022, 3:16 PM
Follow the commands below to connect to clickhouse pod:
kubectl -n platform exec -i --tty pod/chi-signoz-cluster-0-0-0 -- bash
Followed by:
clickhouse-client
a

Anil Kumar Bandrapalli

07/15/2022, 4:09 PM
Hi @Ankit Nayan it is working . in the query i have modified the operation value to POST /api/task/complete
then it is firing alerts
a

Ankit Nayan

07/15/2022, 4:32 PM
Cool 👍 the name needs to be an exact match
a

Anil Kumar Bandrapalli

07/15/2022, 4:38 PM
yes that is i got to know when looking into db only. thanks a lot for helping me out to resolve this issue. I am excited to see to new version with that you have mentioned
a

Ankit Nayan

07/15/2022, 4:51 PM
releasing this hour..would be great if you can try when you get time
a

Anil Kumar Bandrapalli

07/15/2022, 5:08 PM
sure
a

Ankit Nayan

07/16/2022, 10:56 AM
@Anil Kumar Bandrapalli https://github.com/SigNoz/signoz/releases/tag/v0.10.0 migration docs - https://signoz.io/docs/operate/migration/upgrade-0.10/ Let me know if you face any issues in the new alerts UI
r

Rahul Tiwari

07/18/2022, 6:10 AM
@Ankit Nayan and @Prashant Shahi am getting below error while upgrading signoz 0.9.2to 0.10
[ec2-user@ip-10-0-4-191 ~]$ k get pods -n platform NAME READY STATUS RESTARTS AGE chi-signoz-cluster-0-0-0 1/1 Running 0 2d20h clickhouse-operator-787f8989cd-kr52v 2/2 Running 0 2d20h my-release-signoz-alertmanager-0 1/1 Running 0 2d20h my-release-signoz-frontend-68b56fc4b8-zg6hl 1/1 Running 0 2d20h my-release-signoz-otel-collector-57d668b84c-zcbr5 1/1 Running 0 2d20h my-release-signoz-otel-collector-metrics-59556558b5-7gks2 1/1 Running 0 2d20h my-release-signoz-query-service-0 1/1 Running 0 2d20h my-release-zookeeper-0 1/1 Running 0 2d20h signoz-migrate-846b558f6-s6bdg 0/1 CrashLoopBackOff 7 13m [ec2-user@ip-10-0-4-191 ~]$ k logs signoz-migrate-846b558f6-s6bdg -n platform my-release-clickhouse 9000 admin 27ff0399-0d3a-4bd8-919d-17c2181e6fb9 No TTL found, skipping TTL migration No data found in clickhouse [ec2-user@ip-10-0-4-191 ~]$
p

Prashant Shahi

07/18/2022, 6:27 AM
My guess is that you migration script was already ran once.. You can delete the pod.
@Vishal Sharma if migration script was already ran, we should have exited with 0 status code
v

Vishal Sharma

07/18/2022, 6:33 AM
@Prashant Shahi I see that there was no data in exceptions table so data was not found. @Rahul Tiwari Do you use exceptions feature? https://signoz.io/docs/userguide/exceptions/#viewing-exceptions
r

Rahul Tiwari

07/18/2022, 6:53 AM
i have attached the screen shot.
v

Vishal Sharma

07/18/2022, 6:54 AM
Then it’s fine, the migration script ran successfully as you are not using exceptions feature.
r

Rahul Tiwari

07/18/2022, 6:57 AM
@Vishal Sharma and @Prashant Shahi the signoz-migrate pod is going into crashloopbackoff state, with below error.
[ec2-user@ip-10-0-4-191 ~]$ k get pods -n platform NAME READY STATUS RESTARTS AGE chi-signoz-cluster-0-0-0 1/1 Running 0 2d21h clickhouse-operator-787f8989cd-kr52v 2/2 Running 0 2d21h my-release-signoz-alertmanager-0 1/1 Running 0 2d21h my-release-signoz-frontend-68b56fc4b8-zg6hl 1/1 Running 0 2d21h my-release-signoz-otel-collector-57d668b84c-zcbr5 1/1 Running 0 2d21h my-release-signoz-otel-collector-metrics-59556558b5-7gks2 1/1 Running 0 2d21h my-release-signoz-query-service-0 1/1 Running 0 2d21h my-release-zookeeper-0 1/1 Running 0 2d21h signoz-migrate-846b558f6-s6bdg 0/1 CrashLoopBackOff 16 61m [ec2-user@ip-10-0-4-191 ~]$ k logs signoz-migrate-846b558f6-s6bdg -n platform my-release-clickhouse 9000 admin 27ff0399-0d3a-4bd8-919d-17c2181e6fb9 No TTL found, skipping TTL migration No data found in clickhouse [ec2-user@ip-10-0-4-191 ~]$
v

Vishal Sharma

07/18/2022, 6:58 AM
@Rahul Tiwari You can delete migration pods with this command:
kubectl -n platform delete pod signoz-migrate
r

Rahul Tiwari

07/18/2022, 7:01 AM
@Vishal Sharma i tried deleting it but it is again giving the same error.
[ec2-user@ip-10-0-4-191 ~]$ k get pods -n platform NAME READY STATUS RESTARTS AGE chi-signoz-cluster-0-0-0 1/1 Running 0 2d21h clickhouse-operator-787f8989cd-kr52v 2/2 Running 0 2d21h my-release-signoz-alertmanager-0 1/1 Running 0 2d21h my-release-signoz-frontend-68b56fc4b8-zg6hl 1/1 Running 0 2d21h my-release-signoz-otel-collector-57d668b84c-zcbr5 1/1 Running 0 2d21h my-release-signoz-otel-collector-metrics-59556558b5-7gks2 1/1 Running 0 2d21h my-release-signoz-query-service-0 1/1 Running 0 2d21h my-release-zookeeper-0 1/1 Running 0 2d21h signoz-migrate-846b558f6-p6rtb 0/1 CrashLoopBackOff 3 81s [ec2-user@ip-10-0-4-191 ~]$ k logs signoz-migrate-846b558f6-p6rtb -n platform my-release-clickhouse 9000 admin 27ff0399-0d3a-4bd8-919d-17c2181e6fb9 No TTL found, skipping TTL migration No data found in clickhouse [ec2-user@ip-10-0-4-191 ~]$
@Vishal Sharma and @Prashant Shahi i have completely uninstall signoz ver.9.1 and install 10.0. Thank you for your support
👍 1
a

Anil Kumar Bandrapalli

07/18/2022, 1:08 PM
@Ankit Nayan , in the PromQL we have given this query "histogram_quantile(0.99, sum(rate(signoz_latency_bucket{service_name="workflow-service", operation="POST /api/task/complete"}[1m])) by (le)) > 50". When saving we got error "at least one metric condition is required". Previously same query used to work. Could you please help to solve this issue ? Also we tried to create a query using query builder but how this function "histogram_quantile" can be added to that query in the query builder?
a

Amol Umbark

07/18/2022, 1:11 PM
@Anil Kumar Bandrapalli when saving the rule you need to keep the promql tab active. on saving you would also notice a message which says the query will be saved with promql expression instead of query builder . can you please do this
a

Anil Kumar Bandrapalli

07/18/2022, 1:13 PM
yes i did the same thing but still getting same error
a

Amol Umbark

07/18/2022, 1:14 PM
I will try to reproduce this but meanwhile can you please create a new alert rule and proceed.
the issue could be result of switching from promql to query builder
a

Anil Kumar Bandrapalli

07/18/2022, 1:14 PM
sure
a

Amol Umbark

07/18/2022, 1:15 PM
also try to input just the metric query in the promql expression so the graph can be plotted. once your graph looks good the add the threshold in the second step
a

Anil Kumar Bandrapalli

07/18/2022, 1:15 PM
for the fresh alert showing error "metric name is missing in A"
but i am in PromQL tab only
a

Amol Umbark

07/18/2022, 1:17 PM
that's unexpected. let me review and get back
👍 1
but are you able to plot the graph for promql query
a

Anil Kumar Bandrapalli

07/18/2022, 1:23 PM
nope.
we cant able to save it na
with query builder we are able to see the graph
a

Amol Umbark

07/18/2022, 1:25 PM
to see graph there is no need to save
can you pls share a screenshot of your alert
a

Anil Kumar Bandrapalli

07/18/2022, 1:27 PM
sorry the graph is showing
a

Amol Umbark

07/18/2022, 1:27 PM
ok great let me get back on the save issue
a

Anil Kumar Bandrapalli

07/18/2022, 1:27 PM
But could not be able to save that alert
a

Amol Umbark

07/18/2022, 1:28 PM
got it
can you try selecting a metric name in query builder but keep the promql tab active right before you save
select a random metric ..should not matter
a

Anil Kumar Bandrapalli

07/18/2022, 1:32 PM
ok
i am able to save the alert
a

Amol Umbark

07/18/2022, 1:37 PM
cool i will resolve the issue of metric name error
a

Anil Kumar Bandrapalli

07/18/2022, 1:37 PM
OK
a

Ankit Nayan

07/18/2022, 3:54 PM
@Anil Kumar Bandrapalli
histogram_quantile(0.99, sum(rate(signoz_latency_bucket{service_name="workflow-service", operation="POST /api/task/complete"}[1m])) by (le))
try changing the
[1m]
to
[5m]
. Does the chart plot now?
a

Anil Kumar Bandrapalli

07/18/2022, 5:48 PM
@Ankit Nayan it is working fine now. we are able to receive alerts . I have one more question. Do we have integration with camunda platform ?
a

Ankit Nayan

07/18/2022, 6:14 PM
never heard of camunda...what do you want to do by the integration, I am curious!
a

Anil Kumar Bandrapalli

07/18/2022, 6:16 PM
we would like to integrate signoz into camunda platform to see the metrics and set the alerts
a

Ankit Nayan

07/18/2022, 6:19 PM
does camunda support webhook receiver..you can use webhook channel at signoz to send any alert to any webhook integration platform like zapier
a

Anil Kumar Bandrapalli

07/18/2022, 6:20 PM
you can get more info from this link https://camunda.com/ we are actually working on workflows, like in process flow in jira
apart from alerts can we integrate this and get the metrics like how we are able to see p99,top endpoints like that
Hi @Ankit Nayan, we are able to integrate signoz with tomcat java application which is using mysql as DB. Now we are able to see the DB calls and traces as well. But we are seeing question mark (?) in the db.statement . can we can get exact value what is being passed to that query ?
a

Ankit Nayan

07/25/2022, 9:20 AM
I am afraid, I have not seen anybody using like that. @Srikanth Chekuri do you have any idea if this can be enabled soemwhere?
s

Srikanth Chekuri

07/25/2022, 10:16 AM
@Anil Kumar Bandrapalli The question marks will remain in the statement but there should be a optional flag to capture the params but the java instrumentation doesn't support it yet https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/400.
View count: 8