i recently upgraded the signoz chart from `0.53.1`...
# support
h
i recently upgraded the signoz chart from
0.53.1
to
0.87.1
and ran into
{"level":"error","timestamp":"...","caller":"opamp/opamp_server.go:117","msg":"Failed to find or create agent","agentID":"...","error":"cannot create agent without orgId", ...}
errors. and it seems im suddenly running enterprise signoz, is that the reason i get errors, and if so, how do i change to community edition?
i changed the signoz.image.repository to 'signoz/signoz-community', and now it seems ive got community edition, but i still get
Copy code
{
  "level": "error",
  "timestamp": "2025-07-18T13:19:22.619Z",
  "caller": "opamp/opamp_server.go:117",
  "msg": "Failed to find or create agent",
  "agentID": "01981d7c-4b43-7a92-b78e-e2e8f59592fd",
  "error": "cannot create agent without orgId",
  "errorVerbose": "cannot create agent without orgId\<http://ngithub.com/SigNoz/signoz/pkg/query-service/app/opamp/model.(*Agents).FindOrCreateAgent|ngithub.com/SigNoz/signoz/pkg/query-service/app/opamp/model.(*Agents).FindOrCreateAgent>\n\t/home/runner/work/signoz/signoz/pkg/query-service/app/opamp/model/agents.go:91\ngithub.com/SigNoz/signoz/pkg/query-service/app/opamp.(*Server).OnMessage\n\t/home/runner/work/signoz/signoz/pkg/query-service/app/opamp/opamp_server.go:115\ngithub.com/open-telemetry/opamp-go/server.(*server).handleWSConnection\n\t/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.19.0/server/serverimpl.go:253\nruntime.goexit\n\t/opt/hostedtoolcache/go/1.23.10/x64/src/runtime/asm_amd64.s:1700",
  "stacktrace": "<http://github.com/SigNoz/signoz/pkg/query-service/app/opamp.(*Server).OnMessage|github.com/SigNoz/signoz/pkg/query-service/app/opamp.(*Server).OnMessage>\n\t/home/runner/work/signoz/signoz/pkg/query-service/app/opamp/opamp_server.go:117\ngithub.com/open-telemetry/opamp-go/server.(*server).handleWSConnection\n\t/home/runner/go/pkg/mod/github.com/open-telemetry/opamp-go@v0.19.0/server/serverimpl.go:253"
}
v
Is the container still failing?
There is no issue with using
signoz/signoz
.
h
oh, isnt there? yeah i still got that error
v
Do you have some time now to debug it together?
h
it depends on how long that takes, but maybe yeah
v
Since you mentioned you were on
0.53.1
, did you happen to follow this: https://signoz.io/docs/operate/upgrade/
h
no, i just changed version in chart-version 🙂
v
Copy code
signoz:
  initContainers:
    migration:
      enabled: true
      image:
        registry: <http://docker.io|docker.io>
        repository: busybox
        tag: 1.35
      command:
        - /bin/sh
        - -c
        - |
          echo "Running migration..."
          cp -pv /var/lib/old-signoz/signoz.db /var/lib/signoz/signoz.db
          echo "Migration complete..."
      additionalVolumes:
        - name: old-signoz-db
          persistentVolumeClaim:
            claimName: signoz-db-my-release-signoz-query-service-0
      additionalVolumeMounts:
        - name: old-signoz-db
          mountPath: /var/lib/old-signoz
Running this should be the first step
h
but the k8s chart has schema migrators, do i still need ?
Copy code
signoz-schema-migrator-async-cscbq            0/1     Completed   0          39m
signoz-schema-migrator-sync-9q7w6             0/1     Completed   0          40m
v
Yes this is for the main signoz container. Starting 0.76.0, we have merged 3 components into 1.
h
ok, working on getting that into my helm-values
v
You can just skim through this guide: https://signoz.io/docs/operate/migration/upgrade-0.76 to understand all implications.
h
i cant run that migration
Copy code
at 17:40:40 ❯ kubectl get event
LAST SEEN   TYPE      REASON             OBJECT                                  MESSAGE
65m         Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
60m         Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
60m         Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
60m         Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
31m         Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
27m         Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
20m         Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
19m         Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
9m10s       Warning   FailedScheduling   pod/signoz-0                            0/6 nodes are available: 1 Insufficient cpu, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
if i reference the claim
signoz-db-signoz-query-service-0
v
There is a volume conflict
the new volume and the old volume have an affinity conflict.
maybe they got created in separate azs?
h
Yeah, i know we have nodes in different az. I havent thought about pv being limited to one az
We have spot nodes, so not much control over them
i got signoz.db copied over now
which resulted in this error
Copy code
{"level":"warn","timestamp":"2025-07-18T18:10:27.511Z","caller":"query-service/main.go:130","msg":"No JWT secret key is specified."}
{"timestamp":"2025-07-18T18:10:27.51233567Z","level":"INFO","code":{"function":"github.com/SigNoz/signoz/pkg/signoz.New","file":"/home/runner/work/signoz/signoz/pkg/signoz/signoz.go","line":75},"msg":"starting signoz","version":"v0.90.1","variant":"enterprise","commit":"cf4e44d","branch":"v0.90.1","go":"go1.23.10","timestamp":"2025-07-16T11:36:18Z"}
{"timestamp":"2025-07-18T18:10:27.512671898Z","level":"INFO","code":{"function":"github.com/SigNoz/signoz/pkg/sqlstore/sqlitesqlstore.New","file":"/home/runner/work/signoz/signoz/pkg/sqlstore/sqlitesqlstore/provider.go","line":44},"msg":"connected to sqlite","logger":"github.com/SigNoz/signoz/pkg/sqlitesqlstore","path":"/var/lib/signoz/signoz.db"}
{"timestamp":"2025-07-18T18:10:27.512804954Z","level":"ERROR","code":{"function":"github.com/prometheus/prometheus/promql.NewActiveQueryTracker","file":"/home/runner/go/pkg/mod/github.com/prometheus/prometheus@v0.304.1/promql/query_logger.go","line":137},"msg":"Failed to create directory for logging active queries","logger":"github.com/SigNoz/signoz/pkg/prometheus/clickhouseprometheus"}
{"timestamp":"2025-07-18T18:10:27.513444163Z","level":"INFO","code":{"function":"github.com/SigNoz/signoz/pkg/sqlmigrator.(*migrator).Migrate","file":"/home/runner/work/signoz/signoz/pkg/sqlmigrator/migrator.go","line":43},"msg":"starting sqlstore migrations","logger":"github.com/SigNoz/signoz/pkg/sqlmigrator","dialect":"sqlite"}
{"timestamp":"2025-07-18T18:10:27.520243311Z","level":"INFO","code":{"function":"github.com/SigNoz/signoz/pkg/sqlmigrator.(*migrator).Lock","file":"/home/runner/work/signoz/signoz/pkg/sqlmigrator/migrator.go","line":90},"msg":"acquired migration lock","logger":"github.com/SigNoz/signoz/pkg/sqlmigrator","dialect":"sqlite"}
{"level":"fatal","timestamp":"2025-07-18T18:10:27.529Z","caller":"query-service/main.go:162","msg":"Failed to create signoz","error":"no such table: notification_channel","stacktrace":"main.main\n\t/home/runner/work/signoz/signoz/ee/query-service/main.go:162\nruntime.main\n\t/opt/hostedtoolcache/go/1.23.10/x64/src/runtime/proc.go:272"}
downgraded to version 76, that works, but then tried upgrading to newest again, and it failed with.
Copy code
{"timestamp":"2025-07-18T19:52:06.509236749Z","level":"INFO","code":{"function":"<http://github.com/SigNoz/signoz/pkg/signoz.New|github.com/SigNoz/signoz/pkg/signoz.New>","file":"/home/runner/work/signoz/signoz/pkg/signoz/signoz.go","line":75},"msg":"starting signoz","version":"v0.90.1","variant":"enterprise","commit":"cf4e44d","branch":"v0.90.1","go":"go1.23.10","timestamp":"2025-07-16T11:36:18Z"}
{"timestamp":"2025-07-18T19:52:06.509922005Z","level":"INFO","code":{"function":"<http://github.com/SigNoz/signoz/pkg/sqlstore/sqlitesqlstore.New|github.com/SigNoz/signoz/pkg/sqlstore/sqlitesqlstore.New>","file":"/home/runner/work/signoz/signoz/pkg/sqlstore/sqlitesqlstore/provider.go","line":44},"msg":"connected to sqlite","logger":"<http://github.com/SigNoz/signoz/pkg/sqlitesqlstore|github.com/SigNoz/signoz/pkg/sqlitesqlstore>","path":"/var/lib/signoz/signoz.db"}
{"timestamp":"2025-07-18T19:52:06.51013338Z","level":"ERROR","code":{"function":"<http://github.com/prometheus/prometheus/promql.NewActiveQueryTracker|github.com/prometheus/prometheus/promql.NewActiveQueryTracker>","file":"/home/runner/go/pkg/mod/github.com/prometheus/prometheus@v0.304.1/promql/query_logger.go","line":137},"msg":"Failed to create directory for logging active queries","logger":"<http://github.com/SigNoz/signoz/pkg/prometheus/clickhouseprometheus|github.com/SigNoz/signoz/pkg/prometheus/clickhouseprometheus>"}
{"timestamp":"2025-07-18T19:52:06.511047132Z","level":"INFO","code":{"function":"<http://github.com/SigNoz/signoz/pkg/sqlmigrator.(*migrator).Migrate|github.com/SigNoz/signoz/pkg/sqlmigrator.(*migrator).Migrate>","file":"/home/runner/work/signoz/signoz/pkg/sqlmigrator/migrator.go","line":43},"msg":"starting sqlstore migrations","logger":"<http://github.com/SigNoz/signoz/pkg/sqlmigrator|github.com/SigNoz/signoz/pkg/sqlmigrator>","dialect":"sqlite"}
{"timestamp":"2025-07-18T19:52:06.51875841Z","level":"INFO","code":{"function":"<http://github.com/SigNoz/signoz/pkg/sqlmigrator.(*migrator).Lock|github.com/SigNoz/signoz/pkg/sqlmigrator.(*migrator).Lock>","file":"/home/runner/work/signoz/signoz/pkg/sqlmigrator/migrator.go","line":90},"msg":"acquired migration lock","logger":"<http://github.com/SigNoz/signoz/pkg/sqlmigrator|github.com/SigNoz/signoz/pkg/sqlmigrator>","dialect":"sqlite"}
{"level":"fatal","timestamp":"2025-07-18T19:52:06.531Z","caller":"query-service/main.go:162","msg":"Failed to create signoz","error":"invalid UUID format","stacktrace":"main.main\n\t/home/runner/work/signoz/signoz/ee/query-service/main.go:162\nruntime.main\n\t/opt/hostedtoolcache/go/1.23.10/x64/src/runtime/proc.go:272"}
so i downgraded to 76 again, and that works. i guess ill fix the invalid UUID another time
oh, nevermind, couldnt downgrade to 76 now
Copy code
{"timestamp":"2025-07-18T19:58:40.419235371Z","level":"INFO","code":{"function":"<http://go.signoz.io/signoz/pkg/sqlstore/sqlitesqlstore.New|go.signoz.io/signoz/pkg/sqlstore/sqlitesqlstore.New>","file":"/home/runner/work/signoz/signoz/pkg/sqlstore/sqlitesqlstore/provider.go","line":45},"msg":"connected to sqlite","logger":"<http://go.signoz.io/signoz/pkg/sqlitesqlstore|go.signoz.io/signoz/pkg/sqlitesqlstore>","path":"/var/lib/signoz/signoz.db"}
{"timestamp":"2025-07-18T19:58:40.419527221Z","level":"INFO","code":{"function":"<http://go.signoz.io/signoz/pkg/sqlmigrator.(*migrator).Migrate|go.signoz.io/signoz/pkg/sqlmigrator.(*migrator).Migrate>","file":"/home/runner/work/signoz/signoz/pkg/sqlmigrator/migrator.go","line":43},"msg":"starting sqlstore migrations","logger":"<http://go.signoz.io/signoz/pkg/sqlmigrator|go.signoz.io/signoz/pkg/sqlmigrator>","dialect":"sqlite"}
{"timestamp":"2025-07-18T19:58:40.426551612Z","level":"INFO","code":{"function":"<http://go.signoz.io/signoz/pkg/sqlmigrator.(*migrator).Lock|go.signoz.io/signoz/pkg/sqlmigrator.(*migrator).Lock>","file":"/home/runner/work/signoz/signoz/pkg/sqlmigrator/migrator.go","line":90},"msg":"acquired migration lock","logger":"<http://go.signoz.io/signoz/pkg/sqlmigrator|go.signoz.io/signoz/pkg/sqlmigrator>","dialect":"sqlite"}
{"timestamp":"2025-07-18T19:58:40.426908796Z","level":"INFO","code":{"function":"<http://go.signoz.io/signoz/pkg/sqlmigrator.(*migrator).Migrate|go.signoz.io/signoz/pkg/sqlmigrator.(*migrator).Migrate>","file":"/home/runner/work/signoz/signoz/pkg/sqlmigrator/migrator.go","line":60},"msg":"no new migrations to run (database is up to date)","logger":"<http://go.signoz.io/signoz/pkg/sqlmigrator|go.signoz.io/signoz/pkg/sqlmigrator>","dialect":"sqlite"}
{"level":"WARN","timestamp":"2025-07-18T19:58:40.432Z","caller":"query-service/main.go:167","msg":"No JWT secret key is specified."}
2025/07/18 19:58:40 WARN: bun: Organization.IsAnonymous has unknown tag option: "CHECK(is_anonymous IN (0"
2025/07/18 19:58:40 WARN: bun: Organization.IsAnonymous has unknown tag option: "1))"
2025/07/18 19:58:40 WARN: bun: Organization.HasOptedUpdates has unknown tag option: "1))"
2025/07/18 19:58:40 WARN: bun: Organization.HasOptedUpdates has unknown tag option: "CHECK(has_opted_updates IN (0"
{"level":"FATAL","timestamp":"2025-07-18T19:58:40.576Z","caller":"query-service/main.go:196","msg":"Failed to create server","error":"no such column: organization.is_anonymous","stacktrace":"main.main\n\t/home/runner/work/signoz/signoz/ee/query-service/main.go:196\nruntime.main\n\t/opt/hostedtoolcache/go/1.22.12/x64/src/runtime/proc.go:271"}
i guess ill just go for newest again, and have it be broken until i fix it some other day
v
Can you dump the output of
SELECT * FROM migration;
from the sqlite db?
h
Copy code
1|000|1|2025-07-18 18:10:03
2|001|1|2025-07-18 18:10:03
3|002|1|2025-07-18 18:10:03
4|003|1|2025-07-18 18:10:03
5|004|1|2025-07-18 18:10:03
6|005|1|2025-07-18 18:10:03
7|006|1|2025-07-18 18:10:03
8|007|1|2025-07-18 18:10:03
9|008|1|2025-07-18 18:10:03
10|009|1|2025-07-18 18:10:03
11|011|1|2025-07-18 18:10:04
12|012|1|2025-07-18 18:10:04
13|013|1|2025-07-18 18:10:04
14|014|2|2025-07-18 19:28:58
15|015|2|2025-07-18 19:28:58
16|016|3|2025-07-18 19:50:26
17|017|3|2025-07-18 19:50:26
18|018|3|2025-07-18 19:50:26
19|019|3|2025-07-18 19:50:26
20|020|3|2025-07-18 19:50:26
21|021|3|2025-07-18 19:50:26
22|022|3|2025-07-18 19:50:26
23|023|3|2025-07-18 19:50:26
24|024|3|2025-07-18 19:50:26
25|025|3|2025-07-18 19:50:26
26|026|3|2025-07-18 19:50:26
27|027|3|2025-07-18 19:50:26
28|028|3|2025-07-18 19:50:26
29|029|3|2025-07-18 19:50:26
30|030|3|2025-07-18 19:50:26
31|031|3|2025-07-18 19:50:26
32|032|3|2025-07-18 19:50:26
33|033|3|2025-07-18 19:50:26
34|034|3|2025-07-18 19:50:26
35|035|3|2025-07-18 19:50:26
36|036|3|2025-07-18 19:50:26
37|037|3|2025-07-18 19:50:26
Copy code
initContainers:
      - command:
        - sh
        - -c
        - until wget --user "${CLICKHOUSE_USER}:${CLICKHOUSE_PASSWORD}" --spider -q
          signoz-clickhouse:8123/ping; do echo -e "waiting for clickhouseDB"; sleep
          5; done; echo -e "clickhouse ready, starting query service now";
        env:
        - name: CLICKHOUSE_HOST
          value: signoz-clickhouse
        - name: CLICKHOUSE_PORT
          value: "9000"
        - name: CLICKHOUSE_HTTP_PORT
          value: "8123"
        - name: CLICKHOUSE_CLUSTER
          value: cluster
        - name: CLICKHOUSE_USER
          value: admin
        - name: CLICKHOUSE_PASSWORD
          value: ****
        - name: CLICKHOUSE_SECURE
          value: "false"
        image: <http://docker.io/busybox:1.35|docker.io/busybox:1.35>
        imagePullPolicy: IfNotPresent
        name: signoz-init
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - env:
        - name: CLICKHOUSE_HOST
          value: signoz-clickhouse
        - name: CLICKHOUSE_PORT
          value: "9000"
        - name: CLICKHOUSE_HTTP_PORT
          value: "8123"
        - name: CLICKHOUSE_CLUSTER
          value: cluster
        - name: CLICKHOUSE_USER
          value: admin
        - name: CLICKHOUSE_PASSWORD
          value: *****
        - name: CLICKHOUSE_SECURE
          value: "false"
        image: <http://docker.io/busybox:1.35|docker.io/busybox:1.35>
        imagePullPolicy: IfNotPresent
        name: signoz-migration
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/signoz
          name: signoz-db
The 2nd initcontainer seems to have empty command, is that on purpose?
v
There is a dashboard with an invalid UUID. Please run
SELECT uuid FROM dashboard
to identify the invalid UUID. Then run
DELETE FROM dashboard WHERE uuid = ?
to remove that particular dashboard.
Then you can take it to the latest version.
h
Copy code
sqlite> select uuid from dashboards;
4b5315a4-0697-4fce-8c98-b30f40a4b004
5171b8e8-76cc-4068-857d-206b1b98954f
5171b8e8-76cc-4068-857d-206b1x98954f
583af360-0ed8-4a3f-a129-fb239e05e1a1
6482ccd0-1252-45b0-ade5-43f7605a6d98
6492ccd0-1252-45b0-ade5-43f7605a6d98
a208430f-7012-4d8f-b156-078f0a58764d
c5e8bf93-f4a9-44a5-b8b6-76452e2b2638
e4e98472-d9b8-4495-a85d-70f635fc7140
e4e98472-d9b8-4495-a85d-70f635fc7141
they kinda all seem ok, oh,
two of the same at the end, maybe thats
or slightly the same
v
Copy code
6482ccd0-1252-45b0-ade5-43f7605a6d98
6492ccd0-1252-45b0-ade5-43f7605a6d98
No wait these are different as well
Interesting
h
are there other tables with uuid?
hm, seems it was one of those.. not sure which, i deleted all dashboards (after backup) and now it runs
v
Nicee, which version did you upgrade to?
h
0.90.1
v
Nice!
h
might just recreate dashboards, as migrating them seemed difficult now
the version looks very nice btw 🙂
🙌 1