Hi support, I still have a lot of duplicate data d...
# support
m
Hi support, I still have a lot of duplicate data due to the PromQL dot to underscore migration. How should this be resolved in the future?
Copy code
┌─metric_name───────────────────────────────────────────────┬─sample_count─┐
│ datastore_time_customer_rate                              │      4742473 │
│ datastore_time_customer:rate                              │      4386560 │
│ datastore_time_customer_seconds_bucket                    │      4376946 │
│ nginx_ingress_controller_request_duration_seconds_bucket  │      2759941 │
│ nginx_ingress_controller_request_duration_seconds.bucket  │      2739037 │
│ customer_adrequest_seconds_bucket_rate                    │      2626712 │
│ nginx_ingress_controller_response_size_bucket             │      2531205 │
│ nginx_ingress_controller_request_size_bucket              │      2529700 │
│ nginx_ingress_controller_request_size.bucket              │      2510255 │
│ nginx_ingress_controller_response_size.bucket             │      2509547 │
│ customer:adrequest_seconds_bucket:rate                    │      2439399 │
│ customer_user_check_count_total_rate                      │      2357282 │
│ customer:user_check_count_total:rate                      │      2229869 │
│ nginx_ingress_controller_response_duration_seconds_bucket │      2010263 │
│ nginx_ingress_controller_response_duration_seconds.bucket │      1994099 │
│ nginx_ingress_controller_connect_duration_seconds_bucket  │      1963446 │
│ nginx_ingress_controller_connect_duration_seconds.bucket  │      1947365 │
│ nginx_ingress_controller_bytes_sent_bucket                │      1840603 │
│ nginx_ingress_controller_bytes_sent.bucket                │      1825860 │
│ nginx_ingress_controller_header_duration_seconds_bucket   │      1764742 │
│ nginx_ingress_controller_header_duration_seconds.bucket   │      1750758 │
n
@Srikanth Chekuri, could you please check this out?
s
Hi @Matti, we expect to publish the docs for completing the migration next week as we are still figuring out some details.
m
Good to hear @Srikanth Chekuri, we have been quite a heavy SigNoz user since for quite some time now. I think this will reduce some load on our collectors eventually.
(these were metrics from the last hour, we get billions of metrics in total 🙂)
s
Correct, it is also expected to reduce the load on the clickhouse as we are doing the clean up of inefficient stuff lurking around.
m
@Srikanth Chekuri do you have an update on these docs? 🤔
n
Hey @Matti, We have released the docs, PTAL: https://signoz.io/docs/operate/migration/upgrade-0.88
m
ah, there is no reference to this in the changelogs
n
Yeah, will be releasing the announcement soon - we just released it today
m
it seems like a lot to do 😄
n
It depends on how you are running Signoz 😅
m
To upgrade the dashboards/alerts etc using the self-hosted version. How would we configure this? I'm not aware of an admin passwrod 🤔
Copy code
initContainers:
      - name: migration
        image: signoz/migrate:v0.70.5
        imagePullPolicy: IfNotPresent
        env:
          - name: SQL_DB_PATH
            value: /var/lib/signoz/signoz.db
          - name: CH_ADDR
            value: "your-clickhouse-service:9000" # Replace with your ClickHouse service
          - name: CH_DATABASE
            value: signoz_metrics
          - name: CH_USER
            value: admin
          - name: CH_PASS
            value: "your-password"  # Replace with your ClickHouse password
          - name: CH_MAX_OPEN_CONNS
            value: "10"
          - name: SKIP_METRICS_MAP
            value: "dd_internal_stats_payload=true"
          - name: CH_MAX_MEMORY_USAGE
            value: "8388608000"                        # 8 GB
          - name: CH_MAX_BYTES_BEFORE_EXTERNAL_GROUP_BY
            value: "4194304000"                        # 4 GB
          - name: CH_MAX_BYTES_BEFORE_EXTERNAL_SORT
            value: "4194304000"                        # 4 GB
          - name: NOT_FOUND_METRICS_MAP
            value: |-
              rpc_server_responses_per_rpc_bucket=rpc.server.responses_per_rpc.bucket
          - name: NOT_FOUND_ATTR_MAP
            value: |-
              http_scheme=http.scheme,
        args:
          - migrate-meta
        resources: {}   # add limits/requests as needed
        volumeMounts:
          - name: signoz-db
            mountPath: /var/lib/signoz
Please do not share the creds here in the thread, and please check the value - if you haven't changed the value explicitly, then it would be the default one
s
@Matti were you able to migrate? Do you need any help?
m
Still need to initiate the migration itself, the upgrade went fine though!
I will probably migrate on Monday, since we don't want to lose our alters over the weekend 😄
Just triggered the init migration script, not much is happening 😕
Copy code
signoz-0                                                   0/1     Init:1/2    0               4m55s
no logs from the init pod
s
It takes a bit of time to collect the mapping data from old to new
m
alright, will keep it running
maybe good thing to write a log output that something started 😄
Copy code
k{"level":"error","ts":1751883128.9652975,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: 1:115: parse error: bad number or duration syntax: \"\", for dashboards - 78595c13-fd95-402c-a429-b82410822661","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883128.970428,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: 1:120: parse error: bad number or duration syntax: \"\", for dashboards - 78595c13-fd95-402c-a429-b82410822661","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883128.977158,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: 1:133: parse error: bad number or duration syntax: \"\", for dashboards - 78595c13-fd95-402c-a429-b82410822661","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883128.9778268,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: 1:120: parse error: bad number or duration syntax: \"\", for dashboards - 78595c13-fd95-402c-a429-b82410822661","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883128.9783733,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: 1:131: parse error: bad number or duration syntax: \"\", for dashboards - 78595c13-fd95-402c-a429-b82410822661","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883128.9791212,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: 1:120: parse error: bad number or duration syntax: \"\", for dashboards - 78595c13-fd95-402c-a429-b82410822661","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883129.0604925,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: 1:116: parse error: bad number or duration syntax: \"\", for dashboards - 5db8d094-a0c9-42fe-93aa-2c68fe3c7ca0","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883129.07712,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: failed to parse query: line 1:19 unknown interval type: <\"metric_name\">\nextract(metric_name, 'app_002_exec_([a-zA-Z]+)_publish_time_response_time') AS customer,\n                   ^\n, for dashboards - 75c5db8a-f9be-4d5a-b742-97affc377e04","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883129.0772693,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: failed to parse query: line 6:22 unknown interval type: <\"ds\">\n            extract(ds.metric_name, 'app_002_exec_([a-zA-Z0-9]+)_publish_time_response_time') AS customer\n                      ^\n, for dashboards - 75c5db8a-f9be-4d5a-b742-97affc377e04","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883129.081397,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: failed to parse query: line 5:27 unknown interval type: <\"metric_name\">\n        extract(metric_name, 'app_002_exec_([a-zA-Z0-9]+)_publish_time_response_time') AS customer,\n                           ^\n, for dashboards - 75c5db8a-f9be-4d5a-b742-97affc377e04","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883129.9470227,"caller":"migrate/main.go:1109","msg":"error getting for dashboard-id: 1e550af5-84c4-42e8-a173-e994cd0e547a, for error  - error processing variables: failed to parse query: line 0:30 unknown interval type: <\"labels\">\nSELECT distinct extract(labels, 'customer\":\"([^,]+)\"') AS customer\n                              ^\n for file name - /var/lib/signoz/signoz.db","stacktrace":"main.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1109\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883130.0158603,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: failed to parse query: line 5:33 unknown interval type: <\"JSONExtractString\">\n        extract(JSONExtractString(labels, 'k8s.deployment.name'), 'fangorn-([^/-]+)') AS customer,\n                                 ^\n, for dashboards - a5fa9cb6-ebbc-4b0e-915d-1478f5552e5d","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
{"level":"error","ts":1751883130.0164826,"caller":"migrate/main.go:1501","msg":"⚠️ skipping widget due to error: failed to parse query: line 2:21 unknown interval type: <\"JSONExtractString\">\n    JSONExtractString(labels, 'k8s_deployment_name'),\n                     ^\n, for dashboards - a5fa9cb6-ebbc-4b0e-915d-1478f5552e5d","stacktrace":"main.(*DashAlertsMigrator).applyReplacementsToDashboard\n\t/go/src/github.com/signoz/migrate/main.go:1501\nmain.(*DashAlertsMigrator).migrateDashboards\n\t/go/src/github.com/signoz/migrate/main.go:1107\nmain.migrateMeta\n\t/go/src/github.com/signoz/migrate/main.go:247\nmain.main\n\t/go/src/github.com/signoz/migrate/main.go:293\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:283"}
✅ updated 38 dashboards in signoz.db
✅ updated 62 rules in /var/lib/signoz/signoz.db
seems pretty good for most of them, I will update the dotnev variable and I guess they should start working again then
s
Yes, and for those dashboards where migration failed, please review them once and update anything needed.
m
great, thanks Srikanth!
our clickhouse cluster should also start releasing storage over the next few days due to _ metrics being emtited?
s
The last to last release the CPU of CH should be cut down by half. In the next release we will remove the old exporter completely, which will reduce the storage and CPU by another half for CH. And for the collector there should 30% reduction
m
ah perfect, makes sense, thanks!