Number of bars in top bar which shows total log co...
# support
p
Number of bars in top bar which shows total log count is 6 but logs are 4 below. I have shards in clickhouse. On re running query again also its giving different set of logs everytime helm version: 0.56.1, app version 0.58.1
Screenshot 2024-12-04 at 11.51.01 AM.png
s
How many shards do you have and share the query you ran
p
2 shards, how to share query ? am using UI for querying
s
do you have any replicas?
p
yup 2 shards and 2 replicass
s
That may be it. When you run in UI, see the logs of query-service and share the sql query executed.
p
Copy code
{"level":"INFO","timestamp":"2024-12-04T13:52:50.731Z","caller":"querycache/query_range_cache.go:64","msg":"Number of non-overlapping cached series data","count":1}
{"level":"INFO","timestamp":"2024-12-04T13:52:50.731Z","caller":"querier/helper.go:122","msg":"cache misses for logs query","misses":[{"Start":1733318520000,"End":1733320369000}]}
{"level":"INFO","timestamp":"2024-12-04T13:52:50.792Z","caller":"utils/time.go:17","msg":"Elapsed time","func_name":"GetTimeSeriesResultV3","duration":61,"path":"/logs/logs-explorer","dashboardID":"","query":"SELECT toStartOfInterval(fromUnixTimestamp64Nano(timestamp), INTERVAL 240 SECOND) AS ts, severity_text as `severity_text`, toFloat64(count(*)) as value from signoz_logs.distributed_logs where (timestamp >= 1733318520000000000 AND timestamp <= 1733320369000000000) AND resources_string_value[indexOf(resources_string_key, 'k8s.container.name')] = 'prices' AND resources_string_value[indexOf(resources_string_key, 'k8s.namespace.name')] = 'staging' AND lower(body) LIKE lower('%Breakout for NSE:NIFTY 50%') group by `severity_text`,ts order by value DESC","alertID":"","source":"logs-explorer","client":"browser","viewName":"","servicesTab":""}
{"level":"INFO","timestamp":"2024-12-04T13:52:50.793Z","caller":"app/server.go:396","msg":"/api/v3/query_range","timeTaken":136,"path":"/api/v3/query_range"}
{"level":"INFO","timestamp":"2024-12-04T13:52:51.275Z","caller":"utils/time.go:17","msg":"Elapsed time","func_name":"GetListResultV3","duration":417,"viewName":"","servicesTab":"","path":"/logs/logs-explorer","dashboardID":"","alertID":"","query":"SELECT timestamp, id, trace_id, span_id, trace_flags, severity_text, severity_number, scope_name, scope_version, body,CAST((attributes_string_key, attributes_string_value), 'Map(String, String)') as  attributes_string,CAST((attributes_int64_key, attributes_int64_value), 'Map(String, Int64)') as  attributes_int64,CAST((attributes_float64_key, attributes_float64_value), 'Map(String, Float64)') as  attributes_float64,CAST((attributes_bool_key, attributes_bool_value), 'Map(String, Bool)') as  attributes_bool,CAST((resources_string_key, resources_string_value), 'Map(String, String)') as resources_string,CAST((scope_string_key, scope_string_value), 'Map(String, String)') as scope from signoz_logs.distributed_logs where (timestamp >= 1733233969000000000 AND timestamp <= 1733320369000000000) AND resources_string_value[indexOf(resources_string_key, 'k8s.container.name')] = 'prices' AND resources_string_value[indexOf(resources_string_key, 'k8s.namespace.name')] = 'staging' AND lower(body) LIKE lower('%Breakout for NSE:NIFTY 50%') order by timestamp desc LIMIT 100","source":"logs-explorer","client":"browser"}
{"level":"INFO","timestamp":"2024-12-04T13:52:51.276Z","caller":"app/server.go:396","msg":"/api/v3/query_range","timeTaken":619,"path":"/api/v3/query_range"}
{"level":"INFO","timestamp":"2024-12-04T13:52:51.375Z","caller":"app/server.go:396","msg":"/api/v1/event","timeTaken":0,"path":"/api/v1/event"}
s
Copy code
SELECT toStartOfInterval(fromUnixTimestamp64Nano(timestamp), INTERVAL 240 SECOND) AS ts, severity_text as `severity_text`, toFloat64(count(*)) as value from signoz_logs.distributed_logs where (timestamp >= 1733318520000000000 AND timestamp <= 1733320369000000000) AND resources_string_value[indexOf(resources_string_key, 'k8s.container.name')] = 'prices' AND resources_string_value[indexOf(resources_string_key, 'k8s.namespace.name')] = 'staging' AND lower(body) LIKE lower('%Breakout for NSE:NIFTY 50%') group by `severity_text`,ts order by value DESC
Copy code
SELECT timestamp, id, trace_id, span_id, trace_flags, severity_text, severity_number, scope_name, scope_version, body,CAST((attributes_string_key, attributes_string_value), 'Map(String, String)') as  attributes_string,CAST((attributes_int64_key, attributes_int64_value), 'Map(String, Int64)') as  attributes_int64,CAST((attributes_float64_key, attributes_float64_value), 'Map(String, Float64)') as  attributes_float64,CAST((attributes_bool_key, attributes_bool_value), 'Map(String, Bool)') as  attributes_bool,CAST((resources_string_key, resources_string_value), 'Map(String, String)') as resources_string,CAST((scope_string_key, scope_string_value), 'Map(String, String)') as scope from signoz_logs.distributed_logs where (timestamp >= 1733233969000000000 AND timestamp <= 1733320369000000000) AND resources_string_value[indexOf(resources_string_key, 'k8s.container.name')] = 'prices' AND resources_string_value[indexOf(resources_string_key, 'k8s.namespace.name')] = 'staging' AND lower(body) LIKE lower('%Breakout for NSE:NIFTY 50%') order by timestamp desc LIMIT 100
Exec into replicas and see if these queries return same result
If you don't see the same result in both replicas, then it's a data problem, which needs an investigation.
p
can i call you in slack to show the issue ?
s
I would say let's run the queries and see if they give the same result first.
p
ohk
let me check
getting this error Code: 81. DB:Exception Database signoz_logs does not exist. (UNKNOWN_DATABASE)
s
What dbs does this replica have?
SHOW databases;
p
Copy code
│ INFORMATION_SCHEMA │
│ default            │
│ information_schema │
│ system             │
s
This has no SigNoz schemas and data isn't written here. How was this added?
p
its deployed using helm
is there any other way to connect to clickhouse
i connected to pod of clickhouse-cluster and used command
clickhouse
s
Can you share output of get pods?
p
Screenshot 2024-12-04 at 7.36.40 PM.png
s
0-0-0 and 1-0-0 are one replica and the remaining are another replica. Which replica did you run query?
Please exec into each pod and see which have dbs and schemas created and which don't
p
i ran in all of them
off of these have same dbs
s
that doesn't right, are you saying no pod has signoz schemas?
p
yeah am also not sure why is it so
wait
s
how did you exec into and what did run to enter client interface
p
i ran
clickhouse
and then ran than
show databases
clickhouse client
should be run
s
yes
exec into any one shard of each replica and see if they give same result
p
<a>-<b>-0 what is a and what is b which is shard and which is replica ?
in 0-0-0 -- got 5 rows in 0-1-0 -- got 3 rows in 1-0-0 -- got 4 rows in 1-1-0 -- got 3 rows
this is for 2nd query above
s
Copy code
0-0-0; 1-0-0 = replica0
0-1-0; 1-1-0 = replica1
the data in both replicas is not the same. and it's surprising, even in same replica it gives different result for different shards
can you rerun in replica0 again and see the results?
p
got 4 rows in 1-0-0
re ran again
s
what do you get in 0-0-0?
p
5 rows
actually running this same query multiple times in 0-0-0 also is giving 4 sometime and 5 sometime
Copy code
4 rows in set. Elapsed: 0.526 sec. Processed 30.89 thousand rows, 67.04 MB (58.77 thousand rows/s., 127.54 MB/s.)
Copy code
5 rows in set. Elapsed: 0.386 sec. Processed 31.64 thousand rows, 66.59 MB (81.90 thousand rows/s., 172.39 MB/s.)
same pod giving these results with multiple runs
query
Copy code
SELECT
    timestamp,
    id,
    trace_id,
    span_id,
    trace_flags,
    severity_text,
    severity_number,
    scope_name,
    scope_version,
    body,
    CAST((attributes_string_key, attributes_string_value), 'Map(String, String)') AS attributes_string,
    CAST((attributes_int64_key, attributes_int64_value), 'Map(String, Int64)') AS attributes_int64,
    CAST((attributes_float64_key, attributes_float64_value), 'Map(String, Float64)') AS attributes_float64,
    CAST((attributes_bool_key, attributes_bool_value), 'Map(String, Bool)') AS attributes_bool,
    CAST((resources_string_key, resources_string_value), 'Map(String, String)') AS resources_string,
    CAST((scope_string_key, scope_string_value), 'Map(String, String)') AS scope
FROM signoz_logs.distributed_logs
WHERE ((timestamp >= 1733233969000000000) AND (timestamp <= 1733320369000000000)) AND ((resources_string_value[indexOf(resources_string_key, 'k8s.container.name')]) = 'prices') AND ((resources_string_value[indexOf(resources_string_key, 'k8s.namespace.name')]) = 'staging') AND (lower(body) LIKE lower('%Breakout for NSE:NIFTY 50%'))
ORDER BY timestamp DESC
LIMIT 100
s
interesting, any additional ClickHouse settings you configured?
p
No
s
I don't particularly have anything on top of my head for this case. Need to check with ClickHouse repo if it's a known issue and why it occurs
p
oh ohk
but sometimes i see that multiple shards are also not being aggregated for querying
s
Do you mean the query is fetching data other shard of same replica?
p
i mean when running from signoz web, multiple times i get mutiple results
sometime 2 rows, sometimes 3, sometimes 5 rows
ideally i think this query should return 8 rows
5 + 3
s
No, a request goes to one of either replica, each replica is supposed to maintain the same data and data is distributed b/w shards in the same replica. the result should be consistent and it should be 5 or which ever is correct. All the queries prepared run on distributed table.
p
ohk
s
It is one issue when data is not the same b/w replicas; here, you get 3 rows for replica1, which is consistent. However, that's not the full data (as we there are more rows in another replica). Why some rows missing in replica1 is for a separate investigation. There is another issue at hand, even within replica0, the query on the distributed table gives different results on different runs, which is strange and unexpected
Now, assume we run two queries, one for chart and one for list rows. We don't control which query goes where so you experience random combination of results b/w chart and list.
p
ohk
but this has been observed multiple times, many times i run the query multiple times to make sure that i get the data i am looking for