This message was deleted SigNoz Community #general

Join Slack

This message was deleted.

# general

Slackbot

02/16/2023, 12:23 PM

This message was deleted.

Ankit Nayan

02/16/2023, 12:39 PM

how many replicas of query-service are defined? It should be 1

Ankit Nayan

02/16/2023, 12:39 PM

Do the services appear and disappear or they have never seen after adding s3?

Ankit Nayan

02/16/2023, 12:39 PM

@oluchi orji

oluchi orji

02/16/2023, 12:47 PM

hello @Ankit Nayan, thanks for your response. 1. We have just one replica of signoz query 2. They disappear and they reappear after we uninstall and install signoz again (

S3 setup and annotation

are added in the values.yaml) file.

Ankit Nayan

02/16/2023, 12:56 PM

okay...can you share clickhouse logs? I am guessing if s3 connection fails, then clickhouse doesn't show any data. Also can you check if size of data is increasing in s3?

oluchi orji

02/16/2023, 12:56 PM

one second, let me do the checks 👇 @Ankit Nayan

oluchi orji

02/16/2023, 12:59 PM

Copy code

worker.go:445:dropReplicas():start:infra/signoz-clickhouse/e9e59dca-39f7-4444-91e9-5fb092c9daa1:drop replicas based on AP
I0205 00:16:26.531186       1 worker.go:462] worker.go:462:dropReplicas():end:infra/signoz-clickhouse/e9e59dca-39f7-4444-91e9-5fb092c9daa1:processed replicas: 0
I0205 00:16:26.531219       1 worker.go:419] includeStopped():infra/signoz-clickhouse/e9e59dca-39f7-4444-91e9-5fb092c9daa1:add CHI to monitoring
I0205 00:16:26.802933       1 worker.go:485] infra/signoz-clickhouse/9ca4c129-c258-425d-80b1-a956508a0752:IPs of the CHI [*****]
I0205 00:16:26.815881       1 worker.go:489] infra/signoz-clickhouse/342fa60b-416a-4027-ae25-6de4bca505b7:Update users IPS
I0205 00:16:27.042605       1 worker.go:505] markReconcileComplete():infra/signoz-clickhouse/e9e59dca-39f7-4444-91e9-5fb092c9daa1:reconcile completed
I0215 20:17:43.965089       1 controller.go:309] infra/signoz-clickhouse:endpointsInformer.UpdateFunc: IP ASSIGNED: []v1.EndpointSubset{
  v1.EndpointSubset{
    Addresses: []v1.EndpointAddress{
      v1.EndpointAddress{
        IP: "172.********",
        Hostname: "",
        NodeName: &"ip-*******l",
        TargetRef: nil,
      },
    },
    NotReadyAddresses: nil,
    Ports: []v1.EndpointPort{
      v1.EndpointPort{
        Name: "http",
        Port: 8123,
        Protocol: "TCP",
        AppProtocol: nil,
      },
      v1.EndpointPort{
        Name: "tcp",
        Port: 9000,
        Protocol: "TCP",
        AppProtocol: nil,
      },
    },
  },
}
I0215 20:17:44.020501       1 worker.go:299] infra/signoz-clickhouse/f48fbf51-ff72-45f1-abd8-96a17e4f8191:IPs of the CHI [*******]
I0215 20:17:44.026758       1 worker.go:303] infra/signoz-clickhouse/9afb9ed0-a38e-44a2-a57d-598971239d44:Update users IPS
I0215 20:17:44.035005       1 worker.go:1645] updateConfigMap():infra/signoz-clickhouse/9afb9ed0-a38e-44a2-a57d-598971239d44:Update ConfigMap infra/chi-signoz-clickhouse-common-usersd

Ankit Nayan

02/16/2023, 1:21 PM

this does not have much useful information

Ankit Nayan

02/16/2023, 1:21 PM

can you grep by

s3

Ankit Nayan

02/16/2023, 1:21 PM

also can you check size of s3 if that is receiving data?

oluchi orji

02/16/2023, 1:22 PM

Checking ... @Ankit Nayan

oluchi orji

02/16/2023, 1:27 PM

No useful info came up with

s3

except the following @Ankit Nayan

Copy code

{e899fee7-1eea-4e3f-b6dc-6e7bd6141071} <Error> TCPHandler: Code: 243. DB::Exception: Cannot reserve 1.00 MiB, not enough space. (NOT_ENOUGH_SPACE), Stack trace (when copying this message, always include the lines below):

Ankit Nayan

02/16/2023, 1:32 PM

how much space is left in the disk?

Ankit Nayan

02/16/2023, 1:38 PM

https://groups.google.com/g/clickhouse/c/wWQzPtH2PjI

Ankit Nayan

02/16/2023, 1:41 PM

cc: @Prashant Shahi what's the default config? Maybe we want to change the defaults of clickhouse for better operation at scale

Ankit Nayan

02/16/2023, 1:41 PM

@oluchi orji any idea how much data you were trying to ingest?

oluchi orji

02/16/2023, 1:41 PM

One second, checking now @Ankit Nayan

Ankit Nayan

02/16/2023, 1:41 PM

and this message is also temporary..it gets fixed once heavy ingestion is over. Can you check the time of the error?

oluchi orji

02/16/2023, 1:42 PM

the time of the error, is an hour ago

oluchi orji

02/16/2023, 1:44 PM

about 10gb still left @Ankit Nayan

Ankit Nayan

02/16/2023, 1:49 PM

https://github.com/SigNoz/signoz/issues/2272 might be related. I will let @Srikanth Chekuri dive deeper into the issue

oluchi orji

02/16/2023, 1:50 PM

Alright @Ankit Nayan, thank you for your time!

Srikanth Chekuri

02/16/2023, 2:02 PM

@oluchi orji Can you share your S3 configuration? Our retention is currently done on the span timestamp, and then only it moves the data to cold storage. However, you need to move the data based on disk availability. Did you configure the

move_factor

? What is the approximate ingestion estimate?

Prashant Shahi

02/16/2023, 2:03 PM

Copy code

{e899fee7-1eea-4e3f-b6dc-6e7bd6141071} <Error> TCPHandler: Code: 243. DB::Exception: Cannot reserve 1.00 MiB, not enough space. (NOT_ENOUGH_SPACE), Stack trace (when copying this message, always include the lines below):

I have seen this error occurs when there is no enough storage for the clickhouse storage PVC i.e.

/var/lib/clickhouse

mount.

👀 1

Prashant Shahi

02/16/2023, 2:04 PM

But yeah, do share your S3 configuration, so that we can have a look at it.

oluchi orji

02/16/2023, 2:08 PM

my default cold storage setup @Prashant Shahi

Copy code

clickhouse:
  cloud: aws
  installCustomStorageClass: false
  persistence:
    size: 30Gi
    # Cold storage configuration
  coldStorage:
    enabled: true
    defaultKeepFreeSpaceBytes: "10485760"

s3 config

Copy code

{
  "Statement": [
    {
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:PutBucketVersioning",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::<bucket name>",
        "arn:aws:s3:::<bucket_name>/*"
      ]
    }
  ],
  "Version": "2012-10-17"
}

Srikanth Chekuri

02/16/2023, 2:12 PM

defaultKeepFreeSpaceBytes

is used to reserve some free space on any disk but that doesn’t move the data. What was your

move_factor

oluchi orji

02/16/2023, 2:13 PM

move_factor

is that a value on the values.yaml file?

Srikanth Chekuri

02/16/2023, 2:18 PM

I see this is unavailable in our charts, but I believe you could override this. I think that’s the reason you are not seeing services. Your disk space is getting filled, but the default detention (7 days) is set on the timestamp of the span, which will not move for a week. But since you haven’t set up any

move_factor

(i.e. % free disk space that should always exist, and if it crosses this threshold ClickHouse will move the data to cold storage).

oluchi orji

02/16/2023, 2:21 PM

Okay, thank you @Srikanth Chekuri, I will look up information on how to override the

move_factor

Srikanth Chekuri

02/16/2023, 2:24 PM

@Prashant Shahi how can @oluchi orji add the

move_factor

for volumes in our charts https://github.com/SigNoz/charts/blob/f0f467bdfb34f464c4bb14f699a038db16332be4/cha[…]ickhouse/templates/clickhouse-instance/clickhouse-instance.yaml? I am not sure if this can be done with override.yaml.

✅ 1

Prashant Shahi

02/16/2023, 3:27 PM

it would not be possible right now with override.yaml. Maybe except for using

clickhouse.files

configuration.

Prashant Shahi

02/16/2023, 3:30 PM

@Srikanth Chekuri isn't the

move_factor

set to

0.1

by default?

Prashant Shahi

02/16/2023, 3:32 PM

shouldn't that be sufficient?

Prashant Shahi

02/16/2023, 3:38 PM

https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree/#table_engine-mergetree-multiple-volumes_configure

Srikanth Chekuri

02/16/2023, 3:43 PM

That’s why I was asking for the ingestion rate. If the rate is higher, the data get dropped before the background task can move. I wanted them to try something higher and test it.

oluchi orji

02/16/2023, 3:44 PM

Hello @Srikanth Chekuri, how do I check for ingestion rate, is it a kubectl cmd or I have to ssh into the clickhouse pods?

Srikanth Chekuri

02/16/2023, 3:53 PM

Yeah, you could get relevant info by querying in ClickHouse. Let me share some command that outputs the span per duration.

✅ 1

Srikanth Chekuri

02/16/2023, 3:58 PM

Can you exec into ClickHouse and share the output of this?

Copy code

SELECT
    toStartOfInterval(timestamp, toIntervalMinute(10)) AS time,
    count() AS count
FROM signoz_traces.signoz_index_v2
GROUP BY time
ORDER BY time ASC

✅ 1

oluchi orji

02/16/2023, 4:03 PM

not found

oluchi orji

02/16/2023, 4:04 PM

@Srikanth Chekuri

Copy code

/ $ SELECT
sh: SELECT: not found
/ $     toStartOfInterval(timestamp, toIntervalMinute(10)) AS time,
sh: syntax error: unexpected word (expecting ")")
/ $     count() AS count
/ $ FROM signoz_traces.signoz_index_v2
sh: FROM: not found
/ $ GROUP BY time
sh: GROUP: not found
/ $ ORDER BY time ASC
sh: ORDER: not found
/ $

Prashant Shahi

02/16/2023, 4:07 PM

@oluchi orji you will have to execute it using

clickhouse client

oluchi orji

02/16/2023, 4:07 PM

I thought as much @Prashant Shahi, thanks

Alejandro Decchi

05/11/2023, 8:47 PM

How can I drop data since a determinated day ?

Prashant Shahi

05/14/2023, 4:59 PM

@Srikanth Chekuri can you please look into this?

34 Views

Open in Slack

Previous Next