I am facing issue in Kubernetes signoz-otel-collec...
# support
p
I am facing issue in Kubernetes signoz-otel-collector
Copy code
2023-03-30T09:40:18.329Z	info	kube/client.go:101	k8s filtering	{"kind": "processor", "name": "k8sattributes", "pipeline": "metrics/generic", "labelSelector": "", "fieldSelector": "spec.nodeName=ip-10-0-6-126.ap-south-1.compute.internal"}
2023-03-30T09:40:18.469Z	info	clickhouselogsexporter/exporter.go:356	Running migrations from path: 	{"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "test": "/logsmigrations"}
Error: cannot build pipelines: failed to create "clickhouselogsexporter" exporter, in pipeline "logs": cannot configure clickhouse logs exporter: clickhouse Migrate failed to run, error: Dirty database version 5. Fix and force version.
2023/03/30 09:40:18 application run finished with error: cannot build pipelines: failed to create "clickhouselogsexporter" exporter, in pipeline "logs": cannot configure clickhouse logs exporter: clickhouse Migrate failed to run, error: Dirty database version 5. Fix and force version.
n
Hi, can you please tell me how did you reach this state? Did it happen when you were upgrading or did you make any changes to the logs schema manually ? Also if your existing logs data is not that important -> a hacky way to get things back to normal will be deleting the
signoz_logs
database.
p
I dropped the pvc claim of clickhouse.
And restarted clickhouse
As I had some issues in updating logs filter rules
n
What do you mean by “updating logs filter rules” ?
If the PV is the same then there shouldn’t be a problem, correct me if I am wrong @Prashant Shahi
p
added these in processors
Copy code
- type: filter
          expr: 'attributes.namespace == "signoz"'
        - type: filter
          expr: 'attributes.namespace == "tools"'
        - type: filter
          expr: 'attributes.container_name == "otc-container"'
new volume has come up and i think otel-collector created the tables and got stuck in the migration
n
Oh, you deleted the PV as well. Applying filter processors wont cause any issues on otel-collector.
p
somehow it happened
n
Can you delete the
signoz_logs
database and restart your collectors.
p
and also i see these logs in clickhouse
Copy code
0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xa3ef75a in /usr/bin/clickhouse
1. DB::Block::getByName(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool) const @ 0x13ef0872 in /usr/bin/clickhouse
2. DB::getBlockAndPermute(DB::Block const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul> const*) @ 0x158db96f in /usr/bin/clickhouse
3. DB::MergeTreeDataPartWriterCompact::writeDataBlockPrimaryIndexAndSkipIndices(DB::Block const&, std::__1::vector<DB::Granule, std::__1::allocator<DB::Granule> > const&) @ 0x158d682e in /usr/bin/clickhouse
4. DB::MergeTreeDataPartWriterCompact::fillDataChecksums(DB::MergeTreeDataPartChecksums&) @ 0x158d7bc2 in /usr/bin/clickhouse
5. DB::MergeTreeDataPartWriterCompact::fillChecksums(DB::MergeTreeDataPartChecksums&) @ 0x158d847c in /usr/bin/clickhouse
6. DB::MergedBlockOutputStream::finalizePartAsync(std::__1::shared_ptr<DB::IMergeTreeDataPart>&, bool, DB::NamesAndTypesList const*, DB::MergeTreeDataPartChecksums*) @ 0x159c9396 in /usr/bin/clickhouse
7. DB::MutateAllPartColumnsTask::finalize() @ 0x159ee9c5 in /usr/bin/clickhouse
8. ? @ 0x159ecfec in /usr/bin/clickhouse
9. DB::MutatePlainMergeTreeTask::executeStep() @ 0x159d562e in /usr/bin/clickhouse
10. DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(std::__1::shared_ptr<DB::TaskRuntimeData>) @ 0xa3b9f1b in /usr/bin/clickhouse
11. DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::threadFunction() @ 0xa3b9950 in /usr/bin/clickhouse
12. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) @ 0xa4b38a6 in /usr/bin/clickhouse
13. void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()>(void&&)::'lambda'(), void ()> >(std::__1::__function::__policy_storage const*) @ 0xa4b51f7 in /usr/bin/clickhouse
14. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xa4b11c8 in /usr/bin/clickhouse
15. ? @ 0xa4b43dd in /usr/bin/clickhouse
16. ? @ 0x7fac3fccb609 in ?
17. clone @ 0x7fac3fbf0133 in ?
 (version 22.8.8.3 (official build))
2023.03.30 10:29:17.039355 [ 20 ] {35ae2841-cf20-43d4-ae32-f7bcc0e99ad6::20230330_482_482_0_485} <Error> MutatePlainMergeTreeTask: Code: 10. DB::Exception: Not found column os_type in block. There are only columns: timestamp, id, trace_id, span_id, severity_text, severity_number, body, k8s_container_name, k8s_namespace_name, observed_timestamp, trace_flags, resources_string_key, resources_string_value, attributes_string_key, attributes_string_value, attributes_int64_key, attributes_int64_value, attributes_float64_key, attributes_float64_value. (NOT_FOUND_COLUMN_IN_BLOCK) (version 22.8.8.3 (official build))
2023.03.30 10:29:17.041098 [ 20 ] {35ae2841-cf20-43d4-ae32-f7bcc0e99ad6::20230330_482_482_0_485} <Error> virtual bool DB::MutatePlainMergeTreeTask::executeStep(): Code: 10. DB::Exception: Not found column os_type in block. There are only columns: timestamp, id, trace_id, span_id, severity_text, severity_number, body, k8s_container_name, k8s_namespace_name, observed_timestamp, trace_flags, resources_string_key, resources_string_value, attributes_string_key, attributes_string_value, attributes_int64_key, attributes_int64_value, attributes_float64_key, attributes_float64_value. (NOT_FOUND_COLUMN_IN_BLOCK), Stack trace (when copying this message, always include the lines below):
generally how do you connect to clickhouse db ?
And run
drop database signoz_logs
for dropping the database
p
I have 2 shards, it needs to be done on both right to confirm ?
n
you can do
drop database signoz_logs on cluster cluster
p
looks like it worked on dropping them, but if this occurs next time then is there no option apart from losing logs ?
n
No, we can get it back to a normal state just that you will have to check the migrations regarding what went wrong and will have to compare schemas. It will require more manual effort.
p
oh
If you don't mind another question on cold storage of Clickhouse
n
Sure
p
I have enabled cold storage on S3 and i saw that in the S3 bucket there was around 3GB of data. But some how i saw that there was lot of spike in cost of S3 usage .. NATbytesTransferred was around 120GB
How does S3 cold storage work ?
does signoz use it to read from S3 always ?
n
Have you enabled it for all metrics, traces and logs? Ideally, data is read from S3 only when you query that data it is fetched, apart from that it shouldn’t. For logs, it’s basically the timerange that you select. @Ankit Nayan do you have more idea about the
NATbytesTransferred
?
p
Screenshot 2023-03-30 at 4.14.53 PM.png
FYI cost spike in AWS
a
yeah... Surprisingly I also observed spike in cost a few days back. It was RequestsTier1 for us too. And it is not for every saas user. I will be diving deeper soon into this. @Pruthvi Raj Eranti can you please create a github issue at SigNoz? At least we should do the analysis of cost. cc @Prashant Shahi
p
Sure will make an issue