Good morning all, on one of our busy environments ...
# support
j
Good morning all, on one of our busy environments we are seeing this in the docker logs
Copy code
{
  "level": "warn",
  "ts": 1744271384.3731384,
  "caller": "internal/transaction.go:128",
  "msg": "Failed to scrape Prometheus endpoint",
  "kind": "receiver",
  "name": "prometheus",
  "data_type": "metrics",
  "scrape_timestamp": 1744271384369,
  "target_labels": "{__name__=\"up\", instance=\"localhost:8888\", job=\"otel-collector\", job_name=\"otel-collector\"}"
}
I've added a second otel collector but am struggling to get this correct. We are also seeing our dashboards getting behind and the data is almost 5 mins behind.
Copy code
{"level":"info","ts":1744271376.5092738,"caller":"internal/retry_sender.go:118","msg":"Exporting failed. Will retry the request after interval.","kind":"exporter","data_type":"metrics","name":"clickhousemetricswrite","error":"context deadline exceeded","interval":"7.278325194s"} 
we are also getting this error.
During busy periods the docker stack stops working completely. We really want to use the product but without getting it stable we cannot move forward with it