summaryrefslogtreecommitdiffstats
path: root/docs/logs.txt
diff options
context:
space:
mode:
authorSuren A. Chilingaryan <csa@suren.me>2019-10-06 05:00:55 +0200
committerSuren A. Chilingaryan <csa@suren.me>2019-10-06 05:00:55 +0200
commitba144fab071258a97cf3c42a0defeb0aae41a353 (patch)
tree2e738d4e4774d754b56d79021cc8781b3c0835a5 /docs/logs.txt
parentefe4b9bbe3c9cb950378de9697eed2030ac49ca2 (diff)
downloadands-ba144fab071258a97cf3c42a0defeb0aae41a353.tar.gz
ands-ba144fab071258a97cf3c42a0defeb0aae41a353.tar.bz2
ands-ba144fab071258a97cf3c42a0defeb0aae41a353.tar.xz
ands-ba144fab071258a97cf3c42a0defeb0aae41a353.zip
Document latest problems with docker images and resource reclaimation, add docker performance checks in the monitoring scripts, helpers to filter the logs
Diffstat (limited to 'docs/logs.txt')
-rw-r--r--docs/logs.txt10
1 files changed, 9 insertions, 1 deletions
diff --git a/docs/logs.txt b/docs/logs.txt
index e27b1ff..d33ef0a 100644
--- a/docs/logs.txt
+++ b/docs/logs.txt
@@ -2,6 +2,10 @@
=================
- Various RPC errors.
... rpc error: code = # desc = xxx ...
+
+ - PLEG is not healthy: pleg was last seen active 3m0.448988393s ago; threshold is 3m0s
+ This is severe and indicates communication probelm (or at least high latency) with docker daemon. As result the node can be marked
+ temporary NotReady and cause eviction of all resident pods.
- container kill failed because of 'container not found' or 'no such process': Cannot kill container ###: rpc error: code = 2 desc = no such process"
Despite the errror, the containers are actually killed and pods destroyed. However, this error likely triggers
@@ -25,10 +29,14 @@
There are no adverse effects to this. It is a potential kernel issue, but should be just ignored by the customer. Nothing is going to break.
https://bugzilla.redhat.com/show_bug.cgi?id=1425278
-
- E0625 03:59:52.438970 23953 watcher.go:210] watch chan error: etcdserver: mvcc: required revision has been compacted
seems fine and can be ignored.
+ - E0926 09:29:50.744454 93115 mount_linux.go:172] Mount failed: exit status 1
+ Output: Failed to start transient scope unit: Connection timed out
+ It seems caused by too many parallel mounts (about 500 per-node) may cause systemd to hang.
+ Details: https://github.com/kubernetes/kubernetes/issues/79194
+ * Suggested to use 'setsid' to mount volumes instead of 'systemd-run'
/var/log/openvswitch/ovs-vswitchd.log
=====================================