summaryrefslogtreecommitdiffstats
path: root/docs/consistency.txt
diff options
context:
space:
mode:
Diffstat (limited to 'docs/consistency.txt')
-rw-r--r--docs/consistency.txt36
1 files changed, 36 insertions, 0 deletions
diff --git a/docs/consistency.txt b/docs/consistency.txt
new file mode 100644
index 0000000..127d9a7
--- /dev/null
+++ b/docs/consistency.txt
@@ -0,0 +1,36 @@
+General overview
+=================
+ - etcd services (worth checking both ports)
+ etcdctl3 --endpoints="192.168.213.1:2379" member list - doesn't check health only reports members
+ oc get cs - only etcd (other services will fail on Openshift)
+ - All nodes and pods are fine and running and all pvc are bound
+ oc get nodes
+ oc get pods --all-namespaces -o wide
+ oc get pvc --all-namespaces -o wide
+ - API health check
+ curl -k https://apiserver.kube-service-catalog.svc/healthz
+
+Storage
+=======
+ - Heketi status
+ heketi-cli -s http://heketi-storage.glusterfs.svc.cluster.local:8080 --user admin --secret "$(oc get secret heketi-storage-admin-secret -n glusterfs -o jsonpath='{.data.key}' | base64 -d)" topology info
+ - Status of Gluster Volume (and its bricks which with heketi fails often)
+ gluster volume info
+ ./gluster.sh info all_heketi
+ - Check available storage space on system partition and LVM volumes (docker, heketi, ands)
+ Run 'df -h' and 'lvdisplay' on each node
+
+Networking
+==========
+ - Check that both internal and external addresses are resolvable from all hosts.
+ * I.e. we should be able to resolve 'google.com'
+ * And we should be able to resolve 'heketi-storage.glusterfs.svc.cluster.local'
+
+ - Check that keepalived service is up and the corresponding ip's are really assigned to one
+ of the nodes (vagrant provisioner would remove keepalived tracked ips, but keepalived will
+ continue running without noticing it)
+
+ - Ensure, we don't have override of cluster_name to first master (which we do during the
+ provisioning of OpenShift plays)
+
+ \ No newline at end of file