summaryrefslogtreecommitdiffstats
path: root/roles/openshift_health_checker
Commit message (Collapse)AuthorAgeFilesLines
...
* | | Make aos_version module handle multiple versionsRodolfo Carvalho2017-07-172-24/+93
| | | | | | | | | | | | | | | | | | | | | | | | Some packages are supported at more than one major.minor version at the same time. Support is added keeping backward compatibility: the 'version' key can be either a string (single version) or a list of versions.
* | | Split positive and negative unit testsRodolfo Carvalho2017-07-171-50/+40
| | | | | | | | | | | | | | | | | | | | | Split positive and negative tests into their own functions. This means less lines of code, clearer purpose, easier to understand what each test does or doesn't and to add new test cases.
* | | add scheduled pods checkjuanvallejo2017-07-112-2/+32
| | |
* | | Only store failures that were not ignored.Rodolfo Carvalho2017-07-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the past, health checks were implemented with ignore_errors: True in the playbook level, requiring us to store all failures, ignored or not, so that we could report on all failed checks. Now checks are run from a single action plugin entry point, without ignoring errors (all errors are aggregated via the action plugin). Since the integration of the openshift_health_checker role with the install playbook, failure summaries are part of the output of a lot more calls to ansible-playbook. We shall report only failures that caused the execution to stop, as ignored failures in the summary only serve to confuse users.
* | | Add overlay to supported Docker storage driversRodolfo Carvalho2017-07-112-3/+3
| | | | | | | | | | | | | | | | | | | | | Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1467809 As a next step, we can refine under which conditions the overlay driver is supported.
* | | openshift_checks: fix execute_module paramsLuke Meyer2017-07-114-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Fix where execute_module was being passed task_vars in place of tmp param. Most modules don't seem to use either and so this doesn't fail; but under some conditions (perhaps different per version of ansible?) it tried to treat the dict as a string and came back with a python stack trace.
* | | Merge pull request #4655 from sosiouxme/20170630-atomic-etcd-bz1466622OpenShift Bot2017-06-302-1/+16
|\ \ \ | | | | | | | | Merged by openshift-bot
| * | | docker_image_availability: fix containerized etcdLuke Meyer2017-06-302-1/+16
| | |/ | |/| | | | | | | fixes bug 1466622 - docker_image_availability check on etcd host failed for 'openshift_image_tag' is undefined
* | | Merge pull request #4607 from sosiouxme/20170627-docker-storage-vgs-unitsOpenShift Bot2017-06-301-1/+1
|\ \ \ | | | | | | | | Merged by openshift-bot
| * | | docker_storage check: make vgs return sane outputLuke Meyer2017-06-271-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | fix bug 1464974 https://bugzilla.redhat.com/show_bug.cgi?id=1464974 Specify --units on vgs call. In my testing with lvm 2.0.2.171(2) on RHEL Atomic Host 7.4, this turned a response of "<4.07g" into "4.07g" which should resolve the issue. I haven't found what the "<" is for in the first place but I'm thinking this should at least be a safe change.
* | | Merge pull request #4565 from rhcarvalho/handle-incorrect-check-namesOpenShift Bot2017-06-302-4/+13
|\ \ \ | | | | | | | | Merged by openshift-bot
| * | | Capture exceptions when resolving available checksRodolfo Carvalho2017-06-232-4/+13
| |/ / | | | | | | | | | | | | | | | | | | Calling the action plugin (e.g. when running a playbook) with an incorrect check name was raising an unhandled exception, leading to poor output in Ansible (requiring a higher verbosity level to see what is going wrong).
* | | Enable disk check on containerized installsRodolfo Carvalho2017-06-222-15/+11
| | | | | | | | | | | | | | | | | | | | | According to the docs the disk requirements should be similar to non-containerized installs. https://docs.openshift.org/latest/install_config/install/rpm_vs_containerized.html#containerized-storage-requirements
* | | Add module docstringRodolfo Carvalho2017-06-221-1/+2
| | |
* | | Add suggestion to check disk space in any pathRodolfo Carvalho2017-06-221-1/+5
| | |
* | | Require at least 1GB in /usr/bin/local and tempdirRodolfo Carvalho2017-06-222-1/+15
| | | | | | | | | | | | During install, those paths are used and require some free space.
* | | Refactor DiskAvailability for arbitrary pathsRodolfo Carvalho2017-06-222-34/+64
|/ / | | | | | | Prepare the check to support verifying multiple paths, not only /var.
* | Disable TLS verification in skopeo inspectRodolfo Carvalho2017-06-191-1/+1
| | | | | | | | | | | | | | Some registries are not configured with valid certificates and thus the check fails with 'http: server gave HTTP response to HTTPS client'. Since this is not fetching images, but only checking for existence, trade security for convenience.
* | Rename cockpit-shell -> cockpit-systemRodolfo Carvalho2017-06-161-1/+1
| | | | | | | | | | | | | | | | The package name has changed. See https://bugzilla.redhat.com/show_bug.cgi?id=1461689 https://bugzilla.redhat.com/show_bug.cgi?id=1419718
* | pre-install checks: add more during byo installLuke Meyer2017-06-149-91/+184
|/ | | | | | | | | | | | Add the docker and RPM checks to the list that run at install time. They can be disabled the same as the existing ones. Removed cockpit-kubernetes RPM requirement as it no longer is. Fixed up docker_image_availability to handle oreg_url and other nuances. Switched to using the openshift_image_tag that's set by openshift_version for both component and infrastructure images. Fixed a bug where execute_module was being called with incorrect positional arg "tmp" as a dict which caused errors down the call stack.
* Merge pull request #3787 from juanvallejo/jvallejo/docker-storage-checkOpenShift Bot2017-06-095-29/+468
|\ | | | | Merged by openshift-bot
| * Consider previous value of 'changed' when updatingRodolfo Carvalho2017-06-091-1/+1
| | | | | | This avoids unintentionally overriding the value from `True` to `False`.
| * Improve code readabilityRodolfo Carvalho2017-06-091-1/+3
| |
| * docker checks: finish and refactorLuke Meyer2017-06-077-444/+397
| | | | | | | | | | | | | | | | | | Incorporated docker_storage_driver into docker_storage as both need driver info. Corrected storage calculation to include VG free space, not just the current amount in the LV pool. Now makes no assumptions about pool name. Improved user messaging. Factored out some methods that can be shared with docker_image_availability.
| * add docker storage, docker driver checksjuanvallejo2017-06-014-0/+484
| |
* | Merge pull request #3643 from juanvallejo/jvallejo/elastic-search-checkOpenShift Bot2017-06-0613-6/+1575
|\ \ | | | | | | Merged by openshift-bot
| * | add elasticseatch, fluentd, kibana checkjuanvallejo2017-06-0213-6/+1575
| | |
* | | Merge pull request #4064 from juanvallejo/jvallejo/add-ovs-version-checkOpenShift Bot2017-06-054-92/+334
|\ \ \ | | | | | | | | Merged by openshift-bot
| * | | Remove unnecessary comment.Rodolfo Carvalho2017-05-241-1/+1
| | | | | | | | | | | | | | | | | | | | Capturing the ImportError is a common idiom in Ansible modules, and it is not specific to tox.
| * | | update aos_version module to support generic pkgs and versionsjuanvallejo2017-05-244-91/+333
| | | |
* | | | Merge pull request #4157 from ↵OpenShift Bot2017-06-054-0/+376
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | juanvallejo/jvallejo/add-retroactive-ovs-version-check Merged by openshift-bot
| * | | add existing_ovs_version checkjuanvallejo2017-05-195-0/+377
| | |/ | |/|
* | | memory check: use GiB/MiB and adjust memtotalLuke Meyer2017-05-292-20/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | fixes https://bugzilla.redhat.com/show_bug.cgi?id=1455884 Various things reserve memory such that memtotal is quite lower than the actual physical RAM of the system. It's larger as RAM increases but it's not really proportional so I just added a flat 1GiB adjustment in the comparison. This ought to "pass when it's close enough."
* | | memory health check: adjust threshold for etcdLuke Meyer2017-05-232-4/+10
| | |
* | | health checks: specify check skip reasonLuke Meyer2017-05-232-9/+24
| | | | | | | | | | | | | | | Added indicator to check result for why that check was skipped. Note that currently the user will only see it with ansible-playbook -vv
* | | health checks: configure failure output in playbooksLuke Meyer2017-05-232-45/+68
| | | | | | | | | | | | | | | Customized the error summary to depend on the intent of the playbook run. Ensured output makes sense when failures are unrelated to running checks.
* | | disk/memory checks: make threshold configurableLuke Meyer2017-05-234-12/+65
| | |
* | | Show help on how to disable checks after failureRodolfo Carvalho2017-05-231-0/+22
| | |
* | | Allow disabling checks via Ansible variableRodolfo Carvalho2017-05-231-1/+8
| |/ |/| | | | | | | | | | | | | | | Example usage: $ ansible-playbook -i hosts playbooks/byo/config.yml -e openshift_disable_check=memory_availability,disk_availability Or add the variable to the inventory / hosts file.
* | remove skopeo dependency on docker-pyjuanvallejo2017-05-192-152/+143
| |
* | improve error handling for missing varsjuanvallejo2017-05-193-66/+226
|/
* Merge pull request #3630 from juanvallejo/jvallejo/add-etcd-volume-checkOpenShift Bot2017-05-192-0/+207
|\ | | | | Merged by openshift-bot
| * revert role-specific var namejuanvallejo2017-05-151-1/+1
| |
| * Merge branch 'jvallejo/add-etcd-volume-check' of ↵juanvallejo2017-05-121-5/+7
| |\ | | | | | | | | | github.com:juanvallejo/openshift-ansible into jvallejo/add-etcd-volume-check
| | * Update variable name to standardRodolfo Carvalho2017-05-111-1/+1
| | | | | | | | | It was agreed to name role variables as `r_ROLE_NAME_VARIABLE_NAME`. Giving it a try.
| | * Make class attribute name shorterRodolfo Carvalho2017-05-111-4/+4
| | |
| | * Add module docstringRodolfo Carvalho2017-05-111-0/+2
| | |
| * | check if hostname is in list of etcd hostsjuanvallejo2017-05-121-3/+4
| |/
| * Update checkRodolfo Carvalho2017-05-102-48/+46
| |
| * int -> floatRodolfo Carvalho2017-05-101-3/+3
| | | | | | We don't need to convert to int and then to float. Read it as float from the start.