summaryrefslogtreecommitdiffstats
path: root/roles/openshift_health_checker
Commit message (Collapse)AuthorAgeFilesLines
* Remove openshift.common.{is_atomic|is_containerized}Michael Gugino2017-12-2010-58/+41
| | | | | We set these variables using facts in init, no need to duplicate the logic all around the codebase.
* Relocate filter plugins to lib_utilsMichael Gugino2017-12-181-0/+1
| | | | | | | | | | | | | | This commit relocates filter_plugings to lib_utils, changes the namespacing to prevent unintended use of older versions that may be present in filter_plugins/ directory on existing installs. Add lib_utils to meta depends for roles Also consolidate some plugins into lib_utils from various other areas. Update rpm spec, obsolete plugin rpms.
* Cleanup byo referencesRussell Teague2017-12-081-1/+1
|
* Remove openshift.common.service_typeMichael Gugino2017-12-078-14/+31
| | | | | | | | This commit removes openshift.common.service_type in favor of openshift_service_type. This commit also removes r_openshift_excluder_service_type from plays in favor of using the role's defaults.
* registry-console: align image and checkLuke Meyer2017-11-152-8/+12
| | | | | enable option to configure basename in image docker_image_availability check: follow registry-console image options
* Merge pull request #5829 from sosiouxme/20171020-registry-console-bz1497310Scott Dodson2017-11-082-11/+67
|\ | | | | reconcile registry-console and docker_image_availability
| * reconcile registry-console and docker_image_availabilityLuke Meyer2017-11-062-11/+67
| | | | | | | | | | | | | | | | | | | | | | | | Fixes bug 1497310 https://bugzilla.redhat.com/show_bug.cgi?id=1497310 The registry console is a special case in more than one way. This adds logic to incorporate the openshift_cockpit_deployer_* variables into determining what its image will be in docker_image_availability. Along the way I noticed the origin and enterprise templates for this were not consistent. Now they are, and the example hosts file is updated.
* | openshift_checks: Add OVS versions for OCP 3.7Miciah Masters2017-11-063-3/+6
|/ | | | | | | | | Update the ovs_version check with the allowed Open vSwitch versions for OCP 3.7. Add OVS 2.8 to the allowed versions for OCP 3.6 as well. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1509163
* Merge pull request #5816 from sosiouxme/20171019-disk-check-sum-varOpenShift Merge Robot2017-10-272-3/+33
|\ | | | | | | | | | | | | | | | | Automatic merge from submit-queue. disk_availability check: include submount storage Fixes bug [1491566](https://bugzilla.redhat.com/show_bug.cgi?id=1491566) In order to determine how much storage is under a path, include any mounts that are below it in addition to the path itself.
| * disk_availability check: include submount storageLuke Meyer2017-10-202-3/+33
| | | | | | | | | | | | | | | | Fixes bug 1491566 https://bugzilla.redhat.com/show_bug.cgi?id=1491566 In order to determine how much storage is under a path, include any mounts that are below it in addition to the path itself.
* | Merge pull request #5742 from mtnbikenc/refactor-checksScott Dodson2017-10-271-1/+1
|\ \ | | | | | | 1504593 Refactor health check playbooks
| * | Refactor health check playbooksRussell Teague2017-10-121-1/+1
| |/ | | | | | | | | | | - Standardize play/tasks naming - Move install checks to separate playbook with checkpointing - Correct 'docker_storage' tags
* | Ensure proper variable templating for skopeo auth credentialsMichael Gugino2017-10-173-3/+12
|/ | | | | | | | | | | | | | | | | Currently, docker_image_availability.py plugin check is using the raw strings for variables from task_vars. This results in any variables that utilized within the plugin to be un-templated. For instance, if variable "x" is set to "{{ y }}" and y is set to "2", one would expect that x == 2 inside the plugin. Currently, the plugin will use the string "{{ y }}" for the value of x instead of templating the variable. This commit ensures skopeo registry auth credentials are templated properly. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1500698
* docker_image_availability: credentials to skopeoMichael Gugino2017-10-063-83/+76
| | | | | | | | | | | | | | | | | | Currently, docker_image_availability health_check does not support authenticated registries. This commit adds the '--creds=' option to skopeo if needed to support authentication credentials. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1316341 Some other fixes to handle docker config better: Should now account properly for blocked registries, insecure registries, multiple additional registries, and oreg_url registry with or without credentials. Output on failure should be clearer about what was tried. Fixed a bug in the action_plugin_test exposed by these changes.
* openshift_checks: lb and nfs do not need dockerLuke Meyer2017-10-042-5/+7
| | | | | fixes bug 1496760 https://bugzilla.redhat.com/show_bug.cgi?id=1496760
* openshift_checks: use oo group names everywhereLuke Meyer2017-10-0419-112/+116
|
* openshift_checks: Fix incorrect list castSteve Milner2017-10-021-1/+10
| | | | | | | | | | | | | docker_image_availability casted openshift_docker_additional_registries to a list using the list() function. If a string was returned (IE: only a single registry added) the result would be the string split up by component characters. This change forces a string result from get_var to be placed inside a list. If the result is anything BUT a string the original list() function is called on the result. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1497274 Signed-off-by: Steve Milner <smilner@redhat.com>
* Migrate enterprise registry logic to docker roleMichael Gugino2017-09-272-6/+6
| | | | | | | | | | | | | Currently, the enterprise registry to forcefully added in openshift_facts. Recently, the docker role has been modified to consume registry variables directly, bypassing openshift_facts. This commit cleans up unused code in openshift_facts, and migrates enterprise registry logic to the docker role. Fixes: https://github.com/openshift/openshift-ansible/issues/5557
* Merge pull request #5491 from sosiouxme/20170920-diagnostics-checkOpenShift Merge Robot2017-09-227-20/+138
|\ | | | | | | | | | | | | | | | | Automatic merge from submit-queue health checks: add diagnostics check Adds a health check that runs `oc adm diagnostics` with each individual diagnostic. Also, moved `is_first_master` method into superclass for reuse. And look at `oo_first_master` and `ansible_host` instead of `masters` and `ansible_ssh_host`.
| * health checks: add diagnostics checkLuke Meyer2017-09-217-20/+138
| | | | | | | | | | | | Also, moved is_first_master method into superclass for reuse. And look at oo_first_master and ansible_host instead of masters and ansible_ssh_host.
* | Cleanup old deployment typesMichael Gugino2017-09-202-5/+34
|/ | | | | | | | | | | Previously, openshift-ansible supported various types of deployments using the variable "openshift_deployment_type" Currently, openshift-ansible only supports two deployment types, "origin" and "openshift-enterprise". This commit removes all logic and references to deprecated deployment types.
* openshift_checks: enable providing file outputsLuke Meyer2017-09-1815-82/+430
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some refactoring of checks and the action plugin to enable writing files locally about the check operation and results, if the user wants them. This is aimed at enabling persistent and machine-readable results from recurring runs of health checks. Now, rather than trying to build a result hash to return from running each check, checks can just register what they need to as they're going along, and the action plugin processes state when the check is done. Checks can register failures, notes about what they saw, and arbitrary files to be saved into a directory structure where the user specifies. If no directory is specified, no files are written. At this time checks can still return a result hash, but that will likely be refactored away in the next iteration. Multiple failures can be registered without halting check execution. Throwing an exception or returning a hash with "failed" is registered as a failure. execute_module now does a little more with the results. Results are automatically included in notes and written individually as files. "changed" results are propagated. Some json results are decoded. A few of the checks were enhanced to use these features; all get some of the features for free.
* Merge pull request #5365 from sosiouxme/20170908-disconnected-image-checkOpenShift Bot2017-09-1212-144/+184
|\ | | | | Merged by openshift-bot
| * openshift_health_check: allow disabling all checksLuke Meyer2017-09-122-3/+8
| | | | | | | | | | | | | | | | Can now set openshift_disable_check=* to disable all checks without needing to know their names. fixes bug 1462106 https://bugzilla.redhat.com/show_bug.cgi?id=1462106
| * docker_image_availability: fix local image searchLuke Meyer2017-09-121-5/+9
| | | | | | | | | | | | An image in the docker index may be tagged by name or by registry plus name. In order to find the image correctly locally and prevent looking for it externally, make sure all possible variations are searched.
| * docker_image_availability: probe registry connectivityLuke Meyer2017-09-122-122/+132
| | | | | | | | | | | | | | | | | | | | | | | | Probe whether the host has connectivity to the registry before trying to inspect it for images, and remember the result. Also if later inspection fails due to timeout, mark registry as unreachable. Note in failure output if any registries were unreachable. Registry order should match what is configured into docker now as well. Fixes bug 1480195 https://bugzilla.redhat.com/show_bug.cgi?id=1480195
| * openshift_checks: add retries in pythonLuke Meyer2017-09-1210-22/+43
| |
* | Skip failure dedup instead of crashingRodolfo Carvalho2017-09-112-2/+29
|/ | | | | | This makes the callback plugin behave better when dedup is not possible: work with the original list of failures instead of raising an unhandled exception and producing confusing output for users.
* Merge pull request #5296 from nak3/skeopeo-command-outputOpenShift Bot2017-09-061-4/+6
|\ | | | | Merged by openshift-bot
| * output skopeo image check commandKenjiro Nakayama2017-09-051-4/+6
| |
* | openshift_checks aos_version: also check installed under yumLuke Meyer2017-09-063-17/+21
| | | | | | | | | | | | | | | | | | Tweaks to the logic around using yum vs dnf; now uses ansible_pkg_mgr to determine which is in effect for a host. Also, extended the yum logic to check installed packages in addition to available packages in the aos_version module so that disconnected installs and others with weird repo configs need not disable the package_version check.
* | Import dnf only if importing yum failsJakub Hadvig2017-09-051-6/+12
|/
* Merge pull request #5035 from ↵Rodolfo Carvalho2017-08-311-1/+1
|\ | | | | | | | | Miciah/openshift_checks-ignore-hidden-files-in-checks-directory openshift_checks: ignore hidden files in checks dir
| * openshift_checks: ignore hidden files in checks dirMiciah Masters2017-08-081-1/+1
| | | | | | | | load_checks: Ignore hidden files when scanning the directory for checks.
* | Merge pull request #5271 from sosiouxme/20170830-disk-avail-bugRodolfo Carvalho2017-08-311-4/+1
|\ \ | | | | | | disk_availability: fix bug where msg is overwritten
| * | disk_availability: fix bug where msg is overwrittenLuke Meyer2017-08-301-4/+1
| | |
* | | Merge pull request #5228 from sosiouxme/20170825-timeout-skopeoRodolfo Carvalho2017-08-301-1/+4
|\ \ \ | | | | | | | | docker_image_availability: timeout skopeo inspect
| * | | docker_image_availability: timeout skopeo inspectLuke Meyer2017-08-281-1/+4
| | | | | | | | | | | | | | | | | | | | Set a 10 second timeout when using skopeo to inspect remote registries, so that it does not wait for a tcp timeout to fail if they are unreachable.
* | | | Update error message: s/non-unique/duplicateRodolfo Carvalho2017-08-242-6/+4
| | | |
* | | | Make pylint disables more specificRodolfo Carvalho2017-08-241-15/+26
| | | | | | | | | | | | | | | | And beautify the code a bit.
* | | | Handle exceptions in failure summary cb pluginRodolfo Carvalho2017-08-241-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This serves two purposes: - Gracefully omit the summary if there was an error computing it, no confusion to the regular end user. - Provide a stacktrace of the error when running verbose, giving developers or users reporting bugs a better insight of what went wrong, as opposed to Ansible's opaque handling of errors in callback plugins.
* | | | Rewrite failure summary callback pluginRodolfo Carvalho2017-08-243-119/+243
| | | | | | | | | | | | | | | | | | | | The intent is to deduplicate similar errors that happened in many hosts, making the summary more concise.
* | | | Handle more exceptions when running checksRodolfo Carvalho2017-08-241-19/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This prevents an exception in one check from interfering with other checks. Skips checks that raise an exception in their is_active method. Whenever capturing a broad exception in the `is_action` or `run` methods, include traceback information that can be useful in bug reports.
* | | | List known checks/tags when check name is invalidRodolfo Carvalho2017-08-242-16/+22
| | | |
* | | | List existing health checks when none is requestedRodolfo Carvalho2017-08-242-8/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple mechanism to learn what health checks are available. Note that we defer task_vars verification, so that we can compute requested_checks and resolved_checks earlier, allowing us to list checks even if openshift_facts has not run.
* | | | Add playbook for running arbitrary health checksRodolfo Carvalho2017-08-241-1/+1
|/ / / | | | | | | | | | | | | | | | | | | | | | This is useful on its own, and also aids in developing/testing new checks that are not part of any playbook. Since the intent when running this playbook is to execute checks, opt for a less verbose explanation on the error summary.
* | | Merge pull request #5101 from maxamillion/add-dnf-supportScott Dodson2017-08-231-17/+49
|\ \ \ | |/ / |/| | Add dnf support
| * | remove out of scope variable from exception messageAdam Miller2017-08-181-1/+0
| | | | | | | | | | | | Signed-off-by: Adam Miller <maxamillion@fedoraproject.org>
| * | raise AosVersionException if no expected packages found by dnf queryAdam Miller2017-08-181-0/+8
| | | | | | | | | | | | Signed-off-by: Adam Miller <maxamillion@fedoraproject.org>
| * | add dnf support to roles/openshift_health_checker/library/aos_version.pyAdam Miller2017-08-161-17/+42
| | | | | | | | | | | | Signed-off-by: Adam Miller <maxamillion@fedoraproject.org>