summaryrefslogtreecommitdiffstats
path: root/roles/openshift_prometheus/README.md
blob: 92f74928cce179d9c4c9872d6b60deef67a5c90e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
OpenShift Prometheus
====================

OpenShift Prometheus Installation

Requirements
------------


Role Variables
--------------

For default values, see [`defaults/main.yaml`](defaults/main.yaml).

- `openshift_prometheus_state`: present - install/update. absent - uninstall.

- `openshift_prometheus_namespace`: project (i.e. namespace) where the components will be
  deployed.

- `openshift_prometheus_node_selector`: Selector for the nodes prometheus will be deployed on.

- `openshift_prometheus_<COMPONENT>_image_prefix`: specify image prefix for the component 

- `openshift_prometheus_<COMPONENT>_image_version`: specify image version for the component 

## PVC related variables
Each prometheus component (prometheus, alertmanager, alertbuffer) can set pv claim by setting corresponding role variable:
```
openshift_prometheus_<COMPONENT>_storage_type: <VALUE> (pvc, emptydir)
openshift_prometheus_<COMPONENT>_pvc_(name|size|access_modes|pv_selector): <VALUE>
```
e.g
```
openshift_prometheus_storage_type: pvc
openshift_prometheus_alertmanager_pvc_name: alertmanager
openshift_prometheus_alertbuffer_pvc_size: 10G
openshift_prometheus_pvc_access_modes: [ReadWriteOnce]
```

## NFS PV Storage variables
Each prometheus component (prometheus, alertmanager, alertbuffer) can set nfs pv by setting corresponding variable:
```
openshift_prometheus_<COMPONENT>_storage_kind=<VALUE>
openshift_prometheus_<COMPONENT>_storage_(access_modes|host|labels)=<VALUE>
openshift_prometheus_<COMPONENT>_storage_volume_(name|size)=<VALUE>
openshift_prometheus_<COMPONENT>_storage_nfs_(directory|options)=<VALUE>
```
e.g
```
openshift_prometheus_storage_kind=nfs
openshift_prometheus_storage_access_modes=['ReadWriteOnce']
openshift_prometheus_storage_host=nfs.example.com #for external host
openshift_prometheus_storage_nfs_directory=/exports
openshift_prometheus_storage_alertmanager_nfs_options='*(rw,root_squash)'
openshift_prometheus_storage_volume_name=prometheus
openshift_prometheus_storage_alertbuffer_volume_size=10Gi
openshift_prometheus_storage_labels={'storage': 'prometheus'}
```

NOTE: Setting `openshift_prometheus_<COMPONENT>_storage_labels` overrides `openshift_prometheus_<COMPONENT>_pvc_pv_selector`


## Additional Alert Rules file variable
An external file with alert rules can be added by setting path to additional rules variable: 
```
openshift_prometheus_additional_rules_file: <PATH> 
```

File content should be in prometheus alert rules format.
Following example sets rule to fire an alert when one of the cluster nodes is down:

```
groups:
- name: example-rules
  interval: 30s # defaults to global interval
  rules:
  - alert: Node Down
    expr: up{job="kubernetes-nodes"} == 0
    annotations:
      miqTarget: "ContainerNode"
      severity: "HIGH"
      message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down"
```


## Additional variables to control resource limits
Each prometheus component (prometheus, alertmanager, alert-buffer, oauth-proxy) can specify a cpu and memory limits and requests by setting
the corresponding role variable:
```
openshift_prometheus_<COMPONENT>_(limits|requests)_(memory|cpu): <VALUE>
```
e.g
```
openshift_prometheus_alertmanager_limits_memory: 1Gi
openshift_prometheus_oath_proxy_requests_cpu: 100
```

Dependencies
------------

openshift_facts


Example Playbook
----------------

```
- name: Configure openshift-prometheus
  hosts: oo_first_master
  roles:
  - role: openshift_prometheus
```

License
-------

Apache License, Version 2.0