roles/openshift_cfme/README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343

# OpenShift-Ansible - CFME Role

# PROOF OF CONCEPT - Alpha Version

This role is based on the work in the upstream
[manageiq/manageiq-pods](https://github.com/ManageIQ/manageiq-pods)
project. For additional literature on configuration specific to
ManageIQ (optional post-installation tasks), visit the project's
[upstream documentation page](http://manageiq.org/docs/get-started/basic-configuration).

Please submit a
[new issue](https://github.com/openshift/openshift-ansible/issues/new)
if you run into bugs with this role or wish to request enhancements.

# Important Notes

This is an early *proof of concept* role to install the Cloud Forms
Management Engine (ManageIQ) on OpenShift Container Platform (OCP).

* This role is still in **ALPHA STATUS**
* Many options are hard-coded still (ex: NFS setup)
* Not many configurable options yet
* **Should** be ran on a dedicated cluster
* **Will not run** on undersized infra
* The terms *CFME* and *MIQ* / *ManageIQ* are interchangeable

## Requirements

**NOTE:** These requirements are copied from the upstream
[manageiq/manageiq-pods](https://github.com/ManageIQ/manageiq-pods)
project.

### Prerequisites:

*
  [OpenShift Origin 1.5](https://docs.openshift.com/container-platform/3.5/welcome/index.html)
  or
  [higher](https://docs.openshift.com/container-platform/latest/welcome/index.html)
  provisioned
* NFS or other compatible volume provider
* A cluster-admin user (created by role if required)

### Cluster Sizing

In order to avoid random deployment failures due to resource
starvation, we recommend a minimum cluster size for a **test**
environment.

| Type           | Size    | CPUs     | Memory   |
|----------------|---------|----------|----------|
| Masters        | `1+`    | `8`      | `12GB`   |
| Nodes          | `2+`    | `4`      | `8GB`    |
| PV Storage     | `25GB`  | `N/A`    | `N/A`    |


![Basic CFME Deployment](img/CFMEBasicDeployment.png)

**CFME has hard-requirements for memory. CFME will NOT install if your
  infrastructure does not meet or exceed the requirements given
  above. Do not run this playbook if you do not have the required
  memory, you will just waste your time.**


### Other sizing considerations

* Recommendations assume MIQ will be the **only application running**
  on this cluster.
* Alternatively, you can provision an infrastructure node to run
  registry/metrics/router/logging pods.
* Each MIQ application pod will consume at least `3GB` of RAM on initial
  deployment (blank deployment without providers).
* RAM consumption will ramp up higher depending on appliance use, once
  providers are added expect higher resource consumption.


### Assumptions

1) You meet/exceed the [cluster sizing](#cluster-sizing) requirements
1) Your NFS server is on your master host
1) Your PV backing NFS storage volume is mounted on `/exports/`

Required directories that NFS will export to back the PVs:

* `/exports/miq-pv0[123]`

If the required directories are not present at install-time, they will
be created using the recommended permissions per the
[upstream documentation](https://github.com/ManageIQ/manageiq-pods#make-persistent-volumes-to-host-the-miq-database-and-application-data):

* UID/GID: `root`/`root`
* Mode: `0775`

**IMPORTANT:** If you are using a separate volume (`/dev/vdX`) for NFS
  storage, **ensure** it is mounted on `/exports/` **before** running
  this role.


## Role Variables

Core variables in this role:

| Name                          | Default value | Description   |
|-------------------------------|---------------|---------------|
| `openshift_cfme_install_app`  | `False`       | `True`: Install everything and create a new CFME app, `False`: Just install all of the templates and scaffolding |


Variables you may override have defaults defined in
[defaults/main.yml](defaults/main.yml).


# Usage

This section describes the basic usage of this role. All parameters
will use their [default values](defaults/main.yml).

## Pre-flight Checks

**IMPORTANT:** As documented above in [the prerequisites](#prerequisites),
  you **must already** have your OCP cluster up and running.

**Optional:** The ManageIQ pod is fairly large (about 1.7 GB) so to
save some spin-up time post-deployment, you can begin pre-pulling the
docker image now to each of your nodes now:

```
root@node0x # docker pull docker.io/manageiq/manageiq-pods:app-latest
```

## Getting Started

1) The entry point playbook to install CFME is located in
[the BYO playbooks](../../playbooks/byo/openshift-cfme/config.yml)
directory

2) Using your existing `hosts` inventory file, run `ansible-playbook`
with the entry point playbook:

```
$ ansible-playbook -v -i <INVENTORY_FILE> playbooks/byo/openshift-cfme/config.yml
```

## Next Steps

Once complete, the playbook will let you know:


```
TASK [openshift_cfme : Status update] *********************************************************
ok: [ho.st.na.me] => {
    "msg": "CFME has been deployed. Note that there will be a delay before it is fully initialized.\n"
}
```

This will take several minutes (*possibly 10 or more*, depending on
your network connection). However, you can get some insight into the
deployment process during initialization.

On your master node, switch to the `cfme` project (or whatever you
named it if you overrode the `openshift_cfme_project` variable) and
check on the pod states:

```
[root@cfme-master01 ~]# oc project cfme
Now using project "cfme" on server "https://10.10.0.100:8443".

[root@cfme-master01 ~]# oc get pod
NAME                 READY     STATUS    RESTARTS   AGE
manageiq-0           0/1       Running   0          14m
memcached-1-3lk7g    1/1       Running   0          14m
postgresql-1-12slb   1/1       Running   0          14m
```

Note how the `manageiq-0` pod says `0/1` under the **READY**
column. After some time (depending on your network connection) you'll
be able to `rsh` into the pod to find out more of what's happening in
real time:

```
[root@cfme-master01 ~]# oc rsh manageiq-0 bash -l
```

The `rsh` command opens a shell in your pod for you. In this case it's
the pod called `manageiq-0`. `systemd` is managing the services in
this pod so we can use the `list-units` command to see what is running
currently: `# systemctl list-units | grep appliance`.

If you see the `appliance-initialize` service running, this indicates
that basic setup is still in progress. We can monitor the process with
the `journalctl` command like so:


```
[root@manageiq-0 vmdb]# journalctl -f -u appliance-initialize.service
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: == Checking deployment status ==
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: No pre-existing EVM configuration found on region PV
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: == Checking for existing data on server PV ==
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: == Starting New Deployment ==
Jun 14 14:55:52 manageiq-0 appliance-initialize.sh[58]: == Applying memcached config ==
Jun 14 14:55:53 manageiq-0 appliance-initialize.sh[58]: == Initializing Appliance ==
Jun 14 14:55:57 manageiq-0 appliance-initialize.sh[58]: create encryption key
Jun 14 14:55:57 manageiq-0 appliance-initialize.sh[58]: configuring external database
Jun 14 14:55:57 manageiq-0 appliance-initialize.sh[58]: Checking for connections to the database...
Jun 14 14:56:09 manageiq-0 appliance-initialize.sh[58]: Create region starting
Jun 14 14:58:15 manageiq-0 appliance-initialize.sh[58]: Create region complete
Jun 14 14:58:15 manageiq-0 appliance-initialize.sh[58]: == Initializing PV data ==
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: == Initializing PV data backup ==
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: sending incremental file list
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: created directory /persistent/server-deploy/backup/backup_2017_06_14_145816
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/REGION
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/certs/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/certs/v2_key
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/config/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: region-data/var/www/miq/vmdb/config/database.yml
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/www/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/www/miq/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/www/miq/vmdb/
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: server-data/var/www/miq/vmdb/GUID
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: sent 1330 bytes  received 136 bytes  2932.00 bytes/sec
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: total size is 770  speedup is 0.53
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: == Restoring PV data symlinks ==
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: /var/www/miq/vmdb/REGION symlink is already in place, skipping
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: /var/www/miq/vmdb/config/database.yml symlink is already in place, skipping
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: /var/www/miq/vmdb/certs/v2_key symlink is already in place, skipping
Jun 14 14:58:16 manageiq-0 appliance-initialize.sh[58]: /var/www/miq/vmdb/log symlink is already in place, skipping
Jun 14 14:58:28 manageiq-0 systemctl[304]: Removed symlink /etc/systemd/system/multi-user.target.wants/appliance-initialize.service.
Jun 14 14:58:29 manageiq-0 systemd[1]: Started Initialize Appliance Database.
```

Most of what we see here (above) is the initial database seeding
process. This process isn't very quick, so be patient.

At the bottom of the log there is a special line from the `systemctl`
service, `Removed symlink
/etc/systemd/system/multi-user.target.wants/appliance-initialize.service`. The
`appliance-initialize` service is no longer marked as enabled. This
indicates that the base application initialization is complete now.

We're not done yet though, there are other ancillary services which
run in this pod to support the application. *Still in the rsh shell*,
Use the `ps` command to monitor for the `httpd` processes
starting. You will see output similar to the following when that stage
has completed:

```
[root@manageiq-0 vmdb]# ps aux | grep http
root       1941  0.0  0.1 249820  7640 ?        Ss   15:02   0:00 /usr/sbin/httpd -DFOREGROUND
apache     1942  0.0  0.0 250752  6012 ?        S    15:02   0:00 /usr/sbin/httpd -DFOREGROUND
apache     1943  0.0  0.0 250472  5952 ?        S    15:02   0:00 /usr/sbin/httpd -DFOREGROUND
apache     1944  0.0  0.0 250472  5916 ?        S    15:02   0:00 /usr/sbin/httpd -DFOREGROUND
apache     1945  0.0  0.0 250360  5764 ?        S    15:02   0:00 /usr/sbin/httpd -DFOREGROUND
```

Furthermore, you can expand your search process by just looking for
processes with `MIQ` in their name:

```
[root@manageiq-0 vmdb]# ps aux | grep miq
root        333 27.7  4.2 555884 315916 ?       Sl   14:58   3:59 MIQ Server
root       1976  0.6  4.0 507224 303740 ?       SNl  15:02   0:03 MIQ: MiqGenericWorker id: 1, queue: generic
root       1984  0.6  4.0 507224 304312 ?       SNl  15:02   0:03 MIQ: MiqGenericWorker id: 2, queue: generic
root       1992  0.9  4.0 508252 304888 ?       SNl  15:02   0:05 MIQ: MiqPriorityWorker id: 3, queue: generic
root       2000  0.7  4.0 510308 304696 ?       SNl  15:02   0:04 MIQ: MiqPriorityWorker id: 4, queue: generic
root       2008  1.2  4.0 514000 303612 ?       SNl  15:02   0:07 MIQ: MiqScheduleWorker id: 5
root       2026  0.2  4.0 517504 303644 ?       SNl  15:02   0:01 MIQ: MiqEventHandler id: 6, queue: ems
root       2036  0.2  4.0 518532 303768 ?       SNl  15:02   0:01 MIQ: MiqReportingWorker id: 7, queue: reporting
root       2044  0.2  4.0 519560 303812 ?       SNl  15:02   0:01 MIQ: MiqReportingWorker id: 8, queue: reporting
root       2059  0.2  4.0 528372 303956 ?       SNl  15:02   0:01 puma 3.3.0 (tcp://127.0.0.1:5000) [MIQ: Web Server Worker]
root       2067  0.9  4.0 529664 305716 ?       SNl  15:02   0:05 puma 3.3.0 (tcp://127.0.0.1:3000) [MIQ: Web Server Worker]
root       2075  0.2  4.0 529408 304056 ?       SNl  15:02   0:01 puma 3.3.0 (tcp://127.0.0.1:4000) [MIQ: Web Server Worker]
root       2329  0.0  0.0  10640   972 ?        S+   15:13   0:00 grep --color=auto -i miq
```

Finally, *still in the rsh shell*, to test if the application is
running correctly, we can request the application homepage. If the
page is available the page title will be `ManageIQ: Login`:

```
[root@manageiq-0 vmdb]# curl -s -k https://localhost | grep -A2 '<title>'
<title>
ManageIQ: Login
</title>
```

**Note:** The `-s` flag makes `curl` operations silent and the `-k`
flag to ignore errors about untrusted certificates.


# Additional Upstream Resources

Below are some useful resources from the upstream project
documentation. You may find these of value.

* [Verify Setup Was Successful](https://github.com/ManageIQ/manageiq-pods#verifying-the-setup-was-successful)
* [POD Access And Routes](https://github.com/ManageIQ/manageiq-pods#pod-access-and-routes)
* [Troubleshooting](https://github.com/ManageIQ/manageiq-pods#troubleshooting)


# Manual Cleanup

At this time uninstallation/cleanup is still a manual process. You
will have to follow a few steps to fully remove CFME from your
cluster.

Delete the project:

* `oc delete project cfme`

Delete the PVs:

* `oc delete pv miq-pv01`
* `oc delete pv miq-pv02`
* `oc delete pv miq-pv03`

Clean out the old PV data:

* `cd /exports/`
* `find miq* -type f -delete`
* `find miq* -type d -delete`

Remove the NFS exports:

* `rm /etc/exports.d/openshift_cfme.exports`
* `exportfs -ar`

Delete the user:

* `oc delete user cfme`

**NOTE:** The `oc delete project cfme` command will return quickly,
but continue to operate in the background. Continue running `oc get
pods` after you've completed the other tasks to monitor the pod
termination progress. Likewise, run `oc get project` after the pods
have disappeared to ensure that the `cfme` project has been terminated
as well.