Introduction ¶
We have the following cluster:
- Ceph 15.2.1 installed with cephadm
- 14 OSD
- 3 MON
- 3 RGW
- 3 MDS
- 3 CRUSH
Version 15.2.4 has been released, and we want to upgrade.
Getting Started ¶
Cephadm provides a mechanism for upgrading, but it didn’t work for me right away, so I had to assist it. Let’s see what exactly needs to be upgraded.
1
| ceph orch upgrade check ceph/ceph 15.2.4
|
A JSON file appears with a list of daemons that have already been upgraded and those that differ from the requested version, i.e., all of them at the moment. Time to get started.
So, the procedure has been launched, and now there are some commands worth running to get a summary.
ceph status
- for general information about the cluster’s state. You can run it in a separate terminal like this: watch -n 10 ceph status
ceph status -W cephadm
- definitely run this command in the terminal to monitor the orchestrator’s log.
Temporary Difficulties ¶
We call ceph orch ps
and see that all monitors and managers are upgraded, good, but now we’re in HEALTH_ERROR with the error shown below.
1
| module 'cephadm' has failed: auth get failed: failed to find client.crash.01 in keyring retval: -2
|
It turns out that some keys are missing. You need to create one for each crash and restart the orchestration.
1
2
3
4
5
6
7
| for i in $(ceph orch ps | grep -oE 'crash.[0-9]+'); do
ceph auth get-or-create "client.$i" mgr "profile crash" mon "profile crash"
done
ceph orch upgrade stop
ceph mgr module disable cephadm
ceph mgr module enable cephadm
ceph orch upgrade start ceph/ceph 15.2.4
|
Now we see that the crashes are ready too, good, but then everything seems to be stuck. Yes, HEALTH_OK, but ceph orch ps
shows that all OSDs are still on the old version. The logs show the following.
1
| Upgrade: It is NOT safe to stop osd.0
|
Aha, the orchestrator is afraid to stop OSD daemons with data. Let’s help. Here is my situation.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| # ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 14.00000 root default
-3 5.00000 host 01
0 hdd 1.00000 osd.0 up 1.00000 1.00000
1 hdd 1.00000 osd.1 up 1.00000 1.00000
2 hdd 1.00000 osd.2 up 1.00000 1.00000
3 hdd 1.00000 osd.3 up 1.00000 1.00000
4 hdd 1.00000 osd.4 up 1.00000 1.00000
-5 5.00000 host 02
5 hdd 1.00000 osd.5 up 1.00000 1.00000
6 hdd 1.00000 osd.6 up 1.00000 1.00000
7 hdd 1.00000 osd.7 up 1.00000 1.00000
8 hdd 1.00000 osd.8 up 1.00000 1.00000
9 hdd 1.00000 osd.9 up 1.00000 1.00000
-7 4.00000 host 03
10 hdd 1.00000 osd.10 up 1.00000 1.00000
11 hdd 1.00000 osd.11 up 1.00000 1.00000
12 hdd 1.00000 osd.12 up 1.00000 1.00000
13 hdd 1.00000 osd.13 up 1.00000 1.00000
|
At first, I stopped OSDs one by one.
Then I relaxed and began to decommission a whole server sequentially. When we decommission osd.0, after some time, when rebalancing finishes, ceph status -W cephadm
complains about osd.1, and ceph orch ps
shows that osd.0 is upgraded. The idea is clear; just don’t forget to reintegrate it.
In short, we decommission the OSDs of the first server, wait until cephadm complains about the next ones, and then reintegrate them. Further, wait for HEALTH_OK before decommissioning the next server’s OSD, or you could lose data…
Based on your OSD tree, the plan, if done from scratch, is as follows:
- out 0, 1, 2, 3, 4
- wait for complaints about 5
- in 0, 1, 2, 3, 4
- wait for HEALTH_OK
- out 5, 6, 7, 8, 9
- wait for complaints about 10
- in 5, 6, 7, 8, 9
- wait for HEALTH_OK
- out 10, 11, 12, 13
- check that everything is upgraded with
ceph orch ps
- in 10, 11, 12, 13
- wait for HEALTH_OK
```plain
╔═╗╔═╗╔═╗╦ ╦ ╦ ╦╔═╗╔═╗╦═╗╔═╗╔╦╗╔═╗╔╦╗┬
║ ║╣ ╠═╝╠═╣ ║ ║╠═╝║ ╦╠╦╝╠═╣ ║║║╣ ║║│
╚═╝╚═╝╩ ╩ ╩ ╚═╝╩ ╚═╝╩╚═╩ ╩═╩╝╚═╝═╩╝o
```
Notes ¶
The cluster is half-empty, so it was quick. If it were otherwise, it would have taken days…
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| % sudo ceph osd df [22]
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 1.00000 1.00000 5.5 TiB 1.4 GiB 438 MiB 39 KiB 1024 MiB 5.5 TiB 0.03 1.11 37 up
1 hdd 1.00000 1.00000 5.5 TiB 1.4 GiB 454 MiB 2.2 MiB 1022 MiB 5.5 TiB 0.03 1.12 46 up
2 hdd 1.00000 1.00000 5.5 TiB 1.3 GiB 314 MiB 1.0 MiB 1023 MiB 5.5 TiB 0.02 1.01 24 up
3 hdd 1.00000 1.00000 5.5 TiB 1.2 GiB 161 MiB 2.5 MiB 1022 MiB 5.5 TiB 0.02 0.90 34 up
4 hdd 1.00000 1.00000 5.5 TiB 1.2 GiB 242 MiB 21 KiB 1024 MiB 5.5 TiB 0.02 0.96 34 up
5 hdd 1.00000 1.00000 5.5 TiB 1.2 GiB 212 MiB 3.0 MiB 1021 MiB 5.5 TiB 0.02 0.93 34 up
6 hdd 1.00000 1.00000 5.5 TiB 1.5 GiB 485 MiB 533 KiB 1023 MiB 5.5 TiB 0.03 1.14 35 up
7 hdd 1.00000 1.00000 5.5 TiB 1.3 GiB 280 MiB 1.8 MiB 1022 MiB 5.5 TiB 0.02 0.99 41 up
8 hdd 1.00000 1.00000 5.5 TiB 1.3 GiB 265 MiB 19 MiB 1005 MiB 5.5 TiB 0.02 0.97 38 up
9 hdd 1.00000 1.00000 5.5 TiB 1.4 GiB 359 MiB 949 KiB 1023 MiB 5.5 TiB 0.02 1.05 28 up
10 hdd 1.00000 1.00000 5.5 TiB 1.2 GiB 179 MiB 467 KiB 1024 MiB 5.5 TiB 0.02 0.91 33 up
11 hdd 1.00000 1.00000 5.5 TiB 1.1 GiB 132 MiB 2.8 MiB 1021 MiB 5.5 TiB 0.02 0.87 23 up
12 hdd 1.00000 1.00000 5.5 TiB 1.3 GiB 358 MiB 19 MiB 1005 MiB 5.5 TiB 0.02 1.04 36 up
13 hdd 1.00000 1.00000 5.5 TiB 1.3 GiB 301 MiB 734 KiB 1023 MiB 5.5 TiB 0.02 1.00 39 up
TOTAL 76 TiB 18 GiB 4.1 GiB 54 MiB 14 GiB 76 TiB 0.02
MIN/MAX VAR: 0.87/1.14 STDDEV: 0.00
|