Featured image of post Introduction to CEPH

Introduction to CEPH

Introduction to CEPH

Installation

CEPH is a distributed network storage system that supports S3, iSCSI, RBD, etc. It’s a great solution, with plenty of documentation available, but it can be tricky to figure out the correct installation method at first.

Rook

The officially recommended way to install CEPH on Kubernetes. You can read about the experience of the team at Flant here.

Cephadm

Based on this, I wrote some Ansible playbooks and successfully managed a cluster deployed on a host. It accepted 40 GiB over the local network in 2 minutes and remained stable :)

The methods I tried that turned out to be either non-working or deprecated:

  • ceph-deploy: somewhere in the documentation, I found that this method is not recommended. Unfortunately, I had already spent a couple of days on it. I can’t provide a link since their SSL certificate expired, and I’m blocked from accessing the site due to HSTS.
  • ceph ansible: comprehensive and feature-rich, but CentOS 7 is no longer supported on the master branch, and the monitors didn’t synchronize. I couldn’t find a way to install the dashboard without monitoring, so I implemented it myself. I gained experience with Ansible, but then new issues arose, prompting me to switch to cephadm, which I’m glad I did.

Useful Commands

Here’s a cheatsheet I found helpful.
To view the list of disks in use on the servers:

1
2
3
ceph orch device ls
ceph device ls
ceph device ls-by-host ceph-server-0

To check the space usage summary:

1
2
3
ceph osd df
ceph df
ceph df | grep -oE '[0-9.]+\s*[a-zA-Z]+$' | uniq | sort -n

To view the cluster status:

1
2
ceph status
ceph -w

To see the infrastructure, including which services/containers are running and where:

1
2
ceph orch ls
ceph orch ps

Replacing an OSD

Here’s a good article from the same Flant team, along with a useful comment that complements it well. I recreated the OSD when I expanded the block device on the server (virtual machine) to add space to the cluster. I followed this process:

  1. The cluster must be in HEALTH_OK state.
  2. Remove the OSD from the cluster.
  3. Delete the daemon (container) holding this OSD.
  4. Purge the OSD.
  5. Clear the partition.
  6. Reconnect it.
  7. Wait for HEALTH_OK.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# The OSD we want to recreate
OSD='osd.0'

# Find the host
HOST="$(ceph device ls | awk "/${OSD}/{print \$2}" | cut -d: -f1)"

# Get the name of the block device allocated to the OSD
DEV="/dev/$(ceph device ls | awk "/${OSD}/{print \$2}" | cut -d: -f2)"

# Step 1
ceph health | grep -q 'HEALTH_OK'

# Step 2
ceph osd out "$OSD"

# Step 3
ceph orch daemon rm "$OSD" --force 

# Step 4
ceph osd purge "$OSD" --force 

# Step 5
ceph orch device zap "${HOST}:${DEV}" --force

# Step 6
NEW_ID="osd.$(ceph orch daemon add osd "${HOST}:${DEV}' | awk '{print $3}')"
sleep 5
ceph osd crush reweight "$NEW_ID" 1

# Step 7
while ! ceph health | grep -q 'HEALTH_OK'; do
  sleep 5
done

In my experience, replacing an OSD with 15 GiB of data took about 30 minutes. This was tested on three CX21 Hetzner instances.

RBD and Docker

Setting up RBD and integrating it with Docker:

  1. Create a pool.
  2. Initialize the pool.
  3. Create a user for the setup.
  4. Disable unnecessary features. Some RBD features aren’t supported in Linux kernels before 4.17, so they had to be disabled. You can check the options with:
1
ceph config help rbd_default_features
  1. Install the Docker volume plugin.
  2. Test on the container.
  3. Mount the volume and verify the file.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Step 1
ceph osd pool create rbd

# Step 2
rbd pool init rbd

# Step 3
ceph auth get-or-create client.rbd mon 'profile rbd' osd 'profile rbd pool=rbd' mgr 'profile rbd pool=rbd' > /etc/ceph/ceph.client.rbd.keyring

# Step 4
ceph config set global rbd_default_features 7

I used this volume driver. Other options I found hadn’t been updated in 3-4 years and seemed abandoned. This driver doesn’t require anything to run on the side and integrates directly into Docker. Install it like this:

1
2
3
4
5
6
7
8
9
# Step 5
docker plugin install wetopi/rbd \
  --alias=rbd \
  --grant-all-permissions \
  LOG_LEVEL=1 \
  RBD_CONF_POOL=rbd \
  RBD_CONF_CLUSTER=ceph \
  RBD_CONF_KEYRING_USER=client.rbd \
  MOUNT_OPTIONS='--options=noatime,discard'

Run a simple Docker Compose stack and mount the image:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
version: "3.8"

services:
  test:
    image: alpine
    command: sh -c 'cat /vol/test; date | tee /vol/test'
    volumes:
      - data-volume:/vol

volumes:
  data-volume:
    name: vol1
    driver: rbd
    driver_opts:
      size: 123  # MiB

Run and verify:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Step 6
sudo docker-compose up

# Step 7
rbd map vol1
mkdir /tmp/vol1
mount /dev/rbd0 /tmp/vol1
cd /tmp/vol1
ls -la 
cat test
cd /
umount /tmp/vol1 
rbd unmap /dev/rbd0
Licensed under Apache License, Version 2.0
Last updated on Dec 10, 2024 14:01 +0200
All rights reserved
Built with Hugo
Theme Stack designed by Jimmy