Ceph as Default Storage for K3s on Alpine VMs

Using Ceph as your default storage is an excellent choice for a production-ready cluster! It provides distributed, replicated storage across all your nodes. Here’s how to set it up:

Architecture Overview

  • Ceph Cluster: Runs on your 3 VMs (k3s-1, k3s-2, k3s-3)
  • Each node contributes: 50GB disk (shared with OS)
  • Replication: 3x replication for data safety
  • Kubernetes integration: Rook operator manages Ceph within K3s

Part 1: Prepare Nodes for Ceph

Step 1.1: Install Required Packages on ALL Nodes

# On ALL nodes (k3s-1, k3s-2, k3s-3)
apk add lvm2 udev e2fsprogs findutils util-linux

# Load required kernel modules
modprobe rbd
echo "rbd" >> /etc/modules-load.d/ceph.conf

# Verify
lsmod | grep rbd

Step 1.2: Prepare Raw Disks (Optional but Recommended)

If you have additional disks, prepare them. Otherwise, we’ll use a directory on the 50GB root disk:

# On ALL nodes, create a directory for Ceph OSDs
mkdir -p /var/lib/ceph/osd

# Give it plenty of space (use 40GB of your 50GB)
# This will be used as OSD storage

Part 2: Install Rook (Ceph Operator)

Step 2.1: Clone Rook Repository

# On master node (k3s-1)
git clone --single-branch --branch v1.14.0 https://github.com/rook/rook.git
cd rook/deploy/examples

Step 2.2: Apply CRDs and Common Resources

# Create CRDs
kubectl create -f crds.yaml
kubectl create -f common.yaml

# Create operator
kubectl create -f operator.yaml

# Wait for operator to be ready
kubectl -n rook-ceph get pods -w

Step 2.3: Create Ceph Cluster Configuration

# Create cluster.yaml for your 3-node setup
cat > cluster.yaml << EOF
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    image: quay.io/ceph/ceph:v18.2.2
  dataDirHostPath: /var/lib/rook
  mon:
    count: 3
    allowMultiplePerNode: false
  dashboard:
    enabled: true
    ssl: true
  storage:
    useAllNodes: true
    useAllDevices: false
    nodes:
    - name: "k3s-1"
      directories:
      - path: "/var/lib/ceph/osd"
    - name: "k3s-2"
      directories:
      - path: "/var/lib/ceph/osd"
    - name: "k3s-3"
      directories:
      - path: "/var/lib/ceph/osd"
    storageClassDeviceSets: []
  healthCheck:
    daemonHealth:
      mon:
        interval: 45s
      osd:
        interval: 60s
      status:
        interval: 60s
EOF

# Apply the cluster
kubectl create -f cluster.yaml

# Watch cluster creation (this takes 5-10 minutes)
kubectl -n rook-ceph get pods -w

Step 2.4: Verify Ceph Cluster Status

# Check all pods are running
kubectl -n rook-ceph get pods

# You should see:
# - rook-ceph-mon-* (3 pods)
# - rook-ceph-mgr-* (1-2 pods)
# - rook-ceph-osd-* (3 pods, one per node)
# - rook-ceph-mgr-* (dashboard)

# Check Ceph status using toolbox
kubectl create -f toolbox.yaml

# Exec into toolbox
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash

# Inside toolbox, check cluster health
ceph status
ceph osd status
ceph df
exit

Part 3: Create Storage Classes

Step 3.1: Create Block Storage (RBD) – For RWO volumes

# Create block storage class
cat << EOF | kubectl apply -f -
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  replicated:
    size: 3
    requireSafeReplicaSize: true
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ceph-block
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
  clusterID: rook-ceph
  pool: replicapool
  imageFormat: "2"
  imageFeatures: layering
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
  csi.storage.k8s.io/fstype: ext4
allowVolumeExpansion: true
reclaimPolicy: Delete
EOF

Step 3.2: Create Shared Filesystem (CephFS) – For RWX volumes

# Create filesystem
cat << EOF | kubectl apply -f -
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: cephfs
  namespace: rook-ceph
spec:
  metadataPool:
    replicated:
      size: 3
  dataPools:
    - replicated:
        size: 3
  preserveFilesystemOnDelete: true
  metadataServer:
    activeCount: 1
    activeStandby: true
EOF

# Create storage class for CephFS
cat << EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
  clusterID: rook-ceph
  fsName: cephfs
  pool: cephfs-data0
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
allowVolumeExpansion: true
reclaimPolicy: Delete
EOF

Part 4: Make Ceph the Default Storage Class

# Remove default annotation from existing storage classes
kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'

# Set Ceph block as default
kubectl patch storageclass ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

# Verify
kubectl get storageclass
# ceph-block should have (default) next to it

Part 5: Test Ceph Storage

Step 5.1: Test Block Storage (RWO)

# Create PVC using default storage class
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ceph-block-test
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: ceph-block-test-pod
spec:
  containers:
  - name: app
    image: alpine
    command: ["sleep", "3600"]
    volumeMounts:
    - name: data
      mountPath: /data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: ceph-block-test
EOF

# Test writing data
kubectl exec -it ceph-block-test-pod -- sh -c "echo 'Ceph block storage test' > /data/test.txt"
kubectl exec -it ceph-block-test-pod -- cat /data/test.txt

Step 5.2: Test Shared Filesystem (RWX)

# Create RWX PVC
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-test
spec:
  storageClassName: cephfs
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cephfs-test-deploy
spec:
  replicas: 2
  selector:
    matchLabels:
      app: cephfs-test
  template:
    metadata:
      labels:
        app: cephfs-test
    spec:
      containers:
      - name: app1
        image: alpine
        command: ["sleep", "3600"]
        volumeMounts:
        - name: shared
          mountPath: /data
      volumes:
      - name: shared
        persistentVolumeClaim:
          claimName: cephfs-test
EOF

# Test that both pods can read/write the same data
POD1=$(kubectl get pods -l app=cephfs-test -o jsonpath='{.items[0].metadata.name}')
POD2=$(kubectl get pods -l app=cephfs-test -o jsonpath='{.items[1].metadata.name}')

# Write from pod1
kubectl exec $POD1 -- sh -c "echo 'Shared data from pod1' > /data/shared.txt"

# Read from pod2
kubectl exec $POD2 -- cat /data/shared.txt

Part 6: Access Ceph Dashboard

Step 6.1: Expose Dashboard

# Create a service to access dashboard
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: rook-ceph-mgr-dashboard-external
  namespace: rook-ceph
spec:
  type: NodePort
  ports:
  - name: dashboard
    port: 7000
    targetPort: 7000
    nodePort: 30444
  selector:
    app: rook-ceph-mgr
    rook_cluster: rook-ceph
EOF

# Get dashboard password
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode

Step 6.2: Access Dashboard

  • URL: `https://10.50.1.101:30444`
  • Username: admin
  • Password: (from command above)

Part 7: Monitor and Maintain Ceph

Step 7.1: Check Ceph Health

# Use toolbox
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph status

# Check OSD status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd tree

# Check storage usage
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph df

Step 7.2: Monitor in Rancher

  1. Go to Rancher UI
  2. Navigate to “Cluster Explorer”
  3. Go to “Storage Classes” – you’ll see ceph-block and cephfs
  4. Go to “Persistent Volumes” to see Ceph volumes
  5. Add the Rook-Ceph namespace to monitor Ceph pods

Part 8: Backup and Recovery

Step 8.1: Backup Ceph Configuration

# Backup critical resources
kubectl -n rook-ceph get cephcluster -o yaml > ceph-cluster-backup.yaml
kubectl -n rook-ceph get cephblockpool -o yaml > ceph-pool-backup.yaml
kubectl get storageclass ceph-block -o yaml > storageclass-backup.yaml

Step 8.2: Snapshots (Optional)

# Install snapshot CRDs
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml

# Create snapshot class
cat << EOF | kubectl apply -f -
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: ceph-block-snapclass
driver: rook-ceph.rbd.csi.ceph.com
deletionPolicy: Delete
EOF

Resource Considerations

With 50GB per node and 3x replication:
Usable storage: ~50GB total (since each 1GB of data uses 3GB across nodes)
OSD overhead: ~5GB per node for Ceph overhead
Available for workloads: ~35GB usable after overhead

If you need more storage, consider:
1. Adding additional disks to VMs
2. Reducing replication to 2 (less safe but more space)
3. Adding more nodes


Quick Commands Reference

# Check Ceph status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph status

# List pools
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd lspools

# Show storage usage
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph df

# Restart Ceph tools pod (if needed)
kubectl -n rook-ceph delete pod -l app=rook-ceph-tools

# View Ceph logs
kubectl -n rook-ceph logs -l app=rook-ceph-osd

# Access dashboard password
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode

Clean Up (if needed)

# Delete test resources
kubectl delete pod ceph-block-test-pod
kubectl delete pvc ceph-block-test
kubectl delete deployment cephfs-test-deploy
kubectl delete pvc cephfs-test

# To completely remove Ceph (DANGER - deletes all data!)
kubectl -n rook-ceph delete cephcluster rook-ceph
kubectl delete namespace rook-ceph

Your cluster now has enterprise-grade distributed storage with Ceph as the default! All PVCs will automatically use Ceph block storage with 3x replication across your nodes.