Red Hat Ceph recommendations

The following are recommendations for the optimal usage of Red Hat CEPH storage:
  1. It is recommended to use the “Replication Factor 3” for HDD as OSDs, and “Replication Factor 2” for SSD/NVMe as OSD in Ceph cluster.​

  2. It is recommended to have a proportionate ratio between Ceph nodes and OSDs per node as per NEBs compliance. Recommendation is to have 12 OSDs per node for HDD as this would allow for quicker data rebalancing in case of disk failure.​
    NOTE:
    • During node failure, the Ceph cluster starts to rebalance the data that consumes more network bandwidth. This impacts the overall performance of Ceph cluster. ​

    • Rebalancing of data does not consume much of CPU. If one CPU core is allocated to one OSD (in case of HDD), then there would be a minimum impact on system CPU utilization. In case of OSD failure, Network consumption is high compared to the addition of new OSD in Ceph cluster.​

  3. It is recommended to configure the number of Placement Group (PG) count by considering the future scaling requirement. This helps in quick rebalancing and/or recovery of data and keeps the system healthy. Use the Ceph PG calculator for calculating the PG count.

  4. It is recommended to perform addition or removal of Ceph nodes during maintenance window (when Ceph cluster is not in use). This is to avoid the situation, where the system tries to rebalance the data due to any one of the Ceph nodes being down. This causes the high network bandwidth consumption leading to overall IOPS deterioration and unhealthy state of Ceph cluster.​

  5. It is recommended to select the OSD media based on the workload requirements. For more storage capacity requirements, choose HDD. For higher IOPS, choose SSD or NVMe as OSD media. ​

  6. It is recommended to choose 10G NIC card for HDD Ceph cluster and 25G for SSD/NVMe.​

  7. It is recommended to have two partitions on each disk when using SSD as OSD, and four partitions on each disk when using NVMe as OSD. ​

  8. It is recommended to use SSD/NVMe as OSD if the workload requires constant number of IOPS. In case of HDD as OSD, the IOPS keeps decreasing as the disk/OSD storage is utilized​.