Ceph health check threshold values

From Thomas-Krenn-Wiki
Jump to navigation Jump to search

Ceph performs so-called health checks, to guarantee the data integrity in a Ceph cluster. This also includes verifying the fill level of the OSDs. If certain threshold values are reached, Ceph starts processes that protect against data loss.

In this article, these threshold values and its impact within a Proxmox VE Ceph cluster are explained.

Threshold values

The thresholds for the fill level of OSDs in a Ceph cluster are:

  • mon_osd_nearfull_ratio: Above this threshold, warnings about the fill level of the OSDs are issued.
  • mon_osd_backfillfull_ratio: Above this threshold, no more Backfill is executed on OSDs.
  • mon_osd_full_ratio: Once this threshold is reached, writing to this and other OSDs in the same pool is no longer permitted. The pool is read only.

In the following, the individual threshold values are explained.

mon_osd_nearfull_ratio

Default value: 85 %

If one or more OSDs have reached this threshold value, a warning is issued on the Ceph dashboard cluster. When this threshold value is reached, the cluster's storage space should be expanded by then at the latest.

When using Ceph clusters, check whether the default value of 85% is too high. An analysis can be found in the following article: Determine the optimal nearfull ratio in the Proxmox Ceph 3-node cluster.

without
The mon_osd_nearfull_ratio threshold value is reached.

mon_osd_backfillfull_ratio

Default value: 90 %

If an OSD reaches this threshold value, no more Backfill is executed on this OSD[1]. This includes both backfilling by expanding the cluster and recoveryin the event of an OSD or host failure.

Normal writing processes by writing new data or replications will be continued to be carried out.

The threshold value should prevent that mon_osd_full_ratio can be reached early[1].

The mon_osd_backfillfull_ratio threshold value is reached.
The mon_osd_backfillfull_ratio threshold value is reached.

mon_osd_full_ratio

Default value: 95%

Once an OSD has reached this threshold, it is considered full. To avoid data loss, no more data will be written to the pool from this point on[2][3].

Writing operations will only be possible again after expanding the cluster or by freeing up storage space.

Themon_osd_full_ratio threshold value is reached.
The mon_osd_full_ratio threshold value is reached.

More information

References

  1. 1.0 1.1 Ceph Health Checks (docs.ceph.com, 16.12.2025)
  2. no free drive space in Ceph (docs.ceph.com, 16.12.2025)
  3. OSD Full (docs.ceph.com, 16.12.2025)


Author: Stefan Bohn

Stefan Bohn has been employed at Thomas-Krenn.AG since 2020. Originally based in PreSales as a consultant for IT solutions, he moved to Product Management in 2022. There he dedicates himself to knowledge transfer and also drives the Thomas-Krenn Wiki.

Translator: Alina Ranzinger

Alina has been working at Thomas-Krenn.AG since 2024. After her training as multilingual business assistant, she got her job as assistant of the Product Management and is responsible for the translation of texts and for the organisation of the department.


Related articles

Ceph Recovery Stop in the event of node failure
Determine the optimal nearfull ratio in the Proxmox Ceph 3-node cluster
Proxmox GUI remove old Ceph Health Warnings