NVMe physical block size
SSDs store data in storage cells that are summarized in pages. Multiple pages form a block. However, SSDs do not report physical_block_size. The Linux kernel calculates the physical_block_size due to different parameters since kernel version 5.3
Calculation of physical_block_size in the Linux kernel
Since Linux kernel version 5.3, the physical_block_size of NVMe SSDs is calculated as follows in the Linux kernel: [1]
- To start with, the following applies:
- phys_bs = logical_block_size
- 512 byte or 4.096 byte - depends on the block size of the default-namespace of a SSD or with which block size the namespace has been formated.
- phys_bs = logical_block_size
- If an NPGW (Namespace Preferred Write Granularity) is defined for the affected NVMe namespace, phys_bs is increased accordingly to a multiple:
- phys_bs = (1 + NPWG) * logical_block_size
- In addition, the atomic_bs (atomar block size, which can still be written atomically during a power failure) is determined.
- Linux file systems assume, that the writing of a single physical block is an atomar process. Therefore, the smaller of the two values (phys_bs, atomic_bs) is used to secure physical_block_size:
- physical_block_size = min(phys_bs, atomic_bs)
Calculation in detail
The following information explains the calculation in detail.
atomic_bs
Definitions:[2]
- Namespace Features (NSFEAT):
- Bit 1 (NSABP), if set to '1', states that the fields NAWUN, NAWUPF and NACWU are defined for this namespace and should be used by the host instead of AWUN, AWUPF and ACWU in the data structure "Identify Controller". If this value is reset to '0' , the controller does not support the fields NAWUN, NAWUPF and NACWU for this namespace. In this case, the host should use the fields AWUN, AWUPF and ACWU.
- Namespace Atomic Boundary Offset (NABO)
- This field states the LBA in this namespace, where the first atomar border starts. Typically, NABO refers to the LBAaddress 0.
- Namespace Atomic Write Unit Power Fail (NAWUPF) (≥ AWUPF)
- This field states the namespace specific size of the writing process that is guaranteed during the power outage or an error condition in the NVM. If the NSABP-Bit is set to '0', this field is reserved.
- A value from 0h means that the size for this namespace is the same as that specified in the AWUPF field of the Identify Controller data structure. All other values state the size in form of logical blocks using the same encoding as in the AWUPF field.
- Atomic Write Unit Power Fail (AWUPF)
- This field states the size of the writing process, which is guaranteed to be written atomically to the NVM across all namespaces with any supported namespace format during a power failure or fault condition.
- If a specific namespace guarantees a larger size than specified in this field, this namespace-specific size is specified in the NAWUPF field in the Identify namespace data structure.
- This field is stated in logical blocks and is a 0-based value. The AWUPF-value must be less than or equal to the AWUN value.
- When a write command with a size less than or equal to the AWUPF value is transmitted, the host is guaranteed that the write operation in the NVM is atomic with respect to other read or write commands. When a writing command is transferred, that is larger than this size, there is no waranty for the atomicity of the command. If the write size is less than or equal to the AWUPF value and the write command fails, subsequent read commands for the corresponding logical blocks return the data from the previous successful write command. If a write command with a size greater than the AWUPF value is transmitted, there is no guarantee that data will be returned for subsequent read commands for the associated logical blocks.
Calculation:
- atomic_bs = bs by default
- If NABO has the value 0 and
- if the NSABP bit is set and NAWUPF > 1, the following applies atomic_bs = (1 + NAWUP) * bs,
- Otherwise, the following applies: atomic_bs = (1 + AWUPF) * bs
phys_bs
Definitions:[2]
- Namespace Features (NSFEAT):
- Bit 4 (OPTPERF), if set to '1', displays that the areas NPWG, NPWA, NPDG, NPDA and NOWS are defined for this namespace and that should be used by the host for the E/A-optimization. If the value is '0', the NVMe SSD does not support the NPWG, NPWA, NPDG, NPDA, and NOWS fields for this namespace.
- Namespace Preferred Write Granularity (NPWG)
- This field specifies the smallest recommended write granularity in logical blocks for this namespace. It is a 0-based value. If the OPTPERF bit is set to "0", this field is reserved. The specified size should be less than or equal to the maximum data transfer size (MDTS), which is specified in units of the minimum memory page size. The value of this field may change when the namespace is reformatted. The size should be a multiple of Namespace Preferred Write Alignment (NPWA).
Calculation:
- phys_bs = bs by default
- If OPTPERF has the value 1, the following applies: phys_bs = (1 + NPWG) * bs
Example of a NVMe SSD
The following example shows a Micron 7450 Pro M.2 SSD with a logical block size of 512 byte and E2MU200 firmware (older version reported larger values for NPWG):[3]
root@ubuntu2204:~# nvme list Node SN Model Namespace Usage Format FW Rev --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- /dev/nvme0n1 22503FB3D7CE Micron_7450_MTFDKBA960TFR 1 0,00 B / 960,20 GB 512 B + 0 B E2MU200 root@ubuntu2204:~# cat /sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0/nvme/nvme0/nvme0n1/queue/physical_block_size 4096 root@ubuntu2204:~# cat /sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0/nvme/nvme0/nvme0n1/queue/logical_block_size 512 root@ubuntu2204:~# nvme id-ns -H /dev/nvme0n1 |grep -i NAWUPF [1:1] : 0x1 Namespace uses NAWUN, NAWUPF, and NACWU nawupf : 511 root@ubuntu2204:~# nvme id-ctrl -H /dev/nvme0 |grep -i AWUPF awupf : 63 root@ubuntu2204:~# nvme id-ns -H /dev/nvme0n1 |grep -i NPWG [4:4] : 0x1 NPWG, NPWA, NPDG, NPDA, and NOWS are Supported npwg : 7
The following example shows a Micron 7450 Max U.2 SSD with a logical block size of 4,096 byte and E2MU200 firmware:
root@ubuntu2204:~# nvme list Node SN Model Namespace Usage Format FW Rev --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- [...] /dev/nvme1n1 22323B6AC5C5 Micron_7450_MTFDKCC3T2TFS 1 977,32 MB / 3,20 TB 4 KiB + 0 B E2MU200 root@ubuntu2204:~# cat /sys/devices/pci0000:5d/0000:5d:01.0/0000:5e:00.0/nvme/nvme1/nvme1n1/queue/physical_block_size 4096 root@ubuntu2204:~# cat /sys/devices/pci0000:5d/0000:5d:01.0/0000:5e:00.0/nvme/nvme1/nvme1n1/queue/logical_block_size 4096 root@ubuntu2204:~# nvme id-ns -H /dev/nvme1n1 |grep -i NAWUPF [1:1] : 0x1 Namespace uses NAWUN, NAWUPF, and NACWU nawupf : 63 root@ubuntu2204:~# nvme id-ctrl -H /dev/nvme1 |grep -i AWUPF awupf : 63 root@ubuntu2204:~# nvme id-ns -H /dev/nvme1n1 |grep -i NPWG [4:4] : 0x1 NPWG, NPWA, NPDG, NPDA, and NOWS are Supported npwg : 0
More information
- Optimal Performance Parameters for NVMe SSDs (snia.org, SDC 2022, 13.09.2022)
References
- ↑ nvme: set physical block size and optimal I/O size (git.kernel.org, Bart Van Assche, 28.06.2019)
- ↑ 2.0 2.1 NVM Command Set Specification (nvmexpress.org, 03.10.2022) Revision 1.0c
- ↑ ashift=18 needed for NVMe with physical block size 256k (github.com/openzfs/zfs/issues)
|
Author: Werner Fischer Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.
|
|
Translator: Alina Ranzinger Alina has been working at Thomas-Krenn.AG since 2024. After her training as multilingual business assistant, she got her job as assistant of the Product Management and is responsible for the translation of texts and for the organisation of the department.
|


