Linux Multi-Queue Block IO Queueing Mechanism (blk-mq) Details

From Thomas-Krenn-Wiki
Jump to navigation Jump to search

blk-mq (Multi-Queue Block IO Queueing Mechanism) is a new framework for the Linux block layer that was introduced with Linux Kernel 3.13, and which has become feature-complete with Kernel 3.16.[1] Blk-mq allows for over 15 million IOPS with high-performance flash devices (e.g. PCIe SSDs) on 8-socket servers, though even single and dual socket servers also benefit considerably from blk-mq.[2] To use a device with blk-mq, the device must support the respective driver.

This article explains how blk-mq integrates into the Linux storage stack and which devices have blk-mq compatible drivers already included in the Linux kernel.

blk-mq in the Linux Storage Stack

Two-level Linux block layer design of blk-mq.[2]

Blk-mq integrates seamlessly into the Linux storage stack. It provides basic functions to device drivers for mapping I/O enquiries to multiple queues. The tasks are distributed across multiple threads and therefore to multiple CPU cores (per-core software queues). Blk-mq compatible drivers inform blk-mq how many parallel hardware queues a device supports (number of submission queues as part of the hardware dispatch queue registration).

Blk-mq-based device drivers bypass the previous Linux I/O scheduler. In the past, some drivers without blk-mq already performed this (iomemory-vsl, nvme, mtip32xx), but these had to establish as bio-based (block-I/O-based) drivers many generic functions on their own ("stacked" approach).

All device drivers that use the previous block I/O layer continue to work independently of blk-mq as request-based drivers according to the Linux I/O scheduler (request_fn based approach, see Linux I/O Stack Diagram).[3] How much longer this request_fn based approach will exist in the Linux kernel is currently unclear (July 2014).[4][5]

Device Drivers

Driver Device Name Supported Devices blk-mq Since Kernel Version
null_blk /dev/nullb*[6] none (test drivers) 3.13 (git commit)
virtio-blk /dev/vd* Virtual guest drivers (e.g. under KVM[7][8]) 3.13 (git commit)
mtip32xx /dev/rssd* Micron RealSSD PCIe 3.16 (git commit)
scsi (scsi_mq) /dev/sd* e.g. SAS and SATA SSDs/HDDs 3.17 (git commit)
NVMe /dev/nvme* e.g. Intel SSD DC P3600 DC P3700 Series[9] 3.19 (git commit)
rbd /dev/rdb* RADOS Block Device (Ceph) 4.0 (git commit)
ubi/block /dev/ubiblock* 4.0 (git commit)
loop /dev/loop* Loopback-Device 4.0 (git commit)
dm / dm-mpath request-based device mapper targets (derzeit ist dies ausschließlich dm-multipath) planned for 4.1[10]

Additional Resources

References

  1. Blk-mq Is Almost Feature Complete & Fast With Linux 3.16 (phoronix.com, 02.06.2014)
  2. 2.0 2.1 Linux Block IO: Introducing Multi-queue SSD Access on Multi-core Systems (Matias Bjørling, Jens Axboe, David Nellans, Philippe Bonnet at SYSTOR 2013 - 6th Annual International Systems and Storage Conference)
  3. blk-mq: New Multi-queue Block IO Queueing Mechanism (git commit by Jens Axboe from Oct. 25, 2013)
  4. Re: (PATCH RFC - TAKE TWO - 00/12) New Version of the BFQ I/O Scheduler ... yes, we're likely going to maintain that code for a long time, so it's not going anywhere anytime soon... (Jens Axboe, June 2, 2014)
  5. Re: (PATCH RFC - TAKE TWO - 00/12) New Version of the BFQ I/O Scheduler I'd really planning on not maintaining the old request based SCSI code for a long time once we get positive reports in from users of various kinds of older hardware. (Christoph Hellwig, June 4, 2014)
  6. Null block device driver (kernel.org/doc/Documentation)
  7. Virtio (www.linux-kvm.org)
  8. Boot from virtio block device (www.linux-kvm.org)
  9. Intel Solid-State Drive Data Center Family for PCIe (www.intel.com)
  10. dm: add full blk-mq support to request-based DM (dm-devel mailing list, 13.03.2015)


Foto Werner Fischer.jpg

Author: Werner Fischer

Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.