Linux Multi-Queue Block IO Queueing Mechanism (blk-mq)
blk-mq (Multi-Queue Block IO Queueing Mechanism) is a new framework for the Linux block layer that was introduced with Linux Kernel 3.13, and which has become feature-complete with Kernel 3.16. Blk-mq allows for over 15 million IOPS with high-performance flash devices (e.g. PCIe SSDs) on 8-socket servers, though even single and dual socket servers also benefit considerably from blk-mq. To use a device with blk-mq, the device must support the respective driver.
This article explains how blk-mq integrates into the Linux storage stack and which devices have blk-mq compatible drivers already included in the Linux kernel.
blk-mq in the Linux Storage Stack
Blk-mq integrates seamlessly into the Linux storage stack. It provides basic functions to device drivers for mapping I/O enquiries to multiple queues. The tasks are distributed across multiple threads and therefore to multiple CPU cores (per-core software queues). Blk-mq compatible drivers inform blk-mq how many parallel hardware queues a device supports (number of submission queues as part of the hardware dispatch queue registration).
Blk-mq-based device drivers bypass the previous Linux I/O scheduler. In the past, some drivers without blk-mq already performed this (iomemory-vsl, nvme, mtip32xx), but these had to establish as bio-based (block-I/O-based) drivers many generic functions on their own ("stacked" approach).
All device drivers that use the previous block I/O layer continue to work independently of blk-mq as request-based drivers according to the Linux I/O scheduler (request_fn based approach, see Linux I/O Stack Diagram). How much longer this request_fn based approach will exist in the Linux kernel is currently unclear (July 2014).
|Driver||Device Name||Supported Devices||blk-mq Since Kernel Version|
|null_blk||/dev/nullb*||none (test drivers)||3.13 (git commit)|
|virtio-blk||/dev/vd*||Virtual guest drivers (e.g. under KVM)||3.13 (git commit)|
|mtip32xx||/dev/rssd*||Micron RealSSD PCIe||3.16 (git commit)|
|scsi (scsi_mq)||/dev/sd*||e.g. SAS and SATA SSDs/HDDs||3.17 (git commit)|
|NVMe||/dev/nvme*||e.g. Intel SSD DC P3600 DC P3700 Series||3.19 (git commit)|
|rbd||/dev/rdb*||RADOS Block Device (Ceph)||4.0 (git commit)|
|ubi/block||/dev/ubiblock*||4.0 (git commit)|
|loop||/dev/loop*||Loopback-Device||4.0 (git commit)|
|dm / dm-mpath||request-based device mapper targets (derzeit ist dies ausschließlich dm-multipath)||planned for 4.1|
- Kernel-Log – What's New in 3.13 (1): File Systems and Storage (heise.de, Dec. 10, 2013)
- The Multi-queue Block Layer (lwn.net, June 5, 2013)
- Linux & NVM File and Storage System Challenges (snia.org, Slides from Ric Wheeler, Senior Engineering Manager Kernel File Systems, Red Hat, Inc.)
- Blk-mq Is Almost Feature Complete & Fast With Linux 3.16 (phoronix.com, 02.06.2014)
- Linux Block IO: Introducing Multi-queue SSD Access on Multi-core Systems (Matias Bjørling, Jens Axboe, David Nellans, Philippe Bonnet at SYSTOR 2013 - 6th Annual International Systems and Storage Conference)
- blk-mq: New Multi-queue Block IO Queueing Mechanism (git commit by Jens Axboe from Oct. 25, 2013)
- Re: (PATCH RFC - TAKE TWO - 00/12) New Version of the BFQ I/O Scheduler ... yes, we're likely going to maintain that code for a long time, so it's not going anywhere anytime soon... (Jens Axboe, June 2, 2014)
- Re: (PATCH RFC - TAKE TWO - 00/12) New Version of the BFQ I/O Scheduler I'd really planning on not maintaining the old request based SCSI code for a long time once we get positive reports in from users of various kinds of older hardware. (Christoph Hellwig, June 4, 2014)
- Null block device driver (kernel.org/doc/Documentation)
- Virtio (www.linux-kvm.org)
- Boot from virtio block device (www.linux-kvm.org)
- Intel Solid-State Drive Data Center Family for PCIe (www.intel.com)
- dm: add full blk-mq support to request-based DM (dm-devel mailing list, 13.03.2015)
Author: Werner Fischer
Werner Fischer, working in the Web Operations & Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.