All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC: nvme multipath support
@ 2017-08-23 17:58 ` Christoph Hellwig
  0 siblings, 0 replies; 122+ messages in thread
From: Christoph Hellwig @ 2017-08-23 17:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Keith Busch, Sagi Grimberg, linux-nvme, linux-block

Hi all,

this series adds support for multipathing, that is accessing nvme
namespaces through multiple controllers to the nvme core driver.

It is a very thin and efficient implementation that relies on
close cooperation with other bits of the nvme driver, and few small
and simple block helpers.

Compared to dm-multipath the important differences are how management
of the paths is done, and how the I/O path works.

Management of the paths is fully integrated into the nvme driver,
for each newly found nvme controller we check if there are other
controllers that refer to the same subsystem, and if so we link them
up in the nvme driver.  Then for each namespace found we check if
the namespace id and identifiers match to check if we have multiple
controllers that refer to the same namespaces.  For now path
availability is based entirely on the controller status, which at
least for fabrics will be continuously updated based on the mandatory
keep alive timer.  Once the Asynchronous Namespace Access (ANA)
proposal passes in NVMe we will also get per-namespace states in
addition to that, but for now any details of that remain confidential
to NVMe members.

The I/O path is very different from the existing multipath drivers,
which is enabled by the fact that NVMe (unlike SCSI) does not support
partial completions - a controller will either complete a whole
command or not, but never only complete parts of it.  Because of that
there is no need to clone bios or requests - the I/O path simply
redirects the I/O to a suitable path.  For successful commands
multipath is not in the completion stack at all.  For failed commands
we decide if the error could be a path failure, and if yes remove
the bios from the request structure and requeue them before completing
the request.  All together this means there is no performance
degradation compared to normal nvme operation when using the multipath
device node (at least not until I find a dual ported DRAM backed
device :))

There are a couple questions left in the individual patches, comments
welcome.

Note that this series requires the previous series to remove bi_bdev,
in doubt use the git tree below for testing.

A git tree is available at:

   git://git.infradead.org/users/hch/block.git nvme-mpath

gitweb:

   http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/nvme-mpath

^ permalink raw reply	[flat|nested] 122+ messages in thread

end of thread, other threads:[~2017-09-18  0:17 UTC | newest]

Thread overview: 122+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-23 17:58 RFC: nvme multipath support Christoph Hellwig
2017-08-23 17:58 ` Christoph Hellwig
2017-08-23 17:58 ` [PATCH 01/10] nvme: report more detailed status codes to the block layer Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-28  6:06   ` Sagi Grimberg
2017-08-28  6:06     ` Sagi Grimberg
2017-08-28 18:50   ` Keith Busch
2017-08-28 18:50     ` Keith Busch
2017-08-23 17:58 ` [PATCH 02/10] nvme: allow calling nvme_change_ctrl_state from irq context Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-28  6:06   ` Sagi Grimberg
2017-08-28  6:06     ` Sagi Grimberg
2017-08-28 18:50   ` Keith Busch
2017-08-28 18:50     ` Keith Busch
2017-08-23 17:58 ` [PATCH 03/10] nvme: remove unused struct nvme_ns fields Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-28  6:07   ` Sagi Grimberg
2017-08-28  6:07     ` Sagi Grimberg
2017-08-28 19:13   ` Keith Busch
2017-08-28 19:13     ` Keith Busch
2017-08-23 17:58 ` [PATCH 04/10] nvme: remove nvme_revalidate_ns Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-28  6:12   ` Sagi Grimberg
2017-08-28  6:12     ` Sagi Grimberg
2017-08-28 19:14   ` Keith Busch
2017-08-28 19:14     ` Keith Busch
2017-08-23 17:58 ` [PATCH 05/10] nvme: don't blindly overwrite identifiers on disk revalidate Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-28  6:17   ` Sagi Grimberg
2017-08-28  6:17     ` Sagi Grimberg
2017-08-28  6:23     ` Christoph Hellwig
2017-08-28  6:23       ` Christoph Hellwig
2017-08-28  6:32       ` Sagi Grimberg
2017-08-28  6:32         ` Sagi Grimberg
2017-08-28 19:15   ` Keith Busch
2017-08-28 19:15     ` Keith Busch
2017-08-23 17:58 ` [PATCH 06/10] nvme: track subsystems Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-23 22:04   ` Keith Busch
2017-08-23 22:04     ` Keith Busch
2017-08-24  8:52     ` Christoph Hellwig
2017-08-24  8:52       ` Christoph Hellwig
2017-08-28  6:22   ` Sagi Grimberg
2017-08-28  6:22     ` Sagi Grimberg
2017-08-23 17:58 ` [PATCH 07/10] nvme: track shared namespaces Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-28  6:51   ` Sagi Grimberg
2017-08-28  6:51     ` Sagi Grimberg
2017-08-28  8:50     ` Christoph Hellwig
2017-08-28  8:50       ` Christoph Hellwig
2017-08-28 20:21     ` J Freyensee
2017-08-28 20:21       ` J Freyensee
2017-08-29  8:25       ` Christoph Hellwig
2017-08-29  8:25         ` Christoph Hellwig
2017-08-29  6:54     ` Guan Junxiong
2017-08-29  6:54       ` Guan Junxiong
2017-08-28 12:04   ` javigon
2017-08-28 12:04     ` javigon
2017-08-28 12:41   ` Guan Junxiong
2017-08-28 12:41     ` Guan Junxiong
2017-08-28 14:30     ` Christoph Hellwig
2017-08-28 14:30       ` Christoph Hellwig
2017-08-29  2:42       ` Guan Junxiong
2017-08-29  2:42         ` Guan Junxiong
2017-08-29  8:30         ` Christoph Hellwig
2017-08-29  8:30           ` Christoph Hellwig
2017-08-29  8:29     ` Christoph Hellwig
2017-08-29  8:29       ` Christoph Hellwig
2017-08-28 19:18   ` Keith Busch
2017-08-28 19:18     ` Keith Busch
2017-08-23 17:58 ` [PATCH 08/10] block: provide a generic_make_request_fast helper Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-28  7:00   ` Sagi Grimberg
2017-08-28  7:00     ` Sagi Grimberg
2017-08-28  8:54     ` Christoph Hellwig
2017-08-28  8:54       ` Christoph Hellwig
2017-08-28 11:01       ` Sagi Grimberg
2017-08-28 11:01         ` Sagi Grimberg
2017-08-28 11:54         ` Christoph Hellwig
2017-08-28 11:54           ` Christoph Hellwig
2017-08-28 12:38           ` Sagi Grimberg
2017-08-28 12:38             ` Sagi Grimberg
2017-08-23 17:58 ` [PATCH 09/10] blk-mq: add a blk_steal_bios helper Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-28  7:04   ` Sagi Grimberg
2017-08-28  7:04     ` Sagi Grimberg
2017-08-23 17:58 ` [PATCH 10/10] nvme: implement multipath access to nvme subsystems Christoph Hellwig
2017-08-23 17:58   ` Christoph Hellwig
2017-08-23 18:21   ` Bart Van Assche
2017-08-23 18:21     ` Bart Van Assche
2017-08-24  8:59     ` hch
2017-08-24  8:59       ` hch
2017-08-24 20:17       ` Bart Van Assche
2017-08-24 20:17         ` Bart Van Assche
2017-09-05 11:53         ` Christoph Hellwig
2017-09-05 11:53           ` Christoph Hellwig
2017-09-11  6:34           ` Tony Yang
2017-08-23 22:53   ` Keith Busch
2017-08-23 22:53     ` Keith Busch
2017-08-24  8:52     ` Christoph Hellwig
2017-08-24  8:52       ` Christoph Hellwig
2017-08-28  7:23   ` Sagi Grimberg
2017-08-28  7:23     ` Sagi Grimberg
2017-08-28  9:06     ` Christoph Hellwig
2017-08-28  9:06       ` Christoph Hellwig
2017-08-28 13:40       ` Sagi Grimberg
2017-08-28 13:40         ` Sagi Grimberg
2017-08-28 14:24         ` Christoph Hellwig
2017-08-28 14:24           ` Christoph Hellwig
2017-09-07 15:17       ` Tony Yang
2017-08-29 10:22   ` Guan Junxiong
2017-08-29 10:22     ` Guan Junxiong
2017-08-29 14:51     ` Christoph Hellwig
2017-08-29 14:51       ` Christoph Hellwig
2017-08-29 14:54   ` Keith Busch
2017-08-29 14:54     ` Keith Busch
2017-08-29 14:55     ` Christoph Hellwig
2017-08-29 14:55       ` Christoph Hellwig
2017-08-29 15:41       ` Keith Busch
2017-08-29 15:41         ` Keith Busch
2017-09-18  0:17         ` Christoph Hellwig
2017-09-18  0:17           ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.