From: Mike Snitzer <snitzer@redhat.com> To: Hannes Reinecke <hare@suse.de> Cc: linux-nvme@lists.infradead.org, Keith Busch <keith.busch@intel.com>, Sagi Grimberg <sagi@grimberg.me>, hch@lst.de, axboe@kernel.dk, Martin Wilck <mwilck@suse.com>, lijie <lijie34@huawei.com>, xose.vazquez@gmail.com, chengjike.cheng@huawei.com, shenhong09@huawei.com, dm-devel@redhat.com, wangzhoumengjian@huawei.com, christophe.varoqui@opensvc.com, bmarzins@redhat.com, sschremm@netapp.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: nvme: allow ANA support to be independent of native multipathing Date: Fri, 16 Nov 2018 09:01:53 -0500 [thread overview] Message-ID: <20181116140153.GB28870@redhat.com> (raw) In-Reply-To: <f7fe5b65-d030-bc5d-233f-e5bc2f995efc@suse.de> On Fri, Nov 16 2018 at 2:25am -0500, Hannes Reinecke <hare@suse.de> wrote: > On 11/15/18 6:46 PM, Mike Snitzer wrote: > >Whether or not ANA is present is a choice of the target implementation; > >the host (and whether it supports multipathing) has _zero_ influence on > >this. If the target declares a path as 'inaccessible' the path _is_ > >inaccessible to the host. As such, ANA support should be functional > >even if native multipathing is not. > > > >Introduce ability to always re-read ANA log page as required due to ANA > >error and make current ANA state available via sysfs -- even if native > >multipathing is disabled on the host (e.g. nvme_core.multipath=N). > > > >This affords userspace access to the current ANA state independent of > >which layer might be doing multipathing. It also allows multipath-tools > >to rely on the NVMe driver for ANA support while dm-multipath takes care > >of multipathing. > > > >While implementing these changes care was taken to preserve the exact > >ANA functionality and code sequence native multipathing has provided. > >This manifests as native multipathing's nvme_failover_req() being > >tweaked to call __nvme_update_ana() which was factored out to allow > >nvme_update_ana() to be called independent of nvme_failover_req(). > > > >And as always, if embedded NVMe users do not want any performance > >overhead associated with ANA or native NVMe multipathing they can > >disable CONFIG_NVME_MULTIPATH. > > > >Signed-off-by: Mike Snitzer <snitzer@redhat.com> > >--- > > drivers/nvme/host/core.c | 10 +++++---- > > drivers/nvme/host/multipath.c | 49 +++++++++++++++++++++++++++++++++---------- > > drivers/nvme/host/nvme.h | 4 ++++ > > 3 files changed, 48 insertions(+), 15 deletions(-) > > > >diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > >index fe957166c4a9..3df607905628 100644 > >--- a/drivers/nvme/host/core.c > >+++ b/drivers/nvme/host/core.c > >@@ -255,10 +255,12 @@ void nvme_complete_rq(struct request *req) > > nvme_req(req)->ctrl->comp_seen = true; > > if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) { > >- if ((req->cmd_flags & REQ_NVME_MPATH) && > >- blk_path_error(status)) { > >- nvme_failover_req(req); > >- return; > >+ if (blk_path_error(status)) { > >+ if (req->cmd_flags & REQ_NVME_MPATH) { > >+ nvme_failover_req(req); > >+ return; > >+ } > >+ nvme_update_ana(req); > > } > > if (!blk_queue_dying(req->q)) { ... > >diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c > >index 8e03cda770c5..0adbcff5fba2 100644 > >--- a/drivers/nvme/host/multipath.c > >+++ b/drivers/nvme/host/multipath.c > >@@ -58,25 +87,22 @@ void nvme_failover_req(struct request *req) > > spin_unlock_irqrestore(&ns->head->requeue_lock, flags); > > blk_mq_end_request(req, 0); > >- switch (status & 0x7ff) { > >- case NVME_SC_ANA_TRANSITION: > >- case NVME_SC_ANA_INACCESSIBLE: > >- case NVME_SC_ANA_PERSISTENT_LOSS: > >+ if (nvme_ana_error(status)) { > > /* > > * If we got back an ANA error we know the controller is alive, > > * but not ready to serve this namespaces. The spec suggests > > * we should update our general state here, but due to the fact > > * that the admin and I/O queues are not serialized that is > > * fundamentally racy. So instead just clear the current path, > >- * mark the the path as pending and kick of a re-read of the ANA > >+ * mark the path as pending and kick off a re-read of the ANA > > * log page ASAP. > > */ > > nvme_mpath_clear_current_path(ns); > >- if (ns->ctrl->ana_log_buf) { > >- set_bit(NVME_NS_ANA_PENDING, &ns->flags); > >- queue_work(nvme_wq, &ns->ctrl->ana_work); > >- } > >- break; > >+ __nvme_update_ana(ns); > >+ goto kick_requeue; > >+ } > >+ > >+ switch (status & 0x7ff) { > > case NVME_SC_HOST_PATH_ERROR: > > /* > > * Temporary transport disruption in talking to the controller. > >@@ -93,6 +119,7 @@ void nvme_failover_req(struct request *req) > > break; > > } > >+kick_requeue: > > kblockd_schedule_work(&ns->head->requeue_work); > > } > Doesn't the need to be protected by 'if (ns->head->disk)' or somesuch? No. nvme_failover_req() is only ever called by native multipathing; see nvme_complete_rq()'s check for req->cmd_flags & REQ_NVME_MPATH as the condition for calling nvme_complete_rq(). The previos RFC-style patch I posted muddled ANA and multipathing in nvme_update_ana() but this final patch submission was fixed because I saw a cleaner way forward by having nvme_failover_req() also do ANA work just like it always has -- albeit with new helpers that nvme_update_ana() also calls. Mike
WARNING: multiple messages have this Message-ID (diff)
From: snitzer@redhat.com (Mike Snitzer) Subject: nvme: allow ANA support to be independent of native multipathing Date: Fri, 16 Nov 2018 09:01:53 -0500 [thread overview] Message-ID: <20181116140153.GB28870@redhat.com> (raw) In-Reply-To: <f7fe5b65-d030-bc5d-233f-e5bc2f995efc@suse.de> On Fri, Nov 16 2018 at 2:25am -0500, Hannes Reinecke <hare@suse.de> wrote: > On 11/15/18 6:46 PM, Mike Snitzer wrote: > >Whether or not ANA is present is a choice of the target implementation; > >the host (and whether it supports multipathing) has _zero_ influence on > >this. If the target declares a path as 'inaccessible' the path _is_ > >inaccessible to the host. As such, ANA support should be functional > >even if native multipathing is not. > > > >Introduce ability to always re-read ANA log page as required due to ANA > >error and make current ANA state available via sysfs -- even if native > >multipathing is disabled on the host (e.g. nvme_core.multipath=N). > > > >This affords userspace access to the current ANA state independent of > >which layer might be doing multipathing. It also allows multipath-tools > >to rely on the NVMe driver for ANA support while dm-multipath takes care > >of multipathing. > > > >While implementing these changes care was taken to preserve the exact > >ANA functionality and code sequence native multipathing has provided. > >This manifests as native multipathing's nvme_failover_req() being > >tweaked to call __nvme_update_ana() which was factored out to allow > >nvme_update_ana() to be called independent of nvme_failover_req(). > > > >And as always, if embedded NVMe users do not want any performance > >overhead associated with ANA or native NVMe multipathing they can > >disable CONFIG_NVME_MULTIPATH. > > > >Signed-off-by: Mike Snitzer <snitzer at redhat.com> > >--- > > drivers/nvme/host/core.c | 10 +++++---- > > drivers/nvme/host/multipath.c | 49 +++++++++++++++++++++++++++++++++---------- > > drivers/nvme/host/nvme.h | 4 ++++ > > 3 files changed, 48 insertions(+), 15 deletions(-) > > > >diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > >index fe957166c4a9..3df607905628 100644 > >--- a/drivers/nvme/host/core.c > >+++ b/drivers/nvme/host/core.c > >@@ -255,10 +255,12 @@ void nvme_complete_rq(struct request *req) > > nvme_req(req)->ctrl->comp_seen = true; > > if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) { > >- if ((req->cmd_flags & REQ_NVME_MPATH) && > >- blk_path_error(status)) { > >- nvme_failover_req(req); > >- return; > >+ if (blk_path_error(status)) { > >+ if (req->cmd_flags & REQ_NVME_MPATH) { > >+ nvme_failover_req(req); > >+ return; > >+ } > >+ nvme_update_ana(req); > > } > > if (!blk_queue_dying(req->q)) { ... > >diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c > >index 8e03cda770c5..0adbcff5fba2 100644 > >--- a/drivers/nvme/host/multipath.c > >+++ b/drivers/nvme/host/multipath.c > >@@ -58,25 +87,22 @@ void nvme_failover_req(struct request *req) > > spin_unlock_irqrestore(&ns->head->requeue_lock, flags); > > blk_mq_end_request(req, 0); > >- switch (status & 0x7ff) { > >- case NVME_SC_ANA_TRANSITION: > >- case NVME_SC_ANA_INACCESSIBLE: > >- case NVME_SC_ANA_PERSISTENT_LOSS: > >+ if (nvme_ana_error(status)) { > > /* > > * If we got back an ANA error we know the controller is alive, > > * but not ready to serve this namespaces. The spec suggests > > * we should update our general state here, but due to the fact > > * that the admin and I/O queues are not serialized that is > > * fundamentally racy. So instead just clear the current path, > >- * mark the the path as pending and kick of a re-read of the ANA > >+ * mark the path as pending and kick off a re-read of the ANA > > * log page ASAP. > > */ > > nvme_mpath_clear_current_path(ns); > >- if (ns->ctrl->ana_log_buf) { > >- set_bit(NVME_NS_ANA_PENDING, &ns->flags); > >- queue_work(nvme_wq, &ns->ctrl->ana_work); > >- } > >- break; > >+ __nvme_update_ana(ns); > >+ goto kick_requeue; > >+ } > >+ > >+ switch (status & 0x7ff) { > > case NVME_SC_HOST_PATH_ERROR: > > /* > > * Temporary transport disruption in talking to the controller. > >@@ -93,6 +119,7 @@ void nvme_failover_req(struct request *req) > > break; > > } > >+kick_requeue: > > kblockd_schedule_work(&ns->head->requeue_work); > > } > Doesn't the need to be protected by 'if (ns->head->disk)' or somesuch? No. nvme_failover_req() is only ever called by native multipathing; see nvme_complete_rq()'s check for req->cmd_flags & REQ_NVME_MPATH as the condition for calling nvme_complete_rq(). The previos RFC-style patch I posted muddled ANA and multipathing in nvme_update_ana() but this final patch submission was fixed because I saw a cleaner way forward by having nvme_failover_req() also do ANA work just like it always has -- albeit with new helpers that nvme_update_ana() also calls. Mike
next prev parent reply other threads:[~2018-11-16 14:02 UTC|newest] Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-11-08 6:09 [PATCH] multipath-tools: add ANA support for NVMe device lijie 2018-11-12 16:23 ` Martin Wilck 2018-11-12 21:53 ` Mike Snitzer 2018-11-12 21:53 ` Mike Snitzer 2018-11-13 6:59 ` Martin Wilck 2018-11-13 6:59 ` Martin Wilck 2018-11-13 16:18 ` Keith Busch 2018-11-13 16:18 ` Keith Busch 2018-11-13 18:00 ` Mike Snitzer 2018-11-13 18:00 ` Mike Snitzer 2018-11-14 5:38 ` Mike Snitzer 2018-11-14 5:38 ` Mike Snitzer 2018-11-14 7:49 ` Hannes Reinecke 2018-11-14 7:49 ` Hannes Reinecke 2018-11-14 10:36 ` [dm-devel] " Martin Wilck 2018-11-14 10:36 ` Martin Wilck 2018-11-14 17:47 ` Mike Snitzer 2018-11-14 17:47 ` Mike Snitzer 2018-11-14 18:51 ` Hannes Reinecke 2018-11-14 18:51 ` Hannes Reinecke 2018-11-14 19:26 ` Mike Snitzer 2018-11-14 19:26 ` Mike Snitzer 2018-11-15 17:46 ` [PATCH] nvme: allow ANA support to be independent of native multipathing Mike Snitzer 2018-11-15 17:46 ` Mike Snitzer 2018-11-16 7:25 ` Hannes Reinecke 2018-11-16 7:25 ` Hannes Reinecke 2018-11-16 14:01 ` Mike Snitzer [this message] 2018-11-16 14:01 ` Mike Snitzer 2018-11-16 9:14 ` [PATCH] " Christoph Hellwig 2018-11-16 9:14 ` Christoph Hellwig 2018-11-16 9:40 ` Hannes Reinecke 2018-11-16 9:40 ` Hannes Reinecke 2018-11-16 9:49 ` Christoph Hellwig 2018-11-16 9:49 ` Christoph Hellwig 2018-11-16 10:06 ` Hannes Reinecke 2018-11-16 10:06 ` Hannes Reinecke 2018-11-16 10:17 ` Christoph Hellwig 2018-11-16 10:17 ` Christoph Hellwig 2018-11-16 19:28 ` Mike Snitzer 2018-11-16 19:28 ` Mike Snitzer 2018-11-16 19:34 ` Laurence Oberman 2018-11-16 19:34 ` Laurence Oberman 2018-11-19 9:39 ` Christoph Hellwig 2018-11-19 9:39 ` Christoph Hellwig 2018-11-19 14:56 ` Mike Snitzer 2018-11-19 14:56 ` Mike Snitzer 2018-11-19 14:56 ` Mike Snitzer 2018-11-20 9:42 ` Christoph Hellwig 2018-11-20 9:42 ` Christoph Hellwig 2018-11-20 13:37 ` Mike Snitzer 2018-11-20 13:37 ` Mike Snitzer 2018-11-20 16:23 ` Christoph Hellwig 2018-11-20 16:23 ` Christoph Hellwig 2018-11-16 14:12 ` Mike Snitzer 2018-11-16 14:12 ` Mike Snitzer 2018-11-16 18:59 ` [PATCH v2] " Mike Snitzer 2018-11-16 18:59 ` Mike Snitzer 2018-11-14 7:24 ` multipath-tools: add ANA support for NVMe device Hannes Reinecke 2018-11-14 7:24 ` Hannes Reinecke 2018-11-14 15:35 ` Christoph Hellwig 2018-11-14 15:35 ` Christoph Hellwig 2018-11-14 16:16 ` Mike Snitzer 2018-11-14 16:16 ` Mike Snitzer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20181116140153.GB28870@redhat.com \ --to=snitzer@redhat.com \ --cc=axboe@kernel.dk \ --cc=bmarzins@redhat.com \ --cc=chengjike.cheng@huawei.com \ --cc=christophe.varoqui@opensvc.com \ --cc=dm-devel@redhat.com \ --cc=hare@suse.de \ --cc=hch@lst.de \ --cc=keith.busch@intel.com \ --cc=lijie34@huawei.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=mwilck@suse.com \ --cc=sagi@grimberg.me \ --cc=shenhong09@huawei.com \ --cc=sschremm@netapp.com \ --cc=wangzhoumengjian@huawei.com \ --cc=xose.vazquez@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.