From: Christoph Hellwig <hch@lst.de> To: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>, Sagi Grimberg <sagi@grimberg.me>, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org Subject: Re: [PATCH 10/10] nvme: implement multipath access to nvme subsystems Date: Tue, 29 Aug 2017 16:55:59 +0200 [thread overview] Message-ID: <20170829145559.GA32760@lst.de> (raw) In-Reply-To: <20170829145417.GA4428@localhost.localdomain> On Tue, Aug 29, 2017 at 10:54:17AM -0400, Keith Busch wrote: > On Wed, Aug 23, 2017 at 07:58:15PM +0200, Christoph Hellwig wrote: > > + /* Anything else could be a path failure, so should be retried */ > > + spin_lock_irqsave(&ns->head->requeue_lock, flags); > > + blk_steal_bios(&ns->head->requeue_list, req); > > + spin_unlock_irqrestore(&ns->head->requeue_lock, flags); > > + > > + nvme_reset_ctrl(ns->ctrl); > > + kblockd_schedule_work(&ns->head->requeue_work); > > + return true; > > +} > > It appears this isn't going to cause the path selection to failover for > the requeued work. The bio's bi_disk is unchanged from the failed path when the > requeue_work submits the bio again so it will use the same path, right? Oh. This did indeed break with the bi_bdev -> bi_disk refactoring I did just before sending this out. > It also looks like new submissions will get a new path only from the > fact that the original/primary is being reset. The controller reset > itself seems a bit heavy-handed. Can we just set head->current_path to > the next active controller in the list? For ANA we'll have to do that anyway, but if we got a failure that clearly indicates a path failure what benefit is there in not resetting the controller? But yeah, maybe we can just switch the path for non-ANA controllers and wait for timeouts to do their work. > > > +static void nvme_requeue_work(struct work_struct *work) > > +{ > > + struct nvme_ns_head *head = > > + container_of(work, struct nvme_ns_head, requeue_work); > > + struct bio *bio, *next; > > + > > + spin_lock_irq(&head->requeue_lock); > > + next = bio_list_get(&head->requeue_list); > > + spin_unlock_irq(&head->requeue_lock); > > + > > + while ((bio = next) != NULL) { > > + next = bio->bi_next; > > + bio->bi_next = NULL; > > + generic_make_request_fast(bio); > > + } > > +} > > Here, I think we need to reevaluate the path (nvme_find_path) and set > bio->bi_disk accordingly. Yes. Previously this was opencoded and always used head->disk, but I messed it up last minute. In the end it still worked for my cases because the controller would either already be reset or fail all I/O, but this behavior clearly is not intended and suboptimal.
WARNING: multiple messages have this Message-ID (diff)
From: hch@lst.de (Christoph Hellwig) Subject: [PATCH 10/10] nvme: implement multipath access to nvme subsystems Date: Tue, 29 Aug 2017 16:55:59 +0200 [thread overview] Message-ID: <20170829145559.GA32760@lst.de> (raw) In-Reply-To: <20170829145417.GA4428@localhost.localdomain> On Tue, Aug 29, 2017@10:54:17AM -0400, Keith Busch wrote: > On Wed, Aug 23, 2017@07:58:15PM +0200, Christoph Hellwig wrote: > > + /* Anything else could be a path failure, so should be retried */ > > + spin_lock_irqsave(&ns->head->requeue_lock, flags); > > + blk_steal_bios(&ns->head->requeue_list, req); > > + spin_unlock_irqrestore(&ns->head->requeue_lock, flags); > > + > > + nvme_reset_ctrl(ns->ctrl); > > + kblockd_schedule_work(&ns->head->requeue_work); > > + return true; > > +} > > It appears this isn't going to cause the path selection to failover for > the requeued work. The bio's bi_disk is unchanged from the failed path when the > requeue_work submits the bio again so it will use the same path, right? Oh. This did indeed break with the bi_bdev -> bi_disk refactoring I did just before sending this out. > It also looks like new submissions will get a new path only from the > fact that the original/primary is being reset. The controller reset > itself seems a bit heavy-handed. Can we just set head->current_path to > the next active controller in the list? For ANA we'll have to do that anyway, but if we got a failure that clearly indicates a path failure what benefit is there in not resetting the controller? But yeah, maybe we can just switch the path for non-ANA controllers and wait for timeouts to do their work. > > > +static void nvme_requeue_work(struct work_struct *work) > > +{ > > + struct nvme_ns_head *head = > > + container_of(work, struct nvme_ns_head, requeue_work); > > + struct bio *bio, *next; > > + > > + spin_lock_irq(&head->requeue_lock); > > + next = bio_list_get(&head->requeue_list); > > + spin_unlock_irq(&head->requeue_lock); > > + > > + while ((bio = next) != NULL) { > > + next = bio->bi_next; > > + bio->bi_next = NULL; > > + generic_make_request_fast(bio); > > + } > > +} > > Here, I think we need to reevaluate the path (nvme_find_path) and set > bio->bi_disk accordingly. Yes. Previously this was opencoded and always used head->disk, but I messed it up last minute. In the end it still worked for my cases because the controller would either already be reset or fail all I/O, but this behavior clearly is not intended and suboptimal.
next prev parent reply other threads:[~2017-08-29 14:55 UTC|newest] Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-08-23 17:58 RFC: nvme multipath support Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-23 17:58 ` [PATCH 01/10] nvme: report more detailed status codes to the block layer Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-28 6:06 ` Sagi Grimberg 2017-08-28 6:06 ` Sagi Grimberg 2017-08-28 18:50 ` Keith Busch 2017-08-28 18:50 ` Keith Busch 2017-08-23 17:58 ` [PATCH 02/10] nvme: allow calling nvme_change_ctrl_state from irq context Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-28 6:06 ` Sagi Grimberg 2017-08-28 6:06 ` Sagi Grimberg 2017-08-28 18:50 ` Keith Busch 2017-08-28 18:50 ` Keith Busch 2017-08-23 17:58 ` [PATCH 03/10] nvme: remove unused struct nvme_ns fields Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-28 6:07 ` Sagi Grimberg 2017-08-28 6:07 ` Sagi Grimberg 2017-08-28 19:13 ` Keith Busch 2017-08-28 19:13 ` Keith Busch 2017-08-23 17:58 ` [PATCH 04/10] nvme: remove nvme_revalidate_ns Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-28 6:12 ` Sagi Grimberg 2017-08-28 6:12 ` Sagi Grimberg 2017-08-28 19:14 ` Keith Busch 2017-08-28 19:14 ` Keith Busch 2017-08-23 17:58 ` [PATCH 05/10] nvme: don't blindly overwrite identifiers on disk revalidate Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-28 6:17 ` Sagi Grimberg 2017-08-28 6:17 ` Sagi Grimberg 2017-08-28 6:23 ` Christoph Hellwig 2017-08-28 6:23 ` Christoph Hellwig 2017-08-28 6:32 ` Sagi Grimberg 2017-08-28 6:32 ` Sagi Grimberg 2017-08-28 19:15 ` Keith Busch 2017-08-28 19:15 ` Keith Busch 2017-08-23 17:58 ` [PATCH 06/10] nvme: track subsystems Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-23 22:04 ` Keith Busch 2017-08-23 22:04 ` Keith Busch 2017-08-24 8:52 ` Christoph Hellwig 2017-08-24 8:52 ` Christoph Hellwig 2017-08-28 6:22 ` Sagi Grimberg 2017-08-28 6:22 ` Sagi Grimberg 2017-08-23 17:58 ` [PATCH 07/10] nvme: track shared namespaces Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-28 6:51 ` Sagi Grimberg 2017-08-28 6:51 ` Sagi Grimberg 2017-08-28 8:50 ` Christoph Hellwig 2017-08-28 8:50 ` Christoph Hellwig 2017-08-28 20:21 ` J Freyensee 2017-08-28 20:21 ` J Freyensee 2017-08-29 8:25 ` Christoph Hellwig 2017-08-29 8:25 ` Christoph Hellwig 2017-08-29 6:54 ` Guan Junxiong 2017-08-29 6:54 ` Guan Junxiong 2017-08-28 12:04 ` javigon 2017-08-28 12:04 ` javigon 2017-08-28 12:41 ` Guan Junxiong 2017-08-28 12:41 ` Guan Junxiong 2017-08-28 14:30 ` Christoph Hellwig 2017-08-28 14:30 ` Christoph Hellwig 2017-08-29 2:42 ` Guan Junxiong 2017-08-29 2:42 ` Guan Junxiong 2017-08-29 8:30 ` Christoph Hellwig 2017-08-29 8:30 ` Christoph Hellwig 2017-08-29 8:29 ` Christoph Hellwig 2017-08-29 8:29 ` Christoph Hellwig 2017-08-28 19:18 ` Keith Busch 2017-08-28 19:18 ` Keith Busch 2017-08-23 17:58 ` [PATCH 08/10] block: provide a generic_make_request_fast helper Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-28 7:00 ` Sagi Grimberg 2017-08-28 7:00 ` Sagi Grimberg 2017-08-28 8:54 ` Christoph Hellwig 2017-08-28 8:54 ` Christoph Hellwig 2017-08-28 11:01 ` Sagi Grimberg 2017-08-28 11:01 ` Sagi Grimberg 2017-08-28 11:54 ` Christoph Hellwig 2017-08-28 11:54 ` Christoph Hellwig 2017-08-28 12:38 ` Sagi Grimberg 2017-08-28 12:38 ` Sagi Grimberg 2017-08-23 17:58 ` [PATCH 09/10] blk-mq: add a blk_steal_bios helper Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-28 7:04 ` Sagi Grimberg 2017-08-28 7:04 ` Sagi Grimberg 2017-08-23 17:58 ` [PATCH 10/10] nvme: implement multipath access to nvme subsystems Christoph Hellwig 2017-08-23 17:58 ` Christoph Hellwig 2017-08-23 18:21 ` Bart Van Assche 2017-08-23 18:21 ` Bart Van Assche 2017-08-24 8:59 ` hch 2017-08-24 8:59 ` hch 2017-08-24 20:17 ` Bart Van Assche 2017-08-24 20:17 ` Bart Van Assche 2017-09-05 11:53 ` Christoph Hellwig 2017-09-05 11:53 ` Christoph Hellwig 2017-09-11 6:34 ` Tony Yang 2017-08-23 22:53 ` Keith Busch 2017-08-23 22:53 ` Keith Busch 2017-08-24 8:52 ` Christoph Hellwig 2017-08-24 8:52 ` Christoph Hellwig 2017-08-28 7:23 ` Sagi Grimberg 2017-08-28 7:23 ` Sagi Grimberg 2017-08-28 9:06 ` Christoph Hellwig 2017-08-28 9:06 ` Christoph Hellwig 2017-08-28 13:40 ` Sagi Grimberg 2017-08-28 13:40 ` Sagi Grimberg 2017-08-28 14:24 ` Christoph Hellwig 2017-08-28 14:24 ` Christoph Hellwig 2017-09-07 15:17 ` Tony Yang 2017-08-29 10:22 ` Guan Junxiong 2017-08-29 10:22 ` Guan Junxiong 2017-08-29 14:51 ` Christoph Hellwig 2017-08-29 14:51 ` Christoph Hellwig 2017-08-29 14:54 ` Keith Busch 2017-08-29 14:54 ` Keith Busch 2017-08-29 14:55 ` Christoph Hellwig [this message] 2017-08-29 14:55 ` Christoph Hellwig 2017-08-29 15:41 ` Keith Busch 2017-08-29 15:41 ` Keith Busch 2017-09-18 0:17 ` Christoph Hellwig 2017-09-18 0:17 ` Christoph Hellwig
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170829145559.GA32760@lst.de \ --to=hch@lst.de \ --cc=axboe@kernel.dk \ --cc=keith.busch@intel.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=sagi@grimberg.me \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.