From: Dongli Zhang <dongli.zhang@oracle.com> To: linux-nvme@lists.infradead.org Cc: linux-kernel@vger.kernel.org, kbusch@kernel.org, axboe@fb.com, hch@lst.de, sagi@grimberg.me, ming.lei@redhat.com Subject: [PATCH v2 1/1] nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll() Date: Wed, 27 May 2020 09:13:52 -0700 [thread overview] Message-ID: <20200527161352.21227-1-dongli.zhang@oracle.com> (raw) There may be a race between nvme_reap_pending_cqes() and nvme_poll(), e.g., when doing live reset while polling the nvme device. CPU X CPU Y nvme_poll() nvme_dev_disable() -> nvme_stop_queues() -> nvme_suspend_io_queues() -> nvme_suspend_queue() -> spin_lock(&nvmeq->cq_poll_lock); -> nvme_reap_pending_cqes() -> nvme_process_cq() -> nvme_process_cq() In the above scenario, the nvme_process_cq() for the same queue may be running on both CPU X and CPU Y concurrently. It is much more easier to reproduce the issue when CONFIG_PREEMPT is enabled in kernel. When CONFIG_PREEMPT is disabled, it would take longer time for nvme_stop_queues()-->blk_mq_quiesce_queue() to wait for grace period. This patch protects nvme_process_cq() with nvmeq->cq_poll_lock in nvme_reap_pending_cqes(). Fixes: fa46c6fb5d61 ("nvme/pci: move cqe check after device shutdown") Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> --- Changed since v1: - Add "Fixes" tag drivers/nvme/host/pci.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 3726dc780d15..cc46e250fcac 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1382,16 +1382,19 @@ static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown) /* * Called only on a device that has been disabled and after all other threads - * that can check this device's completion queues have synced. This is the - * last chance for the driver to see a natural completion before - * nvme_cancel_request() terminates all incomplete requests. + * that can check this device's completion queues have synced, except + * nvme_poll(). This is the last chance for the driver to see a natural + * completion before nvme_cancel_request() terminates all incomplete requests. */ static void nvme_reap_pending_cqes(struct nvme_dev *dev) { int i; - for (i = dev->ctrl.queue_count - 1; i > 0; i--) + for (i = dev->ctrl.queue_count - 1; i > 0; i--) { + spin_lock(&dev->queues[i].cq_poll_lock); nvme_process_cq(&dev->queues[i]); + spin_unlock(&dev->queues[i].cq_poll_lock); + } } static int nvme_cmb_qdepth(struct nvme_dev *dev, int nr_io_queues, -- 2.17.1
WARNING: multiple messages have this Message-ID (diff)
From: Dongli Zhang <dongli.zhang@oracle.com> To: linux-nvme@lists.infradead.org Cc: sagi@grimberg.me, linux-kernel@vger.kernel.org, ming.lei@redhat.com, axboe@fb.com, kbusch@kernel.org, hch@lst.de Subject: [PATCH v2 1/1] nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll() Date: Wed, 27 May 2020 09:13:52 -0700 [thread overview] Message-ID: <20200527161352.21227-1-dongli.zhang@oracle.com> (raw) There may be a race between nvme_reap_pending_cqes() and nvme_poll(), e.g., when doing live reset while polling the nvme device. CPU X CPU Y nvme_poll() nvme_dev_disable() -> nvme_stop_queues() -> nvme_suspend_io_queues() -> nvme_suspend_queue() -> spin_lock(&nvmeq->cq_poll_lock); -> nvme_reap_pending_cqes() -> nvme_process_cq() -> nvme_process_cq() In the above scenario, the nvme_process_cq() for the same queue may be running on both CPU X and CPU Y concurrently. It is much more easier to reproduce the issue when CONFIG_PREEMPT is enabled in kernel. When CONFIG_PREEMPT is disabled, it would take longer time for nvme_stop_queues()-->blk_mq_quiesce_queue() to wait for grace period. This patch protects nvme_process_cq() with nvmeq->cq_poll_lock in nvme_reap_pending_cqes(). Fixes: fa46c6fb5d61 ("nvme/pci: move cqe check after device shutdown") Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> --- Changed since v1: - Add "Fixes" tag drivers/nvme/host/pci.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 3726dc780d15..cc46e250fcac 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1382,16 +1382,19 @@ static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown) /* * Called only on a device that has been disabled and after all other threads - * that can check this device's completion queues have synced. This is the - * last chance for the driver to see a natural completion before - * nvme_cancel_request() terminates all incomplete requests. + * that can check this device's completion queues have synced, except + * nvme_poll(). This is the last chance for the driver to see a natural + * completion before nvme_cancel_request() terminates all incomplete requests. */ static void nvme_reap_pending_cqes(struct nvme_dev *dev) { int i; - for (i = dev->ctrl.queue_count - 1; i > 0; i--) + for (i = dev->ctrl.queue_count - 1; i > 0; i--) { + spin_lock(&dev->queues[i].cq_poll_lock); nvme_process_cq(&dev->queues[i]); + spin_unlock(&dev->queues[i].cq_poll_lock); + } } static int nvme_cmb_qdepth(struct nvme_dev *dev, int nr_io_queues, -- 2.17.1 _______________________________________________ linux-nvme mailing list linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
next reply other threads:[~2020-05-27 16:21 UTC|newest] Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-05-27 16:13 Dongli Zhang [this message] 2020-05-27 16:13 ` [PATCH v2 1/1] nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll() Dongli Zhang 2020-05-27 18:33 ` Christoph Hellwig 2020-05-27 18:33 ` Christoph Hellwig
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200527161352.21227-1-dongli.zhang@oracle.com \ --to=dongli.zhang@oracle.com \ --cc=axboe@fb.com \ --cc=hch@lst.de \ --cc=kbusch@kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=ming.lei@redhat.com \ --cc=sagi@grimberg.me \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.