linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Dongli Zhang <dongli.zhang@oracle.com>
Cc: Sagi Grimberg <sagi@grimberg.me>,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org
Subject: Re: nvme double __blk_mq_complete_request() bugs
Date: Mon, 25 May 2020 10:45:16 -0600	[thread overview]
Message-ID: <20200525164516.GC73686@C02WT3WMHTD6> (raw)
In-Reply-To: <49f32df9-81a9-4c15-9950-aceff8fb291e@oracle.com>

On Sun, May 24, 2020 at 07:33:02AM -0700, Dongli Zhang wrote:
> >> After code analysis, I think this is for nvme-pci as well.
> >>
> >>                                         nvme_process_cq()
> >>                                         -> nvme_handle_cqe()
> >>                                            -> nvme_end_request()
> >>                                               -> blk_mq_complete_request()
> >> nvme_reset_work()
> >> -> nvme_dev_disable()
> >>     -> nvme_reap_pending_cqes()
> >>        -> nvme_process_cq()
> >>           -> nvme_handle_cqe()
> >>              -> nvme_end_request()
> >>                 -> blk_mq_complete_request()
> >>                    -> __blk_mq_complete_request()
> >>                                                  -> __blk_mq_complete_request()
> > 
> > nvme_dev_disable will first disable the queues before reaping the pending cqes so
> > it shouldn't have this issue.
> > 
> 
> Would you mind help explain how nvme_dev_disable() would avoid this issue?
> 
> nvme_dev_disable() would:
> 
> 1. freeze all the queues so that new request would not enter and submit
> 2. NOT wait for freezing during live reset so that q->q_usage_counter is not
> guaranteed to be zero.
> 3. quiesce all the queues so that new request would not dispatch
> 4. delete the queue and free irq
> 
> However, I do not find a mechanism to prevent if a nvme_end_request() is already
> in progress.
> 
> E.g., suppose __blk_mq_complete_request() is already triggered on cpu 3 and
> waiting for its first line "WRITE_ONCE(rq->state, MQ_RQ_COMPLETE)" to be
> executed ... while another cpu is doing live reset. I do not see how to prevent
> such race.

The queues and their interrupts are torn and synchronized down before the reset
reclaims uncompleted requests. There's no other context that can be running
completions at that point.

  reply	other threads:[~2020-05-25 16:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-18  4:30 nvme double __blk_mq_complete_request() bugs Dongli Zhang
2020-05-18  7:51 ` Sagi Grimberg
2020-05-24 14:33   ` Dongli Zhang
2020-05-25 16:45     ` Keith Busch [this message]
2020-05-27  1:04       ` Dongli Zhang
2020-05-27  3:36         ` Keith Busch
2020-05-27  4:14         ` Dongli Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200525164516.GC73686@C02WT3WMHTD6 \
    --to=kbusch@kernel.org \
    --cc=dongli.zhang@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).