linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Christoph Hellwig <hch@lst.de>
Cc: "Haeuptle, Michael" <michael.haeuptle@hpe.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"michaelhaeuptle@gmail.com" <michaelhaeuptle@gmail.com>
Subject: Re: Deadlock during PCIe hot remove
Date: Sun, 29 Mar 2020 15:04:20 +0200	[thread overview]
Message-ID: <20200329130420.hggbkgx57qqvu6om@wunner.de> (raw)
In-Reply-To: <20200325104018.GA30853@lst.de>

On Wed, Mar 25, 2020 at 11:40:18AM +0100, Christoph Hellwig wrote:
> On Tue, Mar 24, 2020 at 05:15:34PM +0100, Lukas Wunner wrote:
> > The pci_dev_trylock() in pci_try_reset_function() looks questionable
> > to me.  It was added by commit b014e96d1abb ("PCI: Protect
> > pci_error_handlers->reset_notify() usage with device_lock()")
> > with the following rationale:
> > 
> >     Every method in struct device_driver or structures derived from it like
> >     struct pci_driver MUST provide exclusion vs the driver's ->remove()
> >     method, usually by using device_lock().
> >     [...]
> >     Without this, ->reset_notify() may race with ->remove() calls, which
> >     can be easily triggered in NVMe.
> > 
> > The intersection of drivers defining a ->reset_notify() hook and files
> > invoking pci_try_reset_function() appears to be empty.  So I don't quite
> > understand the problem the commit sought to address.  What am I missing?
> 
> No driver defines ->reset_notify as that has been split into
> ->reset_prepare and ->reset_done a while ago, and plenty of drivers
> define those.  And we can't call into drivers unless we know the driver
> actually still is bound to the device, which is why we need the locking.

Sure, you need to hold the driver in place while you're invoking one of
its callbacks.  But is it really necessary to hold the device lock while
performing the actual reset?  That locking seems awfully coarse-grained.

Do you see any potential problem in pushing down the pci_dev_lock() and
pci_dev_unlock() calls into pci_dev_save_and_disable() and
pci_dev_restore()?  I.e, acquire the lock for the invocation of
->reset_prepare() and ->reset_done() and release it immediately
afterwards?

That would seem to fix the deadlock Michael reported.

Of course that could result in ->reset_prepare() being invoked but
->reset_done() being not invoked if the driver is no longer bound.
Or in ->reset_done() being called for a different driver if the
device was rebound in the meantime.  Would this cause issues?

Thanks,

Lukas

  parent reply	other threads:[~2020-03-29 13:04 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-24 15:21 Deadlock during PCIe hot remove Haeuptle, Michael
2020-03-24 15:35 ` Hoyer, David
2020-03-24 15:37   ` Haeuptle, Michael
2020-03-24 15:39     ` Hoyer, David
2020-03-24 16:26       ` Haeuptle, Michael
2020-03-24 16:32         ` Hoyer, David
2020-03-24 16:15 ` Lukas Wunner
2020-03-25 10:40   ` Christoph Hellwig
2020-03-26 19:30     ` Haeuptle, Michael
2020-03-29 13:04     ` Lukas Wunner [this message]
2020-03-31  8:14       ` Christoph Hellwig
2020-03-29 15:43 ` Lukas Wunner
2020-03-30 16:15   ` Haeuptle, Michael
2020-03-31 13:01     ` Lukas Wunner
2020-03-31 15:02       ` Haeuptle, Michael
2020-04-02 19:24         ` Haeuptle, Michael
     [not found] <492110694.79456.1666778757292.JavaMail.zimbra@kalray.eu>
2022-10-30  8:28 ` Christoph Hellwig
2022-10-30  8:39   ` Alex Michon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200329130420.hggbkgx57qqvu6om@wunner.de \
    --to=lukas@wunner.de \
    --cc=hch@lst.de \
    --cc=linux-pci@vger.kernel.org \
    --cc=michael.haeuptle@hpe.com \
    --cc=michaelhaeuptle@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).