linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>,
	linux-pci@vger.kernel.org, Sinan Kaya <okaya@kernel.org>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Subject: Re: Regression caused by "PCI: Probe for device reset support during enumeration"
Date: Fri, 14 Dec 2018 11:40:36 -0700	[thread overview]
Message-ID: <20181214114036.43566c86@x1.home> (raw)
In-Reply-To: <20181214171838.GA20725@google.com>

On Fri, 14 Dec 2018 11:18:38 -0600
Bjorn Helgaas <helgaas@kernel.org> wrote:

> [+cc Sinan, Alex, Rafael]
> 
> On Fri, Dec 14, 2018 at 01:15:58PM +0000, Ross Lagerwall wrote:
> > Hi Bjorn,
> > 
> > Your commit 5b0764cac9f1 ("PCI: Probe for device reset support during
> > enumeration") moved checking whether a device could be reset much earlier to
> > when the device is first probed. When the device is first probed, the other
> > devices on the bus may not have been discovered yet. This means that we will
> > claim to support SBR as a reset mechanism because it is the only device
> > behind the bus at that pointer meanwhile the others simply haven't been
> > discovered yet. This results in dev->reset_fn being incorrectly set to true
> > and a reset file being created. When userspace actually tries to use the
> > reset file it fails because now there are other sibling devices preventing
> > the use of an SBR.  
> 
> First of all, I'm very sorry about the regression and thanks very much
> for the report!  I know it's a lot of work to track down things like
> this.
> 
> We run pci_probe_reset_function() during pci_init_capabilities(),
> which is before other devices on the bus are discovered, as you say.
> That checks not just for SBR support, but for other types of reset
> (FLR, PM, etc) as well.
> 
> In your case the userspace reset actually fails, so I assume that
> means SBR is the *only* supported reset type for that device?
> 
> Since you have two devices on the bus, I guess SBR of device A
> *should* work during the interval between enumerating A and B (it will
> reset both A and B in hardware, but since the OS doesn't know about B
> yet, that's probably OK).  After we enumerate B and potentially bind a
> driver to it, we can't use SBR any more, of course.
> 
> I could imagine removing A's sysfs reset file when B is enumerated,
> but the locking might be ugly.  And of course A might support other
> types of reset, and then you *want* A's sysfs file.

I've actually wished for the opposite of this in the past, given a bus
with devices A and B with no sysfs reset file, I wish I could 'echo 1 >
B/remove' to rescan for SBR for device A.

> What happened prior to 5b0764cac9f1?  There was never a sysfs reset
> file at all, so userspace didn't even attempt the reset?

Yep.

> We could wait to create the sysfs reset file until after we've
> enumerated all the devices on the bus, but then we're opening a race
> when a device could be hot-added to the bus.  E.g., boot-time
> enumeration finds only A, we create its sysfs reset file, then we
> hot-add device B (admittedly mostly a conventional PCI scenario), and
> now we're in the same situation you're seeing where A's reset file
> exists, but it always fails because SBR is no longer safe.

I'd welcome dynamically re-evaluating the reset availability on device
add and remove, but it might get a bit complicated.  Thanks,

Alex

      reply	other threads:[~2018-12-14 18:40 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-14 13:15 Regression caused by "PCI: Probe for device reset support during enumeration" Ross Lagerwall
2018-12-14 17:18 ` Bjorn Helgaas
2018-12-14 18:40   ` Alex Williamson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181214114036.43566c86@x1.home \
    --to=alex.williamson@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=okaya@kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=ross.lagerwall@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).