All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	x86@kernel.org, rjw@rjwysocki.net, mingo@redhat.com,
	bp@alien8.de, lv.zheng@intel.com, hpa@zytor.com,
	tglx@linutronix.de, yinghai@kernel.org, lenb@kernel.org,
	linux-pci@vger.kernel.org, tony.luck@intel.com,
	linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource
Date: Wed, 11 Mar 2015 10:47:30 -0600	[thread overview]
Message-ID: <1426092450.3643.7.camel@redhat.com> (raw)
In-Reply-To: <1425613912.5200.344.camel@redhat.com>

On Thu, 2015-03-05 at 20:51 -0700, Alex Williamson wrote:
> On Fri, 2015-03-06 at 09:49 +0800, Jiang Liu wrote:
> > On 2015/3/6 5:06, Alex Williamson wrote:
> > > The IRQ resource for a device is established when pci_enabled_device()
> > > is called on a fully disabled device (ie. enable_cnt == 0).  With
> > > commit b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ
> > > resources") this same IRQ resource is released when the driver is
> > > unbound from the device, regardless of the enable_cnt.  This presents
> > > the situation that an ill-behaved driver can now make a device
> > > unusable to subsequent drivers by an imbalance in their use of
> > > pci_enable/disable_device().  It's one thing to break your own device
> > > if you're one of these ill-behaved drivers, but it's a serious
> > > regression for secondary drivers like vfio-pci, which are innocent
> > > of the transgressions of the previous driver.
> > > 
> > > Resolve by pushing the device to a fully disabled state before
> > > releasing the IRQ resource.
> > > 
> > > Fixes: b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources")
> > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > Cc: Jiang Liu <jiang.liu@linux.intel.com>
> > > ---
> > >  arch/x86/pci/common.c |   13 ++++++++++++-
> > >  1 file changed, 12 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> > > index 3d2612b..4810194 100644
> > > --- a/arch/x86/pci/common.c
> > > +++ b/arch/x86/pci/common.c
> > > @@ -527,8 +527,19 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action,
> > >  	if (action != BUS_NOTIFY_UNBOUND_DRIVER)
> > >  		return NOTIFY_DONE;
> > >  
> > > -	if (pcibios_disable_irq)
> > > +	if (pcibios_disable_irq) {
> > > +		/*
> > > +		 * Broken drivers may allow a device to be .remove()'d while
> > > +		 * still enabled.  pci_enable_device() will only re-establish
> > > +		 * dev->irq if the devices is fully disabled.  So if we want
> > > +		 * to release the IRQ, we need to make sure the next driver
> > > +		 * can re-establish it using pci_enable_device().
> > > +		 */
> > > +		while (pci_is_enabled(dev))
> > > +			pci_disable_device(dev);
> > > +
> > >  		pcibios_disable_irq(dev);
> > > +	}
> > Hi Alex,
> > 	Thanks for debugging and fixing it.
> > 	Will it be feasible to give a debug message to remind those
> > driver authors to correctly disable PCI when unbinding?
> 
> I can certainly add a warning to the loop, it loses a bit of its teeth
> here though since we can't specify which driver to blame at this point.
> Maybe that warning and perhaps this enabling roll-back should happen in
> drivers/pci/pci-driver.c:pci_device_remove().  Bjorn, would you prefer
> it be done generically there?  Thanks,

Unfortunately there's a long standing comment in pci_device_remove():

        /*
         * We would love to complain here if pci_dev->is_enabled is set, that
         * the driver should have called pci_disable_device(), but the
         * unfortunate fact is there are too many odd BIOS and bridge setups
         * that don't like drivers doing that all of the time.
         * Oh well, we can dream of sane hardware when we sleep, no matter how
         * horrible the crap we have to deal with is when we are awake...
         */

So, unless we can somehow ignore that comment, I suspect forcing the
device to be disabled on driver remove, whether done from pci-core or
from x86/pci, is going to cause all sorts of breakage.  Are the
expectations set by b4b55cda5874 really valid?  It seems like something
needs to be done to allow the IRQ to be automatically re-established on
x86 regardless of the driver doing the right thing when releasing the
device.  We're still looking at a regression for v4.0 as a result of
b4b55cda5874.  Thanks,

Alex

  reply	other threads:[~2015-03-11 16:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-05 21:06 [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource Alex Williamson
2015-03-06  1:49 ` Jiang Liu
2015-03-06  3:51   ` Alex Williamson
2015-03-11 16:47     ` Alex Williamson [this message]
2015-03-11 22:04       ` Rafael J. Wysocki
2015-03-11 22:04         ` Luck, Tony
2015-03-11 22:04           ` Luck, Tony
2015-03-11 22:04           ` Luck, Tony
2015-03-12  1:17           ` Rafael J. Wysocki
2015-03-12  1:41             ` Jiang Liu
2015-03-12 16:08               ` Rafael J. Wysocki
2015-03-13  1:49                 ` Jiang Liu
2015-03-13  2:06       ` [Bugfix] x86/PCI: Release PCI IRQ resource only if PCI device is disabled when unbinding Jiang Liu
2015-03-13 21:45         ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1426092450.3643.7.camel@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jiang.liu@linux.intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lv.zheng@intel.com \
    --cc=mingo@redhat.com \
    --cc=rjw@rjwysocki.net \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.