linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
@ 2021-07-13 18:50 Laurence Oberman
  2021-07-13 19:15 ` Alan Stern
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Laurence Oberman @ 2021-07-13 18:50 UTC (permalink / raw)
  To: linux-usb, andriy.shevchenko, stable, loberman, emilne, djeffery,
	apanagio, torez

Customers have been reporting that the I/O is radically being
slowed down to HPE virtual USB ILO served DVD images during installation.

Lots of investigation by the Red Hat lab has found that the issue is 
because MSI edge interrupts do not work properly for these 
ILO USB devices.
We start fast and then drop to polling mode and its unusable.

The issue exists currently upstream on 5.13 as tested by Red Hat, 
and reverting the mentioned patch corrects this upstream.

David Jeffery has this explanation:

The problem with the patch turning on MSI appears to be that the ehci 
driver (and possibly other usb controller types too) wasn't written to
support edge-triggered interrupts.
The ehci_irq routine appears to be written in such a way that it will 
be racy with multiple interrupt source bits.
With a level-triggered interrupt, it gets called another time and cleans 
up other interrupt sources.
But with MSI edge, the interrupt state staying high results in no 
new interrupt and ehci has to run based on polling.

static irqreturn_t ehci_irq (struct usb_hcd *hcd)
{
...
        status = ehci_readl(ehci, &ehci->regs->status);

        /* e.g. cardbus physical eject */
        if (status == ~(u32) 0) {
                ehci_dbg (ehci, "device removed\n");
                goto dead;
        }

        /*
         * We don't use STS_FLR, but some controllers don't like it to
         * remain on, so mask it out along with the other status bits.
         */
        masked_status = status & (INTR_MASK | STS_FLR);

        /* Shared IRQ? */
        if (!masked_status || unlikely(ehci->rh_state == EHCI_RH_HALTED)) {
                spin_unlock_irqrestore(&ehci->lock, flags);
                return IRQ_NONE;
        }

        /* clear (just) interrupts */
        ehci_writel(ehci, masked_status, &ehci->regs->status);
...

ehci_irq() reads the interrupt status register and then writes the active 
interrupt-related bits back out to ack the interrupt cause.
But with an edge interrupt, this is racy as another source of interrupt 
could be raised by ehci between the read and the write reaching the 
hardware. 
e.g.  If STS_IAA was set during the initial read, but some other bit like 
STS_INT gets raised by the hardware between the read and the write to the 
interrupt status register, the interrupt signal state won't drop.
The interrupt state says high, and since it is now edged triggered with 
MSI, no new invocation of the interrupt handler gets triggered.

Suggested-by: David Jeffery <djeffery@redhat.com>
Suggested-by: Alexandros Panagiotou <apanagio@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Laurence Oberman <loberman@redhat.com>
Fixes: 306c54d0edb6ba94d39877524dddebaad7770cf2 PCI: Disable MSI for 
Pericom PCIe-USB adapter 
---
 drivers/usb/core/hcd-pci.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
index d630ccc..522a179 100644
--- a/drivers/usb/core/hcd-pci.c
+++ b/drivers/usb/core/hcd-pci.c
@@ -195,21 +195,20 @@ int usb_hcd_pci_probe(struct pci_dev *dev, const struct pci_device_id *id,
 	 * make sure irq setup is not touched for xhci in generic hcd code
 	 */
 	if ((driver->flags & HCD_MASK) < HCD_USB3) {
-		retval = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY | PCI_IRQ_MSI);
-		if (retval < 0) {
+		if (!dev->irq) {
 			dev_err(&dev->dev,
 			"Found HC with no IRQ. Check BIOS/PCI %s setup!\n",
 				pci_name(dev));
 			retval = -ENODEV;
 			goto disable_pci;
 		}
-		hcd_irq = pci_irq_vector(dev, 0);
+		hcd_irq = dev->irq;
 	}
 
 	hcd = usb_create_hcd(driver, &dev->dev, pci_name(dev));
 	if (!hcd) {
 		retval = -ENOMEM;
-		goto free_irq_vectors;
+		goto disable_pci;
 	}
 
 	hcd->amd_resume_bug = (usb_hcd_amd_remote_wakeup_quirk(dev) &&
@@ -288,9 +287,6 @@ int usb_hcd_pci_probe(struct pci_dev *dev, const struct pci_device_id *id,
 
 put_hcd:
 	usb_put_hcd(hcd);
-free_irq_vectors:
-	if ((driver->flags & HCD_MASK) < HCD_USB3)
-		pci_free_irq_vectors(dev);
 disable_pci:
 	pci_disable_device(dev);
 	dev_err(&dev->dev, "init %s fail, %d\n", pci_name(dev), retval);
@@ -352,8 +348,6 @@ void usb_hcd_pci_remove(struct pci_dev *dev)
 		up_read(&companions_rwsem);
 	}
 	usb_put_hcd(hcd);
-	if ((hcd_driver_flags & HCD_MASK) < HCD_USB3)
-		pci_free_irq_vectors(dev);
 	pci_disable_device(dev);
 }
 EXPORT_SYMBOL_GPL(usb_hcd_pci_remove);
@@ -465,7 +459,7 @@ static int suspend_common(struct device *dev, bool do_wakeup)
 	 * synchronized here.
 	 */
 	if (!hcd->msix_enabled)
-		synchronize_irq(pci_irq_vector(pci_dev, 0));
+		synchronize_irq(pci_dev->irq);
 
 	/* Downstream ports from this root hub should already be quiesced, so
 	 * there will be no DMA activity.  Now we can shut down the upstream
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
  2021-07-13 18:50 [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices Laurence Oberman
@ 2021-07-13 19:15 ` Alan Stern
  2021-07-13 20:05   ` Laurence Oberman
  2021-07-13 20:30 ` kernel test robot
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Alan Stern @ 2021-07-13 19:15 UTC (permalink / raw)
  To: Laurence Oberman
  Cc: linux-usb, andriy.shevchenko, stable, emilne, djeffery, apanagio, torez

On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote:
> Customers have been reporting that the I/O is radically being
> slowed down to HPE virtual USB ILO served DVD images during installation.
> 
> Lots of investigation by the Red Hat lab has found that the issue is 
> because MSI edge interrupts do not work properly for these 
> ILO USB devices.
> We start fast and then drop to polling mode and its unusable.
> 
> The issue exists currently upstream on 5.13 as tested by Red Hat, 
> and reverting the mentioned patch corrects this upstream.
> 
> David Jeffery has this explanation:
> 
> The problem with the patch turning on MSI appears to be that the ehci 
> driver (and possibly other usb controller types too) wasn't written to
> support edge-triggered interrupts.
> The ehci_irq routine appears to be written in such a way that it will 
> be racy with multiple interrupt source bits.
> With a level-triggered interrupt, it gets called another time and cleans 
> up other interrupt sources.
> But with MSI edge, the interrupt state staying high results in no 
> new interrupt and ehci has to run based on polling.
> 
> static irqreturn_t ehci_irq (struct usb_hcd *hcd)
> {
> ...
>         status = ehci_readl(ehci, &ehci->regs->status);
> 
>         /* e.g. cardbus physical eject */
>         if (status == ~(u32) 0) {
>                 ehci_dbg (ehci, "device removed\n");
>                 goto dead;
>         }
> 
>         /*
>          * We don't use STS_FLR, but some controllers don't like it to
>          * remain on, so mask it out along with the other status bits.
>          */
>         masked_status = status & (INTR_MASK | STS_FLR);
> 
>         /* Shared IRQ? */
>         if (!masked_status || unlikely(ehci->rh_state == EHCI_RH_HALTED)) {
>                 spin_unlock_irqrestore(&ehci->lock, flags);
>                 return IRQ_NONE;
>         }
> 
>         /* clear (just) interrupts */
>         ehci_writel(ehci, masked_status, &ehci->regs->status);
> ...
> 
> ehci_irq() reads the interrupt status register and then writes the active 
> interrupt-related bits back out to ack the interrupt cause.
> But with an edge interrupt, this is racy as another source of interrupt 
> could be raised by ehci between the read and the write reaching the 
> hardware. 
> e.g.  If STS_IAA was set during the initial read, but some other bit like 
> STS_INT gets raised by the hardware between the read and the write to the 
> interrupt status register, the interrupt signal state won't drop.
> The interrupt state says high, and since it is now edged triggered with 
> MSI, no new invocation of the interrupt handler gets triggered.

Wouldn't it be better to change these other PCI drivers by adding 
proper MSI support?  I don't know what would be involved, but 
presumably it wouldn't be very hard.  (Just run the handler in a loop 
until all the interrupt status bits are off?)

Alan Stern

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
  2021-07-13 19:15 ` Alan Stern
@ 2021-07-13 20:05   ` Laurence Oberman
  2021-07-13 20:30     ` Andy Shevchenko
  0 siblings, 1 reply; 9+ messages in thread
From: Laurence Oberman @ 2021-07-13 20:05 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-usb, andriy.shevchenko, stable, emilne, djeffery, apanagio, torez

On Tue, 2021-07-13 at 15:15 -0400, Alan Stern wrote:
> On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote:
> > Customers have been reporting that the I/O is radically being
> > slowed down to HPE virtual USB ILO served DVD images during
> > installation.
> > 
> > Lots of investigation by the Red Hat lab has found that the issue
> > is 
> > because MSI edge interrupts do not work properly for these 
> > ILO USB devices.
> > We start fast and then drop to polling mode and its unusable.
> > 
> > The issue exists currently upstream on 5.13 as tested by Red Hat, 
> > and reverting the mentioned patch corrects this upstream.
> > 
> > David Jeffery has this explanation:
> > 
> > The problem with the patch turning on MSI appears to be that the
> > ehci 
> > driver (and possibly other usb controller types too) wasn't written
> > to
> > support edge-triggered interrupts.
> > The ehci_irq routine appears to be written in such a way that it
> > will 
> > be racy with multiple interrupt source bits.
> > With a level-triggered interrupt, it gets called another time and
> > cleans 
> > up other interrupt sources.
> > But with MSI edge, the interrupt state staying high results in no 
> > new interrupt and ehci has to run based on polling.
> > 
> > static irqreturn_t ehci_irq (struct usb_hcd *hcd)
> > {
> > ...
> >         status = ehci_readl(ehci, &ehci->regs->status);
> > 
> >         /* e.g. cardbus physical eject */
> >         if (status == ~(u32) 0) {
> >                 ehci_dbg (ehci, "device removed\n");
> >                 goto dead;
> >         }
> > 
> >         /*
> >          * We don't use STS_FLR, but some controllers don't like it
> > to
> >          * remain on, so mask it out along with the other status
> > bits.
> >          */
> >         masked_status = status & (INTR_MASK | STS_FLR);
> > 
> >         /* Shared IRQ? */
> >         if (!masked_status || unlikely(ehci->rh_state ==
> > EHCI_RH_HALTED)) {
> >                 spin_unlock_irqrestore(&ehci->lock, flags);
> >                 return IRQ_NONE;
> >         }
> > 
> >         /* clear (just) interrupts */
> >         ehci_writel(ehci, masked_status, &ehci->regs->status);
> > ...
> > 
> > ehci_irq() reads the interrupt status register and then writes the
> > active 
> > interrupt-related bits back out to ack the interrupt cause.
> > But with an edge interrupt, this is racy as another source of
> > interrupt 
> > could be raised by ehci between the read and the write reaching
> > the 
> > hardware. 
> > e.g.  If STS_IAA was set during the initial read, but some other
> > bit like 
> > STS_INT gets raised by the hardware between the read and the write
> > to the 
> > interrupt status register, the interrupt signal state won't drop.
> > The interrupt state says high, and since it is now edged triggered
> > with 
> > MSI, no new invocation of the interrupt handler gets triggered.
> 
> Wouldn't it be better to change these other PCI drivers by adding 
> proper MSI support?  I don't know what would be involved, but 
> presumably it wouldn't be very hard.  (Just run the handler in a
> loop 
> until all the interrupt status bits are off?)
> 
> Alan Stern
> 

Hello

Agree with you that is a big hammer approach,  but it's such a key
piece of the massive number of HPE servers out there and we have many
affected customers.

While I did all the test work and discovery etc, I am definitely not a
USB kernel guy very often, I spend most of my time in storage.
I will listen for the other replies to see how the folks who know the
subsystem better than I would want this reolved.

Thanks
Laurence




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
  2021-07-13 18:50 [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices Laurence Oberman
  2021-07-13 19:15 ` Alan Stern
@ 2021-07-13 20:30 ` kernel test robot
  2021-07-13 20:33 ` Andy Shevchenko
  2021-07-13 23:11 ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2021-07-13 20:30 UTC (permalink / raw)
  To: Laurence Oberman, linux-usb, andriy.shevchenko, stable, emilne,
	djeffery, apanagio, torez
  Cc: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 6299 bytes --]

Hi Laurence,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on usb/usb-testing]
[also build test WARNING on peter.chen-usb/for-usb-next balbi-usb/testing/next v5.14-rc1 next-20210713]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Laurence-Oberman/usb-hcd-Revert-306c54d0edb6ba94d39877524dddebaad7770cf2-Try-MSI-interrupts-on-PCI-devices/20210714-025312
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing
config: arc-allyesconfig (attached as .config)
compiler: arceb-elf-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/3ea2a748176f21120e150f0645bc3c22e1cea48f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Laurence-Oberman/usb-hcd-Revert-306c54d0edb6ba94d39877524dddebaad7770cf2-Try-MSI-interrupts-on-PCI-devices/20210714-025312
        git checkout 3ea2a748176f21120e150f0645bc3c22e1cea48f
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/usb/core/hcd-pci.c: In function 'usb_hcd_pci_remove':
>> drivers/usb/core/hcd-pci.c:316:8: warning: variable 'hcd_driver_flags' set but not used [-Wunused-but-set-variable]
     316 |  int   hcd_driver_flags;
         |        ^~~~~~~~~~~~~~~~


vim +/hcd_driver_flags +316 drivers/usb/core/hcd-pci.c

^1da177e4c3f41 Linus Torvalds     2005-04-16  300  
^1da177e4c3f41 Linus Torvalds     2005-04-16  301  /**
^1da177e4c3f41 Linus Torvalds     2005-04-16  302   * usb_hcd_pci_remove - shutdown processing for PCI-based HCDs
^1da177e4c3f41 Linus Torvalds     2005-04-16  303   * @dev: USB Host Controller being removed
41631d3616c363 Ahmed S. Darwish   2020-10-19  304   *
41631d3616c363 Ahmed S. Darwish   2020-10-19  305   * Context: task context, might sleep
^1da177e4c3f41 Linus Torvalds     2005-04-16  306   *
^1da177e4c3f41 Linus Torvalds     2005-04-16  307   * Reverses the effect of usb_hcd_pci_probe(), first invoking
^1da177e4c3f41 Linus Torvalds     2005-04-16  308   * the HCD's stop() method.  It is always called from a thread
^1da177e4c3f41 Linus Torvalds     2005-04-16  309   * context, normally "rmmod", "apmd", or something similar.
^1da177e4c3f41 Linus Torvalds     2005-04-16  310   *
^1da177e4c3f41 Linus Torvalds     2005-04-16  311   * Store this function in the HCD's struct pci_driver as remove().
^1da177e4c3f41 Linus Torvalds     2005-04-16  312   */
^1da177e4c3f41 Linus Torvalds     2005-04-16  313  void usb_hcd_pci_remove(struct pci_dev *dev)
^1da177e4c3f41 Linus Torvalds     2005-04-16  314  {
^1da177e4c3f41 Linus Torvalds     2005-04-16  315  	struct usb_hcd		*hcd;
7b2816dd293031 Andy Shevchenko    2020-08-14 @316  	int			hcd_driver_flags;
^1da177e4c3f41 Linus Torvalds     2005-04-16  317  
^1da177e4c3f41 Linus Torvalds     2005-04-16  318  	hcd = pci_get_drvdata(dev);
^1da177e4c3f41 Linus Torvalds     2005-04-16  319  	if (!hcd)
^1da177e4c3f41 Linus Torvalds     2005-04-16  320  		return;
^1da177e4c3f41 Linus Torvalds     2005-04-16  321  
7b2816dd293031 Andy Shevchenko    2020-08-14  322  	hcd_driver_flags = hcd->driver->flags;
7b2816dd293031 Andy Shevchenko    2020-08-14  323  
3da7cff4e79e4a Alan Stern         2010-06-25  324  	if (pci_dev_run_wake(dev))
3da7cff4e79e4a Alan Stern         2010-06-25  325  		pm_runtime_get_noresume(&dev->dev);
3da7cff4e79e4a Alan Stern         2010-06-25  326  
c548795abe0d35 Alan Stern         2010-06-09  327  	/* Fake an interrupt request in order to give the driver a chance
c548795abe0d35 Alan Stern         2010-06-09  328  	 * to test whether the controller hardware has been removed (e.g.,
c548795abe0d35 Alan Stern         2010-06-09  329  	 * cardbus physical eject).
c548795abe0d35 Alan Stern         2010-06-09  330  	 */
c548795abe0d35 Alan Stern         2010-06-09  331  	local_irq_disable();
c548795abe0d35 Alan Stern         2010-06-09  332  	usb_hcd_irq(0, hcd);
c548795abe0d35 Alan Stern         2010-06-09  333  	local_irq_enable();
c548795abe0d35 Alan Stern         2010-06-09  334  
05768918b9a122 Alan Stern         2013-03-28  335  	/* Note: dev_set_drvdata must be called while holding the rwsem */
05768918b9a122 Alan Stern         2013-03-28  336  	if (dev->class == CL_EHCI) {
05768918b9a122 Alan Stern         2013-03-28  337  		down_write(&companions_rwsem);
05768918b9a122 Alan Stern         2013-03-28  338  		for_each_companion(dev, hcd, ehci_remove);
05768918b9a122 Alan Stern         2013-03-28  339  		usb_remove_hcd(hcd);
05768918b9a122 Alan Stern         2013-03-28  340  		dev_set_drvdata(&dev->dev, NULL);
05768918b9a122 Alan Stern         2013-03-28  341  		up_write(&companions_rwsem);
05768918b9a122 Alan Stern         2013-03-28  342  	} else {
05768918b9a122 Alan Stern         2013-03-28  343  		/* Not EHCI; just clear the companion pointer */
05768918b9a122 Alan Stern         2013-03-28  344  		down_read(&companions_rwsem);
05768918b9a122 Alan Stern         2013-03-28  345  		hcd->self.hs_companion = NULL;
^1da177e4c3f41 Linus Torvalds     2005-04-16  346  		usb_remove_hcd(hcd);
05768918b9a122 Alan Stern         2013-03-28  347  		dev_set_drvdata(&dev->dev, NULL);
05768918b9a122 Alan Stern         2013-03-28  348  		up_read(&companions_rwsem);
05768918b9a122 Alan Stern         2013-03-28  349  	}
^1da177e4c3f41 Linus Torvalds     2005-04-16  350  	usb_put_hcd(hcd);
^1da177e4c3f41 Linus Torvalds     2005-04-16  351  	pci_disable_device(dev);
^1da177e4c3f41 Linus Torvalds     2005-04-16  352  }
782e70c6fc2290 Greg Kroah-Hartman 2008-01-25  353  EXPORT_SYMBOL_GPL(usb_hcd_pci_remove);
^1da177e4c3f41 Linus Torvalds     2005-04-16  354  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 68516 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
  2021-07-13 20:05   ` Laurence Oberman
@ 2021-07-13 20:30     ` Andy Shevchenko
  0 siblings, 0 replies; 9+ messages in thread
From: Andy Shevchenko @ 2021-07-13 20:30 UTC (permalink / raw)
  To: Laurence Oberman
  Cc: Alan Stern, linux-usb, stable, emilne, djeffery, apanagio, torez

On Tue, Jul 13, 2021 at 04:05:06PM -0400, Laurence Oberman wrote:
> On Tue, 2021-07-13 at 15:15 -0400, Alan Stern wrote:
> > On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote:
> > > Customers have been reporting that the I/O is radically being
> > > slowed down to HPE virtual USB ILO served DVD images during
> > > installation.

Thanks for the report!

> > > Lots of investigation by the Red Hat lab has found that the issue
> > > is 
> > > because MSI edge interrupts do not work properly for these 
> > > ILO USB devices.
> > > We start fast and then drop to polling mode and its unusable.
> > > 
> > > The issue exists currently upstream on 5.13 as tested by Red Hat, 
> > > and reverting the mentioned patch corrects this upstream.
> > > 
> > > David Jeffery has this explanation:
> > > 
> > > The problem with the patch turning on MSI appears to be that the
> > > ehci 
> > > driver (and possibly other usb controller types too) wasn't written
> > > to
> > > support edge-triggered interrupts.
> > > The ehci_irq routine appears to be written in such a way that it
> > > will 
> > > be racy with multiple interrupt source bits.
> > > With a level-triggered interrupt, it gets called another time and
> > > cleans 
> > > up other interrupt sources.
> > > But with MSI edge, the interrupt state staying high results in no 
> > > new interrupt and ehci has to run based on polling.
> > > 
> > > static irqreturn_t ehci_irq (struct usb_hcd *hcd)
> > > {
> > > ...
> > >         status = ehci_readl(ehci, &ehci->regs->status);
> > > 
> > >         /* e.g. cardbus physical eject */
> > >         if (status == ~(u32) 0) {
> > >                 ehci_dbg (ehci, "device removed\n");
> > >                 goto dead;
> > >         }
> > > 
> > >         /*
> > >          * We don't use STS_FLR, but some controllers don't like it
> > > to
> > >          * remain on, so mask it out along with the other status
> > > bits.
> > >          */
> > >         masked_status = status & (INTR_MASK | STS_FLR);
> > > 
> > >         /* Shared IRQ? */
> > >         if (!masked_status || unlikely(ehci->rh_state ==
> > > EHCI_RH_HALTED)) {
> > >                 spin_unlock_irqrestore(&ehci->lock, flags);
> > >                 return IRQ_NONE;
> > >         }
> > > 
> > >         /* clear (just) interrupts */
> > >         ehci_writel(ehci, masked_status, &ehci->regs->status);
> > > ...
> > > 
> > > ehci_irq() reads the interrupt status register and then writes the
> > > active 
> > > interrupt-related bits back out to ack the interrupt cause.
> > > But with an edge interrupt, this is racy as another source of
> > > interrupt 
> > > could be raised by ehci between the read and the write reaching
> > > the 
> > > hardware. 
> > > e.g.  If STS_IAA was set during the initial read, but some other
> > > bit like 
> > > STS_INT gets raised by the hardware between the read and the write
> > > to the 
> > > interrupt status register, the interrupt signal state won't drop.
> > > The interrupt state says high, and since it is now edged triggered
> > > with 
> > > MSI, no new invocation of the interrupt handler gets triggered.
> > 
> > Wouldn't it be better to change these other PCI drivers by adding 
> > proper MSI support?  I don't know what would be involved, but 
> > presumably it wouldn't be very hard.  (Just run the handler in a
> > loop 
> > until all the interrupt status bits are off?)

My first impression is the same as Alan's. Can we have at least more
information on this?

> Agree with you that is a big hammer approach,  but it's such a key
> piece of the massive number of HPE servers out there and we have many
> affected customers.
> 
> While I did all the test work and discovery etc, I am definitely not a
> USB kernel guy very often, I spend most of my time in storage.
> I will listen for the other replies to see how the folks who know the
> subsystem better than I would want this reolved.

As a quick fix I would suggest to quirk out the current EHCI controllers on
the affected machines rather then drop MSI for all.

It may be done via PCI quirk mechanism. In any case I prefer what Alan says.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
  2021-07-13 18:50 [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices Laurence Oberman
  2021-07-13 19:15 ` Alan Stern
  2021-07-13 20:30 ` kernel test robot
@ 2021-07-13 20:33 ` Andy Shevchenko
  2021-07-13 20:44   ` Laurence Oberman
  2021-07-13 21:57   ` Laurence Oberman
  2021-07-13 23:11 ` kernel test robot
  3 siblings, 2 replies; 9+ messages in thread
From: Andy Shevchenko @ 2021-07-13 20:33 UTC (permalink / raw)
  To: Laurence Oberman; +Cc: linux-usb, stable, emilne, djeffery, apanagio, torez

On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote:
> Customers have been reporting that the I/O is radically being
> slowed down to HPE virtual USB ILO served DVD images during installation.

Side note:

Simple changing

-		retval = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY | PCI_IRQ_MSI);
+		retval = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY);

should have the same effect without touching tons of a good code.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
  2021-07-13 20:33 ` Andy Shevchenko
@ 2021-07-13 20:44   ` Laurence Oberman
  2021-07-13 21:57   ` Laurence Oberman
  1 sibling, 0 replies; 9+ messages in thread
From: Laurence Oberman @ 2021-07-13 20:44 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: linux-usb, stable, emilne, djeffery, apanagio, torez

On Tue, 2021-07-13 at 23:33 +0300, Andy Shevchenko wrote:
> On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote:
> > Customers have been reporting that the I/O is radically being
> > slowed down to HPE virtual USB ILO served DVD images during
> > installation.
> 
> Side note:
> 
> Simple changing
> 
> -		retval = pci_alloc_irq_vectors(dev, 1, 1,
> PCI_IRQ_LEGACY | PCI_IRQ_MSI);
> +		retval = pci_alloc_irq_vectors(dev, 1, 1,
> PCI_IRQ_LEGACY);
> 
> should have the same effect without touching tons of a good code.
> 

Hi Andy, I am fine with that, and actually its one of the suggestions
David Jeffery had as well.

I will etst it today.

I can follow up with a new patch once everybody agrees.
We probbaly should get this initially fixed while we wait for a final
complete fix for other devices.

Regards
Laurence


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
  2021-07-13 20:33 ` Andy Shevchenko
  2021-07-13 20:44   ` Laurence Oberman
@ 2021-07-13 21:57   ` Laurence Oberman
  1 sibling, 0 replies; 9+ messages in thread
From: Laurence Oberman @ 2021-07-13 21:57 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: linux-usb, stable, emilne, djeffery, apanagio, torez

On Tue, 2021-07-13 at 23:33 +0300, Andy Shevchenko wrote:
> On Tue, Jul 13, 2021 at 02:50:42PM -0400, Laurence Oberman wrote:
> > Customers have been reporting that the I/O is radically being
> > slowed down to HPE virtual USB ILO served DVD images during
> > installation.
> 
> Side note:
> 
> Simple changing
> 
> -		retval = pci_alloc_irq_vectors(dev, 1, 1,
> PCI_IRQ_LEGACY | PCI_IRQ_MSI);
> +		retval = pci_alloc_irq_vectors(dev, 1, 1,
> PCI_IRQ_LEGACY);
> 
> should have the same effect without touching tons of a good code.
> 

Andy and Alan,
I tested this and as expected it worked as well.
However David has given me a possible ehci fix which would be the best
outcome you wanted, I am testing that now and will follow up tomorrow.

Regards
Laurence


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices
  2021-07-13 18:50 [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices Laurence Oberman
                   ` (2 preceding siblings ...)
  2021-07-13 20:33 ` Andy Shevchenko
@ 2021-07-13 23:11 ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2021-07-13 23:11 UTC (permalink / raw)
  To: Laurence Oberman, linux-usb, andriy.shevchenko, stable, emilne,
	djeffery, apanagio, torez
  Cc: clang-built-linux, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 6590 bytes --]

Hi Laurence,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on usb/usb-testing]
[also build test WARNING on peter.chen-usb/for-usb-next balbi-usb/testing/next v5.14-rc1 next-20210713]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Laurence-Oberman/usb-hcd-Revert-306c54d0edb6ba94d39877524dddebaad7770cf2-Try-MSI-interrupts-on-PCI-devices/20210714-025312
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing
config: x86_64-randconfig-a002-20210713 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 8d69635ed9ecf36fd0ca85906bfde17949671cbe)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/3ea2a748176f21120e150f0645bc3c22e1cea48f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Laurence-Oberman/usb-hcd-Revert-306c54d0edb6ba94d39877524dddebaad7770cf2-Try-MSI-interrupts-on-PCI-devices/20210714-025312
        git checkout 3ea2a748176f21120e150f0645bc3c22e1cea48f
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/usb/core/hcd-pci.c:316:8: warning: variable 'hcd_driver_flags' set but not used [-Wunused-but-set-variable]
           int                     hcd_driver_flags;
                                   ^
   1 warning generated.


vim +/hcd_driver_flags +316 drivers/usb/core/hcd-pci.c

^1da177e4c3f4152 Linus Torvalds     2005-04-16  300  
^1da177e4c3f4152 Linus Torvalds     2005-04-16  301  /**
^1da177e4c3f4152 Linus Torvalds     2005-04-16  302   * usb_hcd_pci_remove - shutdown processing for PCI-based HCDs
^1da177e4c3f4152 Linus Torvalds     2005-04-16  303   * @dev: USB Host Controller being removed
41631d3616c36305 Ahmed S. Darwish   2020-10-19  304   *
41631d3616c36305 Ahmed S. Darwish   2020-10-19  305   * Context: task context, might sleep
^1da177e4c3f4152 Linus Torvalds     2005-04-16  306   *
^1da177e4c3f4152 Linus Torvalds     2005-04-16  307   * Reverses the effect of usb_hcd_pci_probe(), first invoking
^1da177e4c3f4152 Linus Torvalds     2005-04-16  308   * the HCD's stop() method.  It is always called from a thread
^1da177e4c3f4152 Linus Torvalds     2005-04-16  309   * context, normally "rmmod", "apmd", or something similar.
^1da177e4c3f4152 Linus Torvalds     2005-04-16  310   *
^1da177e4c3f4152 Linus Torvalds     2005-04-16  311   * Store this function in the HCD's struct pci_driver as remove().
^1da177e4c3f4152 Linus Torvalds     2005-04-16  312   */
^1da177e4c3f4152 Linus Torvalds     2005-04-16  313  void usb_hcd_pci_remove(struct pci_dev *dev)
^1da177e4c3f4152 Linus Torvalds     2005-04-16  314  {
^1da177e4c3f4152 Linus Torvalds     2005-04-16  315  	struct usb_hcd		*hcd;
7b2816dd293031b9 Andy Shevchenko    2020-08-14 @316  	int			hcd_driver_flags;
^1da177e4c3f4152 Linus Torvalds     2005-04-16  317  
^1da177e4c3f4152 Linus Torvalds     2005-04-16  318  	hcd = pci_get_drvdata(dev);
^1da177e4c3f4152 Linus Torvalds     2005-04-16  319  	if (!hcd)
^1da177e4c3f4152 Linus Torvalds     2005-04-16  320  		return;
^1da177e4c3f4152 Linus Torvalds     2005-04-16  321  
7b2816dd293031b9 Andy Shevchenko    2020-08-14  322  	hcd_driver_flags = hcd->driver->flags;
7b2816dd293031b9 Andy Shevchenko    2020-08-14  323  
3da7cff4e79e4a71 Alan Stern         2010-06-25  324  	if (pci_dev_run_wake(dev))
3da7cff4e79e4a71 Alan Stern         2010-06-25  325  		pm_runtime_get_noresume(&dev->dev);
3da7cff4e79e4a71 Alan Stern         2010-06-25  326  
c548795abe0d3520 Alan Stern         2010-06-09  327  	/* Fake an interrupt request in order to give the driver a chance
c548795abe0d3520 Alan Stern         2010-06-09  328  	 * to test whether the controller hardware has been removed (e.g.,
c548795abe0d3520 Alan Stern         2010-06-09  329  	 * cardbus physical eject).
c548795abe0d3520 Alan Stern         2010-06-09  330  	 */
c548795abe0d3520 Alan Stern         2010-06-09  331  	local_irq_disable();
c548795abe0d3520 Alan Stern         2010-06-09  332  	usb_hcd_irq(0, hcd);
c548795abe0d3520 Alan Stern         2010-06-09  333  	local_irq_enable();
c548795abe0d3520 Alan Stern         2010-06-09  334  
05768918b9a122ce Alan Stern         2013-03-28  335  	/* Note: dev_set_drvdata must be called while holding the rwsem */
05768918b9a122ce Alan Stern         2013-03-28  336  	if (dev->class == CL_EHCI) {
05768918b9a122ce Alan Stern         2013-03-28  337  		down_write(&companions_rwsem);
05768918b9a122ce Alan Stern         2013-03-28  338  		for_each_companion(dev, hcd, ehci_remove);
05768918b9a122ce Alan Stern         2013-03-28  339  		usb_remove_hcd(hcd);
05768918b9a122ce Alan Stern         2013-03-28  340  		dev_set_drvdata(&dev->dev, NULL);
05768918b9a122ce Alan Stern         2013-03-28  341  		up_write(&companions_rwsem);
05768918b9a122ce Alan Stern         2013-03-28  342  	} else {
05768918b9a122ce Alan Stern         2013-03-28  343  		/* Not EHCI; just clear the companion pointer */
05768918b9a122ce Alan Stern         2013-03-28  344  		down_read(&companions_rwsem);
05768918b9a122ce Alan Stern         2013-03-28  345  		hcd->self.hs_companion = NULL;
^1da177e4c3f4152 Linus Torvalds     2005-04-16  346  		usb_remove_hcd(hcd);
05768918b9a122ce Alan Stern         2013-03-28  347  		dev_set_drvdata(&dev->dev, NULL);
05768918b9a122ce Alan Stern         2013-03-28  348  		up_read(&companions_rwsem);
05768918b9a122ce Alan Stern         2013-03-28  349  	}
^1da177e4c3f4152 Linus Torvalds     2005-04-16  350  	usb_put_hcd(hcd);
^1da177e4c3f4152 Linus Torvalds     2005-04-16  351  	pci_disable_device(dev);
^1da177e4c3f4152 Linus Torvalds     2005-04-16  352  }
782e70c6fc2290a0 Greg Kroah-Hartman 2008-01-25  353  EXPORT_SYMBOL_GPL(usb_hcd_pci_remove);
^1da177e4c3f4152 Linus Torvalds     2005-04-16  354  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35209 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-07-13 23:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-13 18:50 [PATCH] usb: hcd: Revert 306c54d0edb6ba94d39877524dddebaad7770cf2: Try MSI interrupts on PCI devices Laurence Oberman
2021-07-13 19:15 ` Alan Stern
2021-07-13 20:05   ` Laurence Oberman
2021-07-13 20:30     ` Andy Shevchenko
2021-07-13 20:30 ` kernel test robot
2021-07-13 20:33 ` Andy Shevchenko
2021-07-13 20:44   ` Laurence Oberman
2021-07-13 21:57   ` Laurence Oberman
2021-07-13 23:11 ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).