All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
@ 2021-07-29  4:33 Mario Kleiner
  2021-07-29  8:37 ` [EXT] " Ralf Ramsauer
  0 siblings, 1 reply; 7+ messages in thread
From: Mario Kleiner @ 2021-07-29  4:33 UTC (permalink / raw)
  To: linux-serial; +Cc: mario.kleiner.de, Greg Kroah-Hartman, Ralf Ramsauer

This attempts to fix a bug found with a serial port card which uses
an MCS9922 chip, one of the 4 models for which MSI-X interrupts are
currently supported. I don't possess such a card, and i'm not
experienced with the serial subsystem, so this patch is based on what
i think i found as a likely reason for failure, based on walking the
user who actually owns the card through some diagnostic.

The user who reported the problem finds the following in his dmesg
output for the relevant ttyS4 and ttyS5:

[    0.580425] serial 0000:02:00.0: enabling device (0000 -> 0003)
[    0.601448] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 125, base_baud = 115200) is a ST16650V2
[    0.603089] serial 0000:02:00.1: enabling device (0000 -> 0003)
[    0.624119] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 126, base_baud = 115200) is a ST16650V2
...
[    6.323784] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
[    6.324128] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
...

Output of setserial -a:

/dev/ttyS4, Line 4, UART: 16650V2, Port: 0x3010, IRQ: 127
	Baud_base: 115200, close_delay: 50, divisor: 0
	closing_wait: 3000
	Flags: spd_normal skip_test

This suggests to me that the serial driver wants to register and share a
MSI/MSI-X irq 128 with the xhci_hcd driver, whereas the xhci driver does
not want to share the irq, as flags 0x00000080 (== IRQF_SHARED) from the
serial port driver means to share the irq, and this mismatch ends in some
failed irq init?

With this setup, data reception works very unreliable, with dropped data,
already at a transmission rate of only a 16 Bytes chunk every 1/120th of
a second, ie. 1920 Bytes/sec, presumably due to rx fifo overflow due to
mishandled or not used at all rx irq's?

See full discussion thread with attempted diagnosis at:

https://psychtoolbox.discourse.group/t/issues-with-iscan-serial-port-recording/3886

Disabling the use of MSI interrupts for the serial port pci card did
fix the reliability problems. The user executed the following sequence
of commands to achieve this:

# Disable PCI serial port driver, shut down card:
echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/unbind
echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/unbind

# Disallow use of MSI/MSI-X interrupts on pci serial port card:
echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.0/msi_bus
echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.1/msi_bus

# Restart driver, reinitialize card, hopefully without MSI irqs now:
echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/bind
echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/bind

This resulted in the following log output:

[   82.179021] pci 0000:02:00.0: MSI/MSI-X disallowed for future drivers
[   87.003031] pci 0000:02:00.1: MSI/MSI-X disallowed for future drivers
[   98.537010] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 17, base_baud = 115200) is a ST16650V2
[  103.648124] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 18, base_baud = 115200) is a ST16650V2

This patch attempts to fix the problem by disabling irq sharing when
using MSI irq's. Note that all i know for sure is that disabling MSI
irq's fixed the problem for the user, so this patch could be wrong and
is untested. Please review with caution, keeping this in mind.

Fixes: 8428413b1d14 ("serial: 8250_pci: Implement MSI(-X) support")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ralf Ramsauer <ralf.ramsauer@oth-regensburg.de>
---
 drivers/tty/serial/8250/8250_pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index 780cc99732b6..35fd5c4e831a 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -3964,6 +3964,7 @@ pciserial_init_ports(struct pci_dev *dev, const struct pciserial_board *board)
 		if (pci_match_id(pci_use_msi, dev)) {
 			dev_dbg(&dev->dev, "Using MSI(-X) interrupts\n");
 			pci_set_master(dev);
+			uart.port.flags &= ~UPF_SHARE_IRQ;
 			rc = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_ALL_TYPES);
 		} else {
 			dev_dbg(&dev->dev, "Using legacy interrupts\n");
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [EXT] [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
  2021-07-29  4:33 [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts Mario Kleiner
@ 2021-07-29  8:37 ` Ralf Ramsauer
  2021-07-29 15:25   ` Andy Shevchenko
  0 siblings, 1 reply; 7+ messages in thread
From: Ralf Ramsauer @ 2021-07-29  8:37 UTC (permalink / raw)
  To: Mario Kleiner, linux-serial
  Cc: Greg Kroah-Hartman, Jiri Slaby, Andy Shevchenko



On 29/07/2021 06:33, Mario Kleiner wrote:
> This attempts to fix a bug found with a serial port card which uses
> an MCS9922 chip, one of the 4 models for which MSI-X interrupts are
> currently supported. I don't possess such a card, and i'm not
> experienced with the serial subsystem, so this patch is based on what
> i think i found as a likely reason for failure, based on walking the
> user who actually owns the card through some diagnostic.

As there's currently some (stuck) discussion on how to generally handle
MSI capable serial cards, and this is issue related to some degree, let
me Cc Jiri and Andy.

  Ralf

> 
> The user who reported the problem finds the following in his dmesg
> output for the relevant ttyS4 and ttyS5:
> 
> [    0.580425] serial 0000:02:00.0: enabling device (0000 -> 0003)
> [    0.601448] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 125, base_baud = 115200) is a ST16650V2
> [    0.603089] serial 0000:02:00.1: enabling device (0000 -> 0003)
> [    0.624119] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 126, base_baud = 115200) is a ST16650V2
> ...
> [    6.323784] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> [    6.324128] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> ...
> 
> Output of setserial -a:
> 
> /dev/ttyS4, Line 4, UART: 16650V2, Port: 0x3010, IRQ: 127
> 	Baud_base: 115200, close_delay: 50, divisor: 0
> 	closing_wait: 3000
> 	Flags: spd_normal skip_test
> 
> This suggests to me that the serial driver wants to register and share a
> MSI/MSI-X irq 128 with the xhci_hcd driver, whereas the xhci driver does
> not want to share the irq, as flags 0x00000080 (== IRQF_SHARED) from the
> serial port driver means to share the irq, and this mismatch ends in some
> failed irq init?
> 
> With this setup, data reception works very unreliable, with dropped data,
> already at a transmission rate of only a 16 Bytes chunk every 1/120th of
> a second, ie. 1920 Bytes/sec, presumably due to rx fifo overflow due to
> mishandled or not used at all rx irq's?
> 
> See full discussion thread with attempted diagnosis at:
> 
> https://psychtoolbox.discourse.group/t/issues-with-iscan-serial-port-recording/3886
> 
> Disabling the use of MSI interrupts for the serial port pci card did
> fix the reliability problems. The user executed the following sequence
> of commands to achieve this:
> 
> # Disable PCI serial port driver, shut down card:
> echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/unbind
> echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/unbind
> 
> # Disallow use of MSI/MSI-X interrupts on pci serial port card:
> echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.0/msi_bus
> echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.1/msi_bus
> 
> # Restart driver, reinitialize card, hopefully without MSI irqs now:
> echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/bind
> echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/bind
> 
> This resulted in the following log output:
> 
> [   82.179021] pci 0000:02:00.0: MSI/MSI-X disallowed for future drivers
> [   87.003031] pci 0000:02:00.1: MSI/MSI-X disallowed for future drivers
> [   98.537010] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 17, base_baud = 115200) is a ST16650V2
> [  103.648124] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 18, base_baud = 115200) is a ST16650V2
> 
> This patch attempts to fix the problem by disabling irq sharing when
> using MSI irq's. Note that all i know for sure is that disabling MSI
> irq's fixed the problem for the user, so this patch could be wrong and
> is untested. Please review with caution, keeping this in mind.
> 
> Fixes: 8428413b1d14 ("serial: 8250_pci: Implement MSI(-X) support")
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Ralf Ramsauer <ralf.ramsauer@oth-regensburg.de>
> ---
>  drivers/tty/serial/8250/8250_pci.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
> index 780cc99732b6..35fd5c4e831a 100644
> --- a/drivers/tty/serial/8250/8250_pci.c
> +++ b/drivers/tty/serial/8250/8250_pci.c
> @@ -3964,6 +3964,7 @@ pciserial_init_ports(struct pci_dev *dev, const struct pciserial_board *board)
>  		if (pci_match_id(pci_use_msi, dev)) {
>  			dev_dbg(&dev->dev, "Using MSI(-X) interrupts\n");
>  			pci_set_master(dev);
> +			uart.port.flags &= ~UPF_SHARE_IRQ;
>  			rc = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_ALL_TYPES);
>  		} else {
>  			dev_dbg(&dev->dev, "Using legacy interrupts\n");
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXT] [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
  2021-07-29  8:37 ` [EXT] " Ralf Ramsauer
@ 2021-07-29 15:25   ` Andy Shevchenko
  2021-07-30  8:20     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 7+ messages in thread
From: Andy Shevchenko @ 2021-07-29 15:25 UTC (permalink / raw)
  To: Ralf Ramsauer
  Cc: Mario Kleiner, open list:SERIAL DRIVERS, Greg Kroah-Hartman, Jiri Slaby

On Thu, Jul 29, 2021 at 11:37 AM Ralf Ramsauer
<ralf.ramsauer@oth-regensburg.de> wrote:
> On 29/07/2021 06:33, Mario Kleiner wrote:
> > This attempts to fix a bug found with a serial port card which uses
> > an MCS9922 chip, one of the 4 models for which MSI-X interrupts are
> > currently supported. I don't possess such a card, and i'm not
> > experienced with the serial subsystem, so this patch is based on what
> > i think i found as a likely reason for failure, based on walking the
> > user who actually owns the card through some diagnostic.
>
> As there's currently some (stuck) discussion on how to generally handle
> MSI capable serial cards, and this is issue related to some degree, let
> me Cc Jiri and Andy.

Thanks, Ralf!

> > The user who reported the problem finds the following in his dmesg
> > output for the relevant ttyS4 and ttyS5:
> >
> > [    0.580425] serial 0000:02:00.0: enabling device (0000 -> 0003)
> > [    0.601448] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 125, base_baud = 115200) is a ST16650V2
> > [    0.603089] serial 0000:02:00.1: enabling device (0000 -> 0003)
> > [    0.624119] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 126, base_baud = 115200) is a ST16650V2
> > ...
> > [    6.323784] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > [    6.324128] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > ...
> >
> > Output of setserial -a:
> >
> > /dev/ttyS4, Line 4, UART: 16650V2, Port: 0x3010, IRQ: 127
> >       Baud_base: 115200, close_delay: 50, divisor: 0
> >       closing_wait: 3000
> >       Flags: spd_normal skip_test
> >
> > This suggests to me that the serial driver wants to register and share a
> > MSI/MSI-X irq 128 with the xhci_hcd driver, whereas the xhci driver does
> > not want to share the irq, as flags 0x00000080 (== IRQF_SHARED) from the
> > serial port driver means to share the irq, and this mismatch ends in some
> > failed irq init?
> >
> > With this setup, data reception works very unreliable, with dropped data,
> > already at a transmission rate of only a 16 Bytes chunk every 1/120th of
> > a second, ie. 1920 Bytes/sec, presumably due to rx fifo overflow due to
> > mishandled or not used at all rx irq's?
> >
> > See full discussion thread with attempted diagnosis at:
> >
> > https://psychtoolbox.discourse.group/t/issues-with-iscan-serial-port-recording/3886
> >
> > Disabling the use of MSI interrupts for the serial port pci card did
> > fix the reliability problems. The user executed the following sequence
> > of commands to achieve this:
> >
> > # Disable PCI serial port driver, shut down card:
> > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/unbind
> >
> > # Disallow use of MSI/MSI-X interrupts on pci serial port card:
> > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.0/msi_bus
> > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.1/msi_bus
> >
> > # Restart driver, reinitialize card, hopefully without MSI irqs now:
> > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/bind
> > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/bind
> >
> > This resulted in the following log output:
> >
> > [   82.179021] pci 0000:02:00.0: MSI/MSI-X disallowed for future drivers
> > [   87.003031] pci 0000:02:00.1: MSI/MSI-X disallowed for future drivers
> > [   98.537010] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 17, base_baud = 115200) is a ST16650V2
> > [  103.648124] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 18, base_baud = 115200) is a ST16650V2
> >
> > This patch attempts to fix the problem by disabling irq sharing when
> > using MSI irq's. Note that all i know for sure is that disabling MSI

In general the shared MSI interrupts are weird things that can be done
with IRQs.

> > irq's fixed the problem for the user, so this patch could be wrong and
> > is untested. Please review with caution, keeping this in mind.

I think it's a good idea in general. I have no objections.

> > Fixes: 8428413b1d14 ("serial: 8250_pci: Implement MSI(-X) support")
> > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Ralf Ramsauer <ralf.ramsauer@oth-regensburg.de>
> > ---
> >  drivers/tty/serial/8250/8250_pci.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
> > index 780cc99732b6..35fd5c4e831a 100644
> > --- a/drivers/tty/serial/8250/8250_pci.c
> > +++ b/drivers/tty/serial/8250/8250_pci.c
> > @@ -3964,6 +3964,7 @@ pciserial_init_ports(struct pci_dev *dev, const struct pciserial_board *board)
> >               if (pci_match_id(pci_use_msi, dev)) {
> >                       dev_dbg(&dev->dev, "Using MSI(-X) interrupts\n");
> >                       pci_set_master(dev);
> > +                     uart.port.flags &= ~UPF_SHARE_IRQ;
> >                       rc = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_ALL_TYPES);
> >               } else {
> >                       dev_dbg(&dev->dev, "Using legacy interrupts\n");
> >



--
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXT] [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
  2021-07-29 15:25   ` Andy Shevchenko
@ 2021-07-30  8:20     ` Greg Kroah-Hartman
  2021-07-30  8:36       ` Andy Shevchenko
  0 siblings, 1 reply; 7+ messages in thread
From: Greg Kroah-Hartman @ 2021-07-30  8:20 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Ralf Ramsauer, Mario Kleiner, open list:SERIAL DRIVERS, Jiri Slaby

On Thu, Jul 29, 2021 at 06:25:54PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 29, 2021 at 11:37 AM Ralf Ramsauer
> <ralf.ramsauer@oth-regensburg.de> wrote:
> > On 29/07/2021 06:33, Mario Kleiner wrote:
> > > This attempts to fix a bug found with a serial port card which uses
> > > an MCS9922 chip, one of the 4 models for which MSI-X interrupts are
> > > currently supported. I don't possess such a card, and i'm not
> > > experienced with the serial subsystem, so this patch is based on what
> > > i think i found as a likely reason for failure, based on walking the
> > > user who actually owns the card through some diagnostic.
> >
> > As there's currently some (stuck) discussion on how to generally handle
> > MSI capable serial cards, and this is issue related to some degree, let
> > me Cc Jiri and Andy.
> 
> Thanks, Ralf!
> 
> > > The user who reported the problem finds the following in his dmesg
> > > output for the relevant ttyS4 and ttyS5:
> > >
> > > [    0.580425] serial 0000:02:00.0: enabling device (0000 -> 0003)
> > > [    0.601448] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 125, base_baud = 115200) is a ST16650V2
> > > [    0.603089] serial 0000:02:00.1: enabling device (0000 -> 0003)
> > > [    0.624119] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 126, base_baud = 115200) is a ST16650V2
> > > ...
> > > [    6.323784] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > > [    6.324128] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > > ...
> > >
> > > Output of setserial -a:
> > >
> > > /dev/ttyS4, Line 4, UART: 16650V2, Port: 0x3010, IRQ: 127
> > >       Baud_base: 115200, close_delay: 50, divisor: 0
> > >       closing_wait: 3000
> > >       Flags: spd_normal skip_test
> > >
> > > This suggests to me that the serial driver wants to register and share a
> > > MSI/MSI-X irq 128 with the xhci_hcd driver, whereas the xhci driver does
> > > not want to share the irq, as flags 0x00000080 (== IRQF_SHARED) from the
> > > serial port driver means to share the irq, and this mismatch ends in some
> > > failed irq init?
> > >
> > > With this setup, data reception works very unreliable, with dropped data,
> > > already at a transmission rate of only a 16 Bytes chunk every 1/120th of
> > > a second, ie. 1920 Bytes/sec, presumably due to rx fifo overflow due to
> > > mishandled or not used at all rx irq's?
> > >
> > > See full discussion thread with attempted diagnosis at:
> > >
> > > https://psychtoolbox.discourse.group/t/issues-with-iscan-serial-port-recording/3886
> > >
> > > Disabling the use of MSI interrupts for the serial port pci card did
> > > fix the reliability problems. The user executed the following sequence
> > > of commands to achieve this:
> > >
> > > # Disable PCI serial port driver, shut down card:
> > > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > >
> > > # Disallow use of MSI/MSI-X interrupts on pci serial port card:
> > > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.0/msi_bus
> > > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.1/msi_bus
> > >
> > > # Restart driver, reinitialize card, hopefully without MSI irqs now:
> > > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/bind
> > > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/bind
> > >
> > > This resulted in the following log output:
> > >
> > > [   82.179021] pci 0000:02:00.0: MSI/MSI-X disallowed for future drivers
> > > [   87.003031] pci 0000:02:00.1: MSI/MSI-X disallowed for future drivers
> > > [   98.537010] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 17, base_baud = 115200) is a ST16650V2
> > > [  103.648124] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 18, base_baud = 115200) is a ST16650V2
> > >
> > > This patch attempts to fix the problem by disabling irq sharing when
> > > using MSI irq's. Note that all i know for sure is that disabling MSI
> 
> In general the shared MSI interrupts are weird things that can be done
> with IRQs.
> 
> > > irq's fixed the problem for the user, so this patch could be wrong and
> > > is untested. Please review with caution, keeping this in mind.
> 
> I think it's a good idea in general. I have no objections.

So is that a "reviewed-by" for this patch?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXT] [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
  2021-07-30  8:20     ` Greg Kroah-Hartman
@ 2021-07-30  8:36       ` Andy Shevchenko
  2021-08-13 16:45         ` Mario Kleiner
  0 siblings, 1 reply; 7+ messages in thread
From: Andy Shevchenko @ 2021-07-30  8:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Ralf Ramsauer, Mario Kleiner, open list:SERIAL DRIVERS, Jiri Slaby

On Fri, Jul 30, 2021 at 11:20 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Thu, Jul 29, 2021 at 06:25:54PM +0300, Andy Shevchenko wrote:
> > On Thu, Jul 29, 2021 at 11:37 AM Ralf Ramsauer
> > <ralf.ramsauer@oth-regensburg.de> wrote:
> > > On 29/07/2021 06:33, Mario Kleiner wrote:
> > > > This attempts to fix a bug found with a serial port card which uses
> > > > an MCS9922 chip, one of the 4 models for which MSI-X interrupts are
> > > > currently supported. I don't possess such a card, and i'm not
> > > > experienced with the serial subsystem, so this patch is based on what
> > > > i think i found as a likely reason for failure, based on walking the
> > > > user who actually owns the card through some diagnostic.
> > >
> > > As there's currently some (stuck) discussion on how to generally handle
> > > MSI capable serial cards, and this is issue related to some degree, let
> > > me Cc Jiri and Andy.
> >
> > Thanks, Ralf!
> >
> > > > The user who reported the problem finds the following in his dmesg
> > > > output for the relevant ttyS4 and ttyS5:
> > > >
> > > > [    0.580425] serial 0000:02:00.0: enabling device (0000 -> 0003)
> > > > [    0.601448] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 125, base_baud = 115200) is a ST16650V2
> > > > [    0.603089] serial 0000:02:00.1: enabling device (0000 -> 0003)
> > > > [    0.624119] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 126, base_baud = 115200) is a ST16650V2
> > > > ...
> > > > [    6.323784] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > > > [    6.324128] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > > > ...
> > > >
> > > > Output of setserial -a:
> > > >
> > > > /dev/ttyS4, Line 4, UART: 16650V2, Port: 0x3010, IRQ: 127
> > > >       Baud_base: 115200, close_delay: 50, divisor: 0
> > > >       closing_wait: 3000
> > > >       Flags: spd_normal skip_test
> > > >
> > > > This suggests to me that the serial driver wants to register and share a
> > > > MSI/MSI-X irq 128 with the xhci_hcd driver, whereas the xhci driver does
> > > > not want to share the irq, as flags 0x00000080 (== IRQF_SHARED) from the
> > > > serial port driver means to share the irq, and this mismatch ends in some
> > > > failed irq init?
> > > >
> > > > With this setup, data reception works very unreliable, with dropped data,
> > > > already at a transmission rate of only a 16 Bytes chunk every 1/120th of
> > > > a second, ie. 1920 Bytes/sec, presumably due to rx fifo overflow due to
> > > > mishandled or not used at all rx irq's?
> > > >
> > > > See full discussion thread with attempted diagnosis at:
> > > >
> > > > https://psychtoolbox.discourse.group/t/issues-with-iscan-serial-port-recording/3886
> > > >
> > > > Disabling the use of MSI interrupts for the serial port pci card did
> > > > fix the reliability problems. The user executed the following sequence
> > > > of commands to achieve this:
> > > >
> > > > # Disable PCI serial port driver, shut down card:
> > > > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > > > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > > >
> > > > # Disallow use of MSI/MSI-X interrupts on pci serial port card:
> > > > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.0/msi_bus
> > > > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.1/msi_bus
> > > >
> > > > # Restart driver, reinitialize card, hopefully without MSI irqs now:
> > > > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/bind
> > > > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/bind
> > > >
> > > > This resulted in the following log output:
> > > >
> > > > [   82.179021] pci 0000:02:00.0: MSI/MSI-X disallowed for future drivers
> > > > [   87.003031] pci 0000:02:00.1: MSI/MSI-X disallowed for future drivers
> > > > [   98.537010] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 17, base_baud = 115200) is a ST16650V2
> > > > [  103.648124] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 18, base_baud = 115200) is a ST16650V2
> > > >
> > > > This patch attempts to fix the problem by disabling irq sharing when
> > > > using MSI irq's. Note that all i know for sure is that disabling MSI
> >
> > In general the shared MSI interrupts are weird things that can be done
> > with IRQs.
> >
> > > > irq's fixed the problem for the user, so this patch could be wrong and
> > > > is untested. Please review with caution, keeping this in mind.
> >
> > I think it's a good idea in general. I have no objections.
>
> So is that a "reviewed-by" for this patch?

You may consider this way.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXT] [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
  2021-07-30  8:36       ` Andy Shevchenko
@ 2021-08-13 16:45         ` Mario Kleiner
  2021-08-13 17:04           ` Andy Shevchenko
  0 siblings, 1 reply; 7+ messages in thread
From: Mario Kleiner @ 2021-08-13 16:45 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Greg Kroah-Hartman, Ralf Ramsauer, open list:SERIAL DRIVERS, Jiri Slaby

Btw., one question to the experts, after much more grepping through
the irq handling code and being no wiser than before:

My understanding of MSI(-X) is that one of the big advantages over
legacy IRQ is that there are so many available MSI IRQ's, and so IRQ
sharing is normally not needed. So why did the driver choose an MSI
irq shared with the USB host controller? Or iow., why did

pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_ALL_TYPES); and uart.port.irq
= pci_irq_vector(dev, 0); return a IRQ number that was already in use
at that time? Is this something to be expected, or some quirk specific
to serial drivers or this hw?

Thanks and have a nice weekend,
-mario

On Fri, Jul 30, 2021 at 10:37 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Fri, Jul 30, 2021 at 11:20 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Jul 29, 2021 at 06:25:54PM +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 29, 2021 at 11:37 AM Ralf Ramsauer
> > > <ralf.ramsauer@oth-regensburg.de> wrote:
> > > > On 29/07/2021 06:33, Mario Kleiner wrote:
> > > > > This attempts to fix a bug found with a serial port card which uses
> > > > > an MCS9922 chip, one of the 4 models for which MSI-X interrupts are
> > > > > currently supported. I don't possess such a card, and i'm not
> > > > > experienced with the serial subsystem, so this patch is based on what
> > > > > i think i found as a likely reason for failure, based on walking the
> > > > > user who actually owns the card through some diagnostic.
> > > >
> > > > As there's currently some (stuck) discussion on how to generally handle
> > > > MSI capable serial cards, and this is issue related to some degree, let
> > > > me Cc Jiri and Andy.
> > >
> > > Thanks, Ralf!
> > >
> > > > > The user who reported the problem finds the following in his dmesg
> > > > > output for the relevant ttyS4 and ttyS5:
> > > > >
> > > > > [    0.580425] serial 0000:02:00.0: enabling device (0000 -> 0003)
> > > > > [    0.601448] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 125, base_baud = 115200) is a ST16650V2
> > > > > [    0.603089] serial 0000:02:00.1: enabling device (0000 -> 0003)
> > > > > [    0.624119] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 126, base_baud = 115200) is a ST16650V2
> > > > > ...
> > > > > [    6.323784] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > > > > [    6.324128] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > > > > ...
> > > > >
> > > > > Output of setserial -a:
> > > > >
> > > > > /dev/ttyS4, Line 4, UART: 16650V2, Port: 0x3010, IRQ: 127
> > > > >       Baud_base: 115200, close_delay: 50, divisor: 0
> > > > >       closing_wait: 3000
> > > > >       Flags: spd_normal skip_test
> > > > >
> > > > > This suggests to me that the serial driver wants to register and share a
> > > > > MSI/MSI-X irq 128 with the xhci_hcd driver, whereas the xhci driver does
> > > > > not want to share the irq, as flags 0x00000080 (== IRQF_SHARED) from the
> > > > > serial port driver means to share the irq, and this mismatch ends in some
> > > > > failed irq init?
> > > > >
> > > > > With this setup, data reception works very unreliable, with dropped data,
> > > > > already at a transmission rate of only a 16 Bytes chunk every 1/120th of
> > > > > a second, ie. 1920 Bytes/sec, presumably due to rx fifo overflow due to
> > > > > mishandled or not used at all rx irq's?
> > > > >
> > > > > See full discussion thread with attempted diagnosis at:
> > > > >
> > > > > https://psychtoolbox.discourse.group/t/issues-with-iscan-serial-port-recording/3886
> > > > >
> > > > > Disabling the use of MSI interrupts for the serial port pci card did
> > > > > fix the reliability problems. The user executed the following sequence
> > > > > of commands to achieve this:
> > > > >
> > > > > # Disable PCI serial port driver, shut down card:
> > > > > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > > > > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > > > >
> > > > > # Disallow use of MSI/MSI-X interrupts on pci serial port card:
> > > > > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.0/msi_bus
> > > > > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.1/msi_bus
> > > > >
> > > > > # Restart driver, reinitialize card, hopefully without MSI irqs now:
> > > > > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/bind
> > > > > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/bind
> > > > >
> > > > > This resulted in the following log output:
> > > > >
> > > > > [   82.179021] pci 0000:02:00.0: MSI/MSI-X disallowed for future drivers
> > > > > [   87.003031] pci 0000:02:00.1: MSI/MSI-X disallowed for future drivers
> > > > > [   98.537010] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 17, base_baud = 115200) is a ST16650V2
> > > > > [  103.648124] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 18, base_baud = 115200) is a ST16650V2
> > > > >
> > > > > This patch attempts to fix the problem by disabling irq sharing when
> > > > > using MSI irq's. Note that all i know for sure is that disabling MSI
> > >
> > > In general the shared MSI interrupts are weird things that can be done
> > > with IRQs.
> > >
> > > > > irq's fixed the problem for the user, so this patch could be wrong and
> > > > > is untested. Please review with caution, keeping this in mind.
> > >
> > > I think it's a good idea in general. I have no objections.
> >
> > So is that a "reviewed-by" for this patch?
>
> You may consider this way.
>
> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
>
> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXT] [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts.
  2021-08-13 16:45         ` Mario Kleiner
@ 2021-08-13 17:04           ` Andy Shevchenko
  0 siblings, 0 replies; 7+ messages in thread
From: Andy Shevchenko @ 2021-08-13 17:04 UTC (permalink / raw)
  To: Mario Kleiner, Thomas Gleixner
  Cc: Greg Kroah-Hartman, Ralf Ramsauer, open list:SERIAL DRIVERS, Jiri Slaby

On Fri, Aug 13, 2021 at 7:45 PM Mario Kleiner
<mario.kleiner.de@gmail.com> wrote:
>
> Btw., one question to the experts, after much more grepping through
> the irq handling code and being no wiser than before:
>
> My understanding of MSI(-X) is that one of the big advantages over
> legacy IRQ is that there are so many available MSI IRQ's, and so IRQ
> sharing is normally not needed. So why did the driver choose an MSI
> irq shared with the USB host controller? Or iow., why did
>
> pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_ALL_TYPES); and uart.port.irq
> = pci_irq_vector(dev, 0); return a IRQ number that was already in use
> at that time? Is this something to be expected, or some quirk specific
> to serial drivers or this hw?

My theory is that they are sharing the same pin on the PCI bus and
drivers have a SHARED flag and some code inside IRQ core decides that
it's fine to have them shared. But I think the best person to answer
is Thomas.

> > > > > On 29/07/2021 06:33, Mario Kleiner wrote:
> > > > > > This attempts to fix a bug found with a serial port card which uses
> > > > > > an MCS9922 chip, one of the 4 models for which MSI-X interrupts are
> > > > > > currently supported. I don't possess such a card, and i'm not
> > > > > > experienced with the serial subsystem, so this patch is based on what
> > > > > > i think i found as a likely reason for failure, based on walking the
> > > > > > user who actually owns the card through some diagnostic.
> > > > >
> > > > > As there's currently some (stuck) discussion on how to generally handle
> > > > > MSI capable serial cards, and this is issue related to some degree, let
> > > > > me Cc Jiri and Andy.
> > > >
> > > > Thanks, Ralf!
> > > >
> > > > > > The user who reported the problem finds the following in his dmesg
> > > > > > output for the relevant ttyS4 and ttyS5:
> > > > > >
> > > > > > [    0.580425] serial 0000:02:00.0: enabling device (0000 -> 0003)
> > > > > > [    0.601448] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 125, base_baud = 115200) is a ST16650V2
> > > > > > [    0.603089] serial 0000:02:00.1: enabling device (0000 -> 0003)
> > > > > > [    0.624119] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 126, base_baud = 115200) is a ST16650V2
> > > > > > ...
> > > > > > [    6.323784] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > > > > > [    6.324128] genirq: Flags mismatch irq 128. 00000080 (ttyS5) vs. 00000000 (xhci_hcd)
> > > > > > ...
> > > > > >
> > > > > > Output of setserial -a:
> > > > > >
> > > > > > /dev/ttyS4, Line 4, UART: 16650V2, Port: 0x3010, IRQ: 127
> > > > > >       Baud_base: 115200, close_delay: 50, divisor: 0
> > > > > >       closing_wait: 3000
> > > > > >       Flags: spd_normal skip_test
> > > > > >
> > > > > > This suggests to me that the serial driver wants to register and share a
> > > > > > MSI/MSI-X irq 128 with the xhci_hcd driver, whereas the xhci driver does
> > > > > > not want to share the irq, as flags 0x00000080 (== IRQF_SHARED) from the
> > > > > > serial port driver means to share the irq, and this mismatch ends in some
> > > > > > failed irq init?
> > > > > >
> > > > > > With this setup, data reception works very unreliable, with dropped data,
> > > > > > already at a transmission rate of only a 16 Bytes chunk every 1/120th of
> > > > > > a second, ie. 1920 Bytes/sec, presumably due to rx fifo overflow due to
> > > > > > mishandled or not used at all rx irq's?
> > > > > >
> > > > > > See full discussion thread with attempted diagnosis at:
> > > > > >
> > > > > > https://psychtoolbox.discourse.group/t/issues-with-iscan-serial-port-recording/3886
> > > > > >
> > > > > > Disabling the use of MSI interrupts for the serial port pci card did
> > > > > > fix the reliability problems. The user executed the following sequence
> > > > > > of commands to achieve this:
> > > > > >
> > > > > > # Disable PCI serial port driver, shut down card:
> > > > > > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > > > > > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/unbind
> > > > > >
> > > > > > # Disallow use of MSI/MSI-X interrupts on pci serial port card:
> > > > > > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.0/msi_bus
> > > > > > echo 0 | sudo tee /sys/bus/pci/devices/0000:02:00.1/msi_bus
> > > > > >
> > > > > > # Restart driver, reinitialize card, hopefully without MSI irqs now:
> > > > > > echo 0000:02:00.0 | sudo tee /sys/bus/pci/drivers/serial/bind
> > > > > > echo 0000:02:00.1 | sudo tee /sys/bus/pci/drivers/serial/bind
> > > > > >
> > > > > > This resulted in the following log output:
> > > > > >
> > > > > > [   82.179021] pci 0000:02:00.0: MSI/MSI-X disallowed for future drivers
> > > > > > [   87.003031] pci 0000:02:00.1: MSI/MSI-X disallowed for future drivers
> > > > > > [   98.537010] 0000:02:00.0: ttyS4 at I/O 0x3010 (irq = 17, base_baud = 115200) is a ST16650V2
> > > > > > [  103.648124] 0000:02:00.1: ttyS5 at I/O 0x3000 (irq = 18, base_baud = 115200) is a ST16650V2
> > > > > >
> > > > > > This patch attempts to fix the problem by disabling irq sharing when
> > > > > > using MSI irq's. Note that all i know for sure is that disabling MSI
> > > >
> > > > In general the shared MSI interrupts are weird things that can be done
> > > > with IRQs.
> > > >
> > > > > > irq's fixed the problem for the user, so this patch could be wrong and
> > > > > > is untested. Please review with caution, keeping this in mind.
> > > >
> > > > I think it's a good idea in general. I have no objections.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-13 17:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29  4:33 [PATCH] serial: 8250_pci: Avoid irq sharing for MSI(-X) interrupts Mario Kleiner
2021-07-29  8:37 ` [EXT] " Ralf Ramsauer
2021-07-29 15:25   ` Andy Shevchenko
2021-07-30  8:20     ` Greg Kroah-Hartman
2021-07-30  8:36       ` Andy Shevchenko
2021-08-13 16:45         ` Mario Kleiner
2021-08-13 17:04           ` Andy Shevchenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.