linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")
@ 2018-09-12  3:42 Kai-Heng Feng
  2018-09-12  4:56 ` Jian-Hong Pan
  2018-09-12  6:32 ` Thomas Gleixner
  0 siblings, 2 replies; 8+ messages in thread
From: Kai-Heng Feng @ 2018-09-12  3:42 UTC (permalink / raw)
  To: jian-hong, Thomas Gleixner; +Cc: Linux Netdev List, Linux Kernel Mailing List

Hi Jian-Hong,

There's a Dell machine with RTL8106e stops to work after S3 since the  
commit introduced.
So I am wondering if it's possible to revert the commit and use  
DMI/subsystem id based quirk table?

It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent  
reservation mode for non maskable MSI") cleared the reservation mode, and I  
can see this after S3:

[   94.872838] do_IRQ: 3.33 No irq handler for vector

If the device uses MSI-X instead of MSI, the issue doesn't happen because  
of reservation mode.


Hi Thomas,

Is it something should be handled by x86 BIOS? Because I don't see this  
issue when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.

Kai-Heng


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")
  2018-09-12  3:42 Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e") Kai-Heng Feng
@ 2018-09-12  4:56 ` Jian-Hong Pan
  2018-09-12  5:57   ` Kai-Heng Feng
  2018-09-12  6:32 ` Thomas Gleixner
  1 sibling, 1 reply; 8+ messages in thread
From: Jian-Hong Pan @ 2018-09-12  4:56 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Thomas Gleixner, Linux Netdev List, Linux Kernel Mailing List,
	Daniel Drake

2018-09-12 11:42 GMT+08:00 Kai-Heng Feng <kai.heng.feng@canonical.com>:
> Hi Jian-Hong,
>
> There's a Dell machine with RTL8106e stops to work after S3 since the commit
> introduced.
> So I am wondering if it's possible to revert the commit and use
> DMI/subsystem id based quirk table?
>
> It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent
> reservation mode for non maskable MSI") cleared the reservation mode, and I
> can see this after S3:
>
> [   94.872838] do_IRQ: 3.33 No irq handler for vector
>
> If the device uses MSI-X instead of MSI, the issue doesn't happen because of
> reservation mode.

Interesting!  Opposite symptom!
Could you help try the patch
https://marc.info/?l=linux-pci&m=153629858601668&w=4 with and without
reverting the commit?

If the patch does not work, another suggestion: You can try falling
back to only PCI_IRQ_LEGACY.

Regards,
Jian-Hong Pan

>
> Hi Thomas,
>
> Is it something should be handled by x86 BIOS? Because I don't see this
> issue when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.
>
> Kai-Heng
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")
  2018-09-12  4:56 ` Jian-Hong Pan
@ 2018-09-12  5:57   ` Kai-Heng Feng
  0 siblings, 0 replies; 8+ messages in thread
From: Kai-Heng Feng @ 2018-09-12  5:57 UTC (permalink / raw)
  To: Jian-Hong Pan
  Cc: Thomas Gleixner, Linux Netdev List, Linux Kernel Mailing List,
	Daniel Drake

at 12:56, Jian-Hong Pan <jian-hong@endlessm.com> wrote:

> 2018-09-12 11:42 GMT+08:00 Kai-Heng Feng <kai.heng.feng@canonical.com>:
>> Hi Jian-Hong,
>>
>> There's a Dell machine with RTL8106e stops to work after S3 since the  
>> commit
>> introduced.
>> So I am wondering if it's possible to revert the commit and use
>> DMI/subsystem id based quirk table?
>>
>> It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent
>> reservation mode for non maskable MSI") cleared the reservation mode,  
>> and I
>> can see this after S3:
>>
>> [   94.872838] do_IRQ: 3.33 No irq handler for vector
>>
>> If the device uses MSI-X instead of MSI, the issue doesn't happen  
>> because of
>> reservation mode.
>
> Interesting!  Opposite symptom!
> Could you help try the patch
> https://marc.info/?l=linux-pci&m=153629858601668&w=4 with and without
> reverting the commit?

Same issue after applying this patch. MSI-X works, MSI doesn't work.

>
> If the patch does not work, another suggestion: You can try falling
> back to only PCI_IRQ_LEGACY.

This device is capable of using MSI-X, I don't think falls back to use  
legacy is a good idea.
Instead, using a quirk table should be more appropriate.

Kai-Heng

>
> Regards,
> Jian-Hong Pan
>
>> Hi Thomas,
>>
>> Is it something should be handled by x86 BIOS? Because I don't see this
>> issue when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.
>>
>> Kai-Heng



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")
  2018-09-12  3:42 Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e") Kai-Heng Feng
  2018-09-12  4:56 ` Jian-Hong Pan
@ 2018-09-12  6:32 ` Thomas Gleixner
  2018-09-12  8:19   ` Kai-Heng Feng
  1 sibling, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2018-09-12  6:32 UTC (permalink / raw)
  To: Kai-Heng Feng; +Cc: jian-hong, Linux Netdev List, Linux Kernel Mailing List

On Wed, 12 Sep 2018, Kai-Heng Feng wrote:

> There's a Dell machine with RTL8106e stops to work after S3 since the
> commit introduced. So I am wondering if it's possible to revert the
> commit and use DMI/subsystem id based quirk table?

Probably.

> It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent
> reservation mode for non maskable MSI") cleared the reservation mode, and I
> can see this after S3:
> 
> [   94.872838] do_IRQ: 3.33 No irq handler for vector

It's not because of that commit, really. There is a interrupt sent after
resume to the wrong vector for whatever reason. The MSI vector cannot be
masked it seems in the device, but the driver should quiescen the device to
a point where it does not send interrupts.

> If the device uses MSI-X instead of MSI, the issue doesn't happen because of
> reservation mode.

Reservation mode has absolutely nothing to do with that. What prevents the
issue is the fact that MSI-X can be masked by the IRQ core.

> Is it something should be handled by x86 BIOS? Because I don't see this issue
> when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.

Suspend to idle works completely different and I don't see the BIOS at
fault here. it's more an issue of MSI not being maskable on that device,
which can't be fixed in BIOS or it's some half quiescened state which is
used when suspending and that's a pure driver issue.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")
  2018-09-12  6:32 ` Thomas Gleixner
@ 2018-09-12  8:19   ` Kai-Heng Feng
  2018-09-13  5:50     ` Jian-Hong Pan
  0 siblings, 1 reply; 8+ messages in thread
From: Kai-Heng Feng @ 2018-09-12  8:19 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: jian-hong, Linux Netdev List, Linux Kernel Mailing List

at 14:32, Thomas Gleixner <tglx@linutronix.de> wrote:

> On Wed, 12 Sep 2018, Kai-Heng Feng wrote:
>
>> There's a Dell machine with RTL8106e stops to work after S3 since the
>> commit introduced. So I am wondering if it's possible to revert the
>> commit and use DMI/subsystem id based quirk table?
>
> Probably.

Hopefully Jian-Hong can cook up a quirk table for the issue.

>
>> It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent
>> reservation mode for non maskable MSI") cleared the reservation mode,  
>> and I
>> can see this after S3:
>>
>> [   94.872838] do_IRQ: 3.33 No irq handler for vector
>
> It's not because of that commit, really. There is a interrupt sent after
> resume to the wrong vector for whatever reason. The MSI vector cannot be
> masked it seems in the device, but the driver should quiescen the device to
> a point where it does not send interrupts.

Understood.

>
>> If the device uses MSI-X instead of MSI, the issue doesn't happen  
>> because of
>> reservation mode.
>
> Reservation mode has absolutely nothing to do with that. What prevents the
> issue is the fact that MSI-X can be masked by the IRQ core.

So in this case I think keep the device using MSI-X is a better route, it's  
MSI-X capable anyway.

>
>> Is it something should be handled by x86 BIOS? Because I don't see this  
>> issue
>> when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.
>
> Suspend to idle works completely different and I don't see the BIOS at
> fault here. it's more an issue of MSI not being maskable on that device,
> which can't be fixed in BIOS or it's some half quiescened state which is
> used when suspending and that's a pure driver issue.

Understood.
Thanks for all the info!

Kai-Heng

>
> Thanks,
>
> 	tglx



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")
  2018-09-12  8:19   ` Kai-Heng Feng
@ 2018-09-13  5:50     ` Jian-Hong Pan
  2018-09-21 17:08       ` Andy Shevchenko
  0 siblings, 1 reply; 8+ messages in thread
From: Jian-Hong Pan @ 2018-09-13  5:50 UTC (permalink / raw)
  To: Kai-Heng Feng, Heiner Kallweit
  Cc: Thomas Gleixner, Linux Netdev List, Linux Kernel Mailing List,
	Linux Upstreaming Team, Daniel Drake, Steve Dodd

2018-09-12 16:19 GMT+08:00 Kai-Heng Feng <kai.heng.feng@canonical.com>:
> at 14:32, Thomas Gleixner <tglx@linutronix.de> wrote:
>
>> On Wed, 12 Sep 2018, Kai-Heng Feng wrote:
>>
>>> There's a Dell machine with RTL8106e stops to work after S3 since the
>>> commit introduced. So I am wondering if it's possible to revert the
>>> commit and use DMI/subsystem id based quirk table?
>>
>>
>> Probably.
>
>
> Hopefully Jian-Hong can cook up a quirk table for the issue.

Module r8169 gets nothing in the PCI BAR after system resumes which
makes MSI-X fail on some ASUS laptops equipped with RTL8106e chip.
https://www.spinics.net/lists/linux-pci/msg75598.html

Actually, I am waiting for the patch "PCI: Reprogram bridge prefetch
registers on resume" being merged.
https://marc.info/?l=linux-pm&m=153680987814299&w=2

It resolves the drivers which get nothing in PCI BAR after system resumes.

After that, I can remove the falling back code of RTL8106e.

Heiner, any comment?

Regards,
Jian-Hong Pan

>>
>>> It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent
>>> reservation mode for non maskable MSI") cleared the reservation mode, and
>>> I
>>> can see this after S3:
>>>
>>> [   94.872838] do_IRQ: 3.33 No irq handler for vector
>>
>>
>> It's not because of that commit, really. There is a interrupt sent after
>> resume to the wrong vector for whatever reason. The MSI vector cannot be
>> masked it seems in the device, but the driver should quiescen the device
>> to
>> a point where it does not send interrupts.
>
>
> Understood.
>
>>
>>> If the device uses MSI-X instead of MSI, the issue doesn't happen because
>>> of
>>> reservation mode.
>>
>>
>> Reservation mode has absolutely nothing to do with that. What prevents the
>> issue is the fact that MSI-X can be masked by the IRQ core.
>
>
> So in this case I think keep the device using MSI-X is a better route, it's
> MSI-X capable anyway.
>
>>
>>> Is it something should be handled by x86 BIOS? Because I don't see this
>>> issue
>>> when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.
>>
>>
>> Suspend to idle works completely different and I don't see the BIOS at
>> fault here. it's more an issue of MSI not being maskable on that device,
>> which can't be fixed in BIOS or it's some half quiescened state which is
>> used when suspending and that's a pure driver issue.
>
>
> Understood.
> Thanks for all the info!
>
> Kai-Heng
>
>>
>> Thanks,
>>
>>         tglx
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")
  2018-09-13  5:50     ` Jian-Hong Pan
@ 2018-09-21 17:08       ` Andy Shevchenko
  2018-09-27  9:03         ` Jian-Hong Pan
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Shevchenko @ 2018-09-21 17:08 UTC (permalink / raw)
  To: Jian-Hong Pan
  Cc: Kai-Heng Feng, Heiner Kallweit, Thomas Gleixner, netdev,
	Linux Kernel Mailing List, Linux Upstreaming Team, Daniel Drake,
	steved424

On Thu, Sep 13, 2018 at 8:53 AM Jian-Hong Pan <jian-hong@endlessm.com> wrote:
>
> 2018-09-12 16:19 GMT+08:00 Kai-Heng Feng <kai.heng.feng@canonical.com>:
> > at 14:32, Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> >> On Wed, 12 Sep 2018, Kai-Heng Feng wrote:
> >>
> >>> There's a Dell machine with RTL8106e stops to work after S3 since the
> >>> commit introduced. So I am wondering if it's possible to revert the
> >>> commit and use DMI/subsystem id based quirk table?
> >>
> >>
> >> Probably.

Have you seen this thread:
https://patchwork.ozlabs.org/cover/968924/

and this one:
https://patchwork.kernel.org/patch/10583229/

?

> >
> >
> > Hopefully Jian-Hong can cook up a quirk table for the issue.
>
> Module r8169 gets nothing in the PCI BAR after system resumes which
> makes MSI-X fail on some ASUS laptops equipped with RTL8106e chip.
> https://www.spinics.net/lists/linux-pci/msg75598.html
>
> Actually, I am waiting for the patch "PCI: Reprogram bridge prefetch
> registers on resume" being merged.
> https://marc.info/?l=linux-pm&m=153680987814299&w=2
>
> It resolves the drivers which get nothing in PCI BAR after system resumes.
>
> After that, I can remove the falling back code of RTL8106e.
>
> Heiner, any comment?
>
> Regards,
> Jian-Hong Pan
>
> >>
> >>> It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent
> >>> reservation mode for non maskable MSI") cleared the reservation mode, and
> >>> I
> >>> can see this after S3:
> >>>
> >>> [   94.872838] do_IRQ: 3.33 No irq handler for vector
> >>
> >>
> >> It's not because of that commit, really. There is a interrupt sent after
> >> resume to the wrong vector for whatever reason. The MSI vector cannot be
> >> masked it seems in the device, but the driver should quiescen the device
> >> to
> >> a point where it does not send interrupts.
> >
> >
> > Understood.
> >
> >>
> >>> If the device uses MSI-X instead of MSI, the issue doesn't happen because
> >>> of
> >>> reservation mode.
> >>
> >>
> >> Reservation mode has absolutely nothing to do with that. What prevents the
> >> issue is the fact that MSI-X can be masked by the IRQ core.
> >
> >
> > So in this case I think keep the device using MSI-X is a better route, it's
> > MSI-X capable anyway.
> >
> >>
> >>> Is it something should be handled by x86 BIOS? Because I don't see this
> >>> issue
> >>> when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.
> >>
> >>
> >> Suspend to idle works completely different and I don't see the BIOS at
> >> fault here. it's more an issue of MSI not being maskable on that device,
> >> which can't be fixed in BIOS or it's some half quiescened state which is
> >> used when suspending and that's a pure driver issue.
> >
> >
> > Understood.
> > Thanks for all the info!
> >
> > Kai-Heng
> >
> >>
> >> Thanks,
> >>
> >>         tglx
> >
> >
> >



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")
  2018-09-21 17:08       ` Andy Shevchenko
@ 2018-09-27  9:03         ` Jian-Hong Pan
  0 siblings, 0 replies; 8+ messages in thread
From: Jian-Hong Pan @ 2018-09-27  9:03 UTC (permalink / raw)
  To: andy.shevchenko
  Cc: Kai-Heng Feng, Heiner Kallweit, Thomas Gleixner,
	Linux Netdev List, Linux Kernel, Linux Upstreaming Team,
	Daniel Drake, Steve Dodd

Andy Shevchenko <andy.shevchenko@gmail.com> 於 2018年9月22日 週六 上午1:08寫道:
>
> On Thu, Sep 13, 2018 at 8:53 AM Jian-Hong Pan <jian-hong@endlessm.com> wrote:
> >
> > 2018-09-12 16:19 GMT+08:00 Kai-Heng Feng <kai.heng.feng@canonical.com>:
> > > at 14:32, Thomas Gleixner <tglx@linutronix.de> wrote:
> > >
> > >> On Wed, 12 Sep 2018, Kai-Heng Feng wrote:
> > >>
> > >>> There's a Dell machine with RTL8106e stops to work after S3 since the
> > >>> commit introduced. So I am wondering if it's possible to revert the
> > >>> commit and use DMI/subsystem id based quirk table?
> > >>
> > >>
> > >> Probably.
>
> Have you seen this thread:
> https://patchwork.ozlabs.org/cover/968924/
>
> and this one:
> https://patchwork.kernel.org/patch/10583229/

Ya!  It is the one.  And it is discussed in bugzilla
https://bugzilla.kernel.org/show_bug.cgi?id=201181
Now, the revert patch is submitted https://lkml.org/lkml/2018/9/27/224
However, still thanks for your information. :)

Regards,
Jian-Hong Pan

> ?
>
> > >
> > >
> > > Hopefully Jian-Hong can cook up a quirk table for the issue.
> >
> > Module r8169 gets nothing in the PCI BAR after system resumes which
> > makes MSI-X fail on some ASUS laptops equipped with RTL8106e chip.
> > https://www.spinics.net/lists/linux-pci/msg75598.html
> >
> > Actually, I am waiting for the patch "PCI: Reprogram bridge prefetch
> > registers on resume" being merged.
> > https://marc.info/?l=linux-pm&m=153680987814299&w=2
> >
> > It resolves the drivers which get nothing in PCI BAR after system resumes.
> >
> > After that, I can remove the falling back code of RTL8106e.
> >
> > Heiner, any comment?
> >
> > Regards,
> > Jian-Hong Pan
> >
> > >>
> > >>> It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent
> > >>> reservation mode for non maskable MSI") cleared the reservation mode, and
> > >>> I
> > >>> can see this after S3:
> > >>>
> > >>> [   94.872838] do_IRQ: 3.33 No irq handler for vector
> > >>
> > >>
> > >> It's not because of that commit, really. There is a interrupt sent after
> > >> resume to the wrong vector for whatever reason. The MSI vector cannot be
> > >> masked it seems in the device, but the driver should quiescen the device
> > >> to
> > >> a point where it does not send interrupts.
> > >
> > >
> > > Understood.
> > >
> > >>
> > >>> If the device uses MSI-X instead of MSI, the issue doesn't happen because
> > >>> of
> > >>> reservation mode.
> > >>
> > >>
> > >> Reservation mode has absolutely nothing to do with that. What prevents the
> > >> issue is the fact that MSI-X can be masked by the IRQ core.
> > >
> > >
> > > So in this case I think keep the device using MSI-X is a better route, it's
> > > MSI-X capable anyway.
> > >
> > >>
> > >>> Is it something should be handled by x86 BIOS? Because I don't see this
> > >>> issue
> > >>> when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.
> > >>
> > >>
> > >> Suspend to idle works completely different and I don't see the BIOS at
> > >> fault here. it's more an issue of MSI not being maskable on that device,
> > >> which can't be fixed in BIOS or it's some half quiescened state which is
> > >> used when suspending and that's a pure driver issue.
> > >
> > >
> > > Understood.
> > > Thanks for all the info!
> > >
> > > Kai-Heng
> > >
> > >>
> > >> Thanks,
> > >>
> > >>         tglx
> > >
> > >
> > >
>
>
>
> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-09-27  9:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-12  3:42 Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e") Kai-Heng Feng
2018-09-12  4:56 ` Jian-Hong Pan
2018-09-12  5:57   ` Kai-Heng Feng
2018-09-12  6:32 ` Thomas Gleixner
2018-09-12  8:19   ` Kai-Heng Feng
2018-09-13  5:50     ` Jian-Hong Pan
2018-09-21 17:08       ` Andy Shevchenko
2018-09-27  9:03         ` Jian-Hong Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).