All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: rc7 + tip/master suspend fun
       [not found] <20140728175326.GA7100@pd.tnic>
@ 2014-07-28 20:59 ` Rafael J. Wysocki
  2014-07-28 21:02   ` Thomas Gleixner
  0 siblings, 1 reply; 24+ messages in thread
From: Rafael J. Wysocki @ 2014-07-28 20:59 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Thomas Gleixner, x86-ml, lkml, Peter Zijlstra

On Monday, July 28, 2014 07:53:26 PM Borislav Petkov wrote:
> 
> --Nq2Wo0NMKNjxTN9z
> Content-Type: text/plain; charset=utf-8
> Content-Disposition: inline
> 
> Hi guys,
> 
> so during my rc7 + tip/master testing today, I've hit the second WARN
> in remove_proc_entry, see attached pic. This happens right before I
> suspend to disk and I can't successfully suspend because sda gets choked
> afterwards and floods dmesg with something-has-timeout messages which I
> can't read - whizzing by too fast.
> 
> And this seems consistent with the warning because sda1 is behind AHCI
> for which the warning is for.
> 
> Oh, and I'm saying it is tip/master-related because plain rc7 is fine.
> 
> Other strange things I was able to observe in dmesg between plain rc7
> and rc7+tip/master are that something in the interrupts allocation is
> different now (plenty of movement in that area recently) leading to the
> following diffs between dmesg:
> 
> --- 16-rc7	2014-07-28 19:43:13.000000000 +0200
> +++ 16-rc7+	2014-07-28 11:11:24.000000000 +0200
> 
> @@ -137,11 +138,9 @@ IOAPIC[1]: apic_id 10, version 33, addre
>  ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>  ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
>  ACPI: IRQ0 used by override.
> -ACPI: IRQ2 used by override.
>  ACPI: IRQ9 used by override.
>  Using ACPI (MADT) for SMP configuration information
>  smpboot: Allowing 8 CPUs, 0 hotplug CPUs
> -nr_irqs_gsi: 72
>  PM: Registered nosave memory: [mem 0x0009e000-0x0009efff]
>  PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
>  PM: Registered nosave memory: [mem 0x000a0000-0x000dffff]
> 
> that gsi thing went away, apparently.
> 
> -NR_IRQS:4352 nr_irqs:1288 16
> +NR_IRQS:4352 nr_irqs:1032 0
> 
> fun.
> 
> And then something lead to different interrupts being assigned:
> 
> -pci 0000:00:00.2: irq 72 for MSI/MSI-X
> +pci 0000:00:00.2: irq 27 for MSI/MSI-X
> 
> -ahci 0000:04:00.0: irq 73 for MSI/MSI-X
> +ahci 0000:04:00.0: irq 29 for MSI/MSI-X
> 
> and lookie lookie, it is ahci which changes IRQ lines.
> 
> Suggestions and ideas how to narrow down are appreciated.

Please try to revert this patch from Peter:

http://marc.info/?l=linux-kernel&m=140620918218199

Rafael


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: rc7 + tip/master suspend fun
  2014-07-28 20:59 ` rc7 + tip/master suspend fun Rafael J. Wysocki
@ 2014-07-28 21:02   ` Thomas Gleixner
  2014-07-28 22:02     ` Rafael J. Wysocki
                       ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Thomas Gleixner @ 2014-07-28 21:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Borislav Petkov, x86-ml, lkml, Peter Zijlstra, Jiang Liu

On Mon, 28 Jul 2014, Rafael J. Wysocki wrote:
> On Monday, July 28, 2014 07:53:26 PM Borislav Petkov wrote:
> > 
> > --Nq2Wo0NMKNjxTN9z
> > Content-Type: text/plain; charset=utf-8
> > Content-Disposition: inline
> > 
> > Hi guys,
> > 
> > so during my rc7 + tip/master testing today, I've hit the second WARN
> > in remove_proc_entry, see attached pic. This happens right before I
> > suspend to disk and I can't successfully suspend because sda gets choked
> > afterwards and floods dmesg with something-has-timeout messages which I
> > can't read - whizzing by too fast.
> > 
> > And this seems consistent with the warning because sda1 is behind AHCI
> > for which the warning is for.
> > 
> > Oh, and I'm saying it is tip/master-related because plain rc7 is fine.
> > 
> > Other strange things I was able to observe in dmesg between plain rc7
> > and rc7+tip/master are that something in the interrupts allocation is
> > different now (plenty of movement in that area recently) leading to the
> > following diffs between dmesg:
> > 
> > --- 16-rc7	2014-07-28 19:43:13.000000000 +0200
> > +++ 16-rc7+	2014-07-28 11:11:24.000000000 +0200
> > 
> > @@ -137,11 +138,9 @@ IOAPIC[1]: apic_id 10, version 33, addre
> >  ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> >  ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
> >  ACPI: IRQ0 used by override.
> > -ACPI: IRQ2 used by override.
> >  ACPI: IRQ9 used by override.
> >  Using ACPI (MADT) for SMP configuration information
> >  smpboot: Allowing 8 CPUs, 0 hotplug CPUs
> > -nr_irqs_gsi: 72
> >  PM: Registered nosave memory: [mem 0x0009e000-0x0009efff]
> >  PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
> >  PM: Registered nosave memory: [mem 0x000a0000-0x000dffff]
> > 
> > that gsi thing went away, apparently.
> > 
> > -NR_IRQS:4352 nr_irqs:1288 16
> > +NR_IRQS:4352 nr_irqs:1032 0
> > 
> > fun.

Right, that's from Jiangs ioapic rework.
 
> > And then something lead to different interrupts being assigned:
> > 
> > -pci 0000:00:00.2: irq 72 for MSI/MSI-X
> > +pci 0000:00:00.2: irq 27 for MSI/MSI-X
> > 
> > -ahci 0000:04:00.0: irq 73 for MSI/MSI-X
> > +ahci 0000:04:00.0: irq 29 for MSI/MSI-X

Borislav, can you please upload a full bootlog with "apic=verbose" on
the commandline?

Jiang, can you please have a look?
 
> > and lookie lookie, it is ahci which changes IRQ lines.
> > 
> > Suggestions and ideas how to narrow down are appreciated.
> 
> Please try to revert this patch from Peter:
> 
> http://marc.info/?l=linux-kernel&m=140620918218199

How is that related to a leaked interrupt handler? That patch gives a
splat at request_irq() time. Completely unrelated to this problem.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: rc7 + tip/master suspend fun
  2014-07-28 21:02   ` Thomas Gleixner
@ 2014-07-28 22:02     ` Rafael J. Wysocki
  2014-07-29  2:17     ` Jiang Liu
  2014-07-30  9:37     ` [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation Jiang Liu
  2 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2014-07-28 22:02 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Borislav Petkov, x86-ml, lkml, Peter Zijlstra, Jiang Liu

On Monday, July 28, 2014 11:02:09 PM Thomas Gleixner wrote:
> On Mon, 28 Jul 2014, Rafael J. Wysocki wrote:
> > On Monday, July 28, 2014 07:53:26 PM Borislav Petkov wrote:
> > > 
> > > --Nq2Wo0NMKNjxTN9z
> > > Content-Type: text/plain; charset=utf-8
> > > Content-Disposition: inline
> > > 
> > > Hi guys,
> > > 
> > > so during my rc7 + tip/master testing today, I've hit the second WARN
> > > in remove_proc_entry, see attached pic. This happens right before I
> > > suspend to disk and I can't successfully suspend because sda gets choked
> > > afterwards and floods dmesg with something-has-timeout messages which I
> > > can't read - whizzing by too fast.
> > > 
> > > And this seems consistent with the warning because sda1 is behind AHCI
> > > for which the warning is for.
> > > 
> > > Oh, and I'm saying it is tip/master-related because plain rc7 is fine.
> > > 
> > > Other strange things I was able to observe in dmesg between plain rc7
> > > and rc7+tip/master are that something in the interrupts allocation is
> > > different now (plenty of movement in that area recently) leading to the
> > > following diffs between dmesg:
> > > 
> > > --- 16-rc7	2014-07-28 19:43:13.000000000 +0200
> > > +++ 16-rc7+	2014-07-28 11:11:24.000000000 +0200
> > > 
> > > @@ -137,11 +138,9 @@ IOAPIC[1]: apic_id 10, version 33, addre
> > >  ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> > >  ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
> > >  ACPI: IRQ0 used by override.
> > > -ACPI: IRQ2 used by override.
> > >  ACPI: IRQ9 used by override.
> > >  Using ACPI (MADT) for SMP configuration information
> > >  smpboot: Allowing 8 CPUs, 0 hotplug CPUs
> > > -nr_irqs_gsi: 72
> > >  PM: Registered nosave memory: [mem 0x0009e000-0x0009efff]
> > >  PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
> > >  PM: Registered nosave memory: [mem 0x000a0000-0x000dffff]
> > > 
> > > that gsi thing went away, apparently.
> > > 
> > > -NR_IRQS:4352 nr_irqs:1288 16
> > > +NR_IRQS:4352 nr_irqs:1032 0
> > > 
> > > fun.
> 
> Right, that's from Jiangs ioapic rework.
>  
> > > And then something lead to different interrupts being assigned:
> > > 
> > > -pci 0000:00:00.2: irq 72 for MSI/MSI-X
> > > +pci 0000:00:00.2: irq 27 for MSI/MSI-X
> > > 
> > > -ahci 0000:04:00.0: irq 73 for MSI/MSI-X
> > > +ahci 0000:04:00.0: irq 29 for MSI/MSI-X
> 
> Borislav, can you please upload a full bootlog with "apic=verbose" on
> the commandline?
> 
> Jiang, can you please have a look?
>  
> > > and lookie lookie, it is ahci which changes IRQ lines.
> > > 
> > > Suggestions and ideas how to narrow down are appreciated.
> > 
> > Please try to revert this patch from Peter:
> > 
> > http://marc.info/?l=linux-kernel&m=140620918218199
> 
> How is that related to a leaked interrupt handler? That patch gives a
> splat at request_irq() time. Completely unrelated to this problem.

Right, sorry for the noise.

Rafael


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: rc7 + tip/master suspend fun
  2014-07-28 21:02   ` Thomas Gleixner
  2014-07-28 22:02     ` Rafael J. Wysocki
@ 2014-07-29  2:17     ` Jiang Liu
  2014-07-30  9:37     ` [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation Jiang Liu
  2 siblings, 0 replies; 24+ messages in thread
From: Jiang Liu @ 2014-07-29  2:17 UTC (permalink / raw)
  To: Thomas Gleixner, Rafael J. Wysocki
  Cc: Borislav Petkov, x86-ml, lkml, Peter Zijlstra

Hi Borislav, Thomas and Rafael,
	Thanks for testing and reporting this issue.
	I guess the issue is that, we shouldn't release IOAPIC
pin reference count when suspend/hibernate system. I'm reading
PM code and working on a solution for this issue. The basic
idea is to register an PM notifier to stop releasing IOAPIC
pin count during system suspend/hibernate/resume.
	Hi Rafael, is that the right direction to go?

Regards!
Gerry

On 2014/7/29 5:02, Thomas Gleixner wrote:
> On Mon, 28 Jul 2014, Rafael J. Wysocki wrote:
>> On Monday, July 28, 2014 07:53:26 PM Borislav Petkov wrote:
>>>
>>> --Nq2Wo0NMKNjxTN9z
>>> Content-Type: text/plain; charset=utf-8
>>> Content-Disposition: inline
>>>
>>> Hi guys,
>>>
>>> so during my rc7 + tip/master testing today, I've hit the second WARN
>>> in remove_proc_entry, see attached pic. This happens right before I
>>> suspend to disk and I can't successfully suspend because sda gets choked
>>> afterwards and floods dmesg with something-has-timeout messages which I
>>> can't read - whizzing by too fast.
>>>
>>> And this seems consistent with the warning because sda1 is behind AHCI
>>> for which the warning is for.
>>>
>>> Oh, and I'm saying it is tip/master-related because plain rc7 is fine.
>>>
>>> Other strange things I was able to observe in dmesg between plain rc7
>>> and rc7+tip/master are that something in the interrupts allocation is
>>> different now (plenty of movement in that area recently) leading to the
>>> following diffs between dmesg:
>>>
>>> --- 16-rc7	2014-07-28 19:43:13.000000000 +0200
>>> +++ 16-rc7+	2014-07-28 11:11:24.000000000 +0200
>>>
>>> @@ -137,11 +138,9 @@ IOAPIC[1]: apic_id 10, version 33, addre
>>>  ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>>>  ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
>>>  ACPI: IRQ0 used by override.
>>> -ACPI: IRQ2 used by override.
>>>  ACPI: IRQ9 used by override.
>>>  Using ACPI (MADT) for SMP configuration information
>>>  smpboot: Allowing 8 CPUs, 0 hotplug CPUs
>>> -nr_irqs_gsi: 72
>>>  PM: Registered nosave memory: [mem 0x0009e000-0x0009efff]
>>>  PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
>>>  PM: Registered nosave memory: [mem 0x000a0000-0x000dffff]
>>>
>>> that gsi thing went away, apparently.
>>>
>>> -NR_IRQS:4352 nr_irqs:1288 16
>>> +NR_IRQS:4352 nr_irqs:1032 0
>>>
>>> fun.
> 
> Right, that's from Jiangs ioapic rework.
>  
>>> And then something lead to different interrupts being assigned:
>>>
>>> -pci 0000:00:00.2: irq 72 for MSI/MSI-X
>>> +pci 0000:00:00.2: irq 27 for MSI/MSI-X
>>>
>>> -ahci 0000:04:00.0: irq 73 for MSI/MSI-X
>>> +ahci 0000:04:00.0: irq 29 for MSI/MSI-X
> 
> Borislav, can you please upload a full bootlog with "apic=verbose" on
> the commandline?
> 
> Jiang, can you please have a look?
>  
>>> and lookie lookie, it is ahci which changes IRQ lines.
>>>
>>> Suggestions and ideas how to narrow down are appreciated.
>>
>> Please try to revert this patch from Peter:
>>
>> http://marc.info/?l=linux-kernel&m=140620918218199
> 
> How is that related to a leaked interrupt handler? That patch gives a
> splat at request_irq() time. Completely unrelated to this problem.
> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
  2014-07-28 21:02   ` Thomas Gleixner
  2014-07-28 22:02     ` Rafael J. Wysocki
  2014-07-29  2:17     ` Jiang Liu
@ 2014-07-30  9:37     ` Jiang Liu
  2014-07-30 17:58       ` Borislav Petkov
  2 siblings, 1 reply; 24+ messages in thread
From: Jiang Liu @ 2014-07-30  9:37 UTC (permalink / raw)
  To: Borislav Petkov, Rafael J . Wysocki, Bjorn Helgaas,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86
  Cc: Jiang Liu, lkml, linux-pci

Function pci_disable_device() may be called for PCI devices during
suspend/hibernation, which in turn may release IRQ assigned to PCI
interrupt. Later when pci_enable_device() is called during resume,
a different IRQ may be assigned and thus break the driver.
So keep IRQ assignment for PCI devices during suspend/hibernation.
We check pci_dev.dev.power.is_prepared to detect that pci_disable_device()
and pci_enable_device() is called during suspend/hibernation.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
Hi Borislav,
	I can't reproduce the issue with my test machine. So could you
please help to test this patch?
Regards!
Gerry
---
 arch/x86/pci/common.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 059a76c29739..417de4c92e8f 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -662,14 +662,15 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
 	if ((err = pci_enable_resources(dev, mask)) < 0)
 		return err;
 
-	if (!pci_dev_msi_enabled(dev))
+	if (!pci_dev_msi_enabled(dev) && !dev->dev.pm.is_prepared)
 		return pcibios_enable_irq(dev);
 	return 0;
 }
 
 void pcibios_disable_device (struct pci_dev *dev)
 {
-	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq)
+	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq &&
+	    !dev->dev.pm.is_prepared)
 		pcibios_disable_irq(dev);
 }
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
  2014-07-30  9:37     ` [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation Jiang Liu
@ 2014-07-30 17:58       ` Borislav Petkov
  2014-07-31  0:33         ` Jiang Liu
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-07-30 17:58 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Rafael J . Wysocki, Bjorn Helgaas, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, lkml, linux-pci

On Wed, Jul 30, 2014 at 05:37:13PM +0800, Jiang Liu wrote:
> Function pci_disable_device() may be called for PCI devices during
> suspend/hibernation, which in turn may release IRQ assigned to PCI
> interrupt. Later when pci_enable_device() is called during resume,
> a different IRQ may be assigned and thus break the driver.
> So keep IRQ assignment for PCI devices during suspend/hibernation.
> We check pci_dev.dev.power.is_prepared to detect that pci_disable_device()
> and pci_enable_device() is called during suspend/hibernation.
> 
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> ---
> Hi Borislav,
> 	I can't reproduce the issue with my test machine. So could you
> please help to test this patch?
> Regards!
> Gerry
> ---
>  arch/x86/pci/common.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 059a76c29739..417de4c92e8f 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -662,14 +662,15 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
>  	if ((err = pci_enable_resources(dev, mask)) < 0)
>  		return err;
>  
> -	if (!pci_dev_msi_enabled(dev))
> +	if (!pci_dev_msi_enabled(dev) && !dev->dev.pm.is_prepared)
>  		return pcibios_enable_irq(dev);
>  	return 0;
>  }
>  
>  void pcibios_disable_device (struct pci_dev *dev)
>  {
> -	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq)
> +	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq &&
> +	    !dev->dev.pm.is_prepared)
>  		pcibios_disable_irq(dev);

Did you even build-test this patch?

arch/x86/pci/common.c: In function ‘pcibios_enable_device’:
arch/x86/pci/common.c:665:44: error: ‘struct device’ has no member named ‘pm’
  if (!pci_dev_msi_enabled(dev) && !dev->dev.pm.is_prepared)
                                            ^
arch/x86/pci/common.c: In function ‘pcibios_disable_device’:
arch/x86/pci/common.c:673:15: error: ‘struct device’ has no member named ‘pm’
      !dev->dev.pm.is_prepared)
               ^
make[1]: *** [arch/x86/pci/common.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [arch/x86/pci] Error 2
make: *** Waiting for unfinished jobs....



My tree is rc7 + tip/master from Sunday.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
  2014-07-30 17:58       ` Borislav Petkov
@ 2014-07-31  0:33         ` Jiang Liu
  2014-07-31 10:39           ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Jiang Liu @ 2014-07-31  0:33 UTC (permalink / raw)
  To: Borislav Petkov, Rafael J . Wysocki, Thomas Gleixner,
	Bjorn Helgaas, Ingo Molnar, H. Peter Anvin, x86
  Cc: Jiang Liu, linux-pci, lkml

Function pci_disable_device() may be called for PCI devices during
suspend/hibernation, which in turn may release IRQ assigned for PCI
interrupt. Later when pci_enable_device() is called during resume,
a different IRQ may be assigned and thus break the driver.
So keep IRQ assignment for PCI devices during suspend/hibernation.
We check pci_dev.dev.power.is_prepared to detect that pci_disable_device()
and pci_enable_device() is called during suspend/hibernation.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
Hi Borilav,
	Sorry for the incovenience, something is wrong when synchronizing
the patch.  I have found that "pm" should be "power" when building and
testing this patch on my test machine. And I have also fixed it on my
development machine, but forgot to do "stg refresh" before "git format-patch",
so caused the inconvenience.
Regards!
Gerry
---
 arch/x86/pci/common.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 059a76c29739..9115e7ae564b 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -662,14 +662,15 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
 	if ((err = pci_enable_resources(dev, mask)) < 0)
 		return err;
 
-	if (!pci_dev_msi_enabled(dev))
+	if (!pci_dev_msi_enabled(dev) && !dev->dev.power.is_prepared)
 		return pcibios_enable_irq(dev);
 	return 0;
 }
 
 void pcibios_disable_device (struct pci_dev *dev)
 {
-	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq)
+	if (!pci_dev_msi_enabled(dev) && pcibios_disable_irq &&
+	    !dev->dev.power.is_prepared)
 		pcibios_disable_irq(dev);
 }
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
  2014-07-31  0:33         ` Jiang Liu
@ 2014-07-31 10:39           ` Borislav Petkov
  2014-07-31 14:41               ` Jiang Liu
  2014-07-31 16:36             ` Jiang Liu
  0 siblings, 2 replies; 24+ messages in thread
From: Borislav Petkov @ 2014-07-31 10:39 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml

On Thu, Jul 31, 2014 at 08:33:26AM +0800, Jiang Liu wrote:
> Sorry for the incovenience, something is wrong when synchronizing
> the patch. I have found that "pm" should be "power" when building and
> testing this patch on my test machine. And I have also fixed it on
> my development machine, but forgot to do "stg refresh" before "git
> format-patch",

Yeah, use git only :-)

Anyway, still b0rked although I can get further and actually suspend.
Full splat below.

The IOMMU PFs point to

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV635 [Radeon HD 3650/3750/4570/4580]

and

00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller

respectively.

and the IOPFs are all this OHCI controller. Which points at a timing
problem during suspend, I'm being told, when the device gets returned
to the BIOS but IOMMU is still on and filters accesses. Or something to
that effect. In any case, still b0rked.

-> start suspend to disk:

[   31.422901] hib.sh (3786): drop_caches: 3
[   31.427932] PM: Hibernation mode set to 'shutdown'
[   31.444522] PM: Syncing filesystems ... done.
[   31.452558] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   31.462404] PM: Marking nosave pages: [mem 0x0009e000-0x000fffff]
[   31.468544] PM: Marking nosave pages: [mem 0xba9b8000-0xbca4dfff]
[   31.474868] PM: Marking nosave pages: [mem 0xbca4f000-0xbcc54fff]
[   31.480995] PM: Marking nosave pages: [mem 0xbd083000-0xbd7f3fff]
[   31.487136] PM: Marking nosave pages: [mem 0xbd800000-0x100000fff]
[   31.493891] PM: Basic memory bitmaps created
[   31.498178] PM: Preallocating image memory... done (allocated 143539 pages)
[   32.001145] PM: Allocated 574156 kbytes in 0.49 seconds (1171.74 MB/s)
[   32.007690] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   33.220531] serial 00:06: activated
[   33.220959] [drm] PCIE gen 2 link speeds already enabled
[   33.228236] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
[   33.228340] radeon 0000:01:00.0: WB enabled
[   33.228347] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff88042939ac00
[   33.233314] snd_hda_intel 0000:01:00.1: irq 49 for MSI/MSI-X
[   33.259732] [drm] ring test on 0 succeeded in 0 usecs
[   33.259890] [drm] ib test on ring 0 succeeded in 0 usecs
[   33.319713] r8169 0000:02:00.0 eth0: link down
[   33.448144] PM: thaw of devices complete after 229.835 msecs
[   33.455071] PM: writing image.
[   33.535747] ata5: SATA link down (SStatus 0 SControl 300)
[   33.541393] ata6: SATA link down (SStatus 0 SControl 300)
[   33.707603] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   33.714056] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   33.720477] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   33.728938] ata4.00: configured for UDMA/133
[   33.734567] ata3.00: configured for UDMA/133
[   33.739453] ata2.00: configured for UDMA/133
[   33.745111] PM: Using 3 thread(s) for compression.
[   33.745111] PM: Compressing and saving image data (149256 pages)...
[   33.756336] PM: Image saving progress:   0%
[   33.915592] PM: Image saving progress:  10%
[   33.995331] PM: Image saving progress:  20%
[   34.123934] PM: Image saving progress:  30%
[   34.362314] PM: Image saving progress:  40%
[   34.463433] PM: Image saving progress:  50%
[   34.550636] PM: Image saving progress:  60%
[   34.651601] PM: Image saving progress:  70%
[   34.756381] PM: Image saving progress:  80%
[   34.863658] PM: Image saving progress:  90%
[   34.973828] PM: Image saving progress: 100%
[   34.993966] PM: Image saving done.
[   34.998214] PM: Wrote 597024 kbytes in 1.23 seconds (485.38 MB/s)
[   35.005742] PM: S|
[   35.122141] r8169 0000:02:00.0 eth0: link up
[   35.368945] sd 3:0:0:0: [sdc] Synchronizing SCSI cache
[   35.376839] sd 3:0:0:0: [sdc] Stopping disk
[   35.898610] sd 2:0:0:0: [sdb] Synchronizing SCSI cache
[   35.906277] sd 2:0:0:0: [sdb] Stopping disk
[   36.783830] sd 1:0:0:0: [sda] Synchronizing SCSI cache
[   36.792035] sd 1:0:0:0: [sda] Stopping disk
[   36.833993] pcieport 0000:00:04.0: System wakeup enabled by ACPI
[   36.855314] ------------[ cut here ]------------
[   36.862344] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
[   36.864089] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
[   36.886423] remove_proc_entry: removing non-empty directory 'irq/23', leaking at least 'ehci_hcd:usb3'
[   36.898325] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
[   36.934730] CPU: 3 PID: 3786 Comm: hib.sh Not tainted 3.16.0-rc7+ #1
[   36.942103] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   36.953051]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
[   36.961616]  ffff88041e4ebb78 ffffffff8104f82c ffff880037a91e00 ffff880429842e30
[   36.970232]  0000000000000002 ffff88041e4ebc4e ffff880037a91eb9 ffff88041e4ebbd8
[   36.978822] Call Trace:
[   36.982324]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
[   36.988496]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
[   36.995558]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
[   37.002395]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
[   37.011279]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
[   37.020129]  [<ffffffff810ae689>] free_desc+0x39/0x90
[   37.028097]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
[   37.036478]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
[   37.045329]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
[   37.053540]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
[   37.063012]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
[   37.071870]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
[   37.080756]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
[   37.089838]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
[   37.098860]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
[   37.107561]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
[   37.116425]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
[   37.125248]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
[   37.133792]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
[   37.142348]  [<ffffffff81617455>] power_down+0x45/0xa1
[   37.150335]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
[   37.158377]  [<ffffffff810a1c9c>] state_store+0xec/0x100
[   37.166526]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
[   37.174858]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
[   37.183144]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
[   37.191700]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
[   37.199609]  [<ffffffff81172942>] SyS_write+0x52/0xc0
[   37.207357]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
[   37.216091] ---[ end trace 555995a6cab7d4fe ]---
[   37.222168] ------------[ cut here ]------------
[   37.227753] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
[   37.237344] remove_proc_entry: removing non-empty directory 'irq/22', leaking at least 'ohci_hcd:usb7'
[   37.247595] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
[   37.284164] CPU: 3 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
[   37.292669] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   37.303684]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
[   37.312346]  ffff88041e4ebb78 ffffffff8104f82c ffff8804298b7500 ffff880429b2e630
[   37.321035]  0000000000000002 ffff88041e4ebc4e ffff8804298b75b9 ffff88041e4ebbd8
[   37.329713] Call Trace:
[   37.333294]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
[   37.339569]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
[   37.346722]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
[   37.353623]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
[   37.360688]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
[   37.367776]  [<ffffffff810ae689>] free_desc+0x39/0x90
[   37.374023]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
[   37.382851]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
[   37.392223]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
[   37.400915]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
[   37.410875]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
[   37.420244]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
[   37.429713]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
[   37.439354]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
[   37.448906]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
[   37.458130]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
[   37.467572]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
[   37.476968]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
[   37.486079]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
[   37.495206]  [<ffffffff81617455>] power_down+0x45/0xa1
[   37.503789]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
[   37.512466]  [<ffffffff810a1c9c>] state_store+0xec/0x100
[   37.521217]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
[   37.530104]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
[   37.538959]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
[   37.548096]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
[   37.556513]  [<ffffffff81172942>] SyS_write+0x52/0xc0
[   37.564743]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
[   37.573837] ---[ end trace 555995a6cab7d4ff ]---
[   37.583013] ------------[ cut here ]------------
[   37.588736] WARNING: CPU: 0 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
[   37.598369] remove_proc_entry: removing non-empty directory 'irq/21', leaking at least 'ehci_hcd:usb2'
[   37.608655] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
[   37.647451] CPU: 4 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
[   37.658115] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   37.671304]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
[   37.682276]  ffff88041e4ebb78 ffffffff8104f82c ffff880429f6c000 ffff880429842e30
[   37.693179]  0000000000000002 ffff88041e4ebc4e ffff880429f6c0b9 ffff88041e4ebbd8
[   37.704132] Call Trace:
[   37.709898]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
[   37.718393]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
[   37.727774]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
[   37.736888]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
[   37.746167]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
[   37.755444]  [<ffffffff810ae689>] free_desc+0x39/0x90
[   37.763897]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
[   37.772742]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
[   37.782019]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
[   37.790656]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
[   37.800534]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
[   37.809810]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
[   37.819130]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
[   37.828650]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
[   37.838120]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
[   37.847324]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
[   37.856698]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
[   37.866005]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
[   37.875035]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
[   37.884060]  [<ffffffff81617455>] power_down+0x45/0xa1
[   37.892580]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
[   37.901173]  [<ffffffff810a1c9c>] state_store+0xec/0x100
[   37.909851]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
[   37.918705]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
[   37.927558]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
[   37.936710]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
[   37.945157]  [<ffffffff81172942>] SyS_write+0x52/0xc0
[   37.953358]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
[   37.962486] ---[ end trace 555995a6cab7d500 ]---
[   37.971327] ------------[ cut here ]------------
[   37.978904] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
[   37.990542] remove_proc_entry: removing non-empty directory 'irq/20', leaking at least 'ohci_hcd:usb5'
[   38.002874] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
[   38.039571] CPU: 3 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
[   38.048075] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   38.059363]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
[   38.070381]  ffff88041e4ebb78 ffffffff8104f82c ffff880429843e00 ffff880429b2e630
[   38.081335]  0000000000000002 ffff88041e4ebc4e ffff880429843eb9 ffff88041e4ebbd8
[   38.092303] Call Trace:
[   38.098120]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
[   38.106619]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
[   38.115986]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
[   38.125119]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
[   38.134401]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
[   38.143665]  [<ffffffff810ae689>] free_desc+0x39/0x90
[   38.152113]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
[   38.160971]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
[   38.170251]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
[   38.178918]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
[   38.188836]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
[   38.198133]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
[   38.207534]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
[   38.217056]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
[   38.226486]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
[   38.235667]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
[   38.245025]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
[   38.254323]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
[   38.263346]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
[   38.272404]  [<ffffffff81617455>] power_down+0x45/0xa1
[   38.280890]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
[   38.289488]  [<ffffffff810a1c9c>] state_store+0xec/0x100
[   38.298180]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
[   38.307040]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
[   38.315880]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
[   38.325035]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
[   38.333471]  [<ffffffff81172942>] SyS_write+0x52/0xc0
[   38.341680]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
[   38.350836] ---[ end trace 555995a6cab7d501 ]---
[   38.359898] ------------[ cut here ]------------
[   38.367518] WARNING: CPU: 6 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
[   38.377354] remove_proc_entry: removing non-empty directory 'irq/17', leaking at least 'ehci_hcd:usb1'
[   38.387685] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
[   38.424577] CPU: 7 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
[   38.433267] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   38.444345]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
[   38.452997]  ffff88041e4ebb78 ffffffff8104f82c ffff88042b199000 ffff880429842e30
[   38.461711]  0000000000000002 ffff88041e4ebc4e ffff88042b1990b9 ffff88041e4ebbd8
[   38.470362] Call Trace:
[   38.474000]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
[   38.480276]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
[   38.487466]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
[   38.494344]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
[   38.501441]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
[   38.508498]  [<ffffffff810ae689>] free_desc+0x39/0x90
[   38.514772]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
[   38.523647]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
[   38.532943]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
[   38.541628]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
[   38.551512]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
[   38.560841]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
[   38.570218]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
[   38.579774]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
[   38.589277]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
[   38.598485]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
[   38.607886]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
[   38.617206]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
[   38.626273]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
[   38.635346]  [<ffffffff81617455>] power_down+0x45/0xa1
[   38.643849]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
[   38.652446]  [<ffffffff810a1c9c>] state_store+0xec/0x100
[   38.661129]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
[   38.670047]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
[   38.678911]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
[   38.688123]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
[   38.696609]  [<ffffffff81172942>] SyS_write+0x52/0xc0
[   38.704866]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
[   38.713983] ---[ end trace 555995a6cab7d502 ]---
[   38.722834] ------------[ cut here ]------------
[   38.730422] WARNING: CPU: 0 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
[   38.740241] remove_proc_entry: removing non-empty directory 'irq/18', leaking at least 'ohci_hcd:usb6'
[   38.750583] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4[   38.762023] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
[   38.762030] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
[   38.762034] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   38.762038] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   38.762041] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   38.762044] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   38.762048] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   38.762051] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]

lotta those...

[   45.876957] Clocksource tsc unstable (delta = 4686965640 ns)
[   45.883734]  nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
[   45.909515] CPU: 0 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
[   45.909616] Switched to clocksource acpi_pm
[   45.921887] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   45.932301]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
[   45.940316]  ffff88041e4ebb78 ffffffff8104f82c ffff880429842e00 ffff880429b2e630
[   45.948346]  0000000000000002 ffff88041e4ebc4e ffff880429842eb9 ffff88041e4ebbd8
[   45.956357] Call Trace:
[   45.959304]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
[   45.964975]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
[   45.971491]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
[   45.977757]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
[   45.984240]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
[   45.990696]  [<ffffffff810ae689>] free_desc+0x39/0x90
[   45.996299]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
[   46.002380]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
[   46.008893]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
[   46.014794]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
[   46.021911]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
[   46.028427]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
[   46.035046]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
[   46.041809]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
[   46.048495]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
[   46.054996]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
[   46.061611]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
[   46.068202]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
[   46.074543]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
[   46.080886]  [<ffffffff81617455>] power_down+0x45/0xa1
[   46.086695]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
[   46.092603]  [<ffffffff810a1c9c>] state_store+0xec/0x100
[   46.098599]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
[   46.104755]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
[   46.110921]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
[   46.117434]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
[   46.123291]  [<ffffffff81172942>] SyS_write+0x52/0xc0
[   46.129069]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
[   46.135798] ---[ end trace 555995a6cab7d503 ]---
[   46.142058] ACPI: Preparing to enter system sleep state S5
[   46.148449] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline
[   46.159388] reboot: Power down
[   46.163504] acpi_power_off called

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
@ 2014-07-31 14:41               ` Jiang Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Jiang Liu @ 2014-07-31 14:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml, Joerg Roedel, iommu

Hi Borislav,
	Really appreciate your help. There are two issues left
according to the log messages.
	The first issue is " WARNING: CPU: 2 PID: 3786 at
fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()".
It's caused by an issue in usb driver and a patch is ready to fix
this issue, please refer to https://lkml.org/lkml/2014/7/20/277.
This patch has been merged into Greg's usb repository, so it should
be merge into v3.17 too.
	The second issue is "AMD-Vi: Event logged [IO_PAGE_FAULT
device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]".
I feel it's not caused by IRQ related changes, but may be caused
by IO page fault related work for AMD IOMMU. I need more time
to investigate this and also CC Joerg for help too.
	According to the log message, I feel my patch has fixed the
original issue caused by IRQ related changes. What's your thoughts?

Regards!
Gerry

On 2014/7/31 18:39, Borislav Petkov wrote:
> On Thu, Jul 31, 2014 at 08:33:26AM +0800, Jiang Liu wrote:
>> Sorry for the incovenience, something is wrong when synchronizing
>> the patch. I have found that "pm" should be "power" when building and
>> testing this patch on my test machine. And I have also fixed it on
>> my development machine, but forgot to do "stg refresh" before "git
>> format-patch",
> 
> Yeah, use git only :-)
> 
> Anyway, still b0rked although I can get further and actually suspend.
> Full splat below.
> 
> The IOMMU PFs point to
> 
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV635 [Radeon HD 3650/3750/4570/4580]
> 
> and
> 
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 
> respectively.
> 
> and the IOPFs are all this OHCI controller. Which points at a timing
> problem during suspend, I'm being told, when the device gets returned
> to the BIOS but IOMMU is still on and filters accesses. Or something to
> that effect. In any case, still b0rked.
> 
> -> start suspend to disk:
> 
> [   31.422901] hib.sh (3786): drop_caches: 3
> [   31.427932] PM: Hibernation mode set to 'shutdown'
> [   31.444522] PM: Syncing filesystems ... done.
> [   31.452558] Freezing user space processes ... (elapsed 0.002 seconds) done.
> [   31.462404] PM: Marking nosave pages: [mem 0x0009e000-0x000fffff]
> [   31.468544] PM: Marking nosave pages: [mem 0xba9b8000-0xbca4dfff]
> [   31.474868] PM: Marking nosave pages: [mem 0xbca4f000-0xbcc54fff]
> [   31.480995] PM: Marking nosave pages: [mem 0xbd083000-0xbd7f3fff]
> [   31.487136] PM: Marking nosave pages: [mem 0xbd800000-0x100000fff]
> [   31.493891] PM: Basic memory bitmaps created
> [   31.498178] PM: Preallocating image memory... done (allocated 143539 pages)
> [   32.001145] PM: Allocated 574156 kbytes in 0.49 seconds (1171.74 MB/s)
> [   32.007690] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> [   33.220531] serial 00:06: activated
> [   33.220959] [drm] PCIE gen 2 link speeds already enabled
> [   33.228236] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
> [   33.228340] radeon 0000:01:00.0: WB enabled
> [   33.228347] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff88042939ac00
> [   33.233314] snd_hda_intel 0000:01:00.1: irq 49 for MSI/MSI-X
> [   33.259732] [drm] ring test on 0 succeeded in 0 usecs
> [   33.259890] [drm] ib test on ring 0 succeeded in 0 usecs
> [   33.319713] r8169 0000:02:00.0 eth0: link down
> [   33.448144] PM: thaw of devices complete after 229.835 msecs
> [   33.455071] PM: writing image.
> [   33.535747] ata5: SATA link down (SStatus 0 SControl 300)
> [   33.541393] ata6: SATA link down (SStatus 0 SControl 300)
> [   33.707603] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   33.714056] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   33.720477] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> [   33.728938] ata4.00: configured for UDMA/133
> [   33.734567] ata3.00: configured for UDMA/133
> [   33.739453] ata2.00: configured for UDMA/133
> [   33.745111] PM: Using 3 thread(s) for compression.
> [   33.745111] PM: Compressing and saving image data (149256 pages)...
> [   33.756336] PM: Image saving progress:   0%
> [   33.915592] PM: Image saving progress:  10%
> [   33.995331] PM: Image saving progress:  20%
> [   34.123934] PM: Image saving progress:  30%
> [   34.362314] PM: Image saving progress:  40%
> [   34.463433] PM: Image saving progress:  50%
> [   34.550636] PM: Image saving progress:  60%
> [   34.651601] PM: Image saving progress:  70%
> [   34.756381] PM: Image saving progress:  80%
> [   34.863658] PM: Image saving progress:  90%
> [   34.973828] PM: Image saving progress: 100%
> [   34.993966] PM: Image saving done.
> [   34.998214] PM: Wrote 597024 kbytes in 1.23 seconds (485.38 MB/s)
> [   35.005742] PM: S|
> [   35.122141] r8169 0000:02:00.0 eth0: link up
> [   35.368945] sd 3:0:0:0: [sdc] Synchronizing SCSI cache
> [   35.376839] sd 3:0:0:0: [sdc] Stopping disk
> [   35.898610] sd 2:0:0:0: [sdb] Synchronizing SCSI cache
> [   35.906277] sd 2:0:0:0: [sdb] Stopping disk
> [   36.783830] sd 1:0:0:0: [sda] Synchronizing SCSI cache
> [   36.792035] sd 1:0:0:0: [sda] Stopping disk
> [   36.833993] pcieport 0000:00:04.0: System wakeup enabled by ACPI
> [   36.855314] ------------[ cut here ]------------
> [   36.862344] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   36.864089] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
> [   36.886423] remove_proc_entry: removing non-empty directory 'irq/23', leaking at least 'ehci_hcd:usb3'
> [   36.898325] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   36.934730] CPU: 3 PID: 3786 Comm: hib.sh Not tainted 3.16.0-rc7+ #1
> [   36.942103] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   36.953051]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   36.961616]  ffff88041e4ebb78 ffffffff8104f82c ffff880037a91e00 ffff880429842e30
> [   36.970232]  0000000000000002 ffff88041e4ebc4e ffff880037a91eb9 ffff88041e4ebbd8
> [   36.978822] Call Trace:
> [   36.982324]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   36.988496]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   36.995558]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.002395]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.011279]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.020129]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.028097]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.036478]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.045329]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.053540]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.063012]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.071870]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.080756]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.089838]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.098860]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.107561]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.116425]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.125248]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.133792]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.142348]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.150335]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.158377]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.166526]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.174858]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.183144]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.191700]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.199609]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.207357]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.216091] ---[ end trace 555995a6cab7d4fe ]---
> [   37.222168] ------------[ cut here ]------------
> [   37.227753] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.237344] remove_proc_entry: removing non-empty directory 'irq/22', leaking at least 'ohci_hcd:usb7'
> [   37.247595] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   37.284164] CPU: 3 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   37.292669] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   37.303684]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   37.312346]  ffff88041e4ebb78 ffffffff8104f82c ffff8804298b7500 ffff880429b2e630
> [   37.321035]  0000000000000002 ffff88041e4ebc4e ffff8804298b75b9 ffff88041e4ebbd8
> [   37.329713] Call Trace:
> [   37.333294]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   37.339569]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   37.346722]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.353623]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.360688]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.367776]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.374023]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.382851]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.392223]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.400915]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.410875]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.420244]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.429713]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.439354]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.448906]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.458130]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.467572]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.476968]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.486079]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.495206]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.503789]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.512466]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.521217]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.530104]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.538959]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.548096]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.556513]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.564743]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.573837] ---[ end trace 555995a6cab7d4ff ]---
> [   37.583013] ------------[ cut here ]------------
> [   37.588736] WARNING: CPU: 0 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.598369] remove_proc_entry: removing non-empty directory 'irq/21', leaking at least 'ehci_hcd:usb2'
> [   37.608655] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   37.647451] CPU: 4 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   37.658115] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   37.671304]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   37.682276]  ffff88041e4ebb78 ffffffff8104f82c ffff880429f6c000 ffff880429842e30
> [   37.693179]  0000000000000002 ffff88041e4ebc4e ffff880429f6c0b9 ffff88041e4ebbd8
> [   37.704132] Call Trace:
> [   37.709898]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   37.718393]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   37.727774]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.736888]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.746167]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.755444]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.763897]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.772742]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.782019]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.790656]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.800534]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.809810]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.819130]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.828650]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.838120]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.847324]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.856698]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.866005]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.875035]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.884060]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.892580]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.901173]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.909851]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.918705]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.927558]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.936710]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.945157]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.953358]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.962486] ---[ end trace 555995a6cab7d500 ]---
> [   37.971327] ------------[ cut here ]------------
> [   37.978904] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.990542] remove_proc_entry: removing non-empty directory 'irq/20', leaking at least 'ohci_hcd:usb5'
> [   38.002874] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   38.039571] CPU: 3 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   38.048075] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   38.059363]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   38.070381]  ffff88041e4ebb78 ffffffff8104f82c ffff880429843e00 ffff880429b2e630
> [   38.081335]  0000000000000002 ffff88041e4ebc4e ffff880429843eb9 ffff88041e4ebbd8
> [   38.092303] Call Trace:
> [   38.098120]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   38.106619]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   38.115986]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   38.125119]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   38.134401]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   38.143665]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   38.152113]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   38.160971]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   38.170251]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   38.178918]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   38.188836]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   38.198133]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   38.207534]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   38.217056]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   38.226486]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   38.235667]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   38.245025]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   38.254323]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   38.263346]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   38.272404]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   38.280890]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   38.289488]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   38.298180]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   38.307040]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   38.315880]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   38.325035]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   38.333471]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   38.341680]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   38.350836] ---[ end trace 555995a6cab7d501 ]---
> [   38.359898] ------------[ cut here ]------------
> [   38.367518] WARNING: CPU: 6 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   38.377354] remove_proc_entry: removing non-empty directory 'irq/17', leaking at least 'ehci_hcd:usb1'
> [   38.387685] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   38.424577] CPU: 7 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   38.433267] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   38.444345]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   38.452997]  ffff88041e4ebb78 ffffffff8104f82c ffff88042b199000 ffff880429842e30
> [   38.461711]  0000000000000002 ffff88041e4ebc4e ffff88042b1990b9 ffff88041e4ebbd8
> [   38.470362] Call Trace:
> [   38.474000]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   38.480276]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   38.487466]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   38.494344]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   38.501441]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   38.508498]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   38.514772]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   38.523647]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   38.532943]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   38.541628]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   38.551512]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   38.560841]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   38.570218]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   38.579774]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   38.589277]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   38.598485]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   38.607886]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   38.617206]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   38.626273]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   38.635346]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   38.643849]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   38.652446]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   38.661129]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   38.670047]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   38.678911]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   38.688123]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   38.696609]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   38.704866]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   38.713983] ---[ end trace 555995a6cab7d502 ]---
> [   38.722834] ------------[ cut here ]------------
> [   38.730422] WARNING: CPU: 0 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   38.740241] remove_proc_entry: removing non-empty directory 'irq/18', leaking at least 'ohci_hcd:usb6'
> [   38.750583] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4[   38.762023] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
> [   38.762030] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
> [   38.762034] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762038] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762041] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762044] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762048] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762051] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> 
> lotta those...
> 
> [   45.876957] Clocksource tsc unstable (delta = 4686965640 ns)
> [   45.883734]  nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   45.909515] CPU: 0 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   45.909616] Switched to clocksource acpi_pm
> [   45.921887] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   45.932301]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   45.940316]  ffff88041e4ebb78 ffffffff8104f82c ffff880429842e00 ffff880429b2e630
> [   45.948346]  0000000000000002 ffff88041e4ebc4e ffff880429842eb9 ffff88041e4ebbd8
> [   45.956357] Call Trace:
> [   45.959304]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   45.964975]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   45.971491]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   45.977757]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   45.984240]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   45.990696]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   45.996299]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   46.002380]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   46.008893]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   46.014794]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   46.021911]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   46.028427]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   46.035046]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   46.041809]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   46.048495]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   46.054996]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   46.061611]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   46.068202]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   46.074543]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   46.080886]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   46.086695]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   46.092603]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   46.098599]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   46.104755]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   46.110921]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   46.117434]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   46.123291]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   46.129069]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   46.135798] ---[ end trace 555995a6cab7d503 ]---
> [   46.142058] ACPI: Preparing to enter system sleep state S5
> [   46.148449] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline
> [   46.159388] reboot: Power down
> [   46.163504] acpi_power_off called
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
@ 2014-07-31 14:41               ` Jiang Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Jiang Liu @ 2014-07-31 14:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, x86-DgEjT+Ai2ygdnm+yROfE0A,
	Rafael J . Wysocki, lkml, iommu, Ingo Molnar, H. Peter Anvin,
	Bjorn Helgaas, Thomas Gleixner

Hi Borislav,
	Really appreciate your help. There are two issues left
according to the log messages.
	The first issue is " WARNING: CPU: 2 PID: 3786 at
fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()".
It's caused by an issue in usb driver and a patch is ready to fix
this issue, please refer to https://lkml.org/lkml/2014/7/20/277.
This patch has been merged into Greg's usb repository, so it should
be merge into v3.17 too.
	The second issue is "AMD-Vi: Event logged [IO_PAGE_FAULT
device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]".
I feel it's not caused by IRQ related changes, but may be caused
by IO page fault related work for AMD IOMMU. I need more time
to investigate this and also CC Joerg for help too.
	According to the log message, I feel my patch has fixed the
original issue caused by IRQ related changes. What's your thoughts?

Regards!
Gerry

On 2014/7/31 18:39, Borislav Petkov wrote:
> On Thu, Jul 31, 2014 at 08:33:26AM +0800, Jiang Liu wrote:
>> Sorry for the incovenience, something is wrong when synchronizing
>> the patch. I have found that "pm" should be "power" when building and
>> testing this patch on my test machine. And I have also fixed it on
>> my development machine, but forgot to do "stg refresh" before "git
>> format-patch",
> 
> Yeah, use git only :-)
> 
> Anyway, still b0rked although I can get further and actually suspend.
> Full splat below.
> 
> The IOMMU PFs point to
> 
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV635 [Radeon HD 3650/3750/4570/4580]
> 
> and
> 
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 
> respectively.
> 
> and the IOPFs are all this OHCI controller. Which points at a timing
> problem during suspend, I'm being told, when the device gets returned
> to the BIOS but IOMMU is still on and filters accesses. Or something to
> that effect. In any case, still b0rked.
> 
> -> start suspend to disk:
> 
> [   31.422901] hib.sh (3786): drop_caches: 3
> [   31.427932] PM: Hibernation mode set to 'shutdown'
> [   31.444522] PM: Syncing filesystems ... done.
> [   31.452558] Freezing user space processes ... (elapsed 0.002 seconds) done.
> [   31.462404] PM: Marking nosave pages: [mem 0x0009e000-0x000fffff]
> [   31.468544] PM: Marking nosave pages: [mem 0xba9b8000-0xbca4dfff]
> [   31.474868] PM: Marking nosave pages: [mem 0xbca4f000-0xbcc54fff]
> [   31.480995] PM: Marking nosave pages: [mem 0xbd083000-0xbd7f3fff]
> [   31.487136] PM: Marking nosave pages: [mem 0xbd800000-0x100000fff]
> [   31.493891] PM: Basic memory bitmaps created
> [   31.498178] PM: Preallocating image memory... done (allocated 143539 pages)
> [   32.001145] PM: Allocated 574156 kbytes in 0.49 seconds (1171.74 MB/s)
> [   32.007690] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> [   33.220531] serial 00:06: activated
> [   33.220959] [drm] PCIE gen 2 link speeds already enabled
> [   33.228236] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
> [   33.228340] radeon 0000:01:00.0: WB enabled
> [   33.228347] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff88042939ac00
> [   33.233314] snd_hda_intel 0000:01:00.1: irq 49 for MSI/MSI-X
> [   33.259732] [drm] ring test on 0 succeeded in 0 usecs
> [   33.259890] [drm] ib test on ring 0 succeeded in 0 usecs
> [   33.319713] r8169 0000:02:00.0 eth0: link down
> [   33.448144] PM: thaw of devices complete after 229.835 msecs
> [   33.455071] PM: writing image.
> [   33.535747] ata5: SATA link down (SStatus 0 SControl 300)
> [   33.541393] ata6: SATA link down (SStatus 0 SControl 300)
> [   33.707603] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   33.714056] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   33.720477] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> [   33.728938] ata4.00: configured for UDMA/133
> [   33.734567] ata3.00: configured for UDMA/133
> [   33.739453] ata2.00: configured for UDMA/133
> [   33.745111] PM: Using 3 thread(s) for compression.
> [   33.745111] PM: Compressing and saving image data (149256 pages)...
> [   33.756336] PM: Image saving progress:   0%
> [   33.915592] PM: Image saving progress:  10%
> [   33.995331] PM: Image saving progress:  20%
> [   34.123934] PM: Image saving progress:  30%
> [   34.362314] PM: Image saving progress:  40%
> [   34.463433] PM: Image saving progress:  50%
> [   34.550636] PM: Image saving progress:  60%
> [   34.651601] PM: Image saving progress:  70%
> [   34.756381] PM: Image saving progress:  80%
> [   34.863658] PM: Image saving progress:  90%
> [   34.973828] PM: Image saving progress: 100%
> [   34.993966] PM: Image saving done.
> [   34.998214] PM: Wrote 597024 kbytes in 1.23 seconds (485.38 MB/s)
> [   35.005742] PM: S|
> [   35.122141] r8169 0000:02:00.0 eth0: link up
> [   35.368945] sd 3:0:0:0: [sdc] Synchronizing SCSI cache
> [   35.376839] sd 3:0:0:0: [sdc] Stopping disk
> [   35.898610] sd 2:0:0:0: [sdb] Synchronizing SCSI cache
> [   35.906277] sd 2:0:0:0: [sdb] Stopping disk
> [   36.783830] sd 1:0:0:0: [sda] Synchronizing SCSI cache
> [   36.792035] sd 1:0:0:0: [sda] Stopping disk
> [   36.833993] pcieport 0000:00:04.0: System wakeup enabled by ACPI
> [   36.855314] ------------[ cut here ]------------
> [   36.862344] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   36.864089] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
> [   36.886423] remove_proc_entry: removing non-empty directory 'irq/23', leaking at least 'ehci_hcd:usb3'
> [   36.898325] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   36.934730] CPU: 3 PID: 3786 Comm: hib.sh Not tainted 3.16.0-rc7+ #1
> [   36.942103] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   36.953051]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   36.961616]  ffff88041e4ebb78 ffffffff8104f82c ffff880037a91e00 ffff880429842e30
> [   36.970232]  0000000000000002 ffff88041e4ebc4e ffff880037a91eb9 ffff88041e4ebbd8
> [   36.978822] Call Trace:
> [   36.982324]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   36.988496]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   36.995558]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.002395]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.011279]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.020129]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.028097]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.036478]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.045329]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.053540]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.063012]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.071870]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.080756]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.089838]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.098860]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.107561]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.116425]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.125248]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.133792]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.142348]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.150335]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.158377]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.166526]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.174858]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.183144]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.191700]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.199609]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.207357]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.216091] ---[ end trace 555995a6cab7d4fe ]---
> [   37.222168] ------------[ cut here ]------------
> [   37.227753] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.237344] remove_proc_entry: removing non-empty directory 'irq/22', leaking at least 'ohci_hcd:usb7'
> [   37.247595] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   37.284164] CPU: 3 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   37.292669] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   37.303684]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   37.312346]  ffff88041e4ebb78 ffffffff8104f82c ffff8804298b7500 ffff880429b2e630
> [   37.321035]  0000000000000002 ffff88041e4ebc4e ffff8804298b75b9 ffff88041e4ebbd8
> [   37.329713] Call Trace:
> [   37.333294]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   37.339569]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   37.346722]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.353623]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.360688]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.367776]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.374023]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.382851]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.392223]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.400915]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.410875]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.420244]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.429713]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.439354]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.448906]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.458130]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.467572]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.476968]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.486079]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.495206]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.503789]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.512466]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.521217]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.530104]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.538959]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.548096]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.556513]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.564743]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.573837] ---[ end trace 555995a6cab7d4ff ]---
> [   37.583013] ------------[ cut here ]------------
> [   37.588736] WARNING: CPU: 0 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.598369] remove_proc_entry: removing non-empty directory 'irq/21', leaking at least 'ehci_hcd:usb2'
> [   37.608655] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   37.647451] CPU: 4 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   37.658115] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   37.671304]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   37.682276]  ffff88041e4ebb78 ffffffff8104f82c ffff880429f6c000 ffff880429842e30
> [   37.693179]  0000000000000002 ffff88041e4ebc4e ffff880429f6c0b9 ffff88041e4ebbd8
> [   37.704132] Call Trace:
> [   37.709898]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   37.718393]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   37.727774]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.736888]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.746167]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.755444]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.763897]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.772742]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.782019]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.790656]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.800534]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.809810]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.819130]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.828650]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.838120]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.847324]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.856698]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.866005]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.875035]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.884060]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.892580]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.901173]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.909851]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.918705]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.927558]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.936710]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.945157]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.953358]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.962486] ---[ end trace 555995a6cab7d500 ]---
> [   37.971327] ------------[ cut here ]------------
> [   37.978904] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.990542] remove_proc_entry: removing non-empty directory 'irq/20', leaking at least 'ohci_hcd:usb5'
> [   38.002874] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   38.039571] CPU: 3 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   38.048075] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   38.059363]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   38.070381]  ffff88041e4ebb78 ffffffff8104f82c ffff880429843e00 ffff880429b2e630
> [   38.081335]  0000000000000002 ffff88041e4ebc4e ffff880429843eb9 ffff88041e4ebbd8
> [   38.092303] Call Trace:
> [   38.098120]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   38.106619]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   38.115986]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   38.125119]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   38.134401]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   38.143665]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   38.152113]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   38.160971]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   38.170251]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   38.178918]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   38.188836]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   38.198133]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   38.207534]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   38.217056]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   38.226486]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   38.235667]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   38.245025]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   38.254323]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   38.263346]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   38.272404]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   38.280890]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   38.289488]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   38.298180]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   38.307040]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   38.315880]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   38.325035]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   38.333471]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   38.341680]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   38.350836] ---[ end trace 555995a6cab7d501 ]---
> [   38.359898] ------------[ cut here ]------------
> [   38.367518] WARNING: CPU: 6 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   38.377354] remove_proc_entry: removing non-empty directory 'irq/17', leaking at least 'ehci_hcd:usb1'
> [   38.387685] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   38.424577] CPU: 7 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   38.433267] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   38.444345]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   38.452997]  ffff88041e4ebb78 ffffffff8104f82c ffff88042b199000 ffff880429842e30
> [   38.461711]  0000000000000002 ffff88041e4ebc4e ffff88042b1990b9 ffff88041e4ebbd8
> [   38.470362] Call Trace:
> [   38.474000]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   38.480276]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   38.487466]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   38.494344]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   38.501441]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   38.508498]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   38.514772]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   38.523647]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   38.532943]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   38.541628]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   38.551512]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   38.560841]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   38.570218]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   38.579774]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   38.589277]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   38.598485]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   38.607886]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   38.617206]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   38.626273]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   38.635346]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   38.643849]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   38.652446]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   38.661129]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   38.670047]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   38.678911]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   38.688123]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   38.696609]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   38.704866]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   38.713983] ---[ end trace 555995a6cab7d502 ]---
> [   38.722834] ------------[ cut here ]------------
> [   38.730422] WARNING: CPU: 0 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   38.740241] remove_proc_entry: removing non-empty directory 'irq/18', leaking at least 'ohci_hcd:usb6'
> [   38.750583] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4[   38.762023] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
> [   38.762030] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
> [   38.762034] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762038] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762041] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762044] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762048] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762051] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> 
> lotta those...
> 
> [   45.876957] Clocksource tsc unstable (delta = 4686965640 ns)
> [   45.883734]  nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   45.909515] CPU: 0 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   45.909616] Switched to clocksource acpi_pm
> [   45.921887] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   45.932301]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   45.940316]  ffff88041e4ebb78 ffffffff8104f82c ffff880429842e00 ffff880429b2e630
> [   45.948346]  0000000000000002 ffff88041e4ebc4e ffff880429842eb9 ffff88041e4ebbd8
> [   45.956357] Call Trace:
> [   45.959304]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   45.964975]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   45.971491]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   45.977757]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   45.984240]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   45.990696]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   45.996299]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   46.002380]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   46.008893]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   46.014794]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   46.021911]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   46.028427]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   46.035046]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   46.041809]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   46.048495]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   46.054996]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   46.061611]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   46.068202]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   46.074543]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   46.080886]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   46.086695]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   46.092603]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   46.098599]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   46.104755]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   46.110921]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   46.117434]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   46.123291]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   46.129069]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   46.135798] ---[ end trace 555995a6cab7d503 ]---
> [   46.142058] ACPI: Preparing to enter system sleep state S5
> [   46.148449] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline
> [   46.159388] reboot: Power down
> [   46.163504] acpi_power_off called
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
@ 2014-07-31 15:21                 ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2014-07-31 15:21 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml, Joerg Roedel, iommu

Hi,

On Thu, Jul 31, 2014 at 10:41:36PM +0800, Jiang Liu wrote:
> 	Really appreciate your help. There are two issues left
> according to the log messages.

first of all:

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

So please refrain from top-posting on lkml.

> 	The first issue is " WARNING: CPU: 2 PID: 3786 at
> fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()".
> It's caused by an issue in usb driver and a patch is ready to fix
> this issue, please refer to https://lkml.org/lkml/2014/7/20/277.
> This patch has been merged into Greg's usb repository, so it should
> be merge into v3.17 too.

I can apply this patch to test.

> 	The second issue is "AMD-Vi: Event logged [IO_PAGE_FAULT
> device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]".
> I feel it's not caused by IRQ related changes, but may be caused
> by IO page fault related work for AMD IOMMU. I need more time
> to investigate this and also CC Joerg for help too.

I don't think that's the issue - plain rc7 doesn't trigger the issue.
Only if I merge tip/master in it happens. And there are no AMD iommu
patches in tip/master since Joerg doesn't send stuff through tip
anymore.

> 	According to the log message, I feel my patch has fixed the
> original issue caused by IRQ related changes. What's your thoughts?

I still think the issue is in tip/master. I'm currently trying to
reproduce on another box so that I can bisect it. (Bisecting on my
workstation is always nasty :-\).

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
@ 2014-07-31 15:21                 ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2014-07-31 15:21 UTC (permalink / raw)
  To: Jiang Liu
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, x86-DgEjT+Ai2ygdnm+yROfE0A,
	Rafael J . Wysocki, lkml, iommu, Ingo Molnar, H. Peter Anvin,
	Bjorn Helgaas, Thomas Gleixner

Hi,

On Thu, Jul 31, 2014 at 10:41:36PM +0800, Jiang Liu wrote:
> 	Really appreciate your help. There are two issues left
> according to the log messages.

first of all:

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

So please refrain from top-posting on lkml.

> 	The first issue is " WARNING: CPU: 2 PID: 3786 at
> fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()".
> It's caused by an issue in usb driver and a patch is ready to fix
> this issue, please refer to https://lkml.org/lkml/2014/7/20/277.
> This patch has been merged into Greg's usb repository, so it should
> be merge into v3.17 too.

I can apply this patch to test.

> 	The second issue is "AMD-Vi: Event logged [IO_PAGE_FAULT
> device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]".
> I feel it's not caused by IRQ related changes, but may be caused
> by IO page fault related work for AMD IOMMU. I need more time
> to investigate this and also CC Joerg for help too.

I don't think that's the issue - plain rc7 doesn't trigger the issue.
Only if I merge tip/master in it happens. And there are no AMD iommu
patches in tip/master since Joerg doesn't send stuff through tip
anymore.

> 	According to the log message, I feel my patch has fixed the
> original issue caused by IRQ related changes. What's your thoughts?

I still think the issue is in tip/master. I'm currently trying to
reproduce on another box so that I can bisect it. (Bisecting on my
workstation is always nasty :-\).

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
  2014-07-31 10:39           ` Borislav Petkov
  2014-07-31 14:41               ` Jiang Liu
@ 2014-07-31 16:36             ` Jiang Liu
  2014-07-31 16:56               ` Borislav Petkov
  1 sibling, 1 reply; 24+ messages in thread
From: Jiang Liu @ 2014-07-31 16:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml



On 2014/7/31 18:39, Borislav Petkov wrote:
> On Thu, Jul 31, 2014 at 08:33:26AM +0800, Jiang Liu wrote:
>> Sorry for the incovenience, something is wrong when synchronizing
>> the patch. I have found that "pm" should be "power" when building and
>> testing this patch on my test machine. And I have also fixed it on
>> my development machine, but forgot to do "stg refresh" before "git
>> format-patch",
> 
> Yeah, use git only :-)
> 
> Anyway, still b0rked although I can get further and actually suspend.
> Full splat below.
> 
> The IOMMU PFs point to
> 
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV635 [Radeon HD 3650/3750/4570/4580]
> 
> and
> 
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 
> respectively.
> 
> and the IOPFs are all this OHCI controller. Which points at a timing
> problem during suspend, I'm being told, when the device gets returned
> to the BIOS but IOMMU is still on and filters accesses. Or something to
> that effect. In any case, still b0rked.
Hi Borislav,
	I might have a theory based on above statement.
1) There's a race window between device_shutdown() and
syscore_shutdown(). USB device may be returned to BIOS
during device_shutdown() and IOMMU units are disabled
during syscore_shutdown().
2) Printing debug messages enlarges the race window,
especially when using serial console.
Then it triggers warning message "AMD-Vi: Event logged [IO_PAGE_FAULT
device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]".

But I have no AMD platforms to verify this theory. Could you
please help to try adding delays between device_shutdown()
and syscore_shutdown() in function kernel_power_off()?
Regards!
Gerry

> 
> -> start suspend to disk:
> 
> [   31.422901] hib.sh (3786): drop_caches: 3
> [   31.427932] PM: Hibernation mode set to 'shutdown'
> [   31.444522] PM: Syncing filesystems ... done.
> [   31.452558] Freezing user space processes ... (elapsed 0.002 seconds) done.
> [   31.462404] PM: Marking nosave pages: [mem 0x0009e000-0x000fffff]
> [   31.468544] PM: Marking nosave pages: [mem 0xba9b8000-0xbca4dfff]
> [   31.474868] PM: Marking nosave pages: [mem 0xbca4f000-0xbcc54fff]
> [   31.480995] PM: Marking nosave pages: [mem 0xbd083000-0xbd7f3fff]
> [   31.487136] PM: Marking nosave pages: [mem 0xbd800000-0x100000fff]
> [   31.493891] PM: Basic memory bitmaps created
> [   31.498178] PM: Preallocating image memory... done (allocated 143539 pages)
> [   32.001145] PM: Allocated 574156 kbytes in 0.49 seconds (1171.74 MB/s)
> [   32.007690] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> [   33.220531] serial 00:06: activated
> [   33.220959] [drm] PCIE gen 2 link speeds already enabled
> [   33.228236] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
> [   33.228340] radeon 0000:01:00.0: WB enabled
> [   33.228347] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff88042939ac00
> [   33.233314] snd_hda_intel 0000:01:00.1: irq 49 for MSI/MSI-X
> [   33.259732] [drm] ring test on 0 succeeded in 0 usecs
> [   33.259890] [drm] ib test on ring 0 succeeded in 0 usecs
> [   33.319713] r8169 0000:02:00.0 eth0: link down
> [   33.448144] PM: thaw of devices complete after 229.835 msecs
> [   33.455071] PM: writing image.
> [   33.535747] ata5: SATA link down (SStatus 0 SControl 300)
> [   33.541393] ata6: SATA link down (SStatus 0 SControl 300)
> [   33.707603] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   33.714056] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   33.720477] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> [   33.728938] ata4.00: configured for UDMA/133
> [   33.734567] ata3.00: configured for UDMA/133
> [   33.739453] ata2.00: configured for UDMA/133
> [   33.745111] PM: Using 3 thread(s) for compression.
> [   33.745111] PM: Compressing and saving image data (149256 pages)...
> [   33.756336] PM: Image saving progress:   0%
> [   33.915592] PM: Image saving progress:  10%
> [   33.995331] PM: Image saving progress:  20%
> [   34.123934] PM: Image saving progress:  30%
> [   34.362314] PM: Image saving progress:  40%
> [   34.463433] PM: Image saving progress:  50%
> [   34.550636] PM: Image saving progress:  60%
> [   34.651601] PM: Image saving progress:  70%
> [   34.756381] PM: Image saving progress:  80%
> [   34.863658] PM: Image saving progress:  90%
> [   34.973828] PM: Image saving progress: 100%
> [   34.993966] PM: Image saving done.
> [   34.998214] PM: Wrote 597024 kbytes in 1.23 seconds (485.38 MB/s)
> [   35.005742] PM: S|
> [   35.122141] r8169 0000:02:00.0 eth0: link up
> [   35.368945] sd 3:0:0:0: [sdc] Synchronizing SCSI cache
> [   35.376839] sd 3:0:0:0: [sdc] Stopping disk
> [   35.898610] sd 2:0:0:0: [sdb] Synchronizing SCSI cache
> [   35.906277] sd 2:0:0:0: [sdb] Stopping disk
> [   36.783830] sd 1:0:0:0: [sda] Synchronizing SCSI cache
> [   36.792035] sd 1:0:0:0: [sda] Stopping disk
> [   36.833993] pcieport 0000:00:04.0: System wakeup enabled by ACPI
> [   36.855314] ------------[ cut here ]------------
> [   36.862344] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   36.864089] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
> [   36.886423] remove_proc_entry: removing non-empty directory 'irq/23', leaking at least 'ehci_hcd:usb3'
> [   36.898325] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   36.934730] CPU: 3 PID: 3786 Comm: hib.sh Not tainted 3.16.0-rc7+ #1
> [   36.942103] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   36.953051]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   36.961616]  ffff88041e4ebb78 ffffffff8104f82c ffff880037a91e00 ffff880429842e30
> [   36.970232]  0000000000000002 ffff88041e4ebc4e ffff880037a91eb9 ffff88041e4ebbd8
> [   36.978822] Call Trace:
> [   36.982324]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   36.988496]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   36.995558]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.002395]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.011279]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.020129]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.028097]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.036478]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.045329]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.053540]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.063012]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.071870]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.080756]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.089838]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.098860]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.107561]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.116425]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.125248]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.133792]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.142348]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.150335]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.158377]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.166526]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.174858]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.183144]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.191700]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.199609]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.207357]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.216091] ---[ end trace 555995a6cab7d4fe ]---
> [   37.222168] ------------[ cut here ]------------
> [   37.227753] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.237344] remove_proc_entry: removing non-empty directory 'irq/22', leaking at least 'ohci_hcd:usb7'
> [   37.247595] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   37.284164] CPU: 3 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   37.292669] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   37.303684]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   37.312346]  ffff88041e4ebb78 ffffffff8104f82c ffff8804298b7500 ffff880429b2e630
> [   37.321035]  0000000000000002 ffff88041e4ebc4e ffff8804298b75b9 ffff88041e4ebbd8
> [   37.329713] Call Trace:
> [   37.333294]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   37.339569]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   37.346722]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.353623]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.360688]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.367776]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.374023]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.382851]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.392223]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.400915]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.410875]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.420244]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.429713]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.439354]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.448906]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.458130]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.467572]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.476968]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.486079]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.495206]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.503789]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.512466]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.521217]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.530104]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.538959]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.548096]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.556513]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.564743]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.573837] ---[ end trace 555995a6cab7d4ff ]---
> [   37.583013] ------------[ cut here ]------------
> [   37.588736] WARNING: CPU: 0 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.598369] remove_proc_entry: removing non-empty directory 'irq/21', leaking at least 'ehci_hcd:usb2'
> [   37.608655] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   37.647451] CPU: 4 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   37.658115] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   37.671304]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   37.682276]  ffff88041e4ebb78 ffffffff8104f82c ffff880429f6c000 ffff880429842e30
> [   37.693179]  0000000000000002 ffff88041e4ebc4e ffff880429f6c0b9 ffff88041e4ebbd8
> [   37.704132] Call Trace:
> [   37.709898]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   37.718393]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   37.727774]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   37.736888]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   37.746167]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   37.755444]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   37.763897]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   37.772742]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   37.782019]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   37.790656]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   37.800534]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   37.809810]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   37.819130]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   37.828650]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   37.838120]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   37.847324]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   37.856698]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   37.866005]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   37.875035]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   37.884060]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   37.892580]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   37.901173]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   37.909851]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   37.918705]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   37.927558]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   37.936710]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   37.945157]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   37.953358]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   37.962486] ---[ end trace 555995a6cab7d500 ]---
> [   37.971327] ------------[ cut here ]------------
> [   37.978904] WARNING: CPU: 2 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   37.990542] remove_proc_entry: removing non-empty directory 'irq/20', leaking at least 'ohci_hcd:usb5'
> [   38.002874] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   38.039571] CPU: 3 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   38.048075] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   38.059363]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   38.070381]  ffff88041e4ebb78 ffffffff8104f82c ffff880429843e00 ffff880429b2e630
> [   38.081335]  0000000000000002 ffff88041e4ebc4e ffff880429843eb9 ffff88041e4ebbd8
> [   38.092303] Call Trace:
> [   38.098120]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   38.106619]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   38.115986]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   38.125119]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   38.134401]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   38.143665]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   38.152113]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   38.160971]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   38.170251]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   38.178918]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   38.188836]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   38.198133]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   38.207534]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   38.217056]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   38.226486]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   38.235667]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   38.245025]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   38.254323]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   38.263346]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   38.272404]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   38.280890]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   38.289488]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   38.298180]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   38.307040]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   38.315880]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   38.325035]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   38.333471]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   38.341680]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   38.350836] ---[ end trace 555995a6cab7d501 ]---
> [   38.359898] ------------[ cut here ]------------
> [   38.367518] WARNING: CPU: 6 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   38.377354] remove_proc_entry: removing non-empty directory 'irq/17', leaking at least 'ehci_hcd:usb1'
> [   38.387685] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   38.424577] CPU: 7 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   38.433267] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   38.444345]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   38.452997]  ffff88041e4ebb78 ffffffff8104f82c ffff88042b199000 ffff880429842e30
> [   38.461711]  0000000000000002 ffff88041e4ebc4e ffff88042b1990b9 ffff88041e4ebbd8
> [   38.470362] Call Trace:
> [   38.474000]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   38.480276]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   38.487466]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   38.494344]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   38.501441]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   38.508498]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   38.514772]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   38.523647]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   38.532943]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   38.541628]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   38.551512]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   38.560841]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   38.570218]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   38.579774]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   38.589277]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   38.598485]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   38.607886]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   38.617206]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   38.626273]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   38.635346]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   38.643849]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   38.652446]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   38.661129]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   38.670047]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   38.678911]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   38.688123]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   38.696609]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   38.704866]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   38.713983] ---[ end trace 555995a6cab7d502 ]---
> [   38.722834] ------------[ cut here ]------------
> [   38.730422] WARNING: CPU: 0 PID: 3786 at fs/proc/generic.c:521 remove_proc_entry+0x19c/0x1b0()
> [   38.740241] remove_proc_entry: removing non-empty directory 'irq/18', leaking at least 'ohci_hcd:usb6'
> [   38.750583] Modules linked in: xt_iprange ipt_MASQUERADE iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4[   38.762023] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
> [   38.762030] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
> [   38.762034] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762038] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762041] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762044] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762048] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   38.762051] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> 
> lotta those...
> 
> [   45.876957] Clocksource tsc unstable (delta = 4686965640 ns)
> [   45.883734]  nf_nat nf_conntrack iptable_filter ip_tables x_tables cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative ipv6 vfat fat fuse dm_crypt dm_mod amd64_edac_mod edac_core fam15h_power k10temp radeon drm_kms_helper ttm acpi_cpufreq r8169
> [   45.909515] CPU: 0 PID: 3786 Comm: hib.sh Tainted: G        W     3.16.0-rc7+ #1
> [   45.909616] Switched to clocksource acpi_pm
> [   45.921887] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
> [   45.932301]  0000000000000009 ffff88041e4ebb40 ffffffff81617d4e ffff88041e4ebb88
> [   45.940316]  ffff88041e4ebb78 ffffffff8104f82c ffff880429842e00 ffff880429b2e630
> [   45.948346]  0000000000000002 ffff88041e4ebc4e ffff880429842eb9 ffff88041e4ebbd8
> [   45.956357] Call Trace:
> [   45.959304]  [<ffffffff81617d4e>] dump_stack+0x4e/0x7a
> [   45.964975]  [<ffffffff8104f82c>] warn_slowpath_common+0x8c/0xc0
> [   45.971491]  [<ffffffff8104f8ac>] warn_slowpath_fmt+0x4c/0x50
> [   45.977757]  [<ffffffff811dc11c>] remove_proc_entry+0x19c/0x1b0
> [   45.984240]  [<ffffffff810b479e>] unregister_irq_proc+0xce/0xf0
> [   45.990696]  [<ffffffff810ae689>] free_desc+0x39/0x90
> [   45.996299]  [<ffffffff810ae729>] irq_free_descs+0x49/0x90
> [   46.002380]  [<ffffffff810b3f36>] irq_dispose_mapping+0x36/0x60
> [   46.008893]  [<ffffffff8103c2d2>] mp_unmap_irq+0x92/0xc0
> [   46.014794]  [<ffffffff81033a81>] acpi_unregister_gsi_ioapic+0x31/0x40
> [   46.021911]  [<ffffffff81033987>] acpi_unregister_gsi+0x17/0x20
> [   46.028427]  [<ffffffff81340c44>] acpi_pci_irq_disable+0x5c/0x63
> [   46.035046]  [<ffffffff81532b6e>] pcibios_disable_device+0x2e/0x40
> [   46.041809]  [<ffffffff8130bb22>] do_pci_disable_device+0x52/0x60
> [   46.048495]  [<ffffffff8130bb88>] pci_disable_device+0x58/0xd0
> [   46.054996]  [<ffffffff81459182>] usb_hcd_pci_shutdown+0x42/0x50
> [   46.061611]  [<ffffffff8130db28>] pci_device_shutdown+0x28/0x40
> [   46.068202]  [<ffffffff813cf112>] device_shutdown+0xd2/0x1f0
> [   46.074543]  [<ffffffff810754b5>] kernel_power_off+0x35/0x80
> [   46.080886]  [<ffffffff81617455>] power_down+0x45/0xa1
> [   46.086695]  [<ffffffff810a527d>] hibernate+0x19d/0x1d0
> [   46.092603]  [<ffffffff810a1c9c>] state_store+0xec/0x100
> [   46.098599]  [<ffffffff812d448f>] kobj_attr_store+0xf/0x20
> [   46.104755]  [<ffffffff811e91d0>] sysfs_kf_write+0x50/0x70
> [   46.110921]  [<ffffffff811e84e5>] kernfs_fop_write+0x105/0x190
> [   46.117434]  [<ffffffff81171dcd>] vfs_write+0xbd/0x1d0
> [   46.123291]  [<ffffffff81172942>] SyS_write+0x52/0xc0
> [   46.129069]  [<ffffffff81620216>] system_call_fastpath+0x1a/0x1f
> [   46.135798] ---[ end trace 555995a6cab7d503 ]---
> [   46.142058] ACPI: Preparing to enter system sleep state S5
> [   46.148449] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline
> [   46.159388] reboot: Power down
> [   46.163504] acpi_power_off called
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation
  2014-07-31 16:36             ` Jiang Liu
@ 2014-07-31 16:56               ` Borislav Petkov
  2014-08-01 10:56                 ` [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-07-31 16:56 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml

On Fri, Aug 01, 2014 at 12:36:59AM +0800, Jiang Liu wrote:
> 1) There's a race window between device_shutdown() and
> syscore_shutdown(). USB device may be returned to BIOS
> during device_shutdown() and IOMMU units are disabled
> during syscore_shutdown().
> 2) Printing debug messages enlarges the race window,
> especially when using serial console.
> Then it triggers warning message "AMD-Vi: Event logged [IO_PAGE_FAULT
> device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]".
> 
> But I have no AMD platforms to verify this theory. Could you
> please help to try adding delays between device_shutdown()
> and syscore_shutdown() in function kernel_power_off()?

Ok, but this whole deal started appearing only after rc6. And this
ordering hasn't been touched recently, AFAICT. So it must be something
new...

I'm still trying to get that other system to trigger...

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected
  2014-07-31 16:56               ` Borislav Petkov
@ 2014-08-01 10:56                 ` Borislav Petkov
  2014-08-01 12:27                   ` Jiang Liu
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-08-01 10:56 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml

Ok,

so I went and looked at tip and which branches could be the usual
suspects. And whaddya know, tip/x86/apic was the first to pick and it
contained the culprit:

6a38fa0e3c94 ("x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled")

It seems like my machine doesn't like releasing the IOAPIC pin. Or,
maybe it is IOMMU-related and the IOMMU doesn't like this releasing of
the pin *after* the device gets suspended and *before* the IOMMU.

Strange.

Oh well, a well deserved lunch now :)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected
  2014-08-01 10:56                 ` [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected Borislav Petkov
@ 2014-08-01 12:27                   ` Jiang Liu
  2014-08-01 14:39                     ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Jiang Liu @ 2014-08-01 12:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml


On 2014/8/1 18:56, Borislav Petkov wrote:
> Ok,
> 
> so I went and looked at tip and which branches could be the usual
> suspects. And whaddya know, tip/x86/apic was the first to pick and it
> contained the culprit:
> 
> 6a38fa0e3c94 ("x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled")
Hi Borislav,
	The above commit may cause failure of suspend/hiberrnation.
The reason is:
1) With recent changers, we dynamically allocate irq number for IOAPIC
   pins.
2) The allocated irq will be released when suspending/hibernating.
	pci_disable_device()->pcibios_disable_irq()
3) When resuming, a different irq may be assigned to the PCI device.
	pci_enable_device()->pcibios_enable_irq()
4) Now the hardware will deliver interrupt to the new allocated irq
   but the interrupt handler is still registered on old irq, so it
   breaks the driver and causes failure of hibernation.

The patch sent out by me is to fix this issue by keeping the allocated
irq when suspend/hibernate. And seems it works as expected.

But I still don't know why it causes IOMMU related warnings as:
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009
address=0x0000000000000000 flags=0x0000]

Regards!
Gerry
> 
> It seems like my machine doesn't like releasing the IOAPIC pin. Or,
> maybe it is IOMMU-related and the IOMMU doesn't like this releasing of
> the pin *after* the device gets suspended and *before* the IOMMU.
> 
> Strange.
> 
> Oh well, a well deserved lunch now :)
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected
  2014-08-01 12:27                   ` Jiang Liu
@ 2014-08-01 14:39                     ` Borislav Petkov
  2014-08-01 16:11                       ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-08-01 14:39 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml, Jörg Rödel

On Fri, Aug 01, 2014 at 08:27:23PM +0800, Jiang Liu wrote:
> 	The above commit may cause failure of suspend/hiberrnation.
> The reason is:
> 1) With recent changers, we dynamically allocate irq number for IOAPIC
>    pins.
> 2) The allocated irq will be released when suspending/hibernating.
> 	pci_disable_device()->pcibios_disable_irq()
> 3) When resuming, a different irq may be assigned to the PCI device.
> 	pci_enable_device()->pcibios_enable_irq()
> 4) Now the hardware will deliver interrupt to the new allocated irq
>    but the interrupt handler is still registered on old irq, so it
>    breaks the driver and causes failure of hibernation.
> 
> The patch sent out by me is to fix this issue by keeping the allocated
> irq when suspend/hibernate. And seems it works as expected.
> 
> But I still don't know why it causes IOMMU related warnings as:
> AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009
> address=0x0000000000000000 flags=0x0000]

Well, the devices in those change during my test runs. Once it was
00:13.0, once this, once the GPU. This box has IOMMU so it might be
related to that somehow.

I could try to disable the IOMMU and see whether it still triggers. That
could tell us something.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected
  2014-08-01 14:39                     ` Borislav Petkov
@ 2014-08-01 16:11                       ` Borislav Petkov
  2014-08-01 22:14                         ` Jörg Rödel
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-08-01 16:11 UTC (permalink / raw)
  To: Jiang Liu, Jörg Rödel
  Cc: Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas, Ingo Molnar,
	H. Peter Anvin, x86, linux-pci, lkml

On Fri, Aug 01, 2014 at 04:39:22PM +0200, Borislav Petkov wrote:
> I could try to disable the IOMMU and see whether it still triggers.
> That could tell us something.

Ok, let me summarize what I've been able to observe so far:

* https://lkml.kernel.org/r/1406766807-5745-1-git-send-email-jiang.liu@linux.intel.com

this is definitely needed for suspend/resume so for that patch

Acked-and-tested-by: Borislav Petkov <bp@suse.de>

(I need to somehow justify a whole day of bisecting today :-P)

* Then, this
---
diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
index 82044b5d6113..efc953119ce2 100644
--- a/drivers/usb/core/hcd-pci.c
+++ b/drivers/usb/core/hcd-pci.c
@@ -380,6 +380,8 @@ void usb_hcd_pci_shutdown(struct pci_dev *dev)
 	if (test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags) &&
 			hcd->driver->shutdown) {
 		hcd->driver->shutdown(hcd);
+		if (usb_hcd_is_primary_hcd(hcd) && hcd->irq > 0)
+			free_irq(hcd->irq, hcd);
 		pci_disable_device(dev);
 	}
 }
--

is needed for not triggering the remove_proc_entry() WARN_ON.

Finally, even with this hunk above, suspend works fine but I see IOMMU
PFs sometimes(!). Yes, sometimes as in it suspends fine without even
screaming at all and sometimes I get a few of those right before the
machine goes down:

[   89.040795] pcieport 0000:00:04.0: System wakeup enabled by ACPI
[   89.061697] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
[   89.071871] ACPI: Preparing to enter system sleep state S5
[   89.072117] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline
[   89.089832] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
[   89.102239] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
[   89.114684] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.127162] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.139576] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.152017] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.164481] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.176994] AMD-Vi: Event logged [[   89.177657] reboot: Power down
[   89.185286] acpi_power_off called

Now this device 00:12.0 is that OHCI thing for which we have the
hcd-pci.c hunk applied above, AFAICT:

00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller

so it must be still some timing issue there after disabling the device
and *before* disabling the IOMMU.

I don't have a clue how to further debug that. Joerg is on CC.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected
  2014-08-01 16:11                       ` Borislav Petkov
@ 2014-08-01 22:14                         ` Jörg Rödel
  2014-08-01 22:50                           ` Borislav Petkov
  0 siblings, 1 reply; 24+ messages in thread
From: Jörg Rödel @ 2014-08-01 22:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Jiang Liu, Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas,
	Ingo Molnar, H. Peter Anvin, x86, linux-pci, lkml

On Fri, Aug 01, 2014 at 06:11:08PM +0200, Borislav Petkov wrote:
> [   89.040795] pcieport 0000:00:04.0: System wakeup enabled by ACPI
> [   89.061697] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
> [   89.071871] ACPI: Preparing to enter system sleep state S5
> [   89.072117] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline
> [   89.089832] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
> [   89.102239] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
> [   89.114684] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.127162] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.139576] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.152017] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.164481] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.176994] AMD-Vi: Event logged [[   89.177657] reboot: Power down
> [   89.185286] acpi_power_off called
> 
> Now this device 00:12.0 is that OHCI thing for which we have the
> hcd-pci.c hunk applied above, AFAICT:
> 
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 
> so it must be still some timing issue there after disabling the device
> and *before* disabling the IOMMU.

My guess is that the firmware takes back the device after the OS
released it and now the legacy emulation tries to do DMA with it. But
since there is an IOMMU the physical addresses it tries to DMA to is
not mapped and it generated IO page faults

In a perfect world the BIOS would define Unity Mapping regions in the
IVRS ACPI table for the USB controler so that the IOMMU driver would
keep these regions mapped.  But I have never seen those regions defined
by any BIOS of an AMD machine, so this is probably the cause why you are
seeing these IO page faults.

But this doesn't explain the GPU faults, though.


	Joerg


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected
  2014-08-01 22:14                         ` Jörg Rödel
@ 2014-08-01 22:50                           ` Borislav Petkov
  2014-08-02  2:25                               ` Jiang Liu
  0 siblings, 1 reply; 24+ messages in thread
From: Borislav Petkov @ 2014-08-01 22:50 UTC (permalink / raw)
  To: Jörg Rödel
  Cc: Jiang Liu, Rafael J . Wysocki, Thomas Gleixner, Bjorn Helgaas,
	Ingo Molnar, H. Peter Anvin, x86, linux-pci, lkml

On Sat, Aug 02, 2014 at 12:14:10AM +0200, Jörg Rödel wrote:
> My guess is that the firmware takes back the device after the OS
> released it and now the legacy emulation tries to do DMA with it. But
> since there is an IOMMU the physical addresses it tries to DMA to is
> not mapped and it generated IO page faults
> 
> In a perfect world the BIOS would define Unity Mapping regions in the
> IVRS ACPI table for the USB controler so that the IOMMU driver would
> keep these regions mapped.  But I have never seen those regions defined
> by any BIOS of an AMD machine, so this is probably the cause why you are
> seeing these IO page faults.

Ok but what changed? Apparently we didn't have that small window earlier
as I have never seen those IOMMU PFs before. So Jiang's stuff simply
opens that hole now. And I bet this whole work is aiming at physical
hotplug which is all fine and dandy but it shouldn't cause regressions.

So, IIUC, the dynamic IOAPIC stuff of freeing an irq number during
suspend opens this hole. Which leads me to the naive thinking that maybe
this new behavior should be configurable so that systems can choose.

And I bet I won't be the last one to trigger those when those changes
hit 3.17...

> But this doesn't explain the GPU faults, though.

I think this got fixed by

https://lkml.kernel.org/r/1406766807-5745-1-git-send-email-jiang.liu@linux.intel.com

which keeps the IRQ numbers across S/R.

I'll watch out for those though.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] iommu/amd: Implement syscore_ops.shutdown()
@ 2014-08-02  2:25                               ` Jiang Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Jiang Liu @ 2014-08-02  2:25 UTC (permalink / raw)
  To: Borislav Petkov, Rafael J . Wysocki, Thomas Gleixner, Joerg Roedel
  Cc: Jiang Liu, x86-ml, lkml, iommu

During hibernation or shutdown, AMD iommu generates warnings on some
platforms as below:
[   89.089832] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
[   89.102239] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
[   89.114684] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.127162] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.139576] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.152017] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.164481] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]

It may be caused by that the firmware takes back the device after the OS
released it and now the legacy emulation tries to do DMA with it. But
since there is an IOMMU the physical addresses it tries to DMA to is
not mapped and it generated IO page faults. So explicitly shutdown
IOMMU units during hibernation or poweroff.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
Hi Borislav,
	Could you please help to verify whether this patch help to minimize
the race window and reduce IOMMU warning messages?
Regards!
Gerry
---
 drivers/iommu/amd_iommu_init.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 0e08545d7298..4d1752fe12e1 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1687,9 +1687,16 @@ static int amd_iommu_suspend(void)
 	return 0;
 }
 
+static void amd_iommu_shutdown(void)
+{
+	/* disable IOMMUs to go out of the way for BIOS */
+	disable_iommus();
+}
+
 static struct syscore_ops amd_iommu_syscore_ops = {
 	.suspend = amd_iommu_suspend,
 	.resume = amd_iommu_resume,
+	.shutdown = amd_iommu_shutdown,
 };
 
 static void __init free_on_init_error(void)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH] iommu/amd: Implement syscore_ops.shutdown()
@ 2014-08-02  2:25                               ` Jiang Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Jiang Liu @ 2014-08-02  2:25 UTC (permalink / raw)
  To: Borislav Petkov, Rafael J . Wysocki, Thomas Gleixner, Joerg Roedel
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, x86-ml,
	Jiang Liu, lkml

During hibernation or shutdown, AMD iommu generates warnings on some
platforms as below:
[   89.089832] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
[   89.102239] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
[   89.114684] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.127162] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.139576] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.152017] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[   89.164481] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]

It may be caused by that the firmware takes back the device after the OS
released it and now the legacy emulation tries to do DMA with it. But
since there is an IOMMU the physical addresses it tries to DMA to is
not mapped and it generated IO page faults. So explicitly shutdown
IOMMU units during hibernation or poweroff.

Signed-off-by: Jiang Liu <jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
---
Hi Borislav,
	Could you please help to verify whether this patch help to minimize
the race window and reduce IOMMU warning messages?
Regards!
Gerry
---
 drivers/iommu/amd_iommu_init.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 0e08545d7298..4d1752fe12e1 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1687,9 +1687,16 @@ static int amd_iommu_suspend(void)
 	return 0;
 }
 
+static void amd_iommu_shutdown(void)
+{
+	/* disable IOMMUs to go out of the way for BIOS */
+	disable_iommus();
+}
+
 static struct syscore_ops amd_iommu_syscore_ops = {
 	.suspend = amd_iommu_suspend,
 	.resume = amd_iommu_resume,
+	.shutdown = amd_iommu_shutdown,
 };
 
 static void __init free_on_init_error(void)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] iommu/amd: Implement syscore_ops.shutdown()
@ 2014-08-04 10:12                                 ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2014-08-04 10:12 UTC (permalink / raw)
  To: Jiang Liu
  Cc: Rafael J . Wysocki, Thomas Gleixner, Joerg Roedel, x86-ml, lkml, iommu

On Sat, Aug 02, 2014 at 10:25:39AM +0800, Jiang Liu wrote:
> During hibernation or shutdown, AMD iommu generates warnings on some
> platforms as below:
> [   89.089832] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
> [   89.102239] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
> [   89.114684] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.127162] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.139576] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.152017] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.164481] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> 
> It may be caused by that the firmware takes back the device after the OS
> released it and now the legacy emulation tries to do DMA with it. But
> since there is an IOMMU the physical addresses it tries to DMA to is
> not mapped and it generated IO page faults. So explicitly shutdown
> IOMMU units during hibernation or poweroff.

Unfortunately, this doesn't help. The current observation is that I
don't get the IOPFs during *every* suspend-to-disk cycle but I do get
them from time to time. Here are all IOPFs from 10ish suspend/resume
cycles with this patch. The newlines between them show how the IOPFs
happen: sometimes it is the GPU (device 01:00.0) with a single IOPF...

[   78.482122] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]

[  171.593660] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]

and sometimes it is the GPU and OHCI0 with multiple ones.

[  280.680624] IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
[  280.694539] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
[  280.707917] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
[  280.721234] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  280.734565] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  280.747877] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  280.761227] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  280.774444] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  290.686170] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]

[   49.393307] IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]

Other than that, the box is fine AFAICT.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] iommu/amd: Implement syscore_ops.shutdown()
@ 2014-08-04 10:12                                 ` Borislav Petkov
  0 siblings, 0 replies; 24+ messages in thread
From: Borislav Petkov @ 2014-08-04 10:12 UTC (permalink / raw)
  To: Jiang Liu
  Cc: x86-ml, Rafael J . Wysocki, lkml,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Thomas Gleixner

On Sat, Aug 02, 2014 at 10:25:39AM +0800, Jiang Liu wrote:
> During hibernation or shutdown, AMD iommu generates warnings on some
> platforms as below:
> [   89.089832] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
> [   89.102239] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
> [   89.114684] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.127162] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.139576] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.152017] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> [   89.164481] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
> 
> It may be caused by that the firmware takes back the device after the OS
> released it and now the legacy emulation tries to do DMA with it. But
> since there is an IOMMU the physical addresses it tries to DMA to is
> not mapped and it generated IO page faults. So explicitly shutdown
> IOMMU units during hibernation or poweroff.

Unfortunately, this doesn't help. The current observation is that I
don't get the IOPFs during *every* suspend-to-disk cycle but I do get
them from time to time. Here are all IOPFs from 10ish suspend/resume
cycles with this patch. The newlines between them show how the IOPFs
happen: sometimes it is the GPU (device 01:00.0) with a single IOPF...

[   78.482122] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]

[  171.593660] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]

and sometimes it is the GPU and OHCI0 with multiple ones.

[  280.680624] IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]
[  280.694539] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000080 flags=0x0020]
[  280.707917] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x0000000000000000 flags=0x0000]
[  280.721234] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  280.734565] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  280.747877] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  280.761227] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  280.774444] AMD-Vi: Event logged [IO_PAGE_FAULT device=00:12.0 domain=0x0009 address=0x00000000ffffffc0 flags=0x0010]
[  290.686170] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]

[   49.393307] IO_PAGE_FAULT device=01:00.0 domain=0x0014 address=0x0000000020001000 flags=0x0000]

Other than that, the box is fine AFAICT.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2014-08-04 10:12 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20140728175326.GA7100@pd.tnic>
2014-07-28 20:59 ` rc7 + tip/master suspend fun Rafael J. Wysocki
2014-07-28 21:02   ` Thomas Gleixner
2014-07-28 22:02     ` Rafael J. Wysocki
2014-07-29  2:17     ` Jiang Liu
2014-07-30  9:37     ` [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation Jiang Liu
2014-07-30 17:58       ` Borislav Petkov
2014-07-31  0:33         ` Jiang Liu
2014-07-31 10:39           ` Borislav Petkov
2014-07-31 14:41             ` Jiang Liu
2014-07-31 14:41               ` Jiang Liu
2014-07-31 15:21               ` Borislav Petkov
2014-07-31 15:21                 ` Borislav Petkov
2014-07-31 16:36             ` Jiang Liu
2014-07-31 16:56               ` Borislav Petkov
2014-08-01 10:56                 ` [PATCH] x86, irq: Keep IRQ assignment for PCI devices during suspend/hibernation, bisected Borislav Petkov
2014-08-01 12:27                   ` Jiang Liu
2014-08-01 14:39                     ` Borislav Petkov
2014-08-01 16:11                       ` Borislav Petkov
2014-08-01 22:14                         ` Jörg Rödel
2014-08-01 22:50                           ` Borislav Petkov
2014-08-02  2:25                             ` [PATCH] iommu/amd: Implement syscore_ops.shutdown() Jiang Liu
2014-08-02  2:25                               ` Jiang Liu
2014-08-04 10:12                               ` Borislav Petkov
2014-08-04 10:12                                 ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.