linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Suspend-resume failure on Intel Eagle Lake Core2Duo
@ 2017-07-24 13:45 Martin Peres
  2017-07-24 15:28 ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Martin Peres @ 2017-07-24 13:45 UTC (permalink / raw)
  To: jeffy.chen, Thomas Gleixner, Sarvela, Tomi P; +Cc: linux-kernel

Hello,

I am contacting you because your patch "bf22ff45b genirq: Avoid 
unnecessary low level irq function calls" broke suspend/resume on an old 
Intel platform.

This was not caught in linux-next and only got introduced in linux 4.13-rc1.

Here is the current place where the bug has been reported[0] but I can 
move it to bugzilla.kernel.org if you would want that.

Here are the kernel logs[1]. What information would you need?

Thanks in advance,
Martin

[0] https://bugs.freedesktop.org/show_bug.cgi?id=101838
[1] 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_2845/fi-elk-e7500/dmesg-before.log

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-24 13:45 Suspend-resume failure on Intel Eagle Lake Core2Duo Martin Peres
@ 2017-07-24 15:28 ` Thomas Gleixner
  2017-07-24 15:41   ` Martin Peres
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-24 15:28 UTC (permalink / raw)
  To: Martin Peres; +Cc: jeffy.chen, Sarvela, Tomi P, linux-kernel

On Mon, 24 Jul 2017, Martin Peres wrote:
> 
> I am contacting you because your patch "bf22ff45b genirq: Avoid unnecessary
> low level irq function calls" broke suspend/resume on an old Intel platform.
> 
> This was not caught in linux-next and only got introduced in linux 4.13-rc1.
> 
> Here is the current place where the bug has been reported[0] but I can move it
> to bugzilla.kernel.org if you would want that.

Can we please handle this per mail?

> Here are the kernel logs[1]. What information would you need?

Output of 'cat /proc/interrupts' and a description what kind of 'old' Intel
platform that is.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-24 15:28 ` Thomas Gleixner
@ 2017-07-24 15:41   ` Martin Peres
  2017-07-24 16:35     ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Martin Peres @ 2017-07-24 15:41 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: jeffy.chen, Sarvela, Tomi P, linux-kernel

On 24/07/17 18:28, Thomas Gleixner wrote:
> On Mon, 24 Jul 2017, Martin Peres wrote:
>>
>> I am contacting you because your patch "bf22ff45b genirq: Avoid unnecessary
>> low level irq function calls" broke suspend/resume on an old Intel platform.
>>
>> This was not caught in linux-next and only got introduced in linux 4.13-rc1.
>>
>> Here is the current place where the bug has been reported[0] but I can move it
>> to bugzilla.kernel.org if you would want that.
> 
> Can we please handle this per mail?

Sure, if you prefer.

> 
>> Here are the kernel logs[1]. What information would you need?
> 
> Output of 'cat /proc/interrupts' and a description what kind of 'old' Intel
> platform that is.

Sorry, I should have repeated the name outside of just the subject of 
the email. It is an Intel Eagle Lake Core2Duo.

$ sudo cat /proc/interrupts
            CPU0       CPU1 
 

   0:        182          0   IO-APIC   2-edge      timer 
 

   1:          2          1   IO-APIC   1-edge      i8042 
 

   4:          6          4   IO-APIC   4-edge      serial 
 

   5:          0          0   IO-APIC   5-edge      ehci_hcd:usb1, 
uhci_hcd:usb5, uhci_hcd:usb8 

   8:          1          0   IO-APIC   8-edge      rtc0 
 

   9:          0          0   IO-APIC   9-fasteoi   acpi 
 

  10:          0          0   IO-APIC  10-edge      ehci_hcd:usb2, 
uhci_hcd:usb3, uhci_hcd:usb6 

  11:          0          0   IO-APIC  11-edge      uhci_hcd:usb4, 
uhci_hcd:usb7 

  12:          3          2   IO-APIC  12-edge      i8042 
 

  24:      63345          0  HPET-MSI   2-edge      hpet2 
 

  25:          0      70778  HPET-MSI   3-edge      hpet3 
 

  28:          0          0   PCI-MSI 458752-edge      PCIe PME 
 

  29:          0          0   PCI-MSI 466944-edge      PCIe PME 
 

  30:       5349      10818   PCI-MSI 512000-edge 
ahci[0000:00:1f.2] 

  31:        810      10234   PCI-MSI 409600-edge      enp0s25 
 

  32:          9          9   PCI-MSI 49152-edge      mei_me 
 

  33:        233        223   PCI-MSI 442368-edge 
snd_hda_intel:card0 

  34:          1          0   PCI-MSI 32768-edge      i915 
 

NMI:         38         33   Non-maskable interrupts 
 

LOC:        710        438   Local timer interrupts 
 

SPU:          0          0   Spurious interrupts 
 

PMI:         38         33   Performance monitoring interrupts 
 

IWI:          0          0   IRQ work interrupts 
 

RTR:          0          0   APIC ICR read retries 
 

RES:       2629       2425   Rescheduling interrupts 
 

CAL:        618        722   Function call interrupts 
 

TLB:        321        309   TLB shootdowns 
 

TRM:          0          0   Thermal event interrupts 
 

THR:          0          0   Threshold APIC interrupts 
 

MCE:          0          0   Machine check exceptions 
 

MCP:          7          7   Machine check polls 
 

ERR:          0 
 

MIS:          0 
 

PIN:          0          0   Posted-interrupt notification event 
 

PIW:          0          0   Posted-interrupt wakeup event

Cheers,
Martin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-24 15:41   ` Martin Peres
@ 2017-07-24 16:35     ` Thomas Gleixner
  2017-07-24 16:37       ` Martin Peres
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-24 16:35 UTC (permalink / raw)
  To: Martin Peres; +Cc: jeffy.chen, Sarvela, Tomi P, linux-kernel

On Mon, 24 Jul 2017, Martin Peres wrote:
> On 24/07/17 18:28, Thomas Gleixner wrote:
> > Output of 'cat /proc/interrupts' and a description what kind of 'old' Intel
> > platform that is.
> 
> Sorry, I should have repeated the name outside of just the subject of the
> email. It is an Intel Eagle Lake Core2Duo.

I hoped you would give me a little bit more details, but I Gurgled it
now. I try to find such a beast and look what's wrong there. If I can't
find one, I'll come back with some debug patches.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-24 16:35     ` Thomas Gleixner
@ 2017-07-24 16:37       ` Martin Peres
  2017-07-25  7:01         ` Tomi Sarvela
  0 siblings, 1 reply; 45+ messages in thread
From: Martin Peres @ 2017-07-24 16:37 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: jeffy.chen, Sarvela, Tomi P, linux-kernel

On 24/07/17 19:35, Thomas Gleixner wrote:
> On Mon, 24 Jul 2017, Martin Peres wrote:
>> On 24/07/17 18:28, Thomas Gleixner wrote:
>>> Output of 'cat /proc/interrupts' and a description what kind of 'old' Intel
>>> platform that is.
>>
>> Sorry, I should have repeated the name outside of just the subject of the
>> email. It is an Intel Eagle Lake Core2Duo.
> 
> I hoped you would give me a little bit more details, but I Gurgled it
> now. I try to find such a beast and look what's wrong there. If I can't
> find one, I'll come back with some debug patches.

Ha, sorry!

Here is all the info I have:
  - Hardware: HP Compaq 8000
  - CPU: Eagle Lake Core2Duo E7500

I hope this helps!

Martin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-24 16:37       ` Martin Peres
@ 2017-07-25  7:01         ` Tomi Sarvela
  2017-07-26 13:47           ` Martin Peres
  0 siblings, 1 reply; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-25  7:01 UTC (permalink / raw)
  To: Martin Peres, Thomas Gleixner; +Cc: jeffy.chen, linux-kernel

On 24/07/17 19:37, Martin Peres wrote:
> On 24/07/17 19:35, Thomas Gleixner wrote:
>> On Mon, 24 Jul 2017, Martin Peres wrote:
>>> On 24/07/17 18:28, Thomas Gleixner wrote:
>>>> Output of 'cat /proc/interrupts' and a description what kind of 
>>>> 'old' Intel
>>>> platform that is.
>>>
>>> Sorry, I should have repeated the name outside of just the subject of 
>>> the
>>> email. It is an Intel Eagle Lake Core2Duo.
>>
>> I hoped you would give me a little bit more details, but I Gurgled it
>> now. I try to find such a beast and look what's wrong there. If I can't
>> find one, I'll come back with some debug patches.
> 
> Ha, sorry!
> 
> Here is all the info I have:
>   - Hardware: HP Compaq 8000
>   - CPU: Eagle Lake Core2Duo E7500

As mentioned, it's HP Compaq 8000 Elite, desktop tower in most basic 
configuration: one SATA SSD, 4GB memory and so on. Most recent BIOS from 
around 2012, no UEFI.

If you have proposed fix for the issue, I'll be happy to test them.

Best regards,

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-25  7:01         ` Tomi Sarvela
@ 2017-07-26 13:47           ` Martin Peres
  2017-07-26 14:26             ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Martin Peres @ 2017-07-26 13:47 UTC (permalink / raw)
  To: Tomi Sarvela, Thomas Gleixner; +Cc: jeffy.chen, linux-kernel



On 25/07/17 10:01, Tomi Sarvela wrote:
> On 24/07/17 19:37, Martin Peres wrote:
>> On 24/07/17 19:35, Thomas Gleixner wrote:
>>> On Mon, 24 Jul 2017, Martin Peres wrote:
>>>> On 24/07/17 18:28, Thomas Gleixner wrote:
>>>>> Output of 'cat /proc/interrupts' and a description what kind of 
>>>>> 'old' Intel
>>>>> platform that is.
>>>>
>>>> Sorry, I should have repeated the name outside of just the subject 
>>>> of the
>>>> email. It is an Intel Eagle Lake Core2Duo.
>>>
>>> I hoped you would give me a little bit more details, but I Gurgled it
>>> now. I try to find such a beast and look what's wrong there. If I can't
>>> find one, I'll come back with some debug patches.
>>
>> Ha, sorry!
>>
>> Here is all the info I have:
>>   - Hardware: HP Compaq 8000
>>   - CPU: Eagle Lake Core2Duo E7500
> 
> As mentioned, it's HP Compaq 8000 Elite, desktop tower in most basic 
> configuration: one SATA SSD, 4GB memory and so on. Most recent BIOS from 
> around 2012, no UEFI.
> 
> If you have proposed fix for the issue, I'll be happy to test them.
> 
> Best regards,

Thanks for the information Tomi.

Thomas, any update on this? Another user reported the problem:

"I have the same thing on an old cheap laptop.
model name      : Intel(R) Core(TM)2 CPU
cpu family      : 6
model           : 15

Previously, this never happened on the kernel series 4.{10,11,12}.
After reverted commit bf22ff45bed664 on 4.13-rc2, unfortunately it still 
hangs when it wakes up.
Difficult to diagnose because it does not happen right away. After some 
time of action."

I hope this helps!

Cheers,
Martin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-26 13:47           ` Martin Peres
@ 2017-07-26 14:26             ` Thomas Gleixner
  2017-07-27  7:36               ` Tomi Sarvela
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-26 14:26 UTC (permalink / raw)
  To: Martin Peres; +Cc: Tomi Sarvela, jeffy.chen, linux-kernel

On Wed, 26 Jul 2017, Martin Peres wrote:
> On 25/07/17 10:01, Tomi Sarvela wrote:
> > On 24/07/17 19:37, Martin Peres wrote:
> > > On 24/07/17 19:35, Thomas Gleixner wrote:
> > > > On Mon, 24 Jul 2017, Martin Peres wrote:
> > > > > On 24/07/17 18:28, Thomas Gleixner wrote:
> > > > > > Output of 'cat /proc/interrupts' and a description what kind of
> > > > > > 'old' Intel
> > > > > > platform that is.
> > > > > 
> > > > > Sorry, I should have repeated the name outside of just the subject of
> > > > > the
> > > > > email. It is an Intel Eagle Lake Core2Duo.
> > > > 
> > > > I hoped you would give me a little bit more details, but I Gurgled it
> > > > now. I try to find such a beast and look what's wrong there. If I can't
> > > > find one, I'll come back with some debug patches.
> > > 
> > > Ha, sorry!
> > > 
> > > Here is all the info I have:
> > >   - Hardware: HP Compaq 8000
> > >   - CPU: Eagle Lake Core2Duo E7500
> > 
> > As mentioned, it's HP Compaq 8000 Elite, desktop tower in most basic
> > configuration: one SATA SSD, 4GB memory and so on. Most recent BIOS from
> > around 2012, no UEFI.
> > 
> > If you have proposed fix for the issue, I'll be happy to test them.
> > 
> > Best regards,
> 
> Thanks for the information Tomi.
> 
> Thomas, any update on this? Another user reported the problem:

Not yet.

> "I have the same thing on an old cheap laptop.
> model name      : Intel(R) Core(TM)2 CPU
> cpu family      : 6
> model           : 15
> 
> Previously, this never happened on the kernel series 4.{10,11,12}.
> After reverted commit bf22ff45bed664 on 4.13-rc2, unfortunately it still hangs
> when it wakes up.

So reverting that commit does not help. Does it help on your machine?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-26 14:26             ` Thomas Gleixner
@ 2017-07-27  7:36               ` Tomi Sarvela
  2017-07-27  7:42                 ` Thomas Gleixner
  2017-07-28 12:34                 ` Martin Peres
  0 siblings, 2 replies; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-27  7:36 UTC (permalink / raw)
  To: Thomas Gleixner, Martin Peres; +Cc: jeffy.chen, linux-kernel

On 26/07/17 17:26, Thomas Gleixner wrote:
> On Wed, 26 Jul 2017, Martin Peres wrote:
>> On 25/07/17 10:01, Tomi Sarvela wrote:
>>> On 24/07/17 19:37, Martin Peres wrote:
>>>> On 24/07/17 19:35, Thomas Gleixner wrote:
>>>>> On Mon, 24 Jul 2017, Martin Peres wrote:
>>>>>> On 24/07/17 18:28, Thomas Gleixner wrote:
>>>>>>> Output of 'cat /proc/interrupts' and a description what kind of
>>>>>>> 'old' Intel
>>>>>>> platform that is.
>>>>>>
>>>>>> Sorry, I should have repeated the name outside of just the subject of
>>>>>> the
>>>>>> email. It is an Intel Eagle Lake Core2Duo.
>>>>>
>>>>> I hoped you would give me a little bit more details, but I Gurgled it
>>>>> now. I try to find such a beast and look what's wrong there. If I can't
>>>>> find one, I'll come back with some debug patches.
>>>>
>>>> Ha, sorry!
>>>>
>>>> Here is all the info I have:
>>>>    - Hardware: HP Compaq 8000
>>>>    - CPU: Eagle Lake Core2Duo E7500
>>>
>>> As mentioned, it's HP Compaq 8000 Elite, desktop tower in most basic
>>> configuration: one SATA SSD, 4GB memory and so on. Most recent BIOS from
>>> around 2012, no UEFI.
>>>
>>> If you have proposed fix for the issue, I'll be happy to test them.
>>>
>>> Best regards,
>>
>> Thanks for the information Tomi.
>>
>> Thomas, any update on this? Another user reported the problem:
> 
> Not yet.
> 
>> "I have the same thing on an old cheap laptop.
>> model name      : Intel(R) Core(TM)2 CPU
>> cpu family      : 6
>> model           : 15
>>
>> Previously, this never happened on the kernel series 4.{10,11,12}.
>> After reverted commit bf22ff45bed664 on 4.13-rc2, unfortunately it still hangs
>> when it wakes up.
> 
> So reverting that commit does not help. Does it help on your machine?

Yes. Reverting it does not cause the machine to lock up on resume.

I haven't tested if the machine locks up later on, but at least it 
survives couple of s/r cycles.

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-27  7:36               ` Tomi Sarvela
@ 2017-07-27  7:42                 ` Thomas Gleixner
  2017-07-27  8:12                   ` Tomi Sarvela
  2017-07-28 12:34                 ` Martin Peres
  1 sibling, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-27  7:42 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Thu, 27 Jul 2017, Tomi Sarvela wrote:
> On 26/07/17 17:26, Thomas Gleixner wrote:
> > So reverting that commit does not help. Does it help on your machine?
> 
> Yes. Reverting it does not cause the machine to lock up on resume.
> 
> I haven't tested if the machine locks up later on, but at least it survives
> couple of s/r cycles.

Can you please try to add 'nohpet' to the kernel command line?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-27  7:42                 ` Thomas Gleixner
@ 2017-07-27  8:12                   ` Tomi Sarvela
  2017-07-27 19:42                     ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-27  8:12 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 27/07/17 10:42, Thomas Gleixner wrote:
> On Thu, 27 Jul 2017, Tomi Sarvela wrote:
>> On 26/07/17 17:26, Thomas Gleixner wrote:
>>> So reverting that commit does not help. Does it help on your machine?
>>
>> Yes. Reverting it does not cause the machine to lock up on resume.
>>
>> I haven't tested if the machine locks up later on, but at least it survives
>> couple of s/r cycles.
> 
> Can you please try to add 'nohpet' to the kernel command line?

Option nohpet didn't change anything, still hangs on s/r.

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-27  8:12                   ` Tomi Sarvela
@ 2017-07-27 19:42                     ` Thomas Gleixner
  2017-07-27 20:13                       ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-27 19:42 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Thu, 27 Jul 2017, Tomi Sarvela wrote:

> On 27/07/17 10:42, Thomas Gleixner wrote:
> > On Thu, 27 Jul 2017, Tomi Sarvela wrote:
> > > On 26/07/17 17:26, Thomas Gleixner wrote:
> > > > So reverting that commit does not help. Does it help on your machine?
> > > 
> > > Yes. Reverting it does not cause the machine to lock up on resume.
> > > 
> > > I haven't tested if the machine locks up later on, but at least it
> > > survives
> > > couple of s/r cycles.
> > 
> > Can you please try to add 'nohpet' to the kernel command line?
> 
> Option nohpet didn't change anything, still hangs on s/r.

Ok. Was a shot in the dark. I tried on a similar machine, but that one
resumes fine (except that the AHCI controller plays silly buggers, but
nothing interrupt related). I might have access to another core2duo machine
tomorrow.

I'll send you a debug patch shortly, but can you please first check when
the wreckage happens by testing the states in

    /sys/power/pm_test

freezer
devices
platform
processors
core

That might give some first hint to narrow down the issue.

Debug patch follows soon.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-27 19:42                     ` Thomas Gleixner
@ 2017-07-27 20:13                       ` Thomas Gleixner
  2017-07-27 21:08                         ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-27 20:13 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Thu, 27 Jul 2017, Thomas Gleixner wrote:
> On Thu, 27 Jul 2017, Tomi Sarvela wrote:
> 
> > On 27/07/17 10:42, Thomas Gleixner wrote:
> > > On Thu, 27 Jul 2017, Tomi Sarvela wrote:
> > > > On 26/07/17 17:26, Thomas Gleixner wrote:
> > > > > So reverting that commit does not help. Does it help on your machine?
> > > > 
> > > > Yes. Reverting it does not cause the machine to lock up on resume.
> > > > 
> > > > I haven't tested if the machine locks up later on, but at least it
> > > > survives
> > > > couple of s/r cycles.
> > > 
> > > Can you please try to add 'nohpet' to the kernel command line?
> > 
> > Option nohpet didn't change anything, still hangs on s/r.
> 
> Ok. Was a shot in the dark. I tried on a similar machine, but that one
> resumes fine (except that the AHCI controller plays silly buggers, but
> nothing interrupt related). I might have access to another core2duo machine
> tomorrow.
> 
> I'll send you a debug patch shortly, but can you please first check when
> the wreckage happens by testing the states in
> 
>     /sys/power/pm_test
> 
> freezer
> devices
> platform
> processors
> core

Actually for suspend to ram we only have

	 freezer, devices, platform

I assume it's platform because that is where the actual interrupt
suspend/resume happens.

If that survives, then it's the low level architecture s/r code which
fiddles with the interrupt controllers and leaves them in a state which is
not known to the core code.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-27 20:13                       ` Thomas Gleixner
@ 2017-07-27 21:08                         ` Thomas Gleixner
  2017-07-28 12:29                           ` Tomi Sarvela
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-27 21:08 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Thu, 27 Jul 2017, Thomas Gleixner wrote:
> On Thu, 27 Jul 2017, Thomas Gleixner wrote:
> > On Thu, 27 Jul 2017, Tomi Sarvela wrote:
> > 
> > > On 27/07/17 10:42, Thomas Gleixner wrote:
> > > > On Thu, 27 Jul 2017, Tomi Sarvela wrote:
> > > > > On 26/07/17 17:26, Thomas Gleixner wrote:
> > > > > > So reverting that commit does not help. Does it help on your machine?
> > > > > 
> > > > > Yes. Reverting it does not cause the machine to lock up on resume.
> > > > > 
> > > > > I haven't tested if the machine locks up later on, but at least it
> > > > > survives
> > > > > couple of s/r cycles.
> > > > 
> > > > Can you please try to add 'nohpet' to the kernel command line?
> > > 
> > > Option nohpet didn't change anything, still hangs on s/r.
> > 
> > Ok. Was a shot in the dark. I tried on a similar machine, but that one
> > resumes fine (except that the AHCI controller plays silly buggers, but
> > nothing interrupt related). I might have access to another core2duo machine
> > tomorrow.
> > 
> > I'll send you a debug patch shortly, but can you please first check when
> > the wreckage happens by testing the states in
> > 
> >     /sys/power/pm_test
> > 
> > freezer
> > devices
> > platform
> > processors
> > core
> 
> Actually for suspend to ram we only have
> 
> 	 freezer, devices, platform
> 
> I assume it's platform because that is where the actual interrupt
> suspend/resume happens.
> 
> If that survives, then it's the low level architecture s/r code which
> fiddles with the interrupt controllers and leaves them in a state which is
> not known to the core code.

Debug patch below. It should make the machine resume again. Emphasis on
"should". Please provide the output of /sys/kernel/debug/tracing/trace
after resume.

Thanks,

	tglx

8<-----------

--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -304,7 +304,10 @@ void irq_shutdown(struct irq_desc *desc)
 
 void irq_enable(struct irq_desc *desc)
 {
-	if (!irqd_irq_disabled(&desc->irq_data)) {
+	if (irq_suspend_resume)
+		irq_trace_state("preenable", desc);
+
+	if (!irqd_irq_disabled(&desc->irq_data) && !irq_suspend_resume) {
 		unmask_irq(desc);
 	} else {
 		irq_state_clr_disabled(desc);
@@ -315,10 +318,16 @@ void irq_enable(struct irq_desc *desc)
 			unmask_irq(desc);
 		}
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postenable", desc);
 }
 
 static void __irq_disable(struct irq_desc *desc, bool mask)
 {
+	if (irq_suspend_resume)
+		irq_trace_state("predisable", desc);
+
 	if (irqd_irq_disabled(&desc->irq_data)) {
 		if (mask)
 			mask_irq(desc);
@@ -331,6 +340,9 @@ static void __irq_disable(struct irq_des
 			mask_irq(desc);
 		}
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postdisable", desc);
 }
 
 /**
@@ -390,6 +402,9 @@ static inline void mask_ack_irq(struct i
 
 void mask_irq(struct irq_desc *desc)
 {
+	if (irq_suspend_resume)
+		irq_trace_state("premask", desc);
+
 	if (irqd_irq_masked(&desc->irq_data))
 		return;
 
@@ -397,17 +412,26 @@ void mask_irq(struct irq_desc *desc)
 		desc->irq_data.chip->irq_mask(&desc->irq_data);
 		irq_state_set_masked(desc);
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postmask", desc);
 }
 
 void unmask_irq(struct irq_desc *desc)
 {
-	if (!irqd_irq_masked(&desc->irq_data))
+	if (irq_suspend_resume)
+		irq_trace_state("preunmask", desc);
+
+	if (!irqd_irq_masked(&desc->irq_data) && !irq_suspend_resume)
 		return;
 
 	if (desc->irq_data.chip->irq_unmask) {
 		desc->irq_data.chip->irq_unmask(&desc->irq_data);
 		irq_state_clr_masked(desc);
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postunmask", desc);
 }
 
 void unmask_threaded_irq(struct irq_desc *desc)
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -459,3 +459,11 @@ static inline void irq_remove_debugfs_en
 {
 }
 #endif /* CONFIG_GENERIC_IRQ_DEBUGFS */
+
+bool irq_suspend_resume;
+
+static inline void irq_trace_state(const char *what, struct irq_desc *desc)
+{
+	trace_printk("%s %d state %08x\n", what, irq_desc_get_irq(desc),
+		     irqd_get(&desc->irq_data));
+}
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -14,6 +14,8 @@
 
 #include "internals.h"
 
+bool irq_suspend_resume;
+
 bool irq_pm_check_wakeup(struct irq_desc *desc)
 {
 	if (irqd_is_wakeup_armed(&desc->irq_data)) {
@@ -120,6 +122,7 @@ void suspend_device_irqs(void)
 	struct irq_desc *desc;
 	int irq;
 
+	irq_suspend_resume = true;
 	for_each_irq_desc(irq, desc) {
 		unsigned long flags;
 		bool sync;
@@ -127,7 +130,9 @@ void suspend_device_irqs(void)
 		if (irq_settings_is_nested_thread(desc))
 			continue;
 		raw_spin_lock_irqsave(&desc->lock, flags);
+		irq_trace_state("presuspend", desc);
 		sync = suspend_device_irq(desc);
+		irq_trace_state("postsuspend", desc);
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 
 		if (sync)
@@ -172,9 +177,14 @@ static void resume_irqs(bool want_early)
 			continue;
 
 		raw_spin_lock_irqsave(&desc->lock, flags);
+		irq_trace_state("preresume", desc);
 		resume_irq(desc);
+		irq_trace_state("postresume", desc);
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 	}
+
+	if (!want_early)
+		irq_suspend_resume = false;
 }
 
 /**

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-27 21:08                         ` Thomas Gleixner
@ 2017-07-28 12:29                           ` Tomi Sarvela
  2017-07-28 12:42                             ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-28 12:29 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 28/07/17 00:08, Thomas Gleixner wrote:
> On Thu, 27 Jul 2017, Thomas Gleixner wrote:
>> On Thu, 27 Jul 2017, Thomas Gleixner wrote:
>>> On Thu, 27 Jul 2017, Tomi Sarvela wrote:
>>>> On 27/07/17 10:42, Thomas Gleixner wrote:
>>>>> On Thu, 27 Jul 2017, Tomi Sarvela wrote:
>>>>>> On 26/07/17 17:26, Thomas Gleixner wrote:
>>>>>>> So reverting that commit does not help. Does it help on your machine?
>>>>>>
>>>>>> Yes. Reverting it does not cause the machine to lock up on resume.
>>>>>>
>>>>>> I haven't tested if the machine locks up later on, but at least it
>>>>>> survives
>>>>>> couple of s/r cycles.
>>>>>
>>>>> Can you please try to add 'nohpet' to the kernel command line?
>>>>
>>>> Option nohpet didn't change anything, still hangs on s/r.
>>>
>>> Ok. Was a shot in the dark. I tried on a similar machine, but that one
>>> resumes fine (except that the AHCI controller plays silly buggers, but
>>> nothing interrupt related). I might have access to another core2duo machine
>>> tomorrow.
>>>
>>> I'll send you a debug patch shortly, but can you please first check when
>>> the wreckage happens by testing the states in
>>>
>>>      /sys/power/pm_test
>>>
>>> freezer
>>> devices
>>> platform
>>> processors
>>> core
>>
>> Actually for suspend to ram we only have
>>
>> 	 freezer, devices, platform
>>
>> I assume it's platform because that is where the actual interrupt
>> suspend/resume happens.
>>
>> If that survives, then it's the low level architecture s/r code which
>> fiddles with the interrupt controllers and leaves them in a state which is
>> not known to the core code.
> 
> Debug patch below. It should make the machine resume again. Emphasis on
> "should". Please provide the output of /sys/kernel/debug/tracing/trace
> after resume.

The patch didn't apply cleanly: can you tell exact commit or tag it has 
been created against? I tried to hand-wrangle the changes in, but then I 
got compilation errors:

CC      ipc/compat.o
+0x0): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:(.bss+0x0): first defined here
kernel/irq/manage.o:(.bss+0x8): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/drm-tip/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/spurious.o:(.bss+0x0): multiple definition of 
`irq_suspend_resume'

Also, the usage of /sys/power/pm_test was not intuitive to me. Can you 
explain which kind of combinations do you want to test?

Best regards,

Tomi


> Thanks,
> 
> 	tglx
> 
> 8<-----------
> 
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -304,7 +304,10 @@ void irq_shutdown(struct irq_desc *desc)
>   
>   void irq_enable(struct irq_desc *desc)
>   {
> -	if (!irqd_irq_disabled(&desc->irq_data)) {
> +	if (irq_suspend_resume)
> +		irq_trace_state("preenable", desc);
> +
> +	if (!irqd_irq_disabled(&desc->irq_data) && !irq_suspend_resume) {
>   		unmask_irq(desc);
>   	} else {
>   		irq_state_clr_disabled(desc);
> @@ -315,10 +318,16 @@ void irq_enable(struct irq_desc *desc)
>   			unmask_irq(desc);
>   		}
>   	}
> +
> +	if (irq_suspend_resume)
> +		irq_trace_state("postenable", desc);
>   }
>   
>   static void __irq_disable(struct irq_desc *desc, bool mask)
>   {
> +	if (irq_suspend_resume)
> +		irq_trace_state("predisable", desc);
> +
>   	if (irqd_irq_disabled(&desc->irq_data)) {
>   		if (mask)
>   			mask_irq(desc);
> @@ -331,6 +340,9 @@ static void __irq_disable(struct irq_des
>   			mask_irq(desc);
>   		}
>   	}
> +
> +	if (irq_suspend_resume)
> +		irq_trace_state("postdisable", desc);
>   }
>   
>   /**
> @@ -390,6 +402,9 @@ static inline void mask_ack_irq(struct i
>   
>   void mask_irq(struct irq_desc *desc)
>   {
> +	if (irq_suspend_resume)
> +		irq_trace_state("premask", desc);
> +
>   	if (irqd_irq_masked(&desc->irq_data))
>   		return;
>   
> @@ -397,17 +412,26 @@ void mask_irq(struct irq_desc *desc)
>   		desc->irq_data.chip->irq_mask(&desc->irq_data);
>   		irq_state_set_masked(desc);
>   	}
> +
> +	if (irq_suspend_resume)
> +		irq_trace_state("postmask", desc);
>   }
>   
>   void unmask_irq(struct irq_desc *desc)
>   {
> -	if (!irqd_irq_masked(&desc->irq_data))
> +	if (irq_suspend_resume)
> +		irq_trace_state("preunmask", desc);
> +
> +	if (!irqd_irq_masked(&desc->irq_data) && !irq_suspend_resume)
>   		return;
>   
>   	if (desc->irq_data.chip->irq_unmask) {
>   		desc->irq_data.chip->irq_unmask(&desc->irq_data);
>   		irq_state_clr_masked(desc);
>   	}
> +
> +	if (irq_suspend_resume)
> +		irq_trace_state("postunmask", desc);
>   }
>   
>   void unmask_threaded_irq(struct irq_desc *desc)
> --- a/kernel/irq/internals.h
> +++ b/kernel/irq/internals.h
> @@ -459,3 +459,11 @@ static inline void irq_remove_debugfs_en
>   {
>   }
>   #endif /* CONFIG_GENERIC_IRQ_DEBUGFS */
> +
> +bool irq_suspend_resume;
> +
> +static inline void irq_trace_state(const char *what, struct irq_desc *desc)
> +{
> +	trace_printk("%s %d state %08x\n", what, irq_desc_get_irq(desc),
> +		     irqd_get(&desc->irq_data));
> +}
> --- a/kernel/irq/pm.c
> +++ b/kernel/irq/pm.c
> @@ -14,6 +14,8 @@
>   
>   #include "internals.h"
>   
> +bool irq_suspend_resume;
> +
>   bool irq_pm_check_wakeup(struct irq_desc *desc)
>   {
>   	if (irqd_is_wakeup_armed(&desc->irq_data)) {
> @@ -120,6 +122,7 @@ void suspend_device_irqs(void)
>   	struct irq_desc *desc;
>   	int irq;
>   
> +	irq_suspend_resume = true;
>   	for_each_irq_desc(irq, desc) {
>   		unsigned long flags;
>   		bool sync;
> @@ -127,7 +130,9 @@ void suspend_device_irqs(void)
>   		if (irq_settings_is_nested_thread(desc))
>   			continue;
>   		raw_spin_lock_irqsave(&desc->lock, flags);
> +		irq_trace_state("presuspend", desc);
>   		sync = suspend_device_irq(desc);
> +		irq_trace_state("postsuspend", desc);
>   		raw_spin_unlock_irqrestore(&desc->lock, flags);
>   
>   		if (sync)
> @@ -172,9 +177,14 @@ static void resume_irqs(bool want_early)
>   			continue;
>   
>   		raw_spin_lock_irqsave(&desc->lock, flags);
> +		irq_trace_state("preresume", desc);
>   		resume_irq(desc);
> +		irq_trace_state("postresume", desc);
>   		raw_spin_unlock_irqrestore(&desc->lock, flags);
>   	}
> +
> +	if (!want_early)
> +		irq_suspend_resume = false;
>   }
>   
>   /**
> 
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-27  7:36               ` Tomi Sarvela
  2017-07-27  7:42                 ` Thomas Gleixner
@ 2017-07-28 12:34                 ` Martin Peres
  1 sibling, 0 replies; 45+ messages in thread
From: Martin Peres @ 2017-07-28 12:34 UTC (permalink / raw)
  To: Tomi Sarvela, Thomas Gleixner; +Cc: jeffy.chen, linux-kernel

On 27/07/17 10:36, Tomi Sarvela wrote:
> On 26/07/17 17:26, Thomas Gleixner wrote:
>> On Wed, 26 Jul 2017, Martin Peres wrote:
>>> On 25/07/17 10:01, Tomi Sarvela wrote:
>>>> On 24/07/17 19:37, Martin Peres wrote:
>>>>> On 24/07/17 19:35, Thomas Gleixner wrote:
>>>>>> On Mon, 24 Jul 2017, Martin Peres wrote:
>>>>>>> On 24/07/17 18:28, Thomas Gleixner wrote:
>>>>>>>> Output of 'cat /proc/interrupts' and a description what kind of
>>>>>>>> 'old' Intel
>>>>>>>> platform that is.
>>>>>>>
>>>>>>> Sorry, I should have repeated the name outside of just the 
>>>>>>> subject of
>>>>>>> the
>>>>>>> email. It is an Intel Eagle Lake Core2Duo.
>>>>>>
>>>>>> I hoped you would give me a little bit more details, but I Gurgled it
>>>>>> now. I try to find such a beast and look what's wrong there. If I 
>>>>>> can't
>>>>>> find one, I'll come back with some debug patches.
>>>>>
>>>>> Ha, sorry!
>>>>>
>>>>> Here is all the info I have:
>>>>>    - Hardware: HP Compaq 8000
>>>>>    - CPU: Eagle Lake Core2Duo E7500
>>>>
>>>> As mentioned, it's HP Compaq 8000 Elite, desktop tower in most basic
>>>> configuration: one SATA SSD, 4GB memory and so on. Most recent BIOS 
>>>> from
>>>> around 2012, no UEFI.
>>>>
>>>> If you have proposed fix for the issue, I'll be happy to test them.
>>>>
>>>> Best regards,
>>>
>>> Thanks for the information Tomi.
>>>
>>> Thomas, any update on this? Another user reported the problem:
>>
>> Not yet.
>>
>>> "I have the same thing on an old cheap laptop.
>>> model name      : Intel(R) Core(TM)2 CPU
>>> cpu family      : 6
>>> model           : 15
>>>
>>> Previously, this never happened on the kernel series 4.{10,11,12}.
>>> After reverted commit bf22ff45bed664 on 4.13-rc2, unfortunately it 
>>> still hangs
>>> when it wakes up.
>>
>> So reverting that commit does not help. Does it help on your machine?
> 
> Yes. Reverting it does not cause the machine to lock up on resume.
> 
> I haven't tested if the machine locks up later on, but at least it 
> survives couple of s/r cycles

FYI, the user who reported reverting the patch was not sufficient 
actually now says he was hit by a second bug which he found.

So, a revert is always possible.

Cheers,
Martin

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 12:29                           ` Tomi Sarvela
@ 2017-07-28 12:42                             ` Thomas Gleixner
  2017-07-28 13:15                               ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-28 12:42 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Fri, 28 Jul 2017, Tomi Sarvela wrote:
> On 28/07/17 00:08, Thomas Gleixner wrote:
> The patch didn't apply cleanly: can you tell exact commit or tag it has been
> created against?

Linus head

> I tried to hand-wrangle the changes in, but then I got compilation
> errors:

Sorry, forgot to refresh the patch. Working version below.

> Also, the usage of /sys/power/pm_test was not intuitive to me. Can you explain
> which kind of combinations do you want to test?

# echo freezer >/sys/power/pm_test
# echo mem >/sys/power/state

# echo devices >/sys/power/pm_test
# echo mem >/sys/power/state

# echo platform >/sys/power/pm_test
# echo mem >/sys/power/state

To reset the test stuff do:
# echo none >/sys/power/pm_test

Please try that before applying the debug patch.

Thanks,

	tglx

8<-----------

--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -304,7 +304,10 @@ void irq_shutdown(struct irq_desc *desc)
 
 void irq_enable(struct irq_desc *desc)
 {
-	if (!irqd_irq_disabled(&desc->irq_data)) {
+	if (irq_suspend_resume)
+		irq_trace_state("preenable", desc);
+
+	if (!irqd_irq_disabled(&desc->irq_data) && !irq_suspend_resume) {
 		unmask_irq(desc);
 	} else {
 		irq_state_clr_disabled(desc);
@@ -315,10 +318,16 @@ void irq_enable(struct irq_desc *desc)
 			unmask_irq(desc);
 		}
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postenable", desc);
 }
 
 static void __irq_disable(struct irq_desc *desc, bool mask)
 {
+	if (irq_suspend_resume)
+		irq_trace_state("predisable", desc);
+
 	if (irqd_irq_disabled(&desc->irq_data)) {
 		if (mask)
 			mask_irq(desc);
@@ -331,6 +340,9 @@ static void __irq_disable(struct irq_des
 			mask_irq(desc);
 		}
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postdisable", desc);
 }
 
 /**
@@ -390,6 +402,9 @@ static inline void mask_ack_irq(struct i
 
 void mask_irq(struct irq_desc *desc)
 {
+	if (irq_suspend_resume)
+		irq_trace_state("premask", desc);
+
 	if (irqd_irq_masked(&desc->irq_data))
 		return;
 
@@ -397,17 +412,26 @@ void mask_irq(struct irq_desc *desc)
 		desc->irq_data.chip->irq_mask(&desc->irq_data);
 		irq_state_set_masked(desc);
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postmask", desc);
 }
 
 void unmask_irq(struct irq_desc *desc)
 {
-	if (!irqd_irq_masked(&desc->irq_data))
+	if (irq_suspend_resume)
+		irq_trace_state("preunmask", desc);
+
+	if (!irqd_irq_masked(&desc->irq_data) && !irq_suspend_resume)
 		return;
 
 	if (desc->irq_data.chip->irq_unmask) {
 		desc->irq_data.chip->irq_unmask(&desc->irq_data);
 		irq_state_clr_masked(desc);
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postunmask", desc);
 }
 
 void unmask_threaded_irq(struct irq_desc *desc)
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -459,3 +459,11 @@ static inline void irq_remove_debugfs_en
 {
 }
 #endif /* CONFIG_GENERIC_IRQ_DEBUGFS */
+
+bool irq_suspend_resume;
+
+static inline void irq_trace_state(const char *what, struct irq_desc *desc)
+{
+	trace_printk("%s %d state %08x\n", what, irq_desc_get_irq(desc),
+		     irqd_get(&desc->irq_data));
+}
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -14,6 +14,8 @@
 
 #include "internals.h"
 
+extern bool irq_suspend_resume;
+
 bool irq_pm_check_wakeup(struct irq_desc *desc)
 {
 	if (irqd_is_wakeup_armed(&desc->irq_data)) {
@@ -120,6 +122,7 @@ void suspend_device_irqs(void)
 	struct irq_desc *desc;
 	int irq;
 
+	irq_suspend_resume = true;
 	for_each_irq_desc(irq, desc) {
 		unsigned long flags;
 		bool sync;
@@ -127,7 +130,9 @@ void suspend_device_irqs(void)
 		if (irq_settings_is_nested_thread(desc))
 			continue;
 		raw_spin_lock_irqsave(&desc->lock, flags);
+		irq_trace_state("presuspend", desc);
 		sync = suspend_device_irq(desc);
+		irq_trace_state("postsuspend", desc);
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 
 		if (sync)
@@ -172,9 +177,14 @@ static void resume_irqs(bool want_early)
 			continue;
 
 		raw_spin_lock_irqsave(&desc->lock, flags);
+		irq_trace_state("preresume", desc);
 		resume_irq(desc);
+		irq_trace_state("postresume", desc);
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 	}
+
+	if (!want_early)
+		irq_suspend_resume = false;
 }
 
 /**

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 12:42                             ` Thomas Gleixner
@ 2017-07-28 13:15                               ` Thomas Gleixner
  2017-07-28 13:25                                 ` Tomi Sarvela
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-28 13:15 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Fri, 28 Jul 2017, Thomas Gleixner wrote:
> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
> > On 28/07/17 00:08, Thomas Gleixner wrote:
> > The patch didn't apply cleanly: can you tell exact commit or tag it has been
> > created against?
> 
> Linus head
> 
> > I tried to hand-wrangle the changes in, but then I got compilation
> > errors:
> 
> Sorry, forgot to refresh the patch. Working version below.
> 
> > Also, the usage of /sys/power/pm_test was not intuitive to me. Can you explain
> > which kind of combinations do you want to test?
> 
> # echo freezer >/sys/power/pm_test
> # echo mem >/sys/power/state
> 
> # echo devices >/sys/power/pm_test
> # echo mem >/sys/power/state
> 
> # echo platform >/sys/power/pm_test
> # echo mem >/sys/power/state
> 
> To reset the test stuff do:
> # echo none >/sys/power/pm_test
> 
> Please try that before applying the debug patch.

Another question. Is the machine completely dead or not?

Can you ping it or does the serial console show the start of resume or just
nothing at all?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 13:15                               ` Thomas Gleixner
@ 2017-07-28 13:25                                 ` Tomi Sarvela
  2017-07-28 14:13                                   ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-28 13:25 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 28/07/17 16:15, Thomas Gleixner wrote:
> On Fri, 28 Jul 2017, Thomas Gleixner wrote:
>> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>>> On 28/07/17 00:08, Thomas Gleixner wrote:
>>> The patch didn't apply cleanly: can you tell exact commit or tag it has been
>>> created against?
>>
>> Linus head
>>
>>> I tried to hand-wrangle the changes in, but then I got compilation
>>> errors:
>>
>> Sorry, forgot to refresh the patch. Working version below.
>>
>>> Also, the usage of /sys/power/pm_test was not intuitive to me. Can you explain
>>> which kind of combinations do you want to test?

This tested against just fetched 4.13.0-rc2 origin/master.

>> # echo freezer >/sys/power/pm_test
>> # echo mem >/sys/power/state

Works.

>> # echo devices >/sys/power/pm_test
>> # echo mem >/sys/power/state

Works.

>> # echo platform >/sys/power/pm_test
>> # echo mem >/sys/power/state

Works.

>> To reset the test stuff do:
>> # echo none >/sys/power/pm_test

>> Please try that before applying the debug patch.

Going with the suspend test, and it still leaves host dead.

root@elk:/opt/igt# ./tests/gem_exec_suspend --r basic-S3
IGT-Version: 1.19-gf1454751 (x86_64) (Linux: 4.13.0-rc2+ x86_64)
rtcwake: wakeup from "mem" using /dev/rtc0 at Fri Jul 28 13:21:27 2017

> Another question. Is the machine completely dead or not?

Completely dead. Powerled is on, so host isn't shut down. Serial or 
network if don't give any signs of life.

Patch applies cleanly but still getting the same error:
   LD      vmlinux.o
kernel/irq/handle.o:(.bss+0x0): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:(.bss+0x0): first defined here
kernel/irq/manage.o:(.bss+0x8): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/spurious.o:(.bss+0x0): multiple definition of 
`irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/resend.o:(.bss+0x0): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/chip.o:(.bss+0x0): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/dummychip.o:(.bss+0x0): multiple definition of 
`irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/devres.o:(.bss+0x0): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/autoprobe.o:(.bss+0x0): multiple definition of 
`irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/proc.o:(.bss+0x4): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/migration.o:(.bss+0x0): multiple definition of 
`irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/cpuhotplug.o:(.bss+0x0): multiple definition of 
`irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here
kernel/irq/pm.o:(.bss+0x0): multiple definition of `irq_suspend_resume'
kernel/irq/irqdesc.o:/home/testrunner/kernel/kernel/irq/irqdesc.c:270: 
first defined here


Best regards,

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 13:25                                 ` Tomi Sarvela
@ 2017-07-28 14:13                                   ` Thomas Gleixner
  2017-07-28 14:46                                     ` Tomi Sarvela
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-28 14:13 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Fri, 28 Jul 2017, Tomi Sarvela wrote:
> On 28/07/17 16:15, Thomas Gleixner wrote:
> > Another question. Is the machine completely dead or not?
> 
> Completely dead. Powerled is on, so host isn't shut down.

So that means it does not even power the machine down. That's what I
expected least.

> Serial or network if don't give any signs of life.

> Patch applies cleanly but still getting the same error:

Sorry for the noise. I'm an idiot trying to do 10 things at once. This time
it actually compiles and links.

If the machine does still not powerdown with this applied, then please redo
the 'platform' test and grab the trace for that one.

Thanks,

	tglx

8<-----------
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -304,7 +304,10 @@ void irq_shutdown(struct irq_desc *desc)
 
 void irq_enable(struct irq_desc *desc)
 {
-	if (!irqd_irq_disabled(&desc->irq_data)) {
+	if (irq_suspend_resume)
+		irq_trace_state("preenable", desc);
+
+	if (!irqd_irq_disabled(&desc->irq_data) && !irq_suspend_resume) {
 		unmask_irq(desc);
 	} else {
 		irq_state_clr_disabled(desc);
@@ -315,10 +318,16 @@ void irq_enable(struct irq_desc *desc)
 			unmask_irq(desc);
 		}
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postenable", desc);
 }
 
 static void __irq_disable(struct irq_desc *desc, bool mask)
 {
+	if (irq_suspend_resume)
+		irq_trace_state("predisable", desc);
+
 	if (irqd_irq_disabled(&desc->irq_data)) {
 		if (mask)
 			mask_irq(desc);
@@ -331,6 +340,9 @@ static void __irq_disable(struct irq_des
 			mask_irq(desc);
 		}
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postdisable", desc);
 }
 
 /**
@@ -390,6 +402,9 @@ static inline void mask_ack_irq(struct i
 
 void mask_irq(struct irq_desc *desc)
 {
+	if (irq_suspend_resume)
+		irq_trace_state("premask", desc);
+
 	if (irqd_irq_masked(&desc->irq_data))
 		return;
 
@@ -397,17 +412,26 @@ void mask_irq(struct irq_desc *desc)
 		desc->irq_data.chip->irq_mask(&desc->irq_data);
 		irq_state_set_masked(desc);
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postmask", desc);
 }
 
 void unmask_irq(struct irq_desc *desc)
 {
-	if (!irqd_irq_masked(&desc->irq_data))
+	if (irq_suspend_resume)
+		irq_trace_state("preunmask", desc);
+
+	if (!irqd_irq_masked(&desc->irq_data) && !irq_suspend_resume)
 		return;
 
 	if (desc->irq_data.chip->irq_unmask) {
 		desc->irq_data.chip->irq_unmask(&desc->irq_data);
 		irq_state_clr_masked(desc);
 	}
+
+	if (irq_suspend_resume)
+		irq_trace_state("postunmask", desc);
 }
 
 void unmask_threaded_irq(struct irq_desc *desc)
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -459,3 +459,11 @@ static inline void irq_remove_debugfs_en
 {
 }
 #endif /* CONFIG_GENERIC_IRQ_DEBUGFS */
+
+extern bool irq_suspend_resume;
+
+static inline void irq_trace_state(const char *what, struct irq_desc *desc)
+{
+	trace_printk("%s %d state %08x\n", what, irq_desc_get_irq(desc),
+		     irqd_get(&desc->irq_data));
+}
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -14,6 +14,8 @@
 
 #include "internals.h"
 
+bool irq_suspend_resume;
+
 bool irq_pm_check_wakeup(struct irq_desc *desc)
 {
 	if (irqd_is_wakeup_armed(&desc->irq_data)) {
@@ -120,6 +122,7 @@ void suspend_device_irqs(void)
 	struct irq_desc *desc;
 	int irq;
 
+	irq_suspend_resume = true;
 	for_each_irq_desc(irq, desc) {
 		unsigned long flags;
 		bool sync;
@@ -127,7 +130,9 @@ void suspend_device_irqs(void)
 		if (irq_settings_is_nested_thread(desc))
 			continue;
 		raw_spin_lock_irqsave(&desc->lock, flags);
+		irq_trace_state("presuspend", desc);
 		sync = suspend_device_irq(desc);
+		irq_trace_state("postsuspend", desc);
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 
 		if (sync)
@@ -172,9 +177,14 @@ static void resume_irqs(bool want_early)
 			continue;
 
 		raw_spin_lock_irqsave(&desc->lock, flags);
+		irq_trace_state("preresume", desc);
 		resume_irq(desc);
+		irq_trace_state("postresume", desc);
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 	}
+
+	if (!want_early)
+		irq_suspend_resume = false;
 }
 
 /**

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 14:13                                   ` Thomas Gleixner
@ 2017-07-28 14:46                                     ` Tomi Sarvela
  2017-07-28 14:50                                       ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-28 14:46 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 28/07/17 17:13, Thomas Gleixner wrote:
> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>> On 28/07/17 16:15, Thomas Gleixner wrote:
>>> Another question. Is the machine completely dead or not?
>>
>> Completely dead. Powerled is on, so host isn't shut down.
> 
> So that means it does not even power the machine down. That's what I
> expected least.
> 
>> Serial or network if don't give any signs of life.
> 
>> Patch applies cleanly but still getting the same error:
> 
> Sorry for the noise. I'm an idiot trying to do 10 things at once. This time
> it actually compiles and links.
> 
> If the machine does still not powerdown with this applied, then please redo
> the 'platform' test and grab the trace for that one.

This patch fixes the issue. Below is the dmesg from the testrun (sorry 
for the spam, we're primarily testing i915 issues).

Tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>

Tomi


[ 1189.061801] [IGT] gem_exec_suspend: executing
[ 1189.071208] Setting dangerous option reset - tainting kernel
[ 1189.071438] [IGT] gem_exec_suspend: starting subtest basic-S3
[ 1189.239034] PM: Syncing filesystems ... done.
[ 1189.245968] PM: Preparing system for sleep (mem)
[ 1189.246209] Freezing user space processes ... (elapsed 0.001 seconds) 
done.
[ 1189.247530] OOM killer disabled.
[ 1189.247531] Freezing remaining freezable tasks ... (elapsed 0.001 
seconds) done.
[ 1189.248690] PM: Suspending system (mem)
[ 1189.248706] Suspending console(s) (use no_console_suspend to debug)
[ 1189.249177] serial 00:03: disabled
[ 1189.250266] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 1189.255064] e1000e: EEE TX LPI TIMER: 00000000
[ 1189.256460] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.257541] sd 0:0:0:0: [sda] Stopping disk
[ 1189.560357] PM: suspend of devices complete after 311.444 msecs
[ 1189.560653] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.572123] PM: late suspend of devices complete after 11.761 msecs
[ 1189.584066] PM: noirq suspend of devices complete after 11.939 msecs
[ 1189.584580] ACPI: Preparing to enter system sleep state S3
[ 1189.586661] PM: Saving platform NVS memory
[ 1189.586686] Disabling non-boot CPUs ...
[ 1189.596350] IRQ 8: no longer affine to CPU1
[ 1189.596356] IRQ 16: no longer affine to CPU1
[ 1189.596367] IRQ 25: no longer affine to CPU1
[ 1189.596369] IRQ 28: no longer affine to CPU1
[ 1189.597397] smpboot: CPU 1 is now offline
[ 1189.597665] ACPI: Low-level resume complete
[ 1189.597665] PM: Restoring platform NVS memory
[ 1189.597665] Suspended for 14.825 seconds
[ 1189.597665] Delta way too big! 18446743991837909721 
ts=18446744056274518328 write stamp = 64436608607
                If you just came from a suspend/resume,
                please switch to the trace global clock:
                  echo global > /sys/kernel/debug/tracing/trace_clock
[ 1189.597665] ------------[ cut here ]------------
[ 1189.597665] WARNING: CPU: 0 PID: 1332 at 
kernel/trace/ring_buffer.c:2647 rb_handle_timestamp.isra.32+0x71/0x80
[ 1189.597665] Modules linked in: snd_hda_codec_realtek i915 
snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core 
coretemp snd_pcm e1000e lpc_ich mei_me ptp mei pps_core
[ 1189.597665] CPU: 0 PID: 1332 Comm: rtcwake Tainted: G     U 
4.13.0-rc2+ #5
[ 1189.597665] Hardware name: Hewlett-Packard HP Compaq 8000 Elite CMT 
PC/3647h, BIOS 786G7 v01.13 07/20/2011
[ 1189.597665] task: ffff88010dce9880 task.stack: ffffc900000c8000
[ 1189.597665] RIP: 0010:rb_handle_timestamp.isra.32+0x71/0x80
[ 1189.597665] RSP: 0018:ffffc900000cbab0 EFLAGS: 00010082
[ 1189.597665] RAX: 00000000000000e0 RBX: ffffc900000cbad0 RCX: 
0000000000000004
[ 1189.597665] RDX: 0000000080000004 RSI: 0000000000000082 RDI: 
00000000ffffffff
[ 1189.597665] RBP: ffffc900000cbac0 R08: 0000000000000000 R09: 
00000000000000e0
[ 1189.597665] R10: ffffc900000cbbe0 R11: 00000000000ebc08 R12: 
ffff880117008890
[ 1189.597665] R13: ffff8801170088b0 R14: 00000000000003e8 R15: 
0000000000000005
[ 1189.597665] FS:  00007f93d957e700(0000) GS:ffff88011bc00000(0000) 
knlGS:0000000000000000
[ 1189.597665] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1189.597665] CR2: 000000f00255b068 CR3: 00000000d9063000 CR4: 
00000000000406f0
[ 1189.597665] Call Trace:
[ 1189.597665]  ring_buffer_lock_reserve+0x1fa/0x350
[ 1189.597665]  trace_vbprintk+0xdc/0x260
[ 1189.597665]  ? __irq_disable+0x1b/0xc0
[ 1189.597665]  __trace_bprintk+0x4a/0x60
[ 1189.597665]  ? preempt_count_add+0x9e/0xb0
[ 1189.597665]  __irq_disable+0x3f/0xc0
[ 1189.597665]  irq_disable+0x17/0x20
[ 1189.597665]  __disable_irq_nosync+0x59/0x70
[ 1189.597665]  disable_hardirq+0x11/0x30
[ 1189.597665]  hpet_msi_resume+0x85/0xd0
[ 1189.597665]  clockevents_tick_resume+0x14/0x20
[ 1189.597665]  tick_resume_local+0x32/0x60
[ 1189.597665]  tick_resume+0x13/0x20
[ 1189.597665]  timekeeping_resume+0x149/0x1a0
[ 1189.597665]  syscore_resume+0x4b/0x190
[ 1189.597665]  ? syscore_resume+0x4b/0x190
[ 1189.597665]  suspend_devices_and_enter+0x6b9/0x810
[ 1189.597665]  pm_suspend+0x367/0x540
[ 1189.597665]  state_store+0x7e/0xf0
[ 1189.597665]  kobj_attr_store+0xf/0x20
[ 1189.597665]  sysfs_kf_write+0x37/0x40
[ 1189.597665]  kernfs_fop_write+0x110/0x1a0
[ 1189.597665]  __vfs_write+0x28/0x130
[ 1189.597665]  ? __this_cpu_preempt_check+0x13/0x20
[ 1189.597665]  ? __sb_start_write+0x55/0xe0
[ 1189.597665]  vfs_write+0xb6/0x1a0
[ 1189.597665]  SyS_write+0x46/0xb0
[ 1189.597665]  entry_SYSCALL_64_fastpath+0x17/0x98
[ 1189.597665] RIP: 0033:0x7f93d90ac8f0
[ 1189.597665] RSP: 002b:00007ffd80a9ca88 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
[ 1189.597665] RAX: ffffffffffffffda RBX: 000000f00255a050 RCX: 
00007f93d90ac8f0
[ 1189.597665] RDX: 0000000000000004 RSI: 000000f00255a060 RDI: 
0000000000000007
[ 1189.597665] RBP: 00007f93d9375b00 R08: 000000f002557d90 R09: 
00007f93d957e700
[ 1189.597665] R10: 00007f93d9375b58 R11: 0000000000000246 R12: 
00007f93d9375b58
[ 1189.597665] R13: 00007f93d9375b58 R14: 000000000000270f R15: 
0000000000001010
[ 1189.597665] Code: f0 48 8b 73 08 85 c0 48 8b 13 48 c7 c0 88 9a a5 81 
49 c7 c0 1a 3b a8 81 4c 0f 44 c0 48 8b 0f 48 c7 c7 10 9b a5 81 e8 c0 c8 
fa ff <0f> ff eb a7 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56
[ 1189.597665] ---[ end trace 7d99ac836f161565 ]---
[ 1189.597665] Enabling non-boot CPUs ...
[ 1189.597665] x86: Booting SMP configuration:
[ 1189.597665] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 1189.598576]  cache: parent cpu1 should not be sleeping
[ 1189.598651] hpet: hpet3 irq 25 for MSI
[ 1189.598710] CPU1 is up
[ 1189.600116] ACPI: Waking up from system sleep state S3
[ 1189.612296] PM: noirq resume of devices complete after 12.089 msecs
[ 1189.612703] hpet1: lost 615 rtc interrupts
[ 1189.612875] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.612971] PM: early resume of devices complete after 0.166 msecs
[ 1189.614754] [drm:i915_redisable_vga_power_on [i915]] Something 
enabled VGA plane, disabling it
[ 1189.615088] [drm:intel_opregion_setup [i915]] graphic opregion 
physical addr: 0xdd7a1640
[ 1189.615119] [drm:intel_opregion_setup [i915]] SWSCI supported
[ 1189.615143] [drm:swsci [i915]] SWSCI request already in progress
[ 1189.615167] [drm:swsci [i915]] SWSCI request already in progress
[ 1189.615190] [drm:intel_opregion_setup [i915]] SWSCI GBDA callbacks 
00000001, SBCB callbacks 00000001
[ 1189.615215] [drm:intel_opregion_setup [i915]] Found valid VBT in ACPI 
OpRegion (Mailbox #4)
[ 1189.615255] [drm:init_workarounds_ring [i915]] rcs0: Number of 
context specific w/a: 0
[ 1189.615284] [drm:intel_hpll_vco [i915]] HPLL VCO 3200000 kHz
[ 1189.615309] [drm:intel_update_cdclk [i915]] Current CD clock rate: 
400000 kHz, VCO: 3200000 kHz, ref: 0 kHz
[ 1189.615340] [drm:intel_set_plane_visible [i915]] pipe A active planes 0x0
[ 1189.615364] [drm:intel_modeset_setup_hw_state [i915]] [CRTC:32:pipe 
A] hw state readout: disabled
[ 1189.615389] [drm:intel_set_plane_visible [i915]] pipe B active planes 0x0
[ 1189.615413] [drm:intel_modeset_setup_hw_state [i915]] [CRTC:39:pipe 
B] hw state readout: disabled
[ 1189.615439] [drm:intel_modeset_setup_hw_state [i915]] 
[ENCODER:41:CRT] hw state readout: disabled, pipe A
[ 1189.615463] [drm:intel_modeset_setup_hw_state [i915]] 
[ENCODER:42:HDMI C] hw state readout: disabled, pipe A
[ 1189.615488] [drm:intel_modeset_setup_hw_state [i915]] [ENCODER:47:DP 
C] hw state readout: disabled, pipe A
[ 1189.615513] [drm:intel_modeset_setup_hw_state [i915]] 
[CONNECTOR:40:VGA-1] hw state readout: disabled
[ 1189.615538] [drm:intel_modeset_setup_hw_state [i915]] 
[CONNECTOR:43:HDMI-A-1] hw state readout: disabled
[ 1189.615563] [drm:intel_modeset_setup_hw_state [i915]] 
[CONNECTOR:48:DP-1] hw state readout: disabled
[ 1189.615588] [drm:intel_dump_pipe_config [i915]] [CRTC:32:pipe 
A][setup_hw_state]
[ 1189.615613] [drm:intel_dump_pipe_config [i915]] cpu_transcoder: A, 
pipe bpp: 0, dithering: 0
[ 1189.615637] [drm:intel_dump_pipe_config [i915]] audio: 0, infoframes: 0
[ 1189.615660] [drm:intel_dump_pipe_config [i915]] requested mode:
[ 1189.615664] [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 
0 0 0 0 0 0 0x0 0x0
[ 1189.615688] [drm:intel_dump_pipe_config [i915]] adjusted mode:
[ 1189.615690] [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 
0 0 0 0 0 0 0x0 0x0
[ 1189.615715] [drm:intel_dump_pipe_config [i915]] crtc timings: 0 0 0 0 
0 0 0 0 0, type: 0x0 flags: 0x0
[ 1189.615739] [drm:intel_dump_pipe_config [i915]] port clock: 0, pipe 
src size: 0x0, pixel rate 0
[ 1189.615763] [drm:intel_dump_pipe_config [i915]] gmch pfit: control: 
0x00000000, ratios: 0x00000000, lvds border: 0x00000000
[ 1189.615787] [drm:intel_dump_pipe_config [i915]] ips: 0, double wide: 0
[ 1189.615811] [drm:intel_dpll_dump_hw_state [i915]] dpll_hw_state: 
dpll: 0x0, dpll_md: 0x0, fp0: 0x0, fp1: 0x0
[ 1189.615834] [drm:intel_dump_pipe_config [i915]] planes on this crtc
[ 1189.615859] [drm:intel_dump_pipe_config [i915]] [PLANE:26:primary A] 
disabled, scaler_id = 0
[ 1189.615883] [drm:intel_dump_pipe_config [i915]] [PLANE:28:sprite A] 
disabled, scaler_id = 0
[ 1189.615907] [drm:intel_dump_pipe_config [i915]] [PLANE:30:cursor A] 
disabled, scaler_id = 0
[ 1189.615932] [drm:intel_dump_pipe_config [i915]] [CRTC:39:pipe 
B][setup_hw_state]
[ 1189.615956] [drm:intel_dump_pipe_config [i915]] cpu_transcoder: B, 
pipe bpp: 0, dithering: 0
[ 1189.615979] [drm:intel_dump_pipe_config [i915]] audio: 0, infoframes: 0
[ 1189.616003] [drm:intel_dump_pipe_config [i915]] requested mode:
[ 1189.616012] [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 
0 0 0 0 0 0 0x0 0x0
[ 1189.616036] [drm:intel_dump_pipe_config [i915]] adjusted mode:
[ 1189.616038] [drm:drm_mode_debug_printmodeline] Modeline 0:"" 0 0 0 0 
0 0 0 0 0 0 0x0 0x0
[ 1189.616062] [drm:intel_dump_pipe_config [i915]] crtc timings: 0 0 0 0 
0 0 0 0 0, type: 0x0 flags: 0x0
[ 1189.616086] [drm:intel_dump_pipe_config [i915]] port clock: 0, pipe 
src size: 0x0, pixel rate 0
[ 1189.616110] [drm:intel_dump_pipe_config [i915]] gmch pfit: control: 
0x00000000, ratios: 0x00000000, lvds border: 0x00000000
[ 1189.616134] [drm:intel_dump_pipe_config [i915]] ips: 0, double wide: 0
[ 1189.616158] [drm:intel_dpll_dump_hw_state [i915]] dpll_hw_state: 
dpll: 0x0, dpll_md: 0x0, fp0: 0x0, fp1: 0x0
[ 1189.616182] [drm:intel_dump_pipe_config [i915]] planes on this crtc
[ 1189.616206] [drm:intel_dump_pipe_config [i915]] [PLANE:33:primary B] 
disabled, scaler_id = 0
[ 1189.616230] [drm:intel_dump_pipe_config [i915]] [PLANE:35:sprite B] 
disabled, scaler_id = 0
[ 1189.616254] [drm:intel_dump_pipe_config [i915]] [PLANE:37:cursor B] 
disabled, scaler_id = 0
[ 1189.616276] [drm:g4x_wm_get_hw_state [i915]] Initial watermarks: pipe 
A, plane=15, cursor=15, sprite=15
[ 1189.616296] [drm:g4x_wm_get_hw_state [i915]] Initial watermarks: pipe 
B, plane=15, cursor=15, sprite=15
[ 1189.616316] [drm:g4x_wm_get_hw_state [i915]] Initial SR watermarks: 
plane=127, cursor=0 fbc=1
[ 1189.616335] [drm:g4x_wm_get_hw_state [i915]] Initial HPLL watermarks: 
plane=0, SR cursor=0 fbc=5
[ 1189.616355] [drm:g4x_wm_get_hw_state [i915]] Initial SR=no HPLL=no FBC=no
[ 1189.616377] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.616397] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.616417] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.616512] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.616538] [drm:intel_atomic_commit_tail [i915]] [ENCODER:41:CRT]
[ 1189.616562] [drm:intel_atomic_commit_tail [i915]] [ENCODER:42:HDMI C]
[ 1189.616586] [drm:intel_atomic_commit_tail [i915]] [ENCODER:47:DP C]
[ 1189.616610] [drm:verify_connector_state.isra.72 [i915]] 
[CONNECTOR:40:VGA-1]
[ 1189.616635] [drm:verify_connector_state.isra.72 [i915]] 
[CONNECTOR:43:HDMI-A-1]
[ 1189.616659] [drm:verify_connector_state.isra.72 [i915]] 
[CONNECTOR:48:DP-1]
[ 1189.616685] [drm:intel_atomic_commit_tail [i915]] [CRTC:32:pipe A]
[ 1189.616710] [drm:intel_atomic_commit_tail [i915]] [CRTC:39:pipe B]
[ 1189.616731] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.616764] [drm:intel_crt_detect [i915]] [CONNECTOR:40:VGA-1] force=0
[ 1189.616784] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.617427] usb usb3: root hub lost power or was reset
[ 1189.617462] usb usb4: root hub lost power or was reset
[ 1189.617494] usb usb5: root hub lost power or was reset
[ 1189.617629] usb usb6: root hub lost power or was reset
[ 1189.617662] usb usb7: root hub lost power or was reset
[ 1189.617695] usb usb8: root hub lost power or was reset
[ 1189.618430] sd 0:0:0:0: [sda] Starting disk
[ 1189.618599] serial 00:03: activated
[ 1189.628872] [drm:intel_crt_detect [i915]] CRT not detected via hotplug
[ 1189.630945] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus vga] NAK for 
addr: 0050 w(1)
[ 1189.630993] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus vga] NAK on 
first message, retry
[ 1189.633063] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus vga] NAK for 
addr: 0050 w(1)
[ 1189.633067] [drm:drm_do_probe_ddc_edid] drm: skipping non-existent 
adapter i915 gmbus vga
[ 1189.633110] [drm:intel_crt_get_edid [i915]] CRT GMBUS EDID read 
failed, retry using GPIO bit-banging
[ 1189.633154] [drm:intel_gmbus_force_bit [i915]] enabling bit-banging 
on i915 gmbus vga. force bit now 1
[ 1189.633663] [drm:drm_do_probe_ddc_edid] drm: skipping non-existent 
adapter i915 gmbus vga
[ 1189.633687] [drm:intel_gmbus_force_bit [i915]] disabling bit-banging 
on i915 gmbus vga. force bit now 0
[ 1189.633711] [drm:intel_crt_detect_ddc [i915]] CRT not detected via 
DDC:0x50 [no valid EDID found]
[ 1189.633731] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.633734] [drm:drm_helper_hpd_irq_event] [CONNECTOR:40:VGA-1] 
status updated from disconnected to disconnected
[ 1189.633758] [drm:intel_hdmi_detect [i915]] [CONNECTOR:43:HDMI-A-1]
[ 1189.633778] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.635849] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK for 
addr: 0050 w(1)
[ 1189.635892] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK on 
first message, retry
[ 1189.637962] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK for 
addr: 0050 w(1)
[ 1189.637965] [drm:drm_do_probe_ddc_edid] drm: skipping non-existent 
adapter i915 gmbus dpc
[ 1189.640036] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK for 
addr: 0040 w(1)
[ 1189.640092] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK on 
first message, retry
[ 1189.642161] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK for 
addr: 0040 w(1)
[ 1189.642166] [drm:drm_dp_dual_mode_detect] DP dual mode HDMI ID:  (err -6)
[ 1189.642203] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.642205] [drm:drm_helper_hpd_irq_event] [CONNECTOR:43:HDMI-A-1] 
status updated from disconnected to disconnected
[ 1189.642229] [drm:intel_dp_detect [i915]] [CONNECTOR:48:DP-1]
[ 1189.642249] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.642270] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.642272] [drm:drm_helper_hpd_irq_event] [CONNECTOR:48:DP-1] status 
updated from disconnected to disconnected
[ 1189.726634] [drm:intel_get_hpd_pins [i915]] hotplug event received, 
stat 0x00100000, dig 0x00100000, pins 0x00000040
[ 1189.726678] [drm:intel_hpd_irq_handler [i915]] digital hpd port C - long
[ 1189.726722] [drm:intel_hpd_irq_handler [i915]] Received HPD interrupt 
on PIN 6 - cnt: 0
[ 1189.726776] [drm:intel_dp_hpd_pulse [i915]] got hpd irq on port C - long
[ 1189.726810] [drm:i915_hotplug_work_func [i915]] running encoder 
hotplug functions
[ 1189.726835] [drm:i915_hotplug_work_func [i915]] Connector HDMI-A-1 
(pin 6) received hotplug event.
[ 1189.726859] [drm:intel_hdmi_detect [i915]] [CONNECTOR:43:HDMI-A-1]
[ 1189.726879] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.728946] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK for 
addr: 0050 w(1)
[ 1189.728989] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK on 
first message, retry
[ 1189.730071] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK for 
addr: 0050 w(1)
[ 1189.730075] [drm:drm_do_probe_ddc_edid] drm: skipping non-existent 
adapter i915 gmbus dpc
[ 1189.732142] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK for 
addr: 0040 w(1)
[ 1189.732185] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK on 
first message, retry
[ 1189.734069] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpc] NAK for 
addr: 0040 w(1)
[ 1189.734073] [drm:drm_dp_dual_mode_detect] DP dual mode HDMI ID:  (err -6)
[ 1189.734111] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.734155] [drm:i915_hotplug_work_func [i915]] Connector DP-1 (pin 
6) received hotplug event.
[ 1189.734201] [drm:intel_dp_detect [i915]] [CONNECTOR:48:DP-1]
[ 1189.734221] [drm:intel_power_well_enable [i915]] enabling always-on
[ 1189.734241] [drm:intel_power_well_disable [i915]] disabling always-on
[ 1189.924416] ata5: SATA link down (SStatus 0 SControl 300)
[ 1189.924432] ata3: SATA link down (SStatus 0 SControl 300)
[ 1189.924447] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1189.924461] ata4: SATA link down (SStatus 0 SControl 300)
[ 1189.924477] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 1189.924857] ata2.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.924860] ata2.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.924862] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.933311] ata2.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.933313] ata2.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.933316] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.933320] ata2.00: configured for UDMA/100
[ 1189.935151] ata1.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.935153] ata1.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.936958] ata1.00: ACPI cmd c6/00:10:00:00:00:a0 (SET MULTIPLE 
MODE) succeeded
[ 1189.946040] ata1.00: ACPI cmd e3/00:00:00:00:00:a0 (IDLE) succeeded
[ 1189.946043] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE 
LOCK) filtered out
[ 1189.946045] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.986668] ata1.00: Security Log not supported
[ 1189.996656] ata1.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.996659] ata1.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1189.998508] ata1.00: ACPI cmd c6/00:10:00:00:00:a0 (SET MULTIPLE 
MODE) succeeded
[ 1190.007519] ata1.00: ACPI cmd e3/00:00:00:00:00:a0 (IDLE) succeeded
[ 1190.007522] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE 
LOCK) filtered out
[ 1190.007524] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) 
filtered out
[ 1190.048147] ata1.00: Security Log not supported
[ 1190.048151] ata1.00: configured for UDMA/133
[ 1190.182093] PM: resume of devices complete after 569.117 msecs
[ 1190.182296] PM: Finishing wakeup.
[ 1190.182297] OOM killer enabled.
[ 1190.182297] Restarting tasks ... done.
[ 1190.300710] [IGT] gem_exec_suspend: exiting, ret=0

-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 14:46                                     ` Tomi Sarvela
@ 2017-07-28 14:50                                       ` Thomas Gleixner
  2017-07-28 14:55                                         ` Tomi Sarvela
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-28 14:50 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Fri, 28 Jul 2017, Tomi Sarvela wrote:
> On 28/07/17 17:13, Thomas Gleixner wrote:
> > On Fri, 28 Jul 2017, Tomi Sarvela wrote:
> > > On 28/07/17 16:15, Thomas Gleixner wrote:
> > > > Another question. Is the machine completely dead or not?
> > > 
> > > Completely dead. Powerled is on, so host isn't shut down.
> > 
> > So that means it does not even power the machine down. That's what I
> > expected least.
> > 
> > > Serial or network if don't give any signs of life.
> > 
> > > Patch applies cleanly but still getting the same error:
> > 
> > Sorry for the noise. I'm an idiot trying to do 10 things at once. This time
> > it actually compiles and links.
> > 
> > If the machine does still not powerdown with this applied, then please redo
> > the 'platform' test and grab the trace for that one.
> 
> This patch fixes the issue. Below is the dmesg from the testrun (sorry for the
> spam, we're primarily testing i915 issues).

Can you please retrieve the trace data from:

/sys/kernel/debug/tracing/trace

and provide that. The dmesg does not help much.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 14:50                                       ` Thomas Gleixner
@ 2017-07-28 14:55                                         ` Tomi Sarvela
  2017-07-28 16:26                                           ` Thomas Gleixner
  2017-07-31  8:35                                           ` Thomas Gleixner
  0 siblings, 2 replies; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-28 14:55 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 28/07/17 17:50, Thomas Gleixner wrote:
> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>> On 28/07/17 17:13, Thomas Gleixner wrote:
>>> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>>>> On 28/07/17 16:15, Thomas Gleixner wrote:
>>>>> Another question. Is the machine completely dead or not?
>>>>
>>>> Completely dead. Powerled is on, so host isn't shut down.
>>>
>>> So that means it does not even power the machine down. That's what I
>>> expected least.
>>>
>>>> Serial or network if don't give any signs of life.
>>>
>>>> Patch applies cleanly but still getting the same error:
>>>
>>> Sorry for the noise. I'm an idiot trying to do 10 things at once. This time
>>> it actually compiles and links.
>>>
>>> If the machine does still not powerdown with this applied, then please redo
>>> the 'platform' test and grab the trace for that one.
>>
>> This patch fixes the issue. Below is the dmesg from the testrun (sorry for the
>> spam, we're primarily testing i915 issues).
> 
> Can you please retrieve the trace data from:
> 
> /sys/kernel/debug/tracing/trace
> 
> and provide that. The dmesg does not help much.

Right, here you go.

$ sudo cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
          rtcwake-1332  [000] d..1    64.411098: suspend_device_irqs: 
presuspend 0 state 00400600
          rtcwake-1332  [000] d..1    64.411101: suspend_device_irqs: 
postsuspend 0 state 00400600
          rtcwake-1332  [000] d..1    64.411102: suspend_device_irqs: 
presuspend 1 state 00030000
          rtcwake-1332  [000] d..1    64.411103: suspend_device_irqs: 
postsuspend 1 state 00030000
          rtcwake-1332  [000] d..1    64.411104: suspend_device_irqs: 
presuspend 2 state 00030000
          rtcwake-1332  [000] d..1    64.411104: suspend_device_irqs: 
postsuspend 2 state 00030000
          rtcwake-1332  [000] d..1    64.411105: suspend_device_irqs: 
presuspend 3 state 00030000
          rtcwake-1332  [000] d..1    64.411106: suspend_device_irqs: 
postsuspend 3 state 00030000
          rtcwake-1332  [000] d..1    64.411107: suspend_device_irqs: 
presuspend 4 state 00031000
          rtcwake-1332  [000] d..1    64.411107: suspend_device_irqs: 
postsuspend 4 state 00031000
          rtcwake-1332  [000] d..1    64.411108: suspend_device_irqs: 
presuspend 5 state 00030000
          rtcwake-1332  [000] d..1    64.411108: suspend_device_irqs: 
postsuspend 5 state 00030000
          rtcwake-1332  [000] d..1    64.411109: suspend_device_irqs: 
presuspend 6 state 00030000
          rtcwake-1332  [000] d..1    64.411110: suspend_device_irqs: 
postsuspend 6 state 00030000
          rtcwake-1332  [000] d..1    64.411110: suspend_device_irqs: 
presuspend 7 state 00030000
          rtcwake-1332  [000] d..1    64.411111: suspend_device_irqs: 
postsuspend 7 state 00030000
          rtcwake-1332  [000] d..1    64.411112: suspend_device_irqs: 
presuspend 8 state 00401200
          rtcwake-1332  [000] d..1    64.411112: __irq_disable: 
predisable 8 state 00401200
          rtcwake-1332  [000] d..1    64.411113: __irq_disable: 
postdisable 8 state 00411200
          rtcwake-1332  [000] d..1    64.411114: suspend_device_irqs: 
postsuspend 8 state 00411200
          rtcwake-1332  [000] d..1    64.411115: suspend_device_irqs: 
presuspend 9 state 00403300
          rtcwake-1332  [000] d..1    64.411115: __irq_disable: 
predisable 9 state 00403300
          rtcwake-1332  [000] d..1    64.411116: __irq_disable: 
postdisable 9 state 00413300
          rtcwake-1332  [000] d..1    64.411116: suspend_device_irqs: 
postsuspend 9 state 00413300
          rtcwake-1332  [000] d..1    64.411117: suspend_device_irqs: 
presuspend 10 state 00030000
          rtcwake-1332  [000] d..1    64.411118: suspend_device_irqs: 
postsuspend 10 state 00030000
          rtcwake-1332  [000] d..1    64.411119: suspend_device_irqs: 
presuspend 11 state 00030000
          rtcwake-1332  [000] d..1    64.411119: suspend_device_irqs: 
postsuspend 11 state 00030000
          rtcwake-1332  [000] d..1    64.411120: suspend_device_irqs: 
presuspend 12 state 00030000
          rtcwake-1332  [000] d..1    64.411120: suspend_device_irqs: 
postsuspend 12 state 00030000
          rtcwake-1332  [000] d..1    64.411121: suspend_device_irqs: 
presuspend 13 state 00030000
          rtcwake-1332  [000] d..1    64.411122: suspend_device_irqs: 
postsuspend 13 state 00030000
          rtcwake-1332  [000] d..1    64.411122: suspend_device_irqs: 
presuspend 14 state 00030000
          rtcwake-1332  [000] d..1    64.411123: suspend_device_irqs: 
postsuspend 14 state 00030000
          rtcwake-1332  [000] d..1    64.411124: suspend_device_irqs: 
presuspend 15 state 00030000
          rtcwake-1332  [000] d..1    64.411124: suspend_device_irqs: 
postsuspend 15 state 00030000
          rtcwake-1332  [000] d..1    64.411125: suspend_device_irqs: 
presuspend 16 state 00403200
          rtcwake-1332  [000] d..1    64.411126: __irq_disable: 
predisable 16 state 00403200
          rtcwake-1332  [000] d..1    64.411126: __irq_disable: 
postdisable 16 state 00413200
          rtcwake-1332  [000] d..1    64.411127: suspend_device_irqs: 
postsuspend 16 state 00413200
          rtcwake-1332  [000] d..1    64.411128: suspend_device_irqs: 
presuspend 17 state 00033000
          rtcwake-1332  [000] d..1    64.411128: suspend_device_irqs: 
postsuspend 17 state 00033000
          rtcwake-1332  [000] d..1    64.411129: suspend_device_irqs: 
presuspend 18 state 00032000
          rtcwake-1332  [000] d..1    64.411130: suspend_device_irqs: 
postsuspend 18 state 00032000
          rtcwake-1332  [000] d..1    64.411130: suspend_device_irqs: 
presuspend 19 state 00032000
          rtcwake-1332  [000] d..1    64.411131: suspend_device_irqs: 
postsuspend 19 state 00032000
          rtcwake-1332  [000] d..1    64.411132: suspend_device_irqs: 
presuspend 20 state 00403300
          rtcwake-1332  [000] d..1    64.411132: __irq_disable: 
predisable 20 state 00403300
          rtcwake-1332  [000] d..1    64.411133: __irq_disable: 
postdisable 20 state 00413300
          rtcwake-1332  [000] d..1    64.411133: suspend_device_irqs: 
postsuspend 20 state 00413300
          rtcwake-1332  [000] d..1    64.411134: suspend_device_irqs: 
presuspend 21 state 00403300
          rtcwake-1332  [000] d..1    64.411134: __irq_disable: 
predisable 21 state 00403300
          rtcwake-1332  [000] d..1    64.411135: __irq_disable: 
postdisable 21 state 00413300
          rtcwake-1332  [000] d..1    64.411136: suspend_device_irqs: 
postsuspend 21 state 00413300
          rtcwake-1332  [000] d..1    64.411136: suspend_device_irqs: 
presuspend 22 state 00403300
          rtcwake-1332  [000] d..1    64.411137: __irq_disable: 
predisable 22 state 00403300
          rtcwake-1332  [000] d..1    64.411137: __irq_disable: 
postdisable 22 state 00413300
          rtcwake-1332  [000] d..1    64.411138: suspend_device_irqs: 
postsuspend 22 state 00413300
          rtcwake-1332  [000] d..1    64.411139: suspend_device_irqs: 
presuspend 24 state 00409600
          rtcwake-1332  [000] d..1    64.411139: suspend_device_irqs: 
postsuspend 24 state 00409600
          rtcwake-1332  [000] d..1    64.411140: suspend_device_irqs: 
presuspend 25 state 00409600
          rtcwake-1332  [000] d..1    64.411141: suspend_device_irqs: 
postsuspend 25 state 00409600
          rtcwake-1332  [000] d..1    64.411142: suspend_device_irqs: 
presuspend 26 state 00038000
          rtcwake-1332  [000] d..1    64.411142: suspend_device_irqs: 
postsuspend 26 state 00038000
          rtcwake-1332  [000] d..1    64.411143: suspend_device_irqs: 
presuspend 27 state 00038000
          rtcwake-1332  [000] d..1    64.411143: suspend_device_irqs: 
postsuspend 27 state 00038000
          rtcwake-1332  [000] d..1    64.411144: suspend_device_irqs: 
presuspend 28 state 00401200
          rtcwake-1332  [000] d..1    64.411145: __irq_disable: 
predisable 28 state 00401200
          rtcwake-1332  [000] d..1    64.411145: __irq_disable: 
postdisable 28 state 00411200
          rtcwake-1332  [000] d..1    64.411146: suspend_device_irqs: 
postsuspend 28 state 00411200
          rtcwake-1332  [001] d.H1    64.425561: mask_irq: premask 8 
state 00411200
          rtcwake-1332  [001] d.H1    64.425565: mask_irq: postmask 8 
state 00431200
          rtcwake-1332  [000] dN.1    64.436605: __irq_disable: 
predisable 25 state 00409600
          rtcwake-1332  [000] dN.1    64.436607: mask_irq: premask 25 
state 00419600
          rtcwake-1332  [000] dN.1    64.436608: mask_irq: postmask 25 
state 00439600
          rtcwake-1332  [000] dN.1    64.436609: __irq_disable: 
postdisable 25 state 00439600
          rtcwake-1332  [000] d..1 576460734.868390: __irq_disable: 
predisable 24 state 00409600
          rtcwake-1332  [000] d..1 576460734.868508: __irq_disable: 
postdisable 24 state 00419600
          rtcwake-1332  [000] d..1 576460734.868511: irq_enable: 
preenable 24 state 00419600
          rtcwake-1332  [000] d..1 576460734.868511: unmask_irq: 
preunmask 24 state 00409600
          rtcwake-1332  [000] d..1 576460734.868512: unmask_irq: 
postunmask 24 state 00409600
          rtcwake-1332  [000] d..1 576460734.868512: irq_enable: 
postenable 24 state 00409600
          rtcwake-1332  [000] dNh1 576460734.868533: mask_irq: premask 9 
state 00413200
          rtcwake-1332  [000] dNh1 576460734.868535: mask_irq: postmask 
9 state 00433200
      kworker/1:1-1039  [001] d..1 576460734.869322: irq_enable: 
preenable 25 state 00039600
      kworker/1:1-1039  [001] d..1 576460734.869324: unmask_irq: 
preunmask 25 state 00029600
      kworker/1:1-1039  [001] d..1 576460734.869325: unmask_irq: 
postunmask 25 state 00009600
      kworker/1:1-1039  [001] d..1 576460734.869326: irq_enable: 
postenable 25 state 00009600
      kworker/1:1-1039  [001] d..1 576460734.869329: __irq_disable: 
predisable 25 state 00409600
      kworker/1:1-1039  [001] d..1 576460734.869329: __irq_disable: 
postdisable 25 state 00419600
      kworker/1:1-1039  [001] d..1 576460734.869332: irq_enable: 
preenable 25 state 00419600
      kworker/1:1-1039  [001] d..1 576460734.869332: unmask_irq: 
preunmask 25 state 00409600
      kworker/1:1-1039  [001] d..1 576460734.869333: unmask_irq: 
postunmask 25 state 00409600
      kworker/1:1-1039  [001] d..1 576460734.869333: irq_enable: 
postenable 25 state 00409600
          rtcwake-1332  [000] d..1 576460734.882983: resume_irqs: 
preresume 0 state 00400600
          rtcwake-1332  [000] d..1 18446744056.289114: resume_irqs: 
postresume 0 state 00400600
          rtcwake-1332  [000] d..1 18446744056.289116: resume_irqs: 
preresume 1 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289116: resume_irqs: 
postresume 1 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289117: resume_irqs: 
preresume 2 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289118: resume_irqs: 
postresume 2 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289118: resume_irqs: 
preresume 3 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289119: resume_irqs: 
postresume 3 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289120: resume_irqs: 
preresume 4 state 00031000
          rtcwake-1332  [000] d..1 18446744056.289120: resume_irqs: 
postresume 4 state 00031000
          rtcwake-1332  [000] d..1 18446744056.289121: resume_irqs: 
preresume 5 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289122: resume_irqs: 
postresume 5 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289122: resume_irqs: 
preresume 6 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289123: resume_irqs: 
postresume 6 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289124: resume_irqs: 
preresume 7 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289124: resume_irqs: 
postresume 7 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289125: resume_irqs: 
preresume 8 state 00431200
          rtcwake-1332  [000] d..1 18446744056.289126: irq_enable: 
preenable 8 state 00431200
          rtcwake-1332  [000] d..1 18446744056.289126: unmask_irq: 
preunmask 8 state 00421200
          rtcwake-1332  [000] d..1 18446744056.289128: unmask_irq: 
postunmask 8 state 00401200
          rtcwake-1332  [000] d..1 18446744056.289128: irq_enable: 
postenable 8 state 00401200
          rtcwake-1332  [000] d..1 18446744056.289129: resume_irqs: 
postresume 8 state 00401200
          rtcwake-1332  [000] d..1 18446744056.289570: resume_irqs: 
preresume 9 state 00433200
          rtcwake-1332  [000] d..1 18446744056.289571: irq_enable: 
preenable 9 state 00433200
          rtcwake-1332  [000] d..1 18446744056.289571: unmask_irq: 
preunmask 9 state 00423200
          rtcwake-1332  [000] d..1 18446744056.289572: unmask_irq: 
postunmask 9 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289572: irq_enable: 
postenable 9 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289572: resume_irqs: 
postresume 9 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289573: resume_irqs: 
preresume 10 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289573: resume_irqs: 
postresume 10 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289588: resume_irqs: 
preresume 11 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289588: resume_irqs: 
postresume 11 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289589: resume_irqs: 
preresume 12 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289589: resume_irqs: 
postresume 12 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289589: resume_irqs: 
preresume 13 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289590: resume_irqs: 
postresume 13 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289590: resume_irqs: 
preresume 14 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289591: resume_irqs: 
postresume 14 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289591: resume_irqs: 
preresume 15 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289591: resume_irqs: 
postresume 15 state 00030000
          rtcwake-1332  [000] d..1 18446744056.289592: resume_irqs: 
preresume 16 state 00413200
          rtcwake-1332  [000] d..1 18446744056.289592: irq_enable: 
preenable 16 state 00413200
          rtcwake-1332  [000] d..1 18446744056.289593: unmask_irq: 
preunmask 16 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289593: unmask_irq: 
postunmask 16 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289594: irq_enable: 
postenable 16 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289594: resume_irqs: 
postresume 16 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289594: resume_irqs: 
preresume 17 state 00033000
          rtcwake-1332  [000] d..1 18446744056.289595: resume_irqs: 
postresume 17 state 00033000
          rtcwake-1332  [000] d..1 18446744056.289595: resume_irqs: 
preresume 18 state 00032000
          rtcwake-1332  [000] d..1 18446744056.289596: resume_irqs: 
postresume 18 state 00032000
          rtcwake-1332  [000] d..1 18446744056.289596: resume_irqs: 
preresume 19 state 00032000
          rtcwake-1332  [000] d..1 18446744056.289596: resume_irqs: 
postresume 19 state 00032000
          rtcwake-1332  [000] d..1 18446744056.289597: resume_irqs: 
preresume 20 state 00413200
          rtcwake-1332  [000] d..1 18446744056.289597: irq_enable: 
preenable 20 state 00413200
          rtcwake-1332  [000] d..1 18446744056.289598: unmask_irq: 
preunmask 20 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289598: unmask_irq: 
postunmask 20 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289598: irq_enable: 
postenable 20 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289599: resume_irqs: 
postresume 20 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289599: resume_irqs: 
preresume 21 state 00413200
          rtcwake-1332  [000] d..1 18446744056.289600: irq_enable: 
preenable 21 state 00413200
          rtcwake-1332  [000] d..1 18446744056.289600: unmask_irq: 
preunmask 21 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289600: unmask_irq: 
postunmask 21 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289601: irq_enable: 
postenable 21 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289601: resume_irqs: 
postresume 21 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289602: resume_irqs: 
preresume 22 state 00413200
          rtcwake-1332  [000] d..1 18446744056.289602: irq_enable: 
preenable 22 state 00413200
          rtcwake-1332  [000] d..1 18446744056.289602: unmask_irq: 
preunmask 22 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289603: unmask_irq: 
postunmask 22 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289603: irq_enable: 
postenable 22 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289603: resume_irqs: 
postresume 22 state 00403200
          rtcwake-1332  [000] d..1 18446744056.289604: resume_irqs: 
preresume 24 state 00409600
          rtcwake-1332  [000] d..1 18446744056.289604: resume_irqs: 
postresume 24 state 00409600
          rtcwake-1332  [000] d..1 18446744056.289604: resume_irqs: 
preresume 25 state 00409600
          rtcwake-1332  [000] d..1 18446744056.289605: resume_irqs: 
postresume 25 state 00409600
          rtcwake-1332  [000] d..1 18446744056.289605: resume_irqs: 
preresume 26 state 00038000
          rtcwake-1332  [000] d..1 18446744056.289606: resume_irqs: 
postresume 26 state 00038000
          rtcwake-1332  [000] d..1 18446744056.289606: resume_irqs: 
preresume 27 state 00038000
          rtcwake-1332  [000] d..1 18446744056.289606: resume_irqs: 
postresume 27 state 00038000
          rtcwake-1332  [000] d..1 18446744056.289607: resume_irqs: 
preresume 28 state 00411200
          rtcwake-1332  [000] d..1 18446744056.289607: irq_enable: 
preenable 28 state 00411200
          rtcwake-1332  [000] d..1 18446744056.289608: unmask_irq: 
preunmask 28 state 00401200
          rtcwake-1332  [000] d..1 18446744056.289608: unmask_irq: 
postunmask 28 state 00401200
          rtcwake-1332  [000] d..1 18446744056.289608: irq_enable: 
postenable 28 state 00401200
          rtcwake-1332  [000] d..1 18446744056.289609: resume_irqs: 
postresume 28 state 00401200


Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 14:55                                         ` Tomi Sarvela
@ 2017-07-28 16:26                                           ` Thomas Gleixner
  2017-07-31  7:21                                             ` Tomi Sarvela
  2017-07-31  8:35                                           ` Thomas Gleixner
  1 sibling, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-28 16:26 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Fri, 28 Jul 2017, Tomi Sarvela wrote:
> On 28/07/17 17:50, Thomas Gleixner wrote:
> > On Fri, 28 Jul 2017, Tomi Sarvela wrote:
> > > On 28/07/17 17:13, Thomas Gleixner wrote:
> > > > On Fri, 28 Jul 2017, Tomi Sarvela wrote:
> > > > > On 28/07/17 16:15, Thomas Gleixner wrote:
> > > > > > Another question. Is the machine completely dead or not?
> > > > > 
> > > > > Completely dead. Powerled is on, so host isn't shut down.
> > > > 
> > > > So that means it does not even power the machine down. That's what I
> > > > expected least.
> > > > 
> > > > > Serial or network if don't give any signs of life.
> > > > 
> > > > > Patch applies cleanly but still getting the same error:
> > > > 
> > > > Sorry for the noise. I'm an idiot trying to do 10 things at once. This
> > > > time
> > > > it actually compiles and links.
> > > > 
> > > > If the machine does still not powerdown with this applied, then please
> > > > redo
> > > > the 'platform' test and grab the trace for that one.
> > > 
> > > This patch fixes the issue. Below is the dmesg from the testrun (sorry for
> > > the spam, we're primarily testing i915 issues).
> > 
> > Can you please retrieve the trace data from:
> > 
> > /sys/kernel/debug/tracing/trace
> > 
> > and provide that. The dmesg does not help much.
> 
> Right, here you go.

Thanks for providing the data. Just to be sure, that data was from a real
suspend, not the 'platform' test, right?

If so, that does not make any sense at all. The patch merily changes the
enable/resume path and adds the debug trace printks which have no influence
on the disable logic. But you said that the machine does not power off in
the bad case. That does not make any sense at all as the enable logic is
not involved at all in the suspend path.

Did you change anything else compared to the tests before ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 16:26                                           ` Thomas Gleixner
@ 2017-07-31  7:21                                             ` Tomi Sarvela
  2017-07-31  7:45                                               ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-31  7:21 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 28/07/17 19:26, Thomas Gleixner wrote:
> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>> On 28/07/17 17:50, Thomas Gleixner wrote:
>>> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>>>> On 28/07/17 17:13, Thomas Gleixner wrote:
>>>>> On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>>>>>> On 28/07/17 16:15, Thomas Gleixner wrote:
>>>>>>> Another question. Is the machine completely dead or not?
>>>>>>
>>>>>> Completely dead. Powerled is on, so host isn't shut down.
>>>>>
>>>>> So that means it does not even power the machine down. That's what I
>>>>> expected least.
>>>>>
>>>>>> Serial or network if don't give any signs of life.
>>>>>
>>>>>> Patch applies cleanly but still getting the same error:
>>>>>
>>>>> Sorry for the noise. I'm an idiot trying to do 10 things at once. This
>>>>> time
>>>>> it actually compiles and links.
>>>>>
>>>>> If the machine does still not powerdown with this applied, then please
>>>>> redo
>>>>> the 'platform' test and grab the trace for that one.
>>>>
>>>> This patch fixes the issue. Below is the dmesg from the testrun (sorry for
>>>> the spam, we're primarily testing i915 issues).
>>>
>>> Can you please retrieve the trace data from:
>>>
>>> /sys/kernel/debug/tracing/trace
>>>
>>> and provide that. The dmesg does not help much.
>>
>> Right, here you go.
> 
> Thanks for providing the data. Just to be sure, that data was from a real
> suspend, not the 'platform' test, right?
> 
> If so, that does not make any sense at all. The patch merily changes the
> enable/resume path and adds the debug trace printks which have no influence
> on the disable logic. But you said that the machine does not power off in
> the bad case. That does not make any sense at all as the enable logic is
> not involved at all in the suspend path.
> 
> Did you change anything else compared to the tests before ?

I did check that the problem persisted in linus-HEAD before testing your 
patch. The testing was done in order (reading from console logs I happen 
to still have in one window):

- reboot to patched kernel

[    0.000000] Linux version 4.13.0-rc2+ (testrunner@elk) (gcc version 
6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #5 SMP PREEMPT Fri Jul 28 
17:15:47 EEST 2017
[    0.000000] Command line: BOOT_IMAGE=/boot/tsa.efi root=/dev/sda1 
console=ttyS0,115200n8 console=tty0 intel_iommu=igfx_off drm.debug=0xe 
nmi_watchdog=panic,auto panic=1 softdog.soft_panic=1 
scsi_mod.use_blk_mq=0 rootwait ro 3

- "real" 15sec suspend test through IGT/piglit and rtcwake

$ ./scripts/run-tests.sh -vt igt@gem_exec_suspend@basic-s3 -x devices

- dmesg to go with suspend

[ 1189.597665] Suspended for 14.825 seconds
[ 1189.597665] Delta way too big! 18446743991837909721 
ts=18446744056274518328 write stamp = 64436608607
                If you just came from a suspend/resume,
                please switch to the trace global clock:
                  echo global > /sys/kernel/debug/tracing/trace_clock
[ 1189.597665] ------------[ cut here ]------------
[ 1189.597665] WARNING: CPU: 0 PID: 1332 at 
kernel/trace/ring_buffer.c:2647 rb_handle_timestamp.isra.32+0x71/0x80
[ 1189.597665] Modules linked in: snd_hda_codec_realtek i915 
snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core 
coretemp snd_pcm e1000e lpc_ich mei_me ptp mei pps_core
[ 1189.597665] CPU: 0 PID: 1332 Comm: rtcwake Tainted: G     U 
4.13.0-rc2+ #5
[ 1189.597665] Hardware name: Hewlett-Packard HP Compaq 8000 Elite CMT 
PC/3647h, BIOS 786G7 v01.13 07/20/2011
[ 1189.597665] task: ffff88010dce9880 task.stack: ffffc900000c8000
[ 1189.597665] RIP: 0010:rb_handle_timestamp.isra.32+0x71/0x80
[ 1189.597665] RSP: 0018:ffffc900000cbab0 EFLAGS: 00010082
[ 1189.597665] RAX: 00000000000000e0 RBX: ffffc900000cbad0 RCX: 
0000000000000004
[ 1189.597665] RDX: 0000000080000004 RSI: 0000000000000082 RDI: 
00000000ffffffff
[ 1189.597665] RBP: ffffc900000cbac0 R08: 0000000000000000 R09: 
00000000000000e0
[ 1189.597665] R10: ffffc900000cbbe0 R11: 00000000000ebc08 R12: 
ffff880117008890
[ 1189.597665] R13: ffff8801170088b0 R14: 00000000000003e8 R15: 
0000000000000005
[ 1189.597665] FS:  00007f93d957e700(0000) GS:ffff88011bc00000(0000) 
knlGS:0000000000000000
[ 1189.597665] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1189.597665] CR2: 000000f00255b068 CR3: 00000000d9063000 CR4: 
00000000000406f0
[ 1189.597665] Call Trace:
[ 1189.597665]  ring_buffer_lock_reserve+0x1fa/0x350
[ 1189.597665]  trace_vbprintk+0xdc/0x260
[ 1189.597665]  ? __irq_disable+0x1b/0xc0
[ 1189.597665]  __trace_bprintk+0x4a/0x60
[ 1189.597665]  ? preempt_count_add+0x9e/0xb0
[ 1189.597665]  __irq_disable+0x3f/0xc0
[ 1189.597665]  irq_disable+0x17/0x20
[ 1189.597665]  __disable_irq_nosync+0x59/0x70
[ 1189.597665]  disable_hardirq+0x11/0x30
[ 1189.597665]  hpet_msi_resume+0x85/0xd0
[ 1189.597665]  clockevents_tick_resume+0x14/0x20
[ 1189.597665]  tick_resume_local+0x32/0x60
[ 1189.597665]  tick_resume+0x13/0x20
[ 1189.597665]  timekeeping_resume+0x149/0x1a0
[ 1189.597665]  syscore_resume+0x4b/0x190
[ 1189.597665]  ? syscore_resume+0x4b/0x190
[ 1189.597665]  suspend_devices_and_enter+0x6b9/0x810
[ 1189.597665]  pm_suspend+0x367/0x540
[ 1189.597665]  state_store+0x7e/0xf0
[ 1189.597665]  kobj_attr_store+0xf/0x20
[ 1189.597665]  sysfs_kf_write+0x37/0x40
[ 1189.597665]  kernfs_fop_write+0x110/0x1a0
[ 1189.597665]  __vfs_write+0x28/0x130
[ 1189.597665]  ? __this_cpu_preempt_check+0x13/0x20
[ 1189.597665]  ? __sb_start_write+0x55/0xe0
[ 1189.597665]  vfs_write+0xb6/0x1a0
[ 1189.597665]  SyS_write+0x46/0xb0
[ 1189.597665]  entry_SYSCALL_64_fastpath+0x17/0x98
[ 1189.597665] RIP: 0033:0x7f93d90ac8f0
[ 1189.597665] RSP: 002b:00007ffd80a9ca88 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
[ 1189.597665] RAX: ffffffffffffffda RBX: 000000f00255a050 RCX: 
00007f93d90ac8f0
[ 1189.597665] RDX: 0000000000000004 RSI: 000000f00255a060 RDI: 
0000000000000007
[ 1189.597665] RBP: 00007f93d9375b00 R08: 000000f002557d90 R09: 
00007f93d957e700
[ 1189.597665] R10: 00007f93d9375b58 R11: 0000000000000246 R12: 
00007f93d9375b58
[ 1189.597665] R13: 00007f93d9375b58 R14: 000000000000270f R15: 
0000000000001010
[ 1189.597665] Code: f0 48 8b 73 08 85 c0 48 8b 13 48 c7 c0 88 9a a5 81 
49 c7 c0 1a 3b a8 81 4c 0f 44 c0 48 8b 0f 48 c7 c7 10 9b a5 81 e8 c0 c8 
fa ff <0f> ff eb a7 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56
[ 1189.597665] ---[ end trace 7d99ac836f161565 ]---

- printing out the trace from /sys/kernel/debug/tracing/trace


Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31  7:21                                             ` Tomi Sarvela
@ 2017-07-31  7:45                                               ` Thomas Gleixner
  2017-07-31  7:55                                                 ` Tomi Sarvela
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-31  7:45 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> On 28/07/17 19:26, Thomas Gleixner wrote:
> > Did you change anything else compared to the tests before ?
> 
> I did check that the problem persisted in linus-HEAD before testing your
> patch. The testing was done in order (reading from console logs I happen to
> still have in one window):

What I still do not understand is why this would affect the suspend path in
any way.

Can you remove the previous patch and apply the one below. If it resumes,
please provide the data from the trace buffer again.

Thanks,

	tglx

8<--------------
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -459,3 +459,11 @@ static inline void irq_remove_debugfs_en
 {
 }
 #endif /* CONFIG_GENERIC_IRQ_DEBUGFS */
+
+extern bool irq_suspend_resume;
+
+static inline void irq_trace_state(const char *what, struct irq_desc *desc)
+{
+	trace_printk("%s %d state %08x\n", what, irq_desc_get_irq(desc),
+		     irqd_get(&desc->irq_data));
+}
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -14,6 +14,8 @@
 
 #include "internals.h"
 
+bool irq_suspend_resume;
+
 bool irq_pm_check_wakeup(struct irq_desc *desc)
 {
 	if (irqd_is_wakeup_armed(&desc->irq_data)) {
@@ -120,6 +122,7 @@ void suspend_device_irqs(void)
 	struct irq_desc *desc;
 	int irq;
 
+	irq_suspend_resume = true;
 	for_each_irq_desc(irq, desc) {
 		unsigned long flags;
 		bool sync;
@@ -127,7 +130,9 @@ void suspend_device_irqs(void)
 		if (irq_settings_is_nested_thread(desc))
 			continue;
 		raw_spin_lock_irqsave(&desc->lock, flags);
+		irq_trace_state("presuspend", desc);
 		sync = suspend_device_irq(desc);
+		irq_trace_state("postsuspend", desc);
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 
 		if (sync)
@@ -150,8 +155,9 @@ static void resume_irq(struct irq_desc *
 	/* Pretend that it got disabled ! */
 	desc->depth++;
 	irq_state_set_disabled(desc);
-	irq_state_set_masked(desc);
+
 resume:
+	irq_state_set_masked(desc);
 	desc->istate &= ~IRQS_SUSPENDED;
 	__enable_irq(desc);
 }
@@ -172,9 +178,14 @@ static void resume_irqs(bool want_early)
 			continue;
 
 		raw_spin_lock_irqsave(&desc->lock, flags);
+		irq_trace_state("preresume", desc);
 		resume_irq(desc);
+		irq_trace_state("postresume", desc);
 		raw_spin_unlock_irqrestore(&desc->lock, flags);
 	}
+
+	if (!want_early)
+		irq_suspend_resume = false;
 }
 
 /**

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31  7:45                                               ` Thomas Gleixner
@ 2017-07-31  7:55                                                 ` Tomi Sarvela
  2017-07-31  8:29                                                   ` Thomas Gleixner
  2017-07-31 15:06                                                   ` Thomas Gleixner
  0 siblings, 2 replies; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-31  7:55 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 31/07/17 10:45, Thomas Gleixner wrote:
> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>> On 28/07/17 19:26, Thomas Gleixner wrote:
>>> Did you change anything else compared to the tests before ?
>>
>> I did check that the problem persisted in linus-HEAD before testing your
>> patch. The testing was done in order (reading from console logs I happen to
>> still have in one window):
> 
> What I still do not understand is why this would affect the suspend path in
> any way.
> 
> Can you remove the previous patch and apply the one below. If it resumes,
> please provide the data from the trace buffer again.

No such luck. ELK hangs in the suspend-test with earlier patch removed, 
this added. Checked again that the power-led is on, no serial output.

Tree not pulled: still testing against the previous head -rc2, not 
current 4.13.0-rc3

Best regards,

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31  7:55                                                 ` Tomi Sarvela
@ 2017-07-31  8:29                                                   ` Thomas Gleixner
  2017-07-31  8:41                                                     ` Tomi Sarvela
  2017-07-31 15:06                                                   ` Thomas Gleixner
  1 sibling, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-31  8:29 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> On 31/07/17 10:45, Thomas Gleixner wrote:
> > On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> > > On 28/07/17 19:26, Thomas Gleixner wrote:
> > > > Did you change anything else compared to the tests before ?
> > > 
> > > I did check that the problem persisted in linus-HEAD before testing your
> > > patch. The testing was done in order (reading from console logs I happen
> > > to
> > > still have in one window):
> > 
> > What I still do not understand is why this would affect the suspend path in
> > any way.
> > 
> > Can you remove the previous patch and apply the one below. If it resumes,
> > please provide the data from the trace buffer again.
> 
> No such luck. ELK hangs in the suspend-test with earlier patch removed, this
> added. Checked again that the power-led is on, no serial output.
> 
> Tree not pulled: still testing against the previous head -rc2, not current
> 4.13.0-rc3

Shouldn't make a difference. Can you please try the following:

Offline CPU1 before invoking suspend.

# echo 0 >/sys/devices/system/cpus/cpu1/offline

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-28 14:55                                         ` Tomi Sarvela
  2017-07-28 16:26                                           ` Thomas Gleixner
@ 2017-07-31  8:35                                           ` Thomas Gleixner
  1 sibling, 0 replies; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-31  8:35 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, LKML, Rafael J. Wysocki

Rafael,

On Fri, 28 Jul 2017, Tomi Sarvela wrote:
>          rtcwake-1332  [000] d..1    64.411112: suspend_device_irqs:
> presuspend 8 state 00401200
>          rtcwake-1332  [000] d..1    64.411112: __irq_disable: predisable 8
> state 00401200
>          rtcwake-1332  [000] d..1    64.411113: __irq_disable: postdisable 8
> state 00411200
>          rtcwake-1332  [000] d..1    64.411114: suspend_device_irqs:
> postsuspend 8 state 00411200
>          rtcwake-1332  [000] d..1    64.411115: suspend_device_irqs:
> presuspend 9 state 00403300
>          rtcwake-1332  [000] d..1    64.411115: __irq_disable: predisable 9
> state 00403300
>          rtcwake-1332  [000] d..1    64.411116: __irq_disable: postdisable 9
> state 00413300
>          rtcwake-1332  [000] d..1    64.411116: suspend_device_irqs:
> postsuspend 9 state 00413300
>          rtcwake-1332  [000] d..1    64.411117: suspend_device_irqs:

I'm trying to analyze that suspend/resume problem. While staring at the
trace Tomi provided I noticed, that neither the RTC irq nor the ACPI irq
have the wakeup armed bit set, but the system resumes from an RTC
interrupt. That's inconsistent at least, or am I missing something
essential?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31  8:29                                                   ` Thomas Gleixner
@ 2017-07-31  8:41                                                     ` Tomi Sarvela
  2017-07-31 14:04                                                       ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-31  8:41 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 31/07/17 11:29, Thomas Gleixner wrote:
> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>> On 31/07/17 10:45, Thomas Gleixner wrote:
>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>> On 28/07/17 19:26, Thomas Gleixner wrote:
>>>>> Did you change anything else compared to the tests before ?
>>>>
>>>> I did check that the problem persisted in linus-HEAD before testing your
>>>> patch. The testing was done in order (reading from console logs I happen
>>>> to
>>>> still have in one window):
>>>
>>> What I still do not understand is why this would affect the suspend path in
>>> any way.
>>>
>>> Can you remove the previous patch and apply the one below. If it resumes,
>>> please provide the data from the trace buffer again.
>>
>> No such luck. ELK hangs in the suspend-test with earlier patch removed, this
>> added. Checked again that the power-led is on, no serial output.
>>
>> Tree not pulled: still testing against the previous head -rc2, not current
>> 4.13.0-rc3
> 
> Shouldn't make a difference. Can you please try the following:
> 
> Offline CPU1 before invoking suspend.
> 
> # echo 0 >/sys/devices/system/cpus/cpu1/offline

Tested with your latest patch (irq_trace_state):

echo 0 >/sys/devices/system/cpu/cpu1/online

./scripts/run-tests.sh -vt igt@gem_exec_suspend@basic-s3 -x devices

No change, no wakey at all.

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31  8:41                                                     ` Tomi Sarvela
@ 2017-07-31 14:04                                                       ` Thomas Gleixner
  2017-07-31 14:13                                                         ` Tomi Sarvela
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-31 14:04 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> On 31/07/17 11:29, Thomas Gleixner wrote:
> > On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> > > On 31/07/17 10:45, Thomas Gleixner wrote:
> > > > On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> > > > > On 28/07/17 19:26, Thomas Gleixner wrote:
> > > > > > Did you change anything else compared to the tests before ?
> > > > > 
> > > > > I did check that the problem persisted in linus-HEAD before testing
> > > > > your
> > > > > patch. The testing was done in order (reading from console logs I
> > > > > happen
> > > > > to
> > > > > still have in one window):
> > > > 
> > > > What I still do not understand is why this would affect the suspend path
> > > > in
> > > > any way.
> > > > 
> > > > Can you remove the previous patch and apply the one below. If it
> > > > resumes,
> > > > please provide the data from the trace buffer again.
> > > 
> > > No such luck. ELK hangs in the suspend-test with earlier patch removed,
> > > this
> > > added. Checked again that the power-led is on, no serial output.
> > > 
> > > Tree not pulled: still testing against the previous head -rc2, not current
> > > 4.13.0-rc3
> > 
> > Shouldn't make a difference. Can you please try the following:
> > 
> > Offline CPU1 before invoking suspend.
> > 
> > # echo 0 >/sys/devices/system/cpus/cpu1/offline
> 
> Tested with your latest patch (irq_trace_state):
> 
> echo 0 >/sys/devices/system/cpu/cpu1/online
> 
> ./scripts/run-tests.sh -vt igt@gem_exec_suspend@basic-s3 -x devices

So this "igt@gem_exec_suspend@basic-s3" thingy is that executing anything
extra aside of 'echo mem > /sys/power/state'?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31 14:04                                                       ` Thomas Gleixner
@ 2017-07-31 14:13                                                         ` Tomi Sarvela
  0 siblings, 0 replies; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-31 14:13 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 31/07/17 17:04, Thomas Gleixner wrote:
> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>> On 31/07/17 11:29, Thomas Gleixner wrote:
>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>> On 31/07/17 10:45, Thomas Gleixner wrote:
>>>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>>>> On 28/07/17 19:26, Thomas Gleixner wrote:
>>>>>>> Did you change anything else compared to the tests before ?
>>>>>>
>>>>>> I did check that the problem persisted in linus-HEAD before testing
>>>>>> your
>>>>>> patch. The testing was done in order (reading from console logs I
>>>>>> happen
>>>>>> to
>>>>>> still have in one window):
>>>>>
>>>>> What I still do not understand is why this would affect the suspend path
>>>>> in
>>>>> any way.
>>>>>
>>>>> Can you remove the previous patch and apply the one below. If it
>>>>> resumes,
>>>>> please provide the data from the trace buffer again.
>>>>
>>>> No such luck. ELK hangs in the suspend-test with earlier patch removed,
>>>> this
>>>> added. Checked again that the power-led is on, no serial output.
>>>>
>>>> Tree not pulled: still testing against the previous head -rc2, not current
>>>> 4.13.0-rc3
>>>
>>> Shouldn't make a difference. Can you please try the following:
>>>
>>> Offline CPU1 before invoking suspend.
>>>
>>> # echo 0 >/sys/devices/system/cpus/cpu1/offline
>>
>> Tested with your latest patch (irq_trace_state):
>>
>> echo 0 >/sys/devices/system/cpu/cpu1/online
>>
>> ./scripts/run-tests.sh -vt igt@gem_exec_suspend@basic-s3 -x devices
> 
> So this "igt@gem_exec_suspend@basic-s3" thingy is that executing anything
> extra aside of 'echo mem > /sys/power/state'?

It's setting wakeup with rtcwake to +15 seconds, then suspending.

Complete information glanceable from sources:

https://cgit.freedesktop.org/xorg/app/intel-gpu-tools/tree/tests/gem_exec_suspend.c 
->

https://cgit.freedesktop.org/xorg/app/intel-gpu-tools/tree/lib/igt_aux.c:void 
igt_system_suspend_autoresume

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31  7:55                                                 ` Tomi Sarvela
  2017-07-31  8:29                                                   ` Thomas Gleixner
@ 2017-07-31 15:06                                                   ` Thomas Gleixner
  2017-07-31 15:44                                                     ` Tomi Sarvela
  1 sibling, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-31 15:06 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> On 31/07/17 10:45, Thomas Gleixner wrote:
> > On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> > > On 28/07/17 19:26, Thomas Gleixner wrote:
> > > > Did you change anything else compared to the tests before ?
> > > 
> > > I did check that the problem persisted in linus-HEAD before testing your
> > > patch. The testing was done in order (reading from console logs I happen
> > > to
> > > still have in one window):
> > 
> > What I still do not understand is why this would affect the suspend path in
> > any way.
> > 
> > Can you remove the previous patch and apply the one below. If it resumes,
> > please provide the data from the trace buffer again.
> 
> No such luck. ELK hangs in the suspend-test with earlier patch removed, this
> added. Checked again that the power-led is on, no serial output.

Can you please remove the patch. And try the following:

# echo N > /sys/module/printk/parameters/console_suspend

# echo mem > /sys/power/state

and log the output of the serial console. That way we might get a clue
where it gets stuck.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31 15:06                                                   ` Thomas Gleixner
@ 2017-07-31 15:44                                                     ` Tomi Sarvela
  2017-07-31 15:55                                                       ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Tomi Sarvela @ 2017-07-31 15:44 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Martin Peres, jeffy.chen, linux-kernel

On 31/07/17 18:06, Thomas Gleixner wrote:
> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>> On 31/07/17 10:45, Thomas Gleixner wrote:
>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>> On 28/07/17 19:26, Thomas Gleixner wrote:
>>>>> Did you change anything else compared to the tests before ?
>>>>
>>>> I did check that the problem persisted in linus-HEAD before testing your
>>>> patch. The testing was done in order (reading from console logs I happen
>>>> to
>>>> still have in one window):
>>>
>>> What I still do not understand is why this would affect the suspend path in
>>> any way.
>>>
>>> Can you remove the previous patch and apply the one below. If it resumes,
>>> please provide the data from the trace buffer again.
>>
>> No such luck. ELK hangs in the suspend-test with earlier patch removed, this
>> added. Checked again that the power-led is on, no serial output.
> 
> Can you please remove the patch. And try the following:
> 
> # echo N > /sys/module/printk/parameters/console_suspend
> 
> # echo mem > /sys/power/state
> 
> and log the output of the serial console. That way we might get a clue
> where it gets stuck.

I'm afraid it hangs right away. No response from SSH, no output to serial.

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31 15:44                                                     ` Tomi Sarvela
@ 2017-07-31 15:55                                                       ` Thomas Gleixner
  2017-08-03  7:32                                                         ` Masahiro Yamada
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2017-07-31 15:55 UTC (permalink / raw)
  To: Tomi Sarvela; +Cc: Martin Peres, jeffy.chen, linux-kernel

On Mon, 31 Jul 2017, Tomi Sarvela wrote:
> On 31/07/17 18:06, Thomas Gleixner wrote:
> > Can you please remove the patch. And try the following:
> > 
> > # echo N > /sys/module/printk/parameters/console_suspend
> > 
> > # echo mem > /sys/power/state
> > 
> > and log the output of the serial console. That way we might get a clue
> > where it gets stuck.
> 
> I'm afraid it hangs right away. No response from SSH, no output to serial.

What means hangs right away? Is there no output at all on the serial
console? Or does it just stop at some point?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-07-31 15:55                                                       ` Thomas Gleixner
@ 2017-08-03  7:32                                                         ` Masahiro Yamada
  2017-08-03  8:23                                                           ` Peter Zijlstra
  2017-08-03  8:41                                                           ` Marc Zyngier
  0 siblings, 2 replies; 45+ messages in thread
From: Masahiro Yamada @ 2017-08-03  7:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Tomi Sarvela, Martin Peres, Jeffy Chen, Linux Kernel Mailing List

Hi.

2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>> On 31/07/17 18:06, Thomas Gleixner wrote:
>> > Can you please remove the patch. And try the following:
>> >
>> > # echo N > /sys/module/printk/parameters/console_suspend
>> >
>> > # echo mem > /sys/power/state
>> >
>> > and log the output of the serial console. That way we might get a clue
>> > where it gets stuck.
>>
>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>
> What means hangs right away? Is there no output at all on the serial
> console? Or does it just stop at some point?
>
> Thanks,
>
>         tglx
>

Sorry for jumping in.
Finally, I found this thread.


My environment is completely different (ARM64 board),
I am also suffering from a hibernation problem
since this commit.


I get no response on the serial console
after "Restarting tasks ... done." log message.


By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
irq function calls", I can get hibernation working again.


SW info:
defconfig:  arch/arm64/configs/defconfig
DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
PSCI     :  ARM Trusted Firmware


SoC info:
CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)





-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-03  7:32                                                         ` Masahiro Yamada
@ 2017-08-03  8:23                                                           ` Peter Zijlstra
  2017-08-03  8:41                                                           ` Marc Zyngier
  1 sibling, 0 replies; 45+ messages in thread
From: Peter Zijlstra @ 2017-08-03  8:23 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Thomas Gleixner, Tomi Sarvela, Martin Peres, Jeffy Chen,
	Linux Kernel Mailing List

On Thu, Aug 03, 2017 at 04:32:09PM +0900, Masahiro Yamada wrote:

> My environment is completely different (ARM64 board),
> I am also suffering from a hibernation problem
> since this commit.
> 
> 
> I get no response on the serial console
> after "Restarting tasks ... done." log message.
> 
> 
> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
> irq function calls", I can get hibernation working again.

https://lkml.kernel.org/r/alpine.DEB.2.20.1707312158590.2287@nanos

Is the patch that cured the x86 issue, but maybe that gives a clue as
what to look for in your platform.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-03  7:32                                                         ` Masahiro Yamada
  2017-08-03  8:23                                                           ` Peter Zijlstra
@ 2017-08-03  8:41                                                           ` Marc Zyngier
  2017-08-03 12:52                                                             ` Masahiro Yamada
  1 sibling, 1 reply; 45+ messages in thread
From: Marc Zyngier @ 2017-08-03  8:41 UTC (permalink / raw)
  To: Masahiro Yamada, Thomas Gleixner
  Cc: Tomi Sarvela, Martin Peres, Jeffy Chen, Linux Kernel Mailing List

Hi Masahiro,

On 03/08/17 08:32, Masahiro Yamada wrote:
> Hi.
> 
> 2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>>> Can you please remove the patch. And try the following:
>>>>
>>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>>
>>>> # echo mem > /sys/power/state
>>>>
>>>> and log the output of the serial console. That way we might get a clue
>>>> where it gets stuck.
>>>
>>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>>
>> What means hangs right away? Is there no output at all on the serial
>> console? Or does it just stop at some point?
>>
>> Thanks,
>>
>>         tglx
>>
> 
> Sorry for jumping in.
> Finally, I found this thread.
> 
> 
> My environment is completely different (ARM64 board),
> I am also suffering from a hibernation problem
> since this commit.
> 
> 
> I get no response on the serial console
> after "Restarting tasks ... done." log message.
> 
> 
> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
> irq function calls", I can get hibernation working again.
> 
> 
> SW info:
> defconfig:  arch/arm64/configs/defconfig
> DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
> PSCI     :  ARM Trusted Firmware
> 
> 
> SoC info:
> CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
> irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)

Let me take an educated guess: It feels like your firmware doesn't
save/restore the GIC context across suspend/resume. Is that something
you could check, assuming you have access to the firmware source code?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-03  8:41                                                           ` Marc Zyngier
@ 2017-08-03 12:52                                                             ` Masahiro Yamada
  2017-08-03 13:30                                                               ` Marc Zyngier
  0 siblings, 1 reply; 45+ messages in thread
From: Masahiro Yamada @ 2017-08-03 12:52 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Gleixner, Tomi Sarvela, Martin Peres, Jeffy Chen,
	Linux Kernel Mailing List

Hi Marc,

2017-08-03 17:41 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
> Hi Masahiro,
>
> On 03/08/17 08:32, Masahiro Yamada wrote:
>> Hi.
>>
>> 2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>>>> Can you please remove the patch. And try the following:
>>>>>
>>>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>>>
>>>>> # echo mem > /sys/power/state
>>>>>
>>>>> and log the output of the serial console. That way we might get a clue
>>>>> where it gets stuck.
>>>>
>>>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>>>
>>> What means hangs right away? Is there no output at all on the serial
>>> console? Or does it just stop at some point?
>>>
>>> Thanks,
>>>
>>>         tglx
>>>
>>
>> Sorry for jumping in.
>> Finally, I found this thread.
>>
>>
>> My environment is completely different (ARM64 board),
>> I am also suffering from a hibernation problem
>> since this commit.
>>
>>
>> I get no response on the serial console
>> after "Restarting tasks ... done." log message.
>>
>>
>> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
>> irq function calls", I can get hibernation working again.
>>
>>
>> SW info:
>> defconfig:  arch/arm64/configs/defconfig
>> DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
>> PSCI     :  ARM Trusted Firmware
>>
>>
>> SoC info:
>> CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
>> irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>
> Let me take an educated guess: It feels like your firmware doesn't
> save/restore the GIC context across suspend/resume. Is that something
> you could check, assuming you have access to the firmware source code?

Thanks for your comments.


I do not know much about the manner of preserving GICv3 context.

I can see this patch  (rejected?) :
https://patchwork.kernel.org/patch/9343061/


Is it something that should be completely cared by firmware
instead of kernel?


ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
is open source software, and I pushed my platform code to the upstream.

So, yes, I (and everybody) can have access to the firmware source code.


I am not sure how ATF saves the context during hibernation, though.



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-03 12:52                                                             ` Masahiro Yamada
@ 2017-08-03 13:30                                                               ` Marc Zyngier
  2017-08-07  4:45                                                                 ` Masahiro Yamada
  0 siblings, 1 reply; 45+ messages in thread
From: Marc Zyngier @ 2017-08-03 13:30 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Thomas Gleixner, Tomi Sarvela, Martin Peres, Jeffy Chen,
	Linux Kernel Mailing List

On 03/08/17 13:52, Masahiro Yamada wrote:
> Hi Marc,
> 
> 2017-08-03 17:41 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>> Hi Masahiro,
>>
>> On 03/08/17 08:32, Masahiro Yamada wrote:
>>> Hi.
>>>
>>> 2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
>>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>>>>> Can you please remove the patch. And try the following:
>>>>>>
>>>>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>>>>
>>>>>> # echo mem > /sys/power/state
>>>>>>
>>>>>> and log the output of the serial console. That way we might get a clue
>>>>>> where it gets stuck.
>>>>>
>>>>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>>>>
>>>> What means hangs right away? Is there no output at all on the serial
>>>> console? Or does it just stop at some point?
>>>>
>>>> Thanks,
>>>>
>>>>         tglx
>>>>
>>>
>>> Sorry for jumping in.
>>> Finally, I found this thread.
>>>
>>>
>>> My environment is completely different (ARM64 board),
>>> I am also suffering from a hibernation problem
>>> since this commit.
>>>
>>>
>>> I get no response on the serial console
>>> after "Restarting tasks ... done." log message.
>>>
>>>
>>> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
>>> irq function calls", I can get hibernation working again.
>>>
>>>
>>> SW info:
>>> defconfig:  arch/arm64/configs/defconfig
>>> DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
>>> PSCI     :  ARM Trusted Firmware
>>>
>>>
>>> SoC info:
>>> CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
>>> irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>>
>> Let me take an educated guess: It feels like your firmware doesn't
>> save/restore the GIC context across suspend/resume. Is that something
>> you could check, assuming you have access to the firmware source code?
> 
> Thanks for your comments.
> 
> 
> I do not know much about the manner of preserving GICv3 context.
> 
> I can see this patch  (rejected?) :
> https://patchwork.kernel.org/patch/9343061/
> 
> 
> Is it something that should be completely cared by firmware
> instead of kernel?

That was definitely the intention, but it looks like something that ATF
has only started supporting very recently:

https://github.com/ARM-software/arm-trusted-firmware/pull/1047

> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
> is open source software, and I pushed my platform code to the upstream.
> 
> So, yes, I (and everybody) can have access to the firmware source code.
> 
> 
> I am not sure how ATF saves the context during hibernation, though.

See the above link. Is there any chance of you trying this into your
firmware?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-03 13:30                                                               ` Marc Zyngier
@ 2017-08-07  4:45                                                                 ` Masahiro Yamada
  2017-08-07  8:17                                                                   ` Marc Zyngier
  0 siblings, 1 reply; 45+ messages in thread
From: Masahiro Yamada @ 2017-08-07  4:45 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Gleixner, Tomi Sarvela, Martin Peres, Jeffy Chen,
	Linux Kernel Mailing List

Hi Marc,


2017-08-03 22:30 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
> On 03/08/17 13:52, Masahiro Yamada wrote:
>> Hi Marc,
>>
>> 2017-08-03 17:41 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>> Hi Masahiro,
>>>
>>> On 03/08/17 08:32, Masahiro Yamada wrote:
>>>> Hi.
>>>>
>>>> 2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
>>>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>>>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>>>>>> Can you please remove the patch. And try the following:
>>>>>>>
>>>>>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>>>>>
>>>>>>> # echo mem > /sys/power/state
>>>>>>>
>>>>>>> and log the output of the serial console. That way we might get a clue
>>>>>>> where it gets stuck.
>>>>>>
>>>>>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>>>>>
>>>>> What means hangs right away? Is there no output at all on the serial
>>>>> console? Or does it just stop at some point?
>>>>>
>>>>> Thanks,
>>>>>
>>>>>         tglx
>>>>>
>>>>
>>>> Sorry for jumping in.
>>>> Finally, I found this thread.
>>>>
>>>>
>>>> My environment is completely different (ARM64 board),
>>>> I am also suffering from a hibernation problem
>>>> since this commit.
>>>>
>>>>
>>>> I get no response on the serial console
>>>> after "Restarting tasks ... done." log message.
>>>>
>>>>
>>>> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
>>>> irq function calls", I can get hibernation working again.
>>>>
>>>>
>>>> SW info:
>>>> defconfig:  arch/arm64/configs/defconfig
>>>> DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
>>>> PSCI     :  ARM Trusted Firmware
>>>>
>>>>
>>>> SoC info:
>>>> CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
>>>> irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>>>
>>> Let me take an educated guess: It feels like your firmware doesn't
>>> save/restore the GIC context across suspend/resume. Is that something
>>> you could check, assuming you have access to the firmware source code?
>>
>> Thanks for your comments.
>>
>>
>> I do not know much about the manner of preserving GICv3 context.
>>
>> I can see this patch  (rejected?) :
>> https://patchwork.kernel.org/patch/9343061/
>>
>>
>> Is it something that should be completely cared by firmware
>> instead of kernel?
>
> That was definitely the intention, but it looks like something that ATF
> has only started supporting very recently:
>
> https://github.com/ARM-software/arm-trusted-firmware/pull/1047
>
>> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
>> is open source software, and I pushed my platform code to the upstream.
>>
>> So, yes, I (and everybody) can have access to the firmware source code.
>>
>>
>> I am not sure how ATF saves the context during hibernation, though.
>
> See the above link. Is there any chance of you trying this into your
> firmware?
>
> Thanks,

Thanks for the pointer.


Yes.  I will try that once GIC-v3 context save/restore is supported in ATF.

I think that will basically work for suspend-to-ram
because all contexts including both non-secure and secure worlds will
be retained in the main memory.

However, I still do not understand how the context is preserved during
the hibernation (suspend-to-disk).


If my understanding is correct, hibernation on Linux works like follows:

[1] Freeze all tasks
[2] CPU_OFF for non-boot CPUs
[3] Create a hibernation image
[4] CPU_ON for non-boot CPUs
[5] Write the hibernation image to the disk (=swap area)
[6] SYSTEM_OFF


IIUC, [5] only writes the context Linux takes care of (only non-secure).

If so, where and how does the firmware write the GIC-v3 context
to the disk?


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-07  4:45                                                                 ` Masahiro Yamada
@ 2017-08-07  8:17                                                                   ` Marc Zyngier
  2017-08-08  1:30                                                                     ` Masahiro Yamada
  0 siblings, 1 reply; 45+ messages in thread
From: Marc Zyngier @ 2017-08-07  8:17 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Thomas Gleixner, Tomi Sarvela, Martin Peres, Jeffy Chen,
	Linux Kernel Mailing List

On 07/08/17 05:45, Masahiro Yamada wrote:
> Hi Marc,
> 
> 
> 2017-08-03 22:30 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>> On 03/08/17 13:52, Masahiro Yamada wrote:
>>> Hi Marc,
>>>
>>> 2017-08-03 17:41 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>>> Hi Masahiro,
>>>>
>>>> On 03/08/17 08:32, Masahiro Yamada wrote:
>>>>> Hi.
>>>>>
>>>>> 2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
>>>>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>>>>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>>>>>>> Can you please remove the patch. And try the following:
>>>>>>>>
>>>>>>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>>>>>>
>>>>>>>> # echo mem > /sys/power/state
>>>>>>>>
>>>>>>>> and log the output of the serial console. That way we might get a clue
>>>>>>>> where it gets stuck.
>>>>>>>
>>>>>>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>>>>>>
>>>>>> What means hangs right away? Is there no output at all on the serial
>>>>>> console? Or does it just stop at some point?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>         tglx
>>>>>>
>>>>>
>>>>> Sorry for jumping in.
>>>>> Finally, I found this thread.
>>>>>
>>>>>
>>>>> My environment is completely different (ARM64 board),
>>>>> I am also suffering from a hibernation problem
>>>>> since this commit.
>>>>>
>>>>>
>>>>> I get no response on the serial console
>>>>> after "Restarting tasks ... done." log message.
>>>>>
>>>>>
>>>>> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
>>>>> irq function calls", I can get hibernation working again.
>>>>>
>>>>>
>>>>> SW info:
>>>>> defconfig:  arch/arm64/configs/defconfig
>>>>> DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
>>>>> PSCI     :  ARM Trusted Firmware
>>>>>
>>>>>
>>>>> SoC info:
>>>>> CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
>>>>> irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>>>>
>>>> Let me take an educated guess: It feels like your firmware doesn't
>>>> save/restore the GIC context across suspend/resume. Is that something
>>>> you could check, assuming you have access to the firmware source code?
>>>
>>> Thanks for your comments.
>>>
>>>
>>> I do not know much about the manner of preserving GICv3 context.
>>>
>>> I can see this patch  (rejected?) :
>>> https://patchwork.kernel.org/patch/9343061/
>>>
>>>
>>> Is it something that should be completely cared by firmware
>>> instead of kernel?
>>
>> That was definitely the intention, but it looks like something that ATF
>> has only started supporting very recently:
>>
>> https://github.com/ARM-software/arm-trusted-firmware/pull/1047
>>
>>> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
>>> is open source software, and I pushed my platform code to the upstream.
>>>
>>> So, yes, I (and everybody) can have access to the firmware source code.
>>>
>>>
>>> I am not sure how ATF saves the context during hibernation, though.
>>
>> See the above link. Is there any chance of you trying this into your
>> firmware?
>>
>> Thanks,
> 
> Thanks for the pointer.
> 
> 
> Yes.  I will try that once GIC-v3 context save/restore is supported in ATF.
> 
> I think that will basically work for suspend-to-ram
> because all contexts including both non-secure and secure worlds will
> be retained in the main memory.
> 
> However, I still do not understand how the context is preserved during
> the hibernation (suspend-to-disk).
> 
> 
> If my understanding is correct, hibernation on Linux works like follows:
> 
> [1] Freeze all tasks
> [2] CPU_OFF for non-boot CPUs
> [3] Create a hibernation image
> [4] CPU_ON for non-boot CPUs
> [5] Write the hibernation image to the disk (=swap area)
> [6] SYSTEM_OFF
> 
> 
> IIUC, [5] only writes the context Linux takes care of (only non-secure).
> 
> If so, where and how does the firmware write the GIC-v3 context
> to the disk?

Gah, I completely missed the fact that you were talking about suspend to
disk, sorry about that.

It is likely that some driver doesn't restore its state properly. Is
there any chance that you could pinpoint which device creates the issue?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-07  8:17                                                                   ` Marc Zyngier
@ 2017-08-08  1:30                                                                     ` Masahiro Yamada
  2017-08-08  7:39                                                                       ` Marc Zyngier
  0 siblings, 1 reply; 45+ messages in thread
From: Masahiro Yamada @ 2017-08-08  1:30 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Gleixner, Tomi Sarvela, Martin Peres, Jeffy Chen,
	Linux Kernel Mailing List

Hi Marc,

2017-08-07 17:17 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
> On 07/08/17 05:45, Masahiro Yamada wrote:
>> Hi Marc,
>>
>>
>> 2017-08-03 22:30 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>> On 03/08/17 13:52, Masahiro Yamada wrote:
>>>> Hi Marc,
>>>>
>>>> 2017-08-03 17:41 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>>>> Hi Masahiro,
>>>>>
>>>>> On 03/08/17 08:32, Masahiro Yamada wrote:
>>>>>> Hi.
>>>>>>
>>>>>> 2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
>>>>>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>>>>>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>>>>>>>> Can you please remove the patch. And try the following:
>>>>>>>>>
>>>>>>>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>>>>>>>
>>>>>>>>> # echo mem > /sys/power/state
>>>>>>>>>
>>>>>>>>> and log the output of the serial console. That way we might get a clue
>>>>>>>>> where it gets stuck.
>>>>>>>>
>>>>>>>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>>>>>>>
>>>>>>> What means hangs right away? Is there no output at all on the serial
>>>>>>> console? Or does it just stop at some point?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>         tglx
>>>>>>>
>>>>>>
>>>>>> Sorry for jumping in.
>>>>>> Finally, I found this thread.
>>>>>>
>>>>>>
>>>>>> My environment is completely different (ARM64 board),
>>>>>> I am also suffering from a hibernation problem
>>>>>> since this commit.
>>>>>>
>>>>>>
>>>>>> I get no response on the serial console
>>>>>> after "Restarting tasks ... done." log message.
>>>>>>
>>>>>>
>>>>>> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
>>>>>> irq function calls", I can get hibernation working again.
>>>>>>
>>>>>>
>>>>>> SW info:
>>>>>> defconfig:  arch/arm64/configs/defconfig
>>>>>> DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
>>>>>> PSCI     :  ARM Trusted Firmware
>>>>>>
>>>>>>
>>>>>> SoC info:
>>>>>> CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
>>>>>> irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>>>>>
>>>>> Let me take an educated guess: It feels like your firmware doesn't
>>>>> save/restore the GIC context across suspend/resume. Is that something
>>>>> you could check, assuming you have access to the firmware source code?
>>>>
>>>> Thanks for your comments.
>>>>
>>>>
>>>> I do not know much about the manner of preserving GICv3 context.
>>>>
>>>> I can see this patch  (rejected?) :
>>>> https://patchwork.kernel.org/patch/9343061/
>>>>
>>>>
>>>> Is it something that should be completely cared by firmware
>>>> instead of kernel?
>>>
>>> That was definitely the intention, but it looks like something that ATF
>>> has only started supporting very recently:
>>>
>>> https://github.com/ARM-software/arm-trusted-firmware/pull/1047
>>>
>>>> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
>>>> is open source software, and I pushed my platform code to the upstream.
>>>>
>>>> So, yes, I (and everybody) can have access to the firmware source code.
>>>>
>>>>
>>>> I am not sure how ATF saves the context during hibernation, though.
>>>
>>> See the above link. Is there any chance of you trying this into your
>>> firmware?
>>>
>>> Thanks,
>>
>> Thanks for the pointer.
>>
>>
>> Yes.  I will try that once GIC-v3 context save/restore is supported in ATF.
>>
>> I think that will basically work for suspend-to-ram
>> because all contexts including both non-secure and secure worlds will
>> be retained in the main memory.
>>
>> However, I still do not understand how the context is preserved during
>> the hibernation (suspend-to-disk).
>>
>>
>> If my understanding is correct, hibernation on Linux works like follows:
>>
>> [1] Freeze all tasks
>> [2] CPU_OFF for non-boot CPUs
>> [3] Create a hibernation image
>> [4] CPU_ON for non-boot CPUs
>> [5] Write the hibernation image to the disk (=swap area)
>> [6] SYSTEM_OFF
>>
>>
>> IIUC, [5] only writes the context Linux takes care of (only non-secure).
>>
>> If so, where and how does the firmware write the GIC-v3 context
>> to the disk?
>
> Gah, I completely missed the fact that you were talking about suspend to
> disk, sorry about that.
>
> It is likely that some driver doesn't restore its state properly. Is
> there any chance that you could pinpoint which device creates the issue?
>

I use eMMC to store the hibernation image, but
I do not think eMMC driver is the cause of the issue.

I guess the cause of the issue is GIC-v3 context is lost.


I am not an expert in this, so I will ask the ATF community
about how ATF can support suspend-to-disk.


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-08  1:30                                                                     ` Masahiro Yamada
@ 2017-08-08  7:39                                                                       ` Marc Zyngier
  2017-08-09  4:03                                                                         ` Masahiro Yamada
  0 siblings, 1 reply; 45+ messages in thread
From: Marc Zyngier @ 2017-08-08  7:39 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Thomas Gleixner, Tomi Sarvela, Martin Peres, Jeffy Chen,
	Linux Kernel Mailing List

On 08/08/17 02:30, Masahiro Yamada wrote:
> Hi Marc,
> 
> 2017-08-07 17:17 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>> On 07/08/17 05:45, Masahiro Yamada wrote:
>>> Hi Marc,
>>>
>>>
>>> 2017-08-03 22:30 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>>> On 03/08/17 13:52, Masahiro Yamada wrote:
>>>>> Hi Marc,
>>>>>
>>>>> 2017-08-03 17:41 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>>>>> Hi Masahiro,
>>>>>>
>>>>>> On 03/08/17 08:32, Masahiro Yamada wrote:
>>>>>>> Hi.
>>>>>>>
>>>>>>> 2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
>>>>>>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>>>>>>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>>>>>>>>> Can you please remove the patch. And try the following:
>>>>>>>>>>
>>>>>>>>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>>>>>>>>
>>>>>>>>>> # echo mem > /sys/power/state
>>>>>>>>>>
>>>>>>>>>> and log the output of the serial console. That way we might get a clue
>>>>>>>>>> where it gets stuck.
>>>>>>>>>
>>>>>>>>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>>>>>>>>
>>>>>>>> What means hangs right away? Is there no output at all on the serial
>>>>>>>> console? Or does it just stop at some point?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>         tglx
>>>>>>>>
>>>>>>>
>>>>>>> Sorry for jumping in.
>>>>>>> Finally, I found this thread.
>>>>>>>
>>>>>>>
>>>>>>> My environment is completely different (ARM64 board),
>>>>>>> I am also suffering from a hibernation problem
>>>>>>> since this commit.
>>>>>>>
>>>>>>>
>>>>>>> I get no response on the serial console
>>>>>>> after "Restarting tasks ... done." log message.
>>>>>>>
>>>>>>>
>>>>>>> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
>>>>>>> irq function calls", I can get hibernation working again.
>>>>>>>
>>>>>>>
>>>>>>> SW info:
>>>>>>> defconfig:  arch/arm64/configs/defconfig
>>>>>>> DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
>>>>>>> PSCI     :  ARM Trusted Firmware
>>>>>>>
>>>>>>>
>>>>>>> SoC info:
>>>>>>> CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
>>>>>>> irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>>>>>>
>>>>>> Let me take an educated guess: It feels like your firmware doesn't
>>>>>> save/restore the GIC context across suspend/resume. Is that something
>>>>>> you could check, assuming you have access to the firmware source code?
>>>>>
>>>>> Thanks for your comments.
>>>>>
>>>>>
>>>>> I do not know much about the manner of preserving GICv3 context.
>>>>>
>>>>> I can see this patch  (rejected?) :
>>>>> https://patchwork.kernel.org/patch/9343061/
>>>>>
>>>>>
>>>>> Is it something that should be completely cared by firmware
>>>>> instead of kernel?
>>>>
>>>> That was definitely the intention, but it looks like something that ATF
>>>> has only started supporting very recently:
>>>>
>>>> https://github.com/ARM-software/arm-trusted-firmware/pull/1047
>>>>
>>>>> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
>>>>> is open source software, and I pushed my platform code to the upstream.
>>>>>
>>>>> So, yes, I (and everybody) can have access to the firmware source code.
>>>>>
>>>>>
>>>>> I am not sure how ATF saves the context during hibernation, though.
>>>>
>>>> See the above link. Is there any chance of you trying this into your
>>>> firmware?
>>>>
>>>> Thanks,
>>>
>>> Thanks for the pointer.
>>>
>>>
>>> Yes.  I will try that once GIC-v3 context save/restore is supported in ATF.
>>>
>>> I think that will basically work for suspend-to-ram
>>> because all contexts including both non-secure and secure worlds will
>>> be retained in the main memory.
>>>
>>> However, I still do not understand how the context is preserved during
>>> the hibernation (suspend-to-disk).
>>>
>>>
>>> If my understanding is correct, hibernation on Linux works like follows:
>>>
>>> [1] Freeze all tasks
>>> [2] CPU_OFF for non-boot CPUs
>>> [3] Create a hibernation image
>>> [4] CPU_ON for non-boot CPUs
>>> [5] Write the hibernation image to the disk (=swap area)
>>> [6] SYSTEM_OFF
>>>
>>>
>>> IIUC, [5] only writes the context Linux takes care of (only non-secure).
>>>
>>> If so, where and how does the firmware write the GIC-v3 context
>>> to the disk?
>>
>> Gah, I completely missed the fact that you were talking about suspend to
>> disk, sorry about that.
>>
>> It is likely that some driver doesn't restore its state properly. Is
>> there any chance that you could pinpoint which device creates the issue?
>>
> 
> I use eMMC to store the hibernation image, but
> I do not think eMMC driver is the cause of the issue.
> 
> I guess the cause of the issue is GIC-v3 context is lost.

It is not lost. The boot kernel has re-initialized its state. What is
missing is that one driver in your system fails to restore its own state
correctly, and relies on doing something such as enabling/disabling the
interrupt in its PM handler, things will start working again (probably
because they use the same PM callbacks functions for both suspend/resume
and hibernation). This is in no way a guarantee.

Please try:

> swapon -a
> echo test_resume > /sys/power/disk
> echo disk > /sys/power/state

and let me know how this fares.

> I am not an expert in this, so I will ask the ATF community
> about how ATF can support suspend-to-disk.

As you pointed out, ARF is not involved at all in that context, so
that's pretty pointless.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Suspend-resume failure on Intel Eagle Lake Core2Duo
  2017-08-08  7:39                                                                       ` Marc Zyngier
@ 2017-08-09  4:03                                                                         ` Masahiro Yamada
  0 siblings, 0 replies; 45+ messages in thread
From: Masahiro Yamada @ 2017-08-09  4:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Thomas Gleixner, Tomi Sarvela, Martin Peres, Jeffy Chen,
	Linux Kernel Mailing List

Hi Marc,


2017-08-08 16:39 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
> On 08/08/17 02:30, Masahiro Yamada wrote:
>> Hi Marc,
>>
>> 2017-08-07 17:17 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>> On 07/08/17 05:45, Masahiro Yamada wrote:
>>>> Hi Marc,
>>>>
>>>>
>>>> 2017-08-03 22:30 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>>>> On 03/08/17 13:52, Masahiro Yamada wrote:
>>>>>> Hi Marc,
>>>>>>
>>>>>> 2017-08-03 17:41 GMT+09:00 Marc Zyngier <marc.zyngier@arm.com>:
>>>>>>> Hi Masahiro,
>>>>>>>
>>>>>>> On 03/08/17 08:32, Masahiro Yamada wrote:
>>>>>>>> Hi.
>>>>>>>>
>>>>>>>> 2017-08-01 0:55 GMT+09:00 Thomas Gleixner <tglx@linutronix.de>:
>>>>>>>>> On Mon, 31 Jul 2017, Tomi Sarvela wrote:
>>>>>>>>>> On 31/07/17 18:06, Thomas Gleixner wrote:
>>>>>>>>>>> Can you please remove the patch. And try the following:
>>>>>>>>>>>
>>>>>>>>>>> # echo N > /sys/module/printk/parameters/console_suspend
>>>>>>>>>>>
>>>>>>>>>>> # echo mem > /sys/power/state
>>>>>>>>>>>
>>>>>>>>>>> and log the output of the serial console. That way we might get a clue
>>>>>>>>>>> where it gets stuck.
>>>>>>>>>>
>>>>>>>>>> I'm afraid it hangs right away. No response from SSH, no output to serial.
>>>>>>>>>
>>>>>>>>> What means hangs right away? Is there no output at all on the serial
>>>>>>>>> console? Or does it just stop at some point?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>>         tglx
>>>>>>>>>
>>>>>>>>
>>>>>>>> Sorry for jumping in.
>>>>>>>> Finally, I found this thread.
>>>>>>>>
>>>>>>>>
>>>>>>>> My environment is completely different (ARM64 board),
>>>>>>>> I am also suffering from a hibernation problem
>>>>>>>> since this commit.
>>>>>>>>
>>>>>>>>
>>>>>>>> I get no response on the serial console
>>>>>>>> after "Restarting tasks ... done." log message.
>>>>>>>>
>>>>>>>>
>>>>>>>> By reverting bf22ff45bed6 ("genirq: Avoid unnecessary low level
>>>>>>>> irq function calls", I can get hibernation working again.
>>>>>>>>
>>>>>>>>
>>>>>>>> SW info:
>>>>>>>> defconfig:  arch/arm64/configs/defconfig
>>>>>>>> DT       :  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts
>>>>>>>> PSCI     :  ARM Trusted Firmware
>>>>>>>>
>>>>>>>>
>>>>>>>> SoC info:
>>>>>>>> CPU      :  Cortex-A72 * 2 + Cortex-A53 * 2
>>>>>>>> irqchip  :  GICv3 (drivers/irq/irq-gic-v3.c)
>>>>>>>
>>>>>>> Let me take an educated guess: It feels like your firmware doesn't
>>>>>>> save/restore the GIC context across suspend/resume. Is that something
>>>>>>> you could check, assuming you have access to the firmware source code?
>>>>>>
>>>>>> Thanks for your comments.
>>>>>>
>>>>>>
>>>>>> I do not know much about the manner of preserving GICv3 context.
>>>>>>
>>>>>> I can see this patch  (rejected?) :
>>>>>> https://patchwork.kernel.org/patch/9343061/
>>>>>>
>>>>>>
>>>>>> Is it something that should be completely cared by firmware
>>>>>> instead of kernel?
>>>>>
>>>>> That was definitely the intention, but it looks like something that ATF
>>>>> has only started supporting very recently:
>>>>>
>>>>> https://github.com/ARM-software/arm-trusted-firmware/pull/1047
>>>>>
>>>>>> ARM Trusted Firmware (https://github.com/ARM-software/arm-trusted-firmware)
>>>>>> is open source software, and I pushed my platform code to the upstream.
>>>>>>
>>>>>> So, yes, I (and everybody) can have access to the firmware source code.
>>>>>>
>>>>>>
>>>>>> I am not sure how ATF saves the context during hibernation, though.
>>>>>
>>>>> See the above link. Is there any chance of you trying this into your
>>>>> firmware?
>>>>>
>>>>> Thanks,
>>>>
>>>> Thanks for the pointer.
>>>>
>>>>
>>>> Yes.  I will try that once GIC-v3 context save/restore is supported in ATF.
>>>>
>>>> I think that will basically work for suspend-to-ram
>>>> because all contexts including both non-secure and secure worlds will
>>>> be retained in the main memory.
>>>>
>>>> However, I still do not understand how the context is preserved during
>>>> the hibernation (suspend-to-disk).
>>>>
>>>>
>>>> If my understanding is correct, hibernation on Linux works like follows:
>>>>
>>>> [1] Freeze all tasks
>>>> [2] CPU_OFF for non-boot CPUs
>>>> [3] Create a hibernation image
>>>> [4] CPU_ON for non-boot CPUs
>>>> [5] Write the hibernation image to the disk (=swap area)
>>>> [6] SYSTEM_OFF
>>>>
>>>>
>>>> IIUC, [5] only writes the context Linux takes care of (only non-secure).
>>>>
>>>> If so, where and how does the firmware write the GIC-v3 context
>>>> to the disk?
>>>
>>> Gah, I completely missed the fact that you were talking about suspend to
>>> disk, sorry about that.
>>>
>>> It is likely that some driver doesn't restore its state properly. Is
>>> there any chance that you could pinpoint which device creates the issue?
>>>
>>
>> I use eMMC to store the hibernation image, but
>> I do not think eMMC driver is the cause of the issue.
>>
>> I guess the cause of the issue is GIC-v3 context is lost.
>
> It is not lost. The boot kernel has re-initialized its state. What is
> missing is that one driver in your system fails to restore its own state
> correctly, and relies on doing something such as enabling/disabling the
> interrupt in its PM handler, things will start working again (probably
> because they use the same PM callbacks functions for both suspend/resume
> and hibernation). This is in no way a guarantee.
>
> Please try:
>
>> swapon -a
>> echo test_resume > /sys/power/disk
>> echo disk > /sys/power/state
>
> and let me know how this fares.
>
>> I am not an expert in this, so I will ask the ATF community
>> about how ATF can support suspend-to-disk.
>
> As you pointed out, ARF is not involved at all in that context, so
> that's pretty pointless.
>



I tried it based on Linus' tree.



Commit: bfa738cf3dfa  (Today's Linus tree HEAD)
defconfig:  arch/arm64/configs/defconfig
DT:  arch/arm64/boot/dts/socionext/uniphier-ld20-ref.dts


I am using eMMC (/dev/mmcblk0)  as a disk.

The eMMC driver is in upstream  (drivers/mmc/host/sdhci-cadence.c)



"test_resume" seems to work.


The full log is as follows:





Starting kernel ...

WARNING: BL31: cortex_a53: errata workaround for 835769 was missing!
WARNING: BL31: cortex_a53: errata workaround for 843419 was missing!
INFO:    BL31: cortex_a53: errata workaround for 855873 was applied
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.13.0-rc4-00052-gbfa738cf3dfa
(masahiro@pug) (gcc version 6.1.1 20160711 (Linaro GCC 6.1-2016.08))
#445 SMP PREEMPT Wed Aug 9 12:48:33 JST 2017
[    0.000000] Boot CPU: AArch64 Processor [410fd082]
[    0.000000] Machine model: UniPhier LD20 Reference Board
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: UEFI not found.
[    0.000000] cma: Reserved 16 MiB at 0x00000000fec00000
[    0.000000] NUMA: No NUMA configuration found
[    0.000000] NUMA: Faking a node at [mem
0x0000000000000000-0x000000013fffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x13ffe2400-0x13ffe3eff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000080000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x000000013fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080000000-0x000000013fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x000000013fffffff]
[    0.000000] psci: probing for conduit method from DT.
[    0.000000] psci: PSCIv1.0 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: MIGRATE_INFO_TYPE not supported.
[    0.000000] percpu: Embedded 26 pages/cpu @ffff8000bff78000 s67352
r8192 d30952 u106496
[    0.000000] Detected PIPT I-cache on CPU0
[    0.000000] Built 1 zonelists in Node order, mobility grouping on.
Total pages: 774144
[    0.000000] Policy zone: Normal
[    0.000000] Kernel command line: resume=/dev/mmcblk0 no_console_suspend
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] software IO TLB [mem 0xfac00000-0xfec00000] (64MB)
mapped at [ffff80007ac00000-ffff80007ebfffff]
[    0.000000] Memory: 2963680K/3145728K available (9212K kernel code,
1130K rwdata, 4200K rodata, 1088K init, 400K bss, 165664K reserved,
16384K cma-reserved)
[    0.000000] Virtual kernel memory layout:
[    0.000000]     modules : 0xffff000000000000 - 0xffff000008000000
(   128 MB)
[    0.000000]     vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000
(129022 GB)
[    0.000000]       .text : 0xffff000008080000 - 0xffff000008980000
(  9216 KB)
[    0.000000]     .rodata : 0xffff000008980000 - 0xffff000008da0000
(  4224 KB)
[    0.000000]       .init : 0xffff000008da0000 - 0xffff000008eb0000
(  1088 KB)
[    0.000000]       .data : 0xffff000008eb0000 - 0xffff000008fcaa00
(  1131 KB)
[    0.000000]        .bss : 0xffff000008fcaa00 - 0xffff00000902ece0
(   401 KB)
[    0.000000]     fixed   : 0xffff7dfffe7fd000 - 0xffff7dfffec00000
(  4108 KB)
[    0.000000]     PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000
(    16 MB)
[    0.000000]     vmemmap : 0xffff7e0000000000 - 0xffff800000000000
(  2048 GB maximum)
[    0.000000]               0xffff7e0000000000 - 0xffff7e0003000000
(    48 MB actual)
[    0.000000]     memory  : 0xffff800000000000 - 0xffff8000c0000000
(  3072 MB)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] Preemptible hierarchical RCU implementation.
[    0.000000]  RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=4.
[    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] GICv3: CPU0: found redistributor 0 region 0:0x000000005fe80000
[    0.000000] arch_timer: cp15 timer(s) running at 50.00MHz (virt).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff
max_cycles: 0xb8812736b, max_idle_ns: 440795202655 ns
[    0.000003] sched_clock: 56 bits at 50MHz, resolution 20ns, wraps
every 4398046511100ns
[    0.000131] Console: colour dummy device 80x25
[    0.000587] console [tty0] enabled
[    0.000628] Calibrating delay loop (skipped), value calculated
using timer frequency.. 100.00 BogoMIPS (lpj=200000)
[    0.000651] pid_max: default: 32768 minimum: 301
[    0.000708] Security Framework initialized
[    0.003117] Dentry cache hash table entries: 524288 (order: 10,
4194304 bytes)
[    0.004482] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
[    0.004561] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes)
[    0.004617] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes)
[    0.020049] ASID allocator initialised with 65536 entries
[    0.028039] Hierarchical SRCU implementation.
[    0.036324] EFI services will not be available.
[    0.044056] smp: Bringing up secondary CPUs ...
[    0.072170] Detected PIPT I-cache on CPU1
[    0.072196] GICv3: CPU1: found redistributor 1 region 0:0x000000005fea0000
[    0.072217] CPU1: Booted secondary processor [410fd082]
[    0.118125] CPU features: enabling workaround for ARM erratum 845719
[    0.118132] Detected VIPT I-cache on CPU2
[    0.118155] GICv3: CPU2: found redistributor 100 region 0:0x000000005fec0000
[    0.118178] CPU2: Booted secondary processor [410fd034]
[    0.144235] Detected VIPT I-cache on CPU3
[    0.144251] GICv3: CPU3: found redistributor 101 region 0:0x000000005fee0000
[    0.144264] CPU3: Booted secondary processor [410fd034]
[    0.144338] smp: Brought up 1 node, 4 CPUs
[    0.144437] SMP: Total of 4 processors activated.
[    0.144449] CPU features: detected feature: GIC system register CPU interface
[    0.144461] CPU features: detected feature: 32-bit EL0 Support
[    0.144639] CPU: All CPU(s) started at EL1
[    0.144667] alternatives: patching kernel code
[    0.145650] devtmpfs: initialized
[    0.146949] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.147003] futex hash table entries: 1024 (order: 5, 131072 bytes)
[    0.147364] pinctrl core: initialized pinctrl subsystem
[    0.147896] DMI not present or invalid.
[    0.148088] random: get_random_u32 called from
bucket_table_alloc+0xf8/0x280 with crng_init=0
[    0.148214] NET: Registered protocol family 16
[    0.148830] random: fast init done
[    0.152312] cpuidle: using governor menu
[    0.152645] vdso: 2 pages (1 code @ ffff000008986000, 1 data @
ffff000008eb5000)
[    0.152680] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.153493] DMA: preallocated 256 KiB pool for atomic allocations
[    0.153823] Serial: AMBA PL011 UART driver
[    0.164789] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.174604] ACPI: Interpreter disabled.
[    0.175163] vgaarb: loaded
[    0.175378] SCSI subsystem initialized
[    0.176629] usbcore: registered new interface driver usbfs
[    0.176686] usbcore: registered new interface driver hub
[    0.176762] usbcore: registered new device driver usb
[    0.177205] pps_core: LinuxPPS API ver. 1 registered
[    0.177223] pps_core: Software ver. 5.3.6 - Copyright 2005-2007
Rodolfo Giometti <giometti@linux.it>
[    0.177266] PTP clock support registered
[    0.177370] EDAC MC: Ver: 3.0.0
[    0.177638] dmi: Firmware registration failed.
[    0.177822] Advanced Linux Sound Architecture Driver Initialized.
[    0.178276] clocksource: Switched to clocksource arch_sys_counter
[    0.178375] VFS: Disk quotas dquot_6.6.0
[    0.178421] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.178554] pnp: PnP ACPI: disabled
[    0.182345] NET: Registered protocol family 2
[    0.182740] TCP established hash table entries: 32768 (order: 6,
262144 bytes)
[    0.182937] TCP bind hash table entries: 32768 (order: 7, 524288 bytes)
[    0.183486] TCP: Hash tables configured (established 32768 bind 32768)
[    0.183644] UDP hash table entries: 2048 (order: 4, 65536 bytes)
[    0.183692] UDP-Lite hash table entries: 2048 (order: 4, 65536 bytes)
[    0.183818] NET: Registered protocol family 1
[    0.184138] RPC: Registered named UNIX socket transport module.
[    0.184163] RPC: Registered udp transport module.
[    0.184178] RPC: Registered tcp transport module.
[    0.184192] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.184389] Unpacking initramfs...
[    0.240364] Freeing initrd memory: 1108K
[    0.240832] kvm [1]: HYP mode not available
[    0.241653] audit: initializing netlink subsys (disabled)
[    0.241824] audit: type=2000 audit(0.239:1): state=initialized
audit_enabled=0 res=1
[    0.242045] workingset: timestamp_bits=44 max_order=20 bucket_order=0
[    0.247990] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.248572] NFS: Registering the id_resolver key type
[    0.248612] Key type id_resolver registered
[    0.248628] Key type id_legacy registered
[    0.248649] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[    0.248812] 9p: Installing v9fs 9p2000 file system support
[    0.250783] Block layer SCSI generic (bsg) driver version 0.4
loaded (major 247)
[    0.250815] io scheduler noop registered
[    0.250988] io scheduler cfq registered (default)
[    0.251005] io scheduler mq-deadline registered
[    0.251021] io scheduler kyber registered
[    0.251698] uniphier-system-bus 58c00000.system-bus: could not find
pctldev for node /soc@0/soc-glue@5f800000/pinctrl/system_bus_grp,
deferring probe
[    0.261491] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    0.263260] 54006800.serial: ttyS0 at MMIO 0x54006800 (irq = 6,
base_baud = 3676470) is a 16550A
[    1.091069] console [ttyS0] enabled
[    1.095058] SuperH (H)SCI(F) driver initialized
[    1.099835] msm_serial: driver initialized
[    1.104419] cacheinfo: Unable to detect cache hierarchy for CPU 0
[    1.115648] loop: module loaded
[    1.119515] hisi_sas: driver version v1.6
[    1.125404] libphy: Fixed MDIO Bus: probed
[    1.129973] tun: Universal TUN/TAP device driver, 1.6
[    1.135741] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[    1.141610] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[    1.147598] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k
[    1.154592] igb: Copyright (c) 2007-2014 Intel Corporation.
[    1.160222] igbvf: Intel(R) Gigabit Virtual Function Network Driver
- version 2.4.0-k
[    1.168090] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
[    1.174225] sky2: driver version 1.30
[    1.178248] VFIO - User Level meta-driver version: 0.3
[    1.184273] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.190836] ehci-pci: EHCI PCI platform driver
[    1.195329] ehci-platform: EHCI generic platform driver
[    1.200642] ehci-orion: EHCI orion driver
[    1.204724] ehci-exynos: EHCI EXYNOS driver
[    1.208981] ehci-msm: Qualcomm On-Chip EHCI Host Controller
[    1.214636] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    1.220887] ohci-pci: OHCI PCI platform driver
[    1.225387] ohci-platform: OHCI generic platform driver
[    1.230697] ohci-exynos: OHCI EXYNOS driver
[    1.235177] usbcore: registered new interface driver usb-storage
[    1.242531] i2c /dev entries driver
[    1.250270] sdhci: Secure Digital Host Controller Interface driver
[    1.256518] sdhci: Copyright(c) Pierre Ossman
[    1.261074] Synopsys Designware Multimedia Card Interface Driver
[    1.267620] sdhci-pltfm: SDHCI platform and OF driver helper
[    1.318318] mmc0: SDHCI controller on 5a000000.sdhc [5a000000.sdhc]
using ADMA 64-bit
[    1.330068] ledtrig-cpu: registered to indicate activity on CPUs
[    1.336862] usbcore: registered new interface driver usbhid
[    1.342482] usbhid: USB HID core driver
[    1.348120] NET: Registered protocol family 17
[    1.352689] 9pnet: Installing 9P2000 support
[    1.357074] Key type dns_resolver registered
[    1.362787] registered taskstats version 1
[    1.370774] smsc911x 43f00000.ethernet: 43f00000.ethernet supply
vdd33a not found, using dummy regulator
[    1.380376] smsc911x 43f00000.ethernet: 43f00000.ethernet supply
vddvario not found, using dummy regulator
[    1.399995] mmc0: new HS200 MMC card at address 0001
[    1.405310] mmcblk0: mmc0:0001 004G60 3.69 GiB
[    1.409971] mmcblk0boot0: mmc0:0001 004G60 partition 1 2.00 MiB
[    1.412556] libphy: smsc911x-mdio: probed
[    1.412947] smsc911x 43f00000.ethernet eth0: MAC Address: 6a:fe:ad:31:b1:51
[    1.414166] 43fb0000.uart: ttyS1 at MMIO 0x43fb0000 (irq = 0,
base_baud = 768000) is a 16550A
[    1.414785] hctosys: unable to open rtc device (rtc0)
[    1.440767] mmcblk0boot1: mmc0:0001 004G60 partition 2 2.00 MiB
[    1.446817] mmcblk0rpmb: mmc0:0001 004G60 partition 3 512 KiB
[    1.454926] ALSA device list:
[    1.457909]   No soundcards found.
[    1.462026] Freeing unused kernel memory: 1088K
Starting logging: OK
Initializing random number generator... done.
Starting network: OK

Welcome to Buildroot
buildroot login: root
# mkswap  /dev/mmcblk0
Setting up swapspace version 1, size = 3959418880 bytes
# swapon  /dev/mmcblk0
[   36.214903] Adding 3866620k swap on /dev/mmcblk0.  Priority:-1
extents:1 across:3866620k SS
# echo  test_resume  > /sys/power/disk
# cat /sys/power/disk
shutdown reboot suspend [test_resume]
# echo disk > /sys/power/state
[   71.137589] PM: Syncing filesystems ...
[   71.141579] PM: done.
[   71.143871] Freezing user space processes ... (elapsed 0.001 seconds) done.
[   71.151948] OOM killer disabled.
[   71.155295] PM: Preallocating image memory... done (allocated 47630 pages)
[   71.360608] PM: Allocated 190520 kbytes in 0.19 seconds (1002.73 MB/s)
[   71.367149] Freezing remaining freezable tasks ... (elapsed 0.001
seconds) done.
[   71.391858] Disabling non-boot CPUs ...
[   71.414571] CPU1: shutdown
[   71.417281] psci: CPU1 killed.
[   71.442560] CPU2: shutdown
[   71.445271] psci: CPU2 killed.
[   71.470963] CPU3: shutdown
[   71.473674] psci: CPU3 killed.
[   71.477083] PM: Creating hibernation image:
[   71.477083] PM: Need to copy 46899 pages
[   71.477083] PM: Hibernation image created (46899 pages copied)
[   71.477083] Suspended for 0.569 seconds
[   71.477110] Enabling non-boot CPUs ...
[   71.484903] Detected PIPT I-cache on CPU1
[   71.488941] GICv3: CPU1: found redistributor 1 region 0:0x000000005fea0000
[   71.495853] CPU1: Booted secondary processor [410fd082]
[   71.501314] CPU1 is up
[   71.503769] Detected VIPT I-cache on CPU2
[   71.507802] GICv3: CPU2: found redistributor 100 region 0:0x000000005fec0000
[   71.514882] CPU2: Booted secondary processor [410fd034]
[   71.520791] CPU2 is up
[   71.523242] Detected VIPT I-cache on CPU3
[   71.527266] GICv3: CPU3: found redistributor 101 region 0:0x000000005fee0000
[   71.534334] CPU3: Booted secondary processor [410fd034]
[   71.539807] CPU3 is up
[   71.543493] hibernate: Hibernating on CPU 0 [mpidr:0x0]
[   71.621511] PM: Using 3 thread(s) for compression.
[   71.621511] PM: Compressing and saving image data (46991 pages)...
[   71.632559] PM: Image saving progress:   0%
[   71.729828] PM: Image saving progress:  10%
[   71.844900] PM: Image saving progress:  20%
[   71.958230] PM: Image saving progress:  30%
[   72.036647] PM: Image saving progress:  40%
[   72.099759] PM: Image saving progress:  50%
[   72.160172] PM: Image saving progress:  60%
[   72.233590] PM: Image saving progress:  70%
[   72.299996] PM: Image saving progress:  80%
[   72.369865] PM: Image saving progress:  90%
[   72.454932] PM: Image saving progress: 100%
[   74.105739] PM: Image saving done.
[   74.109213] PM: Wrote 187964 kbytes in 2.47 seconds (76.09 MB/s)
[   74.115637] PM: S
[   74.116848] |
[   74.182660] PM: Using 3 thread(s) for decompression.
[   74.182660] PM: Loading and decompressing image data (46991 pages)...
[   74.194105] hibernate: Hibernated on CPU 0 [mpidr:0x0]
[   74.416966] PM: Image loading progress:   0%
[   74.667431] PM: Image loading progress:  10%
[   74.768269] PM: Image loading progress:  20%
[   74.863927] PM: Image loading progress:  30%
[   74.957324] PM: Image loading progress:  40%
[   75.050884] PM: Image loading progress:  50%
[   75.144371] PM: Image loading progress:  60%
[   75.230387] PM: Image loading progress:  70%
[   75.313445] PM: Image loading progress:  80%
[   75.403118] PM: Image loading progress:  90%
[   75.494369] PM: Image loading progress: 100%
[   75.499042] PM: Image loading done.
[   75.502549] PM: Read 187964 kbytes in 1.30 seconds (144.58 MB/s)
[   75.527824] Disabling non-boot CPUs ...
[   75.558505] CPU1: shutdown
[   75.561219] psci: CPU1 killed.
[   75.586483] CPU2: shutdown
[   75.589193] psci: CPU2 killed.
[   75.614940] CPU3: shutdown
[   75.617654] psci: CPU3 killed.
[   71.477083] Suspended for 9.134 seconds
[   71.477111] Enabling non-boot CPUs ...
[   71.484909] Detected PIPT I-cache on CPU1
[   71.488946] GICv3: CPU1: found redistributor 1 region 0:0x000000005fea0000
[   71.495857] CPU1: Booted secondary processor [410fd082]
[   71.501316] CPU1 is up
[   71.503777] Detected VIPT I-cache on CPU2
[   71.507817] GICv3: CPU2: found redistributor 100 region 0:0x000000005fec0000
[   71.514895] CPU2: Booted secondary processor [410fd034]
[   71.520841] CPU2 is up
[   71.523293] Detected VIPT I-cache on CPU3
[   71.527315] GICv3: CPU3: found redistributor 101 region 0:0x000000005fee0000
[   71.534384] CPU3: Booted secondary processor [410fd034]
[   71.539852] CPU3 is up
[   71.596208] OOM killer enabled.
[   71.599374] Restarting tasks ... done.
# ls
# uname -r
4.13.0-rc4-00052-gbfa738cf3dfa









-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2017-08-09  4:03 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-24 13:45 Suspend-resume failure on Intel Eagle Lake Core2Duo Martin Peres
2017-07-24 15:28 ` Thomas Gleixner
2017-07-24 15:41   ` Martin Peres
2017-07-24 16:35     ` Thomas Gleixner
2017-07-24 16:37       ` Martin Peres
2017-07-25  7:01         ` Tomi Sarvela
2017-07-26 13:47           ` Martin Peres
2017-07-26 14:26             ` Thomas Gleixner
2017-07-27  7:36               ` Tomi Sarvela
2017-07-27  7:42                 ` Thomas Gleixner
2017-07-27  8:12                   ` Tomi Sarvela
2017-07-27 19:42                     ` Thomas Gleixner
2017-07-27 20:13                       ` Thomas Gleixner
2017-07-27 21:08                         ` Thomas Gleixner
2017-07-28 12:29                           ` Tomi Sarvela
2017-07-28 12:42                             ` Thomas Gleixner
2017-07-28 13:15                               ` Thomas Gleixner
2017-07-28 13:25                                 ` Tomi Sarvela
2017-07-28 14:13                                   ` Thomas Gleixner
2017-07-28 14:46                                     ` Tomi Sarvela
2017-07-28 14:50                                       ` Thomas Gleixner
2017-07-28 14:55                                         ` Tomi Sarvela
2017-07-28 16:26                                           ` Thomas Gleixner
2017-07-31  7:21                                             ` Tomi Sarvela
2017-07-31  7:45                                               ` Thomas Gleixner
2017-07-31  7:55                                                 ` Tomi Sarvela
2017-07-31  8:29                                                   ` Thomas Gleixner
2017-07-31  8:41                                                     ` Tomi Sarvela
2017-07-31 14:04                                                       ` Thomas Gleixner
2017-07-31 14:13                                                         ` Tomi Sarvela
2017-07-31 15:06                                                   ` Thomas Gleixner
2017-07-31 15:44                                                     ` Tomi Sarvela
2017-07-31 15:55                                                       ` Thomas Gleixner
2017-08-03  7:32                                                         ` Masahiro Yamada
2017-08-03  8:23                                                           ` Peter Zijlstra
2017-08-03  8:41                                                           ` Marc Zyngier
2017-08-03 12:52                                                             ` Masahiro Yamada
2017-08-03 13:30                                                               ` Marc Zyngier
2017-08-07  4:45                                                                 ` Masahiro Yamada
2017-08-07  8:17                                                                   ` Marc Zyngier
2017-08-08  1:30                                                                     ` Masahiro Yamada
2017-08-08  7:39                                                                       ` Marc Zyngier
2017-08-09  4:03                                                                         ` Masahiro Yamada
2017-07-31  8:35                                           ` Thomas Gleixner
2017-07-28 12:34                 ` Martin Peres

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).