All of lore.kernel.org
 help / color / mirror / Atom feed
* Unreliable hibernation on Lenovo x230 (regression)
@ 2015-04-01 19:47 rhn
  2015-04-02 15:28 ` Pavel Machek
  0 siblings, 1 reply; 16+ messages in thread
From: rhn @ 2015-04-01 19:47 UTC (permalink / raw)
  To: linux-pm; +Cc: Rafael J. Wysocki, Pavel Machek

Hello,

Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.

The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.

I have tracked the problem to first appear in the commit
e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'

The problem itself manifests in dmesg as follows (system was first restarted, then hibernated - this log is from the subsequent resume):

[    0.603802] PM: Checking hibernation image partition UUID=8f9fa995-7456-4353-9e43-4f9b2aa3c85a
[    1.523053] PM: Hibernation image not present or could not be loaded.
[    2.888873] PM: Starting manual resume from disk
[    2.888879] PM: Hibernation image partition 8:17 present
[    2.888881] PM: Looking for hibernation image.
[    2.891705] PM: Image signature found, resuming
[    2.891834] PM: Preparing processes for restore.
[    2.895006] PM: Loading hibernation image.
[    2.895188] PM: Marking nosave pages: [mem 0x00090000-0x000fffff]
[    2.895194] PM: Marking nosave pages: [mem 0x20000000-0x201fffff]
[    2.895207] PM: Marking nosave pages: [mem 0x40004000-0x40004fff]
[    2.895209] PM: Marking nosave pages: [mem 0x5b8fd000-0x5bafefff]
[    2.895222] PM: Marking nosave pages: [mem 0x9d3d3000-0x9d3d3fff]
[    2.895223] PM: Marking nosave pages: [mem 0x9d3e3000-0x9d3e3fff]
[    2.895225] PM: Marking nosave pages: [mem 0xd6850000-0xdaffefff]
[    2.895626] PM: Marking nosave pages: [mem 0xdb000000-0xffffffff]
[    2.896472] PM: Basic memory bitmaps created
[    2.930394] PM: Using 3 thread(s) for decompression.
PM: Loading and decompressing image data (355423 pages)...
[    3.054656] PM: Image loading progress:   0%
[    3.138824] PM: 0x9d3d3000 in e820 nosave region: [mem 0x9d3d3000-0x9d3d3fff]
[    3.139696] PM: Read 1421692 kbytes in 0.20 seconds (7108.46 MB/s)
[    3.140736] PM: Error -14 resuming
[    3.140762] PM: Failed to load hibernation image, recovering.
[    3.141520] PM: Basic memory bitmaps freed
[    3.141591] PM: Hibernation image not present or could not be loaded.
[    3.159767] PM: Starting manual resume from disk
[    3.159772] PM: Hibernation image partition 8:17 present
[    3.159774] PM: Looking for hibernation image.
[    3.160992] PM: Image not found (code -22)
[    3.160995] PM: Hibernation image not present or could not be loaded.

(the system boots normally after this)

The failure mode looks similar to the one specified by commit
84c91b7ae07c62cf6dee7fde3277f4be21331f85	PM / hibernate: avoid unsafe pages in e820 reserved regions
which I believe is related.

In 3.16, dmesg after resume is the same as the second (successful) attempt in the broken commit:

[   82.104560] PM: Hibernation mode set to 'platform'
[   82.124615] PM: Syncing filesystems ... done.
[   82.338692] PM: Marking nosave pages: [mem 0x00090000-0x000fffff]
[   82.338698] PM: Marking nosave pages: [mem 0x20000000-0x201fffff]
[   82.338708] PM: Marking nosave pages: [mem 0x40004000-0x40004fff]
[   82.338710] PM: Marking nosave pages: [mem 0x5b8fd000-0x5bafefff]
[   82.338720] PM: Marking nosave pages: [mem 0x9d3d3000-0x9d3d3fff]
[   82.338721] PM: Marking nosave pages: [mem 0x9d3e3000-0x9d3e3fff]
[   82.338723] PM: Marking nosave pages: [mem 0xd6850000-0xdaffefff]
[   82.339042] PM: Marking nosave pages: [mem 0xdb000000-0xffffffff]
[   82.339731] PM: Basic memory bitmaps created
[   82.339802] PM: Preallocating image memory... done (allocated 369125 pages)
[   82.749218] PM: Allocated 1476500 kbytes in 0.40 seconds (3691.25 MB/s)
[   83.943966] PM: freeze of devices complete after 1193.117 msecs
[   83.944513] PM: late freeze of devices complete after 0.542 msecs
[   83.949333] PM: noirq freeze of devices complete after 4.815 msecs
[   83.952668] PM: Saving platform NVS memory
[   83.960083] PM: Creating hibernation image:
[   84.288693] PM: Need to copy 368027 pages
[   84.288697] PM: Normal pages needed: 368027 + 1024, available pages: 2730849
[   83.961192] PM: Restoring platform NVS memory
[   84.092687] PM: noirq restore of devices complete after 11.067 msecs
[   84.092844] PM: early restore of devices complete after 0.132 msecs
[   84.941509] PM: restore of devices complete after 793.686 msecs
[   84.941713] PM: Image restored successfully.
[   84.941764] PM: Basic memory bitmaps freed

The bug persists with kernel 4.0.0-rc5.

While I'm not familiar enough with x64 PM to fix it myself, I'll be happy to provide more info, or apply patches that provide deeper debugging, or check out various combinations of the merged branches (but which ones?).

Cheers,
rhn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-01 19:47 Unreliable hibernation on Lenovo x230 (regression) rhn
@ 2015-04-02 15:28 ` Pavel Machek
  2015-04-02 16:50   ` joeyli
  2015-04-03 15:58   ` rhn
  0 siblings, 2 replies; 16+ messages in thread
From: Pavel Machek @ 2015-04-02 15:28 UTC (permalink / raw)
  To: rhn, kernel list, joeyli.kernel; +Cc: linux-pm, Rafael J. Wysocki

On Wed 2015-04-01 21:47:43, rhn wrote:
> Hello,
> 
> Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> 
> The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> 
> I have tracked the problem to first appear in the commit
> e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> 
> The problem itself manifests in dmesg as follows (system was first
> restarted, then hibernated - this log is from the subsequent
resume):

Ok, can you try to disable cpufreq and cpuidle, and then try if it
reproduces?

At that point, this is the candidate:

commit e67ee10190e69332f929bdd6594a312363321a66
Merge: 21c806d 84c91b7 39c8bba 372ba8c
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date:   Mon Aug 11 23:19:48 2014 +0200

    Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'

    * pm-sleep:
          PM / hibernate: avoid unsafe pages in e820 reserved regions

...
Alternatively, you can just try to revert

commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
Date:   Mon Aug 4 23:23:21 2014 +0800

    PM / hibernate: avoid unsafe pages in e820 reserved regions

    When the machine doesn't well handle the e820 persistent when
    hibernate
        resuming, then it may cause page fault when writing image to
    snapshot
        buffer:


...

Thanks,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-02 15:28 ` Pavel Machek
@ 2015-04-02 16:50   ` joeyli
  2015-04-02 17:22     ` joeyli
  2015-04-03 15:58   ` rhn
  1 sibling, 1 reply; 16+ messages in thread
From: joeyli @ 2015-04-02 16:50 UTC (permalink / raw)
  To: Pavel Machek, rhn; +Cc: kernel list, joeyli.kernel, linux-pm, Rafael J. Wysocki

Hi, 

On Thu, Apr 02, 2015 at 05:28:05PM +0200, Pavel Machek wrote:
> On Wed 2015-04-01 21:47:43, rhn wrote:
> > Hello,
> > 
> > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > 
> > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > 
> > I have tracked the problem to first appear in the commit
> > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > 
> > The problem itself manifests in dmesg as follows (system was first
> > restarted, then hibernated - this log is from the subsequent
> resume):
> 
> Ok, can you try to disable cpufreq and cpuidle, and then try if it
> reproduces?
> 
> At that point, this is the candidate:
> 
> commit e67ee10190e69332f929bdd6594a312363321a66
> Merge: 21c806d 84c91b7 39c8bba 372ba8c
> Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Date:   Mon Aug 11 23:19:48 2014 +0200
> 
>     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> 
>     * pm-sleep:
>           PM / hibernate: avoid unsafe pages in e820 reserved regions
> 
> ...
> Alternatively, you can just try to revert
> 
> commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> Date:   Mon Aug 4 23:23:21 2014 +0800
> 
>     PM / hibernate: avoid unsafe pages in e820 reserved regions
> 
>     When the machine doesn't well handle the e820 persistent when
>     hibernate
>         resuming, then it may cause page fault when writing image to
>     snapshot
>         buffer:
> 
> 
> ...
> 
> Thanks,
> 									Pavel

Before revert 84c91b7ae patch, please check does there have log similar as
following in dmesg when hibernate resume fail?

[   24.349777] PM: 0xab9bc000 in e820 nosave region: [mem 0xab9bc000-0xab9c2fff]

The address may different, by you should see "e820 nosave region" log. Otherwise
we got another problem.


Thanks a lot!
Joey Lee

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-02 16:50   ` joeyli
@ 2015-04-02 17:22     ` joeyli
  2015-04-02 18:12       ` rhn
  0 siblings, 1 reply; 16+ messages in thread
From: joeyli @ 2015-04-02 17:22 UTC (permalink / raw)
  To: Pavel Machek, rhn; +Cc: kernel list, joeyli.kernel, linux-pm, Rafael J. Wysocki

On Fri, Apr 03, 2015 at 12:50:54AM +0800, joeyli wrote:
> Hi, 
> 
> On Thu, Apr 02, 2015 at 05:28:05PM +0200, Pavel Machek wrote:
> > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > Hello,
> > > 
> > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > 
> > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > 
> > > I have tracked the problem to first appear in the commit
> > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > 
> > > The problem itself manifests in dmesg as follows (system was first
> > > restarted, then hibernated - this log is from the subsequent
> > resume):
> > 
> > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > reproduces?
> > 
> > At that point, this is the candidate:
> > 
> > commit e67ee10190e69332f929bdd6594a312363321a66
> > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Date:   Mon Aug 11 23:19:48 2014 +0200
> > 
> >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > 
> >     * pm-sleep:
> >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > 
> > ...
> > Alternatively, you can just try to revert
> > 
> > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > Date:   Mon Aug 4 23:23:21 2014 +0800
> > 
> >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > 
> >     When the machine doesn't well handle the e820 persistent when
> >     hibernate
> >         resuming, then it may cause page fault when writing image to
> >     snapshot
> >         buffer:
> > 
> > 
> > ...
> > 
> > Thanks,
> > 									Pavel
> 
> Before revert 84c91b7ae patch, please check does there have log similar as
> following in dmesg when hibernate resume fail?
> 
> [   24.349777] PM: 0xab9bc000 in e820 nosave region: [mem 0xab9bc000-0xab9c2fff]
> 
> The address may different, by you should see "e820 nosave region" log. Otherwise
> we got another problem.
>

Forgot to mention, please add "debug no_console_suspend=1 loglevel=9" to kernel
parameter then try to reproduce issue and look at dmesg.


Thanks a lot!
Joey Lee 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-02 17:22     ` joeyli
@ 2015-04-02 18:12       ` rhn
  2015-04-03  1:23         ` joeyli
  0 siblings, 1 reply; 16+ messages in thread
From: rhn @ 2015-04-02 18:12 UTC (permalink / raw)
  To: joeyli
  Cc: Pavel Machek, rhn, kernel list, joeyli.kernel, linux-pm,
	Rafael J. Wysocki

On Fri, 3 Apr 2015 01:22:21 +0800
joeyli <jlee@suse.com> wrote:

> On Fri, Apr 03, 2015 at 12:50:54AM +0800, joeyli wrote:
> > Hi, 
> > 
> > On Thu, Apr 02, 2015 at 05:28:05PM +0200, Pavel Machek wrote:
> > > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > > Hello,
> > > > 
> > > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > > 
> > > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > > 
> > > > I have tracked the problem to first appear in the commit
> > > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > 
> > > > The problem itself manifests in dmesg as follows (system was first
> > > > restarted, then hibernated - this log is from the subsequent
> > > resume):
> > > 
> > > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > > reproduces?
> > > 
> > > At that point, this is the candidate:
> > > 
> > > commit e67ee10190e69332f929bdd6594a312363321a66
> > > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Date:   Mon Aug 11 23:19:48 2014 +0200
> > > 
> > >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > 
> > >     * pm-sleep:
> > >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > 
> > > ...
> > > Alternatively, you can just try to revert
> > > 
> > > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > > Date:   Mon Aug 4 23:23:21 2014 +0800
> > > 
> > >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > 
> > >     When the machine doesn't well handle the e820 persistent when
> > >     hibernate
> > >         resuming, then it may cause page fault when writing image to
> > >     snapshot
> > >         buffer:
> > > 
> > > 
> > > ...
> > > 
> > > Thanks,
> > > 									Pavel
> > 
> > Before revert 84c91b7ae patch, please check does there have log similar as
> > following in dmesg when hibernate resume fail?
> > 
> > [   24.349777] PM: 0xab9bc000 in e820 nosave region: [mem 0xab9bc000-0xab9c2fff]
> > 
> > The address may different, by you should see "e820 nosave region" log. Otherwise
> > we got another problem.
> >
> 
> Forgot to mention, please add "debug no_console_suspend=1 loglevel=9" to kernel
> parameter then try to reproduce issue and look at dmesg.
> 
> 
> Thanks a lot!
> Joey Lee 

Yes, it's present in dmesg when hibernate fails (default kernel params):
[    3.138824] PM: 0x9d3d3000 in e820 nosave region: [mem 0x9d3d3000-0x9d3d3fff]

I probably didn't make it clear - the top dmesg in my original message was from failed resume.

Cheers,
rhn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-02 18:12       ` rhn
@ 2015-04-03  1:23         ` joeyli
  2015-04-03 16:00           ` rhn
  0 siblings, 1 reply; 16+ messages in thread
From: joeyli @ 2015-04-03  1:23 UTC (permalink / raw)
  To: rhn; +Cc: Pavel Machek, kernel list, joeyli.kernel, linux-pm, Rafael J. Wysocki

On Thu, Apr 02, 2015 at 08:12:00PM +0200, rhn wrote:
> On Fri, 3 Apr 2015 01:22:21 +0800
> joeyli <jlee@suse.com> wrote:
> 
> > On Fri, Apr 03, 2015 at 12:50:54AM +0800, joeyli wrote:
> > > Hi, 
> > > 
> > > On Thu, Apr 02, 2015 at 05:28:05PM +0200, Pavel Machek wrote:
> > > > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > > > Hello,
> > > > > 
> > > > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > > > 
> > > > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > > > 
> > > > > I have tracked the problem to first appear in the commit
> > > > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > > 
> > > > > The problem itself manifests in dmesg as follows (system was first
> > > > > restarted, then hibernated - this log is from the subsequent
> > > > resume):
> > > > 
> > > > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > > > reproduces?
> > > > 
> > > > At that point, this is the candidate:
> > > > 
> > > > commit e67ee10190e69332f929bdd6594a312363321a66
> > > > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > Date:   Mon Aug 11 23:19:48 2014 +0200
> > > > 
> > > >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > 
> > > >     * pm-sleep:
> > > >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > > 
> > > > ...
> > > > Alternatively, you can just try to revert
> > > > 
> > > > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > > > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > > > Date:   Mon Aug 4 23:23:21 2014 +0800
> > > > 
> > > >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > > 
> > > >     When the machine doesn't well handle the e820 persistent when
> > > >     hibernate
> > > >         resuming, then it may cause page fault when writing image to
> > > >     snapshot
> > > >         buffer:
> > > > 
> > > > 
> > > > ...
> > > > 
> > > > Thanks,
> > > > 									Pavel
> > > 
> > > Before revert 84c91b7ae patch, please check does there have log similar as
> > > following in dmesg when hibernate resume fail?
> > > 
> > > [   24.349777] PM: 0xab9bc000 in e820 nosave region: [mem 0xab9bc000-0xab9c2fff]
> > > 
> > > The address may different, by you should see "e820 nosave region" log. Otherwise
> > > we got another problem.
> > >
> > 
> > Forgot to mention, please add "debug no_console_suspend=1 loglevel=9" to kernel
> > parameter then try to reproduce issue and look at dmesg.
> > 
> > 
> > Thanks a lot!
> > Joey Lee 
> 
> Yes, it's present in dmesg when hibernate fails (default kernel params):
> [    3.138824] PM: 0x9d3d3000 in e820 nosave region: [mem 0x9d3d3000-0x9d3d3fff]
>

OK, then the message means 0x9d3d3000 address used by image kernel but in e820
region of current boot. Need check does this e820 region used by setup_data so
reserved as E820_RESERVED_KERN.

Need your complete dmesg to verify the e820 table. If the above assumption is
true, then Yinghai Lu's patchset could fix this problem:

x86: Kill E820_RESERVED_KERN
https://lkml.org/lkml/2015/3/4/434

The target kernel version to merge his patches is v4.1
 
> I probably didn't make it clear - the top dmesg in my original message was from failed resume.
> 
> Cheers,
> rhn

On the other hand,
Could you please check you are using platform mode to turn off machine for
hibernating?

$ cat /sys/power/disk
[platform] shutdown reboot suspend

And, if possible, please file bug on bugzilla.kernel.org and give me the bug
number. I prefer collect log and debugging history in bugzilla for further
tracking.


Thanks a lot!
Joey Lee

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-02 15:28 ` Pavel Machek
  2015-04-02 16:50   ` joeyli
@ 2015-04-03 15:58   ` rhn
  2015-04-03 16:40     ` Pavel Machek
  2015-04-03 21:43     ` Rafael J. Wysocki
  1 sibling, 2 replies; 16+ messages in thread
From: rhn @ 2015-04-03 15:58 UTC (permalink / raw)
  To: Pavel Machek; +Cc: rhn, kernel list, joeyli.kernel, linux-pm, Rafael J. Wysocki

On Thu, 2 Apr 2015 17:28:05 +0200
Pavel Machek <pavel@ucw.cz> wrote:

> On Wed 2015-04-01 21:47:43, rhn wrote:
> > Hello,
> > 
> > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > 
> > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > 
> > I have tracked the problem to first appear in the commit
> > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > 
> > The problem itself manifests in dmesg as follows (system was first
> > restarted, then hibernated - this log is from the subsequent
> resume):
> 
> Ok, can you try to disable cpufreq and cpuidle, and then try if it
> reproduces?
> 
> At that point, this is the candidate:
> 
> commit e67ee10190e69332f929bdd6594a312363321a66
> Merge: 21c806d 84c91b7 39c8bba 372ba8c
> Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Date:   Mon Aug 11 23:19:48 2014 +0200
> 
>     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> 
>     * pm-sleep:
>           PM / hibernate: avoid unsafe pages in e820 reserved regions
> 
> ...
> Alternatively, you can just try to revert
> 
> commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> Date:   Mon Aug 4 23:23:21 2014 +0800
> 
>     PM / hibernate: avoid unsafe pages in e820 reserved regions
> 
>     When the machine doesn't well handle the e820 persistent when
>     hibernate
>         resuming, then it may cause page fault when writing image to
>     snapshot
>         buffer:
> 
> 
> ...
> 
> Thanks,
> 									Pavel

I tried to disable CONFIG_CPU_IDLE and CONFIG_CPU_FREQ, however for some reason I could only disable CONFIG_CPU_FREQ.

The bug persisted.

Reverting the commit 84c91b7 on top of e67ee10 fixes the problem.

I created a copy of the bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=96111

Cheers,
rhn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-03  1:23         ` joeyli
@ 2015-04-03 16:00           ` rhn
  0 siblings, 0 replies; 16+ messages in thread
From: rhn @ 2015-04-03 16:00 UTC (permalink / raw)
  To: joeyli
  Cc: rhn, Pavel Machek, kernel list, joeyli.kernel, linux-pm,
	Rafael J. Wysocki

On Fri, 3 Apr 2015 09:23:35 +0800
joeyli <jlee@suse.com> wrote:

> On Thu, Apr 02, 2015 at 08:12:00PM +0200, rhn wrote:
> > On Fri, 3 Apr 2015 01:22:21 +0800
> > joeyli <jlee@suse.com> wrote:
> > 
> > > On Fri, Apr 03, 2015 at 12:50:54AM +0800, joeyli wrote:
> > > > Hi, 
> > > > 
> > > > On Thu, Apr 02, 2015 at 05:28:05PM +0200, Pavel Machek wrote:
> > > > > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > > > > 
> > > > > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > > > > 
> > > > > > I have tracked the problem to first appear in the commit
> > > > > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > > > 
> > > > > > The problem itself manifests in dmesg as follows (system was first
> > > > > > restarted, then hibernated - this log is from the subsequent
> > > > > resume):
> > > > > 
> > > > > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > > > > reproduces?
> > > > > 
> > > > > At that point, this is the candidate:
> > > > > 
> > > > > commit e67ee10190e69332f929bdd6594a312363321a66
> > > > > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > > > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > > Date:   Mon Aug 11 23:19:48 2014 +0200
> > > > > 
> > > > >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > > 
> > > > >     * pm-sleep:
> > > > >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > > > 
> > > > > ...
> > > > > Alternatively, you can just try to revert
> > > > > 
> > > > > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > > > > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > > > > Date:   Mon Aug 4 23:23:21 2014 +0800
> > > > > 
> > > > >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > > > 
> > > > >     When the machine doesn't well handle the e820 persistent when
> > > > >     hibernate
> > > > >         resuming, then it may cause page fault when writing image to
> > > > >     snapshot
> > > > >         buffer:
> > > > > 
> > > > > 
> > > > > ...
> > > > > 
> > > > > Thanks,
> > > > > 									Pavel
> > > > 
> > > > Before revert 84c91b7ae patch, please check does there have log similar as
> > > > following in dmesg when hibernate resume fail?
> > > > 
> > > > [   24.349777] PM: 0xab9bc000 in e820 nosave region: [mem 0xab9bc000-0xab9c2fff]
> > > > 
> > > > The address may different, by you should see "e820 nosave region" log. Otherwise
> > > > we got another problem.
> > > >
> > > 
> > > Forgot to mention, please add "debug no_console_suspend=1 loglevel=9" to kernel
> > > parameter then try to reproduce issue and look at dmesg.
> > > 
> > > 
> > > Thanks a lot!
> > > Joey Lee 
> > 
> > Yes, it's present in dmesg when hibernate fails (default kernel params):
> > [    3.138824] PM: 0x9d3d3000 in e820 nosave region: [mem 0x9d3d3000-0x9d3d3fff]
> >
> 
> OK, then the message means 0x9d3d3000 address used by image kernel but in e820
> region of current boot. Need check does this e820 region used by setup_data so
> reserved as E820_RESERVED_KERN.
> 
> Need your complete dmesg to verify the e820 table. If the above assumption is
> true, then Yinghai Lu's patchset could fix this problem:
> 
> x86: Kill E820_RESERVED_KERN
> https://lkml.org/lkml/2015/3/4/434
> 
> The target kernel version to merge his patches is v4.1
>  
> > I probably didn't make it clear - the top dmesg in my original message was from failed resume.
> > 
> > Cheers,
> > rhn
> 
> On the other hand,
> Could you please check you are using platform mode to turn off machine for
> hibernating?
> 
> $ cat /sys/power/disk
> [platform] shutdown reboot suspend
> 
> And, if possible, please file bug on bugzilla.kernel.org and give me the bug
> number. I prefer collect log and debugging history in bugzilla for further
> tracking.
> 
> 
> Thanks a lot!
> Joey Lee

Yes, platform mode was used in all instances - both working and broken kernels.

I included full dmesg in the bug report on bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=96111

Cheers,
rhn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-03 15:58   ` rhn
@ 2015-04-03 16:40     ` Pavel Machek
  2015-04-03 21:43     ` Rafael J. Wysocki
  1 sibling, 0 replies; 16+ messages in thread
From: Pavel Machek @ 2015-04-03 16:40 UTC (permalink / raw)
  To: rhn; +Cc: kernel list, joeyli.kernel, linux-pm, Rafael J. Wysocki

Hi!

> > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > 
> > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > 
> > > I have tracked the problem to first appear in the commit
> > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > 
> > > The problem itself manifests in dmesg as follows (system was first
> > > restarted, then hibernated - this log is from the subsequent
> > resume):
> > 
> > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > reproduces?
> > 
> > At that point, this is the candidate:
> > 
> > commit e67ee10190e69332f929bdd6594a312363321a66
> > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Date:   Mon Aug 11 23:19:48 2014 +0200
> > 
> >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > 
> >     * pm-sleep:
> >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > 
> > ...
> > Alternatively, you can just try to revert
> > 
> > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > Date:   Mon Aug 4 23:23:21 2014 +0800
> > 
> >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > 
> >     When the machine doesn't well handle the e820 persistent when
> >     hibernate
> >         resuming, then it may cause page fault when writing image to
> >     snapshot
> >         buffer:
> > 
> > 
> > ...
> > 
> > Thanks,
> > 									Pavel
> 
> I tried to disable CONFIG_CPU_IDLE and CONFIG_CPU_FREQ, however for some reason I could only disable CONFIG_CPU_FREQ.
> 
> The bug persisted.
> 
> Reverting the commit 84c91b7 on top of e67ee10 fixes the problem.

Ok, I guess next steps would be verify if 4.0 has the problem, and if
revert of 84c91b7 there fixes it, too... maybe we should revert it for
4.0?


									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-03 15:58   ` rhn
  2015-04-03 16:40     ` Pavel Machek
@ 2015-04-03 21:43     ` Rafael J. Wysocki
  2015-04-04  8:12       ` rhn
  2015-04-05  7:26       ` joeyli
  1 sibling, 2 replies; 16+ messages in thread
From: Rafael J. Wysocki @ 2015-04-03 21:43 UTC (permalink / raw)
  To: rhn; +Cc: Pavel Machek, kernel list, joeyli.kernel, linux-pm

On Friday, April 03, 2015 05:58:25 PM rhn wrote:
> On Thu, 2 Apr 2015 17:28:05 +0200
> Pavel Machek <pavel@ucw.cz> wrote:
> 
> > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > Hello,
> > > 
> > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > 
> > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > 
> > > I have tracked the problem to first appear in the commit
> > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > 
> > > The problem itself manifests in dmesg as follows (system was first
> > > restarted, then hibernated - this log is from the subsequent
> > resume):
> > 
> > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > reproduces?
> > 
> > At that point, this is the candidate:
> > 
> > commit e67ee10190e69332f929bdd6594a312363321a66
> > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Date:   Mon Aug 11 23:19:48 2014 +0200
> > 
> >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > 
> >     * pm-sleep:
> >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > 
> > ...
> > Alternatively, you can just try to revert
> > 
> > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > Date:   Mon Aug 4 23:23:21 2014 +0800
> > 
> >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > 
> >     When the machine doesn't well handle the e820 persistent when
> >     hibernate
> >         resuming, then it may cause page fault when writing image to
> >     snapshot
> >         buffer:
> > 
> > 
> > ...
> > 
> > Thanks,
> > 									Pavel
> 
> I tried to disable CONFIG_CPU_IDLE and CONFIG_CPU_FREQ, however for some reason I could only disable CONFIG_CPU_FREQ.
> 
> The bug persisted.
> 
> Reverting the commit 84c91b7 on top of e67ee10 fixes the problem.
> 
> I created a copy of the bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=96111

Please check if 4.0-rc6 still has the problem and if reverting the commit in
question on top of it fixes the problem too.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-03 21:43     ` Rafael J. Wysocki
@ 2015-04-04  8:12       ` rhn
  2015-04-05  7:24         ` joeyli
  2015-04-05  7:26       ` joeyli
  1 sibling, 1 reply; 16+ messages in thread
From: rhn @ 2015-04-04  8:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rhn, Pavel Machek, kernel list, joeyli.kernel, linux-pm, joeyli

On Fri, 03 Apr 2015 23:43:30 +0200
"Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:

> On Friday, April 03, 2015 05:58:25 PM rhn wrote:
> > On Thu, 2 Apr 2015 17:28:05 +0200
> > Pavel Machek <pavel@ucw.cz> wrote:
> > 
> > > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > > Hello,
> > > > 
> > > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > > 
> > > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > > 
> > > > I have tracked the problem to first appear in the commit
> > > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > 
> > > > The problem itself manifests in dmesg as follows (system was first
> > > > restarted, then hibernated - this log is from the subsequent
> > > resume):
> > > 
> > > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > > reproduces?
> > > 
> > > At that point, this is the candidate:
> > > 
> > > commit e67ee10190e69332f929bdd6594a312363321a66
> > > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Date:   Mon Aug 11 23:19:48 2014 +0200
> > > 
> > >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > 
> > >     * pm-sleep:
> > >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > 
> > > ...
> > > Alternatively, you can just try to revert
> > > 
> > > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > > Date:   Mon Aug 4 23:23:21 2014 +0800
> > > 
> > >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > 
> > >     When the machine doesn't well handle the e820 persistent when
> > >     hibernate
> > >         resuming, then it may cause page fault when writing image to
> > >     snapshot
> > >         buffer:
> > > 
> > > 
> > > ...
> > > 
> > > Thanks,
> > > 									Pavel
> > 
> > I tried to disable CONFIG_CPU_IDLE and CONFIG_CPU_FREQ, however for some reason I could only disable CONFIG_CPU_FREQ.
> > 
> > The bug persisted.
> > 
> > Reverting the commit 84c91b7 on top of e67ee10 fixes the problem.
> > 
> > I created a copy of the bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=96111
> 
> Please check if 4.0-rc6 still has the problem and if reverting the commit in
> question on top of it fixes the problem too.
> 
> 

I took the commit 8f778bbc542ddf8f6243b21d6aca087e709cabdc as the base for further checking (I started building before I read your message). It's a descendant of 4.0-rc6, so I hope it's not going to make a difference.

Results:
8f778bb : bad
8f778bb + reverted 84c91b7 : good
8f778bb + patch [1] : good

Thanks!

[1]:
x86: Kill E820_RESERVED_KERN  https://lkml.org/lkml/2015/3/4/434 as suggested in joeyli's other email.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-04  8:12       ` rhn
@ 2015-04-05  7:24         ` joeyli
  2015-04-05  7:51           ` Yinghai Lu
  0 siblings, 1 reply; 16+ messages in thread
From: joeyli @ 2015-04-05  7:24 UTC (permalink / raw)
  To: rhn; +Cc: Rafael J. Wysocki, Pavel Machek, kernel list, joeyli.kernel, linux-pm

Hi Rafael, 

On Sat, Apr 04, 2015 at 10:12:43AM +0200, rhn wrote:
> On Fri, 03 Apr 2015 23:43:30 +0200
> "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> 
> > On Friday, April 03, 2015 05:58:25 PM rhn wrote:
> > > On Thu, 2 Apr 2015 17:28:05 +0200
> > > Pavel Machek <pavel@ucw.cz> wrote:
> > > 
> > > > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > > > Hello,
> > > > > 
> > > > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > > > 
> > > > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > > > 
> > > > > I have tracked the problem to first appear in the commit
> > > > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > > 
> > > > > The problem itself manifests in dmesg as follows (system was first
> > > > > restarted, then hibernated - this log is from the subsequent
> > > > resume):
> > > > 
> > > > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > > > reproduces?
> > > > 
> > > > At that point, this is the candidate:
> > > > 
> > > > commit e67ee10190e69332f929bdd6594a312363321a66
> > > > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > Date:   Mon Aug 11 23:19:48 2014 +0200
> > > > 
> > > >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > 
> > > >     * pm-sleep:
> > > >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > > 
> > > > ...
> > > > Alternatively, you can just try to revert
> > > > 
> > > > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > > > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > > > Date:   Mon Aug 4 23:23:21 2014 +0800
> > > > 
> > > >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > > 
> > > >     When the machine doesn't well handle the e820 persistent when
> > > >     hibernate
> > > >         resuming, then it may cause page fault when writing image to
> > > >     snapshot
> > > >         buffer:
> > > > 
> > > > 
> > > > ...
> > > > 
> > > > Thanks,
> > > > 									Pavel
> > > 
> > > I tried to disable CONFIG_CPU_IDLE and CONFIG_CPU_FREQ, however for some reason I could only disable CONFIG_CPU_FREQ.
> > > 
> > > The bug persisted.
> > > 
> > > Reverting the commit 84c91b7 on top of e67ee10 fixes the problem.
> > > 
> > > I created a copy of the bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=96111
> > 
> > Please check if 4.0-rc6 still has the problem and if reverting the commit in
> > question on top of it fixes the problem too.
> > 
> > 
> 
> I took the commit 8f778bbc542ddf8f6243b21d6aca087e709cabdc as the base for further checking (I started building before I read your message). It's a descendant of 4.0-rc6, so I hope it's not going to make a difference.
> 
> Results:
> 8f778bb : bad
> 8f778bb + reverted 84c91b7 : good
> 8f778bb + patch [1] : good

Thanks for your dmesg on bko#96111.
I checked and confirm there have the situation of setup_data reserved as E820_RESERVED_KERN.
I will add comment on bugzilla.

> 
> Thanks!
> 
> [1]:
> x86: Kill E820_RESERVED_KERN  https://lkml.org/lkml/2015/3/4/434 as suggested in joeyli's other email.

I think just revert 84c91b7ae until Yinghai Lu's patches merged to v4.1.
I will resend 84c91b7ae patch until Yinghai Lu's patches merged.


Regards
Joey Lee

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-03 21:43     ` Rafael J. Wysocki
  2015-04-04  8:12       ` rhn
@ 2015-04-05  7:26       ` joeyli
  2015-04-06 23:28         ` Rafael J. Wysocki
  1 sibling, 1 reply; 16+ messages in thread
From: joeyli @ 2015-04-05  7:26 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: rhn, Pavel Machek, kernel list, joeyli.kernel, linux-pm

On Fri, Apr 03, 2015 at 11:43:30PM +0200, Rafael J. Wysocki wrote:
> On Friday, April 03, 2015 05:58:25 PM rhn wrote:
> > On Thu, 2 Apr 2015 17:28:05 +0200
> > Pavel Machek <pavel@ucw.cz> wrote:
> > 
> > > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > > Hello,
> > > > 
> > > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > > 
> > > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > > 
> > > > I have tracked the problem to first appear in the commit
> > > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > 
> > > > The problem itself manifests in dmesg as follows (system was first
> > > > restarted, then hibernated - this log is from the subsequent
> > > resume):
> > > 
> > > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > > reproduces?
> > > 
> > > At that point, this is the candidate:
> > > 
> > > commit e67ee10190e69332f929bdd6594a312363321a66
> > > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Date:   Mon Aug 11 23:19:48 2014 +0200
> > > 
> > >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > 
> > >     * pm-sleep:
> > >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > 
> > > ...
> > > Alternatively, you can just try to revert
> > > 
> > > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > > Date:   Mon Aug 4 23:23:21 2014 +0800
> > > 
> > >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > 
> > >     When the machine doesn't well handle the e820 persistent when
> > >     hibernate
> > >         resuming, then it may cause page fault when writing image to
> > >     snapshot
> > >         buffer:
> > > 
> > > 
> > > ...
> > > 
> > > Thanks,
> > > 									Pavel
> > 
> > I tried to disable CONFIG_CPU_IDLE and CONFIG_CPU_FREQ, however for some reason I could only disable CONFIG_CPU_FREQ.
> > 
> > The bug persisted.
> > 
> > Reverting the commit 84c91b7 on top of e67ee10 fixes the problem.
> > 
> > I created a copy of the bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=96111
> 
> Please check if 4.0-rc6 still has the problem and if reverting the commit in
> question on top of it fixes the problem too.
> 
> 
> -- 
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

I think just revert 84c91b7ae until Yinghai Lu's patches merged to v4.1.                                                                                                                 
I will resend 84c91b7ae patch until Yinghai Lu's patches merged.                                                                                                                         
                                                                                                                                                                                         
                                                                                                                                                                                         
Regards                                                                                                                                                                                  
Joey Lee

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-05  7:24         ` joeyli
@ 2015-04-05  7:51           ` Yinghai Lu
  2015-04-06  7:12             ` Ingo Molnar
  0 siblings, 1 reply; 16+ messages in thread
From: Yinghai Lu @ 2015-04-05  7:51 UTC (permalink / raw)
  To: joeyli, Ingo Molnar, H. Peter Anvin, Borislav Petkov, Thomas Gleixner
  Cc: rhn, Rafael J. Wysocki, Pavel Machek, kernel list, Lee, Chun-Yi,
	Linux PM list, Andrew Morton

On Sun, Apr 5, 2015 at 12:24 AM, joeyli <jlee@suse.com> wrote:
>> >
>>
>> I took the commit 8f778bbc542ddf8f6243b21d6aca087e709cabdc as the base for further checking (I started building before I read your message). It's a descendant of 4.0-rc6, so I hope it's not going to make a difference.
>>
>> Results:
>> 8f778bb : bad
>> 8f778bb + reverted 84c91b7 : good
>> 8f778bb + patch [1] : good
>
> Thanks for your dmesg on bko#96111.
> I checked and confirm there have the situation of setup_data reserved as E820_RESERVED_KERN.
> I will add comment on bugzilla.
>
>>
>> Thanks!
>>
>> [1]:
>> x86: Kill E820_RESERVED_KERN  https://lkml.org/lkml/2015/3/4/434 as suggested in joeyli's other email.
>
> I think just revert 84c91b7ae until Yinghai Lu's patches merged to v4.1.
> I will resend 84c91b7ae patch until Yinghai Lu's patches merged.

Can you please put https://lkml.org/lkml/2015/3/4/434
into tip?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-05  7:51           ` Yinghai Lu
@ 2015-04-06  7:12             ` Ingo Molnar
  0 siblings, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2015-04-06  7:12 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: joeyli, Ingo Molnar, H. Peter Anvin, Borislav Petkov,
	Thomas Gleixner, rhn, Rafael J. Wysocki, Pavel Machek,
	kernel list, Lee, Chun-Yi, Linux PM list, Andrew Morton,
	Borislav Petkov


* Yinghai Lu <yinghai@kernel.org> wrote:

> On Sun, Apr 5, 2015 at 12:24 AM, joeyli <jlee@suse.com> wrote:
> >> >
> >>
> >> I took the commit 8f778bbc542ddf8f6243b21d6aca087e709cabdc as the base for further checking (I started building before I read your message). It's a descendant of 4.0-rc6, so I hope it's not going to make a difference.
> >>
> >> Results:
> >> 8f778bb : bad
> >> 8f778bb + reverted 84c91b7 : good
> >> 8f778bb + patch [1] : good
> >
> > Thanks for your dmesg on bko#96111.
> > I checked and confirm there have the situation of setup_data reserved as E820_RESERVED_KERN.
> > I will add comment on bugzilla.
> >
> >>
> >> Thanks!
> >>
> >> [1]:
> >> x86: Kill E820_RESERVED_KERN  https://lkml.org/lkml/2015/3/4/434 as suggested in joeyli's other email.
> >
> > I think just revert 84c91b7ae until Yinghai Lu's patches merged to v4.1.
> > I will resend 84c91b7ae patch until Yinghai Lu's patches merged.
> 
> Can you please put https://lkml.org/lkml/2015/3/4/434
> into tip?

I cannot apply this patch without a readable changelog, see:

  http://lkml.iu.edu/hypermail/linux/kernel/1503.1/05342.html

Your changelog (again) violates about half of the principles I tried 
to outline in that post.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Unreliable hibernation on Lenovo x230 (regression)
  2015-04-05  7:26       ` joeyli
@ 2015-04-06 23:28         ` Rafael J. Wysocki
  0 siblings, 0 replies; 16+ messages in thread
From: Rafael J. Wysocki @ 2015-04-06 23:28 UTC (permalink / raw)
  To: joeyli; +Cc: rhn, Pavel Machek, kernel list, joeyli.kernel, linux-pm

On Sunday, April 05, 2015 03:26:13 PM joeyli wrote:
> On Fri, Apr 03, 2015 at 11:43:30PM +0200, Rafael J. Wysocki wrote:
> > On Friday, April 03, 2015 05:58:25 PM rhn wrote:
> > > On Thu, 2 Apr 2015 17:28:05 +0200
> > > Pavel Machek <pavel@ucw.cz> wrote:
> > > 
> > > > On Wed 2015-04-01 21:47:43, rhn wrote:
> > > > > Hello,
> > > > > 
> > > > > Between kernel 3.16 and 3.17, a regression has been introduced where the first hibernation after regular shutdown always fails to resume. Subsequent hibernations succeed.
> > > > > 
> > > > > The system is a Lenovo x230 with Intel i5, booting with EFI, with the hibernate partition located on a secondary SSD drive. Installed system is Fedora 20, hibernation and reboots were issued using the KDE shutdown dialog.
> > > > > 
> > > > > I have tracked the problem to first appear in the commit
> > > > > e67ee10190e69332f929bdd6594a312363321a66	Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > > 
> > > > > The problem itself manifests in dmesg as follows (system was first
> > > > > restarted, then hibernated - this log is from the subsequent
> > > > resume):
> > > > 
> > > > Ok, can you try to disable cpufreq and cpuidle, and then try if it
> > > > reproduces?
> > > > 
> > > > At that point, this is the candidate:
> > > > 
> > > > commit e67ee10190e69332f929bdd6594a312363321a66
> > > > Merge: 21c806d 84c91b7 39c8bba 372ba8c
> > > > Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > Date:   Mon Aug 11 23:19:48 2014 +0200
> > > > 
> > > >     Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'
> > > > 
> > > >     * pm-sleep:
> > > >           PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > > 
> > > > ...
> > > > Alternatively, you can just try to revert
> > > > 
> > > > commit 84c91b7ae07c62cf6dee7fde3277f4be21331f85
> > > > Author: Lee, Chun-Yi <joeyli.kernel@gmail.com>
> > > > Date:   Mon Aug 4 23:23:21 2014 +0800
> > > > 
> > > >     PM / hibernate: avoid unsafe pages in e820 reserved regions
> > > > 
> > > >     When the machine doesn't well handle the e820 persistent when
> > > >     hibernate
> > > >         resuming, then it may cause page fault when writing image to
> > > >     snapshot
> > > >         buffer:
> > > > 
> > > > 
> > > > ...
> > > > 
> > > > Thanks,
> > > > 									Pavel
> > > 
> > > I tried to disable CONFIG_CPU_IDLE and CONFIG_CPU_FREQ, however for some reason I could only disable CONFIG_CPU_FREQ.
> > > 
> > > The bug persisted.
> > > 
> > > Reverting the commit 84c91b7 on top of e67ee10 fixes the problem.
> > > 
> > > I created a copy of the bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=96111
> > 
> > Please check if 4.0-rc6 still has the problem and if reverting the commit in
> > question on top of it fixes the problem too.
> > 
> > 
> 
> I think just revert 84c91b7ae until Yinghai Lu's patches merged to v4.1.                                                                                                                 
> I will resend 84c91b7ae patch until Yinghai Lu's patches merged.                                                                                                                         

OK, I'll queue up a revert of 84c91b7ae as a fix for 4.0.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-04-06 23:03 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-01 19:47 Unreliable hibernation on Lenovo x230 (regression) rhn
2015-04-02 15:28 ` Pavel Machek
2015-04-02 16:50   ` joeyli
2015-04-02 17:22     ` joeyli
2015-04-02 18:12       ` rhn
2015-04-03  1:23         ` joeyli
2015-04-03 16:00           ` rhn
2015-04-03 15:58   ` rhn
2015-04-03 16:40     ` Pavel Machek
2015-04-03 21:43     ` Rafael J. Wysocki
2015-04-04  8:12       ` rhn
2015-04-05  7:24         ` joeyli
2015-04-05  7:51           ` Yinghai Lu
2015-04-06  7:12             ` Ingo Molnar
2015-04-05  7:26       ` joeyli
2015-04-06 23:28         ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.