linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.35.5: hibernation broken... AGAIN
@ 2010-11-17 10:39 Ondrej Zary
  2010-11-17 20:53 ` Rafael J. Wysocki
  0 siblings, 1 reply; 24+ messages in thread
From: Ondrej Zary @ 2010-11-17 10:39 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: KOSAKI Motohiro, Rafael J. Wysocki, Kernel development list,
	Andrew Morton, Balbir Singh

Hello,
the nasty memory-corrupting hibernation bug 
https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since 2.6.35.5.
2.6.35.4 works fine, 2.6.35.5 crashes after two days.

It seems to be caused by b77c254d8d66e5e9aa81239fedba9f3d568097d9.

-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-17 10:39 2.6.35.5: hibernation broken... AGAIN Ondrej Zary
@ 2010-11-17 20:53 ` Rafael J. Wysocki
  2010-11-17 21:04   ` Andrew Morton
  0 siblings, 1 reply; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-11-17 20:53 UTC (permalink / raw)
  To: Ondrej Zary
  Cc: KAMEZAWA Hiroyuki, KOSAKI Motohiro, Kernel development list,
	Andrew Morton, Balbir Singh

On Wednesday, November 17, 2010, Ondrej Zary wrote:
> Hello,
> the nasty memory-corrupting hibernation bug 
> https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since 2.6.35.5.
> 2.6.35.4 works fine, 2.6.35.5 crashes after two days.
> 
> It seems to be caused by b77c254d8d66e5e9aa81239fedba9f3d568097d9.

Can you, _please_, always put all of the basic commit info (subject, author,
who signed it off) in bug reports?

In this particular case I don't have a -stable tree handy at the moment and
can't even see what the commit is.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-17 20:53 ` Rafael J. Wysocki
@ 2010-11-17 21:04   ` Andrew Morton
  2010-11-17 21:12     ` Rafael J. Wysocki
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2010-11-17 21:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ondrej Zary, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Wed, 17 Nov 2010 21:53:52 +0100
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > Hello,
> > the nasty memory-corrupting hibernation bug 
> > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since 2.6.35.5.
> > 2.6.35.4 works fine, 2.6.35.5 crashes after two days.
> > 
> > It seems to be caused by b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> 
> Can you, _please_, always put all of the basic commit info (subject, author,
> who signed it off) in bug reports?
> 
> In this particular case I don't have a -stable tree handy at the moment and
> can't even see what the commit is.

You can simply do a google search for the commit ID.  It is


commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
Author: Hugh Dickins <hughd@google.com>
Date:   Thu Sep 9 16:38:09 2010 -0700

    swap: prevent reuse during hibernation
 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-17 21:04   ` Andrew Morton
@ 2010-11-17 21:12     ` Rafael J. Wysocki
  2010-11-17 21:18       ` Ondrej Zary
  0 siblings, 1 reply; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-11-17 21:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ondrej Zary, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh, Hugh Dickins

On Wednesday, November 17, 2010, Andrew Morton wrote:
> On Wed, 17 Nov 2010 21:53:52 +0100
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > Hello,
> > > the nasty memory-corrupting hibernation bug 
> > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since 2.6.35.5.
> > > 2.6.35.4 works fine, 2.6.35.5 crashes after two days.
> > > 
> > > It seems to be caused by b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > 
> > Can you, _please_, always put all of the basic commit info (subject, author,
> > who signed it off) in bug reports?
> > 
> > In this particular case I don't have a -stable tree handy at the moment and
> > can't even see what the commit is.
> 
> You can simply do a google search for the commit ID.  It is

Thanks! :-)

> commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> Author: Hugh Dickins <hughd@google.com>
> Date:   Thu Sep 9 16:38:09 2010 -0700
> 
>     swap: prevent reuse during hibernation

So I guess we should Cc Hugh ...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-17 21:12     ` Rafael J. Wysocki
@ 2010-11-17 21:18       ` Ondrej Zary
  2010-11-17 23:46         ` Hugh Dickins
  0 siblings, 1 reply; 24+ messages in thread
From: Ondrej Zary @ 2010-11-17 21:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh, Hugh Dickins

On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> On Wednesday, November 17, 2010, Andrew Morton wrote:
> > On Wed, 17 Nov 2010 21:53:52 +0100
> >
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > Hello,
> > > > the nasty memory-corrupting hibernation bug
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since
> > > > 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after two days.
> > > >
> > > > It seems to be caused by b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > >
> > > Can you, _please_, always put all of the basic commit info (subject,
> > > author, who signed it off) in bug reports?
> > >
> > > In this particular case I don't have a -stable tree handy at the moment
> > > and can't even see what the commit is.
> >
> > You can simply do a google search for the commit ID.  It is
>
> Thanks! :-)
>
> > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > Author: Hugh Dickins <hughd@google.com>
> > Date:   Thu Sep 9 16:38:09 2010 -0700
> >
> >     swap: prevent reuse during hibernation
>
> So I guess we should Cc Hugh ...

Oops, my bad. I was in hurry and somehow missed that he's not on the Cc list.

-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-17 21:18       ` Ondrej Zary
@ 2010-11-17 23:46         ` Hugh Dickins
  2010-11-20 15:07           ` Ondrej Zary
                             ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Hugh Dickins @ 2010-11-17 23:46 UTC (permalink / raw)
  To: Ondrej Zary
  Cc: Rafael J. Wysocki, Andrew Morton, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, Kernel development list, Balbir Singh

On Wed, 17 Nov 2010, Ondrej Zary wrote:
> On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > Hello,
> > > > > the nasty memory-corrupting hibernation bug
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since
> > > > > 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after two days.

That's distressing, for both and all of us: I'm sorry.

> > > > >
> > > > > It seems to be caused by b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> >
> > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > Author: Hugh Dickins <hughd@google.com>
> > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > >
> > >     swap: prevent reuse during hibernation

Embarrassing: I suspect that I've been confused, not for the first
time, by the fork-like nature of hibernation and its images.
I wonder if this patch below fixes it, Ondrej?

(And is it kernel swsusp or user swsusp that you're using?  May not
matter at all, but will help us to think more clearly about it,
if the corruption remains after this patch.)

Rafael, do you agree that this patch was actually required even for
your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
mm/pm: force GFP_NOIO during suspend/hibernation and resume?

Or am I still just as confused?  Or if not, are there more forking
places which require a similar patch?

Not signing it off yet,
Hugh

---

 kernel/power/hibernate.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- 2.6.37-rc2/kernel/power/hibernate.c	2010-11-01 13:01:31.000000000 -0700
+++ linux/kernel/power/hibernate.c	2010-11-17 15:23:36.000000000 -0800
@@ -348,16 +348,19 @@ int hibernation_snapshot(int platform_mo
 		goto Recover_platform;
 
 	error = create_image(platform_mode);
-	/* Control returns here after successful restore */
+	/* Control returns here after successful restore,
+	 * and also after creating the image in memory (or failing to do so).
+	 */
 
  Resume_devices:
 	/* We may need to release the preallocated image pages here. */
-	if (error || !in_suspend)
+	if (error || !in_suspend) {
 		swsusp_free();
+		set_gfp_allowed_mask(saved_mask);
+	}
 
 	dpm_resume_end(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
-	set_gfp_allowed_mask(saved_mask);
 	resume_console();
  Close:
 	platform_end(platform_mode);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-17 23:46         ` Hugh Dickins
@ 2010-11-20 15:07           ` Ondrej Zary
  2010-11-26 11:43           ` Ondrej Zary
  2010-11-26 20:24           ` Rafael J. Wysocki
  2 siblings, 0 replies; 24+ messages in thread
From: Ondrej Zary @ 2010-11-20 15:07 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Rafael J. Wysocki, Andrew Morton, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, Kernel development list, Balbir Singh

On Thursday 18 November 2010 00:46:51 Hugh Dickins wrote:
> On Wed, 17 Nov 2010, Ondrej Zary wrote:
> > On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > >
> > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > > Hello,
> > > > > > the nasty memory-corrupting hibernation bug
> > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since
> > > > > > 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after two days.
>
> That's distressing, for both and all of us: I'm sorry.
>
> > > > > > It seems to be caused by
> > > > > > b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > > >
> > > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > > Author: Hugh Dickins <hughd@google.com>
> > > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > > >
> > > >     swap: prevent reuse during hibernation
>
> Embarrassing: I suspect that I've been confused, not for the first
> time, by the fork-like nature of hibernation and its images.
> I wonder if this patch below fixes it, Ondrej?

It seems to work with this patch. 2nd day and no crashes yet.
2.6.35.4 is OK
2.6.35.4 + b77c254d8d66e5e9aa81239fedba9f3d568097d9 is bad
2.6.35.4 + b77c254d8d66e5e9aa81239fedba9f3d568097d9 + this patch seems OK

> (And is it kernel swsusp or user swsusp that you're using?  May not
> matter at all, but will help us to think more clearly about it,
> if the corruption remains after this patch.)

I'm using "echo disk >/sys/power/state".

> Rafael, do you agree that this patch was actually required even for
> your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
> mm/pm: force GFP_NOIO during suspend/hibernation and resume?
>
> Or am I still just as confused?  Or if not, are there more forking
> places which require a similar patch?
>
> Not signing it off yet,
> Hugh
>
> ---
>
>  kernel/power/hibernate.c |    9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> --- 2.6.37-rc2/kernel/power/hibernate.c	2010-11-01 13:01:31.000000000 -0700
> +++ linux/kernel/power/hibernate.c	2010-11-17 15:23:36.000000000 -0800
> @@ -348,16 +348,19 @@ int hibernation_snapshot(int platform_mo
>  		goto Recover_platform;
>
>  	error = create_image(platform_mode);
> -	/* Control returns here after successful restore */
> +	/* Control returns here after successful restore,
> +	 * and also after creating the image in memory (or failing to do so).
> +	 */
>
>   Resume_devices:
>  	/* We may need to release the preallocated image pages here. */
> -	if (error || !in_suspend)
> +	if (error || !in_suspend) {
>  		swsusp_free();
> +		set_gfp_allowed_mask(saved_mask);
> +	}
>
>  	dpm_resume_end(in_suspend ?
>  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
>  	resume_console();
>   Close:
>  	platform_end(platform_mode);



-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-17 23:46         ` Hugh Dickins
  2010-11-20 15:07           ` Ondrej Zary
@ 2010-11-26 11:43           ` Ondrej Zary
  2010-11-26 23:10             ` Rafael J. Wysocki
  2010-11-26 20:24           ` Rafael J. Wysocki
  2 siblings, 1 reply; 24+ messages in thread
From: Ondrej Zary @ 2010-11-26 11:43 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Rafael J. Wysocki, Andrew Morton, KAMEZAWA Hiroyuki,
	KOSAKI Motohiro, Kernel development list, Balbir Singh

On Thursday 18 November 2010, Hugh Dickins wrote:
> On Wed, 17 Nov 2010, Ondrej Zary wrote:
> > On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > >
> > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > > Hello,
> > > > > > the nasty memory-corrupting hibernation bug
> > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since
> > > > > > 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after two days.
>
> That's distressing, for both and all of us: I'm sorry.
>
> > > > > > It seems to be caused by
> > > > > > b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > > >
> > > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > > Author: Hugh Dickins <hughd@google.com>
> > > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > > >
> > > >     swap: prevent reuse during hibernation
>
> Embarrassing: I suspect that I've been confused, not for the first
> time, by the fork-like nature of hibernation and its images.
> I wonder if this patch below fixes it, Ondrej?
>
> (And is it kernel swsusp or user swsusp that you're using?  May not
> matter at all, but will help us to think more clearly about it,
> if the corruption remains after this patch.)
>
> Rafael, do you agree that this patch was actually required even for
> your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
> mm/pm: force GFP_NOIO during suspend/hibernation and resume?
>
> Or am I still just as confused?  Or if not, are there more forking
> places which require a similar patch?
>
> Not signing it off yet,
> Hugh

Could you please do that? The patch fixes the problem.


> ---
>
>  kernel/power/hibernate.c |    9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> --- 2.6.37-rc2/kernel/power/hibernate.c	2010-11-01 13:01:31.000000000 -0700
> +++ linux/kernel/power/hibernate.c	2010-11-17 15:23:36.000000000 -0800
> @@ -348,16 +348,19 @@ int hibernation_snapshot(int platform_mo
>  		goto Recover_platform;
>
>  	error = create_image(platform_mode);
> -	/* Control returns here after successful restore */
> +	/* Control returns here after successful restore,
> +	 * and also after creating the image in memory (or failing to do so).
> +	 */
>
>   Resume_devices:
>  	/* We may need to release the preallocated image pages here. */
> -	if (error || !in_suspend)
> +	if (error || !in_suspend) {
>  		swsusp_free();
> +		set_gfp_allowed_mask(saved_mask);
> +	}
>
>  	dpm_resume_end(in_suspend ?
>  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
>  	resume_console();
>   Close:
>  	platform_end(platform_mode);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-17 23:46         ` Hugh Dickins
  2010-11-20 15:07           ` Ondrej Zary
  2010-11-26 11:43           ` Ondrej Zary
@ 2010-11-26 20:24           ` Rafael J. Wysocki
  2 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-11-26 20:24 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Ondrej Zary, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Thursday, November 18, 2010, Hugh Dickins wrote:
> On Wed, 17 Nov 2010, Ondrej Zary wrote:
> > On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > > Hello,
> > > > > > the nasty memory-corrupting hibernation bug
> > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since
> > > > > > 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after two days.
> 
> That's distressing, for both and all of us: I'm sorry.
> 
> > > > > >
> > > > > > It seems to be caused by b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > >
> > > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > > Author: Hugh Dickins <hughd@google.com>
> > > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > > >
> > > >     swap: prevent reuse during hibernation
> 
> Embarrassing: I suspect that I've been confused, not for the first
> time, by the fork-like nature of hibernation and its images.
> I wonder if this patch below fixes it, Ondrej?
> 
> (And is it kernel swsusp or user swsusp that you're using?  May not
> matter at all, but will help us to think more clearly about it,
> if the corruption remains after this patch.)
> 
> Rafael, do you agree that this patch was actually required even for
> your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
> mm/pm: force GFP_NOIO during suspend/hibernation and resume?

(Sorry for the late response).

Well, not exactly.

First, IMO set_gfp_allowed_mask(saved_mask) should not be called before
dpm_resume_end(), because IO allocations may try to use suspended devices in
theory (although that's not likely due to the swsusp_free() before).  So the
right thing to do appears to be:

if (error || !in_suspend) {
  		swsusp_free();

dpm_resume_end(in_suspend ?
  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);

if (error || !in_suspend)
  		set_gfp_allowed_mask(saved_mask);

resume_console();

Second, since we don't call set_gfp_allowed_mask(saved_mask) in the
(in_suspend && !error) case, hibernation_platform_enter() also should be
updated (I think we don't need to use set_gfp_allowed_mask() in it at all).

There's one more subtlety.  Namely, the saving of an image by s2disk
may be aborted and in that case we need to restore the original
gfp_allowed_mask too, but I need to look deeper into the code to see how to do
it cleanly.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-26 11:43           ` Ondrej Zary
@ 2010-11-26 23:10             ` Rafael J. Wysocki
  2010-11-27 14:58               ` Ondrej Zary
  0 siblings, 1 reply; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-11-26 23:10 UTC (permalink / raw)
  To: Ondrej Zary
  Cc: Hugh Dickins, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Friday, November 26, 2010, Ondrej Zary wrote:
> On Thursday 18 November 2010, Hugh Dickins wrote:
> > On Wed, 17 Nov 2010, Ondrej Zary wrote:
> > > On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > > > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > > >
> > > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > > > Hello,
> > > > > > > the nasty memory-corrupting hibernation bug
> > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since
> > > > > > > 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after two days.
> >
> > That's distressing, for both and all of us: I'm sorry.
> >
> > > > > > > It seems to be caused by
> > > > > > > b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > > > >
> > > > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > > > Author: Hugh Dickins <hughd@google.com>
> > > > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > > > >
> > > > >     swap: prevent reuse during hibernation
> >
> > Embarrassing: I suspect that I've been confused, not for the first
> > time, by the fork-like nature of hibernation and its images.
> > I wonder if this patch below fixes it, Ondrej?
> >
> > (And is it kernel swsusp or user swsusp that you're using?  May not
> > matter at all, but will help us to think more clearly about it,
> > if the corruption remains after this patch.)
> >
> > Rafael, do you agree that this patch was actually required even for
> > your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
> > mm/pm: force GFP_NOIO during suspend/hibernation and resume?
> >
> > Or am I still just as confused?  Or if not, are there more forking
> > places which require a similar patch?
> >
> > Not signing it off yet,
> > Hugh
> 
> Could you please do that? The patch fixes the problem.

Can you check if the problem is also fixed by the patch below, please?

Rafael

---
 kernel/power/hibernate.c |   36 ++++++++++++++++++++++++++----------
 kernel/power/power.h     |    1 +
 kernel/power/user.c      |    1 +
 3 files changed, 28 insertions(+), 10 deletions(-)

Index: linux-2.6/kernel/power/hibernate.c
===================================================================
--- linux-2.6.orig/kernel/power/hibernate.c
+++ linux-2.6/kernel/power/hibernate.c
@@ -29,6 +29,21 @@
 #include "power.h"
 
 
+static gfp_t saved_gfp_mask;
+
+static void hibernate_restrict_gfp_mask(void)
+{
+	saved_gfp_mask = clear_gfp_allowed_mask(GFP_IOFS);
+}
+
+void hibernate_restore_gfp_mask(void)
+{
+	if (saved_gfp_mask) {
+		set_gfp_allowed_mask(saved_gfp_mask);
+		saved_gfp_mask = 0;
+	}
+}
+
 static int nocompress = 0;
 static int noresume = 0;
 static char resume_file[256] = CONFIG_PM_STD_PARTITION;
@@ -327,7 +342,6 @@ static int create_image(int platform_mod
 int hibernation_snapshot(int platform_mode)
 {
 	int error;
-	gfp_t saved_mask;
 
 	error = platform_begin(platform_mode);
 	if (error)
@@ -339,7 +353,7 @@ int hibernation_snapshot(int platform_mo
 		goto Close;
 
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
+	hibernate_restrict_gfp_mask();
 	error = dpm_suspend_start(PMSG_FREEZE);
 	if (error)
 		goto Recover_platform;
@@ -348,7 +362,10 @@ int hibernation_snapshot(int platform_mo
 		goto Recover_platform;
 
 	error = create_image(platform_mode);
-	/* Control returns here after successful restore */
+	/*
+	 * Control returns here (1) after the image has been created or the
+	 * image creation has failed and (2) after a successful restore.
+	 */
 
  Resume_devices:
 	/* We may need to release the preallocated image pages here. */
@@ -357,7 +374,10 @@ int hibernation_snapshot(int platform_mo
 
 	dpm_resume_end(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
-	set_gfp_allowed_mask(saved_mask);
+
+	if (error || !in_suspend)
+		hibernate_restore_gfp_mask();
+
 	resume_console();
  Close:
 	platform_end(platform_mode);
@@ -452,17 +472,16 @@ static int resume_target_kernel(bool pla
 int hibernation_restore(int platform_mode)
 {
 	int error;
-	gfp_t saved_mask;
 
 	pm_prepare_console();
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
+	hibernate_restrict_gfp_mask();
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
 		dpm_resume_end(PMSG_RECOVER);
 	}
-	set_gfp_allowed_mask(saved_mask);
+	hibernate_restore_gfp_mask();
 	resume_console();
 	pm_restore_console();
 	return error;
@@ -476,7 +495,6 @@ int hibernation_restore(int platform_mod
 int hibernation_platform_enter(void)
 {
 	int error;
-	gfp_t saved_mask;
 
 	if (!hibernation_ops)
 		return -ENOSYS;
@@ -492,7 +510,6 @@ int hibernation_platform_enter(void)
 
 	entering_platform_hibernation = true;
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_HIBERNATE);
 	if (error) {
 		if (hibernation_ops->recover)
@@ -536,7 +553,6 @@ int hibernation_platform_enter(void)
  Resume_devices:
 	entering_platform_hibernation = false;
 	dpm_resume_end(PMSG_RESTORE);
-	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 
  Close:
Index: linux-2.6/kernel/power/power.h
===================================================================
--- linux-2.6.orig/kernel/power/power.h
+++ linux-2.6/kernel/power/power.h
@@ -52,6 +52,7 @@ static inline char *check_image_kernel(s
 extern int hibernation_snapshot(int platform_mode);
 extern int hibernation_restore(int platform_mode);
 extern int hibernation_platform_enter(void);
+extern void hibernate_restore_gfp_mask(void);
 
 #else /* !CONFIG_HIBERNATION */
 
Index: linux-2.6/kernel/power/user.c
===================================================================
--- linux-2.6.orig/kernel/power/user.c
+++ linux-2.6/kernel/power/user.c
@@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
 	case SNAPSHOT_UNFREEZE:
 		if (!data->frozen || data->ready)
 			break;
+		hibernate_restore_gfp_mask();
 		thaw_processes();
 		usermodehelper_enable();
 		data->frozen = 0;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-26 23:10             ` Rafael J. Wysocki
@ 2010-11-27 14:58               ` Ondrej Zary
  2010-11-27 20:47                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 24+ messages in thread
From: Ondrej Zary @ 2010-11-27 14:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Hugh Dickins, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Saturday 27 November 2010 00:10:04 Rafael J. Wysocki wrote:
> On Friday, November 26, 2010, Ondrej Zary wrote:
> > On Thursday 18 November 2010, Hugh Dickins wrote:
> > > On Wed, 17 Nov 2010, Ondrej Zary wrote:
> > > > On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > > > > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > > > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > > > >
> > > > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > > > > Hello,
> > > > > > > > the nasty memory-corrupting hibernation bug
> > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back
> > > > > > > > since 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after
> > > > > > > > two days.
> > >
> > > That's distressing, for both and all of us: I'm sorry.
> > >
> > > > > > > > It seems to be caused by
> > > > > > > > b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > > > > >
> > > > > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > > > > Author: Hugh Dickins <hughd@google.com>
> > > > > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > > > > >
> > > > > >     swap: prevent reuse during hibernation
> > >
> > > Embarrassing: I suspect that I've been confused, not for the first
> > > time, by the fork-like nature of hibernation and its images.
> > > I wonder if this patch below fixes it, Ondrej?
> > >
> > > (And is it kernel swsusp or user swsusp that you're using?  May not
> > > matter at all, but will help us to think more clearly about it,
> > > if the corruption remains after this patch.)
> > >
> > > Rafael, do you agree that this patch was actually required even for
> > > your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
> > > mm/pm: force GFP_NOIO during suspend/hibernation and resume?
> > >
> > > Or am I still just as confused?  Or if not, are there more forking
> > > places which require a similar patch?
> > >
> > > Not signing it off yet,
> > > Hugh
> >
> > Could you please do that? The patch fixes the problem.
>
> Can you check if the problem is also fixed by the patch below, please?

The patch does not apply to 2.6.35.4, 2.6.35.5 and also 2.6.36. What version 
should I test?


> Rafael
>
> ---
>  kernel/power/hibernate.c |   36 ++++++++++++++++++++++++++----------
>  kernel/power/power.h     |    1 +
>  kernel/power/user.c      |    1 +
>  3 files changed, 28 insertions(+), 10 deletions(-)
>
> Index: linux-2.6/kernel/power/hibernate.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/hibernate.c
> +++ linux-2.6/kernel/power/hibernate.c
> @@ -29,6 +29,21 @@
>  #include "power.h"
>
>
> +static gfp_t saved_gfp_mask;
> +
> +static void hibernate_restrict_gfp_mask(void)
> +{
> +	saved_gfp_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +}
> +
> +void hibernate_restore_gfp_mask(void)
> +{
> +	if (saved_gfp_mask) {
> +		set_gfp_allowed_mask(saved_gfp_mask);
> +		saved_gfp_mask = 0;
> +	}
> +}
> +
>  static int nocompress = 0;
>  static int noresume = 0;
>  static char resume_file[256] = CONFIG_PM_STD_PARTITION;
> @@ -327,7 +342,6 @@ static int create_image(int platform_mod
>  int hibernation_snapshot(int platform_mode)
>  {
>  	int error;
> -	gfp_t saved_mask;
>
>  	error = platform_begin(platform_mode);
>  	if (error)
> @@ -339,7 +353,7 @@ int hibernation_snapshot(int platform_mo
>  		goto Close;
>
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	hibernate_restrict_gfp_mask();
>  	error = dpm_suspend_start(PMSG_FREEZE);
>  	if (error)
>  		goto Recover_platform;
> @@ -348,7 +362,10 @@ int hibernation_snapshot(int platform_mo
>  		goto Recover_platform;
>
>  	error = create_image(platform_mode);
> -	/* Control returns here after successful restore */
> +	/*
> +	 * Control returns here (1) after the image has been created or the
> +	 * image creation has failed and (2) after a successful restore.
> +	 */
>
>   Resume_devices:
>  	/* We may need to release the preallocated image pages here. */
> @@ -357,7 +374,10 @@ int hibernation_snapshot(int platform_mo
>
>  	dpm_resume_end(in_suspend ?
>  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
> +
> +	if (error || !in_suspend)
> +		hibernate_restore_gfp_mask();
> +
>  	resume_console();
>   Close:
>  	platform_end(platform_mode);
> @@ -452,17 +472,16 @@ static int resume_target_kernel(bool pla
>  int hibernation_restore(int platform_mode)
>  {
>  	int error;
> -	gfp_t saved_mask;
>
>  	pm_prepare_console();
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	hibernate_restrict_gfp_mask();
>  	error = dpm_suspend_start(PMSG_QUIESCE);
>  	if (!error) {
>  		error = resume_target_kernel(platform_mode);
>  		dpm_resume_end(PMSG_RECOVER);
>  	}
> -	set_gfp_allowed_mask(saved_mask);
> +	hibernate_restore_gfp_mask();
>  	resume_console();
>  	pm_restore_console();
>  	return error;
> @@ -476,7 +495,6 @@ int hibernation_restore(int platform_mod
>  int hibernation_platform_enter(void)
>  {
>  	int error;
> -	gfp_t saved_mask;
>
>  	if (!hibernation_ops)
>  		return -ENOSYS;
> @@ -492,7 +510,6 @@ int hibernation_platform_enter(void)
>
>  	entering_platform_hibernation = true;
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
>  	error = dpm_suspend_start(PMSG_HIBERNATE);
>  	if (error) {
>  		if (hibernation_ops->recover)
> @@ -536,7 +553,6 @@ int hibernation_platform_enter(void)
>   Resume_devices:
>  	entering_platform_hibernation = false;
>  	dpm_resume_end(PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
>  	resume_console();
>
>   Close:
> Index: linux-2.6/kernel/power/power.h
> ===================================================================
> --- linux-2.6.orig/kernel/power/power.h
> +++ linux-2.6/kernel/power/power.h
> @@ -52,6 +52,7 @@ static inline char *check_image_kernel(s
>  extern int hibernation_snapshot(int platform_mode);
>  extern int hibernation_restore(int platform_mode);
>  extern int hibernation_platform_enter(void);
> +extern void hibernate_restore_gfp_mask(void);
>
>  #else /* !CONFIG_HIBERNATION */
>
> Index: linux-2.6/kernel/power/user.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/user.c
> +++ linux-2.6/kernel/power/user.c
> @@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
>  	case SNAPSHOT_UNFREEZE:
>  		if (!data->frozen || data->ready)
>  			break;
> +		hibernate_restore_gfp_mask();
>  		thaw_processes();
>  		usermodehelper_enable();
>  		data->frozen = 0;



-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-27 14:58               ` Ondrej Zary
@ 2010-11-27 20:47                 ` Rafael J. Wysocki
  2010-11-30 22:16                   ` Hugh Dickins
  2010-12-06 21:09                   ` Ondrej Zary
  0 siblings, 2 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-11-27 20:47 UTC (permalink / raw)
  To: Ondrej Zary
  Cc: Hugh Dickins, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Saturday, November 27, 2010, Ondrej Zary wrote:
> On Saturday 27 November 2010 00:10:04 Rafael J. Wysocki wrote:
> > On Friday, November 26, 2010, Ondrej Zary wrote:
> > > On Thursday 18 November 2010, Hugh Dickins wrote:
> > > > On Wed, 17 Nov 2010, Ondrej Zary wrote:
> > > > > On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > > > > > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > > > > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > > > > >
> > > > > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > > > > > Hello,
> > > > > > > > > the nasty memory-corrupting hibernation bug
> > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back
> > > > > > > > > since 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after
> > > > > > > > > two days.
> > > >
> > > > That's distressing, for both and all of us: I'm sorry.
> > > >
> > > > > > > > > It seems to be caused by
> > > > > > > > > b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > > > > > >
> > > > > > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > > > > > Author: Hugh Dickins <hughd@google.com>
> > > > > > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > > > > > >
> > > > > > >     swap: prevent reuse during hibernation
> > > >
> > > > Embarrassing: I suspect that I've been confused, not for the first
> > > > time, by the fork-like nature of hibernation and its images.
> > > > I wonder if this patch below fixes it, Ondrej?
> > > >
> > > > (And is it kernel swsusp or user swsusp that you're using?  May not
> > > > matter at all, but will help us to think more clearly about it,
> > > > if the corruption remains after this patch.)
> > > >
> > > > Rafael, do you agree that this patch was actually required even for
> > > > your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
> > > > mm/pm: force GFP_NOIO during suspend/hibernation and resume?
> > > >
> > > > Or am I still just as confused?  Or if not, are there more forking
> > > > places which require a similar patch?
> > > >
> > > > Not signing it off yet,
> > > > Hugh
> > >
> > > Could you please do that? The patch fixes the problem.
> >
> > Can you check if the problem is also fixed by the patch below, please?
> 
> The patch does not apply to 2.6.35.4, 2.6.35.5 and also 2.6.36. What version 
> should I test?

The patch was against the current mainline.

The one below was rebased on top of 2.6.36, so please test it with this kernel.

Thanks,
Rafael

---
 kernel/power/hibernate.c |   36 ++++++++++++++++++++++++++----------
 kernel/power/power.h     |    1 +
 kernel/power/user.c      |    2 ++
 3 files changed, 29 insertions(+), 10 deletions(-)

Index: suspend-2.6/kernel/power/hibernate.c
===================================================================
--- suspend-2.6.orig/kernel/power/hibernate.c
+++ suspend-2.6/kernel/power/hibernate.c
@@ -29,6 +29,21 @@
 #include "power.h"
 
 
+static gfp_t saved_gfp_mask;
+
+static void hibernate_restrict_gfp_mask(void)
+{
+	saved_gfp_mask = clear_gfp_allowed_mask(GFP_IOFS);
+}
+
+void hibernate_restore_gfp_mask(void)
+{
+	if (saved_gfp_mask) {
+		set_gfp_allowed_mask(saved_gfp_mask);
+		saved_gfp_mask = 0;
+	}
+}
+
 static int noresume = 0;
 static char resume_file[256] = CONFIG_PM_STD_PARTITION;
 dev_t swsusp_resume_device;
@@ -326,7 +341,6 @@ static int create_image(int platform_mod
 int hibernation_snapshot(int platform_mode)
 {
 	int error;
-	gfp_t saved_mask;
 
 	error = platform_begin(platform_mode);
 	if (error)
@@ -338,7 +352,7 @@ int hibernation_snapshot(int platform_mo
 		goto Close;
 
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
+	hibernate_restrict_gfp_mask();
 	error = dpm_suspend_start(PMSG_FREEZE);
 	if (error)
 		goto Recover_platform;
@@ -347,7 +361,10 @@ int hibernation_snapshot(int platform_mo
 		goto Recover_platform;
 
 	error = create_image(platform_mode);
-	/* Control returns here after successful restore */
+	/*
+	 * Control returns here (1) after the image has been created or the
+	 * image creation has failed and (2) after a successful restore.
+	 */
 
  Resume_devices:
 	/* We may need to release the preallocated image pages here. */
@@ -356,7 +373,10 @@ int hibernation_snapshot(int platform_mo
 
 	dpm_resume_end(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
-	set_gfp_allowed_mask(saved_mask);
+
+	if (error || !in_suspend)
+		hibernate_restore_gfp_mask();
+
 	resume_console();
  Close:
 	platform_end(platform_mode);
@@ -451,17 +471,16 @@ static int resume_target_kernel(bool pla
 int hibernation_restore(int platform_mode)
 {
 	int error;
-	gfp_t saved_mask;
 
 	pm_prepare_console();
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
+	hibernate_restrict_gfp_mask();
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
 		dpm_resume_end(PMSG_RECOVER);
 	}
-	set_gfp_allowed_mask(saved_mask);
+	hibernate_restore_gfp_mask();
 	resume_console();
 	pm_restore_console();
 	return error;
@@ -475,7 +494,6 @@ int hibernation_restore(int platform_mod
 int hibernation_platform_enter(void)
 {
 	int error;
-	gfp_t saved_mask;
 
 	if (!hibernation_ops)
 		return -ENOSYS;
@@ -491,7 +509,6 @@ int hibernation_platform_enter(void)
 
 	entering_platform_hibernation = true;
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_HIBERNATE);
 	if (error) {
 		if (hibernation_ops->recover)
@@ -535,7 +552,6 @@ int hibernation_platform_enter(void)
  Resume_devices:
 	entering_platform_hibernation = false;
 	dpm_resume_end(PMSG_RESTORE);
-	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 
  Close:
Index: suspend-2.6/kernel/power/power.h
===================================================================
--- suspend-2.6.orig/kernel/power/power.h
+++ suspend-2.6/kernel/power/power.h
@@ -49,6 +49,7 @@ static inline char *check_image_kernel(s
 extern int hibernation_snapshot(int platform_mode);
 extern int hibernation_restore(int platform_mode);
 extern int hibernation_platform_enter(void);
+extern void hibernate_restore_gfp_mask(void);
 #endif
 
 extern int pfn_is_nosave(unsigned long);
Index: suspend-2.6/kernel/power/user.c
===================================================================
--- suspend-2.6.orig/kernel/power/user.c
+++ suspend-2.6/kernel/power/user.c
@@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
 	case SNAPSHOT_UNFREEZE:
 		if (!data->frozen || data->ready)
 			break;
+		hibernate_restore_gfp_mask();
 		thaw_processes();
 		usermodehelper_enable();
 		data->frozen = 0;
@@ -275,6 +276,7 @@ static long snapshot_ioctl(struct file *
 			error = -EPERM;
 			break;
 		}
+		hibernate_restore_gfp_mask();
 		error = hibernation_snapshot(data->platform_support);
 		if (!error)
 			error = put_user(in_suspend, (int __user *)arg);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-27 20:47                 ` Rafael J. Wysocki
@ 2010-11-30 22:16                   ` Hugh Dickins
  2010-11-30 23:07                     ` Rafael J. Wysocki
  2010-12-06 21:09                   ` Ondrej Zary
  1 sibling, 1 reply; 24+ messages in thread
From: Hugh Dickins @ 2010-11-30 22:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ondrej Zary, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Sat, 27 Nov 2010, Rafael J. Wysocki wrote:
> On Saturday, November 27, 2010, Ondrej Zary wrote:
> > > > >
> > > > > Not signing it off yet,
> > > > > Hugh
> > > >
> > > > Could you please do that? The patch fixes the problem.

Sorry for going quiet for so long.

Thanks for testing, Ondrej.  I completely agree with Rafael that my
patch was inadequate: it turned out to be good enough for your useful
feedback on the main path through hibernation, but missed a number
of details.  We do need something like Rafael's patch - thanks.

(And I realize that it's tiresome and frustrating for you to be
trying first that and then this etc; but we had bugs here precisely
because this is a confusing area, so I think it is worth some effort
to get right now.)

> > >
> > > Can you check if the problem is also fixed by the patch below, please?
> > 
> > The patch does not apply to 2.6.35.4, 2.6.35.5 and also 2.6.36. What version 
> > should I test?
> 
> The patch was against the current mainline.
> 
> The one below was rebased on top of 2.6.36, so please test it with this kernel.

I'll comment on this version, since it has one more line than your original.

> 
> Thanks,
> Rafael
> 
> ---
>  kernel/power/hibernate.c |   36 ++++++++++++++++++++++++++----------
>  kernel/power/power.h     |    1 +
>  kernel/power/user.c      |    2 ++
>  3 files changed, 29 insertions(+), 10 deletions(-)
> 
> Index: suspend-2.6/kernel/power/hibernate.c
> ===================================================================
> --- suspend-2.6.orig/kernel/power/hibernate.c
> +++ suspend-2.6/kernel/power/hibernate.c
> @@ -29,6 +29,21 @@
>  #include "power.h"
>  
>  
> +static gfp_t saved_gfp_mask;
> +
> +static void hibernate_restrict_gfp_mask(void)
> +{
> +	saved_gfp_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +}
> +
> +void hibernate_restore_gfp_mask(void)
> +{
> +	if (saved_gfp_mask) {
> +		set_gfp_allowed_mask(saved_gfp_mask);
> +		saved_gfp_mask = 0;
> +	}
> +}
> +

Trivial point, I suppose, but it bothers me that PM is accumulating
wrappers around wrappers around gfp_allowed_mask.  Looks like
clear_gfp_allowed_mask and set_gfp_allowed_mask (oddly asymmetrical)
were not really what we need.  How about scrapping them, and putting
pm_restrict_gfp_mask() and pm_restore_gfp_mask() into page_alloc.c?

>  static int noresume = 0;
>  static char resume_file[256] = CONFIG_PM_STD_PARTITION;
>  dev_t swsusp_resume_device;
> @@ -326,7 +341,6 @@ static int create_image(int platform_mod
>  int hibernation_snapshot(int platform_mode)
>  {
>  	int error;
> -	gfp_t saved_mask;
>  
>  	error = platform_begin(platform_mode);
>  	if (error)
> @@ -338,7 +352,7 @@ int hibernation_snapshot(int platform_mo
>  		goto Close;
>  
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	hibernate_restrict_gfp_mask();

Yes.

>  	error = dpm_suspend_start(PMSG_FREEZE);
>  	if (error)
>  		goto Recover_platform;
> @@ -347,7 +361,10 @@ int hibernation_snapshot(int platform_mo
>  		goto Recover_platform;
>  
>  	error = create_image(platform_mode);
> -	/* Control returns here after successful restore */
> +	/*
> +	 * Control returns here (1) after the image has been created or the
> +	 * image creation has failed and (2) after a successful restore.
> +	 */
>  
>   Resume_devices:
>  	/* We may need to release the preallocated image pages here. */
> @@ -356,7 +373,10 @@ int hibernation_snapshot(int platform_mo
>  
>  	dpm_resume_end(in_suspend ?
>  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
> +
> +	if (error || !in_suspend)
> +		hibernate_restore_gfp_mask();
> +

I'm worried that it's hard to find and maintain the places that need
this restoration - and if we miss one, we won't find out about it for
ages.  It would help a lot if the gfp restoration always accompanies
some other essential stage - thaw_processes() looks to be right,
so could we skip conditionally restoring here if we do it then?

I suggest that in part because I cannot find where you restore in
the case that hibernate()'s swsusp_write() fails - that was the
case that made me realize my little patch was too simplistic.

>  	resume_console();
>   Close:
>  	platform_end(platform_mode);
> @@ -451,17 +471,16 @@ static int resume_target_kernel(bool pla
>  int hibernation_restore(int platform_mode)
>  {
>  	int error;
> -	gfp_t saved_mask;
>  
>  	pm_prepare_console();
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	hibernate_restrict_gfp_mask();

Yes.

>  	error = dpm_suspend_start(PMSG_QUIESCE);
>  	if (!error) {
>  		error = resume_target_kernel(platform_mode);
>  		dpm_resume_end(PMSG_RECOVER);
>  	}
> -	set_gfp_allowed_mask(saved_mask);
> +	hibernate_restore_gfp_mask();

But this could be left until software_resume()'s thaw_processes()?
Ah, that won't cover the SNAPSHOT_ATOMIC_RESTORE case, hmmm.

>  	resume_console();
>  	pm_restore_console();
>  	return error;
> @@ -475,7 +494,6 @@ int hibernation_restore(int platform_mod
>  int hibernation_platform_enter(void)
>  {
>  	int error;
> -	gfp_t saved_mask;
>  
>  	if (!hibernation_ops)
>  		return -ENOSYS;
> @@ -491,7 +509,6 @@ int hibernation_platform_enter(void)
>  
>  	entering_platform_hibernation = true;
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);

Right.

>  	error = dpm_suspend_start(PMSG_HIBERNATE);
>  	if (error) {
>  		if (hibernation_ops->recover)
> @@ -535,7 +552,6 @@ int hibernation_platform_enter(void)
>   Resume_devices:
>  	entering_platform_hibernation = false;
>  	dpm_resume_end(PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);

Right.

>  	resume_console();
>  
>   Close:
> Index: suspend-2.6/kernel/power/power.h
> ===================================================================
> --- suspend-2.6.orig/kernel/power/power.h
> +++ suspend-2.6/kernel/power/power.h
> @@ -49,6 +49,7 @@ static inline char *check_image_kernel(s
>  extern int hibernation_snapshot(int platform_mode);
>  extern int hibernation_restore(int platform_mode);
>  extern int hibernation_platform_enter(void);
> +extern void hibernate_restore_gfp_mask(void);
>  #endif
>  
>  extern int pfn_is_nosave(unsigned long);
> Index: suspend-2.6/kernel/power/user.c
> ===================================================================
> --- suspend-2.6.orig/kernel/power/user.c
> +++ suspend-2.6/kernel/power/user.c
> @@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
>  	case SNAPSHOT_UNFREEZE:
>  		if (!data->frozen || data->ready)
>  			break;
> +		hibernate_restore_gfp_mask();

Right, here you have one next to thaw_processes().
But the SNAPSHOT ioctls are a nightmarish maze to me,
I cannot comment much on what's right here.

>  		thaw_processes();
>  		usermodehelper_enable();
>  		data->frozen = 0;
> @@ -275,6 +276,7 @@ static long snapshot_ioctl(struct file *
>  			error = -EPERM;
>  			break;
>  		}
> +		hibernate_restore_gfp_mask();
>  		error = hibernation_snapshot(data->platform_support);

That might be good safety, but it does strongly suggest that you
needed to put it somewhere else, and couldn't work out where?

>  		if (!error)
>  			error = put_user(in_suspend, (int __user *)arg);
> 

Hugh

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-30 22:16                   ` Hugh Dickins
@ 2010-11-30 23:07                     ` Rafael J. Wysocki
  2010-12-01  0:38                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-11-30 23:07 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Ondrej Zary, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Tuesday, November 30, 2010, Hugh Dickins wrote:
> On Sat, 27 Nov 2010, Rafael J. Wysocki wrote:
> > On Saturday, November 27, 2010, Ondrej Zary wrote:
> > > > > >
> > > > > > Not signing it off yet,
> > > > > > Hugh
> > > > >
> > > > > Could you please do that? The patch fixes the problem.
> 
> Sorry for going quiet for so long.
> 
> Thanks for testing, Ondrej.  I completely agree with Rafael that my
> patch was inadequate: it turned out to be good enough for your useful
> feedback on the main path through hibernation, but missed a number
> of details.  We do need something like Rafael's patch - thanks.
> 
> (And I realize that it's tiresome and frustrating for you to be
> trying first that and then this etc; but we had bugs here precisely
> because this is a confusing area, so I think it is worth some effort
> to get right now.)
> 
> > > >
> > > > Can you check if the problem is also fixed by the patch below, please?
> > > 
> > > The patch does not apply to 2.6.35.4, 2.6.35.5 and also 2.6.36. What version 
> > > should I test?
> > 
> > The patch was against the current mainline.
> > 
> > The one below was rebased on top of 2.6.36, so please test it with this kernel.
> 
> I'll comment on this version, since it has one more line than your original.
> 
> > 
> > Thanks,
> > Rafael
> > 
> > ---
> >  kernel/power/hibernate.c |   36 ++++++++++++++++++++++++++----------
> >  kernel/power/power.h     |    1 +
> >  kernel/power/user.c      |    2 ++
> >  3 files changed, 29 insertions(+), 10 deletions(-)
> > 
> > Index: suspend-2.6/kernel/power/hibernate.c
> > ===================================================================
> > --- suspend-2.6.orig/kernel/power/hibernate.c
> > +++ suspend-2.6/kernel/power/hibernate.c
> > @@ -29,6 +29,21 @@
> >  #include "power.h"
> >  
> >  
> > +static gfp_t saved_gfp_mask;
> > +
> > +static void hibernate_restrict_gfp_mask(void)
> > +{
> > +	saved_gfp_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +}
> > +
> > +void hibernate_restore_gfp_mask(void)
> > +{
> > +	if (saved_gfp_mask) {
> > +		set_gfp_allowed_mask(saved_gfp_mask);
> > +		saved_gfp_mask = 0;
> > +	}
> > +}
> > +
> 
> Trivial point, I suppose, but it bothers me that PM is accumulating
> wrappers around wrappers around gfp_allowed_mask.  Looks like
> clear_gfp_allowed_mask and set_gfp_allowed_mask (oddly asymmetrical)
> were not really what we need.  How about scrapping them, and putting
> pm_restrict_gfp_mask() and pm_restore_gfp_mask() into page_alloc.c?

Sure, that sounds like a good idea indeed.

> >  static int noresume = 0;
> >  static char resume_file[256] = CONFIG_PM_STD_PARTITION;
> >  dev_t swsusp_resume_device;
> > @@ -326,7 +341,6 @@ static int create_image(int platform_mod
> >  int hibernation_snapshot(int platform_mode)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >  
> >  	error = platform_begin(platform_mode);
> >  	if (error)
> > @@ -338,7 +352,7 @@ int hibernation_snapshot(int platform_mo
> >  		goto Close;
> >  
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +	hibernate_restrict_gfp_mask();
> 
> Yes.
> 
> >  	error = dpm_suspend_start(PMSG_FREEZE);
> >  	if (error)
> >  		goto Recover_platform;
> > @@ -347,7 +361,10 @@ int hibernation_snapshot(int platform_mo
> >  		goto Recover_platform;
> >  
> >  	error = create_image(platform_mode);
> > -	/* Control returns here after successful restore */
> > +	/*
> > +	 * Control returns here (1) after the image has been created or the
> > +	 * image creation has failed and (2) after a successful restore.
> > +	 */
> >  
> >   Resume_devices:
> >  	/* We may need to release the preallocated image pages here. */
> > @@ -356,7 +373,10 @@ int hibernation_snapshot(int platform_mo
> >  
> >  	dpm_resume_end(in_suspend ?
> >  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> > -	set_gfp_allowed_mask(saved_mask);
> > +
> > +	if (error || !in_suspend)
> > +		hibernate_restore_gfp_mask();
> > +
> 
> I'm worried that it's hard to find and maintain the places that need
> this restoration - and if we miss one, we won't find out about it for
> ages.  It would help a lot if the gfp restoration always accompanies
> some other essential stage - thaw_processes() looks to be right,
> so could we skip conditionally restoring here if we do it then?
> 
> I suggest that in part because I cannot find where you restore in
> the case that hibernate()'s swsusp_write() fails - that was the
> case that made me realize my little patch was too simplistic.

Good point.  I'm not totally sure if coupling this with thaw_processes() is
the right thing yet, but I guess it is, except for the s2disk-specific things
below.

> >  	resume_console();
> >   Close:
> >  	platform_end(platform_mode);
> > @@ -451,17 +471,16 @@ static int resume_target_kernel(bool pla
> >  int hibernation_restore(int platform_mode)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >  
> >  	pm_prepare_console();
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +	hibernate_restrict_gfp_mask();
> 
> Yes.
> 
> >  	error = dpm_suspend_start(PMSG_QUIESCE);
> >  	if (!error) {
> >  		error = resume_target_kernel(platform_mode);
> >  		dpm_resume_end(PMSG_RECOVER);
> >  	}
> > -	set_gfp_allowed_mask(saved_mask);
> > +	hibernate_restore_gfp_mask();
> 
> But this could be left until software_resume()'s thaw_processes()?
> Ah, that won't cover the SNAPSHOT_ATOMIC_RESTORE case, hmmm.

That's correct.

> >  	resume_console();
> >  	pm_restore_console();
> >  	return error;
> > @@ -475,7 +494,6 @@ int hibernation_restore(int platform_mod
> >  int hibernation_platform_enter(void)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >  
> >  	if (!hibernation_ops)
> >  		return -ENOSYS;
> > @@ -491,7 +509,6 @@ int hibernation_platform_enter(void)
> >  
> >  	entering_platform_hibernation = true;
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> 
> Right.
> 
> >  	error = dpm_suspend_start(PMSG_HIBERNATE);
> >  	if (error) {
> >  		if (hibernation_ops->recover)
> > @@ -535,7 +552,6 @@ int hibernation_platform_enter(void)
> >   Resume_devices:
> >  	entering_platform_hibernation = false;
> >  	dpm_resume_end(PMSG_RESTORE);
> > -	set_gfp_allowed_mask(saved_mask);
> 
> Right.
> 
> >  	resume_console();
> >  
> >   Close:
> > Index: suspend-2.6/kernel/power/power.h
> > ===================================================================
> > --- suspend-2.6.orig/kernel/power/power.h
> > +++ suspend-2.6/kernel/power/power.h
> > @@ -49,6 +49,7 @@ static inline char *check_image_kernel(s
> >  extern int hibernation_snapshot(int platform_mode);
> >  extern int hibernation_restore(int platform_mode);
> >  extern int hibernation_platform_enter(void);
> > +extern void hibernate_restore_gfp_mask(void);
> >  #endif
> >  
> >  extern int pfn_is_nosave(unsigned long);
> > Index: suspend-2.6/kernel/power/user.c
> > ===================================================================
> > --- suspend-2.6.orig/kernel/power/user.c
> > +++ suspend-2.6/kernel/power/user.c
> > @@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
> >  	case SNAPSHOT_UNFREEZE:
> >  		if (!data->frozen || data->ready)
> >  			break;
> > +		hibernate_restore_gfp_mask();
> 
> Right, here you have one next to thaw_processes().
> But the SNAPSHOT ioctls are a nightmarish maze to me,
> I cannot comment much on what's right here.

Well, sorry about that.  A few of them should just be dropped, but they are
lingering so that oldish user space works with the new kernels.

Perhaps it's time to just drop them ...

> >  		thaw_processes();
> >  		usermodehelper_enable();
> >  		data->frozen = 0;
> > @@ -275,6 +276,7 @@ static long snapshot_ioctl(struct file *
> >  			error = -EPERM;
> >  			break;
> >  		}
> > +		hibernate_restore_gfp_mask();
> >  		error = hibernation_snapshot(data->platform_support);
> 
> That might be good safety, but it does strongly suggest that you
> needed to put it somewhere else, and couldn't work out where?

This is because of how s2disk works.  If it fails to allocate enough swap,
it tries again with minimum image size without thawing the other tasks.
Now, because hibernation_snapshot() saves the "original" GFP mask, it has
to be restored before calling that function.

> >  		if (!error)
> >  			error = put_user(in_suspend, (int __user *)arg);
> > 

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-30 23:07                     ` Rafael J. Wysocki
@ 2010-12-01  0:38                       ` Rafael J. Wysocki
  2010-12-01  0:53                         ` KAMEZAWA Hiroyuki
  2010-12-01  1:21                         ` Hugh Dickins
  0 siblings, 2 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-12-01  0:38 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Ondrej Zary, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> On Tuesday, November 30, 2010, Hugh Dickins wrote:
> > On Sat, 27 Nov 2010, Rafael J. Wysocki wrote:
> > > On Saturday, November 27, 2010, Ondrej Zary wrote:
...
> > 
> > Trivial point, I suppose, but it bothers me that PM is accumulating
> > wrappers around wrappers around gfp_allowed_mask.  Looks like
> > clear_gfp_allowed_mask and set_gfp_allowed_mask (oddly asymmetrical)
> > were not really what we need.  How about scrapping them, and putting
> > pm_restrict_gfp_mask() and pm_restore_gfp_mask() into page_alloc.c?
> 
> Sure, that sounds like a good idea indeed.

Below is an updated patch in which I tried to address your comments.

I didn't find it very useful to couple pm_restore_gfp_mask() with the thawing
of tasks, but nevertheless I think all of the spots where it's needed are
covered now.

The patch has only been compile-tested for now, so caveat emptor.

Thanks,
Rafael

---
 include/linux/gfp.h      |    4 ++--
 kernel/power/hibernate.c |   22 ++++++++++++----------
 kernel/power/suspend.c   |    5 ++---
 kernel/power/user.c      |    2 ++
 mm/page_alloc.c          |   18 +++++++++++-------
 5 files changed, 29 insertions(+), 22 deletions(-)

Index: linux-2.6/kernel/power/hibernate.c
===================================================================
--- linux-2.6.orig/kernel/power/hibernate.c
+++ linux-2.6/kernel/power/hibernate.c
@@ -327,7 +327,6 @@ static int create_image(int platform_mod
 int hibernation_snapshot(int platform_mode)
 {
 	int error;
-	gfp_t saved_mask;
 
 	error = platform_begin(platform_mode);
 	if (error)
@@ -339,7 +338,7 @@ int hibernation_snapshot(int platform_mo
 		goto Close;
 
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
+	pm_restrict_gfp_mask();
 	error = dpm_suspend_start(PMSG_FREEZE);
 	if (error)
 		goto Recover_platform;
@@ -348,7 +347,10 @@ int hibernation_snapshot(int platform_mo
 		goto Recover_platform;
 
 	error = create_image(platform_mode);
-	/* Control returns here after successful restore */
+	/*
+	 * Control returns here (1) after the image has been created or the
+	 * image creation has failed and (2) after a successful restore.
+	 */
 
  Resume_devices:
 	/* We may need to release the preallocated image pages here. */
@@ -357,7 +359,10 @@ int hibernation_snapshot(int platform_mo
 
 	dpm_resume_end(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
-	set_gfp_allowed_mask(saved_mask);
+
+	if (error || !in_suspend)
+		pm_restore_gfp_mask();
+
 	resume_console();
  Close:
 	platform_end(platform_mode);
@@ -452,17 +457,16 @@ static int resume_target_kernel(bool pla
 int hibernation_restore(int platform_mode)
 {
 	int error;
-	gfp_t saved_mask;
 
 	pm_prepare_console();
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
+	pm_restrict_gfp_mask();
 	error = dpm_suspend_start(PMSG_QUIESCE);
 	if (!error) {
 		error = resume_target_kernel(platform_mode);
 		dpm_resume_end(PMSG_RECOVER);
 	}
-	set_gfp_allowed_mask(saved_mask);
+	pm_restore_gfp_mask();
 	resume_console();
 	pm_restore_console();
 	return error;
@@ -476,7 +480,6 @@ int hibernation_restore(int platform_mod
 int hibernation_platform_enter(void)
 {
 	int error;
-	gfp_t saved_mask;
 
 	if (!hibernation_ops)
 		return -ENOSYS;
@@ -492,7 +495,6 @@ int hibernation_platform_enter(void)
 
 	entering_platform_hibernation = true;
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
 	error = dpm_suspend_start(PMSG_HIBERNATE);
 	if (error) {
 		if (hibernation_ops->recover)
@@ -536,7 +538,6 @@ int hibernation_platform_enter(void)
  Resume_devices:
 	entering_platform_hibernation = false;
 	dpm_resume_end(PMSG_RESTORE);
-	set_gfp_allowed_mask(saved_mask);
 	resume_console();
 
  Close:
@@ -647,6 +648,7 @@ int hibernate(void)
 		if (!error)
 			power_down();
 		in_suspend = 0;
+		pm_restore_gfp_mask();
 	} else {
 		pr_debug("PM: Image restored successfully.\n");
 	}
Index: linux-2.6/kernel/power/user.c
===================================================================
--- linux-2.6.orig/kernel/power/user.c
+++ linux-2.6/kernel/power/user.c
@@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
 	case SNAPSHOT_UNFREEZE:
 		if (!data->frozen || data->ready)
 			break;
+		pm_restore_gfp_mask();
 		thaw_processes();
 		usermodehelper_enable();
 		data->frozen = 0;
@@ -275,6 +276,7 @@ static long snapshot_ioctl(struct file *
 			error = -EPERM;
 			break;
 		}
+		pm_restore_gfp_mask();
 		error = hibernation_snapshot(data->platform_support);
 		if (!error)
 			error = put_user(in_suspend, (int __user *)arg);
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -104,19 +104,23 @@ gfp_t gfp_allowed_mask __read_mostly = G
  * only be modified with pm_mutex held, unless the suspend/hibernate code is
  * guaranteed not to run in parallel with that modification).
  */
-void set_gfp_allowed_mask(gfp_t mask)
+
+static gfp_t saved_gfp_mask;
+
+void pm_restore_gfp_mask(void)
 {
 	WARN_ON(!mutex_is_locked(&pm_mutex));
-	gfp_allowed_mask = mask;
+	if (saved_gfp_mask) {
+		gfp_allowed_mask = saved_gfp_mask;
+		saved_gfp_mask = 0;
+	}
 }
 
-gfp_t clear_gfp_allowed_mask(gfp_t mask)
+void pm_restrict_gfp_mask(void)
 {
-	gfp_t ret = gfp_allowed_mask;
-
 	WARN_ON(!mutex_is_locked(&pm_mutex));
-	gfp_allowed_mask &= ~mask;
-	return ret;
+	saved_gfp_mask = gfp_allowed_mask;
+	gfp_allowed_mask &= ~GFP_IOFS;
 }
 #endif /* CONFIG_PM_SLEEP */
 
Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h
+++ linux-2.6/include/linux/gfp.h
@@ -360,7 +360,7 @@ void drain_local_pages(void *dummy);
 
 extern gfp_t gfp_allowed_mask;
 
-extern void set_gfp_allowed_mask(gfp_t mask);
-extern gfp_t clear_gfp_allowed_mask(gfp_t mask);
+extern void pm_restrict_gfp_mask(void);
+extern void pm_restore_gfp_mask(void);
 
 #endif /* __LINUX_GFP_H */
Index: linux-2.6/kernel/power/suspend.c
===================================================================
--- linux-2.6.orig/kernel/power/suspend.c
+++ linux-2.6/kernel/power/suspend.c
@@ -197,7 +197,6 @@ static int suspend_enter(suspend_state_t
 int suspend_devices_and_enter(suspend_state_t state)
 {
 	int error;
-	gfp_t saved_mask;
 
 	if (!suspend_ops)
 		return -ENOSYS;
@@ -208,7 +207,7 @@ int suspend_devices_and_enter(suspend_st
 			goto Close;
 	}
 	suspend_console();
-	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
+	pm_restrict_gfp_mask();
 	suspend_test_start();
 	error = dpm_suspend_start(PMSG_SUSPEND);
 	if (error) {
@@ -225,7 +224,7 @@ int suspend_devices_and_enter(suspend_st
 	suspend_test_start();
 	dpm_resume_end(PMSG_RESUME);
 	suspend_test_finish("resume devices");
-	set_gfp_allowed_mask(saved_mask);
+	pm_restore_gfp_mask();
 	resume_console();
  Close:
 	if (suspend_ops->end)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-12-01  0:38                       ` Rafael J. Wysocki
@ 2010-12-01  0:53                         ` KAMEZAWA Hiroyuki
  2010-12-01 21:25                           ` Rafael J. Wysocki
  2010-12-01  1:21                         ` Hugh Dickins
  1 sibling, 1 reply; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-12-01  0:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Hugh Dickins, Ondrej Zary, Andrew Morton, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Wed, 1 Dec 2010 01:38:53 +0100
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> > On Tuesday, November 30, 2010, Hugh Dickins wrote:
> > > On Sat, 27 Nov 2010, Rafael J. Wysocki wrote:
> > > > On Saturday, November 27, 2010, Ondrej Zary wrote:
> ...
> > > 
> > > Trivial point, I suppose, but it bothers me that PM is accumulating
> > > wrappers around wrappers around gfp_allowed_mask.  Looks like
> > > clear_gfp_allowed_mask and set_gfp_allowed_mask (oddly asymmetrical)
> > > were not really what we need.  How about scrapping them, and putting
> > > pm_restrict_gfp_mask() and pm_restore_gfp_mask() into page_alloc.c?
> > 
> > Sure, that sounds like a good idea indeed.
> 
> Below is an updated patch in which I tried to address your comments.
> 
> I didn't find it very useful to couple pm_restore_gfp_mask() with the thawing
> of tasks, but nevertheless I think all of the spots where it's needed are
> covered now.
> 
> The patch has only been compile-tested for now, so caveat emptor.
> 

Hmm, can't we have some error check as 

> +static gfp_t saved_gfp_mask;

atomic_t gfp_mask_save_mode_counter;

> +
> +void pm_restore_gfp_mask(void)
>  {
>  	WARN_ON(!mutex_is_locked(&pm_mutex));
> -	gfp_allowed_mask = mask;

	if (atomic_dec_return(&gfp_mask_save_mode_counter))
		WARN_ONCE()

> +	if (saved_gfp_mask) {
> +		gfp_allowed_mask = saved_gfp_mask;
> +		saved_gfp_mask = 0;
> +	}
>  }

> +void pm_restrict_gfp_mask(void)
>  {
> -	gfp_t ret = gfp_allowed_mask;
> -
>  	WARN_ON(!mutex_is_locked(&pm_mutex));
> -	gfp_allowed_mask &= ~mask;
> -	return ret;
> +	saved_gfp_mask = gfp_allowed_mask;
> +	gfp_allowed_mask &= ~GFP_IOFS;

	if (atomic_inc_return(&gfp_mask_save_mode_counter) > 1)
		WARN_ONCE()

or some ?

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-12-01  0:38                       ` Rafael J. Wysocki
  2010-12-01  0:53                         ` KAMEZAWA Hiroyuki
@ 2010-12-01  1:21                         ` Hugh Dickins
  2010-12-01 20:57                           ` Rafael J. Wysocki
  1 sibling, 1 reply; 24+ messages in thread
From: Hugh Dickins @ 2010-12-01  1:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ondrej Zary, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Wed, 1 Dec 2010, Rafael J. Wysocki wrote:
> Below is an updated patch in which I tried to address your comments.
> 
> I didn't find it very useful to couple pm_restore_gfp_mask() with the thawing
> of tasks, but nevertheless I think all of the spots where it's needed are
> covered now.

So far as I can see, yes, they are: I'm still a bit worried about
future changes leaving the wrong gfp mask behind by mistake;
but I trust you more than I trust me on that, so you can add my
Acked-by: Hugh Dickins <hughd@google.com> when it comes to
sending in the patch.

Thanks,
Hugh

> 
> The patch has only been compile-tested for now, so caveat emptor.
> 
> Thanks,
> Rafael
> 
> ---
>  include/linux/gfp.h      |    4 ++--
>  kernel/power/hibernate.c |   22 ++++++++++++----------
>  kernel/power/suspend.c   |    5 ++---
>  kernel/power/user.c      |    2 ++
>  mm/page_alloc.c          |   18 +++++++++++-------
>  5 files changed, 29 insertions(+), 22 deletions(-)
> 
> Index: linux-2.6/kernel/power/hibernate.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/hibernate.c
> +++ linux-2.6/kernel/power/hibernate.c
> @@ -327,7 +327,6 @@ static int create_image(int platform_mod
>  int hibernation_snapshot(int platform_mode)
>  {
>  	int error;
> -	gfp_t saved_mask;
>  
>  	error = platform_begin(platform_mode);
>  	if (error)
> @@ -339,7 +338,7 @@ int hibernation_snapshot(int platform_mo
>  		goto Close;
>  
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	pm_restrict_gfp_mask();
>  	error = dpm_suspend_start(PMSG_FREEZE);
>  	if (error)
>  		goto Recover_platform;
> @@ -348,7 +347,10 @@ int hibernation_snapshot(int platform_mo
>  		goto Recover_platform;
>  
>  	error = create_image(platform_mode);
> -	/* Control returns here after successful restore */
> +	/*
> +	 * Control returns here (1) after the image has been created or the
> +	 * image creation has failed and (2) after a successful restore.
> +	 */
>  
>   Resume_devices:
>  	/* We may need to release the preallocated image pages here. */
> @@ -357,7 +359,10 @@ int hibernation_snapshot(int platform_mo
>  
>  	dpm_resume_end(in_suspend ?
>  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
> +
> +	if (error || !in_suspend)
> +		pm_restore_gfp_mask();
> +
>  	resume_console();
>   Close:
>  	platform_end(platform_mode);
> @@ -452,17 +457,16 @@ static int resume_target_kernel(bool pla
>  int hibernation_restore(int platform_mode)
>  {
>  	int error;
> -	gfp_t saved_mask;
>  
>  	pm_prepare_console();
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	pm_restrict_gfp_mask();
>  	error = dpm_suspend_start(PMSG_QUIESCE);
>  	if (!error) {
>  		error = resume_target_kernel(platform_mode);
>  		dpm_resume_end(PMSG_RECOVER);
>  	}
> -	set_gfp_allowed_mask(saved_mask);
> +	pm_restore_gfp_mask();
>  	resume_console();
>  	pm_restore_console();
>  	return error;
> @@ -476,7 +480,6 @@ int hibernation_restore(int platform_mod
>  int hibernation_platform_enter(void)
>  {
>  	int error;
> -	gfp_t saved_mask;
>  
>  	if (!hibernation_ops)
>  		return -ENOSYS;
> @@ -492,7 +495,6 @@ int hibernation_platform_enter(void)
>  
>  	entering_platform_hibernation = true;
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
>  	error = dpm_suspend_start(PMSG_HIBERNATE);
>  	if (error) {
>  		if (hibernation_ops->recover)
> @@ -536,7 +538,6 @@ int hibernation_platform_enter(void)
>   Resume_devices:
>  	entering_platform_hibernation = false;
>  	dpm_resume_end(PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
>  	resume_console();
>  
>   Close:
> @@ -647,6 +648,7 @@ int hibernate(void)
>  		if (!error)
>  			power_down();
>  		in_suspend = 0;
> +		pm_restore_gfp_mask();
>  	} else {
>  		pr_debug("PM: Image restored successfully.\n");
>  	}
> Index: linux-2.6/kernel/power/user.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/user.c
> +++ linux-2.6/kernel/power/user.c
> @@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
>  	case SNAPSHOT_UNFREEZE:
>  		if (!data->frozen || data->ready)
>  			break;
> +		pm_restore_gfp_mask();
>  		thaw_processes();
>  		usermodehelper_enable();
>  		data->frozen = 0;
> @@ -275,6 +276,7 @@ static long snapshot_ioctl(struct file *
>  			error = -EPERM;
>  			break;
>  		}
> +		pm_restore_gfp_mask();
>  		error = hibernation_snapshot(data->platform_support);
>  		if (!error)
>  			error = put_user(in_suspend, (int __user *)arg);
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -104,19 +104,23 @@ gfp_t gfp_allowed_mask __read_mostly = G
>   * only be modified with pm_mutex held, unless the suspend/hibernate code is
>   * guaranteed not to run in parallel with that modification).
>   */
> -void set_gfp_allowed_mask(gfp_t mask)
> +
> +static gfp_t saved_gfp_mask;
> +
> +void pm_restore_gfp_mask(void)
>  {
>  	WARN_ON(!mutex_is_locked(&pm_mutex));
> -	gfp_allowed_mask = mask;
> +	if (saved_gfp_mask) {
> +		gfp_allowed_mask = saved_gfp_mask;
> +		saved_gfp_mask = 0;
> +	}
>  }
>  
> -gfp_t clear_gfp_allowed_mask(gfp_t mask)
> +void pm_restrict_gfp_mask(void)
>  {
> -	gfp_t ret = gfp_allowed_mask;
> -
>  	WARN_ON(!mutex_is_locked(&pm_mutex));
> -	gfp_allowed_mask &= ~mask;
> -	return ret;
> +	saved_gfp_mask = gfp_allowed_mask;
> +	gfp_allowed_mask &= ~GFP_IOFS;
>  }
>  #endif /* CONFIG_PM_SLEEP */
>  
> Index: linux-2.6/include/linux/gfp.h
> ===================================================================
> --- linux-2.6.orig/include/linux/gfp.h
> +++ linux-2.6/include/linux/gfp.h
> @@ -360,7 +360,7 @@ void drain_local_pages(void *dummy);
>  
>  extern gfp_t gfp_allowed_mask;
>  
> -extern void set_gfp_allowed_mask(gfp_t mask);
> -extern gfp_t clear_gfp_allowed_mask(gfp_t mask);
> +extern void pm_restrict_gfp_mask(void);
> +extern void pm_restore_gfp_mask(void);
>  
>  #endif /* __LINUX_GFP_H */
> Index: linux-2.6/kernel/power/suspend.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/suspend.c
> +++ linux-2.6/kernel/power/suspend.c
> @@ -197,7 +197,6 @@ static int suspend_enter(suspend_state_t
>  int suspend_devices_and_enter(suspend_state_t state)
>  {
>  	int error;
> -	gfp_t saved_mask;
>  
>  	if (!suspend_ops)
>  		return -ENOSYS;
> @@ -208,7 +207,7 @@ int suspend_devices_and_enter(suspend_st
>  			goto Close;
>  	}
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	pm_restrict_gfp_mask();
>  	suspend_test_start();
>  	error = dpm_suspend_start(PMSG_SUSPEND);
>  	if (error) {
> @@ -225,7 +224,7 @@ int suspend_devices_and_enter(suspend_st
>  	suspend_test_start();
>  	dpm_resume_end(PMSG_RESUME);
>  	suspend_test_finish("resume devices");
> -	set_gfp_allowed_mask(saved_mask);
> +	pm_restore_gfp_mask();
>  	resume_console();
>   Close:
>  	if (suspend_ops->end)
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-12-01  1:21                         ` Hugh Dickins
@ 2010-12-01 20:57                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-12-01 20:57 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton
  Cc: Ondrej Zary, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Wednesday, December 01, 2010, Hugh Dickins wrote:
> On Wed, 1 Dec 2010, Rafael J. Wysocki wrote:
> > Below is an updated patch in which I tried to address your comments.
> > 
> > I didn't find it very useful to couple pm_restore_gfp_mask() with the thawing
> > of tasks, but nevertheless I think all of the spots where it's needed are
> > covered now.
> 
> So far as I can see, yes, they are: I'm still a bit worried about
> future changes leaving the wrong gfp mask behind by mistake;
> but I trust you more than I trust me on that, so you can add my
> Acked-by: Hugh Dickins <hughd@google.com> when it comes to
> sending in the patch.

Thanks a lot!

Andrew, are the changes in page_alloc.c fine from your standpoint?

Rafael


> > The patch has only been compile-tested for now, so caveat emptor.
> > 
> > Thanks,
> > Rafael
> > 
> > ---
> >  include/linux/gfp.h      |    4 ++--
> >  kernel/power/hibernate.c |   22 ++++++++++++----------
> >  kernel/power/suspend.c   |    5 ++---
> >  kernel/power/user.c      |    2 ++
> >  mm/page_alloc.c          |   18 +++++++++++-------
> >  5 files changed, 29 insertions(+), 22 deletions(-)
> > 
> > Index: linux-2.6/kernel/power/hibernate.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/hibernate.c
> > +++ linux-2.6/kernel/power/hibernate.c
> > @@ -327,7 +327,6 @@ static int create_image(int platform_mod
> >  int hibernation_snapshot(int platform_mode)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >  
> >  	error = platform_begin(platform_mode);
> >  	if (error)
> > @@ -339,7 +338,7 @@ int hibernation_snapshot(int platform_mo
> >  		goto Close;
> >  
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +	pm_restrict_gfp_mask();
> >  	error = dpm_suspend_start(PMSG_FREEZE);
> >  	if (error)
> >  		goto Recover_platform;
> > @@ -348,7 +347,10 @@ int hibernation_snapshot(int platform_mo
> >  		goto Recover_platform;
> >  
> >  	error = create_image(platform_mode);
> > -	/* Control returns here after successful restore */
> > +	/*
> > +	 * Control returns here (1) after the image has been created or the
> > +	 * image creation has failed and (2) after a successful restore.
> > +	 */
> >  
> >   Resume_devices:
> >  	/* We may need to release the preallocated image pages here. */
> > @@ -357,7 +359,10 @@ int hibernation_snapshot(int platform_mo
> >  
> >  	dpm_resume_end(in_suspend ?
> >  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> > -	set_gfp_allowed_mask(saved_mask);
> > +
> > +	if (error || !in_suspend)
> > +		pm_restore_gfp_mask();
> > +
> >  	resume_console();
> >   Close:
> >  	platform_end(platform_mode);
> > @@ -452,17 +457,16 @@ static int resume_target_kernel(bool pla
> >  int hibernation_restore(int platform_mode)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >  
> >  	pm_prepare_console();
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +	pm_restrict_gfp_mask();
> >  	error = dpm_suspend_start(PMSG_QUIESCE);
> >  	if (!error) {
> >  		error = resume_target_kernel(platform_mode);
> >  		dpm_resume_end(PMSG_RECOVER);
> >  	}
> > -	set_gfp_allowed_mask(saved_mask);
> > +	pm_restore_gfp_mask();
> >  	resume_console();
> >  	pm_restore_console();
> >  	return error;
> > @@ -476,7 +480,6 @@ int hibernation_restore(int platform_mod
> >  int hibernation_platform_enter(void)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >  
> >  	if (!hibernation_ops)
> >  		return -ENOSYS;
> > @@ -492,7 +495,6 @@ int hibernation_platform_enter(void)
> >  
> >  	entering_platform_hibernation = true;
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> >  	error = dpm_suspend_start(PMSG_HIBERNATE);
> >  	if (error) {
> >  		if (hibernation_ops->recover)
> > @@ -536,7 +538,6 @@ int hibernation_platform_enter(void)
> >   Resume_devices:
> >  	entering_platform_hibernation = false;
> >  	dpm_resume_end(PMSG_RESTORE);
> > -	set_gfp_allowed_mask(saved_mask);
> >  	resume_console();
> >  
> >   Close:
> > @@ -647,6 +648,7 @@ int hibernate(void)
> >  		if (!error)
> >  			power_down();
> >  		in_suspend = 0;
> > +		pm_restore_gfp_mask();
> >  	} else {
> >  		pr_debug("PM: Image restored successfully.\n");
> >  	}
> > Index: linux-2.6/kernel/power/user.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/user.c
> > +++ linux-2.6/kernel/power/user.c
> > @@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
> >  	case SNAPSHOT_UNFREEZE:
> >  		if (!data->frozen || data->ready)
> >  			break;
> > +		pm_restore_gfp_mask();
> >  		thaw_processes();
> >  		usermodehelper_enable();
> >  		data->frozen = 0;
> > @@ -275,6 +276,7 @@ static long snapshot_ioctl(struct file *
> >  			error = -EPERM;
> >  			break;
> >  		}
> > +		pm_restore_gfp_mask();
> >  		error = hibernation_snapshot(data->platform_support);
> >  		if (!error)
> >  			error = put_user(in_suspend, (int __user *)arg);
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -104,19 +104,23 @@ gfp_t gfp_allowed_mask __read_mostly = G
> >   * only be modified with pm_mutex held, unless the suspend/hibernate code is
> >   * guaranteed not to run in parallel with that modification).
> >   */
> > -void set_gfp_allowed_mask(gfp_t mask)
> > +
> > +static gfp_t saved_gfp_mask;
> > +
> > +void pm_restore_gfp_mask(void)
> >  {
> >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > -	gfp_allowed_mask = mask;
> > +	if (saved_gfp_mask) {
> > +		gfp_allowed_mask = saved_gfp_mask;
> > +		saved_gfp_mask = 0;
> > +	}
> >  }
> >  
> > -gfp_t clear_gfp_allowed_mask(gfp_t mask)
> > +void pm_restrict_gfp_mask(void)
> >  {
> > -	gfp_t ret = gfp_allowed_mask;
> > -
> >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > -	gfp_allowed_mask &= ~mask;
> > -	return ret;
> > +	saved_gfp_mask = gfp_allowed_mask;
> > +	gfp_allowed_mask &= ~GFP_IOFS;
> >  }
> >  #endif /* CONFIG_PM_SLEEP */
> >  
> > Index: linux-2.6/include/linux/gfp.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/gfp.h
> > +++ linux-2.6/include/linux/gfp.h
> > @@ -360,7 +360,7 @@ void drain_local_pages(void *dummy);
> >  
> >  extern gfp_t gfp_allowed_mask;
> >  
> > -extern void set_gfp_allowed_mask(gfp_t mask);
> > -extern gfp_t clear_gfp_allowed_mask(gfp_t mask);
> > +extern void pm_restrict_gfp_mask(void);
> > +extern void pm_restore_gfp_mask(void);
> >  
> >  #endif /* __LINUX_GFP_H */
> > Index: linux-2.6/kernel/power/suspend.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/power/suspend.c
> > +++ linux-2.6/kernel/power/suspend.c
> > @@ -197,7 +197,6 @@ static int suspend_enter(suspend_state_t
> >  int suspend_devices_and_enter(suspend_state_t state)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >  
> >  	if (!suspend_ops)
> >  		return -ENOSYS;
> > @@ -208,7 +207,7 @@ int suspend_devices_and_enter(suspend_st
> >  			goto Close;
> >  	}
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +	pm_restrict_gfp_mask();
> >  	suspend_test_start();
> >  	error = dpm_suspend_start(PMSG_SUSPEND);
> >  	if (error) {
> > @@ -225,7 +224,7 @@ int suspend_devices_and_enter(suspend_st
> >  	suspend_test_start();
> >  	dpm_resume_end(PMSG_RESUME);
> >  	suspend_test_finish("resume devices");
> > -	set_gfp_allowed_mask(saved_mask);
> > +	pm_restore_gfp_mask();
> >  	resume_console();
> >   Close:
> >  	if (suspend_ops->end)
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-12-01  0:53                         ` KAMEZAWA Hiroyuki
@ 2010-12-01 21:25                           ` Rafael J. Wysocki
  2010-12-01 22:23                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-12-01 21:25 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hugh Dickins, Ondrej Zary, Andrew Morton, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Wednesday, December 01, 2010, KAMEZAWA Hiroyuki wrote:
> On Wed, 1 Dec 2010 01:38:53 +0100
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> > > On Tuesday, November 30, 2010, Hugh Dickins wrote:
> > > > On Sat, 27 Nov 2010, Rafael J. Wysocki wrote:
> > > > > On Saturday, November 27, 2010, Ondrej Zary wrote:
> > ...
> > > > 
> > > > Trivial point, I suppose, but it bothers me that PM is accumulating
> > > > wrappers around wrappers around gfp_allowed_mask.  Looks like
> > > > clear_gfp_allowed_mask and set_gfp_allowed_mask (oddly asymmetrical)
> > > > were not really what we need.  How about scrapping them, and putting
> > > > pm_restrict_gfp_mask() and pm_restore_gfp_mask() into page_alloc.c?
> > > 
> > > Sure, that sounds like a good idea indeed.
> > 
> > Below is an updated patch in which I tried to address your comments.
> > 
> > I didn't find it very useful to couple pm_restore_gfp_mask() with the thawing
> > of tasks, but nevertheless I think all of the spots where it's needed are
> > covered now.
> > 
> > The patch has only been compile-tested for now, so caveat emptor.
> > 
> 
> Hmm, can't we have some error check as 
> 
> > +static gfp_t saved_gfp_mask;
> 
> atomic_t gfp_mask_save_mode_counter;
> 
> > +
> > +void pm_restore_gfp_mask(void)
> >  {
> >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > -	gfp_allowed_mask = mask;
> 
> 	if (atomic_dec_return(&gfp_mask_save_mode_counter))
> 		WARN_ONCE()
> 
> > +	if (saved_gfp_mask) {
> > +		gfp_allowed_mask = saved_gfp_mask;
> > +		saved_gfp_mask = 0;
> > +	}
> >  }
> 
> > +void pm_restrict_gfp_mask(void)
> >  {
> > -	gfp_t ret = gfp_allowed_mask;
> > -
> >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > -	gfp_allowed_mask &= ~mask;
> > -	return ret;
> > +	saved_gfp_mask = gfp_allowed_mask;
> > +	gfp_allowed_mask &= ~GFP_IOFS;
> 
> 	if (atomic_inc_return(&gfp_mask_save_mode_counter) > 1)
> 		WARN_ONCE()
> 
> or some ?

What exactly would that be useful for?

Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-12-01 21:25                           ` Rafael J. Wysocki
@ 2010-12-01 22:23                             ` Rafael J. Wysocki
  2010-12-01 23:37                               ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-12-01 22:23 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hugh Dickins, Ondrej Zary, Andrew Morton, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> On Wednesday, December 01, 2010, KAMEZAWA Hiroyuki wrote:
> > On Wed, 1 Dec 2010 01:38:53 +0100
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> > > > On Tuesday, November 30, 2010, Hugh Dickins wrote:
> > > > > On Sat, 27 Nov 2010, Rafael J. Wysocki wrote:
> > > > > > On Saturday, November 27, 2010, Ondrej Zary wrote:
> > > ...
> > > > > 
> > > > > Trivial point, I suppose, but it bothers me that PM is accumulating
> > > > > wrappers around wrappers around gfp_allowed_mask.  Looks like
> > > > > clear_gfp_allowed_mask and set_gfp_allowed_mask (oddly asymmetrical)
> > > > > were not really what we need.  How about scrapping them, and putting
> > > > > pm_restrict_gfp_mask() and pm_restore_gfp_mask() into page_alloc.c?
> > > > 
> > > > Sure, that sounds like a good idea indeed.
> > > 
> > > Below is an updated patch in which I tried to address your comments.
> > > 
> > > I didn't find it very useful to couple pm_restore_gfp_mask() with the thawing
> > > of tasks, but nevertheless I think all of the spots where it's needed are
> > > covered now.
> > > 
> > > The patch has only been compile-tested for now, so caveat emptor.
> > > 
> > 
> > Hmm, can't we have some error check as 
> > 
> > > +static gfp_t saved_gfp_mask;
> > 
> > atomic_t gfp_mask_save_mode_counter;
> > 
> > > +
> > > +void pm_restore_gfp_mask(void)
> > >  {
> > >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > > -	gfp_allowed_mask = mask;
> > 
> > 	if (atomic_dec_return(&gfp_mask_save_mode_counter))
> > 		WARN_ONCE()
> > 
> > > +	if (saved_gfp_mask) {
> > > +		gfp_allowed_mask = saved_gfp_mask;
> > > +		saved_gfp_mask = 0;
> > > +	}
> > >  }
> > 
> > > +void pm_restrict_gfp_mask(void)
> > >  {
> > > -	gfp_t ret = gfp_allowed_mask;
> > > -
> > >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > > -	gfp_allowed_mask &= ~mask;
> > > -	return ret;
> > > +	saved_gfp_mask = gfp_allowed_mask;
> > > +	gfp_allowed_mask &= ~GFP_IOFS;
> > 
> > 	if (atomic_inc_return(&gfp_mask_save_mode_counter) > 1)
> > 		WARN_ONCE()
> > 
> > or some ?
> 
> What exactly would that be useful for?

Please note that pm_restore_gfp_mask() can be legitimately called before
pm_restrict_gfp_mask() via the SNAPSHOT_CREATE_IMAGE hibernate ioctl, so the
test you're suggesting wouldn't really work.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-12-01 22:23                             ` Rafael J. Wysocki
@ 2010-12-01 23:37                               ` KAMEZAWA Hiroyuki
  2010-12-02  0:00                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-12-01 23:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Hugh Dickins, Ondrej Zary, Andrew Morton, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Wed, 1 Dec 2010 23:23:31 +0100
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> > On Wednesday, December 01, 2010, KAMEZAWA Hiroyuki wrote:
> > > On Wed, 1 Dec 2010 01:38:53 +0100
> > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> > > > > On Tuesday, November 30, 2010, Hugh Dickins wrote:
> > > > > > On Sat, 27 Nov 2010, Rafael J. Wysocki wrote:
> > > > > > > On Saturday, November 27, 2010, Ondrej Zary wrote:
> > > > ...
> > > > > > 
> > > > > > Trivial point, I suppose, but it bothers me that PM is accumulating
> > > > > > wrappers around wrappers around gfp_allowed_mask.  Looks like
> > > > > > clear_gfp_allowed_mask and set_gfp_allowed_mask (oddly asymmetrical)
> > > > > > were not really what we need.  How about scrapping them, and putting
> > > > > > pm_restrict_gfp_mask() and pm_restore_gfp_mask() into page_alloc.c?
> > > > > 
> > > > > Sure, that sounds like a good idea indeed.
> > > > 
> > > > Below is an updated patch in which I tried to address your comments.
> > > > 
> > > > I didn't find it very useful to couple pm_restore_gfp_mask() with the thawing
> > > > of tasks, but nevertheless I think all of the spots where it's needed are
> > > > covered now.
> > > > 
> > > > The patch has only been compile-tested for now, so caveat emptor.
> > > > 
> > > 
> > > Hmm, can't we have some error check as 
> > > 
> > > > +static gfp_t saved_gfp_mask;
> > > 
> > > atomic_t gfp_mask_save_mode_counter;
> > > 
> > > > +
> > > > +void pm_restore_gfp_mask(void)
> > > >  {
> > > >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > > > -	gfp_allowed_mask = mask;
> > > 
> > > 	if (atomic_dec_return(&gfp_mask_save_mode_counter))
> > > 		WARN_ONCE()
> > > 
> > > > +	if (saved_gfp_mask) {
> > > > +		gfp_allowed_mask = saved_gfp_mask;
> > > > +		saved_gfp_mask = 0;
> > > > +	}
> > > >  }
> > > 
> > > > +void pm_restrict_gfp_mask(void)
> > > >  {
> > > > -	gfp_t ret = gfp_allowed_mask;
> > > > -
> > > >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > > > -	gfp_allowed_mask &= ~mask;
> > > > -	return ret;
> > > > +	saved_gfp_mask = gfp_allowed_mask;
> > > > +	gfp_allowed_mask &= ~GFP_IOFS;
> > > 
> > > 	if (atomic_inc_return(&gfp_mask_save_mode_counter) > 1)
> > > 		WARN_ONCE()
> > > 
> > > or some ?
> > 
> > What exactly would that be useful for?
> 
> Please note that pm_restore_gfp_mask() can be legitimately called before
> pm_restrict_gfp_mask() via the SNAPSHOT_CREATE_IMAGE hibernate ioctl, so the
> test you're suggesting wouldn't really work.
> 

Hm, I just wonder some tests not for breaking gfp_allowed_mask by
 - double call
 - forget to restore
That will be fatal.

I'm not very interested in implementation detail

Thanks,
-Kame




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-12-01 23:37                               ` KAMEZAWA Hiroyuki
@ 2010-12-02  0:00                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-12-02  0:00 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hugh Dickins, Ondrej Zary, Andrew Morton, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Thursday, December 02, 2010, KAMEZAWA Hiroyuki wrote:
> On Wed, 1 Dec 2010 23:23:31 +0100
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> > > On Wednesday, December 01, 2010, KAMEZAWA Hiroyuki wrote:
> > > > On Wed, 1 Dec 2010 01:38:53 +0100
> > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > 
> > > > > On Wednesday, December 01, 2010, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, November 30, 2010, Hugh Dickins wrote:
> > > > > > > On Sat, 27 Nov 2010, Rafael J. Wysocki wrote:
> > > > > > > > On Saturday, November 27, 2010, Ondrej Zary wrote:
> > > > > ...
> > > > > > > 
> > > > > > > Trivial point, I suppose, but it bothers me that PM is accumulating
> > > > > > > wrappers around wrappers around gfp_allowed_mask.  Looks like
> > > > > > > clear_gfp_allowed_mask and set_gfp_allowed_mask (oddly asymmetrical)
> > > > > > > were not really what we need.  How about scrapping them, and putting
> > > > > > > pm_restrict_gfp_mask() and pm_restore_gfp_mask() into page_alloc.c?
> > > > > > 
> > > > > > Sure, that sounds like a good idea indeed.
> > > > > 
> > > > > Below is an updated patch in which I tried to address your comments.
> > > > > 
> > > > > I didn't find it very useful to couple pm_restore_gfp_mask() with the thawing
> > > > > of tasks, but nevertheless I think all of the spots where it's needed are
> > > > > covered now.
> > > > > 
> > > > > The patch has only been compile-tested for now, so caveat emptor.
> > > > > 
> > > > 
> > > > Hmm, can't we have some error check as 
> > > > 
> > > > > +static gfp_t saved_gfp_mask;
> > > > 
> > > > atomic_t gfp_mask_save_mode_counter;
> > > > 
> > > > > +
> > > > > +void pm_restore_gfp_mask(void)
> > > > >  {
> > > > >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > > > > -	gfp_allowed_mask = mask;
> > > > 
> > > > 	if (atomic_dec_return(&gfp_mask_save_mode_counter))
> > > > 		WARN_ONCE()
> > > > 
> > > > > +	if (saved_gfp_mask) {
> > > > > +		gfp_allowed_mask = saved_gfp_mask;
> > > > > +		saved_gfp_mask = 0;
> > > > > +	}
> > > > >  }
> > > > 
> > > > > +void pm_restrict_gfp_mask(void)
> > > > >  {
> > > > > -	gfp_t ret = gfp_allowed_mask;
> > > > > -
> > > > >  	WARN_ON(!mutex_is_locked(&pm_mutex));
> > > > > -	gfp_allowed_mask &= ~mask;
> > > > > -	return ret;
> > > > > +	saved_gfp_mask = gfp_allowed_mask;
> > > > > +	gfp_allowed_mask &= ~GFP_IOFS;
> > > > 
> > > > 	if (atomic_inc_return(&gfp_mask_save_mode_counter) > 1)
> > > > 		WARN_ONCE()
> > > > 
> > > > or some ?
> > > 
> > > What exactly would that be useful for?
> > 
> > Please note that pm_restore_gfp_mask() can be legitimately called before
> > pm_restrict_gfp_mask() via the SNAPSHOT_CREATE_IMAGE hibernate ioctl, so the
> > test you're suggesting wouldn't really work.
> > 
> 
> Hm, I just wonder some tests not for breaking gfp_allowed_mask by
>  - double call

That's easy.  It's sufficient to do WARN_ON(saved_gfp_mask) at the beginning
of pm_restrict_gfp_mask().  This also takes care of the "forget to restore"
case to some extent, because pm_restrict_gfp_mask() will generally be only
called twice in a row if there's a missing pm_restore_gfp_mask() in between.

>  - forget to restore

This is kind of difficult in general.  You can't detect a missing
pm_restore_gfp_mask() other than by checking if pm_restrict_gfp_mask() is not
called twice in a row IMHO.

> That will be fatal.

Yes, it will.

> I'm not very interested in implementation detail

Well ...

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-11-27 20:47                 ` Rafael J. Wysocki
  2010-11-30 22:16                   ` Hugh Dickins
@ 2010-12-06 21:09                   ` Ondrej Zary
  2010-12-06 22:57                     ` Rafael J. Wysocki
  1 sibling, 1 reply; 24+ messages in thread
From: Ondrej Zary @ 2010-12-06 21:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Hugh Dickins, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Saturday 27 November 2010 21:47:01 Rafael J. Wysocki wrote:
> On Saturday, November 27, 2010, Ondrej Zary wrote:
> > On Saturday 27 November 2010 00:10:04 Rafael J. Wysocki wrote:
> > > On Friday, November 26, 2010, Ondrej Zary wrote:
> > > > On Thursday 18 November 2010, Hugh Dickins wrote:
> > > > > On Wed, 17 Nov 2010, Ondrej Zary wrote:
> > > > > > On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > > > > > > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > > > > > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > > > > > >
> > > > > > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > > > > > > Hello,
> > > > > > > > > > the nasty memory-corrupting hibernation bug
> > > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back
> > > > > > > > > > since 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes
> > > > > > > > > > after two days.
> > > > >
> > > > > That's distressing, for both and all of us: I'm sorry.
> > > > >
> > > > > > > > > > It seems to be caused by
> > > > > > > > > > b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > > > > > > >
> > > > > > > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > > > > > > Author: Hugh Dickins <hughd@google.com>
> > > > > > > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > > > > > > >
> > > > > > > >     swap: prevent reuse during hibernation
> > > > >
> > > > > Embarrassing: I suspect that I've been confused, not for the first
> > > > > time, by the fork-like nature of hibernation and its images.
> > > > > I wonder if this patch below fixes it, Ondrej?
> > > > >
> > > > > (And is it kernel swsusp or user swsusp that you're using?  May not
> > > > > matter at all, but will help us to think more clearly about it,
> > > > > if the corruption remains after this patch.)
> > > > >
> > > > > Rafael, do you agree that this patch was actually required even for
> > > > > your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
> > > > > mm/pm: force GFP_NOIO during suspend/hibernation and resume?
> > > > >
> > > > > Or am I still just as confused?  Or if not, are there more forking
> > > > > places which require a similar patch?
> > > > >
> > > > > Not signing it off yet,
> > > > > Hugh
> > > >
> > > > Could you please do that? The patch fixes the problem.
> > >
> > > Can you check if the problem is also fixed by the patch below, please?
> >
> > The patch does not apply to 2.6.35.4, 2.6.35.5 and also 2.6.36. What
> > version should I test?
>
> The patch was against the current mainline.
>
> The one below was rebased on top of 2.6.36, so please test it with this
> kernel.

This seems to work fine (5 days and no crash).


> Thanks,
> Rafael
>
> ---
>  kernel/power/hibernate.c |   36 ++++++++++++++++++++++++++----------
>  kernel/power/power.h     |    1 +
>  kernel/power/user.c      |    2 ++
>  3 files changed, 29 insertions(+), 10 deletions(-)
>
> Index: suspend-2.6/kernel/power/hibernate.c
> ===================================================================
> --- suspend-2.6.orig/kernel/power/hibernate.c
> +++ suspend-2.6/kernel/power/hibernate.c
> @@ -29,6 +29,21 @@
>  #include "power.h"
>
>
> +static gfp_t saved_gfp_mask;
> +
> +static void hibernate_restrict_gfp_mask(void)
> +{
> +	saved_gfp_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +}
> +
> +void hibernate_restore_gfp_mask(void)
> +{
> +	if (saved_gfp_mask) {
> +		set_gfp_allowed_mask(saved_gfp_mask);
> +		saved_gfp_mask = 0;
> +	}
> +}
> +
>  static int noresume = 0;
>  static char resume_file[256] = CONFIG_PM_STD_PARTITION;
>  dev_t swsusp_resume_device;
> @@ -326,7 +341,6 @@ static int create_image(int platform_mod
>  int hibernation_snapshot(int platform_mode)
>  {
>  	int error;
> -	gfp_t saved_mask;
>
>  	error = platform_begin(platform_mode);
>  	if (error)
> @@ -338,7 +352,7 @@ int hibernation_snapshot(int platform_mo
>  		goto Close;
>
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	hibernate_restrict_gfp_mask();
>  	error = dpm_suspend_start(PMSG_FREEZE);
>  	if (error)
>  		goto Recover_platform;
> @@ -347,7 +361,10 @@ int hibernation_snapshot(int platform_mo
>  		goto Recover_platform;
>
>  	error = create_image(platform_mode);
> -	/* Control returns here after successful restore */
> +	/*
> +	 * Control returns here (1) after the image has been created or the
> +	 * image creation has failed and (2) after a successful restore.
> +	 */
>
>   Resume_devices:
>  	/* We may need to release the preallocated image pages here. */
> @@ -356,7 +373,10 @@ int hibernation_snapshot(int platform_mo
>
>  	dpm_resume_end(in_suspend ?
>  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
> +
> +	if (error || !in_suspend)
> +		hibernate_restore_gfp_mask();
> +
>  	resume_console();
>   Close:
>  	platform_end(platform_mode);
> @@ -451,17 +471,16 @@ static int resume_target_kernel(bool pla
>  int hibernation_restore(int platform_mode)
>  {
>  	int error;
> -	gfp_t saved_mask;
>
>  	pm_prepare_console();
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> +	hibernate_restrict_gfp_mask();
>  	error = dpm_suspend_start(PMSG_QUIESCE);
>  	if (!error) {
>  		error = resume_target_kernel(platform_mode);
>  		dpm_resume_end(PMSG_RECOVER);
>  	}
> -	set_gfp_allowed_mask(saved_mask);
> +	hibernate_restore_gfp_mask();
>  	resume_console();
>  	pm_restore_console();
>  	return error;
> @@ -475,7 +494,6 @@ int hibernation_restore(int platform_mod
>  int hibernation_platform_enter(void)
>  {
>  	int error;
> -	gfp_t saved_mask;
>
>  	if (!hibernation_ops)
>  		return -ENOSYS;
> @@ -491,7 +509,6 @@ int hibernation_platform_enter(void)
>
>  	entering_platform_hibernation = true;
>  	suspend_console();
> -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
>  	error = dpm_suspend_start(PMSG_HIBERNATE);
>  	if (error) {
>  		if (hibernation_ops->recover)
> @@ -535,7 +552,6 @@ int hibernation_platform_enter(void)
>   Resume_devices:
>  	entering_platform_hibernation = false;
>  	dpm_resume_end(PMSG_RESTORE);
> -	set_gfp_allowed_mask(saved_mask);
>  	resume_console();
>
>   Close:
> Index: suspend-2.6/kernel/power/power.h
> ===================================================================
> --- suspend-2.6.orig/kernel/power/power.h
> +++ suspend-2.6/kernel/power/power.h
> @@ -49,6 +49,7 @@ static inline char *check_image_kernel(s
>  extern int hibernation_snapshot(int platform_mode);
>  extern int hibernation_restore(int platform_mode);
>  extern int hibernation_platform_enter(void);
> +extern void hibernate_restore_gfp_mask(void);
>  #endif
>
>  extern int pfn_is_nosave(unsigned long);
> Index: suspend-2.6/kernel/power/user.c
> ===================================================================
> --- suspend-2.6.orig/kernel/power/user.c
> +++ suspend-2.6/kernel/power/user.c
> @@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
>  	case SNAPSHOT_UNFREEZE:
>  		if (!data->frozen || data->ready)
>  			break;
> +		hibernate_restore_gfp_mask();
>  		thaw_processes();
>  		usermodehelper_enable();
>  		data->frozen = 0;
> @@ -275,6 +276,7 @@ static long snapshot_ioctl(struct file *
>  			error = -EPERM;
>  			break;
>  		}
> +		hibernate_restore_gfp_mask();
>  		error = hibernation_snapshot(data->platform_support);
>  		if (!error)
>  			error = put_user(in_suspend, (int __user *)arg);



-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.6.35.5: hibernation broken... AGAIN
  2010-12-06 21:09                   ` Ondrej Zary
@ 2010-12-06 22:57                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2010-12-06 22:57 UTC (permalink / raw)
  To: Ondrej Zary
  Cc: Hugh Dickins, Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Kernel development list, Balbir Singh

On Monday, December 06, 2010, Ondrej Zary wrote:
> On Saturday 27 November 2010 21:47:01 Rafael J. Wysocki wrote:
> > On Saturday, November 27, 2010, Ondrej Zary wrote:
> > > On Saturday 27 November 2010 00:10:04 Rafael J. Wysocki wrote:
> > > > On Friday, November 26, 2010, Ondrej Zary wrote:
> > > > > On Thursday 18 November 2010, Hugh Dickins wrote:
> > > > > > On Wed, 17 Nov 2010, Ondrej Zary wrote:
> > > > > > > On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > > > > > > > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > > > > > > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > > > > > > >
> > > > > > > > > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > > > > > > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > > > > > > > Hello,
> > > > > > > > > > > the nasty memory-corrupting hibernation bug
> > > > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back
> > > > > > > > > > > since 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes
> > > > > > > > > > > after two days.
> > > > > >
> > > > > > That's distressing, for both and all of us: I'm sorry.
> > > > > >
> > > > > > > > > > > It seems to be caused by
> > > > > > > > > > > b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> > > > > > > > >
> > > > > > > > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > > > > > > > Author: Hugh Dickins <hughd@google.com>
> > > > > > > > > Date:   Thu Sep 9 16:38:09 2010 -0700
> > > > > > > > >
> > > > > > > > >     swap: prevent reuse during hibernation
> > > > > >
> > > > > > Embarrassing: I suspect that I've been confused, not for the first
> > > > > > time, by the fork-like nature of hibernation and its images.
> > > > > > I wonder if this patch below fixes it, Ondrej?
> > > > > >
> > > > > > (And is it kernel swsusp or user swsusp that you're using?  May not
> > > > > > matter at all, but will help us to think more clearly about it,
> > > > > > if the corruption remains after this patch.)
> > > > > >
> > > > > > Rafael, do you agree that this patch was actually required even for
> > > > > > your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
> > > > > > mm/pm: force GFP_NOIO during suspend/hibernation and resume?
> > > > > >
> > > > > > Or am I still just as confused?  Or if not, are there more forking
> > > > > > places which require a similar patch?
> > > > > >
> > > > > > Not signing it off yet,
> > > > > > Hugh
> > > > >
> > > > > Could you please do that? The patch fixes the problem.
> > > >
> > > > Can you check if the problem is also fixed by the patch below, please?
> > >
> > > The patch does not apply to 2.6.35.4, 2.6.35.5 and also 2.6.36. What
> > > version should I test?
> >
> > The patch was against the current mainline.
> >
> > The one below was rebased on top of 2.6.36, so please test it with this
> > kernel.
> 
> This seems to work fine (5 days and no crash).

Thanks for testing!

Rafael


> > ---
> >  kernel/power/hibernate.c |   36 ++++++++++++++++++++++++++----------
> >  kernel/power/power.h     |    1 +
> >  kernel/power/user.c      |    2 ++
> >  3 files changed, 29 insertions(+), 10 deletions(-)
> >
> > Index: suspend-2.6/kernel/power/hibernate.c
> > ===================================================================
> > --- suspend-2.6.orig/kernel/power/hibernate.c
> > +++ suspend-2.6/kernel/power/hibernate.c
> > @@ -29,6 +29,21 @@
> >  #include "power.h"
> >
> >
> > +static gfp_t saved_gfp_mask;
> > +
> > +static void hibernate_restrict_gfp_mask(void)
> > +{
> > +	saved_gfp_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +}
> > +
> > +void hibernate_restore_gfp_mask(void)
> > +{
> > +	if (saved_gfp_mask) {
> > +		set_gfp_allowed_mask(saved_gfp_mask);
> > +		saved_gfp_mask = 0;
> > +	}
> > +}
> > +
> >  static int noresume = 0;
> >  static char resume_file[256] = CONFIG_PM_STD_PARTITION;
> >  dev_t swsusp_resume_device;
> > @@ -326,7 +341,6 @@ static int create_image(int platform_mod
> >  int hibernation_snapshot(int platform_mode)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >
> >  	error = platform_begin(platform_mode);
> >  	if (error)
> > @@ -338,7 +352,7 @@ int hibernation_snapshot(int platform_mo
> >  		goto Close;
> >
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +	hibernate_restrict_gfp_mask();
> >  	error = dpm_suspend_start(PMSG_FREEZE);
> >  	if (error)
> >  		goto Recover_platform;
> > @@ -347,7 +361,10 @@ int hibernation_snapshot(int platform_mo
> >  		goto Recover_platform;
> >
> >  	error = create_image(platform_mode);
> > -	/* Control returns here after successful restore */
> > +	/*
> > +	 * Control returns here (1) after the image has been created or the
> > +	 * image creation has failed and (2) after a successful restore.
> > +	 */
> >
> >   Resume_devices:
> >  	/* We may need to release the preallocated image pages here. */
> > @@ -356,7 +373,10 @@ int hibernation_snapshot(int platform_mo
> >
> >  	dpm_resume_end(in_suspend ?
> >  		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
> > -	set_gfp_allowed_mask(saved_mask);
> > +
> > +	if (error || !in_suspend)
> > +		hibernate_restore_gfp_mask();
> > +
> >  	resume_console();
> >   Close:
> >  	platform_end(platform_mode);
> > @@ -451,17 +471,16 @@ static int resume_target_kernel(bool pla
> >  int hibernation_restore(int platform_mode)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >
> >  	pm_prepare_console();
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> > +	hibernate_restrict_gfp_mask();
> >  	error = dpm_suspend_start(PMSG_QUIESCE);
> >  	if (!error) {
> >  		error = resume_target_kernel(platform_mode);
> >  		dpm_resume_end(PMSG_RECOVER);
> >  	}
> > -	set_gfp_allowed_mask(saved_mask);
> > +	hibernate_restore_gfp_mask();
> >  	resume_console();
> >  	pm_restore_console();
> >  	return error;
> > @@ -475,7 +494,6 @@ int hibernation_restore(int platform_mod
> >  int hibernation_platform_enter(void)
> >  {
> >  	int error;
> > -	gfp_t saved_mask;
> >
> >  	if (!hibernation_ops)
> >  		return -ENOSYS;
> > @@ -491,7 +509,6 @@ int hibernation_platform_enter(void)
> >
> >  	entering_platform_hibernation = true;
> >  	suspend_console();
> > -	saved_mask = clear_gfp_allowed_mask(GFP_IOFS);
> >  	error = dpm_suspend_start(PMSG_HIBERNATE);
> >  	if (error) {
> >  		if (hibernation_ops->recover)
> > @@ -535,7 +552,6 @@ int hibernation_platform_enter(void)
> >   Resume_devices:
> >  	entering_platform_hibernation = false;
> >  	dpm_resume_end(PMSG_RESTORE);
> > -	set_gfp_allowed_mask(saved_mask);
> >  	resume_console();
> >
> >   Close:
> > Index: suspend-2.6/kernel/power/power.h
> > ===================================================================
> > --- suspend-2.6.orig/kernel/power/power.h
> > +++ suspend-2.6/kernel/power/power.h
> > @@ -49,6 +49,7 @@ static inline char *check_image_kernel(s
> >  extern int hibernation_snapshot(int platform_mode);
> >  extern int hibernation_restore(int platform_mode);
> >  extern int hibernation_platform_enter(void);
> > +extern void hibernate_restore_gfp_mask(void);
> >  #endif
> >
> >  extern int pfn_is_nosave(unsigned long);
> > Index: suspend-2.6/kernel/power/user.c
> > ===================================================================
> > --- suspend-2.6.orig/kernel/power/user.c
> > +++ suspend-2.6/kernel/power/user.c
> > @@ -263,6 +263,7 @@ static long snapshot_ioctl(struct file *
> >  	case SNAPSHOT_UNFREEZE:
> >  		if (!data->frozen || data->ready)
> >  			break;
> > +		hibernate_restore_gfp_mask();
> >  		thaw_processes();
> >  		usermodehelper_enable();
> >  		data->frozen = 0;
> > @@ -275,6 +276,7 @@ static long snapshot_ioctl(struct file *
> >  			error = -EPERM;
> >  			break;
> >  		}
> > +		hibernate_restore_gfp_mask();
> >  		error = hibernation_snapshot(data->platform_support);
> >  		if (!error)
> >  			error = put_user(in_suspend, (int __user *)arg);
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-12-06 22:58 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-17 10:39 2.6.35.5: hibernation broken... AGAIN Ondrej Zary
2010-11-17 20:53 ` Rafael J. Wysocki
2010-11-17 21:04   ` Andrew Morton
2010-11-17 21:12     ` Rafael J. Wysocki
2010-11-17 21:18       ` Ondrej Zary
2010-11-17 23:46         ` Hugh Dickins
2010-11-20 15:07           ` Ondrej Zary
2010-11-26 11:43           ` Ondrej Zary
2010-11-26 23:10             ` Rafael J. Wysocki
2010-11-27 14:58               ` Ondrej Zary
2010-11-27 20:47                 ` Rafael J. Wysocki
2010-11-30 22:16                   ` Hugh Dickins
2010-11-30 23:07                     ` Rafael J. Wysocki
2010-12-01  0:38                       ` Rafael J. Wysocki
2010-12-01  0:53                         ` KAMEZAWA Hiroyuki
2010-12-01 21:25                           ` Rafael J. Wysocki
2010-12-01 22:23                             ` Rafael J. Wysocki
2010-12-01 23:37                               ` KAMEZAWA Hiroyuki
2010-12-02  0:00                                 ` Rafael J. Wysocki
2010-12-01  1:21                         ` Hugh Dickins
2010-12-01 20:57                           ` Rafael J. Wysocki
2010-12-06 21:09                   ` Ondrej Zary
2010-12-06 22:57                     ` Rafael J. Wysocki
2010-11-26 20:24           ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).