All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alan Jenkins <sourcejedi.lkml@googlemail.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>, Mel Gorman <mel@csn.ul.ie>,
	hugh.dickins@tiscali.co.uk, Pavel Machek <pavel@ucw.cz>,
	pm list <linux-pm@lists.linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>
Subject: Re: s2disk hang update
Date: Wed, 24 Feb 2010 20:19:32 +0000	[thread overview]
Message-ID: <9b2b86521002241219v648458c1gad1c18b0c3e7ca83@mail.gmail.com> (raw)
In-Reply-To: <20100224102037.2cca4f83.kamezawa.hiroyu@jp.fujitsu.com>

On 2/24/10, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 23 Feb 2010 22:13:56 +0100
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
>> Well, it still looks like we're waiting for create_workqueue_thread() to
>> return, which probably is trying to allocate memory for the thread
>> structure.
>>
>> My guess is that the preallocated memory pages freed by
>> free_unnecessary_pages() go into a place from where they cannot be taken
>> for
>> subsequent NOIO allocations.  I have no idea why that happens though.
>>
>> To test that theory you can try to change GFP_IOFS to GFP_KERNEL in the
>> calls to clear_gfp_allowed_mask() in kernel/power/hibernate.c (and in
>> kernel/power/suspend.c for completness).
>>
>
> If allocation of kernel threads for stop_machine_run() is the problem,
>
> What happens when
> 1. use CONIFG_4KSTACK

Interesting question.  4KSTACK doesn't stop it though; it hangs in the
same place.

> or
> 2. make use of stop_machine_create(), stop_machine_destroy().
>    A new interface added by this commit.
>   http://git.kernel.org/?p=linux/kernel/git/torvalds/
> linux-2.6.git;a=commit;h=9ea09af3bd3090e8349ca2899ca2011bd94cda85
>    You can do no-fail stop_machine_run().
>
> Thanks,
> -Kame

Since this is a uni-processor machine that would make it a single 4K
allocation.  AIUI this is supposed to be ok.  The hibernation code
tries to make sure there is over 1000x that much free RAM (ish), in
anticipation of this sort of requirement.

There appear to be some deficiencies in the way this allowance works,
which have recently been exposed.  And unfortunately the allocation
hangs instead of failing, so we're in unclean shutdown territory.

I have three test scenarios at the moment.  I've tested two patches
which appear to fix the common cases, but there's still a third test
scenario to figure out.  (Repeated hibernation attempts with
insufficient swap - encountered during real-world use, believe it or
not).

Alan

WARNING: multiple messages have this Message-ID (diff)
From: Alan Jenkins <sourcejedi.lkml@googlemail.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>, Mel Gorman <mel@csn.ul.ie>,
	hugh.dickins@tiscali.co.uk, Pavel Machek <pavel@ucw.cz>,
	pm list <linux-pm@lists.linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>
Subject: Re: s2disk hang update
Date: Wed, 24 Feb 2010 20:19:32 +0000	[thread overview]
Message-ID: <9b2b86521002241219v648458c1gad1c18b0c3e7ca83@mail.gmail.com> (raw)
In-Reply-To: <20100224102037.2cca4f83.kamezawa.hiroyu@jp.fujitsu.com>

On 2/24/10, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 23 Feb 2010 22:13:56 +0100
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
>> Well, it still looks like we're waiting for create_workqueue_thread() to
>> return, which probably is trying to allocate memory for the thread
>> structure.
>>
>> My guess is that the preallocated memory pages freed by
>> free_unnecessary_pages() go into a place from where they cannot be taken
>> for
>> subsequent NOIO allocations.  I have no idea why that happens though.
>>
>> To test that theory you can try to change GFP_IOFS to GFP_KERNEL in the
>> calls to clear_gfp_allowed_mask() in kernel/power/hibernate.c (and in
>> kernel/power/suspend.c for completness).
>>
>
> If allocation of kernel threads for stop_machine_run() is the problem,
>
> What happens when
> 1. use CONIFG_4KSTACK

Interesting question.  4KSTACK doesn't stop it though; it hangs in the
same place.

> or
> 2. make use of stop_machine_create(), stop_machine_destroy().
>    A new interface added by this commit.
>   http://git.kernel.org/?p=linux/kernel/git/torvalds/
> linux-2.6.git;a=commit;h=9ea09af3bd3090e8349ca2899ca2011bd94cda85
>    You can do no-fail stop_machine_run().
>
> Thanks,
> -Kame

Since this is a uni-processor machine that would make it a single 4K
allocation.  AIUI this is supposed to be ok.  The hibernation code
tries to make sure there is over 1000x that much free RAM (ish), in
anticipation of this sort of requirement.

There appear to be some deficiencies in the way this allowance works,
which have recently been exposed.  And unfortunately the allocation
hangs instead of failing, so we're in unclean shutdown territory.

I have three test scenarios at the moment.  I've tested two patches
which appear to fix the common cases, but there's still a third test
scenario to figure out.  (Repeated hibernation attempts with
insufficient swap - encountered during real-world use, believe it or
not).

Alan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Alan Jenkins <sourcejedi.lkml-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
To: KAMEZAWA Hiroyuki
	<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>,
	Mel Gorman <mel-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>,
	hugh.dickins-IWqWACnzNjwqdlJmJB21zg@public.gmane.org,
	Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org>,
	pm list
	<linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	linux-kernel
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Kernel Testers List
	<kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>
Subject: Re: s2disk hang update
Date: Wed, 24 Feb 2010 20:19:32 +0000	[thread overview]
Message-ID: <9b2b86521002241219v648458c1gad1c18b0c3e7ca83@mail.gmail.com> (raw)
In-Reply-To: <20100224102037.2cca4f83.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>

On 2/24/10, KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> On Tue, 23 Feb 2010 22:13:56 +0100
> "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org> wrote:
>
>> Well, it still looks like we're waiting for create_workqueue_thread() to
>> return, which probably is trying to allocate memory for the thread
>> structure.
>>
>> My guess is that the preallocated memory pages freed by
>> free_unnecessary_pages() go into a place from where they cannot be taken
>> for
>> subsequent NOIO allocations.  I have no idea why that happens though.
>>
>> To test that theory you can try to change GFP_IOFS to GFP_KERNEL in the
>> calls to clear_gfp_allowed_mask() in kernel/power/hibernate.c (and in
>> kernel/power/suspend.c for completness).
>>
>
> If allocation of kernel threads for stop_machine_run() is the problem,
>
> What happens when
> 1. use CONIFG_4KSTACK

Interesting question.  4KSTACK doesn't stop it though; it hangs in the
same place.

> or
> 2. make use of stop_machine_create(), stop_machine_destroy().
>    A new interface added by this commit.
>   http://git.kernel.org/?p=linux/kernel/git/torvalds/
> linux-2.6.git;a=commit;h=9ea09af3bd3090e8349ca2899ca2011bd94cda85
>    You can do no-fail stop_machine_run().
>
> Thanks,
> -Kame

Since this is a uni-processor machine that would make it a single 4K
allocation.  AIUI this is supposed to be ok.  The hibernation code
tries to make sure there is over 1000x that much free RAM (ish), in
anticipation of this sort of requirement.

There appear to be some deficiencies in the way this allowance works,
which have recently been exposed.  And unfortunately the allocation
hangs instead of failing, so we're in unclean shutdown territory.

I have three test scenarios at the moment.  I've tested two patches
which appear to fix the common cases, but there's still a third test
scenario to figure out.  (Repeated hibernation attempts with
insufficient swap - encountered during real-world use, believe it or
not).

Alan

  reply	other threads:[~2010-02-24 20:19 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-02 15:03 s2disk hang update Alan Jenkins
2010-01-02 15:03 ` Alan Jenkins
2010-01-02 20:38 ` Rafael J. Wysocki
2010-01-02 20:38 ` Rafael J. Wysocki
2010-01-02 20:38   ` Rafael J. Wysocki
2010-02-02 14:21   ` Alan Jenkins
2010-02-02 14:21   ` Alan Jenkins
2010-02-02 20:34     ` Rafael J. Wysocki
2010-02-02 20:34     ` Rafael J. Wysocki
2010-02-03 11:14       ` Alan Jenkins
2010-02-09 16:36         ` Alan Jenkins
2010-02-09 16:36           ` Alan Jenkins
2010-02-15 23:08           ` Rafael J. Wysocki
2010-02-15 23:08             ` Rafael J. Wysocki
2010-02-16 11:09             ` Alan Jenkins
2010-02-16 11:09               ` Alan Jenkins
2010-02-16 15:12               ` Alan Jenkins
2010-02-16 15:12               ` Alan Jenkins
2010-02-16 15:12                 ` Alan Jenkins
2010-02-16 21:16                 ` Rafael J. Wysocki
2010-02-16 21:16                 ` Rafael J. Wysocki
2010-02-16 21:16                   ` Rafael J. Wysocki
2010-02-17 11:27                   ` Alan Jenkins
2010-02-17 11:27                   ` Alan Jenkins
2010-02-17 11:27                     ` Alan Jenkins
2010-02-17 19:58                     ` Rafael J. Wysocki
2010-02-17 19:58                       ` Rafael J. Wysocki
2010-02-18 12:53                       ` Alan Jenkins
2010-02-18 12:53                       ` Alan Jenkins
2010-02-18 12:53                         ` Alan Jenkins
2010-02-18 20:04                         ` Rafael J. Wysocki
2010-02-18 20:04                           ` Rafael J. Wysocki
2010-02-19 11:48                           ` Alan Jenkins
2010-02-19 11:48                           ` Alan Jenkins
2010-02-19 11:48                             ` Alan Jenkins
2010-02-21 20:47                             ` Rafael J. Wysocki
2010-02-22 15:35                               ` Alan Jenkins
2010-02-22 15:35                                 ` Alan Jenkins
2010-02-22 19:17                                 ` Rafael J. Wysocki
2010-02-22 19:17                                   ` Rafael J. Wysocki
2010-02-23 14:24                                   ` Alan Jenkins
2010-02-23 14:24                                     ` Alan Jenkins
2010-02-23 21:13                                     ` Rafael J. Wysocki
2010-02-23 21:13                                     ` Rafael J. Wysocki
2010-02-23 21:13                                       ` Rafael J. Wysocki
2010-02-23 21:13                                       ` Rafael J. Wysocki
2010-02-24  1:20                                       ` KAMEZAWA Hiroyuki
2010-02-24  1:20                                       ` KAMEZAWA Hiroyuki
2010-02-24  1:20                                         ` KAMEZAWA Hiroyuki
2010-02-24  1:20                                         ` KAMEZAWA Hiroyuki
2010-02-24 20:19                                         ` Alan Jenkins [this message]
2010-02-24 20:19                                           ` Alan Jenkins
2010-02-24 20:19                                           ` Alan Jenkins
2010-02-24 20:19                                         ` Alan Jenkins
2010-02-24 20:36                                         ` Rafael J. Wysocki
2010-02-24 20:36                                           ` Rafael J. Wysocki
2010-02-24 20:36                                           ` Rafael J. Wysocki
2010-02-24 20:36                                         ` Rafael J. Wysocki
2010-02-24 16:23                                       ` Alan Jenkins
2010-02-24 16:23                                       ` Alan Jenkins
2010-02-24 20:52                                         ` Rafael J. Wysocki
2010-02-24 20:52                                           ` Rafael J. Wysocki
2010-02-24 20:52                                           ` Rafael J. Wysocki
2010-02-25 13:10                                           ` Alan Jenkins
2010-02-25 13:10                                             ` Alan Jenkins
2010-02-25 13:10                                             ` Alan Jenkins
2010-02-25 20:04                                             ` Rafael J. Wysocki
2010-02-25 20:04                                             ` Rafael J. Wysocki
2010-02-25 20:04                                               ` Rafael J. Wysocki
2010-02-25 20:04                                               ` Rafael J. Wysocki
2010-02-26  9:26                                               ` Alan Jenkins
2010-02-26  9:26                                                 ` Alan Jenkins
2010-02-26  9:26                                                 ` Alan Jenkins
2010-02-26  9:26                                               ` Alan Jenkins
2010-02-25 13:10                                           ` Alan Jenkins
2010-02-24 20:52                                         ` Rafael J. Wysocki
2010-02-23 14:24                                   ` Alan Jenkins
2010-02-22 19:17                                 ` Rafael J. Wysocki
2010-02-22 15:35                               ` Alan Jenkins
2010-02-18 20:04                         ` Rafael J. Wysocki
2010-02-17 19:58                     ` Rafael J. Wysocki
2010-02-16 11:09             ` Alan Jenkins
2010-02-15 23:08           ` Rafael J. Wysocki
2010-02-09 16:36         ` Alan Jenkins
2010-02-03 11:14       ` Alan Jenkins
2010-01-02 15:03 Alan Jenkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9b2b86521002241219v648458c1gad1c18b0c3e7ca83@mail.gmail.com \
    --to=sourcejedi.lkml@googlemail.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kernel-testers@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mel@csn.ul.ie \
    --cc=pavel@ucw.cz \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.