All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch v4 1/2] freezer: check OOM kill while being frozen
@ 2014-09-04 22:30 Cong Wang
  2014-09-04 22:30 ` [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task() Cong Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: Cong Wang @ 2014-09-04 22:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, David Rientjes, Michal Hocko, Rafael J. Wysocki,
	Tejun Heo, Andrew Morton

There is a race condition between OOM killer and freezer when
they try to operate on the same process, something like below:

        Process A       Process B               Process C
trigger page fault
then trigger oom
B=oom_scan_process_thread()
                                        cgroup freezer freeze(A, B)
                        ...
                        try_to_freeze()
                        stay in D state
oom_kill_process(B)
restart page fault
...

In this case, process A triggered a page fault in user-space,
and the kernel page fault handler triggered OOM, then kernel
selected process B as the victim, right before being killed
process B was frozen by process C therefore went to D state,
then kernel sent SIGKILL but it is already too late as
process B will not care about pending signals any more.

David Rientjes tried to fix same issue with commit
f660daac474c6f (oom: thaw threads if oom killed thread is
frozen before deferring) but it doesn't work any more, because
__thaw_task() just checks if it's frozen and then wakes it up,
but the frozen task, after waking up, will check if freezing()
is still true and continue to freeze itself if so. __thaw_task()
can't make freezing() return false since it doesn't change any
of these conditions, especially cgroup_freezing().

Fix this straightly by checking if the frozen process itself
has been killed by OOM killer, so that the frozen process will
thaw itself and be killed finally.

Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 kernel/freezer.c | 18 ++++++++++++++++--
 mm/oom_kill.c    |  2 --
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/kernel/freezer.c b/kernel/freezer.c
index aa6a8aa..dbcb87c 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -52,6 +52,21 @@ bool freezing_slow_path(struct task_struct *p)
 }
 EXPORT_SYMBOL(freezing_slow_path);
 
+static bool should_thaw_current(bool check_kthr_stop)
+{
+	if (!freezing(current))
+		return true;
+
+	if (check_kthr_stop && kthread_should_stop())
+		return true;
+
+	/* It might not be safe to check TIF_MEMDIE for pm freeze. */
+	if (cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
+		return true;
+
+	return false;
+}
+
 /* Refrigerator is place where frozen processes are stored :-). */
 bool __refrigerator(bool check_kthr_stop)
 {
@@ -67,8 +82,7 @@ bool __refrigerator(bool check_kthr_stop)
 
 		spin_lock_irq(&freezer_lock);
 		current->flags |= PF_FROZEN;
-		if (!freezing(current) ||
-		    (check_kthr_stop && kthread_should_stop()))
+		if (should_thaw_current(check_kthr_stop))
 			current->flags &= ~PF_FROZEN;
 		spin_unlock_irq(&freezer_lock);
 
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 1e11df8..112c278 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -266,8 +266,6 @@ enum oom_scan_t oom_scan_process_thread(struct task_struct *task,
 	 * Don't allow any other task to have access to the reserves.
 	 */
 	if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
-		if (unlikely(frozen(task)))
-			__thaw_task(task);
 		if (!force_kill)
 			return OOM_SCAN_ABORT;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task()
  2014-09-04 22:30 [Patch v4 1/2] freezer: check OOM kill while being frozen Cong Wang
@ 2014-09-04 22:30 ` Cong Wang
  2014-09-05  9:45   ` Michal Hocko
  2014-09-05 14:09   ` Tejun Heo
  2014-09-05 14:08 ` [Patch v4 1/2] freezer: check OOM kill while being frozen Tejun Heo
  2014-09-15 11:22 ` Michal Hocko
  2 siblings, 2 replies; 61+ messages in thread
From: Cong Wang @ 2014-09-04 22:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Cong Wang, David Rientjes, Rafael J. Wysocki, Tejun Heo,
	Andrew Morton, Michal Hocko

__thaw_task() no longer clears frozen flag since commit a3201227f803
(freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE).

Cc: David Rientjes <rientjes@google.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 kernel/freezer.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/kernel/freezer.c b/kernel/freezer.c
index dbcb87c..438ecca 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -161,12 +161,6 @@ void __thaw_task(struct task_struct *p)
 {
 	unsigned long flags;
 
-	/*
-	 * Clear freezing and kick @p if FROZEN.  Clearing is guaranteed to
-	 * be visible to @p as waking up implies wmb.  Waking up inside
-	 * freezer_lock also prevents wakeups from leaking outside
-	 * refrigerator.
-	 */
 	spin_lock_irqsave(&freezer_lock, flags);
 	if (frozen(p))
 		wake_up_process(p);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task()
  2014-09-04 22:30 ` [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task() Cong Wang
@ 2014-09-05  9:45   ` Michal Hocko
  2014-09-05 14:09   ` Tejun Heo
  1 sibling, 0 replies; 61+ messages in thread
From: Michal Hocko @ 2014-09-05  9:45 UTC (permalink / raw)
  To: Cong Wang
  Cc: linux-kernel, David Rientjes, Rafael J. Wysocki, Tejun Heo,
	Andrew Morton

On Thu 04-09-14 15:30:42, Cong Wang wrote:
> __thaw_task() no longer clears frozen flag since commit a3201227f803
> (freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE).
> 
> Cc: David Rientjes <rientjes@google.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Thanks for extending the description.
Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  kernel/freezer.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/kernel/freezer.c b/kernel/freezer.c
> index dbcb87c..438ecca 100644
> --- a/kernel/freezer.c
> +++ b/kernel/freezer.c
> @@ -161,12 +161,6 @@ void __thaw_task(struct task_struct *p)
>  {
>  	unsigned long flags;
>  
> -	/*
> -	 * Clear freezing and kick @p if FROZEN.  Clearing is guaranteed to
> -	 * be visible to @p as waking up implies wmb.  Waking up inside
> -	 * freezer_lock also prevents wakeups from leaking outside
> -	 * refrigerator.
> -	 */
>  	spin_lock_irqsave(&freezer_lock, flags);
>  	if (frozen(p))
>  		wake_up_process(p);
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-04 22:30 [Patch v4 1/2] freezer: check OOM kill while being frozen Cong Wang
  2014-09-04 22:30 ` [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task() Cong Wang
@ 2014-09-05 14:08 ` Tejun Heo
  2014-09-05 16:31   ` Cong Wang
  2014-09-05 16:43   ` Cong Wang
  2014-09-15 11:22 ` Michal Hocko
  2 siblings, 2 replies; 61+ messages in thread
From: Tejun Heo @ 2014-09-05 14:08 UTC (permalink / raw)
  To: Cong Wang
  Cc: linux-kernel, David Rientjes, Michal Hocko, Rafael J. Wysocki,
	Andrew Morton

Hello,

On Thu, Sep 04, 2014 at 03:30:41PM -0700, Cong Wang wrote:
...
> Cc: David Rientjes <rientjes@google.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Acked-by: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Prolly should stable@vger.kernel.org?

> +static bool should_thaw_current(bool check_kthr_stop)
> +{
> +	if (!freezing(current))
> +		return true;
> +
> +	if (check_kthr_stop && kthread_should_stop())
> +		return true;
> +
> +	/* It might not be safe to check TIF_MEMDIE for pm freeze. */

This is just another representation of the following code which isn't
particularly useful.  Wouldn't it be better if the comment actually
explains why this might not be safe?

> +	if (cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
> +		return true;
> +
> +	return false;
> +}

Other than that,

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task()
  2014-09-04 22:30 ` [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task() Cong Wang
  2014-09-05  9:45   ` Michal Hocko
@ 2014-09-05 14:09   ` Tejun Heo
  1 sibling, 0 replies; 61+ messages in thread
From: Tejun Heo @ 2014-09-05 14:09 UTC (permalink / raw)
  To: Cong Wang
  Cc: linux-kernel, David Rientjes, Rafael J. Wysocki, Andrew Morton,
	Michal Hocko

On Thu, Sep 04, 2014 at 03:30:42PM -0700, Cong Wang wrote:
> __thaw_task() no longer clears frozen flag since commit a3201227f803
> (freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE).
> 
> Cc: David Rientjes <rientjes@google.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-05 14:08 ` [Patch v4 1/2] freezer: check OOM kill while being frozen Tejun Heo
@ 2014-09-05 16:31   ` Cong Wang
  2014-09-05 18:00     ` Tejun Heo
  2014-09-05 16:43   ` Cong Wang
  1 sibling, 1 reply; 61+ messages in thread
From: Cong Wang @ 2014-09-05 16:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, David Rientjes, Michal Hocko, Rafael J. Wysocki, Andrew Morton

On Fri, Sep 5, 2014 at 7:08 AM, Tejun Heo <tj@kernel.org> wrote:
>> +     /* It might not be safe to check TIF_MEMDIE for pm freeze. */
>
> This is just another representation of the following code which isn't
> particularly useful.  Wouldn't it be better if the comment actually
> explains why this might not be safe?
>

I don't know actually, I never understand pm code, just don't
want to take the risk of breaking it as you told. :)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-05 14:08 ` [Patch v4 1/2] freezer: check OOM kill while being frozen Tejun Heo
  2014-09-05 16:31   ` Cong Wang
@ 2014-09-05 16:43   ` Cong Wang
  2014-09-05 16:54     ` Michal Hocko
  1 sibling, 1 reply; 61+ messages in thread
From: Cong Wang @ 2014-09-05 16:43 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, David Rientjes, Michal Hocko, Rafael J. Wysocki, Andrew Morton

On Fri, Sep 5, 2014 at 7:08 AM, Tejun Heo <tj@kernel.org> wrote:
>
> Prolly should stable@vger.kernel.org?

Oh, yes. I assume the maintainer (Michal?) will send it to stable
once it is accepted, or I can do that after it is merged.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-05 16:43   ` Cong Wang
@ 2014-09-05 16:54     ` Michal Hocko
  0 siblings, 0 replies; 61+ messages in thread
From: Michal Hocko @ 2014-09-05 16:54 UTC (permalink / raw)
  To: Cong Wang
  Cc: Tejun Heo, LKML, David Rientjes, Rafael J. Wysocki, Andrew Morton

On Fri 05-09-14 09:43:52, Cong Wang wrote:
> On Fri, Sep 5, 2014 at 7:08 AM, Tejun Heo <tj@kernel.org> wrote:
> >
> > Prolly should stable@vger.kernel.org?
> 
> Oh, yes. I assume the maintainer (Michal?) will send it to stable

I guess you meant Rafael.

> once it is accepted, or I can do that after it is merged.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-05 16:31   ` Cong Wang
@ 2014-09-05 18:00     ` Tejun Heo
  2014-09-05 18:12       ` Cong Wang
  0 siblings, 1 reply; 61+ messages in thread
From: Tejun Heo @ 2014-09-05 18:00 UTC (permalink / raw)
  To: Cong Wang
  Cc: LKML, David Rientjes, Michal Hocko, Rafael J. Wysocki, Andrew Morton

On Fri, Sep 05, 2014 at 09:31:50AM -0700, Cong Wang wrote:
> On Fri, Sep 5, 2014 at 7:08 AM, Tejun Heo <tj@kernel.org> wrote:
> >> +     /* It might not be safe to check TIF_MEMDIE for pm freeze. */
> >
> > This is just another representation of the following code which isn't
> > particularly useful.  Wouldn't it be better if the comment actually
> > explains why this might not be safe?
> 
> I don't know actually, I never understand pm code, just don't
> want to take the risk of breaking it as you told. :)

Let's please try to understand and explain it properly.  Reading
intricate code like this w/ comment which doesn't really explain
anything can be very frustrating for other people reading the code.
Rafael, can you please help?

Shouldn't the primary goal of the comment be explaining why we need
TIF_MEMDIE check there at all anyway?  The deadlock possiblity is not
very obvious.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-05 18:00     ` Tejun Heo
@ 2014-09-05 18:12       ` Cong Wang
  2014-09-05 22:45         ` Tejun Heo
  0 siblings, 1 reply; 61+ messages in thread
From: Cong Wang @ 2014-09-05 18:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, David Rientjes, Michal Hocko, Rafael J. Wysocki, Andrew Morton

On Fri, Sep 5, 2014 at 11:00 AM, Tejun Heo <tj@kernel.org> wrote:
> On Fri, Sep 05, 2014 at 09:31:50AM -0700, Cong Wang wrote:
>> On Fri, Sep 5, 2014 at 7:08 AM, Tejun Heo <tj@kernel.org> wrote:
>> >> +     /* It might not be safe to check TIF_MEMDIE for pm freeze. */
>> >
>> > This is just another representation of the following code which isn't
>> > particularly useful.  Wouldn't it be better if the comment actually
>> > explains why this might not be safe?
>>
>> I don't know actually, I never understand pm code, just don't
>> want to take the risk of breaking it as you told. :)
>
> Let's please try to understand and explain it properly.  Reading
> intricate code like this w/ comment which doesn't really explain
> anything can be very frustrating for other people reading the code.

That comment just explains why we need a cgroup_freezing() check,
nothing else. If you don't think it helps, I can definitely remove it.

> Rafael, can you please help?

Rafael is known not responsive at least for this topic. :)

>
> Shouldn't the primary goal of the comment be explaining why we need
> TIF_MEMDIE check there at all anyway?  The deadlock possiblity is not
> very obvious.
>

The changelog is not long enough?? ;-) I hate to copy+paste changelog
into comments, changelog is essentially necessary for people to understand
kernel code (at least networking) , so I don't think we have to move it
into comments in this case.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-05 18:12       ` Cong Wang
@ 2014-09-05 22:45         ` Tejun Heo
  2014-09-05 23:32           ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Tejun Heo @ 2014-09-05 22:45 UTC (permalink / raw)
  To: Cong Wang
  Cc: LKML, David Rientjes, Michal Hocko, Rafael J. Wysocki, Andrew Morton

Hello,

On Fri, Sep 05, 2014 at 11:12:24AM -0700, Cong Wang wrote:
> > Rafael, can you please help?
> 
> Rafael is known not responsive at least for this topic. :)

:(

> > Shouldn't the primary goal of the comment be explaining why we need
> > TIF_MEMDIE check there at all anyway?  The deadlock possiblity is not
> > very obvious.
> 
> The changelog is not long enough?? ;-) I hate to copy+paste changelog
> into comments, changelog is essentially necessary for people to understand
> kernel code (at least networking) , so I don't think we have to move it
> into comments in this case.

It doesn't have to be the same text but the current comment is
basically content-less.  e.g. it can just say "OOM killer may get
stuck trying to kill a cgroup frozen task" and actualy provide
information on what condition the conditional tries to address.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-05 22:45         ` Tejun Heo
@ 2014-09-05 23:32           ` Rafael J. Wysocki
  2014-09-08 17:40             ` Cong Wang
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-05 23:32 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Cong Wang, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Saturday, September 06, 2014 07:45:54 AM Tejun Heo wrote:
> Hello,
> 
> On Fri, Sep 05, 2014 at 11:12:24AM -0700, Cong Wang wrote:
> > > Rafael, can you please help?
> > 
> > Rafael is known not responsive at least for this topic. :)
> 
> :(

Well, am I?

I haven't commented patches in this thread so far, mostly because other
people have.

How can I help actually?

> > > Shouldn't the primary goal of the comment be explaining why we need
> > > TIF_MEMDIE check there at all anyway?  The deadlock possiblity is not
> > > very obvious.
> > 
> > The changelog is not long enough?? ;-) I hate to copy+paste changelog
> > into comments, changelog is essentially necessary for people to understand
> > kernel code (at least networking) , so I don't think we have to move it
> > into comments in this case.
> 
> It doesn't have to be the same text but the current comment is
> basically content-less.  e.g. it can just say "OOM killer may get
> stuck trying to kill a cgroup frozen task" and actualy provide
> information on what condition the conditional tries to address.

Or something like "We need to check X to prevent Y from happening".

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-05 23:32           ` Rafael J. Wysocki
@ 2014-09-08 17:40             ` Cong Wang
  2014-09-08 20:54               ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Cong Wang @ 2014-09-08 17:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Fri, Sep 5, 2014 at 4:32 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Saturday, September 06, 2014 07:45:54 AM Tejun Heo wrote:
>> Hello,
>>
>> On Fri, Sep 05, 2014 at 11:12:24AM -0700, Cong Wang wrote:
>> > > Rafael, can you please help?
>> >
>> > Rafael is known not responsive at least for this topic. :)
>>
>> :(
>
> Well, am I?
>
> I haven't commented patches in this thread so far, mostly because other
> people have.
>
> How can I help actually?


We asked you to comment on either if this patch is safe for PM freeze
if we don't have the cgroup_freezing() check, or if it is not safe why (so that
I can put it in the comment).

>
>> > > Shouldn't the primary goal of the comment be explaining why we need
>> > > TIF_MEMDIE check there at all anyway?  The deadlock possiblity is not
>> > > very obvious.
>> >
>> > The changelog is not long enough?? ;-) I hate to copy+paste changelog
>> > into comments, changelog is essentially necessary for people to understand
>> > kernel code (at least networking) , so I don't think we have to move it
>> > into comments in this case.
>>
>> It doesn't have to be the same text but the current comment is
>> basically content-less.  e.g. it can just say "OOM killer may get
>> stuck trying to kill a cgroup frozen task" and actualy provide
>> information on what condition the conditional tries to address.
>
> Or something like "We need to check X to prevent Y from happening".
>

OK, maybe just one or two sentences. Let me know if the following
comment is okay for you:

/* OOM killer might decide to kill this process after it is frozen,
  in this case it should thaw and die. */

Thanks.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 17:40             ` Cong Wang
@ 2014-09-08 20:54               ` Rafael J. Wysocki
  2014-09-08 20:58                 ` Cong Wang
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-08 20:54 UTC (permalink / raw)
  To: Cong Wang; +Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Monday, September 08, 2014 10:40:17 AM Cong Wang wrote:
> On Fri, Sep 5, 2014 at 4:32 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Saturday, September 06, 2014 07:45:54 AM Tejun Heo wrote:
> >> Hello,
> >>
> >> On Fri, Sep 05, 2014 at 11:12:24AM -0700, Cong Wang wrote:
> >> > > Rafael, can you please help?
> >> >
> >> > Rafael is known not responsive at least for this topic. :)
> >>
> >> :(
> >
> > Well, am I?
> >
> > I haven't commented patches in this thread so far, mostly because other
> > people have.
> >
> > How can I help actually?
> 
> 
> We asked you to comment on either if this patch is safe for PM freeze
> if we don't have the cgroup_freezing() check, or if it is not safe why (so that
> I can put it in the comment).

OK, which version of it?  Anything that has been posted as a complete patch you
have a link to?  Or am I supposed to go through the whole thread and figure out
how the patch *might* have looked had it been posted and then comment?

> >> > > Shouldn't the primary goal of the comment be explaining why we need
> >> > > TIF_MEMDIE check there at all anyway?  The deadlock possiblity is not
> >> > > very obvious.
> >> >
> >> > The changelog is not long enough?? ;-) I hate to copy+paste changelog
> >> > into comments, changelog is essentially necessary for people to understand
> >> > kernel code (at least networking) , so I don't think we have to move it
> >> > into comments in this case.
> >>
> >> It doesn't have to be the same text but the current comment is
> >> basically content-less.  e.g. it can just say "OOM killer may get
> >> stuck trying to kill a cgroup frozen task" and actualy provide
> >> information on what condition the conditional tries to address.
> >
> > Or something like "We need to check X to prevent Y from happening".
> >
> 
> OK, maybe just one or two sentences. Let me know if the following
> comment is okay for you:
> 
> /* OOM killer might decide to kill this process after it is frozen,
>   in this case it should thaw and die. */

I don't think it's sufficient.  In particular, what does "thaw and die" mean
exactly?  It should be thawed immediately and then die or it should die right
after it's thawed (at one point in the future)?

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 20:54               ` Rafael J. Wysocki
@ 2014-09-08 20:58                 ` Cong Wang
  2014-09-08 21:53                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Cong Wang @ 2014-09-08 20:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Mon, Sep 8, 2014 at 1:54 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, September 08, 2014 10:40:17 AM Cong Wang wrote:
>>
>> We asked you to comment on either if this patch is safe for PM freeze
>> if we don't have the cgroup_freezing() check, or if it is not safe why (so that
>> I can put it in the comment).
>
> OK, which version of it?  Anything that has been posted as a complete patch you
> have a link to?  Or am I supposed to go through the whole thread and figure out
> how the patch *might* have looked had it been posted and then comment?

Sorry, I meant this version, the one we are discussing about.
In case you missed it too, let me show the related code for you:

+       /* It might not be safe to check TIF_MEMDIE for pm freeze. */
+       if (cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
+               return true;
+

(return true means it will break the loop of freezing, that is thaw itself.)

So our question is if we really need the cgroup_freezing() check here?
IOW, is it safe to check TIF_MEMDIE and then thaw itself even if the
freeze request is from PM? Hope it is clear now.

>> OK, maybe just one or two sentences. Let me know if the following
>> comment is okay for you:
>>
>> /* OOM killer might decide to kill this process after it is frozen,
>>   in this case it should thaw and die. */
>
> I don't think it's sufficient.  In particular, what does "thaw and die" mean
> exactly?  It should be thawed immediately and then die or it should die right
> after it's thawed (at one point in the future)?
>

OK, let me try again:

/* OOM killer may decide to kill this process after it is frozen,
  in this case SIGKILL can never be handled, so we should check
 TIF_MEMDIE and if it is set, thaw and let SIGKILL kill it. */

(Again, more detailed explanation is in the changelog, comment is just
for a quick reference, this seems what we all agree on.)

Thanks!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 20:58                 ` Cong Wang
@ 2014-09-08 21:53                   ` Rafael J. Wysocki
  2014-09-08 22:22                     ` Tejun Heo
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-08 21:53 UTC (permalink / raw)
  To: Cong Wang; +Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Monday, September 08, 2014 01:58:30 PM Cong Wang wrote:
> On Mon, Sep 8, 2014 at 1:54 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Monday, September 08, 2014 10:40:17 AM Cong Wang wrote:
> >>
> >> We asked you to comment on either if this patch is safe for PM freeze
> >> if we don't have the cgroup_freezing() check, or if it is not safe why (so that
> >> I can put it in the comment).
> >
> > OK, which version of it?  Anything that has been posted as a complete patch you
> > have a link to?  Or am I supposed to go through the whole thread and figure out
> > how the patch *might* have looked had it been posted and then comment?
> 
> Sorry, I meant this version, the one we are discussing about.
> In case you missed it too, let me show the related code for you:
> 
> +       /* It might not be safe to check TIF_MEMDIE for pm freeze. */
> +       if (cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
> +               return true;
> +
> 
> (return true means it will break the loop of freezing, that is thaw itself.)
> 
> So our question is if we really need the cgroup_freezing() check here?

No, I don't think so.

> IOW, is it safe to check TIF_MEMDIE and then thaw itself even if the
> freeze request is from PM? Hope it is clear now.

Yes, it is clear, thanks.

Why would the cgroup_freezing() test matter here at all?

> >> OK, maybe just one or two sentences. Let me know if the following
> >> comment is okay for you:
> >>
> >> /* OOM killer might decide to kill this process after it is frozen,
> >>   in this case it should thaw and die. */
> >
> > I don't think it's sufficient.  In particular, what does "thaw and die" mean
> > exactly?  It should be thawed immediately and then die or it should die right
> > after it's thawed (at one point in the future)?
> >
> 
> OK, let me try again:
> 
> /* OOM killer may decide to kill this process after it is frozen,
>   in this case SIGKILL can never be handled, so we should check
>  TIF_MEMDIE and if it is set, thaw and let SIGKILL kill it. */
> 
> (Again, more detailed explanation is in the changelog, comment is just
> for a quick reference, this seems what we all agree on.)

OK

I think you can simply add the test_thread_flag(TIF_MEMDIE) check to
__refrigerator() and it should be OK without the comment.

But please tell me this: If TIF_MEMDIE is set and we thaw the process,
can it do *anything* before dying or will it die immediately?

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 21:53                   ` Rafael J. Wysocki
@ 2014-09-08 22:22                     ` Tejun Heo
  2014-09-08 22:48                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Tejun Heo @ 2014-09-08 22:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Cong Wang, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Mon, Sep 08, 2014 at 11:53:34PM +0200, Rafael J. Wysocki wrote:
> But please tell me this: If TIF_MEMDIE is set and we thaw the process,
> can it do *anything* before dying or will it die immediately?

The task could be in freezer from anywhere and there's no limit in
what the task may do before it eventually dies.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 22:22                     ` Tejun Heo
@ 2014-09-08 22:48                       ` Rafael J. Wysocki
  2014-09-08 22:50                         ` Tejun Heo
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-08 22:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Cong Wang, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Tuesday, September 09, 2014 07:22:53 AM Tejun Heo wrote:
> On Mon, Sep 08, 2014 at 11:53:34PM +0200, Rafael J. Wysocki wrote:
> > But please tell me this: If TIF_MEMDIE is set and we thaw the process,
> > can it do *anything* before dying or will it die immediately?
> 
> The task could be in freezer from anywhere and there's no limit in
> what the task may do before it eventually dies.

Well, not exactly from anywhere.  Just from where try_to_freeze() is called
I suppose?

Which means that if this is a user space task, it won't to a lot before dying,
will it?

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 22:48                       ` Rafael J. Wysocki
@ 2014-09-08 22:50                         ` Tejun Heo
  2014-09-08 23:15                           ` Rafael J. Wysocki
  2014-09-09 15:16                           ` Michal Hocko
  0 siblings, 2 replies; 61+ messages in thread
From: Tejun Heo @ 2014-09-08 22:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Cong Wang, LKML, David Rientjes, Michal Hocko, Andrew Morton

Hello,

On Tue, Sep 09, 2014 at 12:48:28AM +0200, Rafael J. Wysocki wrote:
> Well, not exactly from anywhere.  Just from where try_to_freeze() is called
> I suppose?

Yeap, anywhere try_to_freeze() may be called.

> Which means that if this is a user space task, it won't to a lot before dying,
> will it?

Userland tasks aren't likely to a lot of damages before dying but then
again there's no guarantee.  As it currently stands, try_to_freeze()
can be anywhere.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 23:15                           ` Rafael J. Wysocki
@ 2014-09-08 23:00                             ` Cong Wang
  2014-09-08 23:23                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Cong Wang @ 2014-09-08 23:00 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Mon, Sep 8, 2014 at 4:15 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, September 09, 2014 07:50:12 AM Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Sep 09, 2014 at 12:48:28AM +0200, Rafael J. Wysocki wrote:
>> > Well, not exactly from anywhere.  Just from where try_to_freeze() is called
>> > I suppose?
>>
>> Yeap, anywhere try_to_freeze() may be called.
>>
>> > Which means that if this is a user space task, it won't to a lot before dying,
>> > will it?
>>
>> Userland tasks aren't likely to a lot of damages before dying but then
>> again there's no guarantee.  As it currently stands, try_to_freeze()
>> can be anywhere.
>
> Well, in that case the TIF_MEMDIE test alone is not sufficient in my opinion,
> because we can't guarantee that the task will not do something it was frozen
> to prevent it from doing.
>
>

That is true, there is a small window between TIF_MEMDIE is set
and SIGKILL is handled (aka process is killed), but this seems to
be a general signal handling issue not related with freezer after all
OOM relies on SIGKILL to function. No?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 22:50                         ` Tejun Heo
@ 2014-09-08 23:15                           ` Rafael J. Wysocki
  2014-09-08 23:00                             ` Cong Wang
  2014-09-09 15:16                           ` Michal Hocko
  1 sibling, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-08 23:15 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Cong Wang, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Tuesday, September 09, 2014 07:50:12 AM Tejun Heo wrote:
> Hello,
> 
> On Tue, Sep 09, 2014 at 12:48:28AM +0200, Rafael J. Wysocki wrote:
> > Well, not exactly from anywhere.  Just from where try_to_freeze() is called
> > I suppose?
> 
> Yeap, anywhere try_to_freeze() may be called.
> 
> > Which means that if this is a user space task, it won't to a lot before dying,
> > will it?
> 
> Userland tasks aren't likely to a lot of damages before dying but then
> again there's no guarantee.  As it currently stands, try_to_freeze()
> can be anywhere.

Well, in that case the TIF_MEMDIE test alone is not sufficient in my opinion,
because we can't guarantee that the task will not do something it was frozen
to prevent it from doing.

I'm not sure about the cgroups freezing case, however.

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 23:23                               ` Rafael J. Wysocki
@ 2014-09-08 23:16                                 ` Cong Wang
  2014-09-08 23:42                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Cong Wang @ 2014-09-08 23:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Mon, Sep 8, 2014 at 4:23 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> The reason why it matters for the suspend-time freezing is that we freeze tasks
> to take them out of the picture entirely until they are thawed.  Therefore we
> can't allow them to go back to the picture just for a while until they are
> killed.  Frozen tasks are not supposed to get back to the picture at all.
>


Ok, then checking TIF_MEMDIE is unsafe for PM freeze, we should
keep the cgroup_freezing() test to make sure freeze request is from
cgroup not PM. Question got answered. :)

I will put the following as a comment:

/* OOM killer may decide to kill this process after it is frozen,
   in this case SIGKILL can never be handled, so we should check
  TIF_MEMDIE and if it is set, thaw and let SIGKILL kill it.
  But for PM freeze, it is not allowed to get out even for a while,
  so we have to keep it being frozen. */

Let me know if it looks good to you.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 23:00                             ` Cong Wang
@ 2014-09-08 23:23                               ` Rafael J. Wysocki
  2014-09-08 23:16                                 ` Cong Wang
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-08 23:23 UTC (permalink / raw)
  To: Cong Wang; +Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Monday, September 08, 2014 04:00:41 PM Cong Wang wrote:
> On Mon, Sep 8, 2014 at 4:15 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, September 09, 2014 07:50:12 AM Tejun Heo wrote:
> >> Hello,
> >>
> >> On Tue, Sep 09, 2014 at 12:48:28AM +0200, Rafael J. Wysocki wrote:
> >> > Well, not exactly from anywhere.  Just from where try_to_freeze() is called
> >> > I suppose?
> >>
> >> Yeap, anywhere try_to_freeze() may be called.
> >>
> >> > Which means that if this is a user space task, it won't to a lot before dying,
> >> > will it?
> >>
> >> Userland tasks aren't likely to a lot of damages before dying but then
> >> again there's no guarantee.  As it currently stands, try_to_freeze()
> >> can be anywhere.
> >
> > Well, in that case the TIF_MEMDIE test alone is not sufficient in my opinion,
> > because we can't guarantee that the task will not do something it was frozen
> > to prevent it from doing.
> >
> >
> 
> That is true, there is a small window between TIF_MEMDIE is set
> and SIGKILL is handled (aka process is killed), but this seems to
> be a general signal handling issue not related with freezer after all
> OOM relies on SIGKILL to function. No?

The reason why it matters for the suspend-time freezing is that we freeze tasks
to take them out of the picture entirely until they are thawed.  Therefore we
can't allow them to go back to the picture just for a while until they are
killed.  Frozen tasks are not supposed to get back to the picture at all.

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 23:42                                   ` Rafael J. Wysocki
@ 2014-09-08 23:29                                     ` Cong Wang
  2014-09-08 23:59                                       ` Rafael J. Wysocki
  2014-09-10 20:30                                       ` Cong Wang
  0 siblings, 2 replies; 61+ messages in thread
From: Cong Wang @ 2014-09-08 23:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Mon, Sep 8, 2014 at 4:42 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, September 08, 2014 04:16:15 PM Cong Wang wrote:
>> On Mon, Sep 8, 2014 at 4:23 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> >
>> > The reason why it matters for the suspend-time freezing is that we freeze tasks
>> > to take them out of the picture entirely until they are thawed.  Therefore we
>> > can't allow them to go back to the picture just for a while until they are
>> > killed.  Frozen tasks are not supposed to get back to the picture at all.
>> >
>>
>>
>> Ok, then checking TIF_MEMDIE is unsafe for PM freeze, we should
>> keep the cgroup_freezing() test to make sure freeze request is from
>> cgroup not PM. Question got answered. :)
>
> Do I think correctly that cgroups freezing and system suspend are
> mutually exclusive?  If not, then this still is problematic.

Good point! Although rare, but it is possible we freeze a process both from
cgroup and PM. Hmm, this means we have to explicitly exclude PM rather
just checking cgroup freeze? Interesting, but I am not familiar with PM.

>
>> I will put the following as a comment:
>>
>> /* OOM killer may decide to kill this process after it is frozen,
>>    in this case SIGKILL can never be handled, so we should check
>>   TIF_MEMDIE and if it is set, thaw and let SIGKILL kill it.
>>   But for PM freeze, it is not allowed to get out even for a while,
>>   so we have to keep it being frozen. */
>>
>> Let me know if it looks good to you.
>
> Well, it reflects the reality, so it's good enough.  But please adhere to the
> coding style rules for comments.
>

Sure thing.

Thanks!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 23:16                                 ` Cong Wang
@ 2014-09-08 23:42                                   ` Rafael J. Wysocki
  2014-09-08 23:29                                     ` Cong Wang
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-08 23:42 UTC (permalink / raw)
  To: Cong Wang; +Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Monday, September 08, 2014 04:16:15 PM Cong Wang wrote:
> On Mon, Sep 8, 2014 at 4:23 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > The reason why it matters for the suspend-time freezing is that we freeze tasks
> > to take them out of the picture entirely until they are thawed.  Therefore we
> > can't allow them to go back to the picture just for a while until they are
> > killed.  Frozen tasks are not supposed to get back to the picture at all.
> >
> 
> 
> Ok, then checking TIF_MEMDIE is unsafe for PM freeze, we should
> keep the cgroup_freezing() test to make sure freeze request is from
> cgroup not PM. Question got answered. :)

Do I think correctly that cgroups freezing and system suspend are
mutually exclusive?  If not, then this still is problematic.

> I will put the following as a comment:
> 
> /* OOM killer may decide to kill this process after it is frozen,
>    in this case SIGKILL can never be handled, so we should check
>   TIF_MEMDIE and if it is set, thaw and let SIGKILL kill it.
>   But for PM freeze, it is not allowed to get out even for a while,
>   so we have to keep it being frozen. */
> 
> Let me know if it looks good to you.

Well, it reflects the reality, so it's good enough.  But please adhere to the
coding style rules for comments.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 23:29                                     ` Cong Wang
@ 2014-09-08 23:59                                       ` Rafael J. Wysocki
  2014-09-10 20:30                                       ` Cong Wang
  1 sibling, 0 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-08 23:59 UTC (permalink / raw)
  To: Cong Wang; +Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Monday, September 08, 2014 04:29:24 PM Cong Wang wrote:
> On Mon, Sep 8, 2014 at 4:42 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Monday, September 08, 2014 04:16:15 PM Cong Wang wrote:
> >> On Mon, Sep 8, 2014 at 4:23 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> >
> >> > The reason why it matters for the suspend-time freezing is that we freeze tasks
> >> > to take them out of the picture entirely until they are thawed.  Therefore we
> >> > can't allow them to go back to the picture just for a while until they are
> >> > killed.  Frozen tasks are not supposed to get back to the picture at all.
> >> >
> >>
> >>
> >> Ok, then checking TIF_MEMDIE is unsafe for PM freeze, we should
> >> keep the cgroup_freezing() test to make sure freeze request is from
> >> cgroup not PM. Question got answered. :)
> >
> > Do I think correctly that cgroups freezing and system suspend are
> > mutually exclusive?  If not, then this still is problematic.
> 
> Good point! Although rare, but it is possible we freeze a process both from
> cgroup and PM. Hmm, this means we have to explicitly exclude PM rather
> just checking cgroup freeze?

There's no way to do that currently as PM freezing is all-inclusive.

> Interesting, but I am not familiar with PM.

I'm wondering if the OOM killer might avoid killing frozen tasks instead?

Or there might be a "suspend" operation for the OOM killer that would be
run from a PM notifier to guaratee that the OOM killer would not mess up
with things during the entire system suspend (which surely is not quite
useful anyway).

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 22:50                         ` Tejun Heo
  2014-09-08 23:15                           ` Rafael J. Wysocki
@ 2014-09-09 15:16                           ` Michal Hocko
  2014-09-09 15:23                             ` Tejun Heo
  1 sibling, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-09 15:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Cong Wang, LKML, David Rientjes, Andrew Morton

On Tue 09-09-14 07:50:12, Tejun Heo wrote:
> Hello,
> 
> On Tue, Sep 09, 2014 at 12:48:28AM +0200, Rafael J. Wysocki wrote:
> > Well, not exactly from anywhere.  Just from where try_to_freeze() is called
> > I suppose?
> 
> Yeap, anywhere try_to_freeze() may be called.
> 
> > Which means that if this is a user space task, it won't to a lot before dying,
> > will it?
> 
> Userland tasks aren't likely to a lot of damages before dying but then
> again there's no guarantee.  As it currently stands, try_to_freeze()
> can be anywhere.

But OOM killer doesn't kill kernel threads as they do not own any
memory. So the check should be safe, no?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-09 15:16                           ` Michal Hocko
@ 2014-09-09 15:23                             ` Tejun Heo
  2014-09-09 16:06                               ` Michal Hocko
  2014-09-09 20:48                               ` Rafael J. Wysocki
  0 siblings, 2 replies; 61+ messages in thread
From: Tejun Heo @ 2014-09-09 15:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Rafael J. Wysocki, Cong Wang, LKML, David Rientjes, Andrew Morton

On Tue, Sep 09, 2014 at 05:16:55PM +0200, Michal Hocko wrote:
> But OOM killer doesn't kill kernel threads as they do not own any
> memory. So the check should be safe, no?

Even for userland tasks, try_to_freeze() can currently be anywhere in
the kernel.  The frequently used ones are few but there are some odd
ones out, and, again, there's nothing enforcing any structure on
try_to_freeze() usage.  The other thing is that we may do quite a bit
during exiting including allocating memory.  Are those safe for system
PM?  Rafael, what exactly are the rules for PM?  What shouldn't
change?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-09 15:23                             ` Tejun Heo
@ 2014-09-09 16:06                               ` Michal Hocko
  2014-09-09 16:46                                 ` Tejun Heo
  2014-09-09 20:48                               ` Rafael J. Wysocki
  1 sibling, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-09 16:06 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Cong Wang, LKML, David Rientjes, Andrew Morton

On Wed 10-09-14 00:23:36, Tejun Heo wrote:
> On Tue, Sep 09, 2014 at 05:16:55PM +0200, Michal Hocko wrote:
> > But OOM killer doesn't kill kernel threads as they do not own any
> > memory. So the check should be safe, no?
> 
> Even for userland tasks, try_to_freeze() can currently be anywhere in
> the kernel.  The frequently used ones are few but there are some odd

I always thought that user space tasks can be in the fridge only on the
way out from the kernel (get_signal_to_deliver). I have quickly greped
the code and the only place I can see seems to be run_guest but that
one bails out quickly when there are signals pending so it should be
safe in this context.
cifs is doing something suspicious (cifs_reconnect) but I didn't check
more closely all the contexts it is called from.

> ones out, and, again, there's nothing enforcing any structure on
> try_to_freeze() usage. 

Would it make sense to have try_to_freeze_user_task or similar and check
for kernel thread in try_to_freeze and complain loudly if called from
user task context? I mean does it even make sense to call try_to_freeze
in the middle of kernel operation for a user task?

> The other thing is that we may do quite a bit during exiting including
> allocating memory. 

yes, we can allocate memory and even page fault on the exit path. But
TIF_MEMDIE will make sure that the allocation will be successful if
there is some memory left.

> Are those safe for system PM?  Rafael, what exactly are the rules for
> PM?  What shouldn't change?
> 
> Thanks.
> 
> -- 
> tejun

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-09 16:06                               ` Michal Hocko
@ 2014-09-09 16:46                                 ` Tejun Heo
  2014-09-09 17:12                                   ` Michal Hocko
  2014-09-09 20:53                                   ` Rafael J. Wysocki
  0 siblings, 2 replies; 61+ messages in thread
From: Tejun Heo @ 2014-09-09 16:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Rafael J. Wysocki, Cong Wang, LKML, David Rientjes, Andrew Morton

Hello,

On Tue, Sep 09, 2014 at 06:06:25PM +0200, Michal Hocko wrote:
> > Even for userland tasks, try_to_freeze() can currently be anywhere in
> > the kernel.  The frequently used ones are few but there are some odd
> 
> I always thought that user space tasks can be in the fridge only on the
> way out from the kernel (get_signal_to_deliver). I have quickly greped

It *can* be anywhere.  We used to have some deep in nfs.  They got
removed later due to deadlocks but in theory they still can be
anywhere.

> the code and the only place I can see seems to be run_guest but that
> one bails out quickly when there are signals pending so it should be
> safe in this context.
> cifs is doing something suspicious (cifs_reconnect) but I didn't check
> more closely all the contexts it is called from.

Prolly something similar with what nfs was doing?

> > ones out, and, again, there's nothing enforcing any structure on
> > try_to_freeze() usage. 
> 
> Would it make sense to have try_to_freeze_user_task or similar and check
> for kernel thread in try_to_freeze and complain loudly if called from
> user task context? I mean does it even make sense to call try_to_freeze
> in the middle of kernel operation for a user task?

I personally think the whole try_to_freeze() was a mistake at least
for userland tasks.  We should have collected them in a (mostly)
single place like a jobctl stop.  I'm not sure whether distinguishing
the two interfaces would buy us much tho.

> > The other thing is that we may do quite a bit during exiting including
> > allocating memory. 
> 
> yes, we can allocate memory and even page fault on the exit path. But
> TIF_MEMDIE will make sure that the allocation will be successful if
> there is some memory left.

TIF_MEMDIE ensures forward progress so that the task can exit;
however, I'm not sure whether all the things that a task does during
exit are safe for PM freezes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-09 16:46                                 ` Tejun Heo
@ 2014-09-09 17:12                                   ` Michal Hocko
  2014-09-09 20:53                                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 61+ messages in thread
From: Michal Hocko @ 2014-09-09 17:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Cong Wang, LKML, David Rientjes, Andrew Morton

On Wed 10-09-14 01:46:58, Tejun Heo wrote:
> Hello,
> 
> On Tue, Sep 09, 2014 at 06:06:25PM +0200, Michal Hocko wrote:
> > > Even for userland tasks, try_to_freeze() can currently be anywhere in
> > > the kernel.  The frequently used ones are few but there are some odd
> > 
> > I always thought that user space tasks can be in the fridge only on the
> > way out from the kernel (get_signal_to_deliver). I have quickly greped
> 
> It *can* be anywhere.  We used to have some deep in nfs.  They got
> removed later due to deadlocks but in theory they still can be
> anywhere.
> 
> > the code and the only place I can see seems to be run_guest but that
> > one bails out quickly when there are signals pending so it should be
> > safe in this context.
> > cifs is doing something suspicious (cifs_reconnect) but I didn't check
> > more closely all the contexts it is called from.
> 
> Prolly something similar with what nfs was doing?
> 
> > > ones out, and, again, there's nothing enforcing any structure on
> > > try_to_freeze() usage. 
> > 
> > Would it make sense to have try_to_freeze_user_task or similar and check
> > for kernel thread in try_to_freeze and complain loudly if called from
> > user task context? I mean does it even make sense to call try_to_freeze
> > in the middle of kernel operation for a user task?
> 
> I personally think the whole try_to_freeze() was a mistake at least
> for userland tasks.  We should have collected them in a (mostly)
> single place like a jobctl stop.  I'm not sure whether distinguishing
> the two interfaces would buy us much tho.

We can at least catch those who are doing something dubious. Then we
would have only a single place for user tasks and signal handling code
makes sense to me.

> > > The other thing is that we may do quite a bit during exiting including
> > > allocating memory. 
> > 
> > yes, we can allocate memory and even page fault on the exit path. But
> > TIF_MEMDIE will make sure that the allocation will be successful if
> > there is some memory left.
> 
> TIF_MEMDIE ensures forward progress so that the task can exit;
> however, I'm not sure whether all the things that a task does during
> exit are safe for PM freezes.

Last time we were discussing this it was safe (I cannot
find the particular discussion but here is the Rafael's
Ack for the previous patch which handled the same case
http://lkml.iu.edu//hypermail/linux/kernel/1109.3/00448.html)
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-09 15:23                             ` Tejun Heo
  2014-09-09 16:06                               ` Michal Hocko
@ 2014-09-09 20:48                               ` Rafael J. Wysocki
  2014-09-10  5:21                                 ` Cong Wang
  1 sibling, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-09 20:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Michal Hocko, Cong Wang, LKML, David Rientjes, Andrew Morton

On Wednesday, September 10, 2014 12:23:36 AM Tejun Heo wrote:
> On Tue, Sep 09, 2014 at 05:16:55PM +0200, Michal Hocko wrote:
> > But OOM killer doesn't kill kernel threads as they do not own any
> > memory. So the check should be safe, no?
> 
> Even for userland tasks, try_to_freeze() can currently be anywhere in
> the kernel.  The frequently used ones are few but there are some odd
> ones out, and, again, there's nothing enforcing any structure on
> try_to_freeze() usage.  The other thing is that we may do quite a bit
> during exiting including allocating memory.  Are those safe for system
> PM?  Rafael, what exactly are the rules for PM?  What shouldn't
> change?

We can't make any assumptions regarding the availability of any devices.
That is, whatever can end up in device access may potentially fail.

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-09 16:46                                 ` Tejun Heo
  2014-09-09 17:12                                   ` Michal Hocko
@ 2014-09-09 20:53                                   ` Rafael J. Wysocki
  2014-09-10 13:24                                     ` Michal Hocko
  1 sibling, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-09 20:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Michal Hocko, Cong Wang, LKML, David Rientjes, Andrew Morton

On Wednesday, September 10, 2014 01:46:58 AM Tejun Heo wrote:
> Hello,
> 
> On Tue, Sep 09, 2014 at 06:06:25PM +0200, Michal Hocko wrote:
> > > Even for userland tasks, try_to_freeze() can currently be anywhere in
> > > the kernel.  The frequently used ones are few but there are some odd
> > 
> > I always thought that user space tasks can be in the fridge only on the
> > way out from the kernel (get_signal_to_deliver). I have quickly greped
> 
> It *can* be anywhere.  We used to have some deep in nfs.  They got
> removed later due to deadlocks but in theory they still can be
> anywhere.

Well, it would be good to determine the difference between theory and practice
in this particular respect, because if in practice it can't be anywhere,
we can just set an "every new instance of try_to_freeze() has to be documented"
rule (which may not be a bad idea anyway) and disallow people to break things.

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-09 20:48                               ` Rafael J. Wysocki
@ 2014-09-10  5:21                                 ` Cong Wang
  0 siblings, 0 replies; 61+ messages in thread
From: Cong Wang @ 2014-09-10  5:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Michal Hocko, LKML, David Rientjes, Andrew Morton

On Tue, Sep 9, 2014 at 1:48 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, September 10, 2014 12:23:36 AM Tejun Heo wrote:
>> On Tue, Sep 09, 2014 at 05:16:55PM +0200, Michal Hocko wrote:
>> > But OOM killer doesn't kill kernel threads as they do not own any
>> > memory. So the check should be safe, no?
>>
>> Even for userland tasks, try_to_freeze() can currently be anywhere in
>> the kernel.  The frequently used ones are few but there are some odd
>> ones out, and, again, there's nothing enforcing any structure on
>> try_to_freeze() usage.  The other thing is that we may do quite a bit
>> during exiting including allocating memory.  Are those safe for system
>> PM?  Rafael, what exactly are the rules for PM?  What shouldn't
>> change?
>
> We can't make any assumptions regarding the availability of any devices.
> That is, whatever can end up in device access may potentially fail.
>

So we still have to rule out PM freeze, right?

I am thinking about adding a new task flag or adding a new parameter
to __refrigerator() to distinguish PM freeze with cgroup freeze. Will try
to code tomorrow.

Thanks!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-09 20:53                                   ` Rafael J. Wysocki
@ 2014-09-10 13:24                                     ` Michal Hocko
  2014-09-11 13:08                                       ` Michal Hocko
  2014-09-12 23:48                                       ` Tejun Heo
  0 siblings, 2 replies; 61+ messages in thread
From: Michal Hocko @ 2014-09-10 13:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Tue 09-09-14 22:53:32, Rafael J. Wysocki wrote:
> On Wednesday, September 10, 2014 01:46:58 AM Tejun Heo wrote:
> > Hello,
> > 
> > On Tue, Sep 09, 2014 at 06:06:25PM +0200, Michal Hocko wrote:
> > > > Even for userland tasks, try_to_freeze() can currently be anywhere in
> > > > the kernel.  The frequently used ones are few but there are some odd
> > > 
> > > I always thought that user space tasks can be in the fridge only on the
> > > way out from the kernel (get_signal_to_deliver). I have quickly greped
> > 
> > It *can* be anywhere.  We used to have some deep in nfs.  They got
> > removed later due to deadlocks but in theory they still can be
> > anywhere.
> 
> Well, it would be good to determine the difference between theory and practice
> in this particular respect, because if in practice it can't be anywhere,
> we can just set an "every new instance of try_to_freeze() has to be documented"
> rule (which may not be a bad idea anyway) and disallow people to break things.

What do you think about this way to help distinguish kernel threads and
user processes and keep the future maintenance of freezer saner?
---
>From 0a28200a4439283753c10acabd1bcd7cc5989aae Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Wed, 10 Sep 2014 11:04:40 +0200
Subject: [PATCH] freezer: make freezing of user tasks and kernel threads clear

There are slightly different requirements for kernel threads and user
tasks freezing. User tasks might get killed while they are in the fridge
(e.g. by OOM killer) and we have to make sure that such a task will die
as soon as possible. This means that try_to_freeze cannot be called from
an arbitrary code path.

The obviously safe place is on the way out from the kernel. We have two
such places currently, get_signal_to_deliver which is the main point
for user tasks freezing called on the way to userspace and lguest which
expedites return to the userspace as well when signals are pending.

Any new caller should be added with extreme caution and proper
justification. This is quite hard right now because both kernel threads
and user tasks context are calling the same function. This patch
keeps the old try_to_freeze only for kernel threads and adds a new
try_to_freeze_user_task which is aimed at user tasks. Both check process
flag and warn about improper usage.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 drivers/lguest/core.c   |  2 +-
 include/linux/freezer.h | 28 +++++++++++++++++++++++++++-
 kernel/signal.c         |  2 +-
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c
index 6590558d1d31..d8eb5ec8bcc7 100644
--- a/drivers/lguest/core.c
+++ b/drivers/lguest/core.c
@@ -239,7 +239,7 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user)
 		 * thing called the freezer.  If the Host is trying to suspend,
 		 * it stops us.
 		 */
-		try_to_freeze();
+		try_to_freeze_user_task();
 
 		/* Check for signals */
 		if (signal_pending(current))
diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 7fd81b8c4897..2683ad5fdfe8 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -59,13 +59,39 @@ static inline bool try_to_freeze_unsafe(void)
 	return __refrigerator(false);
 }
 
-static inline bool try_to_freeze(void)
+/* A helper function. Do not use directly. */
+static inline bool __try_to_freeze(void)
 {
 	if (!(current->flags & PF_NOFREEZE))
 		debug_check_no_locks_held();
 	return try_to_freeze_unsafe();
 }
 
+/*
+ * Only kernel threads are allowed to call try_to_freeze.
+ */
+static inline bool try_to_freeze(void)
+{
+	WARN_ON(!(current->flags & PF_KTHREAD));
+
+	return __try_to_freeze();
+}
+
+/*
+ * User tasks might get frozen only on specific places
+ * where we know they will get to user space or die on
+ * signal very quickly.
+ *
+ * Every new user of this function has to be consulted with
+ * PM/freezer maintainers.
+ */
+static inline bool try_to_freeze_user_task(void)
+{
+	WARN_ON(current->flags & PF_KTHREAD);
+
+	return __try_to_freeze();
+}
+
 extern bool freeze_task(struct task_struct *p);
 extern bool set_freezable(void);
 
diff --git a/kernel/signal.c b/kernel/signal.c
index a4077e90f19f..c1860564a74d 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2184,7 +2184,7 @@ int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
 	 * do_signal_stop() and ptrace_stop() do freezable_schedule() and
 	 * thus do not need another check after return.
 	 */
-	try_to_freeze();
+	try_to_freeze_user_task();
 
 relock:
 	spin_lock_irq(&sighand->siglock);
-- 
2.1.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-08 23:29                                     ` Cong Wang
  2014-09-08 23:59                                       ` Rafael J. Wysocki
@ 2014-09-10 20:30                                       ` Cong Wang
  2014-09-10 23:38                                         ` Rafael J. Wysocki
  2014-09-11 16:30                                         ` Michal Hocko
  1 sibling, 2 replies; 61+ messages in thread
From: Cong Wang @ 2014-09-10 20:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Mon, Sep 8, 2014 at 4:29 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Mon, Sep 8, 2014 at 4:42 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> On Monday, September 08, 2014 04:16:15 PM Cong Wang wrote:
>>> On Mon, Sep 8, 2014 at 4:23 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>> >
>>> > The reason why it matters for the suspend-time freezing is that we freeze tasks
>>> > to take them out of the picture entirely until they are thawed.  Therefore we
>>> > can't allow them to go back to the picture just for a while until they are
>>> > killed.  Frozen tasks are not supposed to get back to the picture at all.
>>> >
>>>
>>>
>>> Ok, then checking TIF_MEMDIE is unsafe for PM freeze, we should
>>> keep the cgroup_freezing() test to make sure freeze request is from
>>> cgroup not PM. Question got answered. :)
>>
>> Do I think correctly that cgroups freezing and system suspend are
>> mutually exclusive?  If not, then this still is problematic.
>
> Good point! Although rare, but it is possible we freeze a process both from
> cgroup and PM. Hmm, this means we have to explicitly exclude PM rather
> just checking cgroup freeze? Interesting, but I am not familiar with PM.
>

I am wondering if the folllowing check makes any sense with regarding
to rule out PM freeze:

        if ((!pm_nosig_freezing && !pm_freezing) &&
            cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
                return true;

PM will not freeze the process again if it is already frozen by cgroup,
so here we just make sure that it will not thaw if PM freeze is in progress.

Am I missing anything?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-10 23:38                                         ` Rafael J. Wysocki
@ 2014-09-10 23:20                                           ` Cong Wang
  0 siblings, 0 replies; 61+ messages in thread
From: Cong Wang @ 2014-09-10 23:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Wed, Sep 10, 2014 at 4:38 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, September 10, 2014 01:30:25 PM Cong Wang wrote:
>> I am wondering if the folllowing check makes any sense with regarding
>> to rule out PM freeze:
>>
>>         if ((!pm_nosig_freezing && !pm_freezing) &&
>>             cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
>>                 return true;
>>
>> PM will not freeze the process again if it is already frozen by cgroup,
>> so here we just make sure that it will not thaw if PM freeze is in progress.
>>
>> Am I missing anything?
>
> And where exactly would you like to put this check?  Into __refrigerator()?

Yes, still in should_thaw_current() which is called in __refrigerator().

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-10 20:30                                       ` Cong Wang
@ 2014-09-10 23:38                                         ` Rafael J. Wysocki
  2014-09-10 23:20                                           ` Cong Wang
  2014-09-11 16:30                                         ` Michal Hocko
  1 sibling, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-10 23:38 UTC (permalink / raw)
  To: Cong Wang; +Cc: Tejun Heo, LKML, David Rientjes, Michal Hocko, Andrew Morton

On Wednesday, September 10, 2014 01:30:25 PM Cong Wang wrote:
> On Mon, Sep 8, 2014 at 4:29 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > On Mon, Sep 8, 2014 at 4:42 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> On Monday, September 08, 2014 04:16:15 PM Cong Wang wrote:
> >>> On Mon, Sep 8, 2014 at 4:23 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >>> >
> >>> > The reason why it matters for the suspend-time freezing is that we freeze tasks
> >>> > to take them out of the picture entirely until they are thawed.  Therefore we
> >>> > can't allow them to go back to the picture just for a while until they are
> >>> > killed.  Frozen tasks are not supposed to get back to the picture at all.
> >>> >
> >>>
> >>>
> >>> Ok, then checking TIF_MEMDIE is unsafe for PM freeze, we should
> >>> keep the cgroup_freezing() test to make sure freeze request is from
> >>> cgroup not PM. Question got answered. :)
> >>
> >> Do I think correctly that cgroups freezing and system suspend are
> >> mutually exclusive?  If not, then this still is problematic.
> >
> > Good point! Although rare, but it is possible we freeze a process both from
> > cgroup and PM. Hmm, this means we have to explicitly exclude PM rather
> > just checking cgroup freeze? Interesting, but I am not familiar with PM.
> >
> 
> I am wondering if the folllowing check makes any sense with regarding
> to rule out PM freeze:
> 
>         if ((!pm_nosig_freezing && !pm_freezing) &&
>             cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
>                 return true;
> 
> PM will not freeze the process again if it is already frozen by cgroup,
> so here we just make sure that it will not thaw if PM freeze is in progress.
> 
> Am I missing anything?

And where exactly would you like to put this check?  Into __refrigerator()?

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-10 13:24                                     ` Michal Hocko
@ 2014-09-11 13:08                                       ` Michal Hocko
  2014-09-11 14:17                                         ` Rafael J. Wysocki
  2014-09-12 23:48                                       ` Tejun Heo
  1 sibling, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-11 13:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Wed 10-09-14 15:24:17, Michal Hocko wrote:
> On Tue 09-09-14 22:53:32, Rafael J. Wysocki wrote:
> > On Wednesday, September 10, 2014 01:46:58 AM Tejun Heo wrote:
> > > Hello,
> > > 
> > > On Tue, Sep 09, 2014 at 06:06:25PM +0200, Michal Hocko wrote:
> > > > > Even for userland tasks, try_to_freeze() can currently be anywhere in
> > > > > the kernel.  The frequently used ones are few but there are some odd
> > > > 
> > > > I always thought that user space tasks can be in the fridge only on the
> > > > way out from the kernel (get_signal_to_deliver). I have quickly greped
> > > 
> > > It *can* be anywhere.  We used to have some deep in nfs.  They got
> > > removed later due to deadlocks but in theory they still can be
> > > anywhere.
> > 
> > Well, it would be good to determine the difference between theory and practice
> > in this particular respect, because if in practice it can't be anywhere,
> > we can just set an "every new instance of try_to_freeze() has to be documented"
> > rule (which may not be a bad idea anyway) and disallow people to break things.
> 
> What do you think about this way to help distinguish kernel threads and
> user processes and keep the future maintenance of freezer saner?

This is not enough. I have completely missed freezer_count... Will have
to think about this. There is quite large base of callers of this... It
won't be trivial to check whether all of them are safe. Especially after
467de1fc67d1b which made the functionality usable for !user tasks as
well. Btw. is this even desirable? The follow up patch (33e638b9070b)
have (ab)used freezer_{do_not_}count just to be reverted later by
72081624d5ad3. I haven't checked other kernel thread users which might be
added later. Don't rather put try_to_freeze_user_task into freezer_count
to reflect the original intention (assuming that a separate API for user
and kthread is desirable of course).

> ---
> From 0a28200a4439283753c10acabd1bcd7cc5989aae Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Wed, 10 Sep 2014 11:04:40 +0200
> Subject: [PATCH] freezer: make freezing of user tasks and kernel threads clear
> 
> There are slightly different requirements for kernel threads and user
> tasks freezing. User tasks might get killed while they are in the fridge
> (e.g. by OOM killer) and we have to make sure that such a task will die
> as soon as possible. This means that try_to_freeze cannot be called from
> an arbitrary code path.
> 
> The obviously safe place is on the way out from the kernel. We have two
> such places currently, get_signal_to_deliver which is the main point
> for user tasks freezing called on the way to userspace and lguest which
> expedites return to the userspace as well when signals are pending.
> 
> Any new caller should be added with extreme caution and proper
> justification. This is quite hard right now because both kernel threads
> and user tasks context are calling the same function. This patch
> keeps the old try_to_freeze only for kernel threads and adds a new
> try_to_freeze_user_task which is aimed at user tasks. Both check process
> flag and warn about improper usage.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  drivers/lguest/core.c   |  2 +-
>  include/linux/freezer.h | 28 +++++++++++++++++++++++++++-
>  kernel/signal.c         |  2 +-
>  3 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c
> index 6590558d1d31..d8eb5ec8bcc7 100644
> --- a/drivers/lguest/core.c
> +++ b/drivers/lguest/core.c
> @@ -239,7 +239,7 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user)
>  		 * thing called the freezer.  If the Host is trying to suspend,
>  		 * it stops us.
>  		 */
> -		try_to_freeze();
> +		try_to_freeze_user_task();
>  
>  		/* Check for signals */
>  		if (signal_pending(current))
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index 7fd81b8c4897..2683ad5fdfe8 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -59,13 +59,39 @@ static inline bool try_to_freeze_unsafe(void)
>  	return __refrigerator(false);
>  }
>  
> -static inline bool try_to_freeze(void)
> +/* A helper function. Do not use directly. */
> +static inline bool __try_to_freeze(void)
>  {
>  	if (!(current->flags & PF_NOFREEZE))
>  		debug_check_no_locks_held();
>  	return try_to_freeze_unsafe();
>  }
>  
> +/*
> + * Only kernel threads are allowed to call try_to_freeze.
> + */
> +static inline bool try_to_freeze(void)
> +{
> +	WARN_ON(!(current->flags & PF_KTHREAD));
> +
> +	return __try_to_freeze();
> +}
> +
> +/*
> + * User tasks might get frozen only on specific places
> + * where we know they will get to user space or die on
> + * signal very quickly.
> + *
> + * Every new user of this function has to be consulted with
> + * PM/freezer maintainers.
> + */
> +static inline bool try_to_freeze_user_task(void)
> +{
> +	WARN_ON(current->flags & PF_KTHREAD);
> +
> +	return __try_to_freeze();
> +}
> +
>  extern bool freeze_task(struct task_struct *p);
>  extern bool set_freezable(void);
>  
> diff --git a/kernel/signal.c b/kernel/signal.c
> index a4077e90f19f..c1860564a74d 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2184,7 +2184,7 @@ int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
>  	 * do_signal_stop() and ptrace_stop() do freezable_schedule() and
>  	 * thus do not need another check after return.
>  	 */
> -	try_to_freeze();
> +	try_to_freeze_user_task();
>  
>  relock:
>  	spin_lock_irq(&sighand->siglock);
> -- 
> 2.1.0
> 
> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 14:17                                         ` Rafael J. Wysocki
@ 2014-09-11 14:04                                           ` Michal Hocko
  2014-09-11 14:26                                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-11 14:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thu 11-09-14 16:17:56, Rafael J. Wysocki wrote:
[...]
> And I'm still wondering if the OOM killer may be made avoid killing frozen
> tasks.

This is really tricky. OOM killer aims at the biggest memory hog. We
shouldn't ignore it just because it hides into the fridge... So even
if we "fix" oom killer to ignore frozen tasks (which is inherently
racy btw.) then we have a potential problem of freezer abuse (e.g. in
container environments). So I strongly believe that the OOM killer has
to be able to kill a frozen tasks.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 14:26                                             ` Rafael J. Wysocki
@ 2014-09-11 14:10                                               ` Michal Hocko
  2014-09-11 14:32                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-11 14:10 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thu 11-09-14 16:26:56, Rafael J. Wysocki wrote:
> On Thursday, September 11, 2014 04:04:48 PM Michal Hocko wrote:
> > On Thu 11-09-14 16:17:56, Rafael J. Wysocki wrote:
> > [...]
> > > And I'm still wondering if the OOM killer may be made avoid killing frozen
> > > tasks.
> > 
> > This is really tricky. OOM killer aims at the biggest memory hog. We
> > shouldn't ignore it just because it hides into the fridge... So even
> > if we "fix" oom killer to ignore frozen tasks (which is inherently
> > racy btw.) then we have a potential problem of freezer abuse (e.g. in
> > container environments). So I strongly believe that the OOM killer has
> > to be able to kill a frozen tasks.
> 
> OK
> 
> Is the OOM killer the only place where TIF_MEMDIE is set?

Yes. To be precise, lowmemorykiller (staging android thingy), global OOM
killer and memcg OOM killer. Any other users would be an abuse.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 13:08                                       ` Michal Hocko
@ 2014-09-11 14:17                                         ` Rafael J. Wysocki
  2014-09-11 14:04                                           ` Michal Hocko
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-11 14:17 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thursday, September 11, 2014 03:08:40 PM Michal Hocko wrote:
> On Wed 10-09-14 15:24:17, Michal Hocko wrote:
> > On Tue 09-09-14 22:53:32, Rafael J. Wysocki wrote:
> > > On Wednesday, September 10, 2014 01:46:58 AM Tejun Heo wrote:
> > > > Hello,
> > > > 
> > > > On Tue, Sep 09, 2014 at 06:06:25PM +0200, Michal Hocko wrote:
> > > > > > Even for userland tasks, try_to_freeze() can currently be anywhere in
> > > > > > the kernel.  The frequently used ones are few but there are some odd
> > > > > 
> > > > > I always thought that user space tasks can be in the fridge only on the
> > > > > way out from the kernel (get_signal_to_deliver). I have quickly greped
> > > > 
> > > > It *can* be anywhere.  We used to have some deep in nfs.  They got
> > > > removed later due to deadlocks but in theory they still can be
> > > > anywhere.
> > > 
> > > Well, it would be good to determine the difference between theory and practice
> > > in this particular respect, because if in practice it can't be anywhere,
> > > we can just set an "every new instance of try_to_freeze() has to be documented"
> > > rule (which may not be a bad idea anyway) and disallow people to break things.
> > 
> > What do you think about this way to help distinguish kernel threads and
> > user processes and keep the future maintenance of freezer saner?
> 
> This is not enough. I have completely missed freezer_count... Will have
> to think about this. There is quite large base of callers of this... It
> won't be trivial to check whether all of them are safe. Especially after
> 467de1fc67d1b which made the functionality usable for !user tasks as
> well. Btw. is this even desirable? The follow up patch (33e638b9070b)
> have (ab)used freezer_{do_not_}count just to be reverted later by
> 72081624d5ad3. I haven't checked other kernel thread users which might be
> added later. Don't rather put try_to_freeze_user_task into freezer_count
> to reflect the original intention (assuming that a separate API for user
> and kthread is desirable of course).

Well, to me the only difference is that user space is frozen "automatically"
while kernel threads need to flag themselves as "freezable" and then call
try_to_freeze() directly somewhere.  Whether or not that difference is big
enough for a separate API is a good question and I'm not sure if they can be
entirely separate anyway.

And I'm still wondering if the OOM killer may be made avoid killing frozen
tasks.

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 14:04                                           ` Michal Hocko
@ 2014-09-11 14:26                                             ` Rafael J. Wysocki
  2014-09-11 14:10                                               ` Michal Hocko
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-11 14:26 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thursday, September 11, 2014 04:04:48 PM Michal Hocko wrote:
> On Thu 11-09-14 16:17:56, Rafael J. Wysocki wrote:
> [...]
> > And I'm still wondering if the OOM killer may be made avoid killing frozen
> > tasks.
> 
> This is really tricky. OOM killer aims at the biggest memory hog. We
> shouldn't ignore it just because it hides into the fridge... So even
> if we "fix" oom killer to ignore frozen tasks (which is inherently
> racy btw.) then we have a potential problem of freezer abuse (e.g. in
> container environments). So I strongly believe that the OOM killer has
> to be able to kill a frozen tasks.

OK

Is the OOM killer the only place where TIF_MEMDIE is set?

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 14:32                                                 ` Rafael J. Wysocki
@ 2014-09-11 14:28                                                   ` Michal Hocko
  2014-09-11 14:52                                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-11 14:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thu 11-09-14 16:32:50, Rafael J. Wysocki wrote:
> On Thursday, September 11, 2014 04:10:51 PM Michal Hocko wrote:
> > On Thu 11-09-14 16:26:56, Rafael J. Wysocki wrote:
> > > On Thursday, September 11, 2014 04:04:48 PM Michal Hocko wrote:
> > > > On Thu 11-09-14 16:17:56, Rafael J. Wysocki wrote:
> > > > [...]
> > > > > And I'm still wondering if the OOM killer may be made avoid killing frozen
> > > > > tasks.
> > > > 
> > > > This is really tricky. OOM killer aims at the biggest memory hog. We
> > > > shouldn't ignore it just because it hides into the fridge... So even
> > > > if we "fix" oom killer to ignore frozen tasks (which is inherently
> > > > racy btw.) then we have a potential problem of freezer abuse (e.g. in
> > > > container environments). So I strongly believe that the OOM killer has
> > > > to be able to kill a frozen tasks.
> > > 
> > > OK
> > > 
> > > Is the OOM killer the only place where TIF_MEMDIE is set?
> > 
> > Yes. To be precise, lowmemorykiller (staging android thingy), global OOM
> > killer and memcg OOM killer. Any other users would be an abuse.
> 
> OK
> 
> So can we ensure that those things don't trigger during system suspend (or
> equivalent) and then simply use the TIF_MEMDIE check in __refrigerator()?

That would require that no memory allocation triggers OOM killer during
suspend. I don't think this will work out. OOM killer is the last resort
action. We cannot simply give access to memory reserves just because the
current context is in the middle of suspend.

What is the worst thing that might happen when a task is killed in the
middle of suspend? I thought that suspend would fail after several
attempts to suspend all existing tasks.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 14:10                                               ` Michal Hocko
@ 2014-09-11 14:32                                                 ` Rafael J. Wysocki
  2014-09-11 14:28                                                   ` Michal Hocko
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-11 14:32 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thursday, September 11, 2014 04:10:51 PM Michal Hocko wrote:
> On Thu 11-09-14 16:26:56, Rafael J. Wysocki wrote:
> > On Thursday, September 11, 2014 04:04:48 PM Michal Hocko wrote:
> > > On Thu 11-09-14 16:17:56, Rafael J. Wysocki wrote:
> > > [...]
> > > > And I'm still wondering if the OOM killer may be made avoid killing frozen
> > > > tasks.
> > > 
> > > This is really tricky. OOM killer aims at the biggest memory hog. We
> > > shouldn't ignore it just because it hides into the fridge... So even
> > > if we "fix" oom killer to ignore frozen tasks (which is inherently
> > > racy btw.) then we have a potential problem of freezer abuse (e.g. in
> > > container environments). So I strongly believe that the OOM killer has
> > > to be able to kill a frozen tasks.
> > 
> > OK
> > 
> > Is the OOM killer the only place where TIF_MEMDIE is set?
> 
> Yes. To be precise, lowmemorykiller (staging android thingy), global OOM
> killer and memcg OOM killer. Any other users would be an abuse.

OK

So can we ensure that those things don't trigger during system suspend (or
equivalent) and then simply use the TIF_MEMDIE check in __refrigerator()?

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 14:52                                                     ` Rafael J. Wysocki
@ 2014-09-11 14:45                                                       ` Michal Hocko
  2014-09-14 16:39                                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-11 14:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thu 11-09-14 16:52:50, Rafael J. Wysocki wrote:
> On Thursday, September 11, 2014 04:28:22 PM Michal Hocko wrote:
> > On Thu 11-09-14 16:32:50, Rafael J. Wysocki wrote:
> > > On Thursday, September 11, 2014 04:10:51 PM Michal Hocko wrote:
> > > > On Thu 11-09-14 16:26:56, Rafael J. Wysocki wrote:
> > > > > On Thursday, September 11, 2014 04:04:48 PM Michal Hocko wrote:
> > > > > > On Thu 11-09-14 16:17:56, Rafael J. Wysocki wrote:
> > > > > > [...]
> > > > > > > And I'm still wondering if the OOM killer may be made avoid killing frozen
> > > > > > > tasks.
> > > > > > 
> > > > > > This is really tricky. OOM killer aims at the biggest memory hog. We
> > > > > > shouldn't ignore it just because it hides into the fridge... So even
> > > > > > if we "fix" oom killer to ignore frozen tasks (which is inherently
> > > > > > racy btw.) then we have a potential problem of freezer abuse (e.g. in
> > > > > > container environments). So I strongly believe that the OOM killer has
> > > > > > to be able to kill a frozen tasks.
> > > > > 
> > > > > OK
> > > > > 
> > > > > Is the OOM killer the only place where TIF_MEMDIE is set?
> > > > 
> > > > Yes. To be precise, lowmemorykiller (staging android thingy), global OOM
> > > > killer and memcg OOM killer. Any other users would be an abuse.
> > > 
> > > OK
> > > 
> > > So can we ensure that those things don't trigger during system suspend (or
> > > equivalent) and then simply use the TIF_MEMDIE check in __refrigerator()?
> > 
> > That would require that no memory allocation triggers OOM killer during
> > suspend. I don't think this will work out. OOM killer is the last resort
> > action. We cannot simply give access to memory reserves just because the
> > current context is in the middle of suspend.
> 
> But we can fail the allocation, can't we?

We already do that by oom_killer_disable after all tasks are frozen in
freeze_processes. This is before all other device specific things are
done so I guess we cannot start killing after any device is suspended,
right? This should be sufficient.

> > What is the worst thing that might happen when a task is killed in the
> > middle of suspend? I thought that suspend would fail after several
> > attempts to suspend all existing tasks.
> 
> The problem is what to do when we need to kill a frozen task.
> 
> In that case we need to thaw it and then it will die eventually.  Unfortunately,
> it generally can do something undesirable before dying.  That may be accessing
> a suspended device, for example.

OK, I have misunderstood you obviously. I thought we were discussing OOM
while we are in the middle of freezing tasks. After they are frozen
there is no OOM killer as per above.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 14:28                                                   ` Michal Hocko
@ 2014-09-11 14:52                                                     ` Rafael J. Wysocki
  2014-09-11 14:45                                                       ` Michal Hocko
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-11 14:52 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thursday, September 11, 2014 04:28:22 PM Michal Hocko wrote:
> On Thu 11-09-14 16:32:50, Rafael J. Wysocki wrote:
> > On Thursday, September 11, 2014 04:10:51 PM Michal Hocko wrote:
> > > On Thu 11-09-14 16:26:56, Rafael J. Wysocki wrote:
> > > > On Thursday, September 11, 2014 04:04:48 PM Michal Hocko wrote:
> > > > > On Thu 11-09-14 16:17:56, Rafael J. Wysocki wrote:
> > > > > [...]
> > > > > > And I'm still wondering if the OOM killer may be made avoid killing frozen
> > > > > > tasks.
> > > > > 
> > > > > This is really tricky. OOM killer aims at the biggest memory hog. We
> > > > > shouldn't ignore it just because it hides into the fridge... So even
> > > > > if we "fix" oom killer to ignore frozen tasks (which is inherently
> > > > > racy btw.) then we have a potential problem of freezer abuse (e.g. in
> > > > > container environments). So I strongly believe that the OOM killer has
> > > > > to be able to kill a frozen tasks.
> > > > 
> > > > OK
> > > > 
> > > > Is the OOM killer the only place where TIF_MEMDIE is set?
> > > 
> > > Yes. To be precise, lowmemorykiller (staging android thingy), global OOM
> > > killer and memcg OOM killer. Any other users would be an abuse.
> > 
> > OK
> > 
> > So can we ensure that those things don't trigger during system suspend (or
> > equivalent) and then simply use the TIF_MEMDIE check in __refrigerator()?
> 
> That would require that no memory allocation triggers OOM killer during
> suspend. I don't think this will work out. OOM killer is the last resort
> action. We cannot simply give access to memory reserves just because the
> current context is in the middle of suspend.

But we can fail the allocation, can't we?

> What is the worst thing that might happen when a task is killed in the
> middle of suspend? I thought that suspend would fail after several
> attempts to suspend all existing tasks.

The problem is what to do when we need to kill a frozen task.

In that case we need to thaw it and then it will die eventually.  Unfortunately,
it generally can do something undesirable before dying.  That may be accessing
a suspended device, for example.

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-10 20:30                                       ` Cong Wang
  2014-09-10 23:38                                         ` Rafael J. Wysocki
@ 2014-09-11 16:30                                         ` Michal Hocko
  2014-09-12 23:59                                           ` Tejun Heo
  1 sibling, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-11 16:30 UTC (permalink / raw)
  To: Cong Wang
  Cc: Rafael J. Wysocki, Tejun Heo, LKML, David Rientjes, Andrew Morton

On Wed 10-09-14 13:30:25, Cong Wang wrote:
> On Mon, Sep 8, 2014 at 4:29 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > On Mon, Sep 8, 2014 at 4:42 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> On Monday, September 08, 2014 04:16:15 PM Cong Wang wrote:
> >>> On Mon, Sep 8, 2014 at 4:23 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >>> >
> >>> > The reason why it matters for the suspend-time freezing is that we freeze tasks
> >>> > to take them out of the picture entirely until they are thawed.  Therefore we
> >>> > can't allow them to go back to the picture just for a while until they are
> >>> > killed.  Frozen tasks are not supposed to get back to the picture at all.
> >>> >
> >>>
> >>>
> >>> Ok, then checking TIF_MEMDIE is unsafe for PM freeze, we should
> >>> keep the cgroup_freezing() test to make sure freeze request is from
> >>> cgroup not PM. Question got answered. :)
> >>
> >> Do I think correctly that cgroups freezing and system suspend are
> >> mutually exclusive?  If not, then this still is problematic.
> >
> > Good point! Although rare, but it is possible we freeze a process both from
> > cgroup and PM. Hmm, this means we have to explicitly exclude PM rather
> > just checking cgroup freeze? Interesting, but I am not familiar with PM.
> >
> 
> I am wondering if the folllowing check makes any sense with regarding
> to rule out PM freeze:
> 
>         if ((!pm_nosig_freezing && !pm_freezing) &&
>             cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
>                 return true;

Why is this needed in the first place? What if OOM happens in the middle
of task freezing during freeze_processes? Also killing a task at that
stage should be safe as no devices are suspended yet and OOM killer is
disabled after all tasks are frozen and allocations fail at that stage.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-10 13:24                                     ` Michal Hocko
  2014-09-11 13:08                                       ` Michal Hocko
@ 2014-09-12 23:48                                       ` Tejun Heo
  2014-09-15 14:28                                         ` Michal Hocko
  1 sibling, 1 reply; 61+ messages in thread
From: Tejun Heo @ 2014-09-12 23:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Rafael J. Wysocki, Cong Wang, LKML, David Rientjes, Andrew Morton

Hello, Michal.

On Wed, Sep 10, 2014 at 03:24:17PM +0200, Michal Hocko wrote:
> What do you think about this way to help distinguish kernel threads and
> user processes and keep the future maintenance of freezer saner?

I'm not sure either way.  Please note that we have quite a few
wrappers around try_to_freeze() - freezable_schedule*() and
wait_event_freezable*() - and several places where a userland can be
frozen.  Are we gonna introduce an alternate versions for all of them?
What do we actually gain?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 16:30                                         ` Michal Hocko
@ 2014-09-12 23:59                                           ` Tejun Heo
  2014-09-14 16:43                                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Tejun Heo @ 2014-09-12 23:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Cong Wang, Rafael J. Wysocki, LKML, David Rientjes, Andrew Morton

On Thu, Sep 11, 2014 at 06:30:04PM +0200, Michal Hocko wrote:
> > I am wondering if the folllowing check makes any sense with regarding
> > to rule out PM freeze:
> > 
> >         if ((!pm_nosig_freezing && !pm_freezing) &&
> >             cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
> >                 return true;

Doesn't this mean that if PM freezing and OOM killing race each other,
the system may hang?  Driver PM operation may try to allocate memory
-> triggers OOM -> OOM killer selects an already frozen task ->
nothing happens.  I wonder whether OOM killing and PM operations
should be mutually exclusive at a higher level.  e.g. make OOM killing
always override freezing but let hibernation abort operation before
taking snapshot if OOM killing has happened since the beginning of the
PM operation.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-11 14:45                                                       ` Michal Hocko
@ 2014-09-14 16:39                                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-14 16:39 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Thursday, September 11, 2014 04:45:09 PM Michal Hocko wrote:
> On Thu 11-09-14 16:52:50, Rafael J. Wysocki wrote:
> > On Thursday, September 11, 2014 04:28:22 PM Michal Hocko wrote:
> > > On Thu 11-09-14 16:32:50, Rafael J. Wysocki wrote:
> > > > On Thursday, September 11, 2014 04:10:51 PM Michal Hocko wrote:
> > > > > On Thu 11-09-14 16:26:56, Rafael J. Wysocki wrote:
> > > > > > On Thursday, September 11, 2014 04:04:48 PM Michal Hocko wrote:
> > > > > > > On Thu 11-09-14 16:17:56, Rafael J. Wysocki wrote:
> > > > > > > [...]
> > > > > > > > And I'm still wondering if the OOM killer may be made avoid killing frozen
> > > > > > > > tasks.
> > > > > > > 
> > > > > > > This is really tricky. OOM killer aims at the biggest memory hog. We
> > > > > > > shouldn't ignore it just because it hides into the fridge... So even
> > > > > > > if we "fix" oom killer to ignore frozen tasks (which is inherently
> > > > > > > racy btw.) then we have a potential problem of freezer abuse (e.g. in
> > > > > > > container environments). So I strongly believe that the OOM killer has
> > > > > > > to be able to kill a frozen tasks.
> > > > > > 
> > > > > > OK
> > > > > > 
> > > > > > Is the OOM killer the only place where TIF_MEMDIE is set?
> > > > > 
> > > > > Yes. To be precise, lowmemorykiller (staging android thingy), global OOM
> > > > > killer and memcg OOM killer. Any other users would be an abuse.
> > > > 
> > > > OK
> > > > 
> > > > So can we ensure that those things don't trigger during system suspend (or
> > > > equivalent) and then simply use the TIF_MEMDIE check in __refrigerator()?
> > > 
> > > That would require that no memory allocation triggers OOM killer during
> > > suspend. I don't think this will work out. OOM killer is the last resort
> > > action. We cannot simply give access to memory reserves just because the
> > > current context is in the middle of suspend.
> > 
> > But we can fail the allocation, can't we?
> 
> We already do that by oom_killer_disable after all tasks are frozen in
> freeze_processes. This is before all other device specific things are
> done so I guess we cannot start killing after any device is suspended,
> right? This should be sufficient.

Yes, it should.

> > > What is the worst thing that might happen when a task is killed in the
> > > middle of suspend? I thought that suspend would fail after several
> > > attempts to suspend all existing tasks.
> > 
> > The problem is what to do when we need to kill a frozen task.
> > 
> > In that case we need to thaw it and then it will die eventually.  Unfortunately,
> > it generally can do something undesirable before dying.  That may be accessing
> > a suspended device, for example.
> 
> OK, I have misunderstood you obviously. I thought we were discussing OOM
> while we are in the middle of freezing tasks. After they are frozen
> there is no OOM killer as per above.

Right.

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-12 23:59                                           ` Tejun Heo
@ 2014-09-14 16:43                                             ` Rafael J. Wysocki
  2014-09-15  0:56                                               ` Tejun Heo
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-14 16:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Michal Hocko, Cong Wang, LKML, David Rientjes, Andrew Morton

On Saturday, September 13, 2014 08:59:35 AM Tejun Heo wrote:
> On Thu, Sep 11, 2014 at 06:30:04PM +0200, Michal Hocko wrote:
> > > I am wondering if the folllowing check makes any sense with regarding
> > > to rule out PM freeze:
> > > 
> > >         if ((!pm_nosig_freezing && !pm_freezing) &&
> > >             cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
> > >                 return true;
> 
> Doesn't this mean that if PM freezing and OOM killing race each other,
> the system may hang?  Driver PM operation may try to allocate memory
> -> triggers OOM -> OOM killer selects an already frozen task ->
> nothing happens.  I wonder whether OOM killing and PM operations
> should be mutually exclusive at a higher level.  e.g. make OOM killing
> always override freezing but let hibernation abort operation before
> taking snapshot if OOM killing has happened since the beginning of the
> PM operation.

As Michal noted, we do oom_killer_disable() in freeze_processes(), so the
scenario above cannot actually happen to my eyes.  Or am I missing anything?

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-14 16:43                                             ` Rafael J. Wysocki
@ 2014-09-15  0:56                                               ` Tejun Heo
  2014-09-15  3:34                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 61+ messages in thread
From: Tejun Heo @ 2014-09-15  0:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Michal Hocko, Cong Wang, LKML, David Rientjes, Andrew Morton

On Sun, Sep 14, 2014 at 06:43:31PM +0200, Rafael J. Wysocki wrote:
> On Saturday, September 13, 2014 08:59:35 AM Tejun Heo wrote:
> > Doesn't this mean that if PM freezing and OOM killing race each other,
> > the system may hang?  Driver PM operation may try to allocate memory
> > -> triggers OOM -> OOM killer selects an already frozen task ->
> > nothing happens.  I wonder whether OOM killing and PM operations
> > should be mutually exclusive at a higher level.  e.g. make OOM killing
> > always override freezing but let hibernation abort operation before
> > taking snapshot if OOM killing has happened since the beginning of the
> > PM operation.
> 
> As Michal noted, we do oom_killer_disable() in freeze_processes(), so the
> scenario above cannot actually happen to my eyes.  Or am I missing anything?

Ah, okay, that's better but it doesn't seem enough.  It does prevent
new invocations of the oom killer but doesn't do anything if oom
killing is already in progress.  If we do block out oom killing
properly across PM freeze/thaw, it shoud be fine.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-15  0:56                                               ` Tejun Heo
@ 2014-09-15  3:34                                                 ` Rafael J. Wysocki
  2014-09-15  9:36                                                   ` Michal Hocko
  0 siblings, 1 reply; 61+ messages in thread
From: Rafael J. Wysocki @ 2014-09-15  3:34 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Michal Hocko, Cong Wang, LKML, David Rientjes, Andrew Morton

On Monday, September 15, 2014 09:56:57 AM Tejun Heo wrote:
> On Sun, Sep 14, 2014 at 06:43:31PM +0200, Rafael J. Wysocki wrote:
> > On Saturday, September 13, 2014 08:59:35 AM Tejun Heo wrote:
> > > Doesn't this mean that if PM freezing and OOM killing race each other,
> > > the system may hang?  Driver PM operation may try to allocate memory
> > > -> triggers OOM -> OOM killer selects an already frozen task ->
> > > nothing happens.  I wonder whether OOM killing and PM operations
> > > should be mutually exclusive at a higher level.  e.g. make OOM killing
> > > always override freezing but let hibernation abort operation before
> > > taking snapshot if OOM killing has happened since the beginning of the
> > > PM operation.
> > 
> > As Michal noted, we do oom_killer_disable() in freeze_processes(), so the
> > scenario above cannot actually happen to my eyes.  Or am I missing anything?
> 
> Ah, okay, that's better but it doesn't seem enough.  It does prevent
> new invocations of the oom killer but doesn't do anything if oom
> killing is already in progress.  If we do block out oom killing
> properly across PM freeze/thaw, it shoud be fine.

OK, so my assumption was that oom_killer_disable() would wait for the OOM
killing in progress to complete.  Alternatively, it can return an error code
if OOM killing is in progress and we can simply fail the freezing in that
case.

Rafael


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-15  3:34                                                 ` Rafael J. Wysocki
@ 2014-09-15  9:36                                                   ` Michal Hocko
  2014-09-16 22:55                                                     ` Cong Wang
  0 siblings, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-15  9:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Cong Wang, LKML, David Rientjes, Andrew Morton

On Mon 15-09-14 05:34:36, Rafael J. Wysocki wrote:
> On Monday, September 15, 2014 09:56:57 AM Tejun Heo wrote:
> > On Sun, Sep 14, 2014 at 06:43:31PM +0200, Rafael J. Wysocki wrote:
> > > On Saturday, September 13, 2014 08:59:35 AM Tejun Heo wrote:
> > > > Doesn't this mean that if PM freezing and OOM killing race each other,
> > > > the system may hang?  Driver PM operation may try to allocate memory
> > > > -> triggers OOM -> OOM killer selects an already frozen task ->
> > > > nothing happens.  I wonder whether OOM killing and PM operations
> > > > should be mutually exclusive at a higher level.  e.g. make OOM killing
> > > > always override freezing but let hibernation abort operation before
> > > > taking snapshot if OOM killing has happened since the beginning of the
> > > > PM operation.
> > > 
> > > As Michal noted, we do oom_killer_disable() in freeze_processes(), so the
> > > scenario above cannot actually happen to my eyes.  Or am I missing anything?
> > 
> > Ah, okay, that's better but it doesn't seem enough.  It does prevent
> > new invocations of the oom killer but doesn't do anything if oom
> > killing is already in progress.  If we do block out oom killing
> > properly across PM freeze/thaw, it shoud be fine.
> 
> OK, so my assumption was that oom_killer_disable() would wait for the OOM
> killing in progress to complete.  Alternatively, it can return an error code
> if OOM killing is in progress and we can simply fail the freezing in that
> case.

You will need to check all the tasks again after oom_killer_disable.
Something like the following should work. I am not familiar with PM much
so I might have missed something. I didn't like direct do_each_thread loop
but there doesn't seem to be any helper and other callers are doing
something slightly different in the loop.

This patch builds on top of Cong Wang's. What do you think?
---
>From cdf97a20b107ee584352f07274a88d7c3f014ab2 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Mon, 15 Sep 2014 10:52:30 +0200
Subject: [PATCH] OOM, PM: OOM killed task cannot escape PM suspend

PM freezer relies on having all tasks frozen by the time devices are
getting frozen so that no task will touch them while they are getting
frozen. But OOM killer is allowed to kill an already frozen task in
order to handle OOM situtation. In order to protect from late wake ups
OOM killer is disabled after all tasks are frozen. This, however, still
keeps an window open when a killed task didn't manage to die by the time
freeze_processes finishes. Fix this by checking all tasks after OOM
killer has been disabled. To prevent from useless check also introduce
and check oom_kills count which gets incremented when a task is killed
by OOM killer. All the tasks have to be checked only if the counter
changes.

Frozen tasks might get killed by OOM killer.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/oom.h    |  2 ++
 kernel/power/process.c | 31 ++++++++++++++++++++++++++++++-
 mm/oom_kill.c          | 14 ++++++++++++++
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/include/linux/oom.h b/include/linux/oom.h
index 647395a1a550..8927b6e443b5 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -50,6 +50,8 @@ static inline bool oom_task_origin(const struct task_struct *p)
 extern unsigned long oom_badness(struct task_struct *p,
 		struct mem_cgroup *memcg, const nodemask_t *nodemask,
 		unsigned long totalpages);
+
+extern int oom_kills_count(void);
 extern void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 			     unsigned int points, unsigned long totalpages,
 			     struct mem_cgroup *memcg, nodemask_t *nodemask,
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 4ee194eb524b..6ccc2e10724d 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -118,6 +118,7 @@ static int try_to_freeze_tasks(bool user_only)
 int freeze_processes(void)
 {
 	int error;
+	int oom_kills_saved;
 
 	error = __usermodehelper_disable(UMH_FREEZING);
 	if (error)
@@ -131,12 +132,40 @@ int freeze_processes(void)
 
 	printk("Freezing user space processes ... ");
 	pm_freezing = true;
+	oom_kills_saved = oom_kills_count();
 	error = try_to_freeze_tasks(true);
 	if (!error) {
-		printk("done.");
 		__usermodehelper_set_disable_depth(UMH_DISABLED);
 		oom_killer_disable();
+
+		/*
+		 * There was a OOM kill while we were freezing tasks
+		 * and the killed task might be still on the way out
+		 * so we have to double check for race.
+		 */
+		if (oom_kills_count() != oom_kills_saved) {
+			struct task_struct *g, *p;
+
+			read_lock(&tasklist_lock);
+			do_each_thread(g, p) {
+				if (p == current || freezer_should_skip(p) ||
+				    frozen(p))
+					continue;
+				error = -EBUSY;
+				break;
+			} while_each_thread(g, p);
+			read_unlock(&tasklist_lock);
+
+			if (error) {
+				__usermodehelper_set_disable_depth(UMH_ENABLED);
+				oom_killer_enable();
+				printk("OOM in progress. ");
+				goto done;
+			}
+		}
+		printk("done.");
 	}
+done:
 	printk("\n");
 	BUG_ON(in_atomic());
 
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index bbf405a3a18f..57dddd7d1f12 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -404,6 +404,18 @@ static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
 		dump_tasks(memcg, nodemask);
 }
 
+/*
+ * Number of OOM killer invocations (including memcg OOM killer).
+ * Primarily used by PM freezer to check for potential races with
+ * OOM killed frozen task.
+ */
+static atomic_t oom_kills = ATOMIC_INIT(0);
+
+int oom_kills_count(void)
+{
+	return atomic_read(&oom_kills);
+}
+
 #define K(x) ((x) << (PAGE_SHIFT-10))
 /*
  * Must be called while holding a reference to p, which will be released upon
@@ -506,11 +518,13 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
 			pr_err("Kill process %d (%s) sharing same memory\n",
 				task_pid_nr(p), p->comm);
 			task_unlock(p);
+			atomic_inc(&oom_kills);
 			do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
 		}
 	rcu_read_unlock();
 
 	set_tsk_thread_flag(victim, TIF_MEMDIE);
+	atomic_inc(&oom_kills);
 	do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
 	put_task_struct(victim);
 }
-- 
2.1.0


-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-04 22:30 [Patch v4 1/2] freezer: check OOM kill while being frozen Cong Wang
  2014-09-04 22:30 ` [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task() Cong Wang
  2014-09-05 14:08 ` [Patch v4 1/2] freezer: check OOM kill while being frozen Tejun Heo
@ 2014-09-15 11:22 ` Michal Hocko
  2014-09-16 22:58   ` Cong Wang
  2 siblings, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-15 11:22 UTC (permalink / raw)
  To: Cong Wang
  Cc: linux-kernel, David Rientjes, Rafael J. Wysocki, Tejun Heo,
	Andrew Morton

On Thu 04-09-14 15:30:41, Cong Wang wrote:
> There is a race condition between OOM killer and freezer when
> they try to operate on the same process, something like below:
> 
>         Process A       Process B               Process C
> trigger page fault
> then trigger oom
> B=oom_scan_process_thread()
>                                         cgroup freezer freeze(A, B)
>                         ...
>                         try_to_freeze()
>                         stay in D state
> oom_kill_process(B)
> restart page fault
> ...
> 
> In this case, process A triggered a page fault in user-space,
> and the kernel page fault handler triggered OOM, then kernel
> selected process B as the victim, right before being killed
> process B was frozen by process C therefore went to D state,
> then kernel sent SIGKILL but it is already too late as
> process B will not care about pending signals any more.

I have just got back to this patch again and I guess that the
description is a bit misleading. Sure there is a race but the main
problem is that frozen tasks are OOM unkillable in principle. So a task
might hide into the fridge and livelock OOM killer. This has been broken
since __thaw_task doesn't thaw anything (I guess it was a3201227f803
which broke it).

What do you think about the following changelog instead?
"
Since f660daac474c6f (oom: thaw threads if oom killed thread is frozen
before deferring) OOM killer relies on being able to thaw a frozen task
to handle OOM situation but a3201227f803 (freezer: make freezing() test
freeze conditions in effect instead of TIF_FREEZE) has reorganized the
code and stopped clearing freeze flag in __thaw_task. This means that
the target task only wakes up and goes into the fridge again because the
freezing condition hasn't changed for it. This reintroduces the bug
fixed by f660daac474c6f.

Fix the issue by checking for TIF_MEMDIE thread flag and get away from
the fridge if it is set. oom_scan_process_thread doesn't have to check
for the frozen task anymore because do_send_sig_info will wake up the
thread and TIF_MEMDIE is already set by that time.

Fixes: a3201227f803 (freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE)
Cc: stable@vger.kernel.org # 3.3+
"

[...]
> +static bool should_thaw_current(bool check_kthr_stop)
> +{
> +	if (!freezing(current))
> +		return true;
> +
> +	if (check_kthr_stop && kthread_should_stop())
> +		return true;
> +
> +	/* It might not be safe to check TIF_MEMDIE for pm freeze. */
> +	if (cgroup_freezing(current) && test_thread_flag(TIF_MEMDIE))
> +		return true;

+ cgroup_freezing check can go away.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-12 23:48                                       ` Tejun Heo
@ 2014-09-15 14:28                                         ` Michal Hocko
  2014-09-16  5:56                                           ` Tejun Heo
  0 siblings, 1 reply; 61+ messages in thread
From: Michal Hocko @ 2014-09-15 14:28 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Cong Wang, LKML, David Rientjes, Andrew Morton

On Sat 13-09-14 08:48:46, Tejun Heo wrote:
> Hello, Michal.
> 
> On Wed, Sep 10, 2014 at 03:24:17PM +0200, Michal Hocko wrote:
> > What do you think about this way to help distinguish kernel threads and
> > user processes and keep the future maintenance of freezer saner?
> 
> I'm not sure either way.  Please note that we have quite a few
> wrappers around try_to_freeze() - freezable_schedule*() and
> wait_event_freezable*() - and several places where a userland can be
> frozen.  

I wasn't aware of those until I discovered freezer_count and company. I
didn't get to study why we need so many freezing point in the kernel yet
and whether they are safe at all. 

> Are we gonna introduce an alternate versions for all of them?

Are they intended for kernel threads in the first place? The primary
objective seems to be to not wake up user tasks just to put them into
the fridge when you know they are in a "deep" sleep currently. Kernel
threads know they might freeze themselves when returning from schedule,
no?

freezable_schedule{_timeout} seem to be used only by user tasks AFAICS.
freezable_schedule_unsafe seems to be more complicated due to nfs and
cifs but both of them are checking for fatal signal pending and return
ERESTARTSYS which looks like userspace stuff. So maybe we do not need
any alternate version for kernel threads.

I haven't check the workqueue_freezable API.

> What do we actually gain?

I thought we could easily document and add some rules to where
try_to_freeze can be called from for userspace tasks. Kernel threads
know when they are safe to get frozen but we should be more careful
about user tasks. Whether that can be enforced by an API is a question.
I do admit this much more complicated than I originally anticipated.

try_to_freeze should be a nice barrier where you know that waking into
a new world is safe for the caller (there is no state crossing the
freezing point). Whether a separate API makes this happen is an open
question. I would hope for much higher bar for adding new freezing
points for users tasks as those are more tricky to check.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-15 14:28                                         ` Michal Hocko
@ 2014-09-16  5:56                                           ` Tejun Heo
  0 siblings, 0 replies; 61+ messages in thread
From: Tejun Heo @ 2014-09-16  5:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Rafael J. Wysocki, Cong Wang, LKML, David Rientjes, Andrew Morton

Hello, Michal.

On Mon, Sep 15, 2014 at 04:28:21PM +0200, Michal Hocko wrote:
> > Are we gonna introduce an alternate versions for all of them?
> 
> Are they intended for kernel threads in the first place? The primary

There's no restriction in using them and I'm pretty sure some of them
are used by kernel threads.

> objective seems to be to not wake up user tasks just to put them into
> the fridge when you know they are in a "deep" sleep currently. Kernel

That's one of the uses but not the only one.

> threads know they might freeze themselves when returning from schedule,
> no?

I'm not following.  Can you elaborate?

> freezable_schedule{_timeout} seem to be used only by user tasks AFAICS.
> freezable_schedule_unsafe seems to be more complicated due to nfs and
> cifs but both of them are checking for fatal signal pending and return
> ERESTARTSYS which looks like userspace stuff. So maybe we do not need
> any alternate version for kernel threads.
> 
> I haven't check the workqueue_freezable API.

That one is actually easy and should be fine.

> > What do we actually gain?
> 
> I thought we could easily document and add some rules to where
> try_to_freeze can be called from for userspace tasks. Kernel threads
> know when they are safe to get frozen but we should be more careful
> about user tasks. Whether that can be enforced by an API is a question.
> I do admit this much more complicated than I originally anticipated.
> 
> try_to_freeze should be a nice barrier where you know that waking into
> a new world is safe for the caller (there is no state crossing the

It isn't and I'm not sure it can ever be used as "world" barrier.  It
just doesn't mean that.

> freezing point). Whether a separate API makes this happen is an open
> question. I would hope for much higher bar for adding new freezing
> points for users tasks as those are more tricky to check.

Heh, as far as I can recall, freezer usage has always been a rather
large mess.  In the long term, I think the right thing to do is
merging userland freezing into job control stop so that it doesn't get
the special SIGKILL immunity.  We shouldn't have exposed this
unkillable state the beginning.  :(

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-15  9:36                                                   ` Michal Hocko
@ 2014-09-16 22:55                                                     ` Cong Wang
  2014-09-22  8:21                                                       ` Michal Hocko
  0 siblings, 1 reply; 61+ messages in thread
From: Cong Wang @ 2014-09-16 22:55 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Rafael J. Wysocki, Tejun Heo, LKML, David Rientjes, Andrew Morton

On Mon, Sep 15, 2014 at 2:36 AM, Michal Hocko <mhocko@suse.cz> wrote:
> This patch builds on top of Cong Wang's. What do you think?

I think it is should a preliminary of my patch, not a followup. I can carry
this patch for you if you want.

> ---
> From cdf97a20b107ee584352f07274a88d7c3f014ab2 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Mon, 15 Sep 2014 10:52:30 +0200
> Subject: [PATCH] OOM, PM: OOM killed task cannot escape PM suspend
>
> PM freezer relies on having all tasks frozen by the time devices are
> getting frozen so that no task will touch them while they are getting
> frozen. But OOM killer is allowed to kill an already frozen task in
> order to handle OOM situtation. In order to protect from late wake ups
> OOM killer is disabled after all tasks are frozen. This, however, still
> keeps an window open when a killed task didn't manage to die by the time
> freeze_processes finishes. Fix this by checking all tasks after OOM
> killer has been disabled. To prevent from useless check also introduce
> and check oom_kills count which gets incremented when a task is killed
> by OOM killer. All the tasks have to be checked only if the counter
> changes.


Not sure if I understand your patch correctly, seems you are checking if
there is any ongoing OOM killer during PM suspend, since oom_kills
will always increase, maybe a seqlock is more clear?

Or you mean totally forbid PM suspend when an OOM killer is ongoing?
If so, you should call atomic_dec() after OOM is done.

Thanks.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-15 11:22 ` Michal Hocko
@ 2014-09-16 22:58   ` Cong Wang
  0 siblings, 0 replies; 61+ messages in thread
From: Cong Wang @ 2014-09-16 22:58 UTC (permalink / raw)
  To: Michal Hocko
  Cc: LKML, David Rientjes, Rafael J. Wysocki, Tejun Heo, Andrew Morton

On Mon, Sep 15, 2014 at 4:22 AM, Michal Hocko <mhocko@suse.cz> wrote:
> On Thu 04-09-14 15:30:41, Cong Wang wrote:
>> There is a race condition between OOM killer and freezer when
>> they try to operate on the same process, something like below:
>>
>>         Process A       Process B               Process C
>> trigger page fault
>> then trigger oom
>> B=oom_scan_process_thread()
>>                                         cgroup freezer freeze(A, B)
>>                         ...
>>                         try_to_freeze()
>>                         stay in D state
>> oom_kill_process(B)
>> restart page fault
>> ...
>>
>> In this case, process A triggered a page fault in user-space,
>> and the kernel page fault handler triggered OOM, then kernel
>> selected process B as the victim, right before being killed
>> process B was frozen by process C therefore went to D state,
>> then kernel sent SIGKILL but it is already too late as
>> process B will not care about pending signals any more.
>
> I have just got back to this patch again and I guess that the
> description is a bit misleading. Sure there is a race but the main
> problem is that frozen tasks are OOM unkillable in principle. So a task
> might hide into the fridge and livelock OOM killer. This has been broken
> since __thaw_task doesn't thaw anything (I guess it was a3201227f803
> which broke it).
>
> What do you think about the following changelog instead?
> "
> Since f660daac474c6f (oom: thaw threads if oom killed thread is frozen
> before deferring) OOM killer relies on being able to thaw a frozen task
> to handle OOM situation but a3201227f803 (freezer: make freezing() test
> freeze conditions in effect instead of TIF_FREEZE) has reorganized the
> code and stopped clearing freeze flag in __thaw_task. This means that
> the target task only wakes up and goes into the fridge again because the
> freezing condition hasn't changed for it. This reintroduces the bug
> fixed by f660daac474c6f.
>
> Fix the issue by checking for TIF_MEMDIE thread flag and get away from
> the fridge if it is set. oom_scan_process_thread doesn't have to check
> for the frozen task anymore because do_send_sig_info will wake up the
> thread and TIF_MEMDIE is already set by that time.
>
> Fixes: a3201227f803 (freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE)
> Cc: stable@vger.kernel.org # 3.3+
> "


Yes, looks better to me.


>
> + cgroup_freezing check can go away.

With your patch, yes.

Thanks.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [Patch v4 1/2] freezer: check OOM kill while being frozen
  2014-09-16 22:55                                                     ` Cong Wang
@ 2014-09-22  8:21                                                       ` Michal Hocko
  0 siblings, 0 replies; 61+ messages in thread
From: Michal Hocko @ 2014-09-22  8:21 UTC (permalink / raw)
  To: Cong Wang
  Cc: Rafael J. Wysocki, Tejun Heo, LKML, David Rientjes, Andrew Morton

[Sorry for the late reply, I was mostly offline throughout last week]

On Tue 16-09-14 15:55:48, Cong Wang wrote:
> On Mon, Sep 15, 2014 at 2:36 AM, Michal Hocko <mhocko@suse.cz> wrote:
> > This patch builds on top of Cong Wang's. What do you think?
> 
> I think it is should a preliminary of my patch, not a followup.

Well, this looks like a chicken/egg problem. OOM killer currently
doesn't kill any frozen task which is fixed by your patch. So it should
be logically the first to apply. On the other hand that patch opens a
race fixed by this patch. I think the race is really unlikely and we
your patch basically fixes a regression so I would expect it to come
first.

Anyway, I will leave the proper ordering on you if you do not mind.

> I can carry this patch for you if you want.

Would be great, thanks!

> > ---
> > From cdf97a20b107ee584352f07274a88d7c3f014ab2 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.cz>
> > Date: Mon, 15 Sep 2014 10:52:30 +0200
> > Subject: [PATCH] OOM, PM: OOM killed task cannot escape PM suspend
> >
> > PM freezer relies on having all tasks frozen by the time devices are
> > getting frozen so that no task will touch them while they are getting
> > frozen. But OOM killer is allowed to kill an already frozen task in
> > order to handle OOM situtation. In order to protect from late wake ups
> > OOM killer is disabled after all tasks are frozen. This, however, still
> > keeps an window open when a killed task didn't manage to die by the time
> > freeze_processes finishes. Fix this by checking all tasks after OOM
> > killer has been disabled. To prevent from useless check also introduce
> > and check oom_kills count which gets incremented when a task is killed
> > by OOM killer. All the tasks have to be checked only if the counter
> > changes.
> 
> 
> Not sure if I understand your patch correctly, seems you are checking if
> there is any ongoing OOM killer during PM suspend, since oom_kills
> will always increase,

Yeah, I didn't want to pointlessly iterate over all tasks when nothing
happened. OOM is a rare action so this check should be in a slow path.

> maybe a seqlock is more clear?

I wanted to have it as simple as possible and simple counter sounded
like a good fit. seqlock sounds like overkill to me.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2014-09-22  8:21 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-04 22:30 [Patch v4 1/2] freezer: check OOM kill while being frozen Cong Wang
2014-09-04 22:30 ` [Patch v4 2/2] freezer: remove obsolete comments in __thaw_task() Cong Wang
2014-09-05  9:45   ` Michal Hocko
2014-09-05 14:09   ` Tejun Heo
2014-09-05 14:08 ` [Patch v4 1/2] freezer: check OOM kill while being frozen Tejun Heo
2014-09-05 16:31   ` Cong Wang
2014-09-05 18:00     ` Tejun Heo
2014-09-05 18:12       ` Cong Wang
2014-09-05 22:45         ` Tejun Heo
2014-09-05 23:32           ` Rafael J. Wysocki
2014-09-08 17:40             ` Cong Wang
2014-09-08 20:54               ` Rafael J. Wysocki
2014-09-08 20:58                 ` Cong Wang
2014-09-08 21:53                   ` Rafael J. Wysocki
2014-09-08 22:22                     ` Tejun Heo
2014-09-08 22:48                       ` Rafael J. Wysocki
2014-09-08 22:50                         ` Tejun Heo
2014-09-08 23:15                           ` Rafael J. Wysocki
2014-09-08 23:00                             ` Cong Wang
2014-09-08 23:23                               ` Rafael J. Wysocki
2014-09-08 23:16                                 ` Cong Wang
2014-09-08 23:42                                   ` Rafael J. Wysocki
2014-09-08 23:29                                     ` Cong Wang
2014-09-08 23:59                                       ` Rafael J. Wysocki
2014-09-10 20:30                                       ` Cong Wang
2014-09-10 23:38                                         ` Rafael J. Wysocki
2014-09-10 23:20                                           ` Cong Wang
2014-09-11 16:30                                         ` Michal Hocko
2014-09-12 23:59                                           ` Tejun Heo
2014-09-14 16:43                                             ` Rafael J. Wysocki
2014-09-15  0:56                                               ` Tejun Heo
2014-09-15  3:34                                                 ` Rafael J. Wysocki
2014-09-15  9:36                                                   ` Michal Hocko
2014-09-16 22:55                                                     ` Cong Wang
2014-09-22  8:21                                                       ` Michal Hocko
2014-09-09 15:16                           ` Michal Hocko
2014-09-09 15:23                             ` Tejun Heo
2014-09-09 16:06                               ` Michal Hocko
2014-09-09 16:46                                 ` Tejun Heo
2014-09-09 17:12                                   ` Michal Hocko
2014-09-09 20:53                                   ` Rafael J. Wysocki
2014-09-10 13:24                                     ` Michal Hocko
2014-09-11 13:08                                       ` Michal Hocko
2014-09-11 14:17                                         ` Rafael J. Wysocki
2014-09-11 14:04                                           ` Michal Hocko
2014-09-11 14:26                                             ` Rafael J. Wysocki
2014-09-11 14:10                                               ` Michal Hocko
2014-09-11 14:32                                                 ` Rafael J. Wysocki
2014-09-11 14:28                                                   ` Michal Hocko
2014-09-11 14:52                                                     ` Rafael J. Wysocki
2014-09-11 14:45                                                       ` Michal Hocko
2014-09-14 16:39                                                         ` Rafael J. Wysocki
2014-09-12 23:48                                       ` Tejun Heo
2014-09-15 14:28                                         ` Michal Hocko
2014-09-16  5:56                                           ` Tejun Heo
2014-09-09 20:48                               ` Rafael J. Wysocki
2014-09-10  5:21                                 ` Cong Wang
2014-09-05 16:43   ` Cong Wang
2014-09-05 16:54     ` Michal Hocko
2014-09-15 11:22 ` Michal Hocko
2014-09-16 22:58   ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.