linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] freezer: fix freeze timeout on exec
@ 2018-11-08 10:09 Chanho Min
  2018-11-08 10:50 ` Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Chanho Min @ 2018-11-08 10:09 UTC (permalink / raw)
  To: Rafael J. Wysocki, Pavel Machek, Len Brown, Andrew Morton,
	Eric W. Biederman, Christian Brauner, Oleg Nesterov,
	Anna-Maria Gleixner
  Cc: linux-pm, linux-kernel, Seungho Park, Chanho Min

Suspend fails due to the exec family of fuctnions blocking the freezer.
This issue has been found that it is mentioned in the ancient mail thread.
The casue is that de_thread() sleeps in TASK_UNINTERRUPTIBLE waiting for all
sub-threads to die, and we have the "deadlock" if one of them is frozen.
It causes freeze timeout as bellows.

Freezing of tasks failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0):
setcpushares-ls D ffffffc00008ed70     0  5817   1483 0x0040000d
 Call trace:
[<ffffffc00008ed70>] __switch_to+0x88/0xa0
[<ffffffc000d1c30c>] __schedule+0x1bc/0x720
[<ffffffc000d1ca90>] schedule+0x40/0xa8
[<ffffffc0001cd784>] flush_old_exec+0xdc/0x640
[<ffffffc000220360>] load_elf_binary+0x2a8/0x1090
[<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
[<ffffffc00021c584>] load_script+0x20c/0x228
[<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
[<ffffffc0001ce8e0>] do_execveat_common.isra.14+0x4f8/0x6e8
[<ffffffc0001cedd0>] compat_SyS_execve+0x38/0x48
[<ffffffc00008de30>] el0_svc_naked+0x24/0x28

To fix this, I suggest a patch by emboding the mentioned solution.
First, revive and rework cancel_freezing_and_thaw() function whitch stops the
task from sleeping in refrigirator reliably. And, The task to be killed does not
allow to freeze.

Reference:
http://lkml.iu.edu/hypermail//linux/kernel/0702.2/1300.html

Signed-off-by: Chanho Min <chanho.min@lge.com>
---
 include/linux/freezer.h |  7 +++++++
 kernel/freezer.c        | 14 ++++++++++++++
 kernel/signal.c         |  7 +++++++
 3 files changed, 28 insertions(+)

diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 21f5aa0..1c1e3cb 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -57,6 +57,12 @@ static inline bool try_to_freeze_unsafe(void)
 	might_sleep();
 	if (likely(!freezing(current)))
 		return false;
+	/*
+	 * we are going to call do_exit() really soon,
+	 * we have a pending SIGKILL
+	 */
+	if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
+		return false;
 	return __refrigerator(false);
 }
 
@@ -68,6 +74,7 @@ static inline bool try_to_freeze(void)
 }
 
 extern bool freeze_task(struct task_struct *p);
+extern void cancel_freezing_thaw_task(struct task_struct *p);
 extern bool set_freezable(void);
 
 #ifdef CONFIG_CGROUP_FREEZER
diff --git a/kernel/freezer.c b/kernel/freezer.c
index b162b74..584c5c8 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -148,6 +148,20 @@ bool freeze_task(struct task_struct *p)
 	return true;
 }
 
+void cancel_freezing_thaw_task(struct task_struct *p)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&freezer_lock, flags);
+	if (freezing(p)) {
+		spin_lock(&p->sighand->siglock);
+		recalc_sigpending_and_wake(p);
+		spin_unlock(&p->sighand->siglock);
+	} else if (frozen(p))
+		wake_up_process(p);
+	spin_unlock_irqrestore(&freezer_lock, flags);
+}
+
 void __thaw_task(struct task_struct *p)
 {
 	unsigned long flags;
diff --git a/kernel/signal.c b/kernel/signal.c
index 5843c54..ca9b25b 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -41,6 +41,7 @@
 #include <linux/compiler.h>
 #include <linux/posix-timers.h>
 #include <linux/livepatch.h>
+#include <linux/freezer.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/signal.h>
@@ -1273,6 +1274,12 @@ int zap_other_threads(struct task_struct *p)
 		/* Don't bother with already dead threads */
 		if (t->exit_state)
 			continue;
+
+		/*
+		 * we can check sig->group_exit_task to detect de_thread,
+		 * but perhaps it doesn't hurt if the caller is do_group_exit
+		 */
+		cancel_freezing_thaw_task(t);
 		sigaddset(&t->pending.signal, SIGKILL);
 		signal_wake_up(t, 1);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] freezer: fix freeze timeout on exec
  2018-11-08 10:09 [PATCH] freezer: fix freeze timeout on exec Chanho Min
@ 2018-11-08 10:50 ` Oleg Nesterov
  2018-11-09  0:29   ` Chanho Min
  2018-11-09  7:50   ` Chanho Min
  2018-11-11  9:50 ` kbuild test robot
  2018-11-11  9:51 ` kbuild test robot
  2 siblings, 2 replies; 6+ messages in thread
From: Oleg Nesterov @ 2018-11-08 10:50 UTC (permalink / raw)
  To: Chanho Min
  Cc: Rafael J. Wysocki, Pavel Machek, Len Brown, Andrew Morton,
	Eric W. Biederman, Christian Brauner, Anna-Maria Gleixner,
	linux-pm, linux-kernel, Seungho Park

On 11/08, Chanho Min wrote:
>
> Suspend fails due to the exec family of fuctnions blocking the freezer.
> This issue has been found that it is mentioned in the ancient mail thread.
> The casue is that de_thread() sleeps in TASK_UNINTERRUPTIBLE waiting for all
> sub-threads to die, and we have the "deadlock" if one of them is frozen.
> It causes freeze timeout as bellows.
>
> Freezing of tasks failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0):
> setcpushares-ls D ffffffc00008ed70     0  5817   1483 0x0040000d
>  Call trace:
> [<ffffffc00008ed70>] __switch_to+0x88/0xa0
> [<ffffffc000d1c30c>] __schedule+0x1bc/0x720
> [<ffffffc000d1ca90>] schedule+0x40/0xa8
> [<ffffffc0001cd784>] flush_old_exec+0xdc/0x640
> [<ffffffc000220360>] load_elf_binary+0x2a8/0x1090
> [<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
> [<ffffffc00021c584>] load_script+0x20c/0x228
> [<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
> [<ffffffc0001ce8e0>] do_execveat_common.isra.14+0x4f8/0x6e8
> [<ffffffc0001cedd0>] compat_SyS_execve+0x38/0x48
> [<ffffffc00008de30>] el0_svc_naked+0x24/0x28
>
> To fix this, I suggest a patch by emboding the mentioned solution.
> First, revive and rework cancel_freezing_and_thaw() function whitch stops the
> task from sleeping in refrigirator reliably. And, The task to be killed does not
> allow to freeze.

Can't we simply change de_thread() to use freezable_schedule() ?

Oleg.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] freezer: fix freeze timeout on exec
  2018-11-08 10:50 ` Oleg Nesterov
@ 2018-11-09  0:29   ` Chanho Min
  2018-11-09  7:50   ` Chanho Min
  1 sibling, 0 replies; 6+ messages in thread
From: Chanho Min @ 2018-11-09  0:29 UTC (permalink / raw)
  To: 'Oleg Nesterov'
  Cc: 'Rafael J. Wysocki', 'Pavel Machek',
	'Len Brown', 'Andrew Morton',
	'Eric W. Biederman', 'Christian Brauner',
	'Anna-Maria Gleixner',
	linux-pm, linux-kernel, 'Seungho Park',
	'Jongsung Kim', 'Inkyu Hwang',
	'donghwan.jung'

> >
> > To fix this, I suggest a patch by emboding the mentioned solution.
> > First, revive and rework cancel_freezing_and_thaw() function whitch
> > stops the task from sleeping in refrigirator reliably. And, The task
> > to be killed does not allow to freeze.
> 
> Can't we simply change de_thread() to use freezable_schedule() ?
> 
> Oleg.

We need to change freezable_schedule_timeout() instead.
freezable_schedule also can't be frozen if sub-threads can't stop
schedule().
Furthermore, I'm not sure if it is safe to freeze it at de_thread().

diff --git a/fs/exec.c b/fs/exec.c
index 9c5ee2a..291cbd6 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -942,7 +942,7 @@ static int de_thread(struct task_struct *tsk)
        while (sig->notify_count) {
                __set_current_state(TASK_KILLABLE);
                spin_unlock_irq(lock);
-               schedule();
+               while (!freezable_schedule_timeout(HZ));
                if (unlikely(__fatal_signal_pending(tsk)))
                        goto killed;
                spin_lock_irq(lock);

Chanho


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH] freezer: fix freeze timeout on exec
  2018-11-08 10:50 ` Oleg Nesterov
  2018-11-09  0:29   ` Chanho Min
@ 2018-11-09  7:50   ` Chanho Min
  1 sibling, 0 replies; 6+ messages in thread
From: Chanho Min @ 2018-11-09  7:50 UTC (permalink / raw)
  To: 'Oleg Nesterov'
  Cc: 'Rafael J. Wysocki', 'Pavel Machek',
	'Len Brown', 'Andrew Morton',
	'Eric W. Biederman', 'Christian Brauner',
	'Anna-Maria Gleixner',
	linux-pm, linux-kernel, 'Seungho Park',
	'Jongsung Kim', 'Inkyu Hwang',
	'donghwan.jung'

> >
> > Can't we simply change de_thread() to use freezable_schedule() ?
> >
> > Oleg.
> 
> We need to change freezable_schedule_timeout() instead.
> freezable_schedule also can't be frozen if sub-threads can't stop
> schedule().
> Furthermore, I'm not sure if it is safe to freeze it at de_thread().
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 9c5ee2a..291cbd6 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -942,7 +942,7 @@ static int de_thread(struct task_struct *tsk)
>         while (sig->notify_count) {
>                 __set_current_state(TASK_KILLABLE);
>                 spin_unlock_irq(lock);
> -               schedule();
> +               while (!freezable_schedule_timeout(HZ));
>                 if (unlikely(__fatal_signal_pending(tsk)))
>                         goto killed;
>                 spin_lock_irq(lock);
> 
> Chanho

Sorry, I might misunderstand freezer.
Changes to freezable_schedule() works fine. It looks safe.
I'll apply patch again.

Chanho


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] freezer: fix freeze timeout on exec
  2018-11-08 10:09 [PATCH] freezer: fix freeze timeout on exec Chanho Min
  2018-11-08 10:50 ` Oleg Nesterov
@ 2018-11-11  9:50 ` kbuild test robot
  2018-11-11  9:51 ` kbuild test robot
  2 siblings, 0 replies; 6+ messages in thread
From: kbuild test robot @ 2018-11-11  9:50 UTC (permalink / raw)
  To: Chanho Min
  Cc: kbuild-all, Rafael J. Wysocki, Pavel Machek, Len Brown,
	Andrew Morton, Eric W. Biederman, Christian Brauner,
	Oleg Nesterov, Anna-Maria Gleixner, linux-pm, linux-kernel,
	Seungho Park, Chanho Min

[-- Attachment #1: Type: text/plain, Size: 4532 bytes --]

Hi Chanho,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.20-rc1 next-20181109]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chanho-Min/freezer-fix-freeze-timeout-on-exec/20181111-171200
config: i386-randconfig-x070-201845 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from arch/x86/include/asm/atomic.h:5:0,
                    from include/linux/atomic.h:7,
                    from include/linux/crypto.h:20,
                    from arch/x86/kernel/asm-offsets.c:9:
   include/linux/freezer.h: In function 'try_to_freeze_unsafe':
   include/linux/freezer.h:64:30: error: dereferencing pointer to incomplete type 'struct signal_struct'
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
                                 ^
   include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^~~~
>> include/linux/freezer.h:64:2: note: in expansion of macro 'if'
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
     ^~
   include/linux/compiler.h:48:24: note: in expansion of macro '__branch_check__'
    #  define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x)))
                           ^~~~~~~~~~~~~~~~
   include/linux/freezer.h:64:6: note: in expansion of macro 'unlikely'
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
         ^~~~~~~~
   include/linux/freezer.h:64:40: error: 'SIGNAL_GROUP_EXIT' undeclared (first use in this function)
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
                                           ^
   include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^~~~
>> include/linux/freezer.h:64:2: note: in expansion of macro 'if'
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
     ^~
   include/linux/compiler.h:48:24: note: in expansion of macro '__branch_check__'
    #  define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x)))
                           ^~~~~~~~~~~~~~~~
   include/linux/freezer.h:64:6: note: in expansion of macro 'unlikely'
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
         ^~~~~~~~
   include/linux/freezer.h:64:40: note: each undeclared identifier is reported only once for each function it appears in
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
                                           ^
   include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^~~~
>> include/linux/freezer.h:64:2: note: in expansion of macro 'if'
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
     ^~
   include/linux/compiler.h:48:24: note: in expansion of macro '__branch_check__'
    #  define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x)))
                           ^~~~~~~~~~~~~~~~
   include/linux/freezer.h:64:6: note: in expansion of macro 'unlikely'
     if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
         ^~~~~~~~
   make[2]: *** [arch/x86/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [sub-make] Error 2

vim +/if +64 include/linux/freezer.h

    50	
    51	/*
    52	 * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
    53	 * If try_to_freeze causes a lockdep warning it means the caller may deadlock
    54	 */
    55	static inline bool try_to_freeze_unsafe(void)
    56	{
    57		might_sleep();
    58		if (likely(!freezing(current)))
    59			return false;
    60		/*
    61		 * we are going to call do_exit() really soon,
    62		 * we have a pending SIGKILL
    63		 */
  > 64		if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
    65			return false;
    66		return __refrigerator(false);
    67	}
    68	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27301 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] freezer: fix freeze timeout on exec
  2018-11-08 10:09 [PATCH] freezer: fix freeze timeout on exec Chanho Min
  2018-11-08 10:50 ` Oleg Nesterov
  2018-11-11  9:50 ` kbuild test robot
@ 2018-11-11  9:51 ` kbuild test robot
  2 siblings, 0 replies; 6+ messages in thread
From: kbuild test robot @ 2018-11-11  9:51 UTC (permalink / raw)
  To: Chanho Min
  Cc: kbuild-all, Rafael J. Wysocki, Pavel Machek, Len Brown,
	Andrew Morton, Eric W. Biederman, Christian Brauner,
	Oleg Nesterov, Anna-Maria Gleixner, linux-pm, linux-kernel,
	Seungho Park, Chanho Min

[-- Attachment #1: Type: text/plain, Size: 1928 bytes --]

Hi Chanho,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20-rc1 next-20181109]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chanho-Min/freezer-fix-freeze-timeout-on-exec/20181111-171200
config: i386-randconfig-x002-201845 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   kernel/signal.c: In function 'zap_other_threads':
>> kernel/signal.c:1283:3: error: implicit declaration of function 'cancel_freezing_thaw_task' [-Werror=implicit-function-declaration]
      cancel_freezing_thaw_task(t);
      ^~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/cancel_freezing_thaw_task +1283 kernel/signal.c

  1260	
  1261	/*
  1262	 * Nuke all other threads in the group.
  1263	 */
  1264	int zap_other_threads(struct task_struct *p)
  1265	{
  1266		struct task_struct *t = p;
  1267		int count = 0;
  1268	
  1269		p->signal->group_stop_count = 0;
  1270	
  1271		while_each_thread(p, t) {
  1272			task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
  1273			count++;
  1274	
  1275			/* Don't bother with already dead threads */
  1276			if (t->exit_state)
  1277				continue;
  1278	
  1279			/*
  1280			 * we can check sig->group_exit_task to detect de_thread,
  1281			 * but perhaps it doesn't hurt if the caller is do_group_exit
  1282			 */
> 1283			cancel_freezing_thaw_task(t);
  1284			sigaddset(&t->pending.signal, SIGKILL);
  1285			signal_wake_up(t, 1);
  1286		}
  1287	
  1288		return count;
  1289	}
  1290	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27026 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-11-11  9:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-08 10:09 [PATCH] freezer: fix freeze timeout on exec Chanho Min
2018-11-08 10:50 ` Oleg Nesterov
2018-11-09  0:29   ` Chanho Min
2018-11-09  7:50   ` Chanho Min
2018-11-11  9:50 ` kbuild test robot
2018-11-11  9:51 ` kbuild test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).