linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Regression with orderly_poweroff()
@ 2013-03-12  3:25 Benjamin Herrenschmidt
  2013-03-12 14:46 ` Linus Torvalds
  0 siblings, 1 reply; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2013-03-12  3:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, Paul Mackerras, hongfeng, david

Hi Linus !

A couple of weeks ago, David sent an email that went unanswered about a
regression concerning orderly_poweroff(). I think the original patch
causing it should be reverted, here's the actual email with the
explanation:

<<<
Subject: orderly_poweroff() is no longer safe in atomic context

Commit 6c0c0d4d1080840eabb3d055d2fd81911111c5fd "poweroff: fix bug in
orderly_poweroff()" apparently fixes one bug in orderly_poweroff(),
but introduces another.  The comments on orderly_poweroff() claim it
can be called from any context - and indeed we call it from interrupt
context in arch/powerpc/platforms/pseries/ras.c for example.  But
since that commit this is no longer safe, since
call_usermodehelper_fns() is not safe in interrupt context without the
UMH_NO_WAIT option.

I'm having trouble understanding the commit message to see what the
original bug being fixed was.  Specifically I can't make sense of:

  |  The bug here is, step 1 is always successful with param
  |  UMH_NO_WAIT, which obey the design goal of orderly_poweroff.

And without understanding the original bug, I'm not sure what the
correct fix is.
>>>

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12  3:25 Regression with orderly_poweroff() Benjamin Herrenschmidt
@ 2013-03-12 14:46 ` Linus Torvalds
  2013-03-12 17:46   ` Oleg Nesterov
                     ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Linus Torvalds @ 2013-03-12 14:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Oleg Nesterov, Lucas De Marchi

On Mon, Mar 11, 2013 at 8:25 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>
> A couple of weeks ago, David sent an email that went unanswered about a
> regression concerning orderly_poweroff(). I think the original patch
> causing it should be reverted, here's the actual email with the
> explanation:

Hmm.. You should really have cc'd the people who acked it and were in
the sign-off chain too, because all those people are involved with the
patch as well.

Also, the patch doesn't revert cleanly any more after commit
7ff6764061ec ("usermodehelper: cleanup/fix __orderly_poweroff() &&
argv_free()") which seems to be a real bug-fix for a double free, but
which really doesn't seem to work together with UMH_NO_WAIT.

So before reverting that one too, let's at least get the people who
were involved with the original patch (and the bugfix that relies on
it) in the email thread.

I'm leaving David's quoted report for the new people..

                Linus

---
> Subject: orderly_poweroff() is no longer safe in atomic context
>
> Commit 6c0c0d4d1080840eabb3d055d2fd81911111c5fd "poweroff: fix bug in
> orderly_poweroff()" apparently fixes one bug in orderly_poweroff(),
> but introduces another.  The comments on orderly_poweroff() claim it
> can be called from any context - and indeed we call it from interrupt
> context in arch/powerpc/platforms/pseries/ras.c for example.  But
> since that commit this is no longer safe, since
> call_usermodehelper_fns() is not safe in interrupt context without the
> UMH_NO_WAIT option.
>
> I'm having trouble understanding the commit message to see what the
> original bug being fixed was.  Specifically I can't make sense of:
>
>   |  The bug here is, step 1 is always successful with param
>   |  UMH_NO_WAIT, which obey the design goal of orderly_poweroff.
>
> And without understanding the original bug, I'm not sure what the
> correct fix is.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 14:46 ` Linus Torvalds
@ 2013-03-12 17:46   ` Oleg Nesterov
  2013-03-12 17:54   ` Lucas De Marchi
  2013-03-12 19:28   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-12 17:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Andrew Morton, Feng Hong, Lucas De Marchi

On 03/12, Linus Torvalds wrote:
>
> On Mon, Mar 11, 2013 at 8:25 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> >
> > A couple of weeks ago, David sent an email that went unanswered about a
> > regression concerning orderly_poweroff(). I think the original patch
> > causing it should be reverted, here's the actual email with the
> > explanation:
>
> Hmm.. You should really have cc'd the people who acked it and were in
> the sign-off chain too, because all those people are involved with the
> patch as well.
>
> Also, the patch doesn't revert cleanly any more after commit
> 7ff6764061ec ("usermodehelper: cleanup/fix __orderly_poweroff() &&
> argv_free()") which seems to be a real bug-fix for a double free, but
> which really doesn't seem to work together with UMH_NO_WAIT.
>
> So before reverting that one too, let's at least get the people who
> were involved with the original patch (and the bugfix that relies on
> it) in the email thread.
>
> I'm leaving David's quoted report for the new people..
>
>                 Linus
>
> ---
> > Subject: orderly_poweroff() is no longer safe in atomic context
> >
> > Commit 6c0c0d4d1080840eabb3d055d2fd81911111c5fd "poweroff: fix bug in
> > orderly_poweroff()" apparently fixes one bug in orderly_poweroff(),
> > but introduces another.  The comments on orderly_poweroff() claim it
> > can be called from any context - and indeed we call it from interrupt
> > context in arch/powerpc/platforms/pseries/ras.c for example.  But
> > since that commit this is no longer safe, since
> > call_usermodehelper_fns() is not safe in interrupt context without the
> > UMH_NO_WAIT option.
> >
> > I'm having trouble understanding the commit message to see what the
> > original bug being fixed was.  Specifically I can't make sense of:
> >
> >   |  The bug here is, step 1 is always successful with param
> >   |  UMH_NO_WAIT, which obey the design goal of orderly_poweroff.

I guess this means that UMH_NO_WAIT is pointless, it (almost) never
fails and thus we do not do kernel_power_off() if, say, there is no
/sbin/poweroff.

Well, if it can be called from interrupt, we should either skip
call_usermodehelper() or use schedule_work() for that...

And I didn't notice argv_split(GFP_ATOMIC), this is pointless because
we are going to sleep anyway.

So,

	- We can simply change orderly_poweroff() to use queue_work().

	  This makes it asynchronous even if we do not run the command.

	  And with this change it can only return the error if powerof_work
	  is already pending, perhaps this is fine. Only 2 callers check
	  the returned error just to print the warning. And this way
	  we can kill inprog/shutting_down in envctrl_do_shutdown() and
	  do_envctrl_shutdown().

	- We can add orderly_poweroff_async() which does this, and change
	  the in_atomic() callers.

	- We can add "bool in_atomic" argument which means do-not-exec
	  or use-workqueue.

	- Anything else?

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 14:46 ` Linus Torvalds
  2013-03-12 17:46   ` Oleg Nesterov
@ 2013-03-12 17:54   ` Lucas De Marchi
  2013-03-12 18:22     ` Oleg Nesterov
  2013-03-12 19:28   ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 29+ messages in thread
From: Lucas De Marchi @ 2013-03-12 17:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Andrew Morton, Feng Hong, Oleg Nesterov,
	Lucas De Marchi

On Tue, Mar 12, 2013 at 11:46 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Mar 11, 2013 at 8:25 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>>
>> A couple of weeks ago, David sent an email that went unanswered about a
>> regression concerning orderly_poweroff(). I think the original patch
>> causing it should be reverted, here's the actual email with the
>> explanation:
>
> Hmm.. You should really have cc'd the people who acked it and were in
> the sign-off chain too, because all those people are involved with the
> patch as well.
>
> Also, the patch doesn't revert cleanly any more after commit
> 7ff6764061ec ("usermodehelper: cleanup/fix __orderly_poweroff() &&
> argv_free()") which seems to be a real bug-fix for a double free, but
> which really doesn't seem to work together with UMH_NO_WAIT.

Yep, using it in that way with UMH_NO_WAIT will not work.

>
> So before reverting that one too, let's at least get the people who
> were involved with the original patch (and the bugfix that relies on
> it) in the email thread.

I have some pending patches on LKML to remove
call_usermodehelper_fns() and export call_usermodehelper_{setup,exec}.
 Doing this we can separate the allocation part using GFP_ATOMIC in
order to make it work on interrupt context. May I suggest going with
something like below after that patches are applied (sorry, whitespace
damaged)? I can also rework the patch series so this can be applied
regardless of the rest.


Lucas De Marchi

diff --git a/kernel/sys.c b/kernel/sys.c
index bd15276..6df5ec7 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2194,7 +2194,8 @@ static int __orderly_poweroff(void)
  "PATH=/sbin:/bin:/usr/sbin:/usr/bin",
  NULL
  };
- int ret;
+ struct subprocess_info *info;
+ int ret = -ENOMEM;

  argv = argv_split(GFP_ATOMIC, poweroff_cmd, &argc);
  if (argv == NULL) {
@@ -2203,7 +2204,10 @@ static int __orderly_poweroff(void)
  return -ENOMEM;
  }

- ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
+ info = call_usermodehelper_setup(argv[0], argv, envp, GFP_ATOMIC,
+ NULL, NULL, NULL);
+ if (info)
+ ret = call_usermodehelper_exec(info, UMH_WAIT_EXEC);
  argv_free(argv);

  return ret;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 17:54   ` Lucas De Marchi
@ 2013-03-12 18:22     ` Oleg Nesterov
  2013-03-12 18:42       ` Linus Torvalds
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-12 18:22 UTC (permalink / raw)
  To: Lucas De Marchi
  Cc: Linus Torvalds, Benjamin Herrenschmidt,
	Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Lucas De Marchi

On 03/12, Lucas De Marchi wrote:
>
> I have some pending patches on LKML to remove
> call_usermodehelper_fns() and export call_usermodehelper_{setup,exec}.
>  Doing this we can separate the allocation part using GFP_ATOMIC

But this is pointless. Contrary, we should change __orderly_poweroff()
to do not use GFP_ATOMIC for argv_split(), see below.

> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2194,7 +2194,8 @@ static int __orderly_poweroff(void)
>   "PATH=/sbin:/bin:/usr/sbin:/usr/bin",
>   NULL
>   };
> - int ret;
> + struct subprocess_info *info;
> + int ret = -ENOMEM;
> 
>   argv = argv_split(GFP_ATOMIC, poweroff_cmd, &argc);
>   if (argv == NULL) {
> @@ -2203,7 +2204,10 @@ static int __orderly_poweroff(void)
>   return -ENOMEM;
>   }
> 
> - ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
> + info = call_usermodehelper_setup(argv[0], argv, envp, GFP_ATOMIC,
> + NULL, NULL, NULL);
> + if (info)
> + ret = call_usermodehelper_exec(info, UMH_WAIT_EXEC);

And how this can help? The real problem is not GFP_KERNEL.
call_usermodehelper_exec(UMH_WAIT_EXEC) will block.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 18:22     ` Oleg Nesterov
@ 2013-03-12 18:42       ` Linus Torvalds
  2013-03-12 19:11         ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2013-03-12 18:42 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Lucas De Marchi, Benjamin Herrenschmidt,
	Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Lucas De Marchi

On Tue, Mar 12, 2013 at 11:22 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> And how this can help? The real problem is not GFP_KERNEL.
> call_usermodehelper_exec(UMH_WAIT_EXEC) will block.

Well, it's probably a starting point.

You need to do the argument handling atomically, because you cannot
delay that in a workqueue (the arguments will be long gone by the time
the workqueue starts up). So I think the fix is a combination of your
and Lucas' code, where you first do the setup atomically (copying the
arguments and allocating that space with GFP_ATOMIC) and then you do a
workqueue to actually do the real work of the usermode helper thing.

                 Linus

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 18:42       ` Linus Torvalds
@ 2013-03-12 19:11         ` Oleg Nesterov
  2013-03-12 19:20           ` Linus Torvalds
  2013-03-12 20:13           ` Regression with orderly_poweroff() Andi Kleen
  0 siblings, 2 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-12 19:11 UTC (permalink / raw)
  To: Linus Torvalds, Andi Kleen
  Cc: Lucas De Marchi, Benjamin Herrenschmidt,
	Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Lucas De Marchi

On 03/12, Linus Torvalds wrote:
>
> On Tue, Mar 12, 2013 at 11:22 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > And how this can help? The real problem is not GFP_KERNEL.
> > call_usermodehelper_exec(UMH_WAIT_EXEC) will block.
>
> Well, it's probably a starting point.
>
> You need to do the argument handling atomically, because you cannot
> delay that in a workqueue (the arguments will be long gone by the time
> the workqueue starts up).

Confused... which arguments? The only argument is poweroff_cmd, it can't
go away and kmalloc(GFP_KERNEL) is fine in work->func() ?

> So I think the fix is a combination of your
> and Lucas' code, where you first do the setup atomically (copying the
> arguments and allocating that space with GFP_ATOMIC) and then you do a
> workqueue to actually do the real work of the usermode helper thing.

OK, whatever I missed we can do this, and the pending patches from Lucas
(split allocation and call_usermodehelper_exec) makes sense anyway.

In fact we can do more. On the top of Lucas's changes we can change
call_usermodehelper_freeinfo() to not call kfree(info) unconditionally,
and then we can avoid even kmalloc(GFP_ATOMIC). Not sure this actually
makes sense though.

So do you agree that orderly_poweroff() can simply use schedule_work() ?




Btw. There is another "strange" user, arch/x86/kernel/cpu/mcheck/mce.c.
It uses mce_trigger_work to call call_usermodehelper(UMH_NO_WAIT).
Why? UMH_NO_WAIT is already atomic. And the !work_pending() check is
confusing, schedule_work(schedule_work) checks it is not pending.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 19:11         ` Oleg Nesterov
@ 2013-03-12 19:20           ` Linus Torvalds
  2013-03-12 20:35             ` Oleg Nesterov
  2013-03-12 20:13           ` Regression with orderly_poweroff() Andi Kleen
  1 sibling, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2013-03-12 19:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andi Kleen, Lucas De Marchi, Benjamin Herrenschmidt,
	Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Lucas De Marchi

On Tue, Mar 12, 2013 at 12:11 PM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> Confused... which arguments? The only argument is poweroff_cmd, it can't
> go away and kmalloc(GFP_KERNEL) is fine in work->func() ?

Hmm. I guess in this particular situation, we really don't have any
arguments from callers, no. I was confused by the fact that we had
that argv_split/argv_free thing, but that doesn't actually come from
the callers, it just comes from our own buffer. So yeah, I guess
everything could just go into a workqueue.

               Linus

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 14:46 ` Linus Torvalds
  2013-03-12 17:46   ` Oleg Nesterov
  2013-03-12 17:54   ` Lucas De Marchi
@ 2013-03-12 19:28   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 29+ messages in thread
From: Benjamin Herrenschmidt @ 2013-03-12 19:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Oleg Nesterov, Lucas De Marchi

On Tue, 2013-03-12 at 07:46 -0700, Linus Torvalds wrote:
> 
> Hmm.. You should really have cc'd the people who acked it and were in
> the sign-off chain too, because all those people are involved with the
> patch as well.

Ah oops, my bad, just used David's original CC list, too much stuff on
my plate before I head out for a big family holiday tomorrow :-)

David, you'll handle it from there ?

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 19:11         ` Oleg Nesterov
  2013-03-12 19:20           ` Linus Torvalds
@ 2013-03-12 20:13           ` Andi Kleen
  1 sibling, 0 replies; 29+ messages in thread
From: Andi Kleen @ 2013-03-12 20:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Andrew Morton, Feng Hong, Lucas De Marchi

> Btw. There is another "strange" user, arch/x86/kernel/cpu/mcheck/mce.c.
> It uses mce_trigger_work to call call_usermodehelper(UMH_NO_WAIT).
> Why? UMH_NO_WAIT is already atomic. And the !work_pending() check is
> confusing, schedule_work(schedule_work) checks it is not pending.

I think you're right, the additional step shouldn't be needed.

The MCE Handler uses the trick to do this in MCE context if the
interrupts were enabled earlier. 

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Regression with orderly_poweroff()
  2013-03-12 19:20           ` Linus Torvalds
@ 2013-03-12 20:35             ` Oleg Nesterov
  2013-03-13 17:46               ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-12 20:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Lucas De Marchi, Benjamin Herrenschmidt,
	Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Lucas De Marchi

On 03/12, Linus Torvalds wrote:
>
> So yeah, I guess
> everything could just go into a workqueue.

OK, I'll try to make the patch tomorrow. Should be trivial but it is
not clear how we should pass "bool force" without allocating the
work_struct which would be nice to avoid.

And I am a bit worried about UMH_WAIT_EXEC from system_wq. Should be
fine afaics...

And. It seems there is another problem. argv_split(poweroff_cmd) can
obviously race with proc_dostring() ? If nothing else, argv_split()
doesn't check 'argc' when it does "*argvp++ = t". And this is not
__orderly_poweroff-specific. This looks simple... and probably we can
even simplify argv_split/argv_free. We can simply kstrndup() the
original string once and do s/space/0/.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work()
  2013-03-12 20:35             ` Oleg Nesterov
@ 2013-03-13 17:46               ` Oleg Nesterov
  2013-03-13 17:47                 ` [PATCH 1/1] " Oleg Nesterov
  2013-03-13 23:35                 ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Lucas De Marchi
  0 siblings, 2 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-13 17:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Lucas De Marchi, Benjamin Herrenschmidt,
	Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Lucas De Marchi

On 03/12, Oleg Nesterov wrote:
>
> On 03/12, Linus Torvalds wrote:
> >
> > So yeah, I guess
> > everything could just go into a workqueue.
>
> OK, I'll try to make the patch tomorrow. Should be trivial but it is
> not clear how we should pass "bool force" without allocating the
> work_struct which would be nice to avoid.

Yes, it would be nice to keep it simple and use a single work/arg.

Could you review? The change is trivial but

	- orderly_poweroff() always return 0.

	- the patch assumes that orderly_poweroff(false) after
	  orderly_poweroff(true) acts as "force = true". Only xen
	  uses "false", I hope this is fine.

	  In fact I think we can change poweroff_force argument
	  unconditionally, this "if (force)" check is mostly
	  documentation.

	  But we can add the locking or even allocate work_struct
	  every time if this is wrong (or just looks wrong).

	- The patch assumes that orderly_poweroff() doesn't need
	  the keventd_up() check, I hope this is correct...


Lucas, Andrew, sorry. If this patch will be applied, then

	kernel-sysc-use-the-simpler-call_usermodehelper.patch

should be dropped. Or I can redo this fix on top of -mm cleanup.

> And. It seems there is another problem. argv_split(poweroff_cmd) can
> obviously race with proc_dostring() ?

I'll send another patch...

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/1] poweroff: change orderly_poweroff() to use schedule_work()
  2013-03-13 17:46               ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Oleg Nesterov
@ 2013-03-13 17:47                 ` Oleg Nesterov
  2013-03-14 22:28                   ` Andrew Morton
  2013-03-13 23:35                 ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Lucas De Marchi
  1 sibling, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-13 17:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Lucas De Marchi, Benjamin Herrenschmidt,
	Linux Kernel Mailing List, Paul Mackerras, david, Kees Cook,
	Serge Hallyn, Rafael J. Wysocki, Andrew Morton, Feng Hong,
	Lucas De Marchi

orderly_poweroff() can be used from any context but UMH_WAIT_EXEC
is sleepable. Move the "force" logic into __orderly_poweroff() and
change orderly_poweroff() to use the global poweroff_work which
simply calls __orderly_poweroff().

While at it, remove the unneeded "int argc" and change argv_split()
to use GFP_KERNEL.

We use the global "bool poweroff_force" to pass the argument, this
can obviously affect the previous request if it is pending/running.
So we only allow the "false => true" transition assuming that the
pending "true" should succeed anyway. If schedule_work() fails after
that we know that work->func() was not called yet, it must see the
new value.

This means that orderly_poweroff() becomes async even if we do not
run the command and always succeeds, schedule_work() can only fail
if the work is already pending. We can export __orderly_poweroff()
and change the non-atomic callers which want the old semantics.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/sys.c |   57 ++++++++++++++++++++++++++++++++-------------------------
 1 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index e02a036..7c9565f 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2184,9 +2184,8 @@ SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep,
 
 char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = "/sbin/poweroff";
 
-static int __orderly_poweroff(void)
+static int __orderly_poweroff(bool force)
 {
-	int argc;
 	char **argv;
 	static char *envp[] = {
 		"HOME=/",
@@ -2195,20 +2194,40 @@ static int __orderly_poweroff(void)
 	};
 	int ret;
 
-	argv = argv_split(GFP_ATOMIC, poweroff_cmd, &argc);
-	if (argv == NULL) {
+	argv = argv_split(GFP_KERNEL, poweroff_cmd, NULL);
+	if (argv) {
+		ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
+		argv_free(argv);
+	} else {
 		printk(KERN_WARNING "%s failed to allocate memory for \"%s\"\n",
-		       __func__, poweroff_cmd);
-		return -ENOMEM;
+					 __func__, poweroff_cmd);
+		ret = -ENOMEM;
 	}
 
-	ret = call_usermodehelper_fns(argv[0], argv, envp, UMH_WAIT_EXEC,
-				      NULL, NULL, NULL);
-	argv_free(argv);
+	if (ret && force) {
+		printk(KERN_WARNING "Failed to start orderly shutdown: "
+					"forcing the issue\n");
+		/*
+		 * I guess this should try to kick off some daemon to sync and
+		 * poweroff asap.  Or not even bother syncing if we're doing an
+		 * emergency shutdown?
+		 */
+		emergency_sync();
+		kernel_power_off();
+	}
 
 	return ret;
 }
 
+static bool poweroff_force;
+
+static void poweroff_work_func(struct work_struct *work)
+{
+	__orderly_poweroff(poweroff_force);
+}
+
+static DECLARE_WORK(poweroff_work, poweroff_work_func);
+
 /**
  * orderly_poweroff - Trigger an orderly system poweroff
  * @force: force poweroff if command execution fails
@@ -2218,21 +2237,9 @@ static int __orderly_poweroff(void)
  */
 int orderly_poweroff(bool force)
 {
-	int ret = __orderly_poweroff();
-
-	if (ret && force) {
-		printk(KERN_WARNING "Failed to start orderly shutdown: "
-		       "forcing the issue\n");
-
-		/*
-		 * I guess this should try to kick off some daemon to sync and
-		 * poweroff asap.  Or not even bother syncing if we're doing an
-		 * emergency shutdown?
-		 */
-		emergency_sync();
-		kernel_power_off();
-	}
-
-	return ret;
+	if (force) /* do not override the pending "true" */
+		poweroff_force = true;
+	schedule_work(&poweroff_work);
+	return 0;
 }
 EXPORT_SYMBOL_GPL(orderly_poweroff);
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work()
  2013-03-13 17:46               ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Oleg Nesterov
  2013-03-13 17:47                 ` [PATCH 1/1] " Oleg Nesterov
@ 2013-03-13 23:35                 ` Lucas De Marchi
  1 sibling, 0 replies; 29+ messages in thread
From: Lucas De Marchi @ 2013-03-13 23:35 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Andrew Morton, Feng Hong

Hi Oleg,

On Wed, Mar 13, 2013 at 2:46 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Oleg Nesterov wrote:
>>
>> On 03/12, Linus Torvalds wrote:
>> >
>> > So yeah, I guess
>> > everything could just go into a workqueue.
>>
>> OK, I'll try to make the patch tomorrow. Should be trivial but it is
>> not clear how we should pass "bool force" without allocating the
>> work_struct which would be nice to avoid.
>
> Yes, it would be nice to keep it simple and use a single work/arg.
>
> Could you review? The change is trivial but
>
>         - orderly_poweroff() always return 0.
>
>         - the patch assumes that orderly_poweroff(false) after
>           orderly_poweroff(true) acts as "force = true". Only xen
>           uses "false", I hope this is fine.
>
>           In fact I think we can change poweroff_force argument
>           unconditionally, this "if (force)" check is mostly
>           documentation.

I'm not so familiar with this code, but for me it looks reasonable to
let orderly_poweroff(true) win even if there's an
orderly_poweroff(false) later.

>
>           But we can add the locking or even allocate work_struct
>           every time if this is wrong (or just looks wrong).
>
>         - The patch assumes that orderly_poweroff() doesn't need
>           the keventd_up() check, I hope this is correct...
>
>
> Lucas, Andrew, sorry. If this patch will be applied, then
>
>         kernel-sysc-use-the-simpler-call_usermodehelper.patch

No problem for me... your patch already does what this one is doing.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] poweroff: change orderly_poweroff() to use schedule_work()
  2013-03-13 17:47                 ` [PATCH 1/1] " Oleg Nesterov
@ 2013-03-14 22:28                   ` Andrew Morton
  2013-03-15 16:39                     ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2013-03-14 22:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On Wed, 13 Mar 2013 18:47:05 +0100 Oleg Nesterov <oleg@redhat.com> wrote:

> orderly_poweroff() can be used from any context but UMH_WAIT_EXEC
> is sleepable. Move the "force" logic into __orderly_poweroff() and
> change orderly_poweroff() to use the global poweroff_work which
> simply calls __orderly_poweroff().
> 
> While at it, remove the unneeded "int argc" and change argv_split()
> to use GFP_KERNEL.
> 
> We use the global "bool poweroff_force" to pass the argument, this
> can obviously affect the previous request if it is pending/running.
> So we only allow the "false => true" transition assuming that the
> pending "true" should succeed anyway. If schedule_work() fails after
> that we know that work->func() was not called yet, it must see the
> new value.
> 
> This means that orderly_poweroff() becomes async even if we do not
> run the command and always succeeds, schedule_work() can only fail
> if the work is already pending. We can export __orderly_poweroff()
> and change the non-atomic callers which want the old semantics.
> 
> ...
>
> @@ -2218,21 +2237,9 @@ static int __orderly_poweroff(void)
>   */
>  int orderly_poweroff(bool force)
>  {
> -	int ret = __orderly_poweroff();
> -
> -	if (ret && force) {
> -		printk(KERN_WARNING "Failed to start orderly shutdown: "
> -		       "forcing the issue\n");
> -
> -		/*
> -		 * I guess this should try to kick off some daemon to sync and
> -		 * poweroff asap.  Or not even bother syncing if we're doing an
> -		 * emergency shutdown?
> -		 */
> -		emergency_sync();
> -		kernel_power_off();
> -	}
> -
> -	return ret;
> +	if (force) /* do not override the pending "true" */
> +		poweroff_force = true;
> +	schedule_work(&poweroff_work);
> +	return 0;
>  }

afaict the current version of orderly_poweroff() will never return -
either __orderly_poweroff() will block until the machine shuts down or
kernel_power_off() will do so.

However with this patch there is a path via which orderly_poweroff()
can return to its caller, I think?  If so, the caller might be rather
surprised and we're exercising never-before-used code paths.  In fact
if the surprised caller goes oops, the poweroff might not occur at all.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] poweroff: change orderly_poweroff() to use schedule_work()
  2013-03-14 22:28                   ` Andrew Morton
@ 2013-03-15 16:39                     ` Oleg Nesterov
  2013-03-16 20:23                       ` [PATCH 0/2] finx argv_split() vs sysctl race Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-15 16:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On 03/14, Andrew Morton wrote:
>
> On Wed, 13 Mar 2013 18:47:05 +0100 Oleg Nesterov <oleg@redhat.com> wrote:
>
> > This means that orderly_poweroff() becomes async even if we do not
> > run the command and always succeeds, schedule_work() can only fail
> > if the work is already pending. We can export __orderly_poweroff()
> > and change the non-atomic callers which want the old semantics.
> >
> > ...
> >
> > @@ -2218,21 +2237,9 @@ static int __orderly_poweroff(void)
> >   */
> >  int orderly_poweroff(bool force)
> >  {
> > +	if (force) /* do not override the pending "true" */
> > +		poweroff_force = true;
> > +	schedule_work(&poweroff_work);
> > +	return 0;
> >  }
>
> afaict the current version of orderly_poweroff() will never return -
> either __orderly_poweroff() will block until the machine shuts down or
> kernel_power_off() will do so.

Note that __orderly_poweroff() uses UMH_WAIT_EXEC, not UMH_WAIT_PROC,
so it returns right after /sbin/poweroff starts to execute.

So it is already asynchronous unless execve() fails.

> However with this patch there is a path via which orderly_poweroff()
> can return to its caller, I think?

See above, but please also read the changelog.

With this patch orderly_poweroff() is always async, even if exec fails,
but

> If so, the caller might be rather
> surprised and we're exercising never-before-used code paths.  In fact
> if the surprised caller goes oops, the poweroff might not occur at all.

This should not happen.

Anyway. Please also note that now we can export __orderly_poweroff() and
probably change it, it can have another argument "bool use_UMH_WAIT_PROC".

	int __orderly_poweroff(bool force, bool sync)
	{
		int wait = sync ? UMH_WAIT_EXEC : UMH_WAIT_EXEC;

		ret = call_usermodehelper(argv[0], argv, envp, wait);

		if (force) {
			// EXEC failed or /sbin/poweroff didn't do its work
			if (ret || sync)
				kernel_power_off();
		}
	}

The non-atomic callers can use __orderly_poweroff(sync => true).

---------------------------------------------------------------------------
And, Andrew, et all... Could you help with another mentioned problem? It is
really trivial, but exactly because it is trivial I do not know what should
I do.

To remind, say, argv_split(poweroff_cmd) can race with sysctl changing this
string, in this case it can write to the memory after argv[] array. We can
fix this, or we can rewrite argv_split/free:

	void argv_free(char **argv)
	{
		kfree(argv[-1]);
		kfree(argv);
	}

	char **argv_split(gfp_t gfp, const char *str, int *argcp)
	{
		char *argv_str;
		bool was_space;
		char **argv, **argv_ret;
		int argc;

		argv_str = kstrndup(str, KMALLOC_MAX_SIZE, gfp);
		if (!argv_str)
			return NULL;

		argc = count_argc(argv_str);
		argv = kmalloc(sizeof(*argv) * (argc + 2), gfp);
		if (!argv) {
			kfree(argv_str);
			return NULL;
		}

		*argv = argv_str;
		argv_ret = ++argv;
		for (was_space = true; *argv_str; argv_str++) {
			if (isspace(*argv_str)) {
				was_space = true;
				*argv_str = 0;
			} else if (was_space) {
				was_space = false;
				*argv++ = argv_str;
			}
		}
		*argv = NULL;

		if (argcp)
			*argcp = argc;
		return argv_ret;
	}

This way it uses a single kstrndup() to keep the arguments and it is
always safe.

But, whatever we do with argv_split(), it can hit the string "in between".
Personally I think we do not really care, but...

Perhaps we should add proc_dostring_lock() which takes some lock and
modify the callers of argv_split() (or add argv_split_lock) ?

Or perhaps we should introduce the rwsem which should protect every
sysctl-string and proc_dostring() should take this lock?

Help! I'd prefer to rewrite argv_split(), but I agree with any suggestion
in advance.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 0/2] finx argv_split() vs sysctl race
  2013-03-15 16:39                     ` Oleg Nesterov
@ 2013-03-16 20:23                       ` Oleg Nesterov
  2013-03-16 20:23                         ` [PATCH 1/2] teach argv_split() to handle the mutable strings Oleg Nesterov
                                           ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-16 20:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On 03/15, Oleg Nesterov wrote:
>
> To remind, say, argv_split(poweroff_cmd) can race with sysctl changing this
> string, in this case it can write to the memory after argv[] array. We can
> fix this, or we can rewrite argv_split/free:

OK, please see 1/2.

And this reminds me about set_task_comm() which pretends it does something
meaningful for the reader of the mutable ->comm, see the offtopic 2/2.

> But, whatever we do with argv_split(), it can hit the string "in between".
> Personally I think we do not really care, but...
>
> Perhaps we should add proc_dostring_lock() which takes some lock and
> modify the callers of argv_split() (or add argv_split_lock) ?
>
> Or perhaps we should introduce the rwsem which should protect every
> sysctl-string and proc_dostring() should take this lock?

Please tell me if you think we should do something with that.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] teach argv_split() to handle the mutable strings
  2013-03-16 20:23                       ` [PATCH 0/2] finx argv_split() vs sysctl race Oleg Nesterov
@ 2013-03-16 20:23                         ` Oleg Nesterov
  2013-03-18 16:03                           ` [PATCH v2 " Oleg Nesterov
  2013-03-18 21:53                           ` [PATCH " Andrew Morton
  2013-03-16 20:24                         ` [PATCH 2/2] set_task_comm: kill the pointless memset() + wmb() Oleg Nesterov
  2013-03-16 20:32                         ` [PATCH 0/2] finx argv_split() vs sysctl race Andi Kleen
  2 siblings, 2 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-16 20:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

argv_split() allocates argv[count_argc(str)] array and assumes that
it will find the same number of arguments later. This is obviously
wrong if this string can be changed, say, by sysctl.

With this patch argv_split() kstrndup's the whole string and does
not split it, we simply replace the spaces with zeroes and keep the
allocated memory in argv[-1] for argv_free(arg).

We do not use argv[0] because:

	- str can be all-spaces or empty. In fact this case is fine,
	  we could kfree() it before return, but:

	- str can have a space at the start, and we can not rely on
	  kstrndup(skip_arg(str)) because it can equally race if this
	  string is mutable.

Also, simplify count_argc() and kill the no longer used skip_arg().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 lib/argv_split.c |   82 +++++++++++++++++++++++-------------------------------
 1 files changed, 35 insertions(+), 47 deletions(-)

diff --git a/lib/argv_split.c b/lib/argv_split.c
index 1e9a6cb..dfa534d 100644
--- a/lib/argv_split.c
+++ b/lib/argv_split.c
@@ -8,23 +8,17 @@
 #include <linux/slab.h>
 #include <linux/export.h>
 
-static const char *skip_arg(const char *cp)
-{
-	while (*cp && !isspace(*cp))
-		cp++;
-
-	return cp;
-}
-
 static int count_argc(const char *str)
 {
 	int count = 0;
+	bool was_space;
 
-	while (*str) {
-		str = skip_spaces(str);
-		if (*str) {
+	for (was_space = true; *str; str++) {
+		if (isspace(*str)) {
+			was_space = true;
+		} else if (was_space) {
+			was_space = false;
 			count++;
-			str = skip_arg(str);
 		}
 	}
 
@@ -39,10 +33,7 @@ static int count_argc(const char *str)
  */
 void argv_free(char **argv)
 {
-	char **p;
-	for (p = argv; *p; p++)
-		kfree(*p);
-
+	kfree(argv[-1]);
 	kfree(argv);
 }
 EXPORT_SYMBOL(argv_free);
@@ -62,40 +53,37 @@ EXPORT_SYMBOL(argv_free);
  */
 char **argv_split(gfp_t gfp, const char *str, int *argcp)
 {
-	int argc = count_argc(str);
-	char **argv = kzalloc(sizeof(*argv) * (argc+1), gfp);
-	char **argvp;
-
-	if (argv == NULL)
-		goto out;
-
-	if (argcp)
-		*argcp = argc;
-
-	argvp = argv;
-
-	while (*str) {
-		str = skip_spaces(str);
-
-		if (*str) {
-			const char *p = str;
-			char *t;
-
-			str = skip_arg(str);
+	char *argv_str;
+	bool was_space;
+	char **argv, **argv_ret;
+	int argc;
+
+	argv_str = kstrndup(str, KMALLOC_MAX_SIZE, gfp);
+	if (!argv_str)
+		return NULL;
+
+	argc = count_argc(argv_str);
+	argv = kmalloc(sizeof(*argv) * (argc + 2), gfp);
+	if (!argv) {
+		kfree(argv_str);
+		return NULL;
+	}
 
-			t = kstrndup(p, str-p, gfp);
-			if (t == NULL)
-				goto fail;
-			*argvp++ = t;
+	*argv = argv_str;
+	argv_ret = ++argv;
+	for (was_space = true; *argv_str; argv_str++) {
+		if (isspace(*argv_str)) {
+			was_space = true;
+			*argv_str = 0;
+		} else if (was_space) {
+			was_space = false;
+			*argv++ = argv_str;
 		}
 	}
-	*argvp = NULL;
-
-  out:
-	return argv;
+	*argv = NULL;
 
-  fail:
-	argv_free(argv);
-	return NULL;
+	if (argcp)
+		*argcp = argc;
+	return argv_ret;
 }
 EXPORT_SYMBOL(argv_split);
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] set_task_comm: kill the pointless memset() + wmb()
  2013-03-16 20:23                       ` [PATCH 0/2] finx argv_split() vs sysctl race Oleg Nesterov
  2013-03-16 20:23                         ` [PATCH 1/2] teach argv_split() to handle the mutable strings Oleg Nesterov
@ 2013-03-16 20:24                         ` Oleg Nesterov
  2013-03-16 20:32                         ` [PATCH 0/2] finx argv_split() vs sysctl race Andi Kleen
  2 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-16 20:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

set_task_comm() does memset() + wmb() before strlcpy(). This buys
nothing but adds the confusion, the comment is wrong.

- We do not need memset() to be "safe from non-terminating string
  reads", the final char is always zero and we never change it.

- wmb() is paired with nothing, it can't not prevent from printing
  the mixture of the old/new data unless the reader takes the lock.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/exec.c |   10 ----------
 1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index bea2f7d..b270844 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1030,17 +1030,7 @@ EXPORT_SYMBOL_GPL(get_task_comm);
 void set_task_comm(struct task_struct *tsk, char *buf)
 {
 	task_lock(tsk);
-
 	trace_task_rename(tsk, buf);
-
-	/*
-	 * Threads may access current->comm without holding
-	 * the task lock, so write the string carefully.
-	 * Readers without a lock may see incomplete new
-	 * names but are safe from non-terminating string reads.
-	 */
-	memset(tsk->comm, 0, TASK_COMM_LEN);
-	wmb();
 	strlcpy(tsk->comm, buf, sizeof(tsk->comm));
 	task_unlock(tsk);
 	perf_event_comm(tsk);
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] finx argv_split() vs sysctl race
  2013-03-16 20:23                       ` [PATCH 0/2] finx argv_split() vs sysctl race Oleg Nesterov
  2013-03-16 20:23                         ` [PATCH 1/2] teach argv_split() to handle the mutable strings Oleg Nesterov
  2013-03-16 20:24                         ` [PATCH 2/2] set_task_comm: kill the pointless memset() + wmb() Oleg Nesterov
@ 2013-03-16 20:32                         ` Andi Kleen
  2013-03-16 20:45                           ` Oleg Nesterov
  2 siblings, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2013-03-16 20:32 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On Sat, Mar 16, 2013 at 09:23:27PM +0100, Oleg Nesterov wrote:
> On 03/15, Oleg Nesterov wrote:
> >
> > To remind, say, argv_split(poweroff_cmd) can race with sysctl changing this
> > string, in this case it can write to the memory after argv[] array. We can
> > fix this, or we can rewrite argv_split/free:
> 
> OK, please see 1/2.
> 
> And this reminds me about set_task_comm() which pretends it does something
> meaningful for the reader of the mutable ->comm, see the offtopic 2/2.

I had "rcu strings" to handle the sysctl string race problem in a
generic way some time ago.

http://lwn.net/Articles/368684/

Unfortunately never made it in. Perhaps it should be revisited.
In fact I believe the old patchkit fixed the reboot command race.

-andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] finx argv_split() vs sysctl race
  2013-03-16 20:32                         ` [PATCH 0/2] finx argv_split() vs sysctl race Andi Kleen
@ 2013-03-16 20:45                           ` Oleg Nesterov
  2013-03-16 20:56                             ` Andi Kleen
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-16 20:45 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Linus Torvalds, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On 03/16, Andi Kleen wrote:
>
> On Sat, Mar 16, 2013 at 09:23:27PM +0100, Oleg Nesterov wrote:
> > On 03/15, Oleg Nesterov wrote:
> > >
> > > To remind, say, argv_split(poweroff_cmd) can race with sysctl changing this
> > > string, in this case it can write to the memory after argv[] array. We can
> > > fix this, or we can rewrite argv_split/free:
> >
> > OK, please see 1/2.
> >
> > And this reminds me about set_task_comm() which pretends it does something
> > meaningful for the reader of the mutable ->comm, see the offtopic 2/2.
>
> I had "rcu strings" to handle the sysctl string race problem in a
> generic way some time ago.
>
> http://lwn.net/Articles/368684/
>
> Unfortunately never made it in. Perhaps it should be revisited.

Perhaps rcu can be better, although a global rwsem looks simpler,
I dunno.

But argv_split() or its usage should be changed anyway, and GFP_KERNEL
won't work under rcu_read_lock().

To me 1/2 looks as a simplification anyway, but I won't argue if we
decide to add rcu/locking and avoid this patch.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] finx argv_split() vs sysctl race
  2013-03-16 20:45                           ` Oleg Nesterov
@ 2013-03-16 20:56                             ` Andi Kleen
  2013-03-16 21:23                               ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2013-03-16 20:56 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andi Kleen, Andrew Morton, Linus Torvalds, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

> Perhaps rcu can be better, although a global rwsem looks simpler,
> I dunno.

It's a general problem with lots of sysctls.
> 
> But argv_split() or its usage should be changed anyway, and GFP_KERNEL
> won't work under rcu_read_lock().

rcu strings has a helper function to copy the string for sleepy cases.

> 
> To me 1/2 looks as a simplification anyway, but I won't argue if we
> decide to add rcu/locking and avoid this patch.

Ok I'll revisit. I think the problem last time was that the rcu strings
needed a kernel_text_address() function that worked on all
architectures.

Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] finx argv_split() vs sysctl race
  2013-03-16 20:56                             ` Andi Kleen
@ 2013-03-16 21:23                               ` Oleg Nesterov
  2013-03-16 21:54                                 ` Andi Kleen
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-16 21:23 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Linus Torvalds, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On 03/16, Andi Kleen wrote:
>
> > Perhaps rcu can be better, although a global rwsem looks simpler,
> > I dunno.
>
> It's a general problem with lots of sysctls.
> >
> > But argv_split() or its usage should be changed anyway, and GFP_KERNEL
> > won't work under rcu_read_lock().
>
> rcu strings has a helper function to copy the string for sleepy cases.

Then you need to pre-allocate, take rcu_read_lock(), copy, and check
that it actually fits the pre-allocated buffer. Not sure why the simple
rwsem is worse.

But I won't argue in any case

> > To me 1/2 looks as a simplification anyway, but I won't argue if we
> > decide to add rcu/locking and avoid this patch.
>
> Ok I'll revisit.

OK, but do you agree with 1/2?

Once again, this is subjective of course but imho this patch could be
considered as a cleanup/microoptimization, and it won't affect the rcu
changes.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] finx argv_split() vs sysctl race
  2013-03-16 21:23                               ` Oleg Nesterov
@ 2013-03-16 21:54                                 ` Andi Kleen
  2013-03-17 14:15                                   ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2013-03-16 21:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andi Kleen, Andrew Morton, Linus Torvalds, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On Sat, Mar 16, 2013 at 10:23:51PM +0100, Oleg Nesterov wrote:
> On 03/16, Andi Kleen wrote:
> >
> > > Perhaps rcu can be better, although a global rwsem looks simpler,
> > > I dunno.
> >
> > It's a general problem with lots of sysctls.
> > >
> > > But argv_split() or its usage should be changed anyway, and GFP_KERNEL
> > > won't work under rcu_read_lock().
> >
> > rcu strings has a helper function to copy the string for sleepy cases.
> 
> Then you need to pre-allocate, take rcu_read_lock(), copy, and check
> that it actually fits the pre-allocated buffer. Not sure why the simple
> rwsem is worse.

The reason I did it originally like that was that some of the sysctls weren't
as "slow path" as power off. And for anything that is even moderately
often used a global lock is going to hurt eventually. The "read" in the
sem also doesn't help because it's still a hot cache line. 
I agree if it the goal was only to fix poweroff RCU is somewhat
overkill and a global lock would be fine.

> But I won't argue in any case
> 
> > > To me 1/2 looks as a simplification anyway, but I won't argue if we
> > > decide to add rcu/locking and avoid this patch.
> >
> > Ok I'll revisit.
> 
> OK, but do you agree with 1/2?

It doesn't solve the race alone because when the 0 byte can move it's
not safe to run kstrndup() in parallel. Ok given the n and that it 
force terminates it could only lead to some junk at the end.

But it seems like a useful small optimization, although I don't know
if it's used in any non slow paths.

I assume you audited all callers that they comprehend that they need
to free differently now.

-Andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] finx argv_split() vs sysctl race
  2013-03-16 21:54                                 ` Andi Kleen
@ 2013-03-17 14:15                                   ` Oleg Nesterov
  2013-03-18 16:03                                     ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-17 14:15 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Linus Torvalds, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On 03/16, Andi Kleen wrote:
>
> On Sat, Mar 16, 2013 at 10:23:51PM +0100, Oleg Nesterov wrote:
> > >
> > > rcu strings has a helper function to copy the string for sleepy cases.
> >
> > Then you need to pre-allocate, take rcu_read_lock(), copy, and check
> > that it actually fits the pre-allocated buffer. Not sure why the simple
> > rwsem is worse.
>
> The reason I did it originally like that was that some of the sysctls weren't
> as "slow path" as power off.

OK, in this case rwsem is not fine, I agree.

> > > > To me 1/2 looks as a simplification anyway, but I won't argue if we
> > > > decide to add rcu/locking and avoid this patch.
> > >
> > > Ok I'll revisit.
> >
> > OK, but do you agree with 1/2?
>
> It doesn't solve the race alone because when the 0 byte can move it's
> not safe to run kstrndup() in parallel.

I don't think this is possible... To simplify, lets consider
poweroff_cmd[POWEROFF_CMD_PATH_LEN]. proc_dostring() or whatever can
never overwrite the last byte == 0, otherwise it will exceed this array
later to write the final zero.

But this doesn't matter,

> Ok given the n and that it
> force terminates it could only lead to some junk at the end.

Yes, kstrndup(max) is always safe even if the memory is not null
terminated.

> But it seems like a useful small optimization, although I don't know
> if it's used in any non slow paths.

Ah, I didn't mean the patch makes sense because of optimization. My
point was, we can fix the race without making this code worse (in fact
it tries to make the code better but this is subjective).

"fix the race" only means "make it safe" of course, a string in between
is unlikely useful.

> I assume you audited all callers that they comprehend that they need
> to free differently now.

Yes, /bin/grep shows that all callers use argv_free().

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/2] finx argv_split() vs sysctl race
  2013-03-17 14:15                                   ` Oleg Nesterov
@ 2013-03-18 16:03                                     ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-18 16:03 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Linus Torvalds, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On 03/17, Oleg Nesterov wrote:
>
> Ah, I didn't mean the patch makes sense because of optimization. My
> point was, we can fix the race without making this code worse (in fact
> it tries to make the code better but this is subjective).

OK, I am stupid.  argv_free() passes the wrong poiinter to kfree().
I am sendind v2 in reply to 1/2, sorry for noise.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/2] teach argv_split() to handle the mutable strings
  2013-03-16 20:23                         ` [PATCH 1/2] teach argv_split() to handle the mutable strings Oleg Nesterov
@ 2013-03-18 16:03                           ` Oleg Nesterov
  2013-03-18 21:53                           ` [PATCH " Andrew Morton
  1 sibling, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-18 16:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

argv_split() allocates argv[count_argc(str)] array and assumes that
it will find the same number of arguments later. This is obviously
wrong if this string can be changed, say, by sysctl.

With this patch argv_split() kstrndup's the whole string and does
not split it, we simply replace the spaces with zeroes and keep the
allocated memory in argv[-1] for argv_free(arg).

We do not use argv[0] because:

	- str can be all-spaces or empty. In fact this case is fine,
	  we could kfree() it before return, but:

	- str can have a space at the start, and we can not rely on
	  kstrndup(skip_spaces(str)) because it can equally race if
	  this string is mutable.

Also, simplify count_argc() and kill the no longer used skip_arg().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 lib/argv_split.c |   83 +++++++++++++++++++++++------------------------------
 1 files changed, 36 insertions(+), 47 deletions(-)

diff --git a/lib/argv_split.c b/lib/argv_split.c
index 1e9a6cb..fa7d30a 100644
--- a/lib/argv_split.c
+++ b/lib/argv_split.c
@@ -8,23 +8,17 @@
 #include <linux/slab.h>
 #include <linux/export.h>
 
-static const char *skip_arg(const char *cp)
-{
-	while (*cp && !isspace(*cp))
-		cp++;
-
-	return cp;
-}
-
 static int count_argc(const char *str)
 {
 	int count = 0;
+	bool was_space;
 
-	while (*str) {
-		str = skip_spaces(str);
-		if (*str) {
+	for (was_space = true; *str; str++) {
+		if (isspace(*str)) {
+			was_space = true;
+		} else if (was_space) {
+			was_space = false;
 			count++;
-			str = skip_arg(str);
 		}
 	}
 
@@ -39,10 +33,8 @@ static int count_argc(const char *str)
  */
 void argv_free(char **argv)
 {
-	char **p;
-	for (p = argv; *p; p++)
-		kfree(*p);
-
+	argv--;
+	kfree(argv[0]);
 	kfree(argv);
 }
 EXPORT_SYMBOL(argv_free);
@@ -62,40 +54,37 @@ EXPORT_SYMBOL(argv_free);
  */
 char **argv_split(gfp_t gfp, const char *str, int *argcp)
 {
-	int argc = count_argc(str);
-	char **argv = kzalloc(sizeof(*argv) * (argc+1), gfp);
-	char **argvp;
-
-	if (argv == NULL)
-		goto out;
-
-	if (argcp)
-		*argcp = argc;
-
-	argvp = argv;
-
-	while (*str) {
-		str = skip_spaces(str);
-
-		if (*str) {
-			const char *p = str;
-			char *t;
-
-			str = skip_arg(str);
+	char *argv_str;
+	bool was_space;
+	char **argv, **argv_ret;
+	int argc;
+
+	argv_str = kstrndup(str, KMALLOC_MAX_SIZE, gfp);
+	if (!argv_str)
+		return NULL;
+
+	argc = count_argc(argv_str);
+	argv = kmalloc(sizeof(*argv) * (argc + 2), gfp);
+	if (!argv) {
+		kfree(argv_str);
+		return NULL;
+	}
 
-			t = kstrndup(p, str-p, gfp);
-			if (t == NULL)
-				goto fail;
-			*argvp++ = t;
+	*argv = argv_str;
+	argv_ret = ++argv;
+	for (was_space = true; *argv_str; argv_str++) {
+		if (isspace(*argv_str)) {
+			was_space = true;
+			*argv_str = 0;
+		} else if (was_space) {
+			was_space = false;
+			*argv++ = argv_str;
 		}
 	}
-	*argvp = NULL;
-
-  out:
-	return argv;
+	*argv = NULL;
 
-  fail:
-	argv_free(argv);
-	return NULL;
+	if (argcp)
+		*argcp = argc;
+	return argv_ret;
 }
 EXPORT_SYMBOL(argv_split);
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] teach argv_split() to handle the mutable strings
  2013-03-16 20:23                         ` [PATCH 1/2] teach argv_split() to handle the mutable strings Oleg Nesterov
  2013-03-18 16:03                           ` [PATCH v2 " Oleg Nesterov
@ 2013-03-18 21:53                           ` Andrew Morton
  2013-03-19 19:54                             ` [PATCH -mm] argv_split-teach-it-to-handle-mutable-strings-fix-2 Oleg Nesterov
  1 sibling, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2013-03-18 21:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On Sat, 16 Mar 2013 21:23:53 +0100 Oleg Nesterov <oleg@redhat.com> wrote:

> argv_split() allocates argv[count_argc(str)] array and assumes that
> it will find the same number of arguments later. This is obviously
> wrong if this string can be changed, say, by sysctl.
> 
> With this patch argv_split() kstrndup's the whole string and does
> not split it, we simply replace the spaces with zeroes and keep the
> allocated memory in argv[-1] for argv_free(arg).
> 
> We do not use argv[0] because:
> 
> 	- str can be all-spaces or empty. In fact this case is fine,
> 	  we could kfree() it before return, but:
> 
> 	- str can have a space at the start, and we can not rely on
> 	  kstrndup(skip_arg(str)) because it can equally race if this
> 	  string is mutable.
> 
> Also, simplify count_argc() and kill the no longer used skip_arg().
> 
> ...
>
>  char **argv_split(gfp_t gfp, const char *str, int *argcp)
>  {
> -	int argc = count_argc(str);
> -	char **argv = kzalloc(sizeof(*argv) * (argc+1), gfp);
> -	char **argvp;
> -
> -	if (argv == NULL)
> -		goto out;
> -
> -	if (argcp)
> -		*argcp = argc;
> -
> -	argvp = argv;
> -
> -	while (*str) {
> -		str = skip_spaces(str);
> -
> -		if (*str) {
> -			const char *p = str;
> -			char *t;
> -
> -			str = skip_arg(str);
> +	char *argv_str;
> +	bool was_space;
> +	char **argv, **argv_ret;
> +	int argc;
> +
> +	argv_str = kstrndup(str, KMALLOC_MAX_SIZE, gfp);

hm, I think that works.

kstrndup() does kmalloc_track_caller(len+1, gfp) so your
KMALLOC_MAX_SIZE is off-by-one?


>From reading the code it is rather unobvious why things were
implemented in this fashion.  People may come along in five years and
"clean it up".  Hence we should explain, no?

--- a/lib/argv_split.c~argv_split-teach-it-to-handle-mutable-strings-fix
+++ a/lib/argv_split.c
@@ -51,6 +51,10 @@ EXPORT_SYMBOL(argv_free);
  * considered to be a single argument separator.  The returned array
  * is always NULL-terminated.  Returns NULL on memory allocation
  * failure.
+ *
+ * The source string at `str' may be undergoing concurrent alteration via
+ * userspace sysctl activity (at least).  The argv_split() implementation
+ * attempts to handle this gracefully by taking a local copy to work on.
  */
 char **argv_split(gfp_t gfp, const char *str, int *argcp)
 {
_


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH -mm] argv_split-teach-it-to-handle-mutable-strings-fix-2
  2013-03-18 21:53                           ` [PATCH " Andrew Morton
@ 2013-03-19 19:54                             ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2013-03-19 19:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Andi Kleen, Lucas De Marchi,
	Benjamin Herrenschmidt, Linux Kernel Mailing List,
	Paul Mackerras, david, Kees Cook, Serge Hallyn,
	Rafael J. Wysocki, Feng Hong, Lucas De Marchi

On 03/18, Andrew Morton wrote:
>
> On Sat, 16 Mar 2013 21:23:53 +0100 Oleg Nesterov <oleg@redhat.com> wrote:
>
> > +	argv_str = kstrndup(str, KMALLOC_MAX_SIZE, gfp);
>
> kstrndup() does kmalloc_track_caller(len+1, gfp) so your
> KMALLOC_MAX_SIZE is off-by-one?

Yes... 'max' is strlen(), not sizeof()...

Actually we could even use ULONG_MAX, the last zero byte in "str" should
be never overwritten. Or we could use some "reasonable" and lower limit.

But I agree, kstrndup(KMALLOC_MAX_SIZE) doesn't look good, please find
fix-2 below.

> From reading the code it is rather unobvious why things were
> implemented in this fashion.  People may come along in five years and
> "clean it up".  Hence we should explain, no?

Yes, thanks for this comment!
---
 lib/argv_split.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/argv_split.c b/lib/argv_split.c
index cac7ec4..e927ed0 100644
--- a/lib/argv_split.c
+++ b/lib/argv_split.c
@@ -63,7 +63,7 @@ char **argv_split(gfp_t gfp, const char *str, int *argcp)
 	char **argv, **argv_ret;
 	int argc;
 
-	argv_str = kstrndup(str, KMALLOC_MAX_SIZE, gfp);
+	argv_str = kstrndup(str, KMALLOC_MAX_SIZE - 1, gfp);
 	if (!argv_str)
 		return NULL;
 
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2013-03-19 19:56 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-12  3:25 Regression with orderly_poweroff() Benjamin Herrenschmidt
2013-03-12 14:46 ` Linus Torvalds
2013-03-12 17:46   ` Oleg Nesterov
2013-03-12 17:54   ` Lucas De Marchi
2013-03-12 18:22     ` Oleg Nesterov
2013-03-12 18:42       ` Linus Torvalds
2013-03-12 19:11         ` Oleg Nesterov
2013-03-12 19:20           ` Linus Torvalds
2013-03-12 20:35             ` Oleg Nesterov
2013-03-13 17:46               ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Oleg Nesterov
2013-03-13 17:47                 ` [PATCH 1/1] " Oleg Nesterov
2013-03-14 22:28                   ` Andrew Morton
2013-03-15 16:39                     ` Oleg Nesterov
2013-03-16 20:23                       ` [PATCH 0/2] finx argv_split() vs sysctl race Oleg Nesterov
2013-03-16 20:23                         ` [PATCH 1/2] teach argv_split() to handle the mutable strings Oleg Nesterov
2013-03-18 16:03                           ` [PATCH v2 " Oleg Nesterov
2013-03-18 21:53                           ` [PATCH " Andrew Morton
2013-03-19 19:54                             ` [PATCH -mm] argv_split-teach-it-to-handle-mutable-strings-fix-2 Oleg Nesterov
2013-03-16 20:24                         ` [PATCH 2/2] set_task_comm: kill the pointless memset() + wmb() Oleg Nesterov
2013-03-16 20:32                         ` [PATCH 0/2] finx argv_split() vs sysctl race Andi Kleen
2013-03-16 20:45                           ` Oleg Nesterov
2013-03-16 20:56                             ` Andi Kleen
2013-03-16 21:23                               ` Oleg Nesterov
2013-03-16 21:54                                 ` Andi Kleen
2013-03-17 14:15                                   ` Oleg Nesterov
2013-03-18 16:03                                     ` Oleg Nesterov
2013-03-13 23:35                 ` [PATCH 0/1] poweroff: change orderly_poweroff() to use schedule_work() Lucas De Marchi
2013-03-12 20:13           ` Regression with orderly_poweroff() Andi Kleen
2013-03-12 19:28   ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).