linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Odd ENOMEM being returned in 3.8-rcX
@ 2013-02-07 21:57 Josh Boyer
  2013-02-07 22:15 ` Andrew Morton
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Boyer @ 2013-02-07 21:57 UTC (permalink / raw)
  To: Andrew Morton, Al Viro, Eric W. Biederman, Mel Gorman; +Cc: linux-kernel

Hi All,

We've hit a weird error in Fedora using the 3.8-rcX kernels.  It seems
the mock tool is getting back ENOMEM when doing very simple things that
normally just work.  The 3.7 kernels on the same userspace work just
fine.  It seems just running 'mock init -v' is enough to cause the
failure.

Because this is the rawhide kernel, we have some debug options enabled.
This happens to trigger this error:

[   89.143660] BUG: sleeping function called from invalid context at kernel/nsproxy.c:217
[   89.143729] in_atomic(): 0, irqs_disabled(): 1, pid: 1329, name: mock
[   89.143776] no locks held by mock/1329.
[   89.143778] irq event stamp: 324562
[   89.143781] hardirqs last  enabled at (324561): [<ffffffff81163a8d>] get_page_from_freelist+0x51d/0x990
[   89.143791] hardirqs last disabled at (324562): [<ffffffff816daa9d>] _raw_spin_lock_irq+0x1d/0x60
[   89.143798] softirqs last  enabled at (323936): [<ffffffff81070438>] __do_softirq+0x168/0x3d0
[   89.143804] softirqs last disabled at (323931): [<ffffffff816e587c>] call_softirq+0x1c/0x30
[   89.143811] Pid: 1329, comm: mock Not tainted 3.8.0-0.rc6.git1.1.fc19.x86_64 #1
[   89.143814] Call Trace:
[   89.143823]  [<ffffffff8109f8d9>] __might_sleep+0x179/0x230
[   89.143828]  [<ffffffff81097887>] switch_task_namespaces+0x27/0x60
[   89.143833]  [<ffffffff810978d0>] exit_task_namespaces+0x10/0x20
[   89.143839]  [<ffffffff81064692>] copy_process.part.22+0xe32/0x1640
[   89.143844]  [<ffffffff81064f95>] do_fork+0xa5/0x450
[   89.143849]  [<ffffffff816db718>] ? retint_swapgs+0x13/0x1b
[   89.143854]  [<ffffffff810653c6>] sys_clone+0x16/0x20
[   89.143859]  [<ffffffff816e48b9>] stub_clone+0x69/0x90
[   89.143864]  [<ffffffff816e44d9>] ? system_call_fastpath+0x16/0x1b

At first glance it seems copy_io is failing (possibly because
get_task_io_context fails), and then the above fallout is printed.  The
warning seems fairly valid, but I don't think that is the root of the
problem.

We've seen this as far back as Linux v3.8-rc2-116-g5f243b9 so far.  I
can still hit it with 3.8-rc6 as well.

I'm still trying to see if the ENOMEM hits without the debug options set,
and exactly which commit caused it.  I just wanted to see if anyone else
had seen odd python issues or other things failing with ENOMEM when they
shouldn't while I'm off debugging.

Thoughts/tips would be appreciated.

josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-07 21:57 Odd ENOMEM being returned in 3.8-rcX Josh Boyer
@ 2013-02-07 22:15 ` Andrew Morton
  2013-02-08  0:35   ` Josh Boyer
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2013-02-07 22:15 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Al Viro, Eric W. Biederman, Mel Gorman, linux-kernel

On Thu, 7 Feb 2013 16:57:42 -0500
Josh Boyer <jwboyer@redhat.com> wrote:

> Hi All,
> 
> We've hit a weird error in Fedora using the 3.8-rcX kernels.  It seems
> the mock tool is getting back ENOMEM when doing very simple things that
> normally just work.  The 3.7 kernels on the same userspace work just
> fine.  It seems just running 'mock init -v' is enough to cause the
> failure.

I assume you're not seeing the "page allocation failure" message and
backtrace.  This means that either

a) it's a __GFP_NOWARN callsite.  This is rare.  Or

b) it's actually a different error but someone went and overwrote a
   callee's return value with -ENOMEM.  We do this a lot and it sucks.

> Because this is the rawhide kernel, we have some debug options enabled.
> This happens to trigger this error:
> 
> [   89.143660] BUG: sleeping function called from invalid context at kernel/nsproxy.c:217
> [   89.143729] in_atomic(): 0, irqs_disabled(): 1, pid: 1329, name: mock
> [   89.143776] no locks held by mock/1329.
> [   89.143778] irq event stamp: 324562
> [   89.143781] hardirqs last  enabled at (324561): [<ffffffff81163a8d>] get_page_from_freelist+0x51d/0x990
> [   89.143791] hardirqs last disabled at (324562): [<ffffffff816daa9d>] _raw_spin_lock_irq+0x1d/0x60
> [   89.143798] softirqs last  enabled at (323936): [<ffffffff81070438>] __do_softirq+0x168/0x3d0
> [   89.143804] softirqs last disabled at (323931): [<ffffffff816e587c>] call_softirq+0x1c/0x30
> [   89.143811] Pid: 1329, comm: mock Not tainted 3.8.0-0.rc6.git1.1.fc19.x86_64 #1
> [   89.143814] Call Trace:
> [   89.143823]  [<ffffffff8109f8d9>] __might_sleep+0x179/0x230
> [   89.143828]  [<ffffffff81097887>] switch_task_namespaces+0x27/0x60
> [   89.143833]  [<ffffffff810978d0>] exit_task_namespaces+0x10/0x20
> [   89.143839]  [<ffffffff81064692>] copy_process.part.22+0xe32/0x1640
> [   89.143844]  [<ffffffff81064f95>] do_fork+0xa5/0x450
> [   89.143849]  [<ffffffff816db718>] ? retint_swapgs+0x13/0x1b
> [   89.143854]  [<ffffffff810653c6>] sys_clone+0x16/0x20
> [   89.143859]  [<ffffffff816e48b9>] stub_clone+0x69/0x90
> [   89.143864]  [<ffffffff816e44d9>] ? system_call_fastpath+0x16/0x1b
> 
> At first glance it seems copy_io is failing (possibly because
> get_task_io_context fails), and then the above fallout is printed.  The
> warning seems fairly valid, but I don't think that is the root of the
> problem.

yes, get_task_io_context() might be the place.  Tried adding a few
error-path printks in there to see what's happening?

I can't see anything around there which leaves interrupts disabled
though.  It's quite likely that there's some code with is forgetting to
reenable interrupts on a rarely-tested error path, and that ENOMEM is
tickling the bug.

> We've seen this as far back as Linux v3.8-rc2-116-g5f243b9 so far.  I
> can still hit it with 3.8-rc6 as well.
> 
> I'm still trying to see if the ENOMEM hits without the debug options set,
> and exactly which commit caused it.  I just wanted to see if anyone else
> had seen odd python issues or other things failing with ENOMEM when they
> shouldn't while I'm off debugging.
> 
> Thoughts/tips would be appreciated.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-07 22:15 ` Andrew Morton
@ 2013-02-08  0:35   ` Josh Boyer
  2013-02-08 18:19     ` Josh Boyer
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Boyer @ 2013-02-08  0:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Al Viro, Eric W. Biederman, Mel Gorman, linux-kernel

On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote:
> On Thu, 7 Feb 2013 16:57:42 -0500
> Josh Boyer <jwboyer@redhat.com> wrote:
> 
> > Hi All,
> > 
> > We've hit a weird error in Fedora using the 3.8-rcX kernels.  It seems
> > the mock tool is getting back ENOMEM when doing very simple things that
> > normally just work.  The 3.7 kernels on the same userspace work just
> > fine.  It seems just running 'mock init -v' is enough to cause the
> > failure.
> 
> I assume you're not seeing the "page allocation failure" message and
> backtrace.  This means that either

Right.  If I disable our debug options, I see no backtraces at all and
the python app still gets ENOMEM returned.  (See below for those
interested).

> a) it's a __GFP_NOWARN callsite.  This is rare.  Or
> 
> b) it's actually a different error but someone went and overwrote a
>    callee's return value with -ENOMEM.  We do this a lot and it sucks.

We do it in copy_io :\.

> > At first glance it seems copy_io is failing (possibly because
> > get_task_io_context fails), and then the above fallout is printed.  The
> > warning seems fairly valid, but I don't think that is the root of the
> > problem.
> 
> yes, get_task_io_context() might be the place.  Tried adding a few
> error-path printks in there to see what's happening?

Yeah, that's my next step.  I guess I know what I'll be doing tomorrow.

> I can't see anything around there which leaves interrupts disabled
> though.  It's quite likely that there's some code with is forgetting to
> reenable interrupts on a rarely-tested error path, and that ENOMEM is
> tickling the bug.

Right, agreed.  As I said, I think that is mostly a secondary issue.
Hopefully it will be easy to fix once we figure out why we're getting
the ENOMEM error.

Python backtrace below.  Seems to be failing on forking a umount command
after init'ing the chroot.  I can put the full output somewhere if
people are interested.

josh

DEBUG: Executing command: ['/bin/umount', '-n', '-l', '/var/lib/mock/fedora-16-x86_64/root/proc'] with env {'LANG': 'en_US.utf8', 'TERM': 'vt100', 'SHELL': '/bin/bash', 'HOSTNAME': 'mock', 'HOME': '/builddir', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin'}
Traceback (most recent call last):
  File "/usr/sbin/mock", line 539, in <module>
    def do_buildsrpm(config_opts, chroot, options, args):
  File "/usr/sbin/mock", line 782, in main
    chroot.init()
  File "<peak.util.decorators.rewrap wrapping mockbuild.backend.init at 0x00D59320>", line 3, in init
  File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace
    result = func(*args, **kw)
  File "/usr/lib/python2.7/site-packages/mockbuild/backend.py", line 279, in init
    self._init()
  File "<peak.util.decorators.rewrap wrapping mockbuild.backend._init at 0x00D596E0>", line 3, in _init
  File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace
    result = func(*args, **kw)
  File "/usr/lib/python2.7/site-packages/mockbuild/backend.py", line 425, in _init
    self._umountall()
  File "<peak.util.decorators.rewrap wrapping mockbuild.backend._umountall at 0x00D5C320>", line 3, in _umountall
  File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace
    result = func(*args, **kw)
  File "/usr/lib/python2.7/site-packages/mockbuild/backend.py", line 872, in _umountall
    self.mounts.umountall(nowarn=nowarn)
  File "<peak.util.decorators.rewrap wrapping mockbuild.mounts.umountall at 0x00D43398>", line 3, in umountall
  File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace
    result = func(*args, **kw)
  File "/usr/lib/python2.7/site-packages/mockbuild/mounts.py", line 133, in umountall
    m.umount()
  File "<peak.util.decorators.rewrap wrapping mockbuild.mounts.umount at 0x00D4EC08>", line 3, in umount
  File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace
    result = func(*args, **kw)
  File "/usr/lib/python2.7/site-packages/mockbuild/mounts.py", line 69, in umount
    mockbuild.util.do(cmd)
  File "<peak.util.decorators.rewrap wrapping mockbuild.util.do at 0x00D47140>", line 3, in do
  File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace
    result = func(*args, **kw)
  File "/usr/lib/python2.7/site-packages/mockbuild/util.py", line 323, in do
    preexec_fn = preexec,
  File "/usr/lib64/python2.7/subprocess.py", line 679, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.7/subprocess.py", line 1143, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08  0:35   ` Josh Boyer
@ 2013-02-08 18:19     ` Josh Boyer
  2013-02-08 20:13       ` Eric W. Biederman
  2013-02-08 20:18       ` Josh Boyer
  0 siblings, 2 replies; 19+ messages in thread
From: Josh Boyer @ 2013-02-08 18:19 UTC (permalink / raw)
  To: Andrew Morton, Eric W. Biederman; +Cc: Al Viro, Mel Gorman, linux-kernel

On Thu, Feb 07, 2013 at 07:35:01PM -0500, Josh Boyer wrote:
> On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote:
> > On Thu, 7 Feb 2013 16:57:42 -0500
> > Josh Boyer <jwboyer@redhat.com> wrote:
> > 
> > > Hi All,
> > > 
> > > We've hit a weird error in Fedora using the 3.8-rcX kernels.  It seems
> > > the mock tool is getting back ENOMEM when doing very simple things that
> > > normally just work.  The 3.7 kernels on the same userspace work just
> > > fine.  It seems just running 'mock init -v' is enough to cause the
> > > failure.
> > 
> > I assume you're not seeing the "page allocation failure" message and
> > backtrace.  This means that either
> 
> Right.  If I disable our debug options, I see no backtraces at all and
> the python app still gets ENOMEM returned.  (See below for those
> interested).
> 
> > a) it's a __GFP_NOWARN callsite.  This is rare.  Or
> > 
> > b) it's actually a different error but someone went and overwrote a
> >    callee's return value with -ENOMEM.  We do this a lot and it sucks.
> 
> We do it in copy_io :\.
> 
> > > At first glance it seems copy_io is failing (possibly because
> > > get_task_io_context fails), and then the above fallout is printed.  The
> > > warning seems fairly valid, but I don't think that is the root of the
> > > problem.
> > 
> > yes, get_task_io_context() might be the place.  Tried adding a few
> > error-path printks in there to see what's happening?
> 
> Yeah, that's my next step.  I guess I know what I'll be doing tomorrow.
> 
> > I can't see anything around there which leaves interrupts disabled
> > though.  It's quite likely that there's some code with is forgetting to
> > reenable interrupts on a rarely-tested error path, and that ENOMEM is
> > tickling the bug.
> 
> Right, agreed.  As I said, I think that is mostly a secondary issue.
> Hopefully it will be easy to fix once we figure out why we're getting
> the ENOMEM error.
> 
> Python backtrace below.  Seems to be failing on forking a umount command
> after init'ing the chroot.  I can put the full output somewhere if
> people are interested.

OK.  I've bisected this down to:

50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit
commit 50804fe3737ca6a5942fdc2057a18a8141d00141
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Tue Mar 2 15:41:50 2010 -0800

    pidns: Support unsharing the pid namespace.
    

I haven't really gotten much farther than that yet, but the bisect was
pretty straight forward.  Eric, is there anything specific I can gather
or do to help figure out why that is causing mock to get such a weird
error?  I can provide the bisect log if you'd like.

josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 18:19     ` Josh Boyer
@ 2013-02-08 20:13       ` Eric W. Biederman
  2013-02-08 20:23         ` Josh Boyer
                           ` (2 more replies)
  2013-02-08 20:18       ` Josh Boyer
  1 sibling, 3 replies; 19+ messages in thread
From: Eric W. Biederman @ 2013-02-08 20:13 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel

Josh Boyer <jwboyer@redhat.com> writes:

> On Thu, Feb 07, 2013 at 07:35:01PM -0500, Josh Boyer wrote:
>> On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote:
>> > On Thu, 7 Feb 2013 16:57:42 -0500
>> > Josh Boyer <jwboyer@redhat.com> wrote:
>> > 
>> > > Hi All,
>> > > 
>> > > We've hit a weird error in Fedora using the 3.8-rcX kernels.  It seems
>> > > the mock tool is getting back ENOMEM when doing very simple things that
>> > > normally just work.  The 3.7 kernels on the same userspace work just
>> > > fine.  It seems just running 'mock init -v' is enough to cause the
>> > > failure.
>> > 
>> > I assume you're not seeing the "page allocation failure" message and
>> > backtrace.  This means that either
>> 
>> Right.  If I disable our debug options, I see no backtraces at all and
>> the python app still gets ENOMEM returned.  (See below for those
>> interested).
>> 
>> > a) it's a __GFP_NOWARN callsite.  This is rare.  Or
>> > 
>> > b) it's actually a different error but someone went and overwrote a
>> >    callee's return value with -ENOMEM.  We do this a lot and it sucks.
>> 
>> We do it in copy_io :\.
>> 
>> > > At first glance it seems copy_io is failing (possibly because
>> > > get_task_io_context fails), and then the above fallout is printed.  The
>> > > warning seems fairly valid, but I don't think that is the root of the
>> > > problem.
>> > 
>> > yes, get_task_io_context() might be the place.  Tried adding a few
>> > error-path printks in there to see what's happening?
>> 
>> Yeah, that's my next step.  I guess I know what I'll be doing tomorrow.
>> 
>> > I can't see anything around there which leaves interrupts disabled
>> > though.  It's quite likely that there's some code with is forgetting to
>> > reenable interrupts on a rarely-tested error path, and that ENOMEM is
>> > tickling the bug.
>> 
>> Right, agreed.  As I said, I think that is mostly a secondary issue.
>> Hopefully it will be easy to fix once we figure out why we're getting
>> the ENOMEM error.
>> 
>> Python backtrace below.  Seems to be failing on forking a umount command
>> after init'ing the chroot.  I can put the full output somewhere if
>> people are interested.
>
> OK.  I've bisected this down to:
>
> 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit
> commit 50804fe3737ca6a5942fdc2057a18a8141d00141
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Tue Mar 2 15:41:50 2010 -0800
>
>     pidns: Support unsharing the pid namespace.
>     
>
> I haven't really gotten much farther than that yet, but the bisect was
> pretty straight forward.  Eric, is there anything specific I can gather
> or do to help figure out why that is causing mock to get such a weird
> error?  I can provide the bisect log if you'd like.

My best guess in some dark corner of mock has untested code to unshare a
pid namespace, and that corner started doing something now that
unsharing of the pid namespace actually works.

If mock has called unshare(CLONE_NEWPID). And then forked a process and
that process exited, and then forked anothe process that second and all
subsequent fork calls will fail with -ENOMEM (because init has exited in
the pid namespace).  -ENOMEM will be generated because of a failure of
alloc_pid.

Looking at that code path a little closer that just about has to be it,
because I goofed and the error path drops the lock but not irqs.  The
patch below should fix the nasty warning and confirm where the code is
failing in copy_process.

An strace to see which syscalls mock is making and with which flags
would be very interesting.  I am almost certain that there is a
unshare(CLONE_NEWPID) somewhere in there.  But in a remote corner of
possibility it could weird clone flags, or something else.

Beyond that I suspect we want to work with the mock folks so they get
their code to use a pid namespace working the way they intended.

Eric

From: "Eric W. Biederman" <ebiederm@xmission.com>
Date: Fri, 8 Feb 2013 12:05:54 -0800
Subject: [PATCH] pid: unlock_irq when alloc_pid fails because init has
 exited.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/pid.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index de9af60..f2c6a68 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -331,7 +331,7 @@ out:
 	return pid;
 
 out_unlock:
-	spin_unlock(&pidmap_lock);
+	spin_unlock_irq(&pidmap_lock);
 out_free:
 	while (++i <= ns->level)
 		free_pidmap(pid->numbers + i);
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 18:19     ` Josh Boyer
  2013-02-08 20:13       ` Eric W. Biederman
@ 2013-02-08 20:18       ` Josh Boyer
  2013-02-08 20:36         ` Eric W. Biederman
  1 sibling, 1 reply; 19+ messages in thread
From: Josh Boyer @ 2013-02-08 20:18 UTC (permalink / raw)
  To: Andrew Morton, Eric W. Biederman; +Cc: Al Viro, Mel Gorman, linux-kernel

On Fri, Feb 08, 2013 at 01:19:49PM -0500, Josh Boyer wrote:
> On Thu, Feb 07, 2013 at 07:35:01PM -0500, Josh Boyer wrote:
> > On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote:
> > > On Thu, 7 Feb 2013 16:57:42 -0500
> > > Josh Boyer <jwboyer@redhat.com> wrote:
> > > 
> > > > Hi All,
> > > > 
> > > > We've hit a weird error in Fedora using the 3.8-rcX kernels.  It seems
> > > > the mock tool is getting back ENOMEM when doing very simple things that
> > > > normally just work.  The 3.7 kernels on the same userspace work just
> > > > fine.  It seems just running 'mock init -v' is enough to cause the
> > > > failure.
> > > 
> > > I assume you're not seeing the "page allocation failure" message and
> > > backtrace.  This means that either
> > 
> > Right.  If I disable our debug options, I see no backtraces at all and
> > the python app still gets ENOMEM returned.  (See below for those
> > interested).
> > 
> > > a) it's a __GFP_NOWARN callsite.  This is rare.  Or
> > > 
> > > b) it's actually a different error but someone went and overwrote a
> > >    callee's return value with -ENOMEM.  We do this a lot and it sucks.
> > 
> > We do it in copy_io :\.
> > 
> > > > At first glance it seems copy_io is failing (possibly because
> > > > get_task_io_context fails), and then the above fallout is printed.  The
> > > > warning seems fairly valid, but I don't think that is the root of the
> > > > problem.
> > > 
> > > yes, get_task_io_context() might be the place.  Tried adding a few
> > > error-path printks in there to see what's happening?
> > 
> > Yeah, that's my next step.  I guess I know what I'll be doing tomorrow.
> > 
> > > I can't see anything around there which leaves interrupts disabled
> > > though.  It's quite likely that there's some code with is forgetting to
> > > reenable interrupts on a rarely-tested error path, and that ENOMEM is
> > > tickling the bug.
> > 
> > Right, agreed.  As I said, I think that is mostly a secondary issue.
> > Hopefully it will be easy to fix once we figure out why we're getting
> > the ENOMEM error.
> > 
> > Python backtrace below.  Seems to be failing on forking a umount command
> > after init'ing the chroot.  I can put the full output somewhere if
> > people are interested.
> 
> OK.  I've bisected this down to:
> 
> 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit
> commit 50804fe3737ca6a5942fdc2057a18a8141d00141
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Tue Mar 2 15:41:50 2010 -0800
> 
>     pidns: Support unsharing the pid namespace.
>     
> 
> I haven't really gotten much farther than that yet, but the bisect was
> pretty straight forward.  Eric, is there anything specific I can gather
> or do to help figure out why that is causing mock to get such a weird
> error?  I can provide the bisect log if you'd like.

I took a look at what mock was doing and it was mostly very simple
stuff.  The two exceptions were that it was calling unshare, then doing
some file checks and I/O, and then calling fork to exec off some helper
things.  Up until the point it fails, the forks work and the children go
do whatever it is they were supposed to do.  I've CC'd Clark Williams
just in case people have questions on mock itself, but I'm not sure that
will be needed.

I have a very simple testcase (warning, ugly code) that hits this issue
now.  Build with "gcc -D_GNU_SOURCE -g unshare.c -o unshare" and then
just run it as "sudo ./unshare 10".  The sudo is needed for the unshare
call to work.

I get the following output on kernels that have the above commit:

[jwboyer@vader ~]$ sudo ./unshare 10
Calling unshare 738328576
Forked 6684
Forked 6685
Fork failed: Cannot allocate memory
Fork failed: Cannot allocate memory
Fork failed: Cannot allocate memory
Fork failed: Cannot allocate memory
Fork failed: Cannot allocate memory
Fork failed: Cannot allocate memory
Fork failed: Cannot allocate memory
Fork failed: Cannot allocate memory
[jwboyer@vader ~]$

which is consistent with what mock is seeing.  If I comment out the call
to unshare, it seems to always work.  It seems to consistently fail with
ENOMEM after the first 3-5 forked children, but it varies within that
range.

On a kernel without the above commit, this works every time.  I've tried
variations of 10, 100, and 10,000 iterations successfully with and
without the unshare call.

Hopefully this helps.  Testcase below.

josh

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sched.h>

int main(int argc, char **argv)
{
	int i, flags = CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWPID | CLONE_NEWIPC;
	int fd;
	pid_t pid;
	
	fd = open("./wtf-unshare", O_CREAT|O_RDWR, 0666);
	write(fd, "blah", 4);

	printf("Calling unshare %d\n", flags);
	unshare(flags);

	for (i = 0; i < atoi(argv[1]); i++) {
		pid = fork();

		if (pid == 0)
			exit(0);
		else if (pid == -1)
			perror("Fork failed");
		else
			printf("Forked %d\n", pid);
	}
	
	return 0;
}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 20:13       ` Eric W. Biederman
@ 2013-02-08 20:23         ` Josh Boyer
  2013-02-08 20:45           ` Eric W. Biederman
  2013-02-08 22:12         ` Josh Boyer
  2013-02-11 23:57         ` Andrew Morton
  2 siblings, 1 reply; 19+ messages in thread
From: Josh Boyer @ 2013-02-08 20:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams

On Fri, Feb 08, 2013 at 12:13:09PM -0800, Eric W. Biederman wrote:
> Josh Boyer <jwboyer@redhat.com> writes:
> >> Right, agreed.  As I said, I think that is mostly a secondary issue.
> >> Hopefully it will be easy to fix once we figure out why we're getting
> >> the ENOMEM error.
> >> 
> >> Python backtrace below.  Seems to be failing on forking a umount command
> >> after init'ing the chroot.  I can put the full output somewhere if
> >> people are interested.
> >
> > OK.  I've bisected this down to:
> >
> > 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit
> > commit 50804fe3737ca6a5942fdc2057a18a8141d00141
> > Author: Eric W. Biederman <ebiederm@xmission.com>
> > Date:   Tue Mar 2 15:41:50 2010 -0800
> >
> >     pidns: Support unsharing the pid namespace.
> >     
> >
> > I haven't really gotten much farther than that yet, but the bisect was
> > pretty straight forward.  Eric, is there anything specific I can gather
> > or do to help figure out why that is causing mock to get such a weird
> > error?  I can provide the bisect log if you'd like.

< Two emails fly past each other in the night >

> My best guess in some dark corner of mock has untested code to unshare a
> pid namespace, and that corner started doing something now that
> unsharing of the pid namespace actually works.
> 
> If mock has called unshare(CLONE_NEWPID). And then forked a process and
> that process exited, and then forked anothe process that second and all
> subsequent fork calls will fail with -ENOMEM (because init has exited in
> the pid namespace).  -ENOMEM will be generated because of a failure of
> alloc_pid.
> 
> Looking at that code path a little closer that just about has to be it,
> because I goofed and the error path drops the lock but not irqs.  The
> patch below should fix the nasty warning and confirm where the code is
> failing in copy_process.

OK.  I'll turn the debug option back on and give this patch a try.

> An strace to see which syscalls mock is making and with which flags
> would be very interesting.  I am almost certain that there is a
> unshare(CLONE_NEWPID) somewhere in there.  But in a remote corner of
> possibility it could weird clone flags, or something else.

Oh, I have that but it's a python app with a helper C app and it's a...
verbose strace.  It's here for one failure:

http://jwboyer.fedorapeople.org/pub/mock-strace

Hopefully the testcase from my other email will help though.  It's much
simpler.

> Beyond that I suspect we want to work with the mock folks so they get
> their code to use a pid namespace working the way they intended.

Right.  CC'd Clark (for real this time).

I'll let you know on the patch.

josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 20:18       ` Josh Boyer
@ 2013-02-08 20:36         ` Eric W. Biederman
  2013-02-08 20:40           ` Josh Boyer
  0 siblings, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2013-02-08 20:36 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel

Josh Boyer <jwboyer@redhat.com> writes:

> On Fri, Feb 08, 2013 at 01:19:49PM -0500, Josh Boyer wrote:
>> On Thu, Feb 07, 2013 at 07:35:01PM -0500, Josh Boyer wrote:
>> > On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote:
>> > > On Thu, 7 Feb 2013 16:57:42 -0500
>> > > Josh Boyer <jwboyer@redhat.com> wrote:
>> > > 
>> > > > Hi All,
>> > > > 
>> > > > We've hit a weird error in Fedora using the 3.8-rcX kernels.  It seems
>> > > > the mock tool is getting back ENOMEM when doing very simple things that
>> > > > normally just work.  The 3.7 kernels on the same userspace work just
>> > > > fine.  It seems just running 'mock init -v' is enough to cause the
>> > > > failure.
>> > > 
>> > > I assume you're not seeing the "page allocation failure" message and
>> > > backtrace.  This means that either
>> > 
>> > Right.  If I disable our debug options, I see no backtraces at all and
>> > the python app still gets ENOMEM returned.  (See below for those
>> > interested).
>> > 
>> > > a) it's a __GFP_NOWARN callsite.  This is rare.  Or
>> > > 
>> > > b) it's actually a different error but someone went and overwrote a
>> > >    callee's return value with -ENOMEM.  We do this a lot and it sucks.
>> > 
>> > We do it in copy_io :\.
>> > 
>> > > > At first glance it seems copy_io is failing (possibly because
>> > > > get_task_io_context fails), and then the above fallout is printed.  The
>> > > > warning seems fairly valid, but I don't think that is the root of the
>> > > > problem.
>> > > 
>> > > yes, get_task_io_context() might be the place.  Tried adding a few
>> > > error-path printks in there to see what's happening?
>> > 
>> > Yeah, that's my next step.  I guess I know what I'll be doing tomorrow.
>> > 
>> > > I can't see anything around there which leaves interrupts disabled
>> > > though.  It's quite likely that there's some code with is forgetting to
>> > > reenable interrupts on a rarely-tested error path, and that ENOMEM is
>> > > tickling the bug.
>> > 
>> > Right, agreed.  As I said, I think that is mostly a secondary issue.
>> > Hopefully it will be easy to fix once we figure out why we're getting
>> > the ENOMEM error.
>> > 
>> > Python backtrace below.  Seems to be failing on forking a umount command
>> > after init'ing the chroot.  I can put the full output somewhere if
>> > people are interested.
>> 
>> OK.  I've bisected this down to:
>> 
>> 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit
>> commit 50804fe3737ca6a5942fdc2057a18a8141d00141
>> Author: Eric W. Biederman <ebiederm@xmission.com>
>> Date:   Tue Mar 2 15:41:50 2010 -0800
>> 
>>     pidns: Support unsharing the pid namespace.
>>     
>> 
>> I haven't really gotten much farther than that yet, but the bisect was
>> pretty straight forward.  Eric, is there anything specific I can gather
>> or do to help figure out why that is causing mock to get such a weird
>> error?  I can provide the bisect log if you'd like.
>
> I took a look at what mock was doing and it was mostly very simple
> stuff.  The two exceptions were that it was calling unshare, then doing
> some file checks and I/O, and then calling fork to exec off some helper
> things.  Up until the point it fails, the forks work and the children go
> do whatever it is they were supposed to do.  I've CC'd Clark Williams
> just in case people have questions on mock itself, but I'm not sure that
> will be needed.

Our emails crossed paths.  You have just confirmed my suspicion about
what was going wrong.

The practical question is why mock is calling unshare(CLONE_NEWPID)
because it clearly seems not to understand how to unshare the pid
namespace and use it that way.

Except for forgeting to reenable irqs in the failure path of alloc_pid
the behavior is exactly correct and is how the pid namespace is designed
to behave in the case of unshare.


> which is consistent with what mock is seeing.  If I comment out the call
> to unshare, it seems to always work.  It seems to consistently fail with
> ENOMEM after the first 3-5 forked children, but it varies within that
> range.

If you add a waitpid or space out your forks you will see that it always
fails after your first child in the pid namespace has exited.

We don't allow children in a pid namespace after fork has exited.

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 20:36         ` Eric W. Biederman
@ 2013-02-08 20:40           ` Josh Boyer
  0 siblings, 0 replies; 19+ messages in thread
From: Josh Boyer @ 2013-02-08 20:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams

On Fri, Feb 08, 2013 at 12:36:08PM -0800, Eric W. Biederman wrote:
> Josh Boyer <jwboyer@redhat.com> writes:
> >> OK.  I've bisected this down to:
> >> 
> >> 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit
> >> commit 50804fe3737ca6a5942fdc2057a18a8141d00141
> >> Author: Eric W. Biederman <ebiederm@xmission.com>
> >> Date:   Tue Mar 2 15:41:50 2010 -0800
> >> 
> >>     pidns: Support unsharing the pid namespace.
> >>     
> >> 
> >> I haven't really gotten much farther than that yet, but the bisect was
> >> pretty straight forward.  Eric, is there anything specific I can gather
> >> or do to help figure out why that is causing mock to get such a weird
> >> error?  I can provide the bisect log if you'd like.
> >
> > I took a look at what mock was doing and it was mostly very simple
> > stuff.  The two exceptions were that it was calling unshare, then doing
> > some file checks and I/O, and then calling fork to exec off some helper
> > things.  Up until the point it fails, the forks work and the children go
> > do whatever it is they were supposed to do.  I've CC'd Clark Williams
> > just in case people have questions on mock itself, but I'm not sure that
> > will be needed.
> 
> Our emails crossed paths.  You have just confirmed my suspicion about
> what was going wrong.
> 
> The practical question is why mock is calling unshare(CLONE_NEWPID)
> because it clearly seems not to understand how to unshare the pid
> namespace and use it that way.

I don't know either, and I've CC'd Clark on this thread now (again for
real this time).

> Except for forgeting to reenable irqs in the failure path of alloc_pid
> the behavior is exactly correct and is how the pid namespace is designed
> to behave in the case of unshare.

Well... ok.  Personally, I'm confused why mock was working before that
commit and now it doesn't.  Were we just buggy in the kernel prior to
that?

> > which is consistent with what mock is seeing.  If I comment out the call
> > to unshare, it seems to always work.  It seems to consistently fail with
> > ENOMEM after the first 3-5 forked children, but it varies within that
> > range.
> 
> If you add a waitpid or space out your forks you will see that it always
> fails after your first child in the pid namespace has exited.
> 
> We don't allow children in a pid namespace after fork has exited.

Hm.  Mock seems to make it a ways along as well though, and it isn't
doing rapid fire fork/exec.  I believe it's forking helpers and waiting
for them to return before continuing on.

josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 20:23         ` Josh Boyer
@ 2013-02-08 20:45           ` Eric W. Biederman
  2013-02-08 21:27             ` Josh Boyer
  0 siblings, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2013-02-08 20:45 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams

Josh Boyer <jwboyer@redhat.com> writes:

> < Two emails fly past each other in the night >

Yep.

>> My best guess in some dark corner of mock has untested code to unshare a
>> pid namespace, and that corner started doing something now that
>> unsharing of the pid namespace actually works.
>> 
>> If mock has called unshare(CLONE_NEWPID). And then forked a process and
>> that process exited, and then forked anothe process that second and all
>> subsequent fork calls will fail with -ENOMEM (because init has exited in
>> the pid namespace).  -ENOMEM will be generated because of a failure of
>> alloc_pid.
>> 
>> Looking at that code path a little closer that just about has to be it,
>> because I goofed and the error path drops the lock but not irqs.  The
>> patch below should fix the nasty warning and confirm where the code is
>> failing in copy_process.
>
> OK.  I'll turn the debug option back on and give this patch a try.

Thanks.  Your minimal test case also confirms my hunch. But we should
fix the error path as well.

>> An strace to see which syscalls mock is making and with which flags
>> would be very interesting.  I am almost certain that there is a
>> unshare(CLONE_NEWPID) somewhere in there.  But in a remote corner of
>> possibility it could weird clone flags, or something else.
>
> Oh, I have that but it's a python app with a helper C app and it's a...
> verbose strace.  It's here for one failure:
>
> http://jwboyer.fedorapeople.org/pub/mock-strace
>
> Hopefully the testcase from my other email will help though.  It's much
> simpler.

Yes.  Your other test case confirms my patch you bisected this to is
working correctly.

>> Beyond that I suspect we want to work with the mock folks so they get
>> their code to use a pid namespace working the way they intended.
>
> Right.  CC'd Clark (for real this time).
>
> I'll let you know on the patch.

Cool.  Looking at the strace I can't figure out what mock expected
to happen or how mock was working before this.  As mock is calling
unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID) all in one
go.

Previous to my patch enabling CLONE_NEWPID that would cause the unshare
to fail.

So it looks mock is taking a buggy untested code path and things are not
working as it expected.

Eric


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 20:45           ` Eric W. Biederman
@ 2013-02-08 21:27             ` Josh Boyer
  2013-02-08 22:05               ` Eric W. Biederman
  2013-02-08 22:10               ` Clark Williams
  0 siblings, 2 replies; 19+ messages in thread
From: Josh Boyer @ 2013-02-08 21:27 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams

On Fri, Feb 08, 2013 at 12:45:47PM -0800, Eric W. Biederman wrote:
> Josh Boyer <jwboyer@redhat.com> writes:
> 
> > < Two emails fly past each other in the night >
> 
> Yep.
> 
> >> My best guess in some dark corner of mock has untested code to unshare a
> >> pid namespace, and that corner started doing something now that
> >> unsharing of the pid namespace actually works.
> >> 
> >> If mock has called unshare(CLONE_NEWPID). And then forked a process and
> >> that process exited, and then forked anothe process that second and all
> >> subsequent fork calls will fail with -ENOMEM (because init has exited in
> >> the pid namespace).  -ENOMEM will be generated because of a failure of
> >> alloc_pid.
> >> 
> >> Looking at that code path a little closer that just about has to be it,
> >> because I goofed and the error path drops the lock but not irqs.  The
> >> patch below should fix the nasty warning and confirm where the code is
> >> failing in copy_process.
> >
> > OK.  I'll turn the debug option back on and give this patch a try.
> 
> Thanks.  Your minimal test case also confirms my hunch. But we should
> fix the error path as well.
> 
> >> An strace to see which syscalls mock is making and with which flags
> >> would be very interesting.  I am almost certain that there is a
> >> unshare(CLONE_NEWPID) somewhere in there.  But in a remote corner of
> >> possibility it could weird clone flags, or something else.
> >
> > Oh, I have that but it's a python app with a helper C app and it's a...
> > verbose strace.  It's here for one failure:
> >
> > http://jwboyer.fedorapeople.org/pub/mock-strace
> >
> > Hopefully the testcase from my other email will help though.  It's much
> > simpler.
> 
> Yes.  Your other test case confirms my patch you bisected this to is
> working correctly.
> 
> >> Beyond that I suspect we want to work with the mock folks so they get
> >> their code to use a pid namespace working the way they intended.
> >
> > Right.  CC'd Clark (for real this time).
> >
> > I'll let you know on the patch.

The patch appears to work as expected.  With CONFIG_DEBUG_ATOMIC_SLEEP
set I don't see the backtrace error.  That should go in.

> Cool.  Looking at the strace I can't figure out what mock expected
> to happen or how mock was working before this.  As mock is calling
> unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID) all in one
> go.
> 
> Previous to my patch enabling CLONE_NEWPID that would cause the unshare
> to fail.

Oh.  Indeed.  On a kernel without the commit in question I see this from
mock:

DEBUG: Unsharing. Flags: 738328576
DEBUG: unshare(738328576) failed, falling back to unshare(67239936)
DEBUG: Unsharing. Flags: 67239936

So it's trying with NEWPID and that fails, so it falls back to just
NEWNS | NEWUTS.  That explains it working.

> So it looks mock is taking a buggy untested code path and things are not
> working as it expected.

Quite possibly, yes.  I instrumented the kernel a bit and it is indeed
failing in the alloc_pid call.

Clark, thoughts here?

josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 21:27             ` Josh Boyer
@ 2013-02-08 22:05               ` Eric W. Biederman
  2013-02-08 22:40                 ` Clark Williams
  2013-02-08 22:10               ` Clark Williams
  1 sibling, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2013-02-08 22:05 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams

Josh Boyer <jwboyer@redhat.com> writes:

>> So it looks mock is taking a buggy untested code path and things are not
>> working as it expected.
>
> Quite possibly, yes.  I instrumented the kernel a bit and it is indeed
> failing in the alloc_pid call.
>
> Clark, thoughts here?

I will just add the solution is probably for mock to fork immediate
after the unshare succeeds in creating a pid namespace.  With the
original process waiting for mock to exit and the child process
doing everything that mock does now.

That will allow mock to act as the init process in the pid namespace it
just created.

Eric


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 21:27             ` Josh Boyer
  2013-02-08 22:05               ` Eric W. Biederman
@ 2013-02-08 22:10               ` Clark Williams
  2013-02-08 22:40                 ` Eric W. Biederman
  1 sibling, 1 reply; 19+ messages in thread
From: Clark Williams @ 2013-02-08 22:10 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Eric W. Biederman, Andrew Morton, Al Viro, Mel Gorman, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3885 bytes --]

On Fri, 8 Feb 2013 16:27:26 -0500
Josh Boyer <jwboyer@redhat.com> wrote:

> On Fri, Feb 08, 2013 at 12:45:47PM -0800, Eric W. Biederman wrote:
> > Josh Boyer <jwboyer@redhat.com> writes:
> > 
> > > < Two emails fly past each other in the night >
> > 
> > Yep.
> > 
> > >> My best guess in some dark corner of mock has untested code to unshare a
> > >> pid namespace, and that corner started doing something now that
> > >> unsharing of the pid namespace actually works.
> > >> 
> > >> If mock has called unshare(CLONE_NEWPID). And then forked a process and
> > >> that process exited, and then forked anothe process that second and all
> > >> subsequent fork calls will fail with -ENOMEM (because init has exited in
> > >> the pid namespace).  -ENOMEM will be generated because of a failure of
> > >> alloc_pid.
> > >> 
> > >> Looking at that code path a little closer that just about has to be it,
> > >> because I goofed and the error path drops the lock but not irqs.  The
> > >> patch below should fix the nasty warning and confirm where the code is
> > >> failing in copy_process.
> > >
> > > OK.  I'll turn the debug option back on and give this patch a try.
> > 
> > Thanks.  Your minimal test case also confirms my hunch. But we should
> > fix the error path as well.
> > 
> > >> An strace to see which syscalls mock is making and with which flags
> > >> would be very interesting.  I am almost certain that there is a
> > >> unshare(CLONE_NEWPID) somewhere in there.  But in a remote corner of
> > >> possibility it could weird clone flags, or something else.
> > >
> > > Oh, I have that but it's a python app with a helper C app and it's a...
> > > verbose strace.  It's here for one failure:
> > >
> > > http://jwboyer.fedorapeople.org/pub/mock-strace
> > >
> > > Hopefully the testcase from my other email will help though.  It's much
> > > simpler.
> > 
> > Yes.  Your other test case confirms my patch you bisected this to is
> > working correctly.
> > 
> > >> Beyond that I suspect we want to work with the mock folks so they get
> > >> their code to use a pid namespace working the way they intended.
> > >
> > > Right.  CC'd Clark (for real this time).
> > >
> > > I'll let you know on the patch.
> 
> The patch appears to work as expected.  With CONFIG_DEBUG_ATOMIC_SLEEP
> set I don't see the backtrace error.  That should go in.
> 
> > Cool.  Looking at the strace I can't figure out what mock expected
> > to happen or how mock was working before this.  As mock is calling
> > unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID) all in one
> > go.
> > 
> > Previous to my patch enabling CLONE_NEWPID that would cause the unshare
> > to fail.
> 
> Oh.  Indeed.  On a kernel without the commit in question I see this from
> mock:
> 
> DEBUG: Unsharing. Flags: 738328576
> DEBUG: unshare(738328576) failed, falling back to unshare(67239936)
> DEBUG: Unsharing. Flags: 67239936
> 
> So it's trying with NEWPID and that fails, so it falls back to just
> NEWNS | NEWUTS.  That explains it working.
> 
> > So it looks mock is taking a buggy untested code path and things are not
> > working as it expected.
> 
> Quite possibly, yes.  I instrumented the kernel a bit and it is indeed
> failing in the alloc_pid call.
> 
> Clark, thoughts here?
> 
> josh

The more I look at that the more I think I should nuke CLONE_NEWPID in
mock. It came in with a commit that added NEWIPC, which I think is valid
for mock managing a chroot, but we're not looking to do full-up
containers at this point and it looks like containers is the only place
you'd want to start a new set of pids. 

I'll do that and run some tests to make sure I'm not depending on
something there (doubtful since it looks like it never really worked
and that we've always been using the base flags). 

Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 20:13       ` Eric W. Biederman
  2013-02-08 20:23         ` Josh Boyer
@ 2013-02-08 22:12         ` Josh Boyer
  2013-02-11 23:57         ` Andrew Morton
  2 siblings, 0 replies; 19+ messages in thread
From: Josh Boyer @ 2013-02-08 22:12 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel

On Fri, Feb 08, 2013 at 12:13:09PM -0800, Eric W. Biederman wrote:

For posterity's sake

> From: "Eric W. Biederman" <ebiederm@xmission.com>
> Date: Fri, 8 Feb 2013 12:05:54 -0800
> Subject: [PATCH] pid: unlock_irq when alloc_pid fails because init has
>  exited.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Tested-by: Josh Boyer <jwboyer@redhat.com>

> ---
>  kernel/pid.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/pid.c b/kernel/pid.c
> index de9af60..f2c6a68 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -331,7 +331,7 @@ out:
>  	return pid;
>  
>  out_unlock:
> -	spin_unlock(&pidmap_lock);
> +	spin_unlock_irq(&pidmap_lock);
>  out_free:
>  	while (++i <= ns->level)
>  		free_pidmap(pid->numbers + i);
> -- 
> 1.7.5.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 22:10               ` Clark Williams
@ 2013-02-08 22:40                 ` Eric W. Biederman
  2013-02-08 22:56                   ` Clark Williams
  0 siblings, 1 reply; 19+ messages in thread
From: Eric W. Biederman @ 2013-02-08 22:40 UTC (permalink / raw)
  To: Clark Williams
  Cc: Josh Boyer, Andrew Morton, Al Viro, Mel Gorman, linux-kernel

Clark Williams <williams@redhat.com> writes:

> The more I look at that the more I think I should nuke CLONE_NEWPID in
> mock. It came in with a commit that added NEWIPC, which I think is valid
> for mock managing a chroot, but we're not looking to do full-up
> containers at this point and it looks like containers is the only place
> you'd want to start a new set of pids. 

Just taking the code out seems reasonable.  Howerver there is a
practical use for a pid namespace in a setup like mock.  A pid namespace
makes it so your sub processes can not reparent and get away from you,
which could be handy in case someone starts a system daemon in a post
install script.

Eric


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 22:05               ` Eric W. Biederman
@ 2013-02-08 22:40                 ` Clark Williams
  0 siblings, 0 replies; 19+ messages in thread
From: Clark Williams @ 2013-02-08 22:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Josh Boyer, Andrew Morton, Al Viro, Mel Gorman, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]

On Fri, 08 Feb 2013 14:05:55 -0800
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Josh Boyer <jwboyer@redhat.com> writes:
> 
> >> So it looks mock is taking a buggy untested code path and things are not
> >> working as it expected.
> >
> > Quite possibly, yes.  I instrumented the kernel a bit and it is indeed
> > failing in the alloc_pid call.
> >
> > Clark, thoughts here?
> 
> I will just add the solution is probably for mock to fork immediate
> after the unshare succeeds in creating a pid namespace.  With the
> original process waiting for mock to exit and the child process
> doing everything that mock does now.
> 
> That will allow mock to act as the init process in the pid namespace it
> just created.
> 
> Eric
> 

Well, mock is not really setup to act as an init (i.e. reap all the
child processes in the new namespace). Not really sure that's what I
want anyway, verses just nuking CLONE_NEWPID and running in the same
pidspace. 

I added some debugging output in the code around the unshare() calls and
so far I can't seem to get unshare() to succeed when NEWPID is one of
the flags. I'll admit that I'm running a realtime kernel (3.6.11-rt28
with CONFIG_PID_NS=y) so I'll boot into the latest F18 and make sure of
this, but I'm leaning towards just removing NEWPID, since we weren't
using it right and I'm not convinced that we need it. 

Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 22:40                 ` Eric W. Biederman
@ 2013-02-08 22:56                   ` Clark Williams
  0 siblings, 0 replies; 19+ messages in thread
From: Clark Williams @ 2013-02-08 22:56 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Josh Boyer, Andrew Morton, Al Viro, Mel Gorman, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1317 bytes --]

On Fri, 08 Feb 2013 14:40:13 -0800
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Clark Williams <williams@redhat.com> writes:
> 
> > The more I look at that the more I think I should nuke CLONE_NEWPID in
> > mock. It came in with a commit that added NEWIPC, which I think is valid
> > for mock managing a chroot, but we're not looking to do full-up
> > containers at this point and it looks like containers is the only place
> > you'd want to start a new set of pids. 
> 
> Just taking the code out seems reasonable.  Howerver there is a
> practical use for a pid namespace in a setup like mock.  A pid namespace
> makes it so your sub processes can not reparent and get away from you,
> which could be handy in case someone starts a system daemon in a post
> install script.
> 

Ok, I *think* I'm up to speed now (I'm old and slow so gimme a break). 

Unsharing pidns only works after your commit in 3.8; that's why my
unshare was always failing. Does it make sense for me to make an
additional unshare() call with just NEWPID as an argument? That is,
call unshare with the NEWNS, NEWIPC, and NEWUTS flags, then when that
succeeds, try NEWPID. If the NEWPID call succeeds, do:

    pid = os.fork()
    if pid:
        os.waitpid(pid, 0)

So that the child continues on?

Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-08 20:13       ` Eric W. Biederman
  2013-02-08 20:23         ` Josh Boyer
  2013-02-08 22:12         ` Josh Boyer
@ 2013-02-11 23:57         ` Andrew Morton
  2013-02-12 10:34           ` Eric W. Biederman
  2 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2013-02-11 23:57 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Josh Boyer, Al Viro, Mel Gorman, linux-kernel

On Fri, 08 Feb 2013 12:13:09 -0800
ebiederm@xmission.com (Eric W. Biederman) wrote:

> If mock has called unshare(CLONE_NEWPID). And then forked a process and
> that process exited, and then forked anothe process that second and all
> subsequent fork calls will fail with -ENOMEM (because init has exited in
> the pid namespace).  -ENOMEM will be generated because of a failure of
> alloc_pid.

Can we please fix this?  The system is *not* out of memory and it's
wildly misleading to report this to userspace.

If alloc_pid() can fail for multiple reasons then it should be
returning an ERR_PTR on failure, not NULL.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Odd ENOMEM being returned in 3.8-rcX
  2013-02-11 23:57         ` Andrew Morton
@ 2013-02-12 10:34           ` Eric W. Biederman
  0 siblings, 0 replies; 19+ messages in thread
From: Eric W. Biederman @ 2013-02-12 10:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Josh Boyer, Al Viro, Mel Gorman, linux-kernel

Andrew Morton <akpm@linux-foundation.org> writes:

> On Fri, 08 Feb 2013 12:13:09 -0800
> ebiederm@xmission.com (Eric W. Biederman) wrote:
>
>> If mock has called unshare(CLONE_NEWPID). And then forked a process and
>> that process exited, and then forked anothe process that second and all
>> subsequent fork calls will fail with -ENOMEM (because init has exited in
>> the pid namespace).  -ENOMEM will be generated because of a failure of
>> alloc_pid.
>
> Can we please fix this?  The system is *not* out of memory and it's
> wildly misleading to report this to userspace.
>
> If alloc_pid() can fail for multiple reasons then it should be
> returning an ERR_PTR on failure, not NULL.

We might be able to fix this.

alloc_pid causing fork to fail with ENOMEM when there are simply not
enough pids available to use is a long standing failure mode, and error
code.

I believe to change this responsibily you would have read a lot of
programs to see if fork failing with an additional error code would be
handled or if things would break in subtle ways.  There would need to be
research to see what posix says about this, and the posix view on fork
would have any impact on this in any way.

Furthermore this error is a resource shortage, so ENOMEM isn't even
exactly wrong.

I can understand the problem but I totally don't care enough to push an
ABI change like that through.

Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-02-12 10:34 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-07 21:57 Odd ENOMEM being returned in 3.8-rcX Josh Boyer
2013-02-07 22:15 ` Andrew Morton
2013-02-08  0:35   ` Josh Boyer
2013-02-08 18:19     ` Josh Boyer
2013-02-08 20:13       ` Eric W. Biederman
2013-02-08 20:23         ` Josh Boyer
2013-02-08 20:45           ` Eric W. Biederman
2013-02-08 21:27             ` Josh Boyer
2013-02-08 22:05               ` Eric W. Biederman
2013-02-08 22:40                 ` Clark Williams
2013-02-08 22:10               ` Clark Williams
2013-02-08 22:40                 ` Eric W. Biederman
2013-02-08 22:56                   ` Clark Williams
2013-02-08 22:12         ` Josh Boyer
2013-02-11 23:57         ` Andrew Morton
2013-02-12 10:34           ` Eric W. Biederman
2013-02-08 20:18       ` Josh Boyer
2013-02-08 20:36         ` Eric W. Biederman
2013-02-08 20:40           ` Josh Boyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).