* Odd ENOMEM being returned in 3.8-rcX @ 2013-02-07 21:57 Josh Boyer 2013-02-07 22:15 ` Andrew Morton 0 siblings, 1 reply; 19+ messages in thread From: Josh Boyer @ 2013-02-07 21:57 UTC (permalink / raw) To: Andrew Morton, Al Viro, Eric W. Biederman, Mel Gorman; +Cc: linux-kernel Hi All, We've hit a weird error in Fedora using the 3.8-rcX kernels. It seems the mock tool is getting back ENOMEM when doing very simple things that normally just work. The 3.7 kernels on the same userspace work just fine. It seems just running 'mock init -v' is enough to cause the failure. Because this is the rawhide kernel, we have some debug options enabled. This happens to trigger this error: [ 89.143660] BUG: sleeping function called from invalid context at kernel/nsproxy.c:217 [ 89.143729] in_atomic(): 0, irqs_disabled(): 1, pid: 1329, name: mock [ 89.143776] no locks held by mock/1329. [ 89.143778] irq event stamp: 324562 [ 89.143781] hardirqs last enabled at (324561): [<ffffffff81163a8d>] get_page_from_freelist+0x51d/0x990 [ 89.143791] hardirqs last disabled at (324562): [<ffffffff816daa9d>] _raw_spin_lock_irq+0x1d/0x60 [ 89.143798] softirqs last enabled at (323936): [<ffffffff81070438>] __do_softirq+0x168/0x3d0 [ 89.143804] softirqs last disabled at (323931): [<ffffffff816e587c>] call_softirq+0x1c/0x30 [ 89.143811] Pid: 1329, comm: mock Not tainted 3.8.0-0.rc6.git1.1.fc19.x86_64 #1 [ 89.143814] Call Trace: [ 89.143823] [<ffffffff8109f8d9>] __might_sleep+0x179/0x230 [ 89.143828] [<ffffffff81097887>] switch_task_namespaces+0x27/0x60 [ 89.143833] [<ffffffff810978d0>] exit_task_namespaces+0x10/0x20 [ 89.143839] [<ffffffff81064692>] copy_process.part.22+0xe32/0x1640 [ 89.143844] [<ffffffff81064f95>] do_fork+0xa5/0x450 [ 89.143849] [<ffffffff816db718>] ? retint_swapgs+0x13/0x1b [ 89.143854] [<ffffffff810653c6>] sys_clone+0x16/0x20 [ 89.143859] [<ffffffff816e48b9>] stub_clone+0x69/0x90 [ 89.143864] [<ffffffff816e44d9>] ? system_call_fastpath+0x16/0x1b At first glance it seems copy_io is failing (possibly because get_task_io_context fails), and then the above fallout is printed. The warning seems fairly valid, but I don't think that is the root of the problem. We've seen this as far back as Linux v3.8-rc2-116-g5f243b9 so far. I can still hit it with 3.8-rc6 as well. I'm still trying to see if the ENOMEM hits without the debug options set, and exactly which commit caused it. I just wanted to see if anyone else had seen odd python issues or other things failing with ENOMEM when they shouldn't while I'm off debugging. Thoughts/tips would be appreciated. josh ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-07 21:57 Odd ENOMEM being returned in 3.8-rcX Josh Boyer @ 2013-02-07 22:15 ` Andrew Morton 2013-02-08 0:35 ` Josh Boyer 0 siblings, 1 reply; 19+ messages in thread From: Andrew Morton @ 2013-02-07 22:15 UTC (permalink / raw) To: Josh Boyer; +Cc: Al Viro, Eric W. Biederman, Mel Gorman, linux-kernel On Thu, 7 Feb 2013 16:57:42 -0500 Josh Boyer <jwboyer@redhat.com> wrote: > Hi All, > > We've hit a weird error in Fedora using the 3.8-rcX kernels. It seems > the mock tool is getting back ENOMEM when doing very simple things that > normally just work. The 3.7 kernels on the same userspace work just > fine. It seems just running 'mock init -v' is enough to cause the > failure. I assume you're not seeing the "page allocation failure" message and backtrace. This means that either a) it's a __GFP_NOWARN callsite. This is rare. Or b) it's actually a different error but someone went and overwrote a callee's return value with -ENOMEM. We do this a lot and it sucks. > Because this is the rawhide kernel, we have some debug options enabled. > This happens to trigger this error: > > [ 89.143660] BUG: sleeping function called from invalid context at kernel/nsproxy.c:217 > [ 89.143729] in_atomic(): 0, irqs_disabled(): 1, pid: 1329, name: mock > [ 89.143776] no locks held by mock/1329. > [ 89.143778] irq event stamp: 324562 > [ 89.143781] hardirqs last enabled at (324561): [<ffffffff81163a8d>] get_page_from_freelist+0x51d/0x990 > [ 89.143791] hardirqs last disabled at (324562): [<ffffffff816daa9d>] _raw_spin_lock_irq+0x1d/0x60 > [ 89.143798] softirqs last enabled at (323936): [<ffffffff81070438>] __do_softirq+0x168/0x3d0 > [ 89.143804] softirqs last disabled at (323931): [<ffffffff816e587c>] call_softirq+0x1c/0x30 > [ 89.143811] Pid: 1329, comm: mock Not tainted 3.8.0-0.rc6.git1.1.fc19.x86_64 #1 > [ 89.143814] Call Trace: > [ 89.143823] [<ffffffff8109f8d9>] __might_sleep+0x179/0x230 > [ 89.143828] [<ffffffff81097887>] switch_task_namespaces+0x27/0x60 > [ 89.143833] [<ffffffff810978d0>] exit_task_namespaces+0x10/0x20 > [ 89.143839] [<ffffffff81064692>] copy_process.part.22+0xe32/0x1640 > [ 89.143844] [<ffffffff81064f95>] do_fork+0xa5/0x450 > [ 89.143849] [<ffffffff816db718>] ? retint_swapgs+0x13/0x1b > [ 89.143854] [<ffffffff810653c6>] sys_clone+0x16/0x20 > [ 89.143859] [<ffffffff816e48b9>] stub_clone+0x69/0x90 > [ 89.143864] [<ffffffff816e44d9>] ? system_call_fastpath+0x16/0x1b > > At first glance it seems copy_io is failing (possibly because > get_task_io_context fails), and then the above fallout is printed. The > warning seems fairly valid, but I don't think that is the root of the > problem. yes, get_task_io_context() might be the place. Tried adding a few error-path printks in there to see what's happening? I can't see anything around there which leaves interrupts disabled though. It's quite likely that there's some code with is forgetting to reenable interrupts on a rarely-tested error path, and that ENOMEM is tickling the bug. > We've seen this as far back as Linux v3.8-rc2-116-g5f243b9 so far. I > can still hit it with 3.8-rc6 as well. > > I'm still trying to see if the ENOMEM hits without the debug options set, > and exactly which commit caused it. I just wanted to see if anyone else > had seen odd python issues or other things failing with ENOMEM when they > shouldn't while I'm off debugging. > > Thoughts/tips would be appreciated. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-07 22:15 ` Andrew Morton @ 2013-02-08 0:35 ` Josh Boyer 2013-02-08 18:19 ` Josh Boyer 0 siblings, 1 reply; 19+ messages in thread From: Josh Boyer @ 2013-02-08 0:35 UTC (permalink / raw) To: Andrew Morton; +Cc: Al Viro, Eric W. Biederman, Mel Gorman, linux-kernel On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote: > On Thu, 7 Feb 2013 16:57:42 -0500 > Josh Boyer <jwboyer@redhat.com> wrote: > > > Hi All, > > > > We've hit a weird error in Fedora using the 3.8-rcX kernels. It seems > > the mock tool is getting back ENOMEM when doing very simple things that > > normally just work. The 3.7 kernels on the same userspace work just > > fine. It seems just running 'mock init -v' is enough to cause the > > failure. > > I assume you're not seeing the "page allocation failure" message and > backtrace. This means that either Right. If I disable our debug options, I see no backtraces at all and the python app still gets ENOMEM returned. (See below for those interested). > a) it's a __GFP_NOWARN callsite. This is rare. Or > > b) it's actually a different error but someone went and overwrote a > callee's return value with -ENOMEM. We do this a lot and it sucks. We do it in copy_io :\. > > At first glance it seems copy_io is failing (possibly because > > get_task_io_context fails), and then the above fallout is printed. The > > warning seems fairly valid, but I don't think that is the root of the > > problem. > > yes, get_task_io_context() might be the place. Tried adding a few > error-path printks in there to see what's happening? Yeah, that's my next step. I guess I know what I'll be doing tomorrow. > I can't see anything around there which leaves interrupts disabled > though. It's quite likely that there's some code with is forgetting to > reenable interrupts on a rarely-tested error path, and that ENOMEM is > tickling the bug. Right, agreed. As I said, I think that is mostly a secondary issue. Hopefully it will be easy to fix once we figure out why we're getting the ENOMEM error. Python backtrace below. Seems to be failing on forking a umount command after init'ing the chroot. I can put the full output somewhere if people are interested. josh DEBUG: Executing command: ['/bin/umount', '-n', '-l', '/var/lib/mock/fedora-16-x86_64/root/proc'] with env {'LANG': 'en_US.utf8', 'TERM': 'vt100', 'SHELL': '/bin/bash', 'HOSTNAME': 'mock', 'HOME': '/builddir', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin'} Traceback (most recent call last): File "/usr/sbin/mock", line 539, in <module> def do_buildsrpm(config_opts, chroot, options, args): File "/usr/sbin/mock", line 782, in main chroot.init() File "<peak.util.decorators.rewrap wrapping mockbuild.backend.init at 0x00D59320>", line 3, in init File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace result = func(*args, **kw) File "/usr/lib/python2.7/site-packages/mockbuild/backend.py", line 279, in init self._init() File "<peak.util.decorators.rewrap wrapping mockbuild.backend._init at 0x00D596E0>", line 3, in _init File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace result = func(*args, **kw) File "/usr/lib/python2.7/site-packages/mockbuild/backend.py", line 425, in _init self._umountall() File "<peak.util.decorators.rewrap wrapping mockbuild.backend._umountall at 0x00D5C320>", line 3, in _umountall File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace result = func(*args, **kw) File "/usr/lib/python2.7/site-packages/mockbuild/backend.py", line 872, in _umountall self.mounts.umountall(nowarn=nowarn) File "<peak.util.decorators.rewrap wrapping mockbuild.mounts.umountall at 0x00D43398>", line 3, in umountall File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace result = func(*args, **kw) File "/usr/lib/python2.7/site-packages/mockbuild/mounts.py", line 133, in umountall m.umount() File "<peak.util.decorators.rewrap wrapping mockbuild.mounts.umount at 0x00D4EC08>", line 3, in umount File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace result = func(*args, **kw) File "/usr/lib/python2.7/site-packages/mockbuild/mounts.py", line 69, in umount mockbuild.util.do(cmd) File "<peak.util.decorators.rewrap wrapping mockbuild.util.do at 0x00D47140>", line 3, in do File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace result = func(*args, **kw) File "/usr/lib/python2.7/site-packages/mockbuild/util.py", line 323, in do preexec_fn = preexec, File "/usr/lib64/python2.7/subprocess.py", line 679, in __init__ errread, errwrite) File "/usr/lib64/python2.7/subprocess.py", line 1143, in _execute_child self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 0:35 ` Josh Boyer @ 2013-02-08 18:19 ` Josh Boyer 2013-02-08 20:13 ` Eric W. Biederman 2013-02-08 20:18 ` Josh Boyer 0 siblings, 2 replies; 19+ messages in thread From: Josh Boyer @ 2013-02-08 18:19 UTC (permalink / raw) To: Andrew Morton, Eric W. Biederman; +Cc: Al Viro, Mel Gorman, linux-kernel On Thu, Feb 07, 2013 at 07:35:01PM -0500, Josh Boyer wrote: > On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote: > > On Thu, 7 Feb 2013 16:57:42 -0500 > > Josh Boyer <jwboyer@redhat.com> wrote: > > > > > Hi All, > > > > > > We've hit a weird error in Fedora using the 3.8-rcX kernels. It seems > > > the mock tool is getting back ENOMEM when doing very simple things that > > > normally just work. The 3.7 kernels on the same userspace work just > > > fine. It seems just running 'mock init -v' is enough to cause the > > > failure. > > > > I assume you're not seeing the "page allocation failure" message and > > backtrace. This means that either > > Right. If I disable our debug options, I see no backtraces at all and > the python app still gets ENOMEM returned. (See below for those > interested). > > > a) it's a __GFP_NOWARN callsite. This is rare. Or > > > > b) it's actually a different error but someone went and overwrote a > > callee's return value with -ENOMEM. We do this a lot and it sucks. > > We do it in copy_io :\. > > > > At first glance it seems copy_io is failing (possibly because > > > get_task_io_context fails), and then the above fallout is printed. The > > > warning seems fairly valid, but I don't think that is the root of the > > > problem. > > > > yes, get_task_io_context() might be the place. Tried adding a few > > error-path printks in there to see what's happening? > > Yeah, that's my next step. I guess I know what I'll be doing tomorrow. > > > I can't see anything around there which leaves interrupts disabled > > though. It's quite likely that there's some code with is forgetting to > > reenable interrupts on a rarely-tested error path, and that ENOMEM is > > tickling the bug. > > Right, agreed. As I said, I think that is mostly a secondary issue. > Hopefully it will be easy to fix once we figure out why we're getting > the ENOMEM error. > > Python backtrace below. Seems to be failing on forking a umount command > after init'ing the chroot. I can put the full output somewhere if > people are interested. OK. I've bisected this down to: 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit commit 50804fe3737ca6a5942fdc2057a18a8141d00141 Author: Eric W. Biederman <ebiederm@xmission.com> Date: Tue Mar 2 15:41:50 2010 -0800 pidns: Support unsharing the pid namespace. I haven't really gotten much farther than that yet, but the bisect was pretty straight forward. Eric, is there anything specific I can gather or do to help figure out why that is causing mock to get such a weird error? I can provide the bisect log if you'd like. josh ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 18:19 ` Josh Boyer @ 2013-02-08 20:13 ` Eric W. Biederman 2013-02-08 20:23 ` Josh Boyer ` (2 more replies) 2013-02-08 20:18 ` Josh Boyer 1 sibling, 3 replies; 19+ messages in thread From: Eric W. Biederman @ 2013-02-08 20:13 UTC (permalink / raw) To: Josh Boyer; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel Josh Boyer <jwboyer@redhat.com> writes: > On Thu, Feb 07, 2013 at 07:35:01PM -0500, Josh Boyer wrote: >> On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote: >> > On Thu, 7 Feb 2013 16:57:42 -0500 >> > Josh Boyer <jwboyer@redhat.com> wrote: >> > >> > > Hi All, >> > > >> > > We've hit a weird error in Fedora using the 3.8-rcX kernels. It seems >> > > the mock tool is getting back ENOMEM when doing very simple things that >> > > normally just work. The 3.7 kernels on the same userspace work just >> > > fine. It seems just running 'mock init -v' is enough to cause the >> > > failure. >> > >> > I assume you're not seeing the "page allocation failure" message and >> > backtrace. This means that either >> >> Right. If I disable our debug options, I see no backtraces at all and >> the python app still gets ENOMEM returned. (See below for those >> interested). >> >> > a) it's a __GFP_NOWARN callsite. This is rare. Or >> > >> > b) it's actually a different error but someone went and overwrote a >> > callee's return value with -ENOMEM. We do this a lot and it sucks. >> >> We do it in copy_io :\. >> >> > > At first glance it seems copy_io is failing (possibly because >> > > get_task_io_context fails), and then the above fallout is printed. The >> > > warning seems fairly valid, but I don't think that is the root of the >> > > problem. >> > >> > yes, get_task_io_context() might be the place. Tried adding a few >> > error-path printks in there to see what's happening? >> >> Yeah, that's my next step. I guess I know what I'll be doing tomorrow. >> >> > I can't see anything around there which leaves interrupts disabled >> > though. It's quite likely that there's some code with is forgetting to >> > reenable interrupts on a rarely-tested error path, and that ENOMEM is >> > tickling the bug. >> >> Right, agreed. As I said, I think that is mostly a secondary issue. >> Hopefully it will be easy to fix once we figure out why we're getting >> the ENOMEM error. >> >> Python backtrace below. Seems to be failing on forking a umount command >> after init'ing the chroot. I can put the full output somewhere if >> people are interested. > > OK. I've bisected this down to: > > 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit > commit 50804fe3737ca6a5942fdc2057a18a8141d00141 > Author: Eric W. Biederman <ebiederm@xmission.com> > Date: Tue Mar 2 15:41:50 2010 -0800 > > pidns: Support unsharing the pid namespace. > > > I haven't really gotten much farther than that yet, but the bisect was > pretty straight forward. Eric, is there anything specific I can gather > or do to help figure out why that is causing mock to get such a weird > error? I can provide the bisect log if you'd like. My best guess in some dark corner of mock has untested code to unshare a pid namespace, and that corner started doing something now that unsharing of the pid namespace actually works. If mock has called unshare(CLONE_NEWPID). And then forked a process and that process exited, and then forked anothe process that second and all subsequent fork calls will fail with -ENOMEM (because init has exited in the pid namespace). -ENOMEM will be generated because of a failure of alloc_pid. Looking at that code path a little closer that just about has to be it, because I goofed and the error path drops the lock but not irqs. The patch below should fix the nasty warning and confirm where the code is failing in copy_process. An strace to see which syscalls mock is making and with which flags would be very interesting. I am almost certain that there is a unshare(CLONE_NEWPID) somewhere in there. But in a remote corner of possibility it could weird clone flags, or something else. Beyond that I suspect we want to work with the mock folks so they get their code to use a pid namespace working the way they intended. Eric From: "Eric W. Biederman" <ebiederm@xmission.com> Date: Fri, 8 Feb 2013 12:05:54 -0800 Subject: [PATCH] pid: unlock_irq when alloc_pid fails because init has exited. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- kernel/pid.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/pid.c b/kernel/pid.c index de9af60..f2c6a68 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -331,7 +331,7 @@ out: return pid; out_unlock: - spin_unlock(&pidmap_lock); + spin_unlock_irq(&pidmap_lock); out_free: while (++i <= ns->level) free_pidmap(pid->numbers + i); -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 20:13 ` Eric W. Biederman @ 2013-02-08 20:23 ` Josh Boyer 2013-02-08 20:45 ` Eric W. Biederman 2013-02-08 22:12 ` Josh Boyer 2013-02-11 23:57 ` Andrew Morton 2 siblings, 1 reply; 19+ messages in thread From: Josh Boyer @ 2013-02-08 20:23 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams On Fri, Feb 08, 2013 at 12:13:09PM -0800, Eric W. Biederman wrote: > Josh Boyer <jwboyer@redhat.com> writes: > >> Right, agreed. As I said, I think that is mostly a secondary issue. > >> Hopefully it will be easy to fix once we figure out why we're getting > >> the ENOMEM error. > >> > >> Python backtrace below. Seems to be failing on forking a umount command > >> after init'ing the chroot. I can put the full output somewhere if > >> people are interested. > > > > OK. I've bisected this down to: > > > > 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit > > commit 50804fe3737ca6a5942fdc2057a18a8141d00141 > > Author: Eric W. Biederman <ebiederm@xmission.com> > > Date: Tue Mar 2 15:41:50 2010 -0800 > > > > pidns: Support unsharing the pid namespace. > > > > > > I haven't really gotten much farther than that yet, but the bisect was > > pretty straight forward. Eric, is there anything specific I can gather > > or do to help figure out why that is causing mock to get such a weird > > error? I can provide the bisect log if you'd like. < Two emails fly past each other in the night > > My best guess in some dark corner of mock has untested code to unshare a > pid namespace, and that corner started doing something now that > unsharing of the pid namespace actually works. > > If mock has called unshare(CLONE_NEWPID). And then forked a process and > that process exited, and then forked anothe process that second and all > subsequent fork calls will fail with -ENOMEM (because init has exited in > the pid namespace). -ENOMEM will be generated because of a failure of > alloc_pid. > > Looking at that code path a little closer that just about has to be it, > because I goofed and the error path drops the lock but not irqs. The > patch below should fix the nasty warning and confirm where the code is > failing in copy_process. OK. I'll turn the debug option back on and give this patch a try. > An strace to see which syscalls mock is making and with which flags > would be very interesting. I am almost certain that there is a > unshare(CLONE_NEWPID) somewhere in there. But in a remote corner of > possibility it could weird clone flags, or something else. Oh, I have that but it's a python app with a helper C app and it's a... verbose strace. It's here for one failure: http://jwboyer.fedorapeople.org/pub/mock-strace Hopefully the testcase from my other email will help though. It's much simpler. > Beyond that I suspect we want to work with the mock folks so they get > their code to use a pid namespace working the way they intended. Right. CC'd Clark (for real this time). I'll let you know on the patch. josh ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 20:23 ` Josh Boyer @ 2013-02-08 20:45 ` Eric W. Biederman 2013-02-08 21:27 ` Josh Boyer 0 siblings, 1 reply; 19+ messages in thread From: Eric W. Biederman @ 2013-02-08 20:45 UTC (permalink / raw) To: Josh Boyer; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams Josh Boyer <jwboyer@redhat.com> writes: > < Two emails fly past each other in the night > Yep. >> My best guess in some dark corner of mock has untested code to unshare a >> pid namespace, and that corner started doing something now that >> unsharing of the pid namespace actually works. >> >> If mock has called unshare(CLONE_NEWPID). And then forked a process and >> that process exited, and then forked anothe process that second and all >> subsequent fork calls will fail with -ENOMEM (because init has exited in >> the pid namespace). -ENOMEM will be generated because of a failure of >> alloc_pid. >> >> Looking at that code path a little closer that just about has to be it, >> because I goofed and the error path drops the lock but not irqs. The >> patch below should fix the nasty warning and confirm where the code is >> failing in copy_process. > > OK. I'll turn the debug option back on and give this patch a try. Thanks. Your minimal test case also confirms my hunch. But we should fix the error path as well. >> An strace to see which syscalls mock is making and with which flags >> would be very interesting. I am almost certain that there is a >> unshare(CLONE_NEWPID) somewhere in there. But in a remote corner of >> possibility it could weird clone flags, or something else. > > Oh, I have that but it's a python app with a helper C app and it's a... > verbose strace. It's here for one failure: > > http://jwboyer.fedorapeople.org/pub/mock-strace > > Hopefully the testcase from my other email will help though. It's much > simpler. Yes. Your other test case confirms my patch you bisected this to is working correctly. >> Beyond that I suspect we want to work with the mock folks so they get >> their code to use a pid namespace working the way they intended. > > Right. CC'd Clark (for real this time). > > I'll let you know on the patch. Cool. Looking at the strace I can't figure out what mock expected to happen or how mock was working before this. As mock is calling unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID) all in one go. Previous to my patch enabling CLONE_NEWPID that would cause the unshare to fail. So it looks mock is taking a buggy untested code path and things are not working as it expected. Eric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 20:45 ` Eric W. Biederman @ 2013-02-08 21:27 ` Josh Boyer 2013-02-08 22:05 ` Eric W. Biederman 2013-02-08 22:10 ` Clark Williams 0 siblings, 2 replies; 19+ messages in thread From: Josh Boyer @ 2013-02-08 21:27 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams On Fri, Feb 08, 2013 at 12:45:47PM -0800, Eric W. Biederman wrote: > Josh Boyer <jwboyer@redhat.com> writes: > > > < Two emails fly past each other in the night > > > Yep. > > >> My best guess in some dark corner of mock has untested code to unshare a > >> pid namespace, and that corner started doing something now that > >> unsharing of the pid namespace actually works. > >> > >> If mock has called unshare(CLONE_NEWPID). And then forked a process and > >> that process exited, and then forked anothe process that second and all > >> subsequent fork calls will fail with -ENOMEM (because init has exited in > >> the pid namespace). -ENOMEM will be generated because of a failure of > >> alloc_pid. > >> > >> Looking at that code path a little closer that just about has to be it, > >> because I goofed and the error path drops the lock but not irqs. The > >> patch below should fix the nasty warning and confirm where the code is > >> failing in copy_process. > > > > OK. I'll turn the debug option back on and give this patch a try. > > Thanks. Your minimal test case also confirms my hunch. But we should > fix the error path as well. > > >> An strace to see which syscalls mock is making and with which flags > >> would be very interesting. I am almost certain that there is a > >> unshare(CLONE_NEWPID) somewhere in there. But in a remote corner of > >> possibility it could weird clone flags, or something else. > > > > Oh, I have that but it's a python app with a helper C app and it's a... > > verbose strace. It's here for one failure: > > > > http://jwboyer.fedorapeople.org/pub/mock-strace > > > > Hopefully the testcase from my other email will help though. It's much > > simpler. > > Yes. Your other test case confirms my patch you bisected this to is > working correctly. > > >> Beyond that I suspect we want to work with the mock folks so they get > >> their code to use a pid namespace working the way they intended. > > > > Right. CC'd Clark (for real this time). > > > > I'll let you know on the patch. The patch appears to work as expected. With CONFIG_DEBUG_ATOMIC_SLEEP set I don't see the backtrace error. That should go in. > Cool. Looking at the strace I can't figure out what mock expected > to happen or how mock was working before this. As mock is calling > unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID) all in one > go. > > Previous to my patch enabling CLONE_NEWPID that would cause the unshare > to fail. Oh. Indeed. On a kernel without the commit in question I see this from mock: DEBUG: Unsharing. Flags: 738328576 DEBUG: unshare(738328576) failed, falling back to unshare(67239936) DEBUG: Unsharing. Flags: 67239936 So it's trying with NEWPID and that fails, so it falls back to just NEWNS | NEWUTS. That explains it working. > So it looks mock is taking a buggy untested code path and things are not > working as it expected. Quite possibly, yes. I instrumented the kernel a bit and it is indeed failing in the alloc_pid call. Clark, thoughts here? josh ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 21:27 ` Josh Boyer @ 2013-02-08 22:05 ` Eric W. Biederman 2013-02-08 22:40 ` Clark Williams 2013-02-08 22:10 ` Clark Williams 1 sibling, 1 reply; 19+ messages in thread From: Eric W. Biederman @ 2013-02-08 22:05 UTC (permalink / raw) To: Josh Boyer; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams Josh Boyer <jwboyer@redhat.com> writes: >> So it looks mock is taking a buggy untested code path and things are not >> working as it expected. > > Quite possibly, yes. I instrumented the kernel a bit and it is indeed > failing in the alloc_pid call. > > Clark, thoughts here? I will just add the solution is probably for mock to fork immediate after the unshare succeeds in creating a pid namespace. With the original process waiting for mock to exit and the child process doing everything that mock does now. That will allow mock to act as the init process in the pid namespace it just created. Eric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 22:05 ` Eric W. Biederman @ 2013-02-08 22:40 ` Clark Williams 0 siblings, 0 replies; 19+ messages in thread From: Clark Williams @ 2013-02-08 22:40 UTC (permalink / raw) To: Eric W. Biederman Cc: Josh Boyer, Andrew Morton, Al Viro, Mel Gorman, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1405 bytes --] On Fri, 08 Feb 2013 14:05:55 -0800 ebiederm@xmission.com (Eric W. Biederman) wrote: > Josh Boyer <jwboyer@redhat.com> writes: > > >> So it looks mock is taking a buggy untested code path and things are not > >> working as it expected. > > > > Quite possibly, yes. I instrumented the kernel a bit and it is indeed > > failing in the alloc_pid call. > > > > Clark, thoughts here? > > I will just add the solution is probably for mock to fork immediate > after the unshare succeeds in creating a pid namespace. With the > original process waiting for mock to exit and the child process > doing everything that mock does now. > > That will allow mock to act as the init process in the pid namespace it > just created. > > Eric > Well, mock is not really setup to act as an init (i.e. reap all the child processes in the new namespace). Not really sure that's what I want anyway, verses just nuking CLONE_NEWPID and running in the same pidspace. I added some debugging output in the code around the unshare() calls and so far I can't seem to get unshare() to succeed when NEWPID is one of the flags. I'll admit that I'm running a realtime kernel (3.6.11-rt28 with CONFIG_PID_NS=y) so I'll boot into the latest F18 and make sure of this, but I'm leaning towards just removing NEWPID, since we weren't using it right and I'm not convinced that we need it. Clark [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 21:27 ` Josh Boyer 2013-02-08 22:05 ` Eric W. Biederman @ 2013-02-08 22:10 ` Clark Williams 2013-02-08 22:40 ` Eric W. Biederman 1 sibling, 1 reply; 19+ messages in thread From: Clark Williams @ 2013-02-08 22:10 UTC (permalink / raw) To: Josh Boyer Cc: Eric W. Biederman, Andrew Morton, Al Viro, Mel Gorman, linux-kernel [-- Attachment #1: Type: text/plain, Size: 3885 bytes --] On Fri, 8 Feb 2013 16:27:26 -0500 Josh Boyer <jwboyer@redhat.com> wrote: > On Fri, Feb 08, 2013 at 12:45:47PM -0800, Eric W. Biederman wrote: > > Josh Boyer <jwboyer@redhat.com> writes: > > > > > < Two emails fly past each other in the night > > > > > Yep. > > > > >> My best guess in some dark corner of mock has untested code to unshare a > > >> pid namespace, and that corner started doing something now that > > >> unsharing of the pid namespace actually works. > > >> > > >> If mock has called unshare(CLONE_NEWPID). And then forked a process and > > >> that process exited, and then forked anothe process that second and all > > >> subsequent fork calls will fail with -ENOMEM (because init has exited in > > >> the pid namespace). -ENOMEM will be generated because of a failure of > > >> alloc_pid. > > >> > > >> Looking at that code path a little closer that just about has to be it, > > >> because I goofed and the error path drops the lock but not irqs. The > > >> patch below should fix the nasty warning and confirm where the code is > > >> failing in copy_process. > > > > > > OK. I'll turn the debug option back on and give this patch a try. > > > > Thanks. Your minimal test case also confirms my hunch. But we should > > fix the error path as well. > > > > >> An strace to see which syscalls mock is making and with which flags > > >> would be very interesting. I am almost certain that there is a > > >> unshare(CLONE_NEWPID) somewhere in there. But in a remote corner of > > >> possibility it could weird clone flags, or something else. > > > > > > Oh, I have that but it's a python app with a helper C app and it's a... > > > verbose strace. It's here for one failure: > > > > > > http://jwboyer.fedorapeople.org/pub/mock-strace > > > > > > Hopefully the testcase from my other email will help though. It's much > > > simpler. > > > > Yes. Your other test case confirms my patch you bisected this to is > > working correctly. > > > > >> Beyond that I suspect we want to work with the mock folks so they get > > >> their code to use a pid namespace working the way they intended. > > > > > > Right. CC'd Clark (for real this time). > > > > > > I'll let you know on the patch. > > The patch appears to work as expected. With CONFIG_DEBUG_ATOMIC_SLEEP > set I don't see the backtrace error. That should go in. > > > Cool. Looking at the strace I can't figure out what mock expected > > to happen or how mock was working before this. As mock is calling > > unshare(CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID) all in one > > go. > > > > Previous to my patch enabling CLONE_NEWPID that would cause the unshare > > to fail. > > Oh. Indeed. On a kernel without the commit in question I see this from > mock: > > DEBUG: Unsharing. Flags: 738328576 > DEBUG: unshare(738328576) failed, falling back to unshare(67239936) > DEBUG: Unsharing. Flags: 67239936 > > So it's trying with NEWPID and that fails, so it falls back to just > NEWNS | NEWUTS. That explains it working. > > > So it looks mock is taking a buggy untested code path and things are not > > working as it expected. > > Quite possibly, yes. I instrumented the kernel a bit and it is indeed > failing in the alloc_pid call. > > Clark, thoughts here? > > josh The more I look at that the more I think I should nuke CLONE_NEWPID in mock. It came in with a commit that added NEWIPC, which I think is valid for mock managing a chroot, but we're not looking to do full-up containers at this point and it looks like containers is the only place you'd want to start a new set of pids. I'll do that and run some tests to make sure I'm not depending on something there (doubtful since it looks like it never really worked and that we've always been using the base flags). Clark [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 22:10 ` Clark Williams @ 2013-02-08 22:40 ` Eric W. Biederman 2013-02-08 22:56 ` Clark Williams 0 siblings, 1 reply; 19+ messages in thread From: Eric W. Biederman @ 2013-02-08 22:40 UTC (permalink / raw) To: Clark Williams Cc: Josh Boyer, Andrew Morton, Al Viro, Mel Gorman, linux-kernel Clark Williams <williams@redhat.com> writes: > The more I look at that the more I think I should nuke CLONE_NEWPID in > mock. It came in with a commit that added NEWIPC, which I think is valid > for mock managing a chroot, but we're not looking to do full-up > containers at this point and it looks like containers is the only place > you'd want to start a new set of pids. Just taking the code out seems reasonable. Howerver there is a practical use for a pid namespace in a setup like mock. A pid namespace makes it so your sub processes can not reparent and get away from you, which could be handy in case someone starts a system daemon in a post install script. Eric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 22:40 ` Eric W. Biederman @ 2013-02-08 22:56 ` Clark Williams 0 siblings, 0 replies; 19+ messages in thread From: Clark Williams @ 2013-02-08 22:56 UTC (permalink / raw) To: Eric W. Biederman Cc: Josh Boyer, Andrew Morton, Al Viro, Mel Gorman, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1317 bytes --] On Fri, 08 Feb 2013 14:40:13 -0800 ebiederm@xmission.com (Eric W. Biederman) wrote: > Clark Williams <williams@redhat.com> writes: > > > The more I look at that the more I think I should nuke CLONE_NEWPID in > > mock. It came in with a commit that added NEWIPC, which I think is valid > > for mock managing a chroot, but we're not looking to do full-up > > containers at this point and it looks like containers is the only place > > you'd want to start a new set of pids. > > Just taking the code out seems reasonable. Howerver there is a > practical use for a pid namespace in a setup like mock. A pid namespace > makes it so your sub processes can not reparent and get away from you, > which could be handy in case someone starts a system daemon in a post > install script. > Ok, I *think* I'm up to speed now (I'm old and slow so gimme a break). Unsharing pidns only works after your commit in 3.8; that's why my unshare was always failing. Does it make sense for me to make an additional unshare() call with just NEWPID as an argument? That is, call unshare with the NEWNS, NEWIPC, and NEWUTS flags, then when that succeeds, try NEWPID. If the NEWPID call succeeds, do: pid = os.fork() if pid: os.waitpid(pid, 0) So that the child continues on? Clark [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 20:13 ` Eric W. Biederman 2013-02-08 20:23 ` Josh Boyer @ 2013-02-08 22:12 ` Josh Boyer 2013-02-11 23:57 ` Andrew Morton 2 siblings, 0 replies; 19+ messages in thread From: Josh Boyer @ 2013-02-08 22:12 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel On Fri, Feb 08, 2013 at 12:13:09PM -0800, Eric W. Biederman wrote: For posterity's sake > From: "Eric W. Biederman" <ebiederm@xmission.com> > Date: Fri, 8 Feb 2013 12:05:54 -0800 > Subject: [PATCH] pid: unlock_irq when alloc_pid fails because init has > exited. > > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Tested-by: Josh Boyer <jwboyer@redhat.com> > --- > kernel/pid.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/kernel/pid.c b/kernel/pid.c > index de9af60..f2c6a68 100644 > --- a/kernel/pid.c > +++ b/kernel/pid.c > @@ -331,7 +331,7 @@ out: > return pid; > > out_unlock: > - spin_unlock(&pidmap_lock); > + spin_unlock_irq(&pidmap_lock); > out_free: > while (++i <= ns->level) > free_pidmap(pid->numbers + i); > -- > 1.7.5.4 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 20:13 ` Eric W. Biederman 2013-02-08 20:23 ` Josh Boyer 2013-02-08 22:12 ` Josh Boyer @ 2013-02-11 23:57 ` Andrew Morton 2013-02-12 10:34 ` Eric W. Biederman 2 siblings, 1 reply; 19+ messages in thread From: Andrew Morton @ 2013-02-11 23:57 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Josh Boyer, Al Viro, Mel Gorman, linux-kernel On Fri, 08 Feb 2013 12:13:09 -0800 ebiederm@xmission.com (Eric W. Biederman) wrote: > If mock has called unshare(CLONE_NEWPID). And then forked a process and > that process exited, and then forked anothe process that second and all > subsequent fork calls will fail with -ENOMEM (because init has exited in > the pid namespace). -ENOMEM will be generated because of a failure of > alloc_pid. Can we please fix this? The system is *not* out of memory and it's wildly misleading to report this to userspace. If alloc_pid() can fail for multiple reasons then it should be returning an ERR_PTR on failure, not NULL. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-11 23:57 ` Andrew Morton @ 2013-02-12 10:34 ` Eric W. Biederman 0 siblings, 0 replies; 19+ messages in thread From: Eric W. Biederman @ 2013-02-12 10:34 UTC (permalink / raw) To: Andrew Morton; +Cc: Josh Boyer, Al Viro, Mel Gorman, linux-kernel Andrew Morton <akpm@linux-foundation.org> writes: > On Fri, 08 Feb 2013 12:13:09 -0800 > ebiederm@xmission.com (Eric W. Biederman) wrote: > >> If mock has called unshare(CLONE_NEWPID). And then forked a process and >> that process exited, and then forked anothe process that second and all >> subsequent fork calls will fail with -ENOMEM (because init has exited in >> the pid namespace). -ENOMEM will be generated because of a failure of >> alloc_pid. > > Can we please fix this? The system is *not* out of memory and it's > wildly misleading to report this to userspace. > > If alloc_pid() can fail for multiple reasons then it should be > returning an ERR_PTR on failure, not NULL. We might be able to fix this. alloc_pid causing fork to fail with ENOMEM when there are simply not enough pids available to use is a long standing failure mode, and error code. I believe to change this responsibily you would have read a lot of programs to see if fork failing with an additional error code would be handled or if things would break in subtle ways. There would need to be research to see what posix says about this, and the posix view on fork would have any impact on this in any way. Furthermore this error is a resource shortage, so ENOMEM isn't even exactly wrong. I can understand the problem but I totally don't care enough to push an ABI change like that through. Eric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 18:19 ` Josh Boyer 2013-02-08 20:13 ` Eric W. Biederman @ 2013-02-08 20:18 ` Josh Boyer 2013-02-08 20:36 ` Eric W. Biederman 1 sibling, 1 reply; 19+ messages in thread From: Josh Boyer @ 2013-02-08 20:18 UTC (permalink / raw) To: Andrew Morton, Eric W. Biederman; +Cc: Al Viro, Mel Gorman, linux-kernel On Fri, Feb 08, 2013 at 01:19:49PM -0500, Josh Boyer wrote: > On Thu, Feb 07, 2013 at 07:35:01PM -0500, Josh Boyer wrote: > > On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote: > > > On Thu, 7 Feb 2013 16:57:42 -0500 > > > Josh Boyer <jwboyer@redhat.com> wrote: > > > > > > > Hi All, > > > > > > > > We've hit a weird error in Fedora using the 3.8-rcX kernels. It seems > > > > the mock tool is getting back ENOMEM when doing very simple things that > > > > normally just work. The 3.7 kernels on the same userspace work just > > > > fine. It seems just running 'mock init -v' is enough to cause the > > > > failure. > > > > > > I assume you're not seeing the "page allocation failure" message and > > > backtrace. This means that either > > > > Right. If I disable our debug options, I see no backtraces at all and > > the python app still gets ENOMEM returned. (See below for those > > interested). > > > > > a) it's a __GFP_NOWARN callsite. This is rare. Or > > > > > > b) it's actually a different error but someone went and overwrote a > > > callee's return value with -ENOMEM. We do this a lot and it sucks. > > > > We do it in copy_io :\. > > > > > > At first glance it seems copy_io is failing (possibly because > > > > get_task_io_context fails), and then the above fallout is printed. The > > > > warning seems fairly valid, but I don't think that is the root of the > > > > problem. > > > > > > yes, get_task_io_context() might be the place. Tried adding a few > > > error-path printks in there to see what's happening? > > > > Yeah, that's my next step. I guess I know what I'll be doing tomorrow. > > > > > I can't see anything around there which leaves interrupts disabled > > > though. It's quite likely that there's some code with is forgetting to > > > reenable interrupts on a rarely-tested error path, and that ENOMEM is > > > tickling the bug. > > > > Right, agreed. As I said, I think that is mostly a secondary issue. > > Hopefully it will be easy to fix once we figure out why we're getting > > the ENOMEM error. > > > > Python backtrace below. Seems to be failing on forking a umount command > > after init'ing the chroot. I can put the full output somewhere if > > people are interested. > > OK. I've bisected this down to: > > 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit > commit 50804fe3737ca6a5942fdc2057a18a8141d00141 > Author: Eric W. Biederman <ebiederm@xmission.com> > Date: Tue Mar 2 15:41:50 2010 -0800 > > pidns: Support unsharing the pid namespace. > > > I haven't really gotten much farther than that yet, but the bisect was > pretty straight forward. Eric, is there anything specific I can gather > or do to help figure out why that is causing mock to get such a weird > error? I can provide the bisect log if you'd like. I took a look at what mock was doing and it was mostly very simple stuff. The two exceptions were that it was calling unshare, then doing some file checks and I/O, and then calling fork to exec off some helper things. Up until the point it fails, the forks work and the children go do whatever it is they were supposed to do. I've CC'd Clark Williams just in case people have questions on mock itself, but I'm not sure that will be needed. I have a very simple testcase (warning, ugly code) that hits this issue now. Build with "gcc -D_GNU_SOURCE -g unshare.c -o unshare" and then just run it as "sudo ./unshare 10". The sudo is needed for the unshare call to work. I get the following output on kernels that have the above commit: [jwboyer@vader ~]$ sudo ./unshare 10 Calling unshare 738328576 Forked 6684 Forked 6685 Fork failed: Cannot allocate memory Fork failed: Cannot allocate memory Fork failed: Cannot allocate memory Fork failed: Cannot allocate memory Fork failed: Cannot allocate memory Fork failed: Cannot allocate memory Fork failed: Cannot allocate memory Fork failed: Cannot allocate memory [jwboyer@vader ~]$ which is consistent with what mock is seeing. If I comment out the call to unshare, it seems to always work. It seems to consistently fail with ENOMEM after the first 3-5 forked children, but it varies within that range. On a kernel without the above commit, this works every time. I've tried variations of 10, 100, and 10,000 iterations successfully with and without the unshare call. Hopefully this helps. Testcase below. josh #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sched.h> int main(int argc, char **argv) { int i, flags = CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWPID | CLONE_NEWIPC; int fd; pid_t pid; fd = open("./wtf-unshare", O_CREAT|O_RDWR, 0666); write(fd, "blah", 4); printf("Calling unshare %d\n", flags); unshare(flags); for (i = 0; i < atoi(argv[1]); i++) { pid = fork(); if (pid == 0) exit(0); else if (pid == -1) perror("Fork failed"); else printf("Forked %d\n", pid); } return 0; } ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 20:18 ` Josh Boyer @ 2013-02-08 20:36 ` Eric W. Biederman 2013-02-08 20:40 ` Josh Boyer 0 siblings, 1 reply; 19+ messages in thread From: Eric W. Biederman @ 2013-02-08 20:36 UTC (permalink / raw) To: Josh Boyer; +Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel Josh Boyer <jwboyer@redhat.com> writes: > On Fri, Feb 08, 2013 at 01:19:49PM -0500, Josh Boyer wrote: >> On Thu, Feb 07, 2013 at 07:35:01PM -0500, Josh Boyer wrote: >> > On Thu, Feb 07, 2013 at 02:15:02PM -0800, Andrew Morton wrote: >> > > On Thu, 7 Feb 2013 16:57:42 -0500 >> > > Josh Boyer <jwboyer@redhat.com> wrote: >> > > >> > > > Hi All, >> > > > >> > > > We've hit a weird error in Fedora using the 3.8-rcX kernels. It seems >> > > > the mock tool is getting back ENOMEM when doing very simple things that >> > > > normally just work. The 3.7 kernels on the same userspace work just >> > > > fine. It seems just running 'mock init -v' is enough to cause the >> > > > failure. >> > > >> > > I assume you're not seeing the "page allocation failure" message and >> > > backtrace. This means that either >> > >> > Right. If I disable our debug options, I see no backtraces at all and >> > the python app still gets ENOMEM returned. (See below for those >> > interested). >> > >> > > a) it's a __GFP_NOWARN callsite. This is rare. Or >> > > >> > > b) it's actually a different error but someone went and overwrote a >> > > callee's return value with -ENOMEM. We do this a lot and it sucks. >> > >> > We do it in copy_io :\. >> > >> > > > At first glance it seems copy_io is failing (possibly because >> > > > get_task_io_context fails), and then the above fallout is printed. The >> > > > warning seems fairly valid, but I don't think that is the root of the >> > > > problem. >> > > >> > > yes, get_task_io_context() might be the place. Tried adding a few >> > > error-path printks in there to see what's happening? >> > >> > Yeah, that's my next step. I guess I know what I'll be doing tomorrow. >> > >> > > I can't see anything around there which leaves interrupts disabled >> > > though. It's quite likely that there's some code with is forgetting to >> > > reenable interrupts on a rarely-tested error path, and that ENOMEM is >> > > tickling the bug. >> > >> > Right, agreed. As I said, I think that is mostly a secondary issue. >> > Hopefully it will be easy to fix once we figure out why we're getting >> > the ENOMEM error. >> > >> > Python backtrace below. Seems to be failing on forking a umount command >> > after init'ing the chroot. I can put the full output somewhere if >> > people are interested. >> >> OK. I've bisected this down to: >> >> 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit >> commit 50804fe3737ca6a5942fdc2057a18a8141d00141 >> Author: Eric W. Biederman <ebiederm@xmission.com> >> Date: Tue Mar 2 15:41:50 2010 -0800 >> >> pidns: Support unsharing the pid namespace. >> >> >> I haven't really gotten much farther than that yet, but the bisect was >> pretty straight forward. Eric, is there anything specific I can gather >> or do to help figure out why that is causing mock to get such a weird >> error? I can provide the bisect log if you'd like. > > I took a look at what mock was doing and it was mostly very simple > stuff. The two exceptions were that it was calling unshare, then doing > some file checks and I/O, and then calling fork to exec off some helper > things. Up until the point it fails, the forks work and the children go > do whatever it is they were supposed to do. I've CC'd Clark Williams > just in case people have questions on mock itself, but I'm not sure that > will be needed. Our emails crossed paths. You have just confirmed my suspicion about what was going wrong. The practical question is why mock is calling unshare(CLONE_NEWPID) because it clearly seems not to understand how to unshare the pid namespace and use it that way. Except for forgeting to reenable irqs in the failure path of alloc_pid the behavior is exactly correct and is how the pid namespace is designed to behave in the case of unshare. > which is consistent with what mock is seeing. If I comment out the call > to unshare, it seems to always work. It seems to consistently fail with > ENOMEM after the first 3-5 forked children, but it varies within that > range. If you add a waitpid or space out your forks you will see that it always fails after your first child in the pid namespace has exited. We don't allow children in a pid namespace after fork has exited. Eric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Odd ENOMEM being returned in 3.8-rcX 2013-02-08 20:36 ` Eric W. Biederman @ 2013-02-08 20:40 ` Josh Boyer 0 siblings, 0 replies; 19+ messages in thread From: Josh Boyer @ 2013-02-08 20:40 UTC (permalink / raw) To: Eric W. Biederman Cc: Andrew Morton, Al Viro, Mel Gorman, linux-kernel, williams On Fri, Feb 08, 2013 at 12:36:08PM -0800, Eric W. Biederman wrote: > Josh Boyer <jwboyer@redhat.com> writes: > >> OK. I've bisected this down to: > >> > >> 50804fe3737ca6a5942fdc2057a18a8141d00141 is the first bad commit > >> commit 50804fe3737ca6a5942fdc2057a18a8141d00141 > >> Author: Eric W. Biederman <ebiederm@xmission.com> > >> Date: Tue Mar 2 15:41:50 2010 -0800 > >> > >> pidns: Support unsharing the pid namespace. > >> > >> > >> I haven't really gotten much farther than that yet, but the bisect was > >> pretty straight forward. Eric, is there anything specific I can gather > >> or do to help figure out why that is causing mock to get such a weird > >> error? I can provide the bisect log if you'd like. > > > > I took a look at what mock was doing and it was mostly very simple > > stuff. The two exceptions were that it was calling unshare, then doing > > some file checks and I/O, and then calling fork to exec off some helper > > things. Up until the point it fails, the forks work and the children go > > do whatever it is they were supposed to do. I've CC'd Clark Williams > > just in case people have questions on mock itself, but I'm not sure that > > will be needed. > > Our emails crossed paths. You have just confirmed my suspicion about > what was going wrong. > > The practical question is why mock is calling unshare(CLONE_NEWPID) > because it clearly seems not to understand how to unshare the pid > namespace and use it that way. I don't know either, and I've CC'd Clark on this thread now (again for real this time). > Except for forgeting to reenable irqs in the failure path of alloc_pid > the behavior is exactly correct and is how the pid namespace is designed > to behave in the case of unshare. Well... ok. Personally, I'm confused why mock was working before that commit and now it doesn't. Were we just buggy in the kernel prior to that? > > which is consistent with what mock is seeing. If I comment out the call > > to unshare, it seems to always work. It seems to consistently fail with > > ENOMEM after the first 3-5 forked children, but it varies within that > > range. > > If you add a waitpid or space out your forks you will see that it always > fails after your first child in the pid namespace has exited. > > We don't allow children in a pid namespace after fork has exited. Hm. Mock seems to make it a ways along as well though, and it isn't doing rapid fire fork/exec. I believe it's forking helpers and waiting for them to return before continuing on. josh ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2013-02-12 10:34 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-02-07 21:57 Odd ENOMEM being returned in 3.8-rcX Josh Boyer 2013-02-07 22:15 ` Andrew Morton 2013-02-08 0:35 ` Josh Boyer 2013-02-08 18:19 ` Josh Boyer 2013-02-08 20:13 ` Eric W. Biederman 2013-02-08 20:23 ` Josh Boyer 2013-02-08 20:45 ` Eric W. Biederman 2013-02-08 21:27 ` Josh Boyer 2013-02-08 22:05 ` Eric W. Biederman 2013-02-08 22:40 ` Clark Williams 2013-02-08 22:10 ` Clark Williams 2013-02-08 22:40 ` Eric W. Biederman 2013-02-08 22:56 ` Clark Williams 2013-02-08 22:12 ` Josh Boyer 2013-02-11 23:57 ` Andrew Morton 2013-02-12 10:34 ` Eric W. Biederman 2013-02-08 20:18 ` Josh Boyer 2013-02-08 20:36 ` Eric W. Biederman 2013-02-08 20:40 ` Josh Boyer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).