All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Mike Galbraith <efault@gmx.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Pavel Emelyanov <xemul@parallels.com>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Louis Rilling <louis.rilling@kerlabs.com>
Subject: Re: [PATCH]  Re: [RFC PATCH] namespaces: fix leak on fork() failure
Date: Fri, 04 May 2012 07:13:57 -0700	[thread overview]
Message-ID: <m1vckcdoey.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <1336124716.25479.36.camel@marge.simpson.net> (Mike Galbraith's message of "Fri, 04 May 2012 11:45:16 +0200")

Mike Galbraith <efault@gmx.de> writes:

> On Fri, 2012-05-04 at 00:55 -0700, Eric W. Biederman wrote:
>
>> CLONE_NEWUSER?  I presume you have applied my latest user namespace
>> patches?  Otherwise you are running completely half baked code.
>
> I Removed CLONE_NEWUSER flag.
>
>> hackbench?  Which kernel are you running.  Hackbench in some kernels is
>> really good at triggering cache ping-pong effects with pids, and creds.
>
> Not when pinned.  3.0 kernel without the debug stuff enabled in 3.4.git.
> 
> marge:/usr/local/tmp/starvation # taskset -c 3 ./hackbench
> Running with 10*40 (== 400) tasks.
> Time: 0.868
> marge:/usr/local/tmp/starvation # taskset -c 3 ./hackbench -namespace
> Running with 10*40 (== 400) tasks.
> Time: 7.582
> marge:/usr/local/tmp/starvation # taskset -c 3 ./hackbench -namespace -all
> Running with 10*40 (== 400) tasks.
> Time: 29.677

Interesting.  I guess what truly puzzles me is what serializes all of
the processes.  Even synchronize_rcu should sleep and thus let other
synchronize_rcu calls run in parallel.

Did you have HZ=100 in that kernel?  400 tasks at 100Hz all serialized
somehow and then doing synchronize_rcu at a jiffy each would account
for 4 seconds.  And the nsproxy certainly has a synchronize_rcu call.

The network namespace is comparatively heavy weight, at least in the
amount of code and other things it has to go through, so that would be
my prime suspect for those 29 seconds.  There are 2-4 synchronize_rcu
calls needed to put the loopback device.  Still we use
synchronize_rcu_expedited and that work should be out of line and all of
those calls should batch.

Mike is this something you are looking at a pursuing farther?

I want to guess the serialization comes from waiting on children to be
reaped but the namespaces are all cleaned up in exit_notify() called
from do_exit() so that theory doesn't hold water.  The worst case
I can see is detach_pid from exit_signal running under the task list lock.
but nothing sleeps under that lock.  :(

So I am very puzzled why the code serializes itself in a way that leads
to those long delays.  Shrug.

Eric

  reply	other threads:[~2012-05-04 14:09 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-28  9:19 [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
2012-04-28 14:26 ` Oleg Nesterov
2012-04-29  4:13   ` Mike Galbraith
2012-04-29  7:57   ` Eric W. Biederman
2012-04-29  9:49     ` Mike Galbraith
2012-04-29 16:58     ` Oleg Nesterov
2012-04-30  2:59       ` Eric W. Biederman
2012-04-30  3:25         ` Mike Galbraith
2012-05-02 12:40         ` Oleg Nesterov
2012-05-02 17:37           ` Eric W. Biederman
2012-04-30  3:01       ` [PATCH] " Mike Galbraith
     [not found]         ` <m1zk9rmyh4.fsf@fess.ebiederm.org>
2012-05-01 20:42           ` Andrew Morton
2012-05-03  3:12             ` Mike Galbraith
2012-05-03 14:56               ` Mike Galbraith
2012-05-04  4:27                 ` Mike Galbraith
2012-05-04  7:55                   ` Eric W. Biederman
2012-05-04  8:34                     ` Mike Galbraith
2012-05-04  9:45                     ` Mike Galbraith
2012-05-04 14:13                       ` Eric W. Biederman [this message]
2012-05-04 14:49                         ` Mike Galbraith
2012-05-04 15:36                           ` Eric W. Biederman
2012-05-04 16:57                             ` Mike Galbraith
2012-05-04 20:29                               ` Eric W. Biederman
2012-05-05  5:56                                 ` Mike Galbraith
2012-05-05  6:08                                   ` Mike Galbraith
2012-05-05  7:12                                     ` Mike Galbraith
2012-05-05 11:37                                       ` Eric W. Biederman
2012-05-07 21:51                                       ` [PATCH] vfs: Speed up deactivate_super for non-modular filesystems Eric W. Biederman
2012-05-07 22:17                                         ` Al Viro
2012-05-07 23:56                                           ` Paul E. McKenney
2012-05-08  1:07                                             ` Eric W. Biederman
2012-05-08  4:53                                               ` Mike Galbraith
2012-05-09  7:55                                               ` Nick Piggin
2012-05-09 11:02                                                 ` Eric W. Biederman
2012-05-09 11:02                                                   ` Eric W. Biederman
2012-05-15  8:40                                                   ` Nick Piggin
2012-05-16  0:34                                                     ` Eric W. Biederman
2012-05-16  0:34                                                       ` Eric W. Biederman
2012-05-09 13:59                                                 ` Paul E. McKenney
2012-05-04  8:03                 ` [PATCH] Re: [RFC PATCH] namespaces: fix leak on fork() failure Eric W. Biederman
2012-05-04  8:19                   ` Mike Galbraith
2012-05-04  8:54                     ` Mike Galbraith
2012-05-07  0:32             ` [PATCH 0/3] pidns: Closing the pid namespace exit race Eric W. Biederman
2012-05-07  0:33               ` [PATCH 1/3] pidns: Use task_active_pid_ns in do_notify_parent Eric W. Biederman
2012-05-07  0:35               ` [PATCH 2/3] pidns: Guarantee that the pidns init will be the last pidns process reaped Eric W. Biederman
2012-05-08 22:50                 ` Andrew Morton
2012-05-16 18:39                 ` Oleg Nesterov
2012-05-16 19:34                   ` Oleg Nesterov
2012-05-16 20:54                   ` Eric W. Biederman
2012-05-17 17:00                     ` Oleg Nesterov
2012-05-17 21:46                       ` Eric W. Biederman
2012-05-18 12:39                         ` Oleg Nesterov
2012-05-19  0:03                           ` Eric W. Biederman
2012-05-21 12:44                             ` Oleg Nesterov
2012-05-22  0:16                               ` Eric W. Biederman
2012-05-22  0:20                               ` [PATCH] pidns: Guarantee that the pidns init will be the last pidns process reaped. v2 Eric W. Biederman
2012-05-22 16:54                                 ` Oleg Nesterov
2012-05-22 19:23                                 ` Andrew Morton
2012-05-23 14:52                                   ` Oleg Nesterov
2012-05-25 15:15                                     ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Oleg Nesterov
2012-05-25 15:59                                       ` [PATCH -mm 0/1] pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper Oleg Nesterov
2012-05-25 16:00                                         ` [PATCH -mm 1/1] " Oleg Nesterov
2012-05-25 21:43                                           ` Eric W. Biederman
2012-05-27 19:10                                             ` [PATCH v2 -mm 0/1] " Oleg Nesterov
2012-05-27 19:11                                               ` [PATCH v2 -mm 1/1] " Oleg Nesterov
2012-05-29  6:34                                                 ` Eric W. Biederman
2012-05-25 21:25                                       ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Eric W. Biederman
2012-05-27 18:41                                         ` [PATCH -mm v2] " Oleg Nesterov
2012-05-07  0:35               ` [PATCH 3/3] pidns: Make killed children autoreap Eric W. Biederman
2012-05-08 22:51                 ` Andrew Morton
2012-04-30 13:57 ` [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1vckcdoey.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=efault@gmx.de \
    --cc=gorcunov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=louis.rilling@kerlabs.com \
    --cc=oleg@redhat.com \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.