linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Neeraj Upadhyay <quic_neeraju@quicinc.com>,
	Pengfei Xu <pengfei.xu@intel.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	rcu@vger.kernel.org
Subject: Re: [PATCH 3/3] rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()
Date: Fri, 9 Dec 2022 21:26:56 +0100	[thread overview]
Message-ID: <20221209202656.GA1865787@lothringen> (raw)
In-Reply-To: <20221207203859.GD5421@redhat.com>

On Wed, Dec 07, 2022 at 09:39:00PM +0100, Oleg Nesterov wrote:
> On 12/07, Frederic Weisbecker wrote:
> >
> > On Tue, Dec 06, 2022 at 05:49:28PM +0100, Oleg Nesterov wrote:
> > >
> > > At least I think it should not wait for the tasks injected into this ns.
> > >
> > > Because this looks like a kernel bug even if we forget about this deadlock.
> > >
> > I think this was made that way on purpose,
> 
> Well maybe. But to me we have this behaviour only because we (me at least)
> do not know how to avoid the "hang" in this case.
> 
> > see the comment in zap_pid_ns_processes():
> 
> Heh ;) I wrote this comment in a53b83154914 ("exit: pidns: fix/update the
> comments in zap_pid_ns_processes()") exactly because I didn't like this
> behaviour, but I thought it must be documented.

Bah! I should have guessed ;-)

> 
> > I can't say I like the fact that a parent not belonging to a new namespace
> > can create more than one child within that namespace
> 
> not sure I understand but this looks fine and useful to me,

I mean if only one task could be injected within a new namespace, we could be sure
that all subsequent tasks belonging to that namespace would be descendents of
that first task (the same way that every task in the default namespace is a
descendant of the real init_task) and thus we wouldn't be bothered with such
deadlocks. But I guess namespaces aren't designed to work like that. I don't
know much about them so what I'm saying is very likely irrelevant.

> > but anyway this all look like an ABI that can't be reverted now.
> 
> perhaps... But you know, I wrote my previous email because 2 weeks ago
> I had to investigate a bug report which blamed the kernel, while the
> problem (unkillable process sleeping in zap_pid_ns_processes) was caused
> by the dangling zombie injected into that process's namespace. And I am
> still trying to convince the customer they need to fix userspace.

Heh :-/

I wish we could fix this but I have no idea how. I guess the child_reaper
of an ns could avoid waiting for the rest of the ns and designate its parent
as the new child reaper.

Or we could arrange for all tasks in the ns to autoreap if they ever fall back
to be reaped by their ns->child_reaper and that child_reaper is dead.

But that would look like ABI breakages...

Thanks.

  reply	other threads:[~2022-12-09 20:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-25 13:54 [PATCH 0/3] rcu-tasks: Fix race against exiting pid_ns Frederic Weisbecker
2022-11-25 13:54 ` [PATCH 1/3] rcu-tasks: Improve comments explaining tasks_rcu_exit_srcu purpose Frederic Weisbecker
2022-11-25 13:54 ` [PATCH 2/3] rcu-tasks: Remove preemption disablement around srcu_read_[un]lock() calls Frederic Weisbecker
2022-11-25 13:55 ` [PATCH 3/3] rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes() Frederic Weisbecker
2022-11-30 18:37   ` Eric W. Biederman
2022-12-02 19:51     ` Paul E. McKenney
2022-12-02 22:54     ` Frederic Weisbecker
2022-12-02 23:28       ` Eric W. Biederman
2022-12-04  0:03         ` Frederic Weisbecker
2022-12-06 16:49     ` Oleg Nesterov
2022-12-07 14:34       ` Paul E. McKenney
2022-12-07 20:01       ` Frederic Weisbecker
2022-12-07 20:39         ` Oleg Nesterov
2022-12-09 20:26           ` Frederic Weisbecker [this message]
2022-11-29  0:22 ` [PATCH 0/3] rcu-tasks: Fix race against exiting pid_ns Paul E. McKenney
2022-11-29  9:55   ` Frederic Weisbecker
2022-11-29 14:48     ` Paul E. McKenney
2022-12-02 22:55       ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221209202656.GA1865787@lothringen \
    --to=frederic@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pengfei.xu@intel.com \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).