All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tycho Andersen <tycho@tycho.pizza>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	linux-kernel@vger.kernel.org, Oleg Nesterov <oleg@redhat.com>
Subject: Re: [PATCH] sched: __fatal_signal_pending() should also check PF_EXITING
Date: Wed, 27 Jul 2022 11:55:43 -0600	[thread overview]
Message-ID: <YuF8H3ZVNugbLtFC@tycho.pizza> (raw)
In-Reply-To: <871qu6bjp3.fsf@email.froward.int.ebiederm.org>

On Wed, Jul 27, 2022 at 11:32:08AM -0500, Eric W. Biederman wrote:
> Tycho Andersen <tycho@tycho.pizza> writes:
> 
> > Hi all,
> >
> > On Wed, Jul 20, 2022 at 08:54:59PM -0500, Serge E. Hallyn wrote:
> >> Oh - I didn't either - checking the sigkill in shared signals *seems*
> >> legit if they can be put there - but since you posted the new patch I
> >> assumed his reasoning was clear to you.  I know Eric's busy, cc:ing Oleg
> >> for his interpretation too.
> >
> > Any thoughts on this?
> 
> Having __fatal_signal_pending check SIGKILL in shared signals is
> completely and utterly wrong.
> 
> What __fatal_signal_pending reports is if a signal has gone through
> short cirucuit delivery after determining that the delivery of the
> signal will terminate the process.

This short-circuiting you're talking about happens in __send_signal()?
The problem here is that __send_signal() will add things to the shared
queue:

    pending = (type != PIDTYPE_PID) ? &t->signal->shared_pending : &t->pending;

and indeed we add it to the shared set because of the way
zap_pid_ns_processes() calls it:

    roup_send_sig_info(SIGKILL, SEND_SIG_PRIV, task, PIDTYPE_MAX);

> Using "sigismember(&tsk->pending.signal, SIGKILL)" to report that a
> fatal signal has experienced short circuit delivery is a bit of an
> abuse, but essentially harmless as tkill of SIGKILL to a thread will
> result in every thread in the process experiencing short circuit
> delivery of the fatal SIGKILL.  So a pending SIGKILL can't really mean
> anything else.

This is the part I don't follow. If it's ok to send a signal to this
set, why is it not ok to also look there (other than that it was a
slight hack in the first place)? Maybe it will short circuit
more threads, but that seems ok.

> After having looked at the code a little more I can unfortunately also
> say that testing PF_EXITING in __fatal_signal_pending will cause
> kernel_wait4 in zap_pid_ns_processes to not sleep, and instead to return
> 0.  Which will cause zap_pid_ns_processes to busy wait.  That seems very
> unfortunate.
> 
> I hadn't realized it at the time I wrote zap_pid_ns_processes but I
> think anything called from do_exit that cares about signal pending state
> is pretty much broken and needs to be fixed.

> So the question is how do we fix the problem in fuse that shows up
> during a pid namespace exit without having interruptible sleeps we need
> to wake up?
> 
> What are the code paths that experience the problem?

[<0>] request_wait_answer+0x282/0x710 [fuse]
[<0>] fuse_simple_request+0x502/0xc10 [fuse]
[<0>] fuse_flush+0x431/0x630 [fuse]
[<0>] filp_close+0x96/0x120
[<0>] put_files_struct+0x15c/0x2c0
[<0>] do_exit+0xa00/0x2450
[<0>] do_group_exit+0xb2/0x2a0
[<0>] get_signal+0x1eed/0x2090
[<0>] arch_do_signal_or_restart+0x89/0x1bc0
[<0>] exit_to_user_mode_prepare+0x11d/0x1b0
[<0>] syscall_exit_to_user_mode+0x19/0x50
[<0>] do_syscall_64+0x50/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

is the full call stack, I have a reproducer here (make check will run
it): https://github.com/tych0/kernel-utils/tree/master/fuse2

In addition to fuse, it looks like nfs_file_flush() eventually ends up
in __fatal_signal_pending(), and probably a few others that want to
synchronize with something outside the local kernel.

> Will refactoring zap_pid_ns_processes as I have proposed so that it does
> not use kernel_wait4 help sort this out?  AKA make it work something
> like thread group leader of a process and not allow wait to reap the
> init process of a pid namespace until all of the processes in a pid
> namespaces have been gone.  Not that I see the problem in using
> kernel_wait4 it looks like zap_pid_ns_processes needs to stop calling
> kernel_wait4 regardless of the fuse problem.

I can look at this, but I really don't think it will help: in this
brave new world, what wakes up tasks stuck like the above? They're
still looking at the wrong signal set.

Tycho

  reply	other threads:[~2022-07-27 18:57 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-23 17:21 strange interaction between fuse + pidns Tycho Andersen
2022-06-23 21:55 ` Vivek Goyal
2022-06-23 23:41   ` Tycho Andersen
2022-06-24 17:36     ` Vivek Goyal
2022-07-11 10:35 ` Miklos Szeredi
2022-07-11 13:59   ` Miklos Szeredi
2022-07-11 20:25     ` Tycho Andersen
2022-07-11 21:37       ` Eric W. Biederman
2022-07-11 22:53         ` Tycho Andersen
2022-07-11 23:06           ` Eric W. Biederman
2022-07-12 13:43             ` Tycho Andersen
2022-07-12 14:34               ` Eric W. Biederman
2022-07-12 15:14                 ` Tycho Andersen
2022-07-13 17:53                   ` [PATCH] sched: __fatal_signal_pending() should also check PF_EXITING Tycho Andersen
2022-07-20 15:03                     ` Serge E. Hallyn
2022-07-20 20:58                       ` Tycho Andersen
2022-07-21  1:54                         ` Serge E. Hallyn
2022-07-27 15:44                           ` Tycho Andersen
2022-07-27 16:32                             ` Eric W. Biederman
2022-07-27 17:55                               ` Tycho Andersen [this message]
2022-07-28 18:48                                 ` Eric W. Biederman
2022-07-27 17:55                             ` Oleg Nesterov
2022-07-27 18:18                               ` Tycho Andersen
2022-07-27 19:19                                 ` Oleg Nesterov
2022-07-27 19:40                                   ` Tycho Andersen
2022-07-28  9:12                                     ` Oleg Nesterov
2022-07-28 21:20                                       ` Tycho Andersen
2022-07-29  5:04                                         ` Eric W. Biederman
2022-07-29 13:50                                           ` Tycho Andersen
2022-07-29 16:15                                             ` Eric W. Biederman
2022-07-29 16:48                                               ` Tycho Andersen
2022-07-29 17:40                                                 ` [RFC][PATCH] fuse: In fuse_flush only wait if someone wants the return code Eric W. Biederman
2022-07-29 20:47                                                   ` Oleg Nesterov
2022-07-30  0:15                                                     ` Al Viro
2022-07-30  5:10                                                       ` [RFC][PATCH v2] " Eric W. Biederman
2022-08-01 15:16                                                         ` Tycho Andersen
2022-08-02 12:50                                                         ` Miklos Szeredi
2022-08-15 13:59                                                         ` Tycho Andersen
2022-08-15 17:55                                                           ` Serge E. Hallyn
2022-09-01 14:06                                                           ` [PATCH] " Tycho Andersen
2022-09-19 15:03                                                             ` Tycho Andersen
2022-09-20 18:02                                                               ` Serge E. Hallyn
2022-09-26 14:17                                                               ` Tycho Andersen
2022-09-27  9:46                                                             ` Miklos Szeredi
2022-09-29 14:05                                                               ` [fuse-devel] " Stef Bon
2022-09-29 16:39                                                               ` [PATCH v2] " Tycho Andersen
2022-09-30 13:35                                                                 ` Miklos Szeredi
2022-09-30 14:01                                                                   ` Tycho Andersen
2022-09-30 14:41                                                                     ` Miklos Szeredi
2022-09-30 16:09                                                                       ` Tycho Andersen
2022-10-26  9:01                                                                         ` Miklos Szeredi
2022-11-14 16:02                                                                           ` [PATCH v3] " Tycho Andersen
2022-11-28 15:00                                                                             ` Tycho Andersen
2022-12-08 14:26                                                                               ` Miklos Szeredi
2022-12-08 17:49                                                                                 ` Tycho Andersen
2022-12-19 19:16                                                                                   ` Tycho Andersen
2023-01-03 14:51                                                                                     ` Tycho Andersen
2023-01-05 15:15                                                                                       ` Serge E. Hallyn
2023-01-26 14:12                                                                                       ` Miklos Szeredi
2022-09-30 19:47                                                               ` [PATCH] " Serge E. Hallyn
2022-09-19 15:46                                                           ` [RFC][PATCH v2] " Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YuF8H3ZVNugbLtFC@tycho.pizza \
    --to=tycho@tycho.pizza \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=oleg@redhat.com \
    --cc=serge@hallyn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.