linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Tycho Andersen <tycho@tycho.pizza>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	Christian Brauner <brauner@kernel.org>,
	fuse-devel <fuse-devel@lists.sourceforge.net>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: strange interaction between fuse + pidns
Date: Mon, 11 Jul 2022 18:06:21 -0500	[thread overview]
Message-ID: <87zghf6yhe.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <Ysyp8Kbl8FzhApUb@netflix> (Tycho Andersen's message of "Mon, 11 Jul 2022 16:53:36 -0600")

Tycho Andersen <tycho@tycho.pizza> writes:

> On Mon, Jul 11, 2022 at 04:37:12PM -0500, Eric W. Biederman wrote:
>> Tycho Andersen <tycho@tycho.pizza> writes:
>> 
>> > Hi all,
>> >
>> > On Mon, Jul 11, 2022 at 03:59:15PM +0200, Miklos Szeredi wrote:
>> >> On Mon, 11 Jul 2022 at 12:35, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> >> >
>> >> > Can you try the attached untested patch?
>> >> 
>> >> Updated patch to avoid use after free on req->args.
>> >> 
>> >> Still mostly untested.
>> >
>> > Thanks, when I applied your patch, I still ended up with tasks stuck
>> > waiting with a SIGKILL pending. So I looked into that and came up with
>> > the patch below. With both your patch and mine, my testcase exits
>> > cleanly.
>> >
>> > Eric (or Christian, or anyone), can you comment on the patch below? I
>> > have no idea what this will break. Maybe instead a better approach is
>> > some additional special case in __send_signal_locked()?
>> >
>> > Tycho
>> >
>> > From b7ea26adcf3546be5745063cc86658acb5ed37e9 Mon Sep 17 00:00:00 2001
>> > From: Tycho Andersen <tycho@tycho.pizza>
>> > Date: Mon, 11 Jul 2022 11:26:58 -0600
>> > Subject: [PATCH] sched: __fatal_signal_pending() should also check shared
>> >  signals
>> >
>> > The wait_* code uses signal_pending_state() to test whether a thread has
>> > been interrupted, which ultimately uses __fatal_signal_pending() to detect
>> > if there is a fatal signal.
>> >
>> > When a pid ns dies, in zap_pid_ns_processes() it does:
>> >
>> >     group_send_sig_info(SIGKILL, SEND_SIG_PRIV, task, PIDTYPE_MAX);
>> >
>> > for all the tasks in the pid ns. That calls through:
>> >
>> >     group_send_sig_info() ->
>> >       do_send_sig_info() ->
>> >         send_signal_locked() ->
>> >           __send_signal_locked()
>> >
>> > which does:
>> >
>> >     pending = (type != PIDTYPE_PID) ? &t->signal->shared_pending : &t->pending;
>> >
>> > which puts sigkill in the set of shared signals, but not the individual
>> > pending ones. If tasks are stuck in a killable wait (e.g. a fuse flush
>> > operation), they won't see this shared signal, and will hang forever, since
>> > TIF_SIGPENDING is set, but the fatal signal can't be detected.
>> 
>> Hmm.
>> 
>> That is perplexing.
>
> Thanks for taking a look.
>
>> __send_signal_locked calls complete_signal.  Then if any of the tasks of
>> the process can receive the signal, complete_signal will loop through
>> all of the tasks of the process and set the per thread SIGKILL.  Pretty
>> much by definition tasks can always receive SIGKILL.
>> 
>> Is complete_signal not being able to do that?
>
> In my specific case it was because my testcase was already trying to
> exit and had set PF_EXITING when the signal is delivered, so
> complete_signal() was indeed not able to do that since PF_EXITING is
> checked before SIGKILL in wants_signal().
>
> But I changed my testacase to sleep instead of exit, and I get the
> same hang behavior, even though complete_signal() does add SIGKILL to
> the set. So there's something else going on there...
>
>> The patch below really should not be necessary, and I have pending work
>> that if I can push over the finish line won't even make sense.
>> 
>> As it is currently an abuse to use the per thread SIGKILL to indicate
>> that a fatal signal has been short circuit delivered.  That abuse as
>> well as being unclean tends to confuse people reading the code.
>
> How close is your work? I'm wondering if it's worth investigating the
> non-PF_EXITING case further, or if we should just land this since it
> fixes the PF_EXITING case as well. Or maybe just do something like
> this in addition:

It is not different enough to change the semantics.  What I am aiming
for is having a dedicated flag indicating a task will exit, that
fatal_signal_pending can check.  And I intend to make that flag one way
so that once it is set it will never be cleared.

Which sort of argues against your patch below.

It most definitely does not sort out the sleep case.


The other thing I have played with that might be relevant was removing
the explicit wait in zap_pid_ns_processes and simply not allowing wait
to reap the pid namespace init until all it's children had been reaped.
Essentially how we deal with the thread group leader for ordinary
processes.  Does that sound like it might help in the fuse case?

I am not certain how far such a change would be (it has been a couple
years since I played with implementing it) but it can be made much
sooner if it demonstratively breaks some dead-locks, and generally makes
the kernel easier to maintain.

Eric

  reply	other threads:[~2022-07-11 23:06 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-23 17:21 strange interaction between fuse + pidns Tycho Andersen
2022-06-23 21:55 ` Vivek Goyal
2022-06-23 23:41   ` Tycho Andersen
2022-06-24 17:36     ` Vivek Goyal
2022-07-11 10:35 ` Miklos Szeredi
2022-07-11 13:59   ` Miklos Szeredi
2022-07-11 20:25     ` Tycho Andersen
2022-07-11 21:37       ` Eric W. Biederman
2022-07-11 22:53         ` Tycho Andersen
2022-07-11 23:06           ` Eric W. Biederman [this message]
2022-07-12 13:43             ` Tycho Andersen
2022-07-12 14:34               ` Eric W. Biederman
2022-07-12 15:14                 ` Tycho Andersen
2022-07-13 17:53                   ` [PATCH] sched: __fatal_signal_pending() should also check PF_EXITING Tycho Andersen
2022-07-20 15:03                     ` Serge E. Hallyn
2022-07-20 20:58                       ` Tycho Andersen
2022-07-21  1:54                         ` Serge E. Hallyn
2022-07-27 15:44                           ` Tycho Andersen
2022-07-27 16:32                             ` Eric W. Biederman
2022-07-27 17:55                               ` Tycho Andersen
2022-07-28 18:48                                 ` Eric W. Biederman
2022-07-27 17:55                             ` Oleg Nesterov
2022-07-27 18:18                               ` Tycho Andersen
2022-07-27 19:19                                 ` Oleg Nesterov
2022-07-27 19:40                                   ` Tycho Andersen
2022-07-28  9:12                                     ` Oleg Nesterov
2022-07-28 21:20                                       ` Tycho Andersen
2022-07-29  5:04                                         ` Eric W. Biederman
2022-07-29 13:50                                           ` Tycho Andersen
2022-07-29 16:15                                             ` Eric W. Biederman
2022-07-29 16:48                                               ` Tycho Andersen
2022-07-29 17:40                                                 ` [RFC][PATCH] fuse: In fuse_flush only wait if someone wants the return code Eric W. Biederman
2022-07-29 20:47                                                   ` Oleg Nesterov
2022-07-30  0:15                                                     ` Al Viro
2022-07-30  5:10                                                       ` [RFC][PATCH v2] " Eric W. Biederman
2022-08-01 15:16                                                         ` Tycho Andersen
2022-08-02 12:50                                                         ` Miklos Szeredi
2022-08-15 13:59                                                         ` Tycho Andersen
2022-08-15 17:55                                                           ` Serge E. Hallyn
2022-09-01 14:06                                                           ` [PATCH] " Tycho Andersen
2022-09-19 15:03                                                             ` Tycho Andersen
2022-09-20 18:02                                                               ` Serge E. Hallyn
2022-09-26 14:17                                                               ` Tycho Andersen
2022-09-27  9:46                                                             ` Miklos Szeredi
2022-09-29 14:05                                                               ` [fuse-devel] " Stef Bon
2022-09-29 16:39                                                               ` [PATCH v2] " Tycho Andersen
2022-09-30 13:35                                                                 ` Miklos Szeredi
2022-09-30 14:01                                                                   ` Tycho Andersen
2022-09-30 14:41                                                                     ` Miklos Szeredi
2022-09-30 16:09                                                                       ` Tycho Andersen
2022-10-26  9:01                                                                         ` Miklos Szeredi
2022-11-14 16:02                                                                           ` [PATCH v3] " Tycho Andersen
2022-11-28 15:00                                                                             ` Tycho Andersen
2022-12-08 14:26                                                                               ` Miklos Szeredi
2022-12-08 17:49                                                                                 ` Tycho Andersen
2022-12-19 19:16                                                                                   ` Tycho Andersen
2023-01-03 14:51                                                                                     ` Tycho Andersen
2023-01-05 15:15                                                                                       ` Serge E. Hallyn
2023-01-26 14:12                                                                                       ` Miklos Szeredi
2022-09-30 19:47                                                               ` [PATCH] " Serge E. Hallyn
2022-09-19 15:46                                                           ` [RFC][PATCH v2] " Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zghf6yhe.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=brauner@kernel.org \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=tycho@tycho.pizza \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).