linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Bobrowski <repnop@google.com>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Amir Goldstein <amir73il@gmail.com>, Jan Kara <jack@suse.cz>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH 0/5] Add pidfd support to the fanotify API
Date: Wed, 2 Jun 2021 20:56:54 +1000	[thread overview]
Message-ID: <YLdj9pk4Jpz1qqVl@google.com> (raw)
In-Reply-To: <20210602084854.sokpeqr2wgz7ci4a@wittgenstein>

On Wed, Jun 02, 2021 at 10:48:54AM +0200, Christian Brauner wrote:
> On Wed, Jun 02, 2021 at 10:18:36AM +0300, Amir Goldstein wrote:
> > On Wed, Jun 2, 2021 at 9:30 AM Matthew Bobrowski <repnop@google.com> wrote:
> > >
> > > On Tue, Jun 01, 2021 at 01:46:28PM +0200, Christian Brauner wrote:
> > > > On Tue, Jun 01, 2021 at 09:03:26PM +1000, Matthew Bobrowski wrote:
> > > > > On Wed, May 26, 2021 at 08:05:29PM +0200, Christian Brauner wrote:
> > > > > > On Wed, May 26, 2021 at 09:20:55AM +1000, Matthew Bobrowski wrote:
> > > > > > > On Tue, May 25, 2021 at 12:31:33PM +0200, Christian Brauner wrote:
> > > > > > > > On Mon, May 24, 2021 at 10:47:46AM +0200, Jan Kara wrote:
> > > > > > > > > On Sat 22-05-21 09:32:36, Matthew Bobrowski wrote:
> > > > > > > > > > On Fri, May 21, 2021 at 12:40:56PM +0200, Jan Kara wrote:
> > > > > > > > > > > On Fri 21-05-21 20:15:35, Matthew Bobrowski wrote:
> > > > > > > > > > > > On Thu, May 20, 2021 at 03:55:27PM +0200, Jan Kara wrote:
> > > > > > > > > > > > There's one thing that I'd like to mention, and it's something in
> > > > > > > > > > > > regards to the overall approach we've taken that I'm not particularly
> > > > > > > > > > > > happy about and I'd like to hear all your thoughts. Basically, with
> > > > > > > > > > > > this approach the pidfd creation is done only once an event has been
> > > > > > > > > > > > queued and the notification worker wakes up and picks up the event
> > > > > > > > > > > > from the queue processes it. There's a subtle latency introduced when
> > > > > > > > > > > > taking such an approach which at times leads to pidfd creation
> > > > > > > > > > > > failures. As in, by the time pidfd_create() is called the struct pid
> > > > > > > > > > > > has already been reaped, which then results in FAN_NOPIDFD being
> > > > > > > > > > > > returned in the pidfd info record.
> > > > > > > > > > > >
> > > > > > > > > > > > Having said that, I'm wondering what the thoughts are on doing pidfd
> > > > > > > > > > > > creation earlier on i.e. in the event allocation stages? This way, the
> > > > > > > > > > > > struct pid is pinned earlier on and rather than FAN_NOPIDFD being
> > > > > > > > > > > > returned in the pidfd info record because the struct pid has been
> > > > > > > > > > > > already reaped, userspace application will atleast receive a valid
> > > > > > > > > > > > pidfd which can be used to check whether the process still exists or
> > > > > > > > > > > > not. I think it'll just set the expectation better from an API
> > > > > > > > > > > > perspective.
> > > > > > > > > > >
> > > > > > > > > > > Yes, there's this race. OTOH if FAN_NOPIDFD is returned, the listener can
> > > > > > > > > > > be sure the original process doesn't exist anymore. So is it useful to
> > > > > > > > > > > still receive pidfd of the dead process?
> > > > > > > > > >
> > > > > > > > > > Well, you're absolutely right. However, FWIW I was approaching this
> > > > > > > > > > from two different angles:
> > > > > > > > > >
> > > > > > > > > > 1) I wanted to keep the pattern in which the listener checks for the
> > > > > > > > > >    existence/recycling of the process consistent. As in, the listener
> > > > > > > > > >    would receive the pidfd, then send the pidfd a signal via
> > > > > > > > > >    pidfd_send_signal() and check for -ESRCH which clearly indicates
> > > > > > > > > >    that the target process has terminated.
> > > > > > > > > >
> > > > > > > > > > 2) I didn't want to mask failed pidfd creation because of early
> > > > > > > > > >    process termination and other possible failures behind a single
> > > > > > > > > >    FAN_NOPIDFD. IOW, if we take the -ESRCH approach above, the
> > > > > > > > > >    listener can take clear corrective branches as what's to be done
> > > > > > > > > >    next if a race is to have been detected, whereas simply returning
> > > > > > > > > >    FAN_NOPIDFD at this stage can mean multiple things.
> > > > > > > > > >
> > > > > > > > > > Now that I've written the above and keeping in mind that we'd like to
> > > > > > > > > > refrain from doing anything in the event allocation stages, perhaps we
> > > > > > > > > > could introduce a different error code for detecting early process
> > > > > > > > > > termination while attempting to construct the info record. WDYT?
> > > > > > > > >
> > > > > > > > > Sure, I wouldn't like to overengineer it but having one special fd value for
> > > > > > > > > "process doesn't exist anymore" and another for general "creating pidfd
> > > > > > > > > failed" looks OK to me.
> > > > > > > >
> > > > > > > > FAN_EPIDFD -> "creation failed"
> > > > > > > > FAN_NOPIDFD -> "no such process"
> > > > > > >
> > > > > > > Yes, I was thinking something along the lines of this...
> > > > > > >
> > > > > > > With the approach that I've proposed in this series, the pidfd
> > > > > > > creation failure trips up in pidfd_create() at the following
> > > > > > > condition:
> > > > > > >
> > > > > > >         if (!pid || !pid_has_task(pid, PIDTYPE_TGID))
> > > > > > >                  return -EINVAL;
> > > > > > >
> > > > > > > Specifically, the following check:
> > > > > > >         !pid_has_task(pid, PIDTYPE_TGID)
> > > > > > >
> > > > > > > In order to properly report either FAN_NOPIDFD/FAN_EPIDFD to
> > > > > > > userspace, AFAIK I'll have to do one of either two things to better
> > > > > > > distinguish between why the pidfd creation had failed:
> > > > > >
> > > > > > Ok, I see. You already do have a reference to a struct pid and in that
> > > > > > case we should just always return a pidfd to the caller. For
> > > > > > pidfd_open() for example we only report an error when
> > > > > > find_get_pid(<pidnr>) doesn't find a struct pid to refer to. But in your
> > > > > > case here you already have a struct pid so I think we should just keep
> > > > > > this simple and always return a pidfd to the caller and in fact do
> > > > > > burden them with figuring out that the process is gone via
> > > > > > pidfd_send_signal() instead of complicating our lives here.
> > > > >
> > > > > Ah, actually Christian... Before, I go ahead and send through the updated
> > > > > series. Given what you've mentioned above I'm working with the assumption
> > > > > that you're OK with dropping the pid_has_task() check from pidfd_create()
> > > > > [0]. Is that right?
> > > > >
> > > > > If so, I don't know how I feel about this given that pidfd_create() is now
> > > > > to be exposed to the rest of the kernel and the pidfd API, as it stands,
> > > > > doesn't support the creation of pidfds for non-thread-group leaders. I
> > > > > suppose what I don't want is other kernel subsystems, if any, thinking it's
> > > > > OK to call pidfd_create() with an arbitrary struct pid and setting the
> > > > > expectation that a fully functional pidfd will be returned.
> > > > >
> > > > > The way I see it, I think we've got two options here:
> > > > >
> > > > > 1) Leave the pid_has_task() check within pidfd_create() and perform another
> > > > >    explicit pid_has_task() check from the fanotify code before calling
> > > > >    pidfd_create(). If it returns false, we set something like FAN_NOPIDFD
> > > > >    indicating to userspace that there's no such process when the event was
> > > > >    created.
> > > > >
> > > > > 2) Scrap using pidfd_create() all together and implement a fanotify
> > > > >    specific pidfd creation wrapper which would allow for more
> > > > >    control. Something along the lines of what you've done in kernel/fork.c
> > > > >    [1]. Not the biggest fan of this idea just yet given the possibility of
> > > > >    it leading to an API drift over time.
> > > > >
> > > > > WDYT?
> > > >
> > > > Hm, why would you have to drop the pid_has_task() check again?
> > >
> > > Because of the race that I brielfy decscribed here [0]. The race exists
> > 
> > Sorry for being thich. I still don't understand what's racy about this.
> > Won't the event reader get a valid pidfd?
> > Can't the event reader verify that the pidfd points to a dead process?
> > I don't mind returning FAN_NOPIDFD for convenience, but user
> > will have to check the pidfd that it got anyway, because process
> > can die at any time between reading the event and acting on the
> > pidfd.
> 
> (Replying to this part of the thread so we don't have to many parallel
> replies.)
> 
> Hm, so quoting from link [0] Matthew posted so we all have some context
> here:
> "Basically, with this approach the pidfd creation is done only once an
> event has been queued and the notification worker wakes up and picks up
> the event from the queue processes it. There's a subtle latency
> introduced when taking such an approach which at times leads to pidfd
> creation failures. As in, by the time pidfd_create() is called the
> struct pid has already been reaped, which then results in FAN_NOPIDFD
> being returned in the pidfd info record."
> 
> I don't think that's a race and even if I don't think that it's a
> meaningful one. So when the event is queued the process is still alive
> but when the notification is actually delivered the process is dead.
> 
> And your point, Matthew, seems to be that the caller should always get a
> pidfd back even if the process has already exited _and_ been reaped,
> i.e. is dead because it was alive when the event was generated.
> 
> I think that's no how it needs to work and I have a hard time seeing
> this as a good argument. What's problematic about just returning
> FAN_NOPIDFD in that case? After all the process is gone. All the caller
> can do with such a pidfd is to send it a signal and immediately realize
> that the process is gone, i.e. -ESRCH anyway.

To get things straight, there's no argument here. There's a discussion
about what the best approach is for communicating to the event listener
that a process has been killed prior to a pidfd being created by/from
fanotify. I have no issues with communicating FAN_NOPIDFD to the event
listener in such cases. I just want to make sure everyone else is OK with
it.

> > > because we perform the pidfd creation during the notification queue
> > > processing and not in the event allocation stages (for reasons that Jan has
> > > already covered here [1]). So, tl;dr there is the case where the fanotify
> > > calls pidfd_create() and the check for pid_has_task() fails because the
> > > struct pid that we're hanging onto within an event no longer contains a
> > > task of type PIDTYPE_TGID...
> > >
> > > [0] https://www.spinics.net/lists/linux-api/msg48630.html
> > > [1] https://www.spinics.net/lists/linux-api/msg48632.html

/M

  reply	other threads:[~2021-06-02 10:58 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20  2:09 [PATCH 0/5] Add pidfd support to the fanotify API Matthew Bobrowski
2021-05-20  2:10 ` [PATCH 1/5] kernel/pid.c: remove static qualifier from pidfd_create() Matthew Bobrowski
2021-05-20  2:10 ` [PATCH 2/5] kernel/pid.c: implement checks on parameters passed to pidfd_create() Matthew Bobrowski
2021-05-20  2:11 ` [PATCH 3/5] fanotify_user.c: minor cosmetic adjustments to fid labels Matthew Bobrowski
2021-05-20  2:11 ` [PATCH 4/5] fanotify/fanotify_user.c: introduce a generic info record copying function Matthew Bobrowski
2021-05-20 13:59   ` Amir Goldstein
2021-05-21  9:26     ` Matthew Bobrowski
2021-05-20  2:11 ` [PATCH 5/5] fanotify: Add pidfd info record support to the fanotify API Matthew Bobrowski
2021-05-20  8:17   ` Christian Brauner
2021-05-20 13:43     ` Amir Goldstein
2021-05-21  9:21       ` Matthew Bobrowski
2021-05-21  9:41         ` Amir Goldstein
2021-05-21 10:24           ` Jan Kara
2021-05-21 11:10             ` Amir Goldstein
2021-05-21 13:19               ` Jan Kara
2021-05-21 13:52                 ` Amir Goldstein
2021-05-21 15:14                   ` Jan Kara
2021-05-22  0:41                     ` Matthew Bobrowski
2021-05-22  9:01                       ` Amir Goldstein
2021-05-20 13:55 ` [PATCH 0/5] Add pidfd " Jan Kara
2021-05-21 10:15   ` Matthew Bobrowski
2021-05-21 10:40     ` Jan Kara
2021-05-21 23:32       ` Matthew Bobrowski
2021-05-24  8:47         ` Jan Kara
2021-05-25 10:31           ` Christian Brauner
2021-05-25 23:20             ` Matthew Bobrowski
2021-05-26 18:05               ` Christian Brauner
2021-05-26 22:54                 ` Matthew Bobrowski
2021-06-01 11:03                 ` Matthew Bobrowski
2021-06-01 11:46                   ` Christian Brauner
2021-06-02  6:30                     ` Matthew Bobrowski
2021-06-02  7:18                       ` Amir Goldstein
2021-06-02  8:48                         ` Christian Brauner
2021-06-02 10:56                           ` Matthew Bobrowski [this message]
2021-06-02 12:46                             ` Christian Brauner
2021-06-02 10:43                         ` Matthew Bobrowski
2021-06-02 12:18                           ` Amir Goldstein
2021-06-03  1:24                             ` Matthew Bobrowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YLdj9pk4Jpz1qqVl@google.com \
    --to=repnop@google.com \
    --cc=amir73il@gmail.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=jack@suse.cz \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).