linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alice Ryhl <aliceryhl@google.com>
To: Benno Lossin <benno.lossin@proton.me>
Cc: a.hindborg@samsung.com, alex.gaynor@gmail.com, arve@android.com,
	bjorn3_gh@protonmail.com, boqun.feng@gmail.com,
	brauner@kernel.org, cmllamas@google.com,
	dan.j.williams@intel.com, dxu@dxuuu.xyz, gary@garyguo.net,
	gregkh@linuxfoundation.org, joel@joelfernandes.org,
	keescook@chromium.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, maco@android.com, ojeda@kernel.org,
	peterz@infradead.org, rust-for-linux@vger.kernel.org,
	surenb@google.com, tglx@linutronix.de, tkjos@android.com,
	viro@zeniv.linux.org.uk, wedsonaf@gmail.com, willy@infradead.org
Subject: Re: [PATCH v2 6/7] rust: file: add `DeferredFdCloser`
Date: Tue, 12 Dec 2023 10:35:02 +0100	[thread overview]
Message-ID: <CAH5fLggB_33jR1eyXSFhN=DN34wD7E6-ckSU8ABmQ50H-L3P-w@mail.gmail.com> (raw)
In-Reply-To: <DNn_nN0MKmn9OoY7Gjn4fCUcwKD6ijDZyDXVHvouEa2w0o2yiXeRox3EUfAcbfoWqx0I24-8HqqzONjuTQIVxu2cfAoNQpUFJygPtQNXPM4=@proton.me>

On Mon, Dec 11, 2023 at 6:23 PM Benno Lossin <benno.lossin@proton.me> wrote:
>
> >>> +        unsafe { bindings::init_task_work(callback_head, Some(Self::do_close_fd)) };
> >>> +        // SAFETY: The `callback_head` pointer points at a valid and fully initialized task work
> >>> +        // that is ready to be scheduled.
> >>> +        //
> >>> +        // If the task work gets scheduled as-is, then it will be a no-op. However, we will update
> >>> +        // the file pointer below to tell it which file to fput.
> >>> +        let res = unsafe { bindings::task_work_add(current, callback_head, TWA_RESUME) };
> >>> +
> >>> +        if res != 0 {
> >>> +            // SAFETY: Scheduling the task work failed, so we still have ownership of the box, so
> >>> +            // we may destroy it.
> >>> +            unsafe { drop(Box::from_raw(inner)) };
> >>> +
> >>> +            return Err(DeferredFdCloseError::TaskWorkUnavailable);
> >>
> >> Just curious, what could make the `task_work_add` call fail? I imagine
> >> an OOM situation, but is that all?
> >
> > Actually, no, this doesn't fail in OOM situations since we pass it an
> > allocation for its linked list. It fails only when the current task is
> > exiting and wont run task work again.
>
> Interesting, is there some way to check for this aside from calling
> `task_work_add`?

I don't think so. We would need to access the `work_exited` constant
in `kernel/task_work.c` to do that, but it is not exported.

> >> Also, you do not call it when `file` is null, which I imagine to be
> >> fine, but I do not know that since the C comment does not cover that
> >> case.
> >
> > Null pointer means that the fd doesn't exist, and it's correct to do
> > nothing in that case.
>
> I would also mention that in a comment (or the SAFETY comment).

Okay.

> >>> +        let file = unsafe { bindings::close_fd_get_file(fd) };
> >>> +        if file.is_null() {
> >>> +            // We don't clean up the task work since that might be expensive if the task work queue
> >>> +            // is long. Just let it execute and let it clean up for itself.
> >>> +            return Err(DeferredFdCloseError::BadFd);
> >>> +        }
> >>> +
> >>> +        // SAFETY: The `file` pointer points at a valid file.
> >>> +        unsafe { bindings::get_file(file) };
> >>> +
> >>> +        // SAFETY: Due to the above `get_file`, even if the current task holds an `fdget` to
> >>> +        // this file right now, the refcount will not drop to zero until after it is released
> >>> +        // with `fdput`. This is because when using `fdget`, you must always use `fdput` before
> >>
> >> Shouldn't this be "the refcount will not drop to zero until after it is
> >> released with `fput`."?
> >>
> >> Why is this the SAFETY comment for `filp_close`? I am not understanding
> >> the requirement on that function that needs this. This seems more a
> >> justification for accessing `file` inside `do_close_fd`. In which case I
> >> think it would be better to make it a type invariant of
> >> `DeferredFdCloserInner`.
> >
> > It's because `filp_close` decreases the refcount for the file, and doing
> > that is not always safe even if you have a refcount to the file. To drop
> > the refcount, at least one of the two following must be the case:
> >
> > * If the refcount decreases to a non-zero value, then it is okay.
> > * If there are no users of `fdget` on the file, then it is okay.
>
> I see, that makes sense. Is this written down somewhere? Or how does one
> know about this?

I don't think there's a single place to read about this. The comments
on __fget_light allude to something similar, but it makes the blanket
statement that you can't call filp_close while an fdget reference
exists, even though the reality is a bit more nuanced.

> >>> +        // We update the file pointer that the task work is supposed to fput.
> >>> +        //
> >>> +        // SAFETY: Task works are executed on the current thread once we return to userspace, so
> >>> +        // this write is guaranteed to happen before `do_close_fd` is called, which means that a
> >>> +        // race is not possible here.
> >>> +        //
> >>> +        // It's okay to pass this pointer to the task work, since we just acquired a refcount with
> >>> +        // the previous call to `get_file`. Furthermore, the refcount will not drop to zero during
> >>> +        // an `fdget` call, since we defer the `fput` until after returning to userspace.
> >>> +        unsafe { *file_field = file };
> >>
> >> A synchronization question: who guarantees that this write is actually
> >> available to the cpu that executes `do_close_fd`? Is there some
> >> synchronization run when returning to userspace?
> >
> > It's on the same thread, so it's just a sequenced-before relation.
> >
> > It's not like an interrupt. It runs after the syscall invocation has
> > exited, but before it does the actual return-to-userspace stuff.
>
> Reasonable, can you also put this in a comment?

What do you want me to add? I already say that it will be executed on
the same thread.

> >>> +/// Represents a failure to close an fd in a deferred manner.
> >>> +#[derive(Copy, Clone, Eq, PartialEq)]
> >>> +pub enum DeferredFdCloseError {
> >>> +    /// Closing the fd failed because we were unable to schedule a task work.
> >>> +    TaskWorkUnavailable,
> >>> +    /// Closing the fd failed because the fd does not exist.
> >>> +    BadFd,
> >>> +}
> >>> +
> >>> +impl From<DeferredFdCloseError> for Error {
> >>> +    fn from(err: DeferredFdCloseError) -> Error {
> >>> +        match err {
> >>> +            DeferredFdCloseError::TaskWorkUnavailable => ESRCH,
> >>
> >> This error reads "No such process", I am not sure if that is the best
> >> way to express the problem in that situation. I took a quick look at the
> >> other error codes, but could not find a better fit. Do you have any
> >> better ideas? Or is this the error that C binder uses?
> >
> > This is the error code that task_work_add returns. (It can't happen in
> > Binder.)
> >
> > And I do think that it is a reasonable choice, because the error only
> > happens if you're calling the method from a context that has no
> > userspace process associated with it.
>
> I see.
>
> What do you think of making the Rust error more descriptive? So instead
> of implementing `Debug` like you currently do, you print
>
>     $error ($variant)
>
> where $error = Error::from(*self) and $variant is the name of the
> variant?
>
> This is more of a general suggestion, I don't think that this error type
> in particular warrants this. But in general with Rust we do have the
> option to have good error messages for every error while maintaining
> efficient error values.

I can #[derive(Debug)] instead, I guess?

Alice

  reply	other threads:[~2023-12-12  9:35 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-06 11:59 [PATCH v2 0/7] File abstractions needed by Rust Binder Alice Ryhl
2023-12-06 11:59 ` [PATCH v2 1/7] rust: file: add Rust abstraction for `struct file` Alice Ryhl
2023-12-08  9:48   ` Benno Lossin
2023-12-11 15:34     ` Alice Ryhl
2023-12-06 11:59 ` [PATCH v2 2/7] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
2023-12-08 16:13   ` Benno Lossin
2023-12-11 15:34     ` Alice Ryhl
2023-12-11  1:19   ` Boqun Feng
2023-12-11 15:34     ` Alice Ryhl
2023-12-11 17:35       ` Boqun Feng
2023-12-11 19:30         ` Benno Lossin
2023-12-12  9:40         ` Alice Ryhl
2023-12-06 11:59 ` [PATCH v2 3/7] rust: security: add abstraction for secctx Alice Ryhl
2023-12-08 16:22   ` Benno Lossin
2023-12-11 15:34     ` Alice Ryhl
2023-12-06 11:59 ` [PATCH v2 4/7] rust: file: add `FileDescriptorReservation` Alice Ryhl
2023-12-08  7:37   ` Benno Lossin
2023-12-08  7:43     ` Alice Ryhl
2023-12-11 15:34     ` Alice Ryhl
2023-12-06 11:59 ` [PATCH v2 5/7] rust: file: add `Kuid` wrapper Alice Ryhl
2023-12-06 12:34   ` Peter Zijlstra
2023-12-06 12:57     ` Alice Ryhl
2023-12-06 13:40       ` Peter Zijlstra
2023-12-06 13:50         ` Alice Ryhl
2023-12-06 16:49         ` Nick Desaulniers
2023-12-08 16:31         ` Miguel Ojeda
2023-12-08 16:57           ` Peter Zijlstra
2023-12-08 18:18             ` Kees Cook
2023-12-08 20:45               ` Peter Zijlstra
2023-12-08 20:57                 ` Kees Cook
2023-12-11 21:13               ` Kent Overstreet
2023-12-08 16:40   ` Benno Lossin
2023-12-08 16:43     ` Boqun Feng
2023-12-11 15:58       ` Kent Overstreet
2023-12-11 17:04         ` Benno Lossin
2023-12-11 15:34     ` Alice Ryhl
2023-12-06 11:59 ` [PATCH v2 6/7] rust: file: add `DeferredFdCloser` Alice Ryhl
2023-12-08 17:39   ` Benno Lossin
2023-12-11 15:34     ` Alice Ryhl
2023-12-11 17:23       ` Benno Lossin
2023-12-12  9:35         ` Alice Ryhl [this message]
2023-12-12 16:50           ` Benno Lossin
2023-12-11 17:41       ` Boqun Feng
2023-12-12  1:25         ` Boqun Feng
2023-12-12 20:57           ` Boqun Feng
2023-12-13 11:04             ` Alice Ryhl
2023-12-06 11:59 ` [PATCH v2 7/7] rust: file: add abstraction for `poll_table` Alice Ryhl
2023-12-08 17:53   ` Benno Lossin
2023-12-12  9:59     ` Alice Ryhl
2023-12-12 17:01       ` Benno Lossin
2023-12-13  1:35         ` Boqun Feng
2023-12-13  9:12           ` Benno Lossin
2023-12-13 10:09             ` Alice Ryhl
2023-12-13 17:05             ` Boqun Feng
2023-12-13 11:02         ` Alice Ryhl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH5fLggB_33jR1eyXSFhN=DN34wD7E6-ckSU8ABmQ50H-L3P-w@mail.gmail.com' \
    --to=aliceryhl@google.com \
    --cc=a.hindborg@samsung.com \
    --cc=alex.gaynor@gmail.com \
    --cc=arve@android.com \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=brauner@kernel.org \
    --cc=cmllamas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dxu@dxuuu.xyz \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=joel@joelfernandes.org \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maco@android.com \
    --cc=ojeda@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tkjos@android.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wedsonaf@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).