All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] File abstractions needed by Rust Binder
@ 2023-11-29 12:51 Alice Ryhl
  2023-11-29 12:51 ` [PATCH 1/7] rust: file: add Rust abstraction for `struct file` Alice Ryhl
                   ` (7 more replies)
  0 siblings, 8 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 12:51 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan
  Cc: Alice Ryhl, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

This patchset contains the file abstractions needed by the Rust
implementation of the Binder driver.

Please see the Rust Binder RFC for usage examples:
https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-0-08ba9197f637@google.com/

Users of "rust: file: add Rust abstraction for `struct file`":
	[PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
	[PATCH RFC 03/20] rust_binder: add threading support

Users of "rust: cred: add Rust abstraction for `struct cred`":
	[PATCH RFC 05/20] rust_binder: add nodes and context managers
	[PATCH RFC 06/20] rust_binder: add oneway transactions
	[PATCH RFC 11/20] rust_binder: send nodes in transaction
	[PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support

Users of "rust: security: add abstraction for security_secid_to_secctx":
	[PATCH RFC 06/20] rust_binder: add oneway transactions

Users of "rust: file: add `FileDescriptorReservation`":
	[PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support
	[PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support

Users of "rust: file: add kuid getters":
	[PATCH RFC 05/20] rust_binder: add nodes and context managers
	[PATCH RFC 06/20] rust_binder: add oneway transactions

Users of "rust: file: add `DeferredFdCloser`":
	[PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support

Users of "rust: file: add abstraction for `poll_table`":
	[PATCH RFC 07/20] rust_binder: add epoll support

This patchset has some uses of read_volatile in place of READ_ONCE.
Please see the following rfc for context on this:
https://lore.kernel.org/all/20231025195339.1431894-1-boqun.feng@gmail.com/

This was previously sent as an rfc:
https://lore.kernel.org/all/20230720152820.3566078-1-aliceryhl@google.com/

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Alice Ryhl (4):
      rust: security: add abstraction for security_secid_to_secctx
      rust: file: add `Kuid` wrapper
      rust: file: add `DeferredFdCloser`
      rust: file: add abstraction for `poll_table`

Wedson Almeida Filho (3):
      rust: file: add Rust abstraction for `struct file`
      rust: cred: add Rust abstraction for `struct cred`
      rust: file: add `FileDescriptorReservation`

 rust/bindings/bindings_helper.h |   9 ++
 rust/bindings/lib.rs            |   1 +
 rust/helpers.c                  |  94 +++++++++++
 rust/kernel/cred.rs             |  73 +++++++++
 rust/kernel/file.rs             | 345 ++++++++++++++++++++++++++++++++++++++++
 rust/kernel/file/poll_table.rs  |  97 +++++++++++
 rust/kernel/lib.rs              |   3 +
 rust/kernel/security.rs         |  78 +++++++++
 rust/kernel/sync/condvar.rs     |   2 +-
 rust/kernel/task.rs             |  71 ++++++++-
 10 files changed, 771 insertions(+), 2 deletions(-)
---
base-commit: 98b1cc82c4affc16f5598d4fa14b1858671b2263
change-id: 20231123-alice-file-525b98e8a724

Best regards,
-- 
Alice Ryhl <aliceryhl@google.com>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
@ 2023-11-29 12:51 ` Alice Ryhl
  2023-11-29 15:13   ` Matthew Wilcox
                     ` (2 more replies)
  2023-11-29 12:51 ` [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
                   ` (6 subsequent siblings)
  7 siblings, 3 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 12:51 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan
  Cc: Alice Ryhl, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

From: Wedson Almeida Filho <wedsonaf@gmail.com>

This abstraction makes it possible to manipulate the open files for a
process. The new `File` struct wraps the C `struct file`. When accessing
it using the smart pointer `ARef<File>`, the pointer will own a
reference count to the file. When accessing it as `&File`, then the
reference does not own a refcount, but the borrow checker will ensure
that the reference count does not hit zero while the `&File` is live.

Since this is intended to manipulate the open files of a process, we
introduce a `from_fd` constructor that corresponds to the C `fget`
method. In future patches, it will become possible to create a new fd in
a process and bind it to a `File`. Rust Binder will use these to send
fds from one process to another.

We also provide a method for accessing the file's flags. Rust Binder
will use this to access the flags of the Binder fd to check whether the
non-blocking flag is set, which affects what the Binder ioctl does.

This introduces a struct for the EBADF error type, rather than just
using the Error type directly. This has two advantages:
* `File::from_fd` returns a `Result<ARef<File>, BadFdError>`, which the
  compiler will represent as a single pointer, with null being an error.
  This is possible because the compiler understands that `BadFdError`
  has only one possible value, and it also understands that the
  `ARef<File>` smart pointer is guaranteed non-null.
* Additionally, we promise to users of the method that the method can
  only fail with EBADF, which means that they can rely on this promise
  without having to inspect its implementation.
That said, there are also two disadvantages:
* Defining additional error types involves boilerplate.
* The question mark operator will only utilize the `From` trait once,
  which prevents you from using the question mark operator on
  `BadFdError` in methods that return some third error type that the
  kernel `Error` is convertible into. (However, it works fine in methods
  that return `Error`.)

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/bindings/bindings_helper.h |   2 +
 rust/helpers.c                  |   7 ++
 rust/kernel/file.rs             | 182 ++++++++++++++++++++++++++++++++++++++++
 rust/kernel/lib.rs              |   1 +
 4 files changed, 192 insertions(+)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 85f013ed4ca4..beed3ef1fbc3 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -8,6 +8,8 @@
 
 #include <kunit/test.h>
 #include <linux/errname.h>
+#include <linux/file.h>
+#include <linux/fs.h>
 #include <linux/slab.h>
 #include <linux/refcount.h>
 #include <linux/wait.h>
diff --git a/rust/helpers.c b/rust/helpers.c
index 70e59efd92bc..03141a3608a4 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -25,6 +25,7 @@
 #include <linux/build_bug.h>
 #include <linux/err.h>
 #include <linux/errname.h>
+#include <linux/fs.h>
 #include <linux/mutex.h>
 #include <linux/refcount.h>
 #include <linux/sched/signal.h>
@@ -157,6 +158,12 @@ void rust_helper_init_work_with_key(struct work_struct *work, work_func_t func,
 }
 EXPORT_SYMBOL_GPL(rust_helper_init_work_with_key);
 
+struct file *rust_helper_get_file(struct file *f)
+{
+	return get_file(f);
+}
+EXPORT_SYMBOL_GPL(rust_helper_get_file);
+
 /*
  * `bindgen` binds the C `size_t` type as the Rust `usize` type, so we can
  * use it in contexts where Rust expects a `usize` like slice (array) indices.
diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
new file mode 100644
index 000000000000..ee4ec8b919af
--- /dev/null
+++ b/rust/kernel/file.rs
@@ -0,0 +1,182 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Files and file descriptors.
+//!
+//! C headers: [`include/linux/fs.h`](../../../../include/linux/fs.h) and
+//! [`include/linux/file.h`](../../../../include/linux/file.h)
+
+use crate::{
+    bindings,
+    error::{code::*, Error, Result},
+    types::{ARef, AlwaysRefCounted, Opaque},
+};
+use core::ptr;
+
+/// Flags associated with a [`File`].
+pub mod flags {
+    /// File is opened in append mode.
+    pub const O_APPEND: u32 = bindings::O_APPEND;
+
+    /// Signal-driven I/O is enabled.
+    pub const O_ASYNC: u32 = bindings::FASYNC;
+
+    /// Close-on-exec flag is set.
+    pub const O_CLOEXEC: u32 = bindings::O_CLOEXEC;
+
+    /// File was created if it didn't already exist.
+    pub const O_CREAT: u32 = bindings::O_CREAT;
+
+    /// Direct I/O is enabled for this file.
+    pub const O_DIRECT: u32 = bindings::O_DIRECT;
+
+    /// File must be a directory.
+    pub const O_DIRECTORY: u32 = bindings::O_DIRECTORY;
+
+    /// Like [`O_SYNC`] except metadata is not synced.
+    pub const O_DSYNC: u32 = bindings::O_DSYNC;
+
+    /// Ensure that this file is created with the `open(2)` call.
+    pub const O_EXCL: u32 = bindings::O_EXCL;
+
+    /// Large file size enabled (`off64_t` over `off_t`).
+    pub const O_LARGEFILE: u32 = bindings::O_LARGEFILE;
+
+    /// Do not update the file last access time.
+    pub const O_NOATIME: u32 = bindings::O_NOATIME;
+
+    /// File should not be used as process's controlling terminal.
+    pub const O_NOCTTY: u32 = bindings::O_NOCTTY;
+
+    /// If basename of path is a symbolic link, fail open.
+    pub const O_NOFOLLOW: u32 = bindings::O_NOFOLLOW;
+
+    /// File is using nonblocking I/O.
+    pub const O_NONBLOCK: u32 = bindings::O_NONBLOCK;
+
+    /// Also known as `O_NDELAY`.
+    ///
+    /// This is effectively the same flag as [`O_NONBLOCK`] on all architectures
+    /// except SPARC64.
+    pub const O_NDELAY: u32 = bindings::O_NDELAY;
+
+    /// Used to obtain a path file descriptor.
+    pub const O_PATH: u32 = bindings::O_PATH;
+
+    /// Write operations on this file will flush data and metadata.
+    pub const O_SYNC: u32 = bindings::O_SYNC;
+
+    /// This file is an unnamed temporary regular file.
+    pub const O_TMPFILE: u32 = bindings::O_TMPFILE;
+
+    /// File should be truncated to length 0.
+    pub const O_TRUNC: u32 = bindings::O_TRUNC;
+
+    /// Bitmask for access mode flags.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// use kernel::file;
+    /// # fn do_something() {}
+    /// # let flags = 0;
+    /// if (flags & file::flags::O_ACCMODE) == file::flags::O_RDONLY {
+    ///     do_something();
+    /// }
+    /// ```
+    pub const O_ACCMODE: u32 = bindings::O_ACCMODE;
+
+    /// File is read only.
+    pub const O_RDONLY: u32 = bindings::O_RDONLY;
+
+    /// File is write only.
+    pub const O_WRONLY: u32 = bindings::O_WRONLY;
+
+    /// File can be both read and written.
+    pub const O_RDWR: u32 = bindings::O_RDWR;
+}
+
+/// Wraps the kernel's `struct file`.
+///
+/// # Invariants
+///
+/// Instances of this type are always ref-counted, that is, a call to `get_file` ensures that the
+/// allocation remains valid at least until the matching call to `fput`.
+#[repr(transparent)]
+pub struct File(Opaque<bindings::file>);
+
+// SAFETY: By design, the only way to access a `File` is via an immutable reference or an `ARef`.
+// This means that the only situation in which a `File` can be accessed mutably is when the
+// refcount drops to zero and the destructor runs. It is safe for that to happen on any thread, so
+// it is ok for this type to be `Send`.
+unsafe impl Send for File {}
+
+// SAFETY: It's OK to access `File` through shared references from other threads because we're
+// either accessing properties that don't change or that are properly synchronised by C code.
+unsafe impl Sync for File {}
+
+impl File {
+    /// Constructs a new `struct file` wrapper from a file descriptor.
+    ///
+    /// The file descriptor belongs to the current process.
+    pub fn from_fd(fd: u32) -> Result<ARef<Self>, BadFdError> {
+        // SAFETY: FFI call, there are no requirements on `fd`.
+        let ptr = ptr::NonNull::new(unsafe { bindings::fget(fd) }).ok_or(BadFdError)?;
+
+        // INVARIANT: `fget` increments the refcount before returning.
+        Ok(unsafe { ARef::from_raw(ptr.cast()) })
+    }
+
+    /// Creates a reference to a [`File`] from a valid pointer.
+    ///
+    /// # Safety
+    ///
+    /// The caller must ensure that `ptr` points at a valid file and that its refcount does not
+    /// reach zero during the lifetime 'a.
+    pub unsafe fn from_ptr<'a>(ptr: *const bindings::file) -> &'a File {
+        // INVARIANT: The safety requirements guarantee that the refcount does not hit zero during
+        // 'a. The cast is okay because `File` is `repr(transparent)`.
+        unsafe { &*ptr.cast() }
+    }
+
+    /// Returns the flags associated with the file.
+    ///
+    /// The flags are a combination of the constants in [`flags`].
+    pub fn flags(&self) -> u32 {
+        // This `read_volatile` is intended to correspond to a READ_ONCE call.
+        //
+        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
+        //
+        // TODO: Replace with `read_once` when available on the Rust side.
+        unsafe { core::ptr::addr_of!((*self.0.get()).f_flags).read_volatile() }
+    }
+}
+
+// SAFETY: The type invariants guarantee that `File` is always ref-counted.
+unsafe impl AlwaysRefCounted for File {
+    fn inc_ref(&self) {
+        // SAFETY: The existence of a shared reference means that the refcount is nonzero.
+        unsafe { bindings::get_file(self.0.get()) };
+    }
+
+    unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
+        // SAFETY: The safety requirements guarantee that the refcount is nonzero.
+        unsafe { bindings::fput(obj.cast().as_ptr()) }
+    }
+}
+
+/// Represents the `EBADF` error code.
+///
+/// Used for methods that can only fail with `EBADF`.
+pub struct BadFdError;
+
+impl From<BadFdError> for Error {
+    fn from(_: BadFdError) -> Error {
+        EBADF
+    }
+}
+
+impl core::fmt::Debug for BadFdError {
+    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
+        f.pad("EBADF")
+    }
+}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index e6aff80b521f..ce9abceab784 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -34,6 +34,7 @@
 mod allocator;
 mod build_assert;
 pub mod error;
+pub mod file;
 pub mod init;
 pub mod ioctl;
 #[cfg(CONFIG_KUNIT)]

-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred`
  2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
  2023-11-29 12:51 ` [PATCH 1/7] rust: file: add Rust abstraction for `struct file` Alice Ryhl
@ 2023-11-29 12:51 ` Alice Ryhl
  2023-11-30 16:17   ` Benno Lossin
  2023-11-29 13:11 ` [PATCH 3/7] rust: security: add abstraction for secctx Alice Ryhl
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 12:51 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan
  Cc: Alice Ryhl, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

From: Wedson Almeida Filho <wedsonaf@gmail.com>

Add a wrapper around `struct cred` called `Credential`, and provide
functionality to get the `Credential` associated with a `File`.

Rust Binder must check the credentials of processes when they attempt to
perform various operations, and these checks usually take a
`&Credential` as parameter. The security_binder_set_context_mgr function
would be one example. This patch is necessary to access these security_*
methods from Rust.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/bindings/bindings_helper.h |  1 +
 rust/helpers.c                  | 13 +++++++++
 rust/kernel/cred.rs             | 64 +++++++++++++++++++++++++++++++++++++++++
 rust/kernel/file.rs             | 16 +++++++++++
 rust/kernel/lib.rs              |  1 +
 5 files changed, 95 insertions(+)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index beed3ef1fbc3..6d1bd2229aab 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -7,6 +7,7 @@
  */
 
 #include <kunit/test.h>
+#include <linux/cred.h>
 #include <linux/errname.h>
 #include <linux/file.h>
 #include <linux/fs.h>
diff --git a/rust/helpers.c b/rust/helpers.c
index 03141a3608a4..10ed69f76424 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -23,6 +23,7 @@
 #include <kunit/test-bug.h>
 #include <linux/bug.h>
 #include <linux/build_bug.h>
+#include <linux/cred.h>
 #include <linux/err.h>
 #include <linux/errname.h>
 #include <linux/fs.h>
@@ -164,6 +165,18 @@ struct file *rust_helper_get_file(struct file *f)
 }
 EXPORT_SYMBOL_GPL(rust_helper_get_file);
 
+const struct cred *rust_helper_get_cred(const struct cred *cred)
+{
+	return get_cred(cred);
+}
+EXPORT_SYMBOL_GPL(rust_helper_get_cred);
+
+void rust_helper_put_cred(const struct cred *cred)
+{
+	put_cred(cred);
+}
+EXPORT_SYMBOL_GPL(rust_helper_put_cred);
+
 /*
  * `bindgen` binds the C `size_t` type as the Rust `usize` type, so we can
  * use it in contexts where Rust expects a `usize` like slice (array) indices.
diff --git a/rust/kernel/cred.rs b/rust/kernel/cred.rs
new file mode 100644
index 000000000000..497058ec89bb
--- /dev/null
+++ b/rust/kernel/cred.rs
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Credentials management.
+//!
+//! C header: [`include/linux/cred.h`](../../../../include/linux/cred.h)
+//!
+//! Reference: <https://www.kernel.org/doc/html/latest/security/credentials.html>
+
+use crate::{
+    bindings,
+    types::{AlwaysRefCounted, Opaque},
+};
+
+/// Wraps the kernel's `struct cred`.
+///
+/// # Invariants
+///
+/// Instances of this type are always ref-counted, that is, a call to `get_cred` ensures that the
+/// allocation remains valid at least until the matching call to `put_cred`.
+#[repr(transparent)]
+pub struct Credential(pub(crate) Opaque<bindings::cred>);
+
+// SAFETY: By design, the only way to access a `Credential` is via an immutable reference or an
+// `ARef`. This means that the only situation in which a `Credential` can be accessed mutably is
+// when the refcount drops to zero and the destructor runs. It is safe for that to happen on any
+// thread, so it is ok for this type to be `Send`.
+unsafe impl Send for Credential {}
+
+// SAFETY: It's OK to access `Credential` through shared references from other threads because
+// we're either accessing properties that don't change or that are properly synchronised by C code.
+unsafe impl Sync for Credential {}
+
+impl Credential {
+    /// Creates a reference to a [`Credential`] from a valid pointer.
+    ///
+    /// # Safety
+    ///
+    /// The caller must ensure that `ptr` is valid and remains valid for the lifetime of the
+    /// returned [`Credential`] reference.
+    pub unsafe fn from_ptr<'a>(ptr: *const bindings::cred) -> &'a Credential {
+        // SAFETY: The safety requirements guarantee the validity of the dereference, while the
+        // `Credential` type being transparent makes the cast ok.
+        unsafe { &*ptr.cast() }
+    }
+
+    /// Returns the effective UID of the given credential.
+    pub fn euid(&self) -> bindings::kuid_t {
+        // SAFETY: By the type invariant, we know that `self.0` is valid.
+        unsafe { (*self.0.get()).euid }
+    }
+}
+
+// SAFETY: The type invariants guarantee that `Credential` is always ref-counted.
+unsafe impl AlwaysRefCounted for Credential {
+    fn inc_ref(&self) {
+        // SAFETY: The existence of a shared reference means that the refcount is nonzero.
+        unsafe { bindings::get_cred(self.0.get()) };
+    }
+
+    unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) {
+        // SAFETY: The safety requirements guarantee that the refcount is nonzero.
+        unsafe { bindings::put_cred(obj.cast().as_ptr()) };
+    }
+}
diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
index ee4ec8b919af..f1f71c3d97e2 100644
--- a/rust/kernel/file.rs
+++ b/rust/kernel/file.rs
@@ -7,6 +7,7 @@
 
 use crate::{
     bindings,
+    cred::Credential,
     error::{code::*, Error, Result},
     types::{ARef, AlwaysRefCounted, Opaque},
 };
@@ -138,6 +139,21 @@ pub unsafe fn from_ptr<'a>(ptr: *const bindings::file) -> &'a File {
         unsafe { &*ptr.cast() }
     }
 
+    /// Returns the credentials of the task that originally opened the file.
+    pub fn cred(&self) -> &Credential {
+        // This `read_volatile` is intended to correspond to a READ_ONCE call.
+        //
+        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
+        //
+        // TODO: Replace with `read_once` when available on the Rust side.
+        let ptr = unsafe { core::ptr::addr_of!((*self.0.get()).f_cred).read_volatile() };
+
+        // SAFETY: The signature of this function ensures that the caller will only access the
+        // returned credential while the file is still valid, and the credential must stay valid
+        // while the file is valid.
+        unsafe { Credential::from_ptr(ptr) }
+    }
+
     /// Returns the flags associated with the file.
     ///
     /// The flags are a combination of the constants in [`flags`].
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index ce9abceab784..097fe9bb93ed 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -33,6 +33,7 @@
 #[cfg(not(testlib))]
 mod allocator;
 mod build_assert;
+pub mod cred;
 pub mod error;
 pub mod file;
 pub mod init;

-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH 3/7] rust: security: add abstraction for secctx
  2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
  2023-11-29 12:51 ` [PATCH 1/7] rust: file: add Rust abstraction for `struct file` Alice Ryhl
  2023-11-29 12:51 ` [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
@ 2023-11-29 13:11 ` Alice Ryhl
  2023-11-30 16:26   ` Benno Lossin
  2023-11-29 13:11 ` [PATCH 4/7] rust: file: add `FileDescriptorReservation` Alice Ryhl
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 13:11 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan
  Cc: Alice Ryhl, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

Adds an abstraction for viewing the string representation of a security
context.

This is needed by Rust Binder because it has feature where a process can
view the string representation of the security context for incoming
transactions. The process can use that to authenticate incoming
transactions, and since the feature is provided by the kernel, the
process can trust that the security context is legitimate.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/bindings/bindings_helper.h |  1 +
 rust/helpers.c                  | 21 +++++++++
 rust/kernel/cred.rs             |  8 ++++
 rust/kernel/lib.rs              |  1 +
 rust/kernel/security.rs         | 78 +++++++++++++++++++++++++++++++++
 5 files changed, 109 insertions(+)
 create mode 100644 rust/kernel/security.rs

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 6d1bd2229aab..81b13a953eae 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -11,6 +11,7 @@
 #include <linux/errname.h>
 #include <linux/file.h>
 #include <linux/fs.h>
+#include <linux/security.h>
 #include <linux/slab.h>
 #include <linux/refcount.h>
 #include <linux/wait.h>
diff --git a/rust/helpers.c b/rust/helpers.c
index 10ed69f76424..fd633d9db79a 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -30,6 +30,7 @@
 #include <linux/mutex.h>
 #include <linux/refcount.h>
 #include <linux/sched/signal.h>
+#include <linux/security.h>
 #include <linux/spinlock.h>
 #include <linux/wait.h>
 #include <linux/workqueue.h>
@@ -177,6 +178,26 @@ void rust_helper_put_cred(const struct cred *cred)
 }
 EXPORT_SYMBOL_GPL(rust_helper_put_cred);
 
+#ifndef CONFIG_SECURITY
+void rust_helper_security_cred_getsecid(const struct cred *c, u32 *secid)
+{
+	security_cred_getsecid(c, secid);
+}
+EXPORT_SYMBOL_GPL(rust_helper_security_cred_getsecid);
+
+int rust_helper_security_secid_to_secctx(u32 secid, char **secdata, u32 *seclen)
+{
+	return security_secid_to_secctx(secid, secdata, seclen);
+}
+EXPORT_SYMBOL_GPL(rust_helper_security_secid_to_secctx);
+
+void rust_helper_security_release_secctx(char *secdata, u32 seclen)
+{
+	security_release_secctx(secdata, seclen);
+}
+EXPORT_SYMBOL_GPL(rust_helper_security_release_secctx);
+#endif
+
 /*
  * `bindgen` binds the C `size_t` type as the Rust `usize` type, so we can
  * use it in contexts where Rust expects a `usize` like slice (array) indices.
diff --git a/rust/kernel/cred.rs b/rust/kernel/cred.rs
index 497058ec89bb..3794937b5294 100644
--- a/rust/kernel/cred.rs
+++ b/rust/kernel/cred.rs
@@ -43,6 +43,14 @@ pub unsafe fn from_ptr<'a>(ptr: *const bindings::cred) -> &'a Credential {
         unsafe { &*ptr.cast() }
     }
 
+    /// Get the id for this security context.
+    pub fn get_secid(&self) -> u32 {
+        let mut secid = 0;
+        // SAFETY: The invariants of this type ensures that the pointer is valid.
+        unsafe { bindings::security_cred_getsecid(self.0.get(), &mut secid) };
+        secid
+    }
+
     /// Returns the effective UID of the given credential.
     pub fn euid(&self) -> bindings::kuid_t {
         // SAFETY: By the type invariant, we know that `self.0` is valid.
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 097fe9bb93ed..342cb02c495a 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -42,6 +42,7 @@
 pub mod kunit;
 pub mod prelude;
 pub mod print;
+pub mod security;
 mod static_assert;
 #[doc(hidden)]
 pub mod std_vendor;
diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs
new file mode 100644
index 000000000000..69c10ed89a57
--- /dev/null
+++ b/rust/kernel/security.rs
@@ -0,0 +1,78 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Linux Security Modules (LSM).
+//!
+//! C header: [`include/linux/security.h`](../../../../include/linux/security.h).
+
+use crate::{
+    bindings,
+    error::{to_result, Result},
+};
+
+/// A security context string.
+///
+/// The struct has the invariant that it always contains a valid security context.
+pub struct SecurityCtx {
+    secdata: *mut core::ffi::c_char,
+    seclen: usize,
+}
+
+impl SecurityCtx {
+    /// Get the security context given its id.
+    pub fn from_secid(secid: u32) -> Result<Self> {
+        let mut secdata = core::ptr::null_mut();
+        let mut seclen = 0;
+        // SAFETY: Just a C FFI call. The pointers are valid for writes.
+        unsafe {
+            to_result(bindings::security_secid_to_secctx(
+                secid,
+                &mut secdata,
+                &mut seclen,
+            ))?;
+        }
+
+        // If the above call did not fail, then we have a valid security
+        // context, so the invariants are not violated.
+        Ok(Self {
+            secdata,
+            seclen: usize::try_from(seclen).unwrap(),
+        })
+    }
+
+    /// Returns whether the security context is empty.
+    pub fn is_empty(&self) -> bool {
+        self.seclen == 0
+    }
+
+    /// Returns the length of this security context.
+    pub fn len(&self) -> usize {
+        self.seclen
+    }
+
+    /// Returns the bytes for this security context.
+    pub fn as_bytes(&self) -> &[u8] {
+        let mut ptr = self.secdata;
+        if ptr.is_null() {
+            // Many C APIs will use null pointers for strings of length zero, but
+            // `slice::from_raw_parts` doesn't allow the pointer to be null even if the length is
+            // zero. Replace the pointer with a dangling but non-null pointer in this case.
+            debug_assert_eq!(self.seclen, 0);
+            ptr = core::ptr::NonNull::dangling().as_ptr();
+        }
+
+        // SAFETY: The call to `security_secid_to_secctx` guarantees that the pointer is valid for
+        // `seclen` bytes. Furthermore, if the length is zero, then we have ensured that the
+        // pointer is not null.
+        unsafe { core::slice::from_raw_parts(ptr.cast(), self.seclen) }
+    }
+}
+
+impl Drop for SecurityCtx {
+    fn drop(&mut self) {
+        // SAFETY: This frees a pointer that came from a successful call to
+        // `security_secid_to_secctx`.
+        unsafe {
+            bindings::security_release_secctx(self.secdata, self.seclen as u32);
+        }
+    }
+}
-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
                   ` (2 preceding siblings ...)
  2023-11-29 13:11 ` [PATCH 3/7] rust: security: add abstraction for secctx Alice Ryhl
@ 2023-11-29 13:11 ` Alice Ryhl
  2023-11-29 16:14   ` Christian Brauner
  2023-11-30 16:40   ` Benno Lossin
  2023-11-29 13:12 ` [PATCH 5/7] rust: file: add `Kuid` wrapper Alice Ryhl
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 13:11 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan
  Cc: Alice Ryhl, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

From: Wedson Almeida Filho <wedsonaf@gmail.com>

Allow for the creation of a file descriptor in two steps: first, we
reserve a slot for it, then we commit or drop the reservation. The first
step may fail (e.g., the current process ran out of available slots),
but commit and drop never fail (and are mutually exclusive).

This is needed by Rust Binder when fds are sent from one process to
another. It has to be a two-step process to properly handle the case
where multiple fds are sent: The operation must fail or succeed
atomically, which we achieve by first reserving the fds we need, and
only installing the files once we have reserved enough fds to send the
files.

Fd reservations assume that the value of `current` does not change
between the call to get_unused_fd_flags and the call to fd_install (or
put_unused_fd). By not implementing the Send trait, this abstraction
ensures that the `FileDescriptorReservation` cannot be moved into a
different process.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/kernel/file.rs | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
index f1f71c3d97e2..2186a6ea3f2f 100644
--- a/rust/kernel/file.rs
+++ b/rust/kernel/file.rs
@@ -11,7 +11,7 @@
     error::{code::*, Error, Result},
     types::{ARef, AlwaysRefCounted, Opaque},
 };
-use core::ptr;
+use core::{marker::PhantomData, ptr};
 
 /// Flags associated with a [`File`].
 pub mod flags {
@@ -180,6 +180,68 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
     }
 }
 
+/// A file descriptor reservation.
+///
+/// This allows the creation of a file descriptor in two steps: first, we reserve a slot for it,
+/// then we commit or drop the reservation. The first step may fail (e.g., the current process ran
+/// out of available slots), but commit and drop never fail (and are mutually exclusive).
+///
+/// Dropping the reservation happens in the destructor of this type.
+///
+/// # Invariants
+///
+/// The fd stored in this struct must correspond to a reserved file descriptor of the current task.
+pub struct FileDescriptorReservation {
+    fd: u32,
+    /// Prevent values of this type from being moved to a different task.
+    ///
+    /// This is necessary because the C FFI calls assume that `current` is set to the task that
+    /// owns the fd in question.
+    _not_send_sync: PhantomData<*mut ()>,
+}
+
+impl FileDescriptorReservation {
+    /// Creates a new file descriptor reservation.
+    pub fn new(flags: u32) -> Result<Self> {
+        // SAFETY: FFI call, there are no safety requirements on `flags`.
+        let fd: i32 = unsafe { bindings::get_unused_fd_flags(flags) };
+        if fd < 0 {
+            return Err(Error::from_errno(fd));
+        }
+        Ok(Self {
+            fd: fd as _,
+            _not_send_sync: PhantomData,
+        })
+    }
+
+    /// Returns the file descriptor number that was reserved.
+    pub fn reserved_fd(&self) -> u32 {
+        self.fd
+    }
+
+    /// Commits the reservation.
+    ///
+    /// The previously reserved file descriptor is bound to `file`. This method consumes the
+    /// [`FileDescriptorReservation`], so it will not be usable after this call.
+    pub fn commit(self, file: ARef<File>) {
+        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
+        // guaranteed to have an owned ref count by its type invariants.
+        unsafe { bindings::fd_install(self.fd, file.0.get()) };
+
+        // `fd_install` consumes both the file descriptor and the file reference, so we cannot run
+        // the destructors.
+        core::mem::forget(self);
+        core::mem::forget(file);
+    }
+}
+
+impl Drop for FileDescriptorReservation {
+    fn drop(&mut self) {
+        // SAFETY: `self.fd` was returned by a previous call to `get_unused_fd_flags`.
+        unsafe { bindings::put_unused_fd(self.fd) };
+    }
+}
+
 /// Represents the `EBADF` error code.
 ///
 /// Used for methods that can only fail with `EBADF`.

-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
                   ` (3 preceding siblings ...)
  2023-11-29 13:11 ` [PATCH 4/7] rust: file: add `FileDescriptorReservation` Alice Ryhl
@ 2023-11-29 13:12 ` Alice Ryhl
  2023-11-29 16:28   ` Christian Brauner
                     ` (2 more replies)
  2023-11-29 13:12 ` [PATCH 6/7] rust: file: add `DeferredFdCloser` Alice Ryhl
                   ` (2 subsequent siblings)
  7 siblings, 3 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 13:12 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan
  Cc: Alice Ryhl, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

Adds a wrapper around `kuid_t` called `Kuid`. This allows us to define
various operations on kuids such as equality and current_euid. It also
lets us provide conversions from kuid into userspace values.

Rust Binder needs these operations because it needs to compare kuids for
equality, and it needs to tell userspace about the pid and uid of
incoming transactions.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/bindings/bindings_helper.h |  1 +
 rust/helpers.c                  | 45 ++++++++++++++++++++++++++
 rust/kernel/cred.rs             |  5 +--
 rust/kernel/task.rs             | 71 ++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 119 insertions(+), 3 deletions(-)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 81b13a953eae..700f01840188 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -11,6 +11,7 @@
 #include <linux/errname.h>
 #include <linux/file.h>
 #include <linux/fs.h>
+#include <linux/pid_namespace.h>
 #include <linux/security.h>
 #include <linux/slab.h>
 #include <linux/refcount.h>
diff --git a/rust/helpers.c b/rust/helpers.c
index fd633d9db79a..58e3a9dff349 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -142,6 +142,51 @@ void rust_helper_put_task_struct(struct task_struct *t)
 }
 EXPORT_SYMBOL_GPL(rust_helper_put_task_struct);
 
+kuid_t rust_helper_task_uid(struct task_struct *task)
+{
+	return task_uid(task);
+}
+EXPORT_SYMBOL_GPL(rust_helper_task_uid);
+
+kuid_t rust_helper_task_euid(struct task_struct *task)
+{
+	return task_euid(task);
+}
+EXPORT_SYMBOL_GPL(rust_helper_task_euid);
+
+#ifndef CONFIG_USER_NS
+uid_t rust_helper_from_kuid(struct user_namespace *to, kuid_t uid)
+{
+	return from_kuid(to, uid);
+}
+EXPORT_SYMBOL_GPL(rust_helper_from_kuid);
+#endif /* CONFIG_USER_NS */
+
+bool rust_helper_uid_eq(kuid_t left, kuid_t right)
+{
+	return uid_eq(left, right);
+}
+EXPORT_SYMBOL_GPL(rust_helper_uid_eq);
+
+kuid_t rust_helper_current_euid(void)
+{
+	return current_euid();
+}
+EXPORT_SYMBOL_GPL(rust_helper_current_euid);
+
+struct user_namespace *rust_helper_current_user_ns(void)
+{
+	return current_user_ns();
+}
+EXPORT_SYMBOL_GPL(rust_helper_current_user_ns);
+
+pid_t rust_helper_task_tgid_nr_ns(struct task_struct *tsk,
+				  struct pid_namespace *ns)
+{
+	return task_tgid_nr_ns(tsk, ns);
+}
+EXPORT_SYMBOL_GPL(rust_helper_task_tgid_nr_ns);
+
 struct kunit *rust_helper_kunit_get_current_test(void)
 {
 	return kunit_get_current_test();
diff --git a/rust/kernel/cred.rs b/rust/kernel/cred.rs
index 3794937b5294..fbc749788bfa 100644
--- a/rust/kernel/cred.rs
+++ b/rust/kernel/cred.rs
@@ -8,6 +8,7 @@
 
 use crate::{
     bindings,
+    task::Kuid,
     types::{AlwaysRefCounted, Opaque},
 };
 
@@ -52,9 +53,9 @@ pub fn get_secid(&self) -> u32 {
     }
 
     /// Returns the effective UID of the given credential.
-    pub fn euid(&self) -> bindings::kuid_t {
+    pub fn euid(&self) -> Kuid {
         // SAFETY: By the type invariant, we know that `self.0` is valid.
-        unsafe { (*self.0.get()).euid }
+        Kuid::from_raw(unsafe { (*self.0.get()).euid })
     }
 }
 
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index b2299bc7ac1f..1a27b968a907 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -5,7 +5,12 @@
 //! C header: [`include/linux/sched.h`](../../../../include/linux/sched.h).
 
 use crate::{bindings, types::Opaque};
-use core::{marker::PhantomData, ops::Deref, ptr};
+use core::{
+    cmp::{Eq, PartialEq},
+    marker::PhantomData,
+    ops::Deref,
+    ptr,
+};
 
 /// Returns the currently running task.
 #[macro_export]
@@ -78,6 +83,12 @@ unsafe impl Sync for Task {}
 /// The type of process identifiers (PIDs).
 type Pid = bindings::pid_t;
 
+/// The type of user identifiers (UIDs).
+#[derive(Copy, Clone)]
+pub struct Kuid {
+    kuid: bindings::kuid_t,
+}
+
 impl Task {
     /// Returns a task reference for the currently executing task/thread.
     ///
@@ -132,12 +143,34 @@ pub fn pid(&self) -> Pid {
         unsafe { *ptr::addr_of!((*self.0.get()).pid) }
     }
 
+    /// Returns the UID of the given task.
+    pub fn uid(&self) -> Kuid {
+        // SAFETY: By the type invariant, we know that `self.0` is valid.
+        Kuid::from_raw(unsafe { bindings::task_uid(self.0.get()) })
+    }
+
+    /// Returns the effective UID of the given task.
+    pub fn euid(&self) -> Kuid {
+        // SAFETY: By the type invariant, we know that `self.0` is valid.
+        Kuid::from_raw(unsafe { bindings::task_euid(self.0.get()) })
+    }
+
     /// Determines whether the given task has pending signals.
     pub fn signal_pending(&self) -> bool {
         // SAFETY: By the type invariant, we know that `self.0` is valid.
         unsafe { bindings::signal_pending(self.0.get()) != 0 }
     }
 
+    /// Returns the given task's pid in the current pid namespace.
+    pub fn pid_in_current_ns(&self) -> Pid {
+        // SAFETY: We know that `self.0.get()` is valid by the type invariant. The rest is just FFI
+        // calls.
+        unsafe {
+            let namespace = bindings::task_active_pid_ns(bindings::get_current());
+            bindings::task_tgid_nr_ns(self.0.get(), namespace)
+        }
+    }
+
     /// Wakes up the task.
     pub fn wake_up(&self) {
         // SAFETY: By the type invariant, we know that `self.0.get()` is non-null and valid.
@@ -147,6 +180,42 @@ pub fn wake_up(&self) {
     }
 }
 
+impl Kuid {
+    /// Get the current euid.
+    pub fn current_euid() -> Kuid {
+        // SAFETY: Just an FFI call.
+        Self {
+            kuid: unsafe { bindings::current_euid() },
+        }
+    }
+
+    /// Create a `Kuid` given the raw C type.
+    pub fn from_raw(kuid: bindings::kuid_t) -> Self {
+        Self { kuid }
+    }
+
+    /// Turn this kuid into the raw C type.
+    pub fn into_raw(self) -> bindings::kuid_t {
+        self.kuid
+    }
+
+    /// Converts this kernel UID into a UID that userspace understands. Uses the namespace of the
+    /// current task.
+    pub fn into_uid_in_current_ns(self) -> bindings::uid_t {
+        // SAFETY: Just an FFI call.
+        unsafe { bindings::from_kuid(bindings::current_user_ns(), self.kuid) }
+    }
+}
+
+impl PartialEq for Kuid {
+    fn eq(&self, other: &Kuid) -> bool {
+        // SAFETY: Just an FFI call.
+        unsafe { bindings::uid_eq(self.kuid, other.kuid) }
+    }
+}
+
+impl Eq for Kuid {}
+
 // SAFETY: The type invariants guarantee that `Task` is always ref-counted.
 unsafe impl crate::types::AlwaysRefCounted for Task {
     fn inc_ref(&self) {

-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH 6/7] rust: file: add `DeferredFdCloser`
  2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
                   ` (4 preceding siblings ...)
  2023-11-29 13:12 ` [PATCH 5/7] rust: file: add `Kuid` wrapper Alice Ryhl
@ 2023-11-29 13:12 ` Alice Ryhl
  2023-11-30 17:12   ` Benno Lossin
  2023-11-29 13:12 ` [PATCH 7/7] rust: file: add abstraction for `poll_table` Alice Ryhl
  2023-11-29 16:31 ` [PATCH 0/7] File abstractions needed by Rust Binder Christian Brauner
  7 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 13:12 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan
  Cc: Alice Ryhl, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

To close an fd from kernel space, we could call `ksys_close`. However,
if we do this to an fd that is held using `fdget`, then we may trigger a
use-after-free. Introduce a helper that can be used to close an fd even
if the fd is currently held with `fdget`. This is done by grabbing an
extra refcount to the file and dropping it in a task work once we return
to userspace.

This is necessary for Rust Binder because otherwise the user might try
to have Binder close its fd for /dev/binder, which would cause problems
as this happens inside an ioctl on /dev/binder, and ioctls hold the fd
using `fdget`.

Additional motivation can be found in commit 80cd795630d6 ("binder: fix
use-after-free due to ksys_close() during fdget()") and in the comments
on `binder_do_fd_close`.

If there is some way to detect whether an fd is currently held with
`fdget`, then this could be optimized to skip the allocation and task
work when this is not the case. Another possible optimization would be
to combine several fds into a single task work, since this is used with
fd arrays that might hold several fds.

That said, it might not be necessary to optimize it, because Rust Binder
has two ways to send fds: BINDER_TYPE_FD and BINDER_TYPE_FDA. With
BINDER_TYPE_FD, it is userspace's responsibility to close the fd, so
this mechanism is used only by BINDER_TYPE_FDA, but fd arrays are used
rarely these days.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/bindings/bindings_helper.h |  2 +
 rust/helpers.c                  |  8 ++++
 rust/kernel/file.rs             | 84 ++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 700f01840188..c8daee341df6 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -9,6 +9,7 @@
 #include <kunit/test.h>
 #include <linux/cred.h>
 #include <linux/errname.h>
+#include <linux/fdtable.h>
 #include <linux/file.h>
 #include <linux/fs.h>
 #include <linux/pid_namespace.h>
@@ -17,6 +18,7 @@
 #include <linux/refcount.h>
 #include <linux/wait.h>
 #include <linux/sched.h>
+#include <linux/task_work.h>
 #include <linux/workqueue.h>
 
 /* `bindgen` gets confused at certain things. */
diff --git a/rust/helpers.c b/rust/helpers.c
index 58e3a9dff349..d146bbf25aec 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -32,6 +32,7 @@
 #include <linux/sched/signal.h>
 #include <linux/security.h>
 #include <linux/spinlock.h>
+#include <linux/task_work.h>
 #include <linux/wait.h>
 #include <linux/workqueue.h>
 
@@ -243,6 +244,13 @@ void rust_helper_security_release_secctx(char *secdata, u32 seclen)
 EXPORT_SYMBOL_GPL(rust_helper_security_release_secctx);
 #endif
 
+void rust_helper_init_task_work(struct callback_head *twork,
+				task_work_func_t func)
+{
+	init_task_work(twork, func);
+}
+EXPORT_SYMBOL_GPL(rust_helper_init_task_work);
+
 /*
  * `bindgen` binds the C `size_t` type as the Rust `usize` type, so we can
  * use it in contexts where Rust expects a `usize` like slice (array) indices.
diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
index 2186a6ea3f2f..578ee307093f 100644
--- a/rust/kernel/file.rs
+++ b/rust/kernel/file.rs
@@ -11,7 +11,8 @@
     error::{code::*, Error, Result},
     types::{ARef, AlwaysRefCounted, Opaque},
 };
-use core::{marker::PhantomData, ptr};
+use alloc::boxed::Box;
+use core::{alloc::AllocError, marker::PhantomData, mem, ptr};
 
 /// Flags associated with a [`File`].
 pub mod flags {
@@ -242,6 +243,87 @@ fn drop(&mut self) {
     }
 }
 
+/// Helper used for closing file descriptors in a way that is safe even if the file is currently
+/// held using `fdget`.
+///
+/// Additional motivation can be found in commit 80cd795630d6 ("binder: fix use-after-free due to
+/// ksys_close() during fdget()") and in the comments on `binder_do_fd_close`.
+pub struct DeferredFdCloser {
+    inner: Box<DeferredFdCloserInner>,
+}
+
+/// SAFETY: This just holds an allocation with no real content, so there's no safety issue with
+/// moving it across threads.
+unsafe impl Send for DeferredFdCloser {}
+unsafe impl Sync for DeferredFdCloser {}
+
+#[repr(C)]
+struct DeferredFdCloserInner {
+    twork: mem::MaybeUninit<bindings::callback_head>,
+    file: *mut bindings::file,
+}
+
+impl DeferredFdCloser {
+    /// Create a new [`DeferredFdCloser`].
+    pub fn new() -> Result<Self, AllocError> {
+        Ok(Self {
+            inner: Box::try_new(DeferredFdCloserInner {
+                twork: mem::MaybeUninit::uninit(),
+                file: core::ptr::null_mut(),
+            })?,
+        })
+    }
+
+    /// Schedule a task work that closes the file descriptor when this task returns to userspace.
+    pub fn close_fd(mut self, fd: u32) {
+        use bindings::task_work_notify_mode_TWA_RESUME as TWA_RESUME;
+
+        let file = unsafe { bindings::close_fd_get_file(fd) };
+        if file.is_null() {
+            // Nothing further to do. The allocation is freed by the destructor of `self.inner`.
+            return;
+        }
+
+        self.inner.file = file;
+
+        // SAFETY: Since `DeferredFdCloserInner` is `#[repr(C)]`, casting the pointers gives a
+        // pointer to the `twork` field.
+        let inner = Box::into_raw(self.inner) as *mut bindings::callback_head;
+
+        // SAFETY: Getting a pointer to current is always safe.
+        let current = unsafe { bindings::get_current() };
+        // SAFETY: The `file` pointer points at a valid file.
+        unsafe { bindings::get_file(file) };
+        // SAFETY: Due to the above `get_file`, even if the current task holds an `fdget` to
+        // this file right now, the refcount will not drop to zero until after it is released
+        // with `fdput`. This is because when using `fdget`, you must always use `fdput` before
+        // returning to userspace, and our task work runs after any `fdget` users have returned
+        // to userspace.
+        //
+        // Note: fl_owner_t is currently a void pointer.
+        unsafe { bindings::filp_close(file, (*current).files as bindings::fl_owner_t) };
+        // SAFETY: The `inner` pointer is compatible with the `do_close_fd` method.
+        unsafe { bindings::init_task_work(inner, Some(Self::do_close_fd)) };
+        // SAFETY: The `inner` pointer points at a valid and fully initialized task work that is
+        // ready to be scheduled.
+        unsafe { bindings::task_work_add(current, inner, TWA_RESUME) };
+    }
+
+    // SAFETY: This function is an implementation detail of `close_fd`, so its safety comments
+    // should be read in extension of that method.
+    unsafe extern "C" fn do_close_fd(inner: *mut bindings::callback_head) {
+        // SAFETY: In `close_fd` we use this method together with a pointer that originates from a
+        // `Box<DeferredFdCloserInner>`, and we have just been given ownership of that allocation.
+        let inner = unsafe { Box::from_raw(inner as *mut DeferredFdCloserInner) };
+        // SAFETY: This drops a refcount we acquired in `close_fd`. Since this callback runs in a
+        // task work after we return to userspace, it is guaranteed that the current thread doesn't
+        // hold this file with `fdget`, as `fdget` must be released before returning to userspace.
+        unsafe { bindings::fput(inner.file) };
+        // Free the allocation.
+        drop(inner);
+    }
+}
+
 /// Represents the `EBADF` error code.
 ///
 /// Used for methods that can only fail with `EBADF`.

-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH 7/7] rust: file: add abstraction for `poll_table`
  2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
                   ` (5 preceding siblings ...)
  2023-11-29 13:12 ` [PATCH 6/7] rust: file: add `DeferredFdCloser` Alice Ryhl
@ 2023-11-29 13:12 ` Alice Ryhl
  2023-11-30 17:42   ` Benno Lossin
                     ` (2 more replies)
  2023-11-29 16:31 ` [PATCH 0/7] File abstractions needed by Rust Binder Christian Brauner
  7 siblings, 3 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 13:12 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan
  Cc: Alice Ryhl, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

The existing `CondVar` abstraction is a wrapper around `wait_list`, but
it does not support all use-cases of the C `wait_list` type. To be
specific, a `CondVar` cannot be registered with a `struct poll_table`.
This limitation has the advantage that you do not need to call
`synchronize_rcu` when destroying a `CondVar`.

However, we need the ability to register a `poll_table` with a
`wait_list` in Rust Binder. To enable this, introduce a type called
`PollCondVar`, which is like `CondVar` except that you can register a
`poll_table`. We also introduce `PollTable`, which is a safe wrapper
around `poll_table` that is intended to be used with `PollCondVar`.

The destructor of `PollCondVar` unconditionally calls `synchronize_rcu`
to ensure that the removal of epoll waiters has fully completed before
the `wait_list` is destroyed.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
That said, `synchronize_rcu` is rather expensive and is not needed in
all cases: If we have never registered a `poll_table` with the
`wait_list`, then we don't need to call `synchronize_rcu`. (And this is
a common case in Binder - not all processes use Binder with epoll.) The
current implementation does not account for this, but we could change it
to store a boolean next to the `wait_list` to keep track of whether a
`poll_table` has ever been registered. It is up to discussion whether
this is desireable.

It is not clear to me whether we can implement the above without storing
an extra boolean. We could check whether the `wait_list` is empty, but
it is not clear that this is sufficient. Perhaps someone knows the
answer? If a `poll_table` has previously been registered with a
`wait_list`, is it the case that we can kfree the `wait_list` after
observing that the `wait_list` is empty without waiting for an rcu grace
period?

 rust/bindings/bindings_helper.h |  2 +
 rust/bindings/lib.rs            |  1 +
 rust/kernel/file.rs             |  3 ++
 rust/kernel/file/poll_table.rs  | 97 +++++++++++++++++++++++++++++++++++++++++
 rust/kernel/sync/condvar.rs     |  2 +-
 5 files changed, 104 insertions(+), 1 deletion(-)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index c8daee341df6..14f84aeef62d 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -13,6 +13,7 @@
 #include <linux/file.h>
 #include <linux/fs.h>
 #include <linux/pid_namespace.h>
+#include <linux/poll.h>
 #include <linux/security.h>
 #include <linux/slab.h>
 #include <linux/refcount.h>
@@ -25,3 +26,4 @@
 const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;
 const gfp_t BINDINGS_GFP_KERNEL = GFP_KERNEL;
 const gfp_t BINDINGS___GFP_ZERO = __GFP_ZERO;
+const __poll_t BINDINGS_POLLFREE = POLLFREE;
diff --git a/rust/bindings/lib.rs b/rust/bindings/lib.rs
index 9bcbea04dac3..eeb291cc60db 100644
--- a/rust/bindings/lib.rs
+++ b/rust/bindings/lib.rs
@@ -51,3 +51,4 @@ mod bindings_helper {
 
 pub const GFP_KERNEL: gfp_t = BINDINGS_GFP_KERNEL;
 pub const __GFP_ZERO: gfp_t = BINDINGS___GFP_ZERO;
+pub const POLLFREE: __poll_t = BINDINGS_POLLFREE;
diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
index 578ee307093f..35576678c993 100644
--- a/rust/kernel/file.rs
+++ b/rust/kernel/file.rs
@@ -14,6 +14,9 @@
 use alloc::boxed::Box;
 use core::{alloc::AllocError, marker::PhantomData, mem, ptr};
 
+mod poll_table;
+pub use self::poll_table::{PollCondVar, PollTable};
+
 /// Flags associated with a [`File`].
 pub mod flags {
     /// File is opened in append mode.
diff --git a/rust/kernel/file/poll_table.rs b/rust/kernel/file/poll_table.rs
new file mode 100644
index 000000000000..a26b64df0106
--- /dev/null
+++ b/rust/kernel/file/poll_table.rs
@@ -0,0 +1,97 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Utilities for working with `struct poll_table`.
+
+use crate::{
+    bindings,
+    file::File,
+    prelude::*,
+    sync::{CondVar, LockClassKey},
+    types::Opaque,
+};
+use core::ops::Deref;
+
+/// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
+#[macro_export]
+macro_rules! new_poll_condvar {
+    ($($name:literal)?) => {
+        $crate::file::PollCondVar::new($crate::optional_name!($($name)?), $crate::static_lock_class!())
+    };
+}
+
+/// Wraps the kernel's `struct poll_table`.
+#[repr(transparent)]
+pub struct PollTable(Opaque<bindings::poll_table>);
+
+impl PollTable {
+    /// Creates a reference to a [`PollTable`] from a valid pointer.
+    ///
+    /// # Safety
+    ///
+    /// The caller must ensure that for the duration of 'a, the pointer will point at a valid poll
+    /// table, and that it is only accessed via the returned reference.
+    pub unsafe fn from_ptr<'a>(ptr: *mut bindings::poll_table) -> &'a mut PollTable {
+        // SAFETY: The safety requirements guarantee the validity of the dereference, while the
+        // `PollTable` type being transparent makes the cast ok.
+        unsafe { &mut *ptr.cast() }
+    }
+
+    fn get_qproc(&self) -> bindings::poll_queue_proc {
+        let ptr = self.0.get();
+        // SAFETY: The `ptr` is valid because it originates from a reference, and the `_qproc`
+        // field is not modified concurrently with this call.
+        unsafe { (*ptr)._qproc }
+    }
+
+    /// Register this [`PollTable`] with the provided [`PollCondVar`], so that it can be notified
+    /// using the condition variable.
+    pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) {
+        if let Some(qproc) = self.get_qproc() {
+            // SAFETY: The pointers to `self` and `file` are valid because they are references.
+            //
+            // Before the wait list is destroyed, the destructor of `PollCondVar` will clear
+            // everything in the wait list, so the wait list is not used after it is freed.
+            unsafe { qproc(file.0.get() as _, cv.wait_list.get(), self.0.get()) };
+        }
+    }
+}
+
+/// A wrapper around [`CondVar`] that makes it usable with [`PollTable`].
+///
+/// [`CondVar`]: crate::sync::CondVar
+#[pin_data(PinnedDrop)]
+pub struct PollCondVar {
+    #[pin]
+    inner: CondVar,
+}
+
+impl PollCondVar {
+    /// Constructs a new condvar initialiser.
+    #[allow(clippy::new_ret_no_self)]
+    pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self> {
+        pin_init!(Self {
+            inner <- CondVar::new(name, key),
+        })
+    }
+}
+
+// Make the `CondVar` methods callable on `PollCondVar`.
+impl Deref for PollCondVar {
+    type Target = CondVar;
+
+    fn deref(&self) -> &CondVar {
+        &self.inner
+    }
+}
+
+#[pinned_drop]
+impl PinnedDrop for PollCondVar {
+    fn drop(self: Pin<&mut Self>) {
+        // Clear anything registered using `register_wait`.
+        self.inner.notify(1, bindings::POLLHUP | bindings::POLLFREE);
+        // Wait for epoll items to be properly removed.
+        //
+        // SAFETY: Just an FFI call.
+        unsafe { bindings::synchronize_rcu() };
+    }
+}
diff --git a/rust/kernel/sync/condvar.rs b/rust/kernel/sync/condvar.rs
index b679b6f6dbeb..2d276a013ec8 100644
--- a/rust/kernel/sync/condvar.rs
+++ b/rust/kernel/sync/condvar.rs
@@ -143,7 +143,7 @@ pub fn wait_uninterruptible<T: ?Sized, B: Backend>(&self, guard: &mut Guard<'_,
     }
 
     /// Calls the kernel function to notify the appropriate number of threads with the given flags.
-    fn notify(&self, count: i32, flags: u32) {
+    pub(crate) fn notify(&self, count: i32, flags: u32) {
         // SAFETY: `wait_list` points to valid memory.
         unsafe {
             bindings::__wake_up(

-- 
2.43.0.rc1.413.gea7ed67945-goog


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 12:51 ` [PATCH 1/7] rust: file: add Rust abstraction for `struct file` Alice Ryhl
@ 2023-11-29 15:13   ` Matthew Wilcox
  2023-11-29 15:23     ` Peter Zijlstra
                       ` (2 more replies)
  2023-11-29 17:06   ` Christian Brauner
  2023-11-30 14:53   ` Benno Lossin
  2 siblings, 3 replies; 96+ messages in thread
From: Matthew Wilcox @ 2023-11-29 15:13 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 12:51:07PM +0000, Alice Ryhl wrote:
> This introduces a struct for the EBADF error type, rather than just
> using the Error type directly. This has two advantages:
> * `File::from_fd` returns a `Result<ARef<File>, BadFdError>`, which the
>   compiler will represent as a single pointer, with null being an error.
>   This is possible because the compiler understands that `BadFdError`
>   has only one possible value, and it also understands that the
>   `ARef<File>` smart pointer is guaranteed non-null.
> * Additionally, we promise to users of the method that the method can
>   only fail with EBADF, which means that they can rely on this promise
>   without having to inspect its implementation.
> That said, there are also two disadvantages:
> * Defining additional error types involves boilerplate.
> * The question mark operator will only utilize the `From` trait once,
>   which prevents you from using the question mark operator on
>   `BadFdError` in methods that return some third error type that the
>   kernel `Error` is convertible into. (However, it works fine in methods
>   that return `Error`.)

I haven't looked at how Rust-for-Linux handles errors yet, but it's
disappointing to see that it doesn't do something like the PTR_ERR /
ERR_PTR / IS_ERR C thing under the hood.

> @@ -157,6 +158,12 @@ void rust_helper_init_work_with_key(struct work_struct *work, work_func_t func,
>  }
>  EXPORT_SYMBOL_GPL(rust_helper_init_work_with_key);
>  
> +struct file *rust_helper_get_file(struct file *f)
> +{
> +	return get_file(f);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_get_file);

This is ridiculous.  A function call instead of doing the
atomic_long_inc() in Rust?


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 15:13   ` Matthew Wilcox
@ 2023-11-29 15:23     ` Peter Zijlstra
  2023-11-29 17:08       ` Boqun Feng
  2023-11-29 16:42     ` Alice Ryhl
  2023-11-30 15:02     ` Benno Lossin
  2 siblings, 1 reply; 96+ messages in thread
From: Peter Zijlstra @ 2023-11-29 15:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 03:13:22PM +0000, Matthew Wilcox wrote:

> > @@ -157,6 +158,12 @@ void rust_helper_init_work_with_key(struct work_struct *work, work_func_t func,
> >  }
> >  EXPORT_SYMBOL_GPL(rust_helper_init_work_with_key);
> >  
> > +struct file *rust_helper_get_file(struct file *f)
> > +{
> > +	return get_file(f);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_get_file);
> 
> This is ridiculous.  A function call instead of doing the
> atomic_long_inc() in Rust?

Yeah, I complained about something similar a while ago. And recently
talked to Boqun about this as well,

Bindgen *could* in theory 'compile' the inline C headers into (unsafe)
Rust, the immediate problem is that Rust has a wildly different inline
asm syntax (because Rust needs terrible syntax or whatever).

Boqun said it should all be fixable, but is a non-trivial amount of
work.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-29 13:11 ` [PATCH 4/7] rust: file: add `FileDescriptorReservation` Alice Ryhl
@ 2023-11-29 16:14   ` Christian Brauner
  2023-11-29 16:55     ` Alice Ryhl
  2023-11-30 16:40   ` Benno Lossin
  1 sibling, 1 reply; 96+ messages in thread
From: Christian Brauner @ 2023-11-29 16:14 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 01:11:56PM +0000, Alice Ryhl wrote:
> From: Wedson Almeida Filho <wedsonaf@gmail.com>
> 
> Allow for the creation of a file descriptor in two steps: first, we
> reserve a slot for it, then we commit or drop the reservation. The first
> step may fail (e.g., the current process ran out of available slots),
> but commit and drop never fail (and are mutually exclusive).
> 
> This is needed by Rust Binder when fds are sent from one process to
> another. It has to be a two-step process to properly handle the case
> where multiple fds are sent: The operation must fail or succeed
> atomically, which we achieve by first reserving the fds we need, and
> only installing the files once we have reserved enough fds to send the
> files.
> 
> Fd reservations assume that the value of `current` does not change
> between the call to get_unused_fd_flags and the call to fd_install (or
> put_unused_fd). By not implementing the Send trait, this abstraction
> ensures that the `FileDescriptorReservation` cannot be moved into a
> different process.
> 
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/file.rs | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 63 insertions(+), 1 deletion(-)
> 
> diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
> index f1f71c3d97e2..2186a6ea3f2f 100644
> --- a/rust/kernel/file.rs
> +++ b/rust/kernel/file.rs
> @@ -11,7 +11,7 @@
>      error::{code::*, Error, Result},
>      types::{ARef, AlwaysRefCounted, Opaque},
>  };
> -use core::ptr;
> +use core::{marker::PhantomData, ptr};
>  
>  /// Flags associated with a [`File`].
>  pub mod flags {
> @@ -180,6 +180,68 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
>      }
>  }
>  
> +/// A file descriptor reservation.
> +///
> +/// This allows the creation of a file descriptor in two steps: first, we reserve a slot for it,
> +/// then we commit or drop the reservation. The first step may fail (e.g., the current process ran
> +/// out of available slots), but commit and drop never fail (and are mutually exclusive).
> +///
> +/// Dropping the reservation happens in the destructor of this type.
> +///
> +/// # Invariants
> +///
> +/// The fd stored in this struct must correspond to a reserved file descriptor of the current task.
> +pub struct FileDescriptorReservation {

Can we follow the traditional file terminology, i.e.,
get_unused_fd_flags() and fd_install()? At least at the beginning this
might be quite helpful instead of having to mentally map new() and
commit() onto the C functions.

> +    fd: u32,
> +    /// Prevent values of this type from being moved to a different task.
> +    ///
> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
> +    /// owns the fd in question.
> +    _not_send_sync: PhantomData<*mut ()>,

I don't fully understand this. Can you explain in a little more detail
what you mean by this and how this works?

> +}
> +
> +impl FileDescriptorReservation {
> +    /// Creates a new file descriptor reservation.
> +    pub fn new(flags: u32) -> Result<Self> {
> +        // SAFETY: FFI call, there are no safety requirements on `flags`.
> +        let fd: i32 = unsafe { bindings::get_unused_fd_flags(flags) };
> +        if fd < 0 {
> +            return Err(Error::from_errno(fd));
> +        }
> +        Ok(Self {
> +            fd: fd as _,

This is a cast to a u32?

> +            _not_send_sync: PhantomData,

Can you please draft a quick example how that return value would be
expected to be used by a caller? It's really not clear 

> +        })
> +    }
> +
> +    /// Returns the file descriptor number that was reserved.
> +    pub fn reserved_fd(&self) -> u32 {
> +        self.fd
> +    }
> +
> +    /// Commits the reservation.
> +    ///
> +    /// The previously reserved file descriptor is bound to `file`. This method consumes the
> +    /// [`FileDescriptorReservation`], so it will not be usable after this call.
> +    pub fn commit(self, file: ARef<File>) {
> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> +        // guaranteed to have an owned ref count by its type invariants.
> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };

Why file.0.get()? Where did that come from?

> +
> +        // `fd_install` consumes both the file descriptor and the file reference, so we cannot run
> +        // the destructors.
> +        core::mem::forget(self);
> +        core::mem::forget(file);
> +    }
> +}
> +
> +impl Drop for FileDescriptorReservation {
> +    fn drop(&mut self) {
> +        // SAFETY: `self.fd` was returned by a previous call to `get_unused_fd_flags`.
> +        unsafe { bindings::put_unused_fd(self.fd) };
> +    }
> +}
> +
>  /// Represents the `EBADF` error code.
>  ///
>  /// Used for methods that can only fail with `EBADF`.
> 
> -- 
> 2.43.0.rc1.413.gea7ed67945-goog
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-29 13:12 ` [PATCH 5/7] rust: file: add `Kuid` wrapper Alice Ryhl
@ 2023-11-29 16:28   ` Christian Brauner
  2023-11-29 16:48     ` Peter Zijlstra
  2023-11-30  9:36     ` Alice Ryhl
  2023-11-30 10:36   ` Peter Zijlstra
  2023-11-30 16:48   ` Benno Lossin
  2 siblings, 2 replies; 96+ messages in thread
From: Christian Brauner @ 2023-11-29 16:28 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 01:12:17PM +0000, Alice Ryhl wrote:
> Adds a wrapper around `kuid_t` called `Kuid`. This allows us to define
> various operations on kuids such as equality and current_euid. It also
> lets us provide conversions from kuid into userspace values.
> 
> Rust Binder needs these operations because it needs to compare kuids for
> equality, and it needs to tell userspace about the pid and uid of
> incoming transactions.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/bindings/bindings_helper.h |  1 +
>  rust/helpers.c                  | 45 ++++++++++++++++++++++++++
>  rust/kernel/cred.rs             |  5 +--
>  rust/kernel/task.rs             | 71 ++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 119 insertions(+), 3 deletions(-)
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 81b13a953eae..700f01840188 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -11,6 +11,7 @@
>  #include <linux/errname.h>
>  #include <linux/file.h>
>  #include <linux/fs.h>
> +#include <linux/pid_namespace.h>
>  #include <linux/security.h>
>  #include <linux/slab.h>
>  #include <linux/refcount.h>
> diff --git a/rust/helpers.c b/rust/helpers.c
> index fd633d9db79a..58e3a9dff349 100644
> --- a/rust/helpers.c
> +++ b/rust/helpers.c
> @@ -142,6 +142,51 @@ void rust_helper_put_task_struct(struct task_struct *t)
>  }
>  EXPORT_SYMBOL_GPL(rust_helper_put_task_struct);
>  
> +kuid_t rust_helper_task_uid(struct task_struct *task)
> +{
> +	return task_uid(task);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_task_uid);
> +
> +kuid_t rust_helper_task_euid(struct task_struct *task)
> +{
> +	return task_euid(task);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_task_euid);
> +
> +#ifndef CONFIG_USER_NS
> +uid_t rust_helper_from_kuid(struct user_namespace *to, kuid_t uid)
> +{
> +	return from_kuid(to, uid);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_from_kuid);
> +#endif /* CONFIG_USER_NS */
> +
> +bool rust_helper_uid_eq(kuid_t left, kuid_t right)
> +{
> +	return uid_eq(left, right);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_uid_eq);
> +
> +kuid_t rust_helper_current_euid(void)
> +{
> +	return current_euid();
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_current_euid);
> +
> +struct user_namespace *rust_helper_current_user_ns(void)
> +{
> +	return current_user_ns();
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_current_user_ns);
> +
> +pid_t rust_helper_task_tgid_nr_ns(struct task_struct *tsk,
> +				  struct pid_namespace *ns)
> +{
> +	return task_tgid_nr_ns(tsk, ns);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_task_tgid_nr_ns);

I'm a bit puzzled by all these rust_helper_*() calls. Can you explain
why they are needed? Because they are/can be static inlines and that
somehow doesn't work?

> +
>  struct kunit *rust_helper_kunit_get_current_test(void)
>  {
>  	return kunit_get_current_test();
> diff --git a/rust/kernel/cred.rs b/rust/kernel/cred.rs
> index 3794937b5294..fbc749788bfa 100644
> --- a/rust/kernel/cred.rs
> +++ b/rust/kernel/cred.rs
> @@ -8,6 +8,7 @@
>  
>  use crate::{
>      bindings,
> +    task::Kuid,
>      types::{AlwaysRefCounted, Opaque},
>  };
>  
> @@ -52,9 +53,9 @@ pub fn get_secid(&self) -> u32 {
>      }
>  
>      /// Returns the effective UID of the given credential.
> -    pub fn euid(&self) -> bindings::kuid_t {
> +    pub fn euid(&self) -> Kuid {
>          // SAFETY: By the type invariant, we know that `self.0` is valid.
> -        unsafe { (*self.0.get()).euid }
> +        Kuid::from_raw(unsafe { (*self.0.get()).euid })
>      }
>  }
>  
> diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
> index b2299bc7ac1f..1a27b968a907 100644
> --- a/rust/kernel/task.rs
> +++ b/rust/kernel/task.rs
> @@ -5,7 +5,12 @@
>  //! C header: [`include/linux/sched.h`](../../../../include/linux/sched.h).
>  
>  use crate::{bindings, types::Opaque};
> -use core::{marker::PhantomData, ops::Deref, ptr};
> +use core::{
> +    cmp::{Eq, PartialEq},
> +    marker::PhantomData,
> +    ops::Deref,
> +    ptr,
> +};
>  
>  /// Returns the currently running task.
>  #[macro_export]
> @@ -78,6 +83,12 @@ unsafe impl Sync for Task {}
>  /// The type of process identifiers (PIDs).
>  type Pid = bindings::pid_t;
>  
> +/// The type of user identifiers (UIDs).
> +#[derive(Copy, Clone)]
> +pub struct Kuid {
> +    kuid: bindings::kuid_t,
> +}
> +
>  impl Task {
>      /// Returns a task reference for the currently executing task/thread.
>      ///
> @@ -132,12 +143,34 @@ pub fn pid(&self) -> Pid {
>          unsafe { *ptr::addr_of!((*self.0.get()).pid) }
>      }
>  
> +    /// Returns the UID of the given task.
> +    pub fn uid(&self) -> Kuid {
> +        // SAFETY: By the type invariant, we know that `self.0` is valid.
> +        Kuid::from_raw(unsafe { bindings::task_uid(self.0.get()) })
> +    }
> +
> +    /// Returns the effective UID of the given task.
> +    pub fn euid(&self) -> Kuid {
> +        // SAFETY: By the type invariant, we know that `self.0` is valid.
> +        Kuid::from_raw(unsafe { bindings::task_euid(self.0.get()) })
> +    }
> +
>      /// Determines whether the given task has pending signals.
>      pub fn signal_pending(&self) -> bool {
>          // SAFETY: By the type invariant, we know that `self.0` is valid.
>          unsafe { bindings::signal_pending(self.0.get()) != 0 }
>      }
>  
> +    /// Returns the given task's pid in the current pid namespace.
> +    pub fn pid_in_current_ns(&self) -> Pid {
> +        // SAFETY: We know that `self.0.get()` is valid by the type invariant. The rest is just FFI
> +        // calls.
> +        unsafe {
> +            let namespace = bindings::task_active_pid_ns(bindings::get_current());
> +            bindings::task_tgid_nr_ns(self.0.get(), namespace)
> +        }
> +    }
> +
>      /// Wakes up the task.
>      pub fn wake_up(&self) {
>          // SAFETY: By the type invariant, we know that `self.0.get()` is non-null and valid.
> @@ -147,6 +180,42 @@ pub fn wake_up(&self) {
>      }
>  }
>  
> +impl Kuid {
> +    /// Get the current euid.
> +    pub fn current_euid() -> Kuid {
> +        // SAFETY: Just an FFI call.
> +        Self {
> +            kuid: unsafe { bindings::current_euid() },
> +        }
> +    }
> +
> +    /// Create a `Kuid` given the raw C type.
> +    pub fn from_raw(kuid: bindings::kuid_t) -> Self {
> +        Self { kuid }
> +    }
> +
> +    /// Turn this kuid into the raw C type.
> +    pub fn into_raw(self) -> bindings::kuid_t {
> +        self.kuid
> +    }
> +
> +    /// Converts this kernel UID into a UID that userspace understands. Uses the namespace of the
> +    /// current task.
> +    pub fn into_uid_in_current_ns(self) -> bindings::uid_t {

Hm, I wouldn't special-case this. Just expose from_kuid() and let it
take a namespace argument, no? You don't need to provide bindings for
namespaces ofc.

> +        // SAFETY: Just an FFI call.
> +        unsafe { bindings::from_kuid(bindings::current_user_ns(), self.kuid) }
> +    }
> +}
> +
> +impl PartialEq for Kuid {
> +    fn eq(&self, other: &Kuid) -> bool {
> +        // SAFETY: Just an FFI call.
> +        unsafe { bindings::uid_eq(self.kuid, other.kuid) }
> +    }
> +}
> +
> +impl Eq for Kuid {}

Do you need that?

> +
>  // SAFETY: The type invariants guarantee that `Task` is always ref-counted.
>  unsafe impl crate::types::AlwaysRefCounted for Task {
>      fn inc_ref(&self) {
> 
> -- 
> 2.43.0.rc1.413.gea7ed67945-goog
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 0/7] File abstractions needed by Rust Binder
  2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
                   ` (6 preceding siblings ...)
  2023-11-29 13:12 ` [PATCH 7/7] rust: file: add abstraction for `poll_table` Alice Ryhl
@ 2023-11-29 16:31 ` Christian Brauner
  2023-11-29 16:48   ` Miguel Ojeda
  2023-12-06 20:05   ` Kent Overstreet
  7 siblings, 2 replies; 96+ messages in thread
From: Christian Brauner @ 2023-11-29 16:31 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 12:51:06PM +0000, Alice Ryhl wrote:
> This patchset contains the file abstractions needed by the Rust
> implementation of the Binder driver.
> 
> Please see the Rust Binder RFC for usage examples:
> https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-0-08ba9197f637@google.com/
> 
> Users of "rust: file: add Rust abstraction for `struct file`":
> 	[PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
> 	[PATCH RFC 03/20] rust_binder: add threading support
> 
> Users of "rust: cred: add Rust abstraction for `struct cred`":
> 	[PATCH RFC 05/20] rust_binder: add nodes and context managers
> 	[PATCH RFC 06/20] rust_binder: add oneway transactions
> 	[PATCH RFC 11/20] rust_binder: send nodes in transaction
> 	[PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support
> 
> Users of "rust: security: add abstraction for security_secid_to_secctx":
> 	[PATCH RFC 06/20] rust_binder: add oneway transactions
> 
> Users of "rust: file: add `FileDescriptorReservation`":
> 	[PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support
> 	[PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support
> 
> Users of "rust: file: add kuid getters":
> 	[PATCH RFC 05/20] rust_binder: add nodes and context managers
> 	[PATCH RFC 06/20] rust_binder: add oneway transactions
> 
> Users of "rust: file: add `DeferredFdCloser`":
> 	[PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support
> 
> Users of "rust: file: add abstraction for `poll_table`":
> 	[PATCH RFC 07/20] rust_binder: add epoll support
> 
> This patchset has some uses of read_volatile in place of READ_ONCE.
> Please see the following rfc for context on this:
> https://lore.kernel.org/all/20231025195339.1431894-1-boqun.feng@gmail.com/
> 
> This was previously sent as an rfc:
> https://lore.kernel.org/all/20230720152820.3566078-1-aliceryhl@google.com/
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> Alice Ryhl (4):
>       rust: security: add abstraction for security_secid_to_secctx
>       rust: file: add `Kuid` wrapper
>       rust: file: add `DeferredFdCloser`
>       rust: file: add abstraction for `poll_table`
> 
> Wedson Almeida Filho (3):
>       rust: file: add Rust abstraction for `struct file`
>       rust: cred: add Rust abstraction for `struct cred`
>       rust: file: add `FileDescriptorReservation`
> 
>  rust/bindings/bindings_helper.h |   9 ++
>  rust/bindings/lib.rs            |   1 +
>  rust/helpers.c                  |  94 +++++++++++
>  rust/kernel/cred.rs             |  73 +++++++++
>  rust/kernel/file.rs             | 345 ++++++++++++++++++++++++++++++++++++++++
>  rust/kernel/file/poll_table.rs  |  97 +++++++++++

That's pretty far away from the subsystem these wrappers belong to. I
would prefer if wrappers such as this would live directly in fs/rust/
and so live within the subsystem they belong to. I think I mentioned
that before. Maybe I missed some sort of agreement here?

>  rust/kernel/lib.rs              |   3 +
>  rust/kernel/security.rs         |  78 +++++++++
>  rust/kernel/sync/condvar.rs     |   2 +-
>  rust/kernel/task.rs             |  71 ++++++++-
>  10 files changed, 771 insertions(+), 2 deletions(-)
> ---
> base-commit: 98b1cc82c4affc16f5598d4fa14b1858671b2263
> change-id: 20231123-alice-file-525b98e8a724
> 
> Best regards,
> -- 
> Alice Ryhl <aliceryhl@google.com>
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 15:13   ` Matthew Wilcox
  2023-11-29 15:23     ` Peter Zijlstra
@ 2023-11-29 16:42     ` Alice Ryhl
  2023-11-29 16:45       ` Peter Zijlstra
  2023-11-30 15:02     ` Benno Lossin
  2 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 16:42 UTC (permalink / raw)
  To: willy
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, dan.j.williams, dxu,
	gary, gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco,
	ojeda, peterz, rust-for-linux, surenb, tglx, tkjos, viro,
	wedsonaf

Matthew Wilcox <willy@infradead.org>
> I haven't looked at how Rust-for-Linux handles errors yet, but it's
> disappointing to see that it doesn't do something like the PTR_ERR /
> ERR_PTR / IS_ERR C thing under the hood.

It would be cool to do that, but we haven't written the infrastructure
to do that yet. (Note that in this particular case, the C function also
returns the error as a null pointer.)

>> @@ -157,6 +158,12 @@ void rust_helper_init_work_with_key(struct work_struct *work, work_func_t func,
>>  }
>>  EXPORT_SYMBOL_GPL(rust_helper_init_work_with_key);
>>  
>> +struct file *rust_helper_get_file(struct file *f)
>> +{
>> +	return get_file(f);
>> +}
>> +EXPORT_SYMBOL_GPL(rust_helper_get_file);
> 
> This is ridiculous.  A function call instead of doing the
> atomic_long_inc() in Rust?

I think there are two factors to consider here:

First, doing the atomic increment from Rust currently runs into the
memory model split between the C++ and LKMM memory models. It would be
like using the C11 atomic_fetch_add instead of the one that the Kernel
defines for LKMM using inline assembly. When I discussed this with Paul
McKenney, we were advised that its best to avoid mixing the memory
models.

Avoiding this would require that we replicate the inline assembly that C
uses to define its atomic operations on the Rust side. This is something
that I think should be done, but it hasn't been done yet.


Second, there's potentially an increased maintenance burden when C
methods are reimplemented in Rust. Any change to the implementation on
the C side would have to be reflected on the Rust side.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 16:42     ` Alice Ryhl
@ 2023-11-29 16:45       ` Peter Zijlstra
  0 siblings, 0 replies; 96+ messages in thread
From: Peter Zijlstra @ 2023-11-29 16:45 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: willy, a.hindborg, alex.gaynor, arve, benno.lossin, bjorn3_gh,
	boqun.feng, brauner, cmllamas, dan.j.williams, dxu, gary, gregkh,
	joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf

On Wed, Nov 29, 2023 at 04:42:51PM +0000, Alice Ryhl wrote:
> Second, there's potentially an increased maintenance burden when C
> methods are reimplemented in Rust. Any change to the implementation on
> the C side would have to be reflected on the Rust side.

C to Rust compiler FTW :-)

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-29 16:28   ` Christian Brauner
@ 2023-11-29 16:48     ` Peter Zijlstra
  2023-11-30 12:46       ` Christian Brauner
  2023-11-30  9:36     ` Alice Ryhl
  1 sibling, 1 reply; 96+ messages in thread
From: Peter Zijlstra @ 2023-11-29 16:48 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alexander Viro, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 05:28:27PM +0100, Christian Brauner wrote:

> > +pid_t rust_helper_task_tgid_nr_ns(struct task_struct *tsk,
> > +				  struct pid_namespace *ns)
> > +{
> > +	return task_tgid_nr_ns(tsk, ns);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_task_tgid_nr_ns);
> 
> I'm a bit puzzled by all these rust_helper_*() calls. Can you explain
> why they are needed? Because they are/can be static inlines and that
> somehow doesn't work?

Correct, because Rust can only talk to C ABI, it cannot use C headers.
Bindgen would need to translate the full C headers into valid Rust for
that to work.

I really think the Rust peoples should spend more effort on that,
because you are quite right, all this wrappery is tedious at best.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 0/7] File abstractions needed by Rust Binder
  2023-11-29 16:31 ` [PATCH 0/7] File abstractions needed by Rust Binder Christian Brauner
@ 2023-11-29 16:48   ` Miguel Ojeda
  2023-12-06 20:05   ` Kent Overstreet
  1 sibling, 0 replies; 96+ messages in thread
From: Miguel Ojeda @ 2023-11-29 16:48 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Peter Zijlstra, Alexander Viro,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Wed, Nov 29, 2023 at 5:31 PM Christian Brauner <brauner@kernel.org> wrote:
>
> That's pretty far away from the subsystem these wrappers belong to. I
> would prefer if wrappers such as this would live directly in fs/rust/
> and so live within the subsystem they belong to. I think I mentioned
> that before. Maybe I missed some sort of agreement here?

The plan is that the code will be moved to the right places when the
new build system is in place (WIP). Currently the "abstractions" go
inside the `kernel` crate.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-29 16:14   ` Christian Brauner
@ 2023-11-29 16:55     ` Alice Ryhl
  2023-11-29 17:14       ` Alice Ryhl
  2023-11-30  9:09       ` Christian Brauner
  0 siblings, 2 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 16:55 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, cmllamas, dan.j.williams, dxu, gary,
	gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	peterz, rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf,
	willy

Christian Brauner <brauner@kernel.org> writes:
> Can we follow the traditional file terminology, i.e.,
> get_unused_fd_flags() and fd_install()? At least at the beginning this
> might be quite helpful instead of having to mentally map new() and
> commit() onto the C functions.

Sure, I'll do that in the next version.

>> +    /// Prevent values of this type from being moved to a different task.
>> +    ///
>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
>> +    /// owns the fd in question.
>> +    _not_send_sync: PhantomData<*mut ()>,
> 
> I don't fully understand this. Can you explain in a little more detail
> what you mean by this and how this works?

Yeah, so, this has to do with the Rust trait `Send` that controls
whether it's okay for a value to get moved from one thread to another.
In this case, we don't want it to be `Send` so that it can't be moved to
another thread, since current might be different there.

The `Send` trait is automatically applied to structs whenever *all*
fields of the struct are `Send`. So to ensure that a struct is not
`Send`, you add a field that is not `Send`.

The `PhantomData` type used here is a special zero-sized type.
Basically, it says "pretend this struct has a field of type `*mut ()`,
but don't actually add the field". So for the purposes of `Send`, it has
a non-Send field, but since its wrapped in `PhantomData`, the field is
not there at runtime.

>> +        Ok(Self {
>> +            fd: fd as _,
> 
> This is a cast to a u32?

Yes.

> Can you please draft a quick example how that return value would be
> expected to be used by a caller? It's really not clear

The most basic usage would look like this:

	// First, reserve the fd.
	let reservation = FileDescriptorReservation::new(O_CLOEXEC)?;

	// Then, somehow get a file to put in it.
	let file = get_file_using_fallible_operation()?;

	// Finally, commit it to the fd.
	reservation.commit(file);

In Rust Binder, reservations are used here:
https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/allocation.rs#L199-L210
https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/allocation.rs#L512-L541

>> +    pub fn commit(self, file: ARef<File>) {
>> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
>> +        // guaranteed to have an owned ref count by its type invariants.
>> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> 
> Why file.0.get()? Where did that come from?

This gets a raw pointer to the C type.

The `.0` part is a field access. `ARef` struct is a tuple struct, so its
fields are unnamed. However, the fields can still be accessed by index.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 12:51 ` [PATCH 1/7] rust: file: add Rust abstraction for `struct file` Alice Ryhl
  2023-11-29 15:13   ` Matthew Wilcox
@ 2023-11-29 17:06   ` Christian Brauner
  2023-11-29 21:27     ` Alice Ryhl
  2023-11-30 14:53   ` Benno Lossin
  2 siblings, 1 reply; 96+ messages in thread
From: Christian Brauner @ 2023-11-29 17:06 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 12:51:07PM +0000, Alice Ryhl wrote:
> From: Wedson Almeida Filho <wedsonaf@gmail.com>
> 
> This abstraction makes it possible to manipulate the open files for a
> process. The new `File` struct wraps the C `struct file`. When accessing
> it using the smart pointer `ARef<File>`, the pointer will own a
> reference count to the file. When accessing it as `&File`, then the
> reference does not own a refcount, but the borrow checker will ensure
> that the reference count does not hit zero while the `&File` is live.

Could you explain this in more details please? Ideally with some C and
how that translates to your Rust wrappers, please. Sorry, this is going
to be a long journey...

> 
> Since this is intended to manipulate the open files of a process, we
> introduce a `from_fd` constructor that corresponds to the C `fget`
> method. In future patches, it will become possible to create a new fd in
> a process and bind it to a `File`. Rust Binder will use these to send
> fds from one process to another.
> 
> We also provide a method for accessing the file's flags. Rust Binder
> will use this to access the flags of the Binder fd to check whether the
> non-blocking flag is set, which affects what the Binder ioctl does.
> 
> This introduces a struct for the EBADF error type, rather than just
> using the Error type directly. This has two advantages:
> * `File::from_fd` returns a `Result<ARef<File>, BadFdError>`, which the
>   compiler will represent as a single pointer, with null being an error.
>   This is possible because the compiler understands that `BadFdError`
>   has only one possible value, and it also understands that the
>   `ARef<File>` smart pointer is guaranteed non-null.
> * Additionally, we promise to users of the method that the method can
>   only fail with EBADF, which means that they can rely on this promise
>   without having to inspect its implementation.
> That said, there are also two disadvantages:
> * Defining additional error types involves boilerplate.
> * The question mark operator will only utilize the `From` trait once,
>   which prevents you from using the question mark operator on
>   `BadFdError` in methods that return some third error type that the
>   kernel `Error` is convertible into. (However, it works fine in methods
>   that return `Error`.)
> 
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Co-developed-by: Daniel Xu <dxu@dxuuu.xyz>
> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
> Co-developed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/bindings/bindings_helper.h |   2 +
>  rust/helpers.c                  |   7 ++
>  rust/kernel/file.rs             | 182 ++++++++++++++++++++++++++++++++++++++++
>  rust/kernel/lib.rs              |   1 +
>  4 files changed, 192 insertions(+)
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 85f013ed4ca4..beed3ef1fbc3 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -8,6 +8,8 @@
>  
>  #include <kunit/test.h>
>  #include <linux/errname.h>
> +#include <linux/file.h>
> +#include <linux/fs.h>
>  #include <linux/slab.h>
>  #include <linux/refcount.h>
>  #include <linux/wait.h>
> diff --git a/rust/helpers.c b/rust/helpers.c
> index 70e59efd92bc..03141a3608a4 100644
> --- a/rust/helpers.c
> +++ b/rust/helpers.c
> @@ -25,6 +25,7 @@
>  #include <linux/build_bug.h>
>  #include <linux/err.h>
>  #include <linux/errname.h>
> +#include <linux/fs.h>
>  #include <linux/mutex.h>
>  #include <linux/refcount.h>
>  #include <linux/sched/signal.h>
> @@ -157,6 +158,12 @@ void rust_helper_init_work_with_key(struct work_struct *work, work_func_t func,
>  }
>  EXPORT_SYMBOL_GPL(rust_helper_init_work_with_key);
>  
> +struct file *rust_helper_get_file(struct file *f)
> +{
> +	return get_file(f);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_get_file);
> +
>  /*
>   * `bindgen` binds the C `size_t` type as the Rust `usize` type, so we can
>   * use it in contexts where Rust expects a `usize` like slice (array) indices.
> diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
> new file mode 100644
> index 000000000000..ee4ec8b919af
> --- /dev/null
> +++ b/rust/kernel/file.rs
> @@ -0,0 +1,182 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Files and file descriptors.
> +//!
> +//! C headers: [`include/linux/fs.h`](../../../../include/linux/fs.h) and
> +//! [`include/linux/file.h`](../../../../include/linux/file.h)
> +
> +use crate::{
> +    bindings,
> +    error::{code::*, Error, Result},
> +    types::{ARef, AlwaysRefCounted, Opaque},
> +};
> +use core::ptr;
> +
> +/// Flags associated with a [`File`].
> +pub mod flags {
> +    /// File is opened in append mode.
> +    pub const O_APPEND: u32 = bindings::O_APPEND;
> +
> +    /// Signal-driven I/O is enabled.
> +    pub const O_ASYNC: u32 = bindings::FASYNC;
> +
> +    /// Close-on-exec flag is set.
> +    pub const O_CLOEXEC: u32 = bindings::O_CLOEXEC;
> +
> +    /// File was created if it didn't already exist.
> +    pub const O_CREAT: u32 = bindings::O_CREAT;
> +
> +    /// Direct I/O is enabled for this file.
> +    pub const O_DIRECT: u32 = bindings::O_DIRECT;
> +
> +    /// File must be a directory.
> +    pub const O_DIRECTORY: u32 = bindings::O_DIRECTORY;
> +
> +    /// Like [`O_SYNC`] except metadata is not synced.
> +    pub const O_DSYNC: u32 = bindings::O_DSYNC;
> +
> +    /// Ensure that this file is created with the `open(2)` call.
> +    pub const O_EXCL: u32 = bindings::O_EXCL;
> +
> +    /// Large file size enabled (`off64_t` over `off_t`).
> +    pub const O_LARGEFILE: u32 = bindings::O_LARGEFILE;
> +
> +    /// Do not update the file last access time.
> +    pub const O_NOATIME: u32 = bindings::O_NOATIME;
> +
> +    /// File should not be used as process's controlling terminal.
> +    pub const O_NOCTTY: u32 = bindings::O_NOCTTY;
> +
> +    /// If basename of path is a symbolic link, fail open.
> +    pub const O_NOFOLLOW: u32 = bindings::O_NOFOLLOW;
> +
> +    /// File is using nonblocking I/O.
> +    pub const O_NONBLOCK: u32 = bindings::O_NONBLOCK;
> +
> +    /// Also known as `O_NDELAY`.
> +    ///
> +    /// This is effectively the same flag as [`O_NONBLOCK`] on all architectures
> +    /// except SPARC64.
> +    pub const O_NDELAY: u32 = bindings::O_NDELAY;
> +
> +    /// Used to obtain a path file descriptor.
> +    pub const O_PATH: u32 = bindings::O_PATH;
> +
> +    /// Write operations on this file will flush data and metadata.
> +    pub const O_SYNC: u32 = bindings::O_SYNC;
> +
> +    /// This file is an unnamed temporary regular file.
> +    pub const O_TMPFILE: u32 = bindings::O_TMPFILE;
> +
> +    /// File should be truncated to length 0.
> +    pub const O_TRUNC: u32 = bindings::O_TRUNC;
> +
> +    /// Bitmask for access mode flags.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::file;
> +    /// # fn do_something() {}
> +    /// # let flags = 0;
> +    /// if (flags & file::flags::O_ACCMODE) == file::flags::O_RDONLY {
> +    ///     do_something();
> +    /// }
> +    /// ```
> +    pub const O_ACCMODE: u32 = bindings::O_ACCMODE;
> +
> +    /// File is read only.
> +    pub const O_RDONLY: u32 = bindings::O_RDONLY;
> +
> +    /// File is write only.
> +    pub const O_WRONLY: u32 = bindings::O_WRONLY;
> +
> +    /// File can be both read and written.
> +    pub const O_RDWR: u32 = bindings::O_RDWR;
> +}
> +
> +/// Wraps the kernel's `struct file`.
> +///
> +/// # Invariants
> +///
> +/// Instances of this type are always ref-counted, that is, a call to `get_file` ensures that the
> +/// allocation remains valid at least until the matching call to `fput`.
> +#[repr(transparent)]
> +pub struct File(Opaque<bindings::file>);
> +
> +// SAFETY: By design, the only way to access a `File` is via an immutable reference or an `ARef`.
> +// This means that the only situation in which a `File` can be accessed mutably is when the
> +// refcount drops to zero and the destructor runs. It is safe for that to happen on any thread, so
> +// it is ok for this type to be `Send`.
> +unsafe impl Send for File {}
> +
> +// SAFETY: It's OK to access `File` through shared references from other threads because we're
> +// either accessing properties that don't change or that are properly synchronised by C code.

Uhm, what guarantees are you talking about specifically, please?
Examples would help.

> +unsafe impl Sync for File {}
> +
> +impl File {
> +    /// Constructs a new `struct file` wrapper from a file descriptor.
> +    ///
> +    /// The file descriptor belongs to the current process.
> +    pub fn from_fd(fd: u32) -> Result<ARef<Self>, BadFdError> {
> +        // SAFETY: FFI call, there are no requirements on `fd`.
> +        let ptr = ptr::NonNull::new(unsafe { bindings::fget(fd) }).ok_or(BadFdError)?;
> +
> +        // INVARIANT: `fget` increments the refcount before returning.
> +        Ok(unsafe { ARef::from_raw(ptr.cast()) })
> +    }

I think this is really misnamed.

File reference counting has two modes. For simplicity let's ignore
fdget_pos() for now:

(1) fdget()
    Return file either with or without an increased reference count.
    If the fdtable was shared increment reference count, if not don't
    increment refernce count.
(2) fget()
    Always increase refcount.

Your File implementation currently only deals with (2). And this
terminology is terribly important as far as I'm concerned. This wants to
be fget() and not from_fd(). The latter tells me nothing. I feel we
really need to try and mirror the current naming closely. Not
religiously ofc but core stuff such as this really benefits from having
an almost 1:1 mapping between C names and Rust names, I think.
Especially in the beginning.

> +
> +    /// Creates a reference to a [`File`] from a valid pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The caller must ensure that `ptr` points at a valid file and that its refcount does not
> +    /// reach zero during the lifetime 'a.
> +    pub unsafe fn from_ptr<'a>(ptr: *const bindings::file) -> &'a File {
> +        // INVARIANT: The safety requirements guarantee that the refcount does not hit zero during
> +        // 'a. The cast is okay because `File` is `repr(transparent)`.
> +        unsafe { &*ptr.cast() }
> +    }

How does that work and what is this used for? It's required that a
caller has called from_fd()/fget() first before from_ptr() can be used?

Can you show how this would be used in an example, please? Unless you
hold file_lock it is now invalid to access fields in struct file just
with rcu lock held for example. Which is why returning a pointer without
holding a reference seems dodgy. I'm probably just missing context.

> +
> +    /// Returns the flags associated with the file.
> +    ///
> +    /// The flags are a combination of the constants in [`flags`].
> +    pub fn flags(&self) -> u32 {
> +        // This `read_volatile` is intended to correspond to a READ_ONCE call.
> +        //
> +        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.

I really need to understand what you mean by shared reference. At least
in the current C implementation you can't share a reference without
another task as the other task might fput() behind you and then you're
hosed. That's why we have the fdget() logic.

> +        //
> +        // TODO: Replace with `read_once` when available on the Rust side.
> +        unsafe { core::ptr::addr_of!((*self.0.get()).f_flags).read_volatile() }
> +    }
> +}
> +
> +// SAFETY: The type invariants guarantee that `File` is always ref-counted.
> +unsafe impl AlwaysRefCounted for File {
> +    fn inc_ref(&self) {
> +        // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> +        unsafe { bindings::get_file(self.0.get()) };
> +    }

Why inc_ref() and not just get_file()?

> +
> +    unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
> +        // SAFETY: The safety requirements guarantee that the refcount is nonzero.
> +        unsafe { bindings::fput(obj.cast().as_ptr()) }
> +    }

Ok, so this makes me think that from_ptr() requires you to have called
from_fd()/fget() first which would be good.

> +}
> +
> +/// Represents the `EBADF` error code.
> +///
> +/// Used for methods that can only fail with `EBADF`.
> +pub struct BadFdError;
> +
> +impl From<BadFdError> for Error {
> +    fn from(_: BadFdError) -> Error {
> +        EBADF
> +    }
> +}
> +
> +impl core::fmt::Debug for BadFdError {
> +    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
> +        f.pad("EBADF")
> +    }
> +}
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index e6aff80b521f..ce9abceab784 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -34,6 +34,7 @@
>  mod allocator;
>  mod build_assert;
>  pub mod error;
> +pub mod file;
>  pub mod init;
>  pub mod ioctl;
>  #[cfg(CONFIG_KUNIT)]
> 
> -- 
> 2.43.0.rc1.413.gea7ed67945-goog
> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 15:23     ` Peter Zijlstra
@ 2023-11-29 17:08       ` Boqun Feng
  2023-11-30 10:42         ` Peter Zijlstra
  0 siblings, 1 reply; 96+ messages in thread
From: Boqun Feng @ 2023-11-29 17:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 04:23:05PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 29, 2023 at 03:13:22PM +0000, Matthew Wilcox wrote:
> 
> > > @@ -157,6 +158,12 @@ void rust_helper_init_work_with_key(struct work_struct *work, work_func_t func,
> > >  }
> > >  EXPORT_SYMBOL_GPL(rust_helper_init_work_with_key);
> > >  
> > > +struct file *rust_helper_get_file(struct file *f)
> > > +{
> > > +	return get_file(f);
> > > +}
> > > +EXPORT_SYMBOL_GPL(rust_helper_get_file);
> > 
> > This is ridiculous.  A function call instead of doing the
> > atomic_long_inc() in Rust?
> 
> Yeah, I complained about something similar a while ago. And recently
> talked to Boqun about this as well,
> 
> Bindgen *could* in theory 'compile' the inline C headers into (unsafe)
> Rust, the immediate problem is that Rust has a wildly different inline
> asm syntax (because Rust needs terrible syntax or whatever).
> 
> Boqun said it should all be fixable, but is a non-trivial amount of
> work.
> 

Right, but TBH, I was only thinking about "inlining" our atomic
primitives back then. The idea is since atomic primitives only have
small body (most of which is asm code), it's relatively easy to
translate that from a C function into a Rust one. And what's left is
translating asm blocks. Things get interesting here:


Originally I think the translation, despite the different syntax, might
be relatively easy, for example, considering smp_store_release() on
ARM64, we are going to translate from

	asm volatile ("stlr %w1, %0"				\
			: "=Q" (*__p)				\
			: "rZ" (*(__u32 *)__u.__c)		\
			: "memory");

to something like:

	asm!("stlr {val}, [{ptr}]",
	     val = in(reg) __u.__c,
	     ptr = in(reg) __p);

, the translation is non-trivial, but it's not that hard, since it's
basically find-and-replace.

But but but, I then realized we have asm goto in C but Rust doesn't
support them, and I haven't thought through how hard tht would be..

Regards,
Boqun

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-29 16:55     ` Alice Ryhl
@ 2023-11-29 17:14       ` Alice Ryhl
  2023-11-30  9:12         ` Christian Brauner
  2023-11-30  9:09       ` Christian Brauner
  1 sibling, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 17:14 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, arve, benno.lossin, bjorn3_gh,
	boqun.feng, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

On Wed, Nov 29, 2023 at 5:55 PM Alice Ryhl <aliceryhl@google.com> wrote:

> >> +    pub fn commit(self, file: ARef<File>) {
> >> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> >> +        // guaranteed to have an owned ref count by its type invariants.
> >> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> >
> > Why file.0.get()? Where did that come from?
>
> This gets a raw pointer to the C type.
>
> The `.0` part is a field access. `ARef` struct is a tuple struct, so its
> fields are unnamed. However, the fields can still be accessed by index.

Oh, sorry, this is wrong. Let me try again:

This gets a raw pointer to the C type. The `.0` part accesses the
field of type `Opaque<bindings::file>` in the Rust wrapper. Recall
that File is defined like this:

pub struct File(Opaque<bindings::file>);

The above syntax defines a tuple struct, which means that the fields
are unnamed. The `.0` syntax accesses the first field of a tuple
struct [1].

The `.get()` method is from the `Opaque` struct, which returns a raw
pointer to the C type being wrapped.

Alice

[1]: https://doc.rust-lang.org/std/keyword.struct.html#:~:text=Tuple%20structs%20are%20similar%20to,with%20regular%20tuples%2C%20namely%20foo.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 17:06   ` Christian Brauner
@ 2023-11-29 21:27     ` Alice Ryhl
  2023-11-29 23:17       ` Benno Lossin
  2023-11-30 10:48       ` Christian Brauner
  0 siblings, 2 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-11-29 21:27 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, cmllamas, dan.j.williams, dxu, gary,
	gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	peterz, rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf,
	willy

Christian Brauner <brauner@kernel.org> writes:
>> This abstraction makes it possible to manipulate the open files for a
>> process. The new `File` struct wraps the C `struct file`. When accessing
>> it using the smart pointer `ARef<File>`, the pointer will own a
>> reference count to the file. When accessing it as `&File`, then the
>> reference does not own a refcount, but the borrow checker will ensure
>> that the reference count does not hit zero while the `&File` is live.
> 
> Could you explain this in more details please? Ideally with some C and
> how that translates to your Rust wrappers, please. Sorry, this is going
> to be a long journey...

Yes of course. This touches on what I think is one of the most important
features that Rust brings to the table, which is the idea of defining
many different pointer types for different ownership semantics.

In the case of `struct file`, there are two pointer types that are
relevant:

 * `&File`. This is an "immutable reference" or "shared reference"
   (both names are used). This pointer type is used when you don't have
   any ownership over the target at all. All you have is _some_ sort of
   guarantee that target stays alive while the reference is live. In
   many cases, the borrow checker helps validate this at compile-time,
   but you can also use a backdoor (in this case from_ptr) to just
   unsafely say "I know this value is valid, so shut up compiler".
   Shared references have no destructor.

 * `ARef<File>`. The `ARef` type is a custom pointer type defined in the
   kernel in `rust/kernel/types.rs`. It represents a pointer that owns a
   ref-count to the inner value. ARef can only be used with types that
   have an `unsafe impl AlwaysRefCounted for T` block. Whenever you
   clone an `ARef`, it calls into the `inc_ref` method that you defined
   for the type, and whenever it goes out of scope and the destructor
   runs, it calls the `dec_ref` method that you defined for the type.

Potentially we might want a third in the future. The third pointer type
could be another custom pointer type just for `struct file` that uses
`fdget` instead of `fget`. However, I haven't added this since I don't
need it (dead code and so on).

To give an example of this, consider this really simple C function:

	bool is_nonblocking(struct file *file) {
		return !!(filp->f_flags & O_NONBLOCK);
	}

What are the ownership semantics of `file`? Well, we don't really care.
The caller needs to somehow ensure that the `file` is valid, but we
don't care if they're doing that with `fdget` or `fget` or whatever.
This corresponds to &File, so the Rust equivalent would be:

	fn is_nonblocking(file: &File) -> bool {
		(file.flags() & O_NONBLOCK) != 0
	}

Another example:

	void set_nonblocking_and_fput(struct file *file) {
		// Let's just ignore the lock for this example.
		file->f_flags |= O_NONBLOCK;

		fput(file);
	}

This method takes a file, sets it to non-blocking, and then destroys the
ref-count. What are the ownership semantics? Well, the caller should own
an `fget` ref-count, and we consume that ref-count. The equivalent Rust
code would be to take an `ARef<File>`:

	fn set_nonblocking_and_fput(file: ARef<File>) {
		file.set_flag(O_NONBLOCK);

		// When `file` goes out of scope here, the destructor
		// runs and calls `fput`. (Since that's what we defined
		// `ARef` to do on drop in `fn dec_ref`.)
	}

You can also explicitly call the destructor with `drop(file)`:

	fn set_nonblocking_and_fput(file: ARef<File>) {
		file.set_flag(O_NONBLOCK);
		drop(file);

		// When `file` goes out of scope, the destructor does
		// *not* run. This is because `drop(file)` is a move
		// (due to the signature of drop), and if you perform a
		// move, then the destructor no longer runs at the end
		// of the scope.
	}

And note that this would not compile, because we give up ownership of
the `ARef` by passing it to `drop`:

	fn set_nonblocking_and_fput(file: ARef<File>) {
		drop(file);
		file.set_flag(O_NONBLOCK);
	}

A third example:

	struct holds_a_file {
		struct file *inner;
	};

	struct file *get_the_file(struct holds_a_file *holder) {
		return holder->inner;
	}

What are the ownership semantics? Well, let's say that `holds_a_file`
owns a refcount to the file. Then, the pointer returned by get_the_file
is valid as long as `holder` is, but it doesn't have any ownership
over the file. You must stop using the returned file pointer before the
holder is destroyed.

The Rust equivalent is:

	struct HoldsAFile {
		inner: ARef<File>,
	}

	fn get_the_file(holder: &HoldsAFile) -> &File {
		&holder.inner
	}

The method signature is short-hand for (see [1]):

	fn get_the_file<'a>(holder: &'a HoldsAFile) -> &'a File {
		&holder.inner
	}

Here, 'a is a lifetime, and it ties together `holder` and the returned
reference in the way I described above. So e.g., this compiles:

	let holder = ...;
	let file = get_the_file(&holder);
	use_the_file(file);

But this doesn't:

	let holder = ...;
	let file = get_the_file(&holder);
	drop(holder);
	use_the_file(file); // Oops, destroying holder calls fput.

Notice also how the compiler accepted `&holder.inner` as the type
`&File` even though `inner` has type `ARef<File>`. This is because
`ARef` is defined to use something called deref coercion, which makes it
act like a real pointer type. This means that if you have an
`ARef<File>`, but you want to call a method that accepts `&File`, then
it will just work. (Deref coercion only exists for conversions into
reference types, so you can't pass a `&File` to something that takes an
`ARef<File>` without explicitly upgrading it to an `ARef<File>` by
taking a ref-count.)

[1]: https://doc.rust-lang.org/reference/lifetime-elision.html

>> +    /// Constructs a new `struct file` wrapper from a file descriptor.
>> +    ///
>> +    /// The file descriptor belongs to the current process.
>> +    pub fn from_fd(fd: u32) -> Result<ARef<Self>, BadFdError> {
>> +        // SAFETY: FFI call, there are no requirements on `fd`.
>> +        let ptr = ptr::NonNull::new(unsafe { bindings::fget(fd) }).ok_or(BadFdError)?;
>> +
>> +        // INVARIANT: `fget` increments the refcount before returning.
>> +        Ok(unsafe { ARef::from_raw(ptr.cast()) })
>> +    }
> 
> I think this is really misnamed.
> 
> File reference counting has two modes. For simplicity let's ignore
> fdget_pos() for now:
> 
> (1) fdget()
>     Return file either with or without an increased reference count.
>     If the fdtable was shared increment reference count, if not don't
>     increment refernce count.
> (2) fget()
>     Always increase refcount.
> 
> Your File implementation currently only deals with (2). And this
> terminology is terribly important as far as I'm concerned. This wants to
> be fget() and not from_fd(). The latter tells me nothing. I feel we
> really need to try and mirror the current naming closely. Not
> religiously ofc but core stuff such as this really benefits from having
> an almost 1:1 mapping between C names and Rust names, I think.
> Especially in the beginning.

Sure, I'll rename these methods in the next version.

>> +    /// Creates a reference to a [`File`] from a valid pointer.
>> +    ///
>> +    /// # Safety
>> +    ///
>> +    /// The caller must ensure that `ptr` points at a valid file and that its refcount does not
>> +    /// reach zero during the lifetime 'a.
>> +    pub unsafe fn from_ptr<'a>(ptr: *const bindings::file) -> &'a File {
>> +        // INVARIANT: The safety requirements guarantee that the refcount does not hit zero during
>> +        // 'a. The cast is okay because `File` is `repr(transparent)`.
>> +        unsafe { &*ptr.cast() }
>> +    }
> 
> How does that work and what is this used for? It's required that a
> caller has called from_fd()/fget() first before from_ptr() can be used?
> 
> Can you show how this would be used in an example, please? Unless you
> hold file_lock it is now invalid to access fields in struct file just
> with rcu lock held for example. Which is why returning a pointer without
> holding a reference seems dodgy. I'm probably just missing context.

This is the backdoor. You use it when *you* know that the file is okay
to access, but Rust doesn't. It's unsafe because it's not checked by
Rust.

For example you could do this:

	let ptr = unsafe { bindings::fdget(fd) };

	// SAFETY: We just called `fdget`.
	let file = unsafe { File::from_ptr(ptr) };
	use_file(file);

	// SAFETY: We're not using `file` after this call.
	unsafe { bindings::fdput(ptr) };

It's used in Binder here:
https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/rust_binder.rs#L331-L332

Basically, I use it to say "C code has called fdget for us so it's okay
to access the file", whenever userspace uses a syscall to call into the
driver.

>> +// SAFETY: The type invariants guarantee that `File` is always ref-counted.
>> +unsafe impl AlwaysRefCounted for File {
>> +    fn inc_ref(&self) {
>> +        // SAFETY: The existence of a shared reference means that the refcount is nonzero.
>> +        unsafe { bindings::get_file(self.0.get()) };
>> +    }
> 
> Why inc_ref() and not just get_file()?

Whenever you see an impl block that uses the keyword "for", then the
code is implementing a trait. In this case, the trait being implemented
is AlwaysRefCounted, which allows File to work with ARef.

It has to be `inc_ref` because that's what AlwaysRefCounted calls this
method.

>> +    unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
>> +        // SAFETY: The safety requirements guarantee that the refcount is nonzero.
>> +        unsafe { bindings::fput(obj.cast().as_ptr()) }
>> +    }
> 
> Ok, so this makes me think that from_ptr() requires you to have called
> from_fd()/fget() first which would be good.

Actually, `from_ptr` has nothing to do with this. The above code only
applies to code that uses the `ARef` pointer type, but `from_ptr` uses
the `&File` pointer type instead.

>> +    /// Returns the flags associated with the file.
>> +    ///
>> +    /// The flags are a combination of the constants in [`flags`].
>> +    pub fn flags(&self) -> u32 {
>> +        // This `read_volatile` is intended to correspond to a READ_ONCE call.
>> +        //
>> +        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
> 
> I really need to understand what you mean by shared reference. At least
> in the current C implementation you can't share a reference without
> another task as the other task might fput() behind you and then you're
> hosed. That's why we have the fdget() logic.

By "shared reference" I just mean an `&File`. They're called shared
because there could be other pointers to the same object elsewhere in
the program, and not because we have explicitly shared it ourselves.

Rust's other type of reference `&mut T` is called a "mutable reference"
or "exclusive reference". Like with `&T`, both names are used.

> > +// SAFETY: It's OK to access `File` through shared references from other threads because we're
> > +// either accessing properties that don't change or that are properly synchronised by C code.
> 
> Uhm, what guarantees are you talking about specifically, please?
> Examples would help.
> 
> > +unsafe impl Sync for File {}

The Sync trait defines whether a value may be accessed from several
threads in parallel (using shared/immutable references). In our case,
every method on `File` that accepts a `&File` is okay to be called in
parallel from several threads, so it's okay for `File` to implement
`Sync`.

I'm actually making a statement about the rest of the Rust code in this
file here. If I added a method that took `&File`, but couldn't be called
in parallel, then `File` could no longer be `Sync`.



I hope that helps, and let me know if you have any other questions.

Alice


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 21:27     ` Alice Ryhl
@ 2023-11-29 23:17       ` Benno Lossin
  2023-11-30 10:48       ` Christian Brauner
  1 sibling, 0 replies; 96+ messages in thread
From: Benno Lossin @ 2023-11-29 23:17 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: brauner, a.hindborg, alex.gaynor, arve, bjorn3_gh, boqun.feng,
	cmllamas, dan.j.williams, dxu, gary, gregkh, joel, keescook,
	linux-fsdevel, linux-kernel, maco, ojeda, peterz, rust-for-linux,
	surenb, tglx, tkjos, viro, wedsonaf, willy

On 29.11.23 22:27, Alice Ryhl wrote:
> Another example:
> 
> 	void set_nonblocking_and_fput(struct file *file) {
> 		// Let's just ignore the lock for this example.
> 		file->f_flags |= O_NONBLOCK;
> 
> 		fput(file);
> 	}
> 
> This method takes a file, sets it to non-blocking, and then destroys the
> ref-count. What are the ownership semantics? Well, the caller should own
> an `fget` ref-count, and we consume that ref-count. The equivalent Rust
> code would be to take an `ARef<File>`:
> 
> 	fn set_nonblocking_and_fput(file: ARef<File>) {
> 		file.set_flag(O_NONBLOCK);
> 
> 		// When `file` goes out of scope here, the destructor
> 		// runs and calls `fput`. (Since that's what we defined
> 		// `ARef` to do on drop in `fn dec_ref`.)
> 	}
> 
> You can also explicitly call the destructor with `drop(file)`:
> 
> 	fn set_nonblocking_and_fput(file: ARef<File>) {
> 		file.set_flag(O_NONBLOCK);
> 		drop(file);
> 
> 		// When `file` goes out of scope, the destructor does
> 		// *not* run. This is because `drop(file)` is a move
> 		// (due to the signature of drop), and if you perform a
> 		// move, then the destructor no longer runs at the end
> 		// of the scope.

I want to note that while the destructor does not run at the end of the
scope, it still *does* run: the `drop(file)` call runs the destructor.

> 	}
> 
> And note that this would not compile, because we give up ownership of
> the `ARef` by passing it to `drop`:
> 
> 	fn set_nonblocking_and_fput(file: ARef<File>) {
> 		drop(file);
> 		file.set_flag(O_NONBLOCK);
> 	}
>

[...]

>>> +// SAFETY: The type invariants guarantee that `File` is always ref-counted.
>>> +unsafe impl AlwaysRefCounted for File {
>>> +    fn inc_ref(&self) {
>>> +        // SAFETY: The existence of a shared reference means that the refcount is nonzero.
>>> +        unsafe { bindings::get_file(self.0.get()) };
>>> +    }
>>
>> Why inc_ref() and not just get_file()?
> 
> Whenever you see an impl block that uses the keyword "for", then the
> code is implementing a trait. In this case, the trait being implemented
> is AlwaysRefCounted, which allows File to work with ARef.
> 
> It has to be `inc_ref` because that's what AlwaysRefCounted calls this
> method.

I am not sure if the Rust term "trait" is well-known, so for a bit more
context, I am quoting the Rust Book [1]:

    A *trait* defines functionality a particular type has and can share
    with other types. We can use traits to define shared behavior in an
    abstract way. We can use *trait bounds* to specify that a generic type
    can be any type that has certain behavior.

[1]: https://doc.rust-lang.org/book/ch10-02-traits.html

We have created an abstraction over reference counting:
the trait `AlwaysRefCounted` and the struct `ARef<T>` where `T`
implements `AlwaysRefCounted`.
As Alice already explained, `ARef<T>` is a pointer that owns a refcount
on the object. Because `ARef<T>` needs to know how to increment and
decrement that counter. For example, when you want to create another
`ARef<T>` you can `clone()` it and therefore `ARef<T>` needs to
increment the refcount. And when you drop it, `ARef<T>` needs to
decrement it.
The "`ARef<T>` knows how to inc/dec the refcount" part is done by the
`AlwaysRefCounted` trait. And there we chose to name the functions
`inc_ref` and `dec_ref`, since these are the *general*/*abstract*
operations and not any specific refcount adjustment.



Hope that also helped and did not create confusion.

--
Cheers,
Benno


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-29 16:55     ` Alice Ryhl
  2023-11-29 17:14       ` Alice Ryhl
@ 2023-11-30  9:09       ` Christian Brauner
  2023-11-30  9:17         ` Alice Ryhl
  1 sibling, 1 reply; 96+ messages in thread
From: Christian Brauner @ 2023-11-30  9:09 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: a.hindborg, alex.gaynor, arve, benno.lossin, bjorn3_gh,
	boqun.feng, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

On Wed, Nov 29, 2023 at 04:55:51PM +0000, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
> > Can we follow the traditional file terminology, i.e.,
> > get_unused_fd_flags() and fd_install()? At least at the beginning this
> > might be quite helpful instead of having to mentally map new() and
> > commit() onto the C functions.
> 
> Sure, I'll do that in the next version.
> 
> >> +    /// Prevent values of this type from being moved to a different task.
> >> +    ///
> >> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
> >> +    /// owns the fd in question.
> >> +    _not_send_sync: PhantomData<*mut ()>,
> > 
> > I don't fully understand this. Can you explain in a little more detail
> > what you mean by this and how this works?
> 
> Yeah, so, this has to do with the Rust trait `Send` that controls
> whether it's okay for a value to get moved from one thread to another.
> In this case, we don't want it to be `Send` so that it can't be moved to
> another thread, since current might be different there.
> 
> The `Send` trait is automatically applied to structs whenever *all*
> fields of the struct are `Send`. So to ensure that a struct is not
> `Send`, you add a field that is not `Send`.
> 
> The `PhantomData` type used here is a special zero-sized type.
> Basically, it says "pretend this struct has a field of type `*mut ()`,
> but don't actually add the field". So for the purposes of `Send`, it has
> a non-Send field, but since its wrapped in `PhantomData`, the field is
> not there at runtime.

This probably a stupid suggestion, question. But while PhantomData gives
the right hint of what is happening I wouldn't mind if that was very
explicitly called NoSendTrait or just add the explanatory comment. Yes,
that's a lot of verbiage but you'd help us a lot.

> 
> >> +        Ok(Self {
> >> +            fd: fd as _,
> > 
> > This is a cast to a u32?
> 
> Yes.
> 
> > Can you please draft a quick example how that return value would be
> > expected to be used by a caller? It's really not clear
> 
> The most basic usage would look like this:
> 
> 	// First, reserve the fd.
> 	let reservation = FileDescriptorReservation::new(O_CLOEXEC)?;
> 
> 	// Then, somehow get a file to put in it.
> 	let file = get_file_using_fallible_operation()?;
> 
> 	// Finally, commit it to the fd.
> 	reservation.commit(file);

Ok, the reason I asked was that I was confused about the PhantomData and
how that would figure into using the return value as I hadn't seen that
Ok(Self { }) syntax before. Thanks.
  
> 
> In Rust Binder, reservations are used here:
> https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/allocation.rs#L199-L210
> https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/allocation.rs#L512-L541
> 
> >> +    pub fn commit(self, file: ARef<File>) {
> >> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> >> +        // guaranteed to have an owned ref count by its type invariants.
> >> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> > 
> > Why file.0.get()? Where did that come from?
> 
> This gets a raw pointer to the C type.
> 
> The `.0` part is a field access. `ARef` struct is a tuple struct, so its

Ah, there we go. It's a bit ugly tbh.

> fields are unnamed. However, the fields can still be accessed by index.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-29 17:14       ` Alice Ryhl
@ 2023-11-30  9:12         ` Christian Brauner
  2023-11-30  9:23           ` Alice Ryhl
  0 siblings, 1 reply; 96+ messages in thread
From: Christian Brauner @ 2023-11-30  9:12 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: a.hindborg, alex.gaynor, arve, benno.lossin, bjorn3_gh,
	boqun.feng, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

On Wed, Nov 29, 2023 at 06:14:24PM +0100, Alice Ryhl wrote:
> On Wed, Nov 29, 2023 at 5:55 PM Alice Ryhl <aliceryhl@google.com> wrote:
> 
> > >> +    pub fn commit(self, file: ARef<File>) {
> > >> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> > >> +        // guaranteed to have an owned ref count by its type invariants.
> > >> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> > >
> > > Why file.0.get()? Where did that come from?
> >
> > This gets a raw pointer to the C type.
> >
> > The `.0` part is a field access. `ARef` struct is a tuple struct, so its
> > fields are unnamed. However, the fields can still be accessed by index.
> 
> Oh, sorry, this is wrong. Let me try again:
> 
> This gets a raw pointer to the C type. The `.0` part accesses the
> field of type `Opaque<bindings::file>` in the Rust wrapper. Recall
> that File is defined like this:
> 
> pub struct File(Opaque<bindings::file>);
> 
> The above syntax defines a tuple struct, which means that the fields
> are unnamed. The `.0` syntax accesses the first field of a tuple
> struct [1].
> 
> The `.get()` method is from the `Opaque` struct, which returns a raw
> pointer to the C type being wrapped.

It'd be nice if this could be written in a more obvious/elegant way. And
if not a comment would help. I know there'll be more text then code but
until this is second nature to read I personally won't mind... Because
searching for this specific syntax isn't really possible.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-30  9:09       ` Christian Brauner
@ 2023-11-30  9:17         ` Alice Ryhl
  2023-11-30 10:51           ` Christian Brauner
  0 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-30  9:17 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, cmllamas, dan.j.williams, dxu, gary,
	gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	peterz, rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf,
	willy

Christian Brauner <brauner@kernel.org> writes:
>>>> +    /// Prevent values of this type from being moved to a different task.
>>>> +    ///
>>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
>>>> +    /// owns the fd in question.
>>>> +    _not_send_sync: PhantomData<*mut ()>,
>>> 
>>> I don't fully understand this. Can you explain in a little more detail
>>> what you mean by this and how this works?
>> 
>> Yeah, so, this has to do with the Rust trait `Send` that controls
>> whether it's okay for a value to get moved from one thread to another.
>> In this case, we don't want it to be `Send` so that it can't be moved to
>> another thread, since current might be different there.
>> 
>> The `Send` trait is automatically applied to structs whenever *all*
>> fields of the struct are `Send`. So to ensure that a struct is not
>> `Send`, you add a field that is not `Send`.
>> 
>> The `PhantomData` type used here is a special zero-sized type.
>> Basically, it says "pretend this struct has a field of type `*mut ()`,
>> but don't actually add the field". So for the purposes of `Send`, it has
>> a non-Send field, but since its wrapped in `PhantomData`, the field is
>> not there at runtime.
> 
> This probably a stupid suggestion, question. But while PhantomData gives
> the right hint of what is happening I wouldn't mind if that was very
> explicitly called NoSendTrait or just add the explanatory comment. Yes,
> that's a lot of verbiage but you'd help us a lot.

I suppose we could add a typedef:

type NoSendTrait = PhantomData<*mut ()>;

and use that as the field type. The way I did it here is the "standard"
way of doing it, and if you look at code outside the kernel, you will
also find them using `PhantomData` like this. However, I don't mind
adding the typedef if you think it is helpful.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-30  9:12         ` Christian Brauner
@ 2023-11-30  9:23           ` Alice Ryhl
  0 siblings, 0 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-11-30  9:23 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, cmllamas, dan.j.williams, dxu, gary,
	gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	peterz, rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf,
	willy

Christian Brauner <brauner@kernel.org> writes:
> On Wed, Nov 29, 2023 at 06:14:24PM +0100, Alice Ryhl wrote:
> > On Wed, Nov 29, 2023 at 5:55 PM Alice Ryhl <aliceryhl@google.com> wrote:
> > 
> > > >> +    pub fn commit(self, file: ARef<File>) {
> > > >> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> > > >> +        // guaranteed to have an owned ref count by its type invariants.
> > > >> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> > > >
> > > > Why file.0.get()? Where did that come from?
> > >
> > > This gets a raw pointer to the C type.
> > >
> > > The `.0` part is a field access. `ARef` struct is a tuple struct, so its
> > > fields are unnamed. However, the fields can still be accessed by index.
> > 
> > Oh, sorry, this is wrong. Let me try again:
> > 
> > This gets a raw pointer to the C type. The `.0` part accesses the
> > field of type `Opaque<bindings::file>` in the Rust wrapper. Recall
> > that File is defined like this:
> > 
> > pub struct File(Opaque<bindings::file>);
> > 
> > The above syntax defines a tuple struct, which means that the fields
> > are unnamed. The `.0` syntax accesses the first field of a tuple
> > struct [1].
> > 
> > The `.get()` method is from the `Opaque` struct, which returns a raw
> > pointer to the C type being wrapped.
> 
> It'd be nice if this could be written in a more obvious/elegant way. And
> if not a comment would help. I know there'll be more text then code but
> until this is second nature to read I personally won't mind... Because
> searching for this specific syntax isn't really possible.

Adding a comment to every instance of this is probably not realisitic.
This kind of code will be very common in abstraction code. However,
there are two other options that I think are reasonable:

1. I can change the definition of `File` so that the field has a name:

struct File {
    inner: Opaque<bindings::file>,
}

Then, it would say `file.inner.get()`.

2. Alternatively, I can add a method to file:

impl File {
    #[inline]
    pub fn as_ptr(&self) -> *mut bindings::file {
        self.0.get()
    }
}

And then write `file.as_ptr()` whenever I want a pointer.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-29 16:28   ` Christian Brauner
  2023-11-29 16:48     ` Peter Zijlstra
@ 2023-11-30  9:36     ` Alice Ryhl
  2023-11-30 10:52       ` Christian Brauner
  1 sibling, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-30  9:36 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, cmllamas, dan.j.williams, dxu, gary,
	gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	peterz, rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf,
	willy

Christian Brauner <brauner@kernel.org> writes:
> I'm a bit puzzled by all these rust_helper_*() calls. Can you explain
> why they are needed? Because they are/can be static inlines and that
> somehow doesn't work?

Yes, it's because the methods are inline. Rust can only call C methods
that are actually exported by the C code.

>> +    /// Converts this kernel UID into a UID that userspace understands. Uses the namespace of the
>> +    /// current task.
>> +    pub fn into_uid_in_current_ns(self) -> bindings::uid_t {
> 
> Hm, I wouldn't special-case this. Just expose from_kuid() and let it
> take a namespace argument, no? You don't need to provide bindings for
> namespaces ofc.

To make `from_kuid` safe, I would need to wrap the namespace type too. I
could do that, but it would be more code than this method because I need
another wrapper struct and so on.

Personally I would prefer to special-case it until someone needs the
non-special-case. Then, they can delete this method when they introduce
the non-special-case.

But I'll do it if you think I should.

>> +impl PartialEq for Kuid {
>> +    fn eq(&self, other: &Kuid) -> bool {
>> +        // SAFETY: Just an FFI call.
>> +        unsafe { bindings::uid_eq(self.kuid, other.kuid) }
>> +    }
>> +}
>> +
>> +impl Eq for Kuid {}
> 
> Do you need that?

Yes. This is the code that tells the compiler what `==` means for the
`Kuid` type. Binder uses it here:

https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/context.rs#L174

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-29 13:12 ` [PATCH 5/7] rust: file: add `Kuid` wrapper Alice Ryhl
  2023-11-29 16:28   ` Christian Brauner
@ 2023-11-30 10:36   ` Peter Zijlstra
  2023-12-06 20:02     ` Kent Overstreet
  2023-11-30 16:48   ` Benno Lossin
  2 siblings, 1 reply; 96+ messages in thread
From: Peter Zijlstra @ 2023-11-30 10:36 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alexander Viro, Christian Brauner, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 01:12:17PM +0000, Alice Ryhl wrote:

> diff --git a/rust/helpers.c b/rust/helpers.c
> index fd633d9db79a..58e3a9dff349 100644
> --- a/rust/helpers.c
> +++ b/rust/helpers.c
> @@ -142,6 +142,51 @@ void rust_helper_put_task_struct(struct task_struct *t)
>  }
>  EXPORT_SYMBOL_GPL(rust_helper_put_task_struct);
>  
> +kuid_t rust_helper_task_uid(struct task_struct *task)
> +{
> +	return task_uid(task);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_task_uid);
> +
> +kuid_t rust_helper_task_euid(struct task_struct *task)
> +{
> +	return task_euid(task);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_task_euid);

Aren't these like ideal speculation gadgets? And shouldn't we avoid
functions like this for exactly that reason?


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 17:08       ` Boqun Feng
@ 2023-11-30 10:42         ` Peter Zijlstra
  2023-11-30 15:25           ` Boqun Feng
  0 siblings, 1 reply; 96+ messages in thread
From: Peter Zijlstra @ 2023-11-30 10:42 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 09:08:14AM -0800, Boqun Feng wrote:

> But but but, I then realized we have asm goto in C but Rust doesn't
> support them, and I haven't thought through how hard tht would be..

You're kidding right?

I thought we *finally* deprecated all compilers that didn't support
asm-goto and x86 now mandates asm-goto to build, and then this toy
language comes around ?

What a load of crap ... 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 21:27     ` Alice Ryhl
  2023-11-29 23:17       ` Benno Lossin
@ 2023-11-30 10:48       ` Christian Brauner
  2023-11-30 12:10         ` Alice Ryhl
  1 sibling, 1 reply; 96+ messages in thread
From: Christian Brauner @ 2023-11-30 10:48 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: a.hindborg, alex.gaynor, arve, benno.lossin, bjorn3_gh,
	boqun.feng, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

On Wed, Nov 29, 2023 at 09:27:07PM +0000, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
> >> This abstraction makes it possible to manipulate the open files for a
> >> process. The new `File` struct wraps the C `struct file`. When accessing
> >> it using the smart pointer `ARef<File>`, the pointer will own a
> >> reference count to the file. When accessing it as `&File`, then the
> >> reference does not own a refcount, but the borrow checker will ensure
> >> that the reference count does not hit zero while the `&File` is live.
> > 
> > Could you explain this in more details please? Ideally with some C and
> > how that translates to your Rust wrappers, please. Sorry, this is going
> > to be a long journey...
> 
> Yes of course. This touches on what I think is one of the most important

Thanks for all the background.

> features that Rust brings to the table, which is the idea of defining
> many different pointer types for different ownership semantics.
> 
> In the case of `struct file`, there are two pointer types that are
> relevant:
> 
>  * `&File`. This is an "immutable reference" or "shared reference"
>    (both names are used). This pointer type is used when you don't have
>    any ownership over the target at all. All you have is _some_ sort of
>    guarantee that target stays alive while the reference is live. In
>    many cases, the borrow checker helps validate this at compile-time,
>    but you can also use a backdoor (in this case from_ptr) to just
>    unsafely say "I know this value is valid, so shut up compiler".
>    Shared references have no destructor.
> 
>  * `ARef<File>`. The `ARef` type is a custom pointer type defined in the
>    kernel in `rust/kernel/types.rs`. It represents a pointer that owns a
>    ref-count to the inner value. ARef can only be used with types that
>    have an `unsafe impl AlwaysRefCounted for T` block. Whenever you
>    clone an `ARef`, it calls into the `inc_ref` method that you defined
>    for the type, and whenever it goes out of scope and the destructor
>    runs, it calls the `dec_ref` method that you defined for the type.
> 
> Potentially we might want a third in the future. The third pointer type
> could be another custom pointer type just for `struct file` that uses
> `fdget` instead of `fget`. However, I haven't added this since I don't
> need it (dead code and so on).
> 
> To give an example of this, consider this really simple C function:
> 
> 	bool is_nonblocking(struct file *file) {
> 		return !!(filp->f_flags & O_NONBLOCK);
> 	}
> 
> What are the ownership semantics of `file`? Well, we don't really care.
> The caller needs to somehow ensure that the `file` is valid, but we
> don't care if they're doing that with `fdget` or `fget` or whatever.
> This corresponds to &File, so the Rust equivalent would be:
> 
> 	fn is_nonblocking(file: &File) -> bool {
> 		(file.flags() & O_NONBLOCK) != 0
> 	}
> 
> Another example:
> 
> 	void set_nonblocking_and_fput(struct file *file) {
> 		// Let's just ignore the lock for this example.
> 		file->f_flags |= O_NONBLOCK;
> 
> 		fput(file);
> 	}
> 
> This method takes a file, sets it to non-blocking, and then destroys the
> ref-count. What are the ownership semantics? Well, the caller should own
> an `fget` ref-count, and we consume that ref-count. The equivalent Rust
> code would be to take an `ARef<File>`:
> 
> 	fn set_nonblocking_and_fput(file: ARef<File>) {
> 		file.set_flag(O_NONBLOCK);
> 
> 		// When `file` goes out of scope here, the destructor
> 		// runs and calls `fput`. (Since that's what we defined
> 		// `ARef` to do on drop in `fn dec_ref`.)
> 	}
> 
> You can also explicitly call the destructor with `drop(file)`:
> 
> 	fn set_nonblocking_and_fput(file: ARef<File>) {
> 		file.set_flag(O_NONBLOCK);
> 		drop(file);
> 
> 		// When `file` goes out of scope, the destructor does
> 		// *not* run. This is because `drop(file)` is a move
> 		// (due to the signature of drop), and if you perform a
> 		// move, then the destructor no longer runs at the end
> 		// of the scope.
> 	}
> 
> And note that this would not compile, because we give up ownership of
> the `ARef` by passing it to `drop`:
> 
> 	fn set_nonblocking_and_fput(file: ARef<File>) {
> 		drop(file);
> 		file.set_flag(O_NONBLOCK);
> 	}
> 
> A third example:
> 
> 	struct holds_a_file {
> 		struct file *inner;
> 	};
> 
> 	struct file *get_the_file(struct holds_a_file *holder) {
> 		return holder->inner;
> 	}
> 
> What are the ownership semantics? Well, let's say that `holds_a_file`
> owns a refcount to the file. Then, the pointer returned by get_the_file
> is valid as long as `holder` is, but it doesn't have any ownership
> over the file. You must stop using the returned file pointer before the
> holder is destroyed.
> 
> The Rust equivalent is:
> 
> 	struct HoldsAFile {
> 		inner: ARef<File>,
> 	}
> 
> 	fn get_the_file(holder: &HoldsAFile) -> &File {
> 		&holder.inner
> 	}
> 
> The method signature is short-hand for (see [1]):
> 
> 	fn get_the_file<'a>(holder: &'a HoldsAFile) -> &'a File {

Whenever you implement something like this - at least for fs/vfs
wrappers - I would ask you to please annotate the lifetimes with
comments. I've done a decent amount of (userspace) Rust
https://github.com/brauner/rlxc but the syntax isn't second nature to me
and I expect there to be quite a few other developers/maintainers that
aren't familiar.

> 		&holder.inner
> 	}
> 
> Here, 'a is a lifetime, and it ties together `holder` and the returned

The lieftime of the file is bound to the lifetime of the holder, ok.

> reference in the way I described above. So e.g., this compiles:
> 
> 	let holder = ...;
> 	let file = get_the_file(&holder);
> 	use_the_file(file);
> 
> But this doesn't:
> 
> 	let holder = ...;
> 	let file = get_the_file(&holder);
> 	drop(holder);
> 	use_the_file(file); // Oops, destroying holder calls fput.
> 
> Notice also how the compiler accepted `&holder.inner` as the type
> `&File` even though `inner` has type `ARef<File>`. This is because
> `ARef` is defined to use something called deref coercion, which makes it
> act like a real pointer type. This means that if you have an
> `ARef<File>`, but you want to call a method that accepts `&File`, then
> it will just work. (Deref coercion only exists for conversions into
> reference types, so you can't pass a `&File` to something that takes an
> `ARef<File>` without explicitly upgrading it to an `ARef<File>` by
> taking a ref-count.)
> 
> [1]: https://doc.rust-lang.org/reference/lifetime-elision.html
> 
> >> +    /// Constructs a new `struct file` wrapper from a file descriptor.
> >> +    ///
> >> +    /// The file descriptor belongs to the current process.
> >> +    pub fn from_fd(fd: u32) -> Result<ARef<Self>, BadFdError> {
> >> +        // SAFETY: FFI call, there are no requirements on `fd`.
> >> +        let ptr = ptr::NonNull::new(unsafe { bindings::fget(fd) }).ok_or(BadFdError)?;
> >> +
> >> +        // INVARIANT: `fget` increments the refcount before returning.
> >> +        Ok(unsafe { ARef::from_raw(ptr.cast()) })
> >> +    }
> > 
> > I think this is really misnamed.
> > 
> > File reference counting has two modes. For simplicity let's ignore
> > fdget_pos() for now:
> > 
> > (1) fdget()
> >     Return file either with or without an increased reference count.
> >     If the fdtable was shared increment reference count, if not don't
> >     increment refernce count.
> > (2) fget()
> >     Always increase refcount.
> > 
> > Your File implementation currently only deals with (2). And this
> > terminology is terribly important as far as I'm concerned. This wants to
> > be fget() and not from_fd(). The latter tells me nothing. I feel we
> > really need to try and mirror the current naming closely. Not
> > religiously ofc but core stuff such as this really benefits from having
> > an almost 1:1 mapping between C names and Rust names, I think.
> > Especially in the beginning.
> 
> Sure, I'll rename these methods in the next version.
> 
> >> +    /// Creates a reference to a [`File`] from a valid pointer.
> >> +    ///
> >> +    /// # Safety
> >> +    ///
> >> +    /// The caller must ensure that `ptr` points at a valid file and that its refcount does not
> >> +    /// reach zero during the lifetime 'a.
> >> +    pub unsafe fn from_ptr<'a>(ptr: *const bindings::file) -> &'a File {
> >> +        // INVARIANT: The safety requirements guarantee that the refcount does not hit zero during
> >> +        // 'a. The cast is okay because `File` is `repr(transparent)`.
> >> +        unsafe { &*ptr.cast() }
> >> +    }
> > 
> > How does that work and what is this used for? It's required that a
> > caller has called from_fd()/fget() first before from_ptr() can be used?
> > 
> > Can you show how this would be used in an example, please? Unless you
> > hold file_lock it is now invalid to access fields in struct file just
> > with rcu lock held for example. Which is why returning a pointer without
> > holding a reference seems dodgy. I'm probably just missing context.
> 
> This is the backdoor. You use it when *you* know that the file is okay

And a huge one.

> to access, but Rust doesn't. It's unsafe because it's not checked by
> Rust.
> 
> For example you could do this:
> 
> 	let ptr = unsafe { bindings::fdget(fd) };
> 
> 	// SAFETY: We just called `fdget`.
> 	let file = unsafe { File::from_ptr(ptr) };
> 	use_file(file);
> 
> 	// SAFETY: We're not using `file` after this call.
> 	unsafe { bindings::fdput(ptr) };
> 
> It's used in Binder here:
> https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/rust_binder.rs#L331-L332
> 
> Basically, I use it to say "C code has called fdget for us so it's okay
> to access the file", whenever userspace uses a syscall to call into the
> driver.

Yeah, ok, because the fd you're operating on may be coming from fdget(). Iirc,
binder is almost by default used multi-threaded with a shared file descriptor
table? But while that means fdget() will usually bump the reference count you
can't be sure. Hmkay.

> 
> >> +// SAFETY: The type invariants guarantee that `File` is always ref-counted.
> >> +unsafe impl AlwaysRefCounted for File {
> >> +    fn inc_ref(&self) {
> >> +        // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> >> +        unsafe { bindings::get_file(self.0.get()) };
> >> +    }
> > 
> > Why inc_ref() and not just get_file()?
> 
> Whenever you see an impl block that uses the keyword "for", then the
> code is implementing a trait. In this case, the trait being implemented
> is AlwaysRefCounted, which allows File to work with ARef.

Ah, right. Thanks.

> 
> It has to be `inc_ref` because that's what AlwaysRefCounted calls this
> method.
> 
> >> +    unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
> >> +        // SAFETY: The safety requirements guarantee that the refcount is nonzero.
> >> +        unsafe { bindings::fput(obj.cast().as_ptr()) }
> >> +    }
> > 
> > Ok, so this makes me think that from_ptr() requires you to have called
> > from_fd()/fget() first which would be good.
> 
> Actually, `from_ptr` has nothing to do with this. The above code only
> applies to code that uses the `ARef` pointer type, but `from_ptr` uses
> the `&File` pointer type instead.
> 
> >> +    /// Returns the flags associated with the file.
> >> +    ///
> >> +    /// The flags are a combination of the constants in [`flags`].
> >> +    pub fn flags(&self) -> u32 {
> >> +        // This `read_volatile` is intended to correspond to a READ_ONCE call.
> >> +        //
> >> +        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
> > 
> > I really need to understand what you mean by shared reference. At least
> > in the current C implementation you can't share a reference without
> > another task as the other task might fput() behind you and then you're
> > hosed. That's why we have the fdget() logic.
> 
> By "shared reference" I just mean an `&File`. They're called shared
> because there could be other pointers to the same object elsewhere in
> the program, and not because we have explicitly shared it ourselves.

Ok, that was confusing to me because I wasn't sure whether you were talking
about sharing an ->f_count reference.

> 
> Rust's other type of reference `&mut T` is called a "mutable reference"
> or "exclusive reference". Like with `&T`, both names are used.
> 
> > > +// SAFETY: It's OK to access `File` through shared references from other threads because we're
> > > +// either accessing properties that don't change or that are properly synchronised by C code.
> > 
> > Uhm, what guarantees are you talking about specifically, please?
> > Examples would help.
> > 
> > > +unsafe impl Sync for File {}
> 
> The Sync trait defines whether a value may be accessed from several
> threads in parallel (using shared/immutable references). In our case,

So let me put this into my own words and you correct me, please:

So, this really just means that if I have two processes both with their own
fdtable and they happen to hold fds that refer to the same @file:

P1				P2
struct fd fd1 = fdget(1234);
                                 struct fd fd2 = fdget(5678);
if (!fd1.file)                   if (!fd2.file)
	return -EBADF;                 return -EBADF;

// fd1.file == fd2.file

the only if the Sync trait is implemented both P1 and P2 can in parallel call
file->f_op->poll(@file)?

So if the Sync trait isn't implemented then the compiler will prohibit that P1
and P2 at the same time call file->f_op->poll(@file)? And that's all that's
meant by a shared reference? It's really about sharing the pointer.

The thing is that "shared reference" gets a bit in our way here:

(1) If you have SCM_RIGHTs in the mix then P1 can open fd1 to @file and then
    send that @file to P2 which now has fd2 refering to @file as well. The
    @file->f_count is bumped in that process. So @file->f_count is now 2.

    Now both P1 and P2 call fdget(). Since they don't have a shared fdtable
    none of them take an additional reference to @file. IOW, @file->f_count
    may remain 2 all throughout the @file->f_op->*() operation.

    So they share a reference to that file and elide both the
    atomic_inc_not_zero() and the atomic_dec_not_zero().

(2) io_uring has fixed files whose reference count always stays at 1.
    So all io_uring operations on such fixed files share a single reference.

So that's why this is a bit confusing at first to read "shared reference".

Please add a comment on top of unsafe impl Sync for File {}
explaining/clarifying this a little that it's about calling methods on the same
file.

> every method on `File` that accepts a `&File` is okay to be called in
> parallel from several threads, so it's okay for `File` to implement
> `Sync`.
> 
> I'm actually making a statement about the rest of the Rust code in this
> file here. If I added a method that took `&File`, but couldn't be called
> in parallel, then `File` could no longer be `Sync`.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-30  9:17         ` Alice Ryhl
@ 2023-11-30 10:51           ` Christian Brauner
  2023-11-30 11:54             ` Alice Ryhl
  0 siblings, 1 reply; 96+ messages in thread
From: Christian Brauner @ 2023-11-30 10:51 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: a.hindborg, alex.gaynor, arve, benno.lossin, bjorn3_gh,
	boqun.feng, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

On Thu, Nov 30, 2023 at 09:17:56AM +0000, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
> >>>> +    /// Prevent values of this type from being moved to a different task.
> >>>> +    ///
> >>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
> >>>> +    /// owns the fd in question.
> >>>> +    _not_send_sync: PhantomData<*mut ()>,
> >>> 
> >>> I don't fully understand this. Can you explain in a little more detail
> >>> what you mean by this and how this works?
> >> 
> >> Yeah, so, this has to do with the Rust trait `Send` that controls
> >> whether it's okay for a value to get moved from one thread to another.
> >> In this case, we don't want it to be `Send` so that it can't be moved to
> >> another thread, since current might be different there.
> >> 
> >> The `Send` trait is automatically applied to structs whenever *all*
> >> fields of the struct are `Send`. So to ensure that a struct is not
> >> `Send`, you add a field that is not `Send`.
> >> 
> >> The `PhantomData` type used here is a special zero-sized type.
> >> Basically, it says "pretend this struct has a field of type `*mut ()`,
> >> but don't actually add the field". So for the purposes of `Send`, it has
> >> a non-Send field, but since its wrapped in `PhantomData`, the field is
> >> not there at runtime.
> > 
> > This probably a stupid suggestion, question. But while PhantomData gives
> > the right hint of what is happening I wouldn't mind if that was very
> > explicitly called NoSendTrait or just add the explanatory comment. Yes,
> > that's a lot of verbiage but you'd help us a lot.
> 
> I suppose we could add a typedef:
> 
> type NoSendTrait = PhantomData<*mut ()>;
> 
> and use that as the field type. The way I did it here is the "standard"
> way of doing it, and if you look at code outside the kernel, you will
> also find them using `PhantomData` like this. However, I don't mind
> adding the typedef if you think it is helpful.

I'm fine with just a comment as well. I just need to be able to read
this a bit faster. I'm basically losing half a day just dealing with
this patchset and that's not realistic if I want to keep up with other
patches that get sent.

And if you resend and someone else review you might have to answer the
same question again.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-30  9:36     ` Alice Ryhl
@ 2023-11-30 10:52       ` Christian Brauner
  0 siblings, 0 replies; 96+ messages in thread
From: Christian Brauner @ 2023-11-30 10:52 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: a.hindborg, alex.gaynor, arve, benno.lossin, bjorn3_gh,
	boqun.feng, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

On Thu, Nov 30, 2023 at 09:36:03AM +0000, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
> > I'm a bit puzzled by all these rust_helper_*() calls. Can you explain
> > why they are needed? Because they are/can be static inlines and that
> > somehow doesn't work?
> 
> Yes, it's because the methods are inline. Rust can only call C methods
> that are actually exported by the C code.
> 
> >> +    /// Converts this kernel UID into a UID that userspace understands. Uses the namespace of the
> >> +    /// current task.
> >> +    pub fn into_uid_in_current_ns(self) -> bindings::uid_t {
> > 
> > Hm, I wouldn't special-case this. Just expose from_kuid() and let it
> > take a namespace argument, no? You don't need to provide bindings for
> > namespaces ofc.
> 
> To make `from_kuid` safe, I would need to wrap the namespace type too. I
> could do that, but it would be more code than this method because I need
> another wrapper struct and so on.
> 
> Personally I would prefer to special-case it until someone needs the
> non-special-case. Then, they can delete this method when they introduce
> the non-special-case.
> 
> But I'll do it if you think I should.

No, don't start wrapping namespaces as well. You already do parts of LSM
as well.

> 
> >> +impl PartialEq for Kuid {
> >> +    fn eq(&self, other: &Kuid) -> bool {
> >> +        // SAFETY: Just an FFI call.
> >> +        unsafe { bindings::uid_eq(self.kuid, other.kuid) }
> >> +    }
> >> +}
> >> +
> >> +impl Eq for Kuid {}
> > 
> > Do you need that?
> 
> Yes. This is the code that tells the compiler what `==` means for the
> `Kuid` type. Binder uses it here:

Ok, thanks.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-30 10:51           ` Christian Brauner
@ 2023-11-30 11:54             ` Alice Ryhl
  2023-11-30 12:17               ` Benno Lossin
  0 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-30 11:54 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, cmllamas, dan.j.williams, dxu, gary,
	gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	peterz, rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf,
	willy

Christian Brauner <brauner@kernel.org> writes:
> On Thu, Nov 30, 2023 at 09:17:56AM +0000, Alice Ryhl wrote:
>> Christian Brauner <brauner@kernel.org> writes:
>>>>>> +    /// Prevent values of this type from being moved to a different task.
>>>>>> +    ///
>>>>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
>>>>>> +    /// owns the fd in question.
>>>>>> +    _not_send_sync: PhantomData<*mut ()>,
>>>>> 
>>>>> I don't fully understand this. Can you explain in a little more detail
>>>>> what you mean by this and how this works?
>>>> 
>>>> Yeah, so, this has to do with the Rust trait `Send` that controls
>>>> whether it's okay for a value to get moved from one thread to another.
>>>> In this case, we don't want it to be `Send` so that it can't be moved to
>>>> another thread, since current might be different there.
>>>> 
>>>> The `Send` trait is automatically applied to structs whenever *all*
>>>> fields of the struct are `Send`. So to ensure that a struct is not
>>>> `Send`, you add a field that is not `Send`.
>>>> 
>>>> The `PhantomData` type used here is a special zero-sized type.
>>>> Basically, it says "pretend this struct has a field of type `*mut ()`,
>>>> but don't actually add the field". So for the purposes of `Send`, it has
>>>> a non-Send field, but since its wrapped in `PhantomData`, the field is
>>>> not there at runtime.
>>> 
>>> This probably a stupid suggestion, question. But while PhantomData gives
>>> the right hint of what is happening I wouldn't mind if that was very
>>> explicitly called NoSendTrait or just add the explanatory comment. Yes,
>>> that's a lot of verbiage but you'd help us a lot.
>> 
>> I suppose we could add a typedef:
>> 
>> type NoSendTrait = PhantomData<*mut ()>;
>> 
>> and use that as the field type. The way I did it here is the "standard"
>> way of doing it, and if you look at code outside the kernel, you will
>> also find them using `PhantomData` like this. However, I don't mind
>> adding the typedef if you think it is helpful.
> 
> I'm fine with just a comment as well. I just need to be able to read
> this a bit faster. I'm basically losing half a day just dealing with
> this patchset and that's not realistic if I want to keep up with other
> patches that get sent.
> 
> And if you resend and someone else review you might have to answer the
> same question again.

What do you think about this wording?

/// Prevent values of this type from being moved to a different task.
/// 
/// This field has the type `PhantomData<*mut ()>`, which does not
/// implement the Send trait. By adding a field with this property, we
/// ensure that the `FileDescriptorReservation` struct will not
/// implement the Send trait either. This has the consequence that the
/// compiler will prevent you from moving values of type
/// `FileDescriptorReservation` into a different task, which we want
/// because other tasks might have a different value of `current`. We
/// want to avoid that because `fd_install` assumes that the value of
/// `current` is unchanged since the call to `get_unused_fd_flags`.
/// 
/// The `PhantomData` type has size zero, so the field does not exist at
/// runtime.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 10:48       ` Christian Brauner
@ 2023-11-30 12:10         ` Alice Ryhl
  2023-11-30 12:36           ` Christian Brauner
  0 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-11-30 12:10 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, cmllamas, dan.j.williams, dxu, gary,
	gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	peterz, rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf,
	willy

Christian Brauner <brauner@kernel.org> writes:
>> This is the backdoor. You use it when *you* know that the file is okay
> 
> And a huge one.
> 
>> to access, but Rust doesn't. It's unsafe because it's not checked by
>> Rust.
>> 
>> For example you could do this:
>> 
>> 	let ptr = unsafe { bindings::fdget(fd) };
>> 
>> 	// SAFETY: We just called `fdget`.
>> 	let file = unsafe { File::from_ptr(ptr) };
>> 	use_file(file);
>> 
>> 	// SAFETY: We're not using `file` after this call.
>> 	unsafe { bindings::fdput(ptr) };
>> 
>> It's used in Binder here:
>> https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/rust_binder.rs#L331-L332
>> 
>> Basically, I use it to say "C code has called fdget for us so it's okay
>> to access the file", whenever userspace uses a syscall to call into the
>> driver.
> 
> Yeah, ok, because the fd you're operating on may be coming from fdget(). Iirc,
> binder is almost by default used multi-threaded with a shared file descriptor
> table? But while that means fdget() will usually bump the reference count you
> can't be sure. Hmkay.

Even if the syscall used `fget` instead of `fdget`, I would still be
using `from_ptr` here. The `ARef` type only really makes sense when *we*
have ownership of the ref-count, but in this case we don't own it. We're
just given a promise that the caller is keeping it alive for us using
some mechanism or another.

>>>> +// SAFETY: It's OK to access `File` through shared references from other threads because we're
>>>> +// either accessing properties that don't change or that are properly synchronised by C code.
>>> 
>>> Uhm, what guarantees are you talking about specifically, please?
>>> Examples would help.
>>> 
>>>> +unsafe impl Sync for File {}
>> 
>> The Sync trait defines whether a value may be accessed from several
>> threads in parallel (using shared/immutable references). In our case,
> 
> So let me put this into my own words and you correct me, please:
> 
> So, this really just means that if I have two processes both with their own
> fdtable and they happen to hold fds that refer to the same @file:
> 
> P1				P2
> struct fd fd1 = fdget(1234);
>                                  struct fd fd2 = fdget(5678);
> if (!fd1.file)                   if (!fd2.file)
> 	return -EBADF;                 return -EBADF;
> 
> // fd1.file == fd2.file
> 
> the only if the Sync trait is implemented both P1 and P2 can in parallel call
> file->f_op->poll(@file)?
> 
> So if the Sync trait isn't implemented then the compiler will prohibit that P1
> and P2 at the same time call file->f_op->poll(@file)? And that's all that's
> meant by a shared reference? It's really about sharing the pointer.

Yeah, what you're saying sounds correct. For a type that is not Sync,
you would need a lock around the call to `poll` before the compiler
would accept the call.

(Or some other mechanism to convince the compiler that no other thread
is looking at the file at the same time. Of course, a lock is just one
way to do that.)

> The thing is that "shared reference" gets a bit in our way here:
> 
> (1) If you have SCM_RIGHTs in the mix then P1 can open fd1 to @file and then
>     send that @file to P2 which now has fd2 refering to @file as well. The
>     @file->f_count is bumped in that process. So @file->f_count is now 2.
> 
>     Now both P1 and P2 call fdget(). Since they don't have a shared fdtable
>     none of them take an additional reference to @file. IOW, @file->f_count
>     may remain 2 all throughout the @file->f_op->*() operation.
> 
>     So they share a reference to that file and elide both the
>     atomic_inc_not_zero() and the atomic_dec_not_zero().
> 
> (2) io_uring has fixed files whose reference count always stays at 1.
>     So all io_uring operations on such fixed files share a single reference.
> 
> So that's why this is a bit confusing at first to read "shared reference".
> 
> Please add a comment on top of unsafe impl Sync for File {}
> explaining/clarifying this a little that it's about calling methods on the same
> file.

Yeah, I agree, the terminology gets a bit mixed up here because we both
use the word "reference" for different things.

How about this comment?

/// All methods defined on `File` that take `&self` are safe to call even if
/// other threads are concurrently accessing the same `struct file`, because
/// those methods either access immutable properties or have proper
/// synchronization to ensure that such accesses are safe.

Note: Here, I say "take &self" to refer to methods with &self in the
signature. This signature means that you pass a &File to the method when
you call it.

Alice


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-30 11:54             ` Alice Ryhl
@ 2023-11-30 12:17               ` Benno Lossin
  2023-11-30 12:33                 ` Christian Brauner
  0 siblings, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 12:17 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: brauner, a.hindborg, alex.gaynor, arve, bjorn3_gh, boqun.feng,
	cmllamas, dan.j.williams, dxu, gary, gregkh, joel, keescook,
	linux-fsdevel, linux-kernel, maco, ojeda, peterz, rust-for-linux,
	surenb, tglx, tkjos, viro, wedsonaf, willy

On 30.11.23 12:54, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
>> On Thu, Nov 30, 2023 at 09:17:56AM +0000, Alice Ryhl wrote:
>>> Christian Brauner <brauner@kernel.org> writes:
>>>>>>> +    /// Prevent values of this type from being moved to a different task.
>>>>>>> +    ///
>>>>>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
>>>>>>> +    /// owns the fd in question.
>>>>>>> +    _not_send_sync: PhantomData<*mut ()>,
>>>>>>
>>>>>> I don't fully understand this. Can you explain in a little more detail
>>>>>> what you mean by this and how this works?
>>>>>
>>>>> Yeah, so, this has to do with the Rust trait `Send` that controls
>>>>> whether it's okay for a value to get moved from one thread to another.
>>>>> In this case, we don't want it to be `Send` so that it can't be moved to
>>>>> another thread, since current might be different there.
>>>>>
>>>>> The `Send` trait is automatically applied to structs whenever *all*
>>>>> fields of the struct are `Send`. So to ensure that a struct is not
>>>>> `Send`, you add a field that is not `Send`.
>>>>>
>>>>> The `PhantomData` type used here is a special zero-sized type.
>>>>> Basically, it says "pretend this struct has a field of type `*mut ()`,
>>>>> but don't actually add the field". So for the purposes of `Send`, it has
>>>>> a non-Send field, but since its wrapped in `PhantomData`, the field is
>>>>> not there at runtime.
>>>>
>>>> This probably a stupid suggestion, question. But while PhantomData gives
>>>> the right hint of what is happening I wouldn't mind if that was very
>>>> explicitly called NoSendTrait or just add the explanatory comment. Yes,
>>>> that's a lot of verbiage but you'd help us a lot.
>>>
>>> I suppose we could add a typedef:
>>>
>>> type NoSendTrait = PhantomData<*mut ()>;
>>>
>>> and use that as the field type. The way I did it here is the "standard"
>>> way of doing it, and if you look at code outside the kernel, you will
>>> also find them using `PhantomData` like this. However, I don't mind
>>> adding the typedef if you think it is helpful.
>>
>> I'm fine with just a comment as well. I just need to be able to read
>> this a bit faster. I'm basically losing half a day just dealing with
>> this patchset and that's not realistic if I want to keep up with other
>> patches that get sent.
>>
>> And if you resend and someone else review you might have to answer the
>> same question again.
> 
> What do you think about this wording?
> 
> /// Prevent values of this type from being moved to a different task.
> ///
> /// This field has the type `PhantomData<*mut ()>`, which does not
> /// implement the Send trait. By adding a field with this property, we
> /// ensure that the `FileDescriptorReservation` struct will not
> /// implement the Send trait either. This has the consequence that the
> /// compiler will prevent you from moving values of type
> /// `FileDescriptorReservation` into a different task, which we want
> /// because other tasks might have a different value of `current`. We
> /// want to avoid that because `fd_install` assumes that the value of
> /// `current` is unchanged since the call to `get_unused_fd_flags`.
> ///
> /// The `PhantomData` type has size zero, so the field does not exist at
> /// runtime.
> 
> Alice

I don't think it is a good idea to add this big comment to every
`PhantomData` field. I would much rather have a type alias:

    /// Zero-sized type to mark types not [`Send`].
    ///
    /// Add this type as a field to your struct if your type should not be sent to a different task.
    /// Since [`Send`] is an auto trait, adding a single field that is [`!Send`] will ensure that the
    /// whole type is [`!Send`].
    ///
    /// If a type is [`!Send`] it is impossible to give control over an instance of the type to another
    /// task. This is useful when a type stores task-local information for example file descriptors.
    pub type NotSend = PhantomData<*mut ()>;

If you have suggestions for improving the doc comment or the name,
please go ahead.

This doesn't mean that there should be no comment on the `NotSend`
field of `FileDescriptorReservation`, but I don't want to repeat
the `Send` stuff all over the place (since it comes up a lot):

    /// Ensure that `FileDescriptorReservation` cannot be sent to a different task, since there the
    /// value of `current` is different. We want to avoid that because `fd_install` assumes that the
    /// value of `current` is unchanged since the call to `get_unused_fd_flags`.
    _not_send: NotSend,

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-30 12:17               ` Benno Lossin
@ 2023-11-30 12:33                 ` Christian Brauner
  0 siblings, 0 replies; 96+ messages in thread
From: Christian Brauner @ 2023-11-30 12:33 UTC (permalink / raw)
  To: Benno Lossin
  Cc: Alice Ryhl, a.hindborg, alex.gaynor, arve, bjorn3_gh, boqun.feng,
	cmllamas, dan.j.williams, dxu, gary, gregkh, joel, keescook,
	linux-fsdevel, linux-kernel, maco, ojeda, peterz, rust-for-linux,
	surenb, tglx, tkjos, viro, wedsonaf, willy

On Thu, Nov 30, 2023 at 12:17:14PM +0000, Benno Lossin wrote:
> On 30.11.23 12:54, Alice Ryhl wrote:
> > Christian Brauner <brauner@kernel.org> writes:
> >> On Thu, Nov 30, 2023 at 09:17:56AM +0000, Alice Ryhl wrote:
> >>> Christian Brauner <brauner@kernel.org> writes:
> >>>>>>> +    /// Prevent values of this type from being moved to a different task.
> >>>>>>> +    ///
> >>>>>>> +    /// This is necessary because the C FFI calls assume that `current` is set to the task that
> >>>>>>> +    /// owns the fd in question.
> >>>>>>> +    _not_send_sync: PhantomData<*mut ()>,
> >>>>>>
> >>>>>> I don't fully understand this. Can you explain in a little more detail
> >>>>>> what you mean by this and how this works?
> >>>>>
> >>>>> Yeah, so, this has to do with the Rust trait `Send` that controls
> >>>>> whether it's okay for a value to get moved from one thread to another.
> >>>>> In this case, we don't want it to be `Send` so that it can't be moved to
> >>>>> another thread, since current might be different there.
> >>>>>
> >>>>> The `Send` trait is automatically applied to structs whenever *all*
> >>>>> fields of the struct are `Send`. So to ensure that a struct is not
> >>>>> `Send`, you add a field that is not `Send`.
> >>>>>
> >>>>> The `PhantomData` type used here is a special zero-sized type.
> >>>>> Basically, it says "pretend this struct has a field of type `*mut ()`,
> >>>>> but don't actually add the field". So for the purposes of `Send`, it has
> >>>>> a non-Send field, but since its wrapped in `PhantomData`, the field is
> >>>>> not there at runtime.
> >>>>
> >>>> This probably a stupid suggestion, question. But while PhantomData gives
> >>>> the right hint of what is happening I wouldn't mind if that was very
> >>>> explicitly called NoSendTrait or just add the explanatory comment. Yes,
> >>>> that's a lot of verbiage but you'd help us a lot.
> >>>
> >>> I suppose we could add a typedef:
> >>>
> >>> type NoSendTrait = PhantomData<*mut ()>;
> >>>
> >>> and use that as the field type. The way I did it here is the "standard"
> >>> way of doing it, and if you look at code outside the kernel, you will
> >>> also find them using `PhantomData` like this. However, I don't mind
> >>> adding the typedef if you think it is helpful.
> >>
> >> I'm fine with just a comment as well. I just need to be able to read
> >> this a bit faster. I'm basically losing half a day just dealing with
> >> this patchset and that's not realistic if I want to keep up with other
> >> patches that get sent.
> >>
> >> And if you resend and someone else review you might have to answer the
> >> same question again.
> > 
> > What do you think about this wording?
> > 
> > /// Prevent values of this type from being moved to a different task.
> > ///
> > /// This field has the type `PhantomData<*mut ()>`, which does not
> > /// implement the Send trait. By adding a field with this property, we
> > /// ensure that the `FileDescriptorReservation` struct will not
> > /// implement the Send trait either. This has the consequence that the
> > /// compiler will prevent you from moving values of type
> > /// `FileDescriptorReservation` into a different task, which we want
> > /// because other tasks might have a different value of `current`. We
> > /// want to avoid that because `fd_install` assumes that the value of
> > /// `current` is unchanged since the call to `get_unused_fd_flags`.
> > ///
> > /// The `PhantomData` type has size zero, so the field does not exist at
> > /// runtime.
> > 
> > Alice
> 
> I don't think it is a good idea to add this big comment to every
> `PhantomData` field. I would much rather have a type alias:
> 
>     /// Zero-sized type to mark types not [`Send`].
>     ///
>     /// Add this type as a field to your struct if your type should not be sent to a different task.
>     /// Since [`Send`] is an auto trait, adding a single field that is [`!Send`] will ensure that the
>     /// whole type is [`!Send`].
>     ///
>     /// If a type is [`!Send`] it is impossible to give control over an instance of the type to another
>     /// task. This is useful when a type stores task-local information for example file descriptors.
>     pub type NotSend = PhantomData<*mut ()>;
> 
> If you have suggestions for improving the doc comment or the name,
> please go ahead.
> 
> This doesn't mean that there should be no comment on the `NotSend`
> field of `FileDescriptorReservation`, but I don't want to repeat
> the `Send` stuff all over the place (since it comes up a lot):
> 
>     /// Ensure that `FileDescriptorReservation` cannot be sent to a different task, since there the
>     /// value of `current` is different. We want to avoid that because `fd_install` assumes that the
>     /// value of `current` is unchanged since the call to `get_unused_fd_flags`.
>     _not_send: NotSend,

Seems sane to me. But I would suggest to move away from the "send"
terminology?

* CurrentOnly
* AccessCurrentTask vs AccessForeignTask
* NoForeignTaskAccess
* TaskLocalContext
* TaskCurrentAccess

Or some other variant thereof.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 12:10         ` Alice Ryhl
@ 2023-11-30 12:36           ` Christian Brauner
  0 siblings, 0 replies; 96+ messages in thread
From: Christian Brauner @ 2023-11-30 12:36 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: a.hindborg, alex.gaynor, arve, benno.lossin, bjorn3_gh,
	boqun.feng, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

On Thu, Nov 30, 2023 at 12:10:12PM +0000, Alice Ryhl wrote:
> Christian Brauner <brauner@kernel.org> writes:
> >> This is the backdoor. You use it when *you* know that the file is okay
> > 
> > And a huge one.
> > 
> >> to access, but Rust doesn't. It's unsafe because it's not checked by
> >> Rust.
> >> 
> >> For example you could do this:
> >> 
> >> 	let ptr = unsafe { bindings::fdget(fd) };
> >> 
> >> 	// SAFETY: We just called `fdget`.
> >> 	let file = unsafe { File::from_ptr(ptr) };
> >> 	use_file(file);
> >> 
> >> 	// SAFETY: We're not using `file` after this call.
> >> 	unsafe { bindings::fdput(ptr) };
> >> 
> >> It's used in Binder here:
> >> https://github.com/Darksonn/linux/blob/dca45e6c7848e024709b165a306cdbe88e5b086a/drivers/android/rust_binder.rs#L331-L332
> >> 
> >> Basically, I use it to say "C code has called fdget for us so it's okay
> >> to access the file", whenever userspace uses a syscall to call into the
> >> driver.
> > 
> > Yeah, ok, because the fd you're operating on may be coming from fdget(). Iirc,
> > binder is almost by default used multi-threaded with a shared file descriptor
> > table? But while that means fdget() will usually bump the reference count you
> > can't be sure. Hmkay.
> 
> Even if the syscall used `fget` instead of `fdget`, I would still be
> using `from_ptr` here. The `ARef` type only really makes sense when *we*
> have ownership of the ref-count, but in this case we don't own it. We're
> just given a promise that the caller is keeping it alive for us using
> some mechanism or another.
> 
> >>>> +// SAFETY: It's OK to access `File` through shared references from other threads because we're
> >>>> +// either accessing properties that don't change or that are properly synchronised by C code.
> >>> 
> >>> Uhm, what guarantees are you talking about specifically, please?
> >>> Examples would help.
> >>> 
> >>>> +unsafe impl Sync for File {}
> >> 
> >> The Sync trait defines whether a value may be accessed from several
> >> threads in parallel (using shared/immutable references). In our case,
> > 
> > So let me put this into my own words and you correct me, please:
> > 
> > So, this really just means that if I have two processes both with their own
> > fdtable and they happen to hold fds that refer to the same @file:
> > 
> > P1				P2
> > struct fd fd1 = fdget(1234);
> >                                  struct fd fd2 = fdget(5678);
> > if (!fd1.file)                   if (!fd2.file)
> > 	return -EBADF;                 return -EBADF;
> > 
> > // fd1.file == fd2.file
> > 
> > the only if the Sync trait is implemented both P1 and P2 can in parallel call
> > file->f_op->poll(@file)?
> > 
> > So if the Sync trait isn't implemented then the compiler will prohibit that P1
> > and P2 at the same time call file->f_op->poll(@file)? And that's all that's
> > meant by a shared reference? It's really about sharing the pointer.
> 
> Yeah, what you're saying sounds correct. For a type that is not Sync,
> you would need a lock around the call to `poll` before the compiler
> would accept the call.
> 
> (Or some other mechanism to convince the compiler that no other thread
> is looking at the file at the same time. Of course, a lock is just one
> way to do that.)
> 
> > The thing is that "shared reference" gets a bit in our way here:
> > 
> > (1) If you have SCM_RIGHTs in the mix then P1 can open fd1 to @file and then
> >     send that @file to P2 which now has fd2 refering to @file as well. The
> >     @file->f_count is bumped in that process. So @file->f_count is now 2.
> > 
> >     Now both P1 and P2 call fdget(). Since they don't have a shared fdtable
> >     none of them take an additional reference to @file. IOW, @file->f_count
> >     may remain 2 all throughout the @file->f_op->*() operation.
> > 
> >     So they share a reference to that file and elide both the
> >     atomic_inc_not_zero() and the atomic_dec_not_zero().
> > 
> > (2) io_uring has fixed files whose reference count always stays at 1.
> >     So all io_uring operations on such fixed files share a single reference.
> > 
> > So that's why this is a bit confusing at first to read "shared reference".
> > 
> > Please add a comment on top of unsafe impl Sync for File {}
> > explaining/clarifying this a little that it's about calling methods on the same
> > file.
> 
> Yeah, I agree, the terminology gets a bit mixed up here because we both
> use the word "reference" for different things.

> 
> How about this comment?

Sounds good.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-29 16:48     ` Peter Zijlstra
@ 2023-11-30 12:46       ` Christian Brauner
  2023-12-06 19:59         ` Kent Overstreet
  2023-12-08  5:28         ` comex
  0 siblings, 2 replies; 96+ messages in thread
From: Christian Brauner @ 2023-11-30 12:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alexander Viro, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Wed, Nov 29, 2023 at 05:48:15PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 29, 2023 at 05:28:27PM +0100, Christian Brauner wrote:
> 
> > > +pid_t rust_helper_task_tgid_nr_ns(struct task_struct *tsk,
> > > +				  struct pid_namespace *ns)
> > > +{
> > > +	return task_tgid_nr_ns(tsk, ns);
> > > +}
> > > +EXPORT_SYMBOL_GPL(rust_helper_task_tgid_nr_ns);
> > 
> > I'm a bit puzzled by all these rust_helper_*() calls. Can you explain
> > why they are needed? Because they are/can be static inlines and that
> > somehow doesn't work?
> 
> Correct, because Rust can only talk to C ABI, it cannot use C headers.
> Bindgen would need to translate the full C headers into valid Rust for
> that to work.
> 
> I really think the Rust peoples should spend more effort on that,
> because you are quite right, all this wrappery is tedious at best.

The problem is that we end up with a long list of explicit exports that
also are all really weirdly named like rust_helper_*(). I wouldn't even
complain if it they were somehow auto-generated but as you say that
might be out of scope.

The thing is though that if I want to change the static inlines I know
also have to very likely care about these explicit Rust wrappers which
seems less than ideal.

So if we could not do rust_helper_*() exports we'd probably be better
off.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 12:51 ` [PATCH 1/7] rust: file: add Rust abstraction for `struct file` Alice Ryhl
  2023-11-29 15:13   ` Matthew Wilcox
  2023-11-29 17:06   ` Christian Brauner
@ 2023-11-30 14:53   ` Benno Lossin
  2023-11-30 14:59     ` Greg Kroah-Hartman
  2 siblings, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 14:53 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Andreas Hindborg, Peter Zijlstra,
	Alexander Viro, Christian Brauner, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On 11/29/23 13:51, Alice Ryhl wrote:
> +/// Flags associated with a [`File`].
> +pub mod flags {
> +    /// File is opened in append mode.
> +    pub const O_APPEND: u32 = bindings::O_APPEND;

Why do all of these constants begin with `O_`?

[...]

> +impl File {
> +    /// Constructs a new `struct file` wrapper from a file descriptor.
> +    ///
> +    /// The file descriptor belongs to the current process.
> +    pub fn from_fd(fd: u32) -> Result<ARef<Self>, BadFdError> {
> +        // SAFETY: FFI call, there are no requirements on `fd`.
> +        let ptr = ptr::NonNull::new(unsafe { bindings::fget(fd) }).ok_or(BadFdError)?;
> +
> +        // INVARIANT: `fget` increments the refcount before returning.
> +        Ok(unsafe { ARef::from_raw(ptr.cast()) })

Missing `SAFETY` comment.

> +    }
> +
> +    /// Creates a reference to a [`File`] from a valid pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The caller must ensure that `ptr` points at a valid file and that its refcount does not
> +    /// reach zero during the lifetime 'a.
> +    pub unsafe fn from_ptr<'a>(ptr: *const bindings::file) -> &'a File {
> +        // INVARIANT: The safety requirements guarantee that the refcount does not hit zero during
> +        // 'a. The cast is okay because `File` is `repr(transparent)`.
> +        unsafe { &*ptr.cast() }

Missing `SAFETY` comment.

> +    }
> +
> +    /// Returns the flags associated with the file.
> +    ///
> +    /// The flags are a combination of the constants in [`flags`].
> +    pub fn flags(&self) -> u32 {
> +        // This `read_volatile` is intended to correspond to a READ_ONCE call.
> +        //
> +        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
> +        //
> +        // TODO: Replace with `read_once` when available on the Rust side.
> +        unsafe { core::ptr::addr_of!((*self.0.get()).f_flags).read_volatile() }
> +    }
> +}
> +
> +// SAFETY: The type invariants guarantee that `File` is always ref-counted.
> +unsafe impl AlwaysRefCounted for File {
> +    fn inc_ref(&self) {
> +        // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> +        unsafe { bindings::get_file(self.0.get()) };
> +    }
> +
> +    unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
> +        // SAFETY: The safety requirements guarantee that the refcount is nonzero.
> +        unsafe { bindings::fput(obj.cast().as_ptr()) }
> +    }
> +}
> +
> +/// Represents the `EBADF` error code.
> +///
> +/// Used for methods that can only fail with `EBADF`.
> +pub struct BadFdError;
> +
> +impl From<BadFdError> for Error {
> +    fn from(_: BadFdError) -> Error {
> +        EBADF
> +    }
> +}
> +
> +impl core::fmt::Debug for BadFdError {
> +    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
> +        f.pad("EBADF")
> +    }
> +}

Do we want to generalize this to the other errors as well? We could modify
the `declare_error!` macro in `error.rs` to create these unit structs.

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 14:53   ` Benno Lossin
@ 2023-11-30 14:59     ` Greg Kroah-Hartman
  2023-11-30 15:46       ` Benno Lossin
  0 siblings, 1 reply; 96+ messages in thread
From: Greg Kroah-Hartman @ 2023-11-30 14:59 UTC (permalink / raw)
  To: Benno Lossin
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Thu, Nov 30, 2023 at 02:53:35PM +0000, Benno Lossin wrote:
> On 11/29/23 13:51, Alice Ryhl wrote:
> > +/// Flags associated with a [`File`].
> > +pub mod flags {
> > +    /// File is opened in append mode.
> > +    pub const O_APPEND: u32 = bindings::O_APPEND;
> 
> Why do all of these constants begin with `O_`?

Because that is how they are defined in the kernel in the C code.  Why
would they not be the same here?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-29 15:13   ` Matthew Wilcox
  2023-11-29 15:23     ` Peter Zijlstra
  2023-11-29 16:42     ` Alice Ryhl
@ 2023-11-30 15:02     ` Benno Lossin
  2 siblings, 0 replies; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 15:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel

On 11/29/23 16:13, Matthew Wilcox wrote:
> On Wed, Nov 29, 2023 at 12:51:07PM +0000, Alice Ryhl wrote:
>> This introduces a struct for the EBADF error type, rather than just
>> using the Error type directly. This has two advantages:
>> * `File::from_fd` returns a `Result<ARef<File>, BadFdError>`, which the
>>   compiler will represent as a single pointer, with null being an error.
>>   This is possible because the compiler understands that `BadFdError`
>>   has only one possible value, and it also understands that the
>>   `ARef<File>` smart pointer is guaranteed non-null.
>> * Additionally, we promise to users of the method that the method can
>>   only fail with EBADF, which means that they can rely on this promise
>>   without having to inspect its implementation.
>> That said, there are also two disadvantages:
>> * Defining additional error types involves boilerplate.
>> * The question mark operator will only utilize the `From` trait once,
>>   which prevents you from using the question mark operator on
>>   `BadFdError` in methods that return some third error type that the
>>   kernel `Error` is convertible into. (However, it works fine in methods
>>   that return `Error`.)
> 
> I haven't looked at how Rust-for-Linux handles errors yet, but it's
> disappointing to see that it doesn't do something like the PTR_ERR /
> ERR_PTR / IS_ERR C thing under the hood.

In this case we are actually doing that: `ARef<T>` is a non-null pointer
to a `T` and since `BadFdError` is a unit struct (i.e. there exists only
a single value it can take) `Result<ARef<T>, BadFdError>` has the same
size as a pointer. This is because the Rust compiler represents the
`Err` variant as null.

We also do have support for `ERR_PTR`, but that requires `unsafe`, since
we do not know which kind of pointer the C side returned (was it an
`ARef<T>`, `&mut T`, `&T` etc.?) and can therefore only support `*mut T`.

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 10:42         ` Peter Zijlstra
@ 2023-11-30 15:25           ` Boqun Feng
  2023-12-01  8:53             ` Peter Zijlstra
  2023-12-01  9:00             ` Peter Zijlstra
  0 siblings, 2 replies; 96+ messages in thread
From: Boqun Feng @ 2023-11-30 15:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel,
	Josh Triplett

On Thu, Nov 30, 2023 at 11:42:26AM +0100, Peter Zijlstra wrote:
> On Wed, Nov 29, 2023 at 09:08:14AM -0800, Boqun Feng wrote:
> 
> > But but but, I then realized we have asm goto in C but Rust doesn't
> > support them, and I haven't thought through how hard tht would be..
> 
> You're kidding right?
> 

I'm not, but I've found this:

	https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#asm-goto

seems to me, the plan for this is something like below:

	asm!(
		"cmp {}, 42",
		"jeq {}",
		in(reg) val,
		label { println!("a"); },
		fallthrough { println!("b"); }
    	);

But it's not implemented yet. Cc Josh in case that he knows more about
this.

Regards,
Boqun

> I thought we *finally* deprecated all compilers that didn't support
> asm-goto and x86 now mandates asm-goto to build, and then this toy
> language comes around ?
> 
> What a load of crap ... 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 14:59     ` Greg Kroah-Hartman
@ 2023-11-30 15:46       ` Benno Lossin
  2023-11-30 15:56         ` Greg Kroah-Hartman
  2023-11-30 15:58         ` Theodore Ts'o
  0 siblings, 2 replies; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 15:46 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On 11/30/23 15:59, Greg Kroah-Hartman wrote:
> On Thu, Nov 30, 2023 at 02:53:35PM +0000, Benno Lossin wrote:
>> On 11/29/23 13:51, Alice Ryhl wrote:
>>> +/// Flags associated with a [`File`].
>>> +pub mod flags {
>>> +    /// File is opened in append mode.
>>> +    pub const O_APPEND: u32 = bindings::O_APPEND;
>>
>> Why do all of these constants begin with `O_`?
> 
> Because that is how they are defined in the kernel in the C code.  Why
> would they not be the same here?

Then why does the C side name them that way? Is it because `O_*` is
supposed to mean something, or is it done due to namespacing?

In Rust we have namespacing, so we generally drop common prefixes.

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 15:46       ` Benno Lossin
@ 2023-11-30 15:56         ` Greg Kroah-Hartman
  2023-11-30 15:58         ` Theodore Ts'o
  1 sibling, 0 replies; 96+ messages in thread
From: Greg Kroah-Hartman @ 2023-11-30 15:56 UTC (permalink / raw)
  To: Benno Lossin
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On Thu, Nov 30, 2023 at 03:46:55PM +0000, Benno Lossin wrote:
> On 11/30/23 15:59, Greg Kroah-Hartman wrote:
> > On Thu, Nov 30, 2023 at 02:53:35PM +0000, Benno Lossin wrote:
> >> On 11/29/23 13:51, Alice Ryhl wrote:
> >>> +/// Flags associated with a [`File`].
> >>> +pub mod flags {
> >>> +    /// File is opened in append mode.
> >>> +    pub const O_APPEND: u32 = bindings::O_APPEND;
> >>
> >> Why do all of these constants begin with `O_`?
> > 
> > Because that is how they are defined in the kernel in the C code.  Why
> > would they not be the same here?
> 
> Then why does the C side name them that way? Is it because `O_*` is
> supposed to mean something, or is it done due to namespacing?

Because this is a unix-like system, we all "know" what they mean. :)

See 'man 2 open' for details.

> In Rust we have namespacing, so we generally drop common prefixes.

Fine, but we know what this namespace is, please don't override it to be
something else.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 15:46       ` Benno Lossin
  2023-11-30 15:56         ` Greg Kroah-Hartman
@ 2023-11-30 15:58         ` Theodore Ts'o
  2023-11-30 16:12           ` Benno Lossin
  1 sibling, 1 reply; 96+ messages in thread
From: Theodore Ts'o @ 2023-11-30 15:58 UTC (permalink / raw)
  To: Benno Lossin
  Cc: Greg Kroah-Hartman, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Andreas Hindborg, Peter Zijlstra, Alexander Viro,
	Christian Brauner, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Thu, Nov 30, 2023 at 03:46:55PM +0000, Benno Lossin wrote:
> >>> +    pub const O_APPEND: u32 = bindings::O_APPEND;
> >>
> >> Why do all of these constants begin with `O_`?
> > 
> > Because that is how they are defined in the kernel in the C code.  Why
> > would they not be the same here?
> 
> Then why does the C side name them that way? Is it because `O_*` is
> supposed to mean something, or is it done due to namespacing?

It's because these sets of constants were flags passed to the open(2)
system call, and so they are dictated by the POSIX specification.  So
O_ means that they are a set of integer values which are used by
open(2), and they are defined when userspace #include's the fcntl.h
header file.  One could consider it be namespacing --- we need to
distinguish these from other constants: MAY_APPEND, RWF_APPEND,
ESCAPE_APPEND, STATX_ATTR_APPEND, BTRFS_INODE_APPEND.

But it's also a convention that dates back for ***decades*** and if we
want code to be understandable by kernel programmers, we need to obey
standard kernel naming conventions.  

> In Rust we have namespacing, so we generally drop common prefixes.

I don't know about Rust namespacing, but in other languages, how you
have to especify namespaces tend to be ***far*** more verbose than
just adding an O_ prefix.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 15:58         ` Theodore Ts'o
@ 2023-11-30 16:12           ` Benno Lossin
  2023-12-01  1:16             ` Theodore Ts'o
  2023-12-01 12:11             ` David Laight
  0 siblings, 2 replies; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 16:12 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Greg Kroah-Hartman, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Andreas Hindborg, Peter Zijlstra, Alexander Viro,
	Christian Brauner, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On 11/30/23 16:58, Theodore Ts'o wrote:
> On Thu, Nov 30, 2023 at 03:46:55PM +0000, Benno Lossin wrote:
>>>>> +    pub const O_APPEND: u32 = bindings::O_APPEND;
>>>>
>>>> Why do all of these constants begin with `O_`?
>>>
>>> Because that is how they are defined in the kernel in the C code.  Why
>>> would they not be the same here?
>>
>> Then why does the C side name them that way? Is it because `O_*` is
>> supposed to mean something, or is it done due to namespacing?
> 
> It's because these sets of constants were flags passed to the open(2)
> system call, and so they are dictated by the POSIX specification.  So
> O_ means that they are a set of integer values which are used by
> open(2), and they are defined when userspace #include's the fcntl.h
> header file.  One could consider it be namespacing --- we need to
> distinguish these from other constants: MAY_APPEND, RWF_APPEND,
> ESCAPE_APPEND, STATX_ATTR_APPEND, BTRFS_INODE_APPEND.
> 
> But it's also a convention that dates back for ***decades*** and if we
> want code to be understandable by kernel programmers, we need to obey
> standard kernel naming conventions.

I see, that makes a lot of sense. Thanks for the explanation.

>> In Rust we have namespacing, so we generally drop common prefixes.
> 
> I don't know about Rust namespacing, but in other languages, how you
> have to especify namespaces tend to be ***far*** more verbose than
> just adding an O_ prefix.

In this case we already have the `flags` namespace, so I thought about
just dropping the `O_` prefix altogether.

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred`
  2023-11-29 12:51 ` [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
@ 2023-11-30 16:17   ` Benno Lossin
  2023-12-01  9:06     ` Alice Ryhl
  0 siblings, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 16:17 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Andreas Hindborg, Peter Zijlstra,
	Alexander Viro, Christian Brauner, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On 11/29/23 13:51, Alice Ryhl wrote:
> +    /// Returns the credentials of the task that originally opened the file.
> +    pub fn cred(&self) -> &Credential {
> +        // This `read_volatile` is intended to correspond to a READ_ONCE call.
> +        //
> +        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
> +        //
> +        // TODO: Replace with `read_once` when available on the Rust side.
> +        let ptr = unsafe { core::ptr::addr_of!((*self.0.get()).f_cred).read_volatile() };
> +
> +        // SAFETY: The signature of this function ensures that the caller will only access the
> +        // returned credential while the file is still valid, and the credential must stay valid
> +        // while the file is valid.

About the last part of this safety comment, is this a guarantee from the
C side? If yes, then I would phrase it that way:

    ... while the file is still valid, and the C side ensures that the
    credentials stay valid while the file is valid.

-- 
Cheers,
Benno

> +        unsafe { Credential::from_ptr(ptr) }
> +    }
> +

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 3/7] rust: security: add abstraction for secctx
  2023-11-29 13:11 ` [PATCH 3/7] rust: security: add abstraction for secctx Alice Ryhl
@ 2023-11-30 16:26   ` Benno Lossin
  2023-12-01 10:48     ` Alice Ryhl
  0 siblings, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 16:26 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Andreas Hindborg, Peter Zijlstra,
	Alexander Viro, Christian Brauner, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On 11/29/23 14:11, Alice Ryhl wrote:
> +/// A security context string.
> +///
> +/// The struct has the invariant that it always contains a valid security context.

Refactor to use the `# Invariants` section:

    # Invariants
    `secdata` points to a valid security context.

I also do not know what a "valid security context" is, so a link to the
definition wouldn't hurt.

> +pub struct SecurityCtx {
> +    secdata: *mut core::ffi::c_char,
> +    seclen: usize,
> +}
> +
> +impl SecurityCtx {
> +    /// Get the security context given its id.
> +    pub fn from_secid(secid: u32) -> Result<Self> {
> +        let mut secdata = core::ptr::null_mut();
> +        let mut seclen = 0;
> +        // SAFETY: Just a C FFI call. The pointers are valid for writes.
> +        unsafe {
> +            to_result(bindings::security_secid_to_secctx(
> +                secid,
> +                &mut secdata,
> +                &mut seclen,
> +            ))?;
> +        }
> +
> +        // If the above call did not fail, then we have a valid security
> +        // context, so the invariants are not violated.

Should be tagged `INVARIANT`.

> +        Ok(Self {
> +            secdata,
> +            seclen: usize::try_from(seclen).unwrap(),
> +        })
> +    }
> +
> +    /// Returns whether the security context is empty.
> +    pub fn is_empty(&self) -> bool {
> +        self.seclen == 0
> +    }
> +
> +    /// Returns the length of this security context.
> +    pub fn len(&self) -> usize {
> +        self.seclen
> +    }
> +
> +    /// Returns the bytes for this security context.
> +    pub fn as_bytes(&self) -> &[u8] {
> +        let mut ptr = self.secdata;
> +        if ptr.is_null() {
> +            // Many C APIs will use null pointers for strings of length zero, but

I would just write that the secctx API uses null pointers to denote a
string of length zero.

> +            // `slice::from_raw_parts` doesn't allow the pointer to be null even if the length is
> +            // zero. Replace the pointer with a dangling but non-null pointer in this case.
> +            debug_assert_eq!(self.seclen, 0);

I am feeling a bit uncomfortable with this, why can't we just return
an empty slice in this case?

> +            ptr = core::ptr::NonNull::dangling().as_ptr();
> +        }
> +
> +        // SAFETY: The call to `security_secid_to_secctx` guarantees that the pointer is valid for
> +        // `seclen` bytes. Furthermore, if the length is zero, then we have ensured that the
> +        // pointer is not null.
> +        unsafe { core::slice::from_raw_parts(ptr.cast(), self.seclen) }
> +    }
> +}
> +
> +impl Drop for SecurityCtx {
> +    fn drop(&mut self) {
> +        // SAFETY: This frees a pointer that came from a successful call to
> +        // `security_secid_to_secctx`.

This should be part of the type invariant.

-- 
Cheers,
Benno

> +        unsafe {
> +            bindings::security_release_secctx(self.secdata, self.seclen as u32);
> +        }
> +    }
> +}
> --
> 2.43.0.rc1.413.gea7ed67945-goog
>

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-29 13:11 ` [PATCH 4/7] rust: file: add `FileDescriptorReservation` Alice Ryhl
  2023-11-29 16:14   ` Christian Brauner
@ 2023-11-30 16:40   ` Benno Lossin
  2023-12-01 11:32     ` Alice Ryhl
  1 sibling, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 16:40 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Andreas Hindborg, Peter Zijlstra,
	Alexander Viro, Christian Brauner, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On 11/29/23 14:11, Alice Ryhl wrote:
> +impl FileDescriptorReservation {
> +    /// Creates a new file descriptor reservation.
> +    pub fn new(flags: u32) -> Result<Self> {
> +        // SAFETY: FFI call, there are no safety requirements on `flags`.
> +        let fd: i32 = unsafe { bindings::get_unused_fd_flags(flags) };
> +        if fd < 0 {
> +            return Err(Error::from_errno(fd));
> +        }

I think here we could also use the modified `to_result` function that
returns a `u32` if the value is non-negative.

> +        Ok(Self {
> +            fd: fd as _,
> +            _not_send_sync: PhantomData,
> +        })
> +    }
> +
> +    /// Returns the file descriptor number that was reserved.
> +    pub fn reserved_fd(&self) -> u32 {
> +        self.fd
> +    }
> +
> +    /// Commits the reservation.
> +    ///
> +    /// The previously reserved file descriptor is bound to `file`. This method consumes the
> +    /// [`FileDescriptorReservation`], so it will not be usable after this call.
> +    pub fn commit(self, file: ARef<File>) {
> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
> +        // guaranteed to have an owned ref count by its type invariants.
> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
> +
> +        // `fd_install` consumes both the file descriptor and the file reference, so we cannot run
> +        // the destructors.
> +        core::mem::forget(self);
> +        core::mem::forget(file);

Would be useful to have an `ARef::into_raw` function that would do
the `forget` for us.

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-29 13:12 ` [PATCH 5/7] rust: file: add `Kuid` wrapper Alice Ryhl
  2023-11-29 16:28   ` Christian Brauner
  2023-11-30 10:36   ` Peter Zijlstra
@ 2023-11-30 16:48   ` Benno Lossin
  2 siblings, 0 replies; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 16:48 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Andreas Hindborg, Peter Zijlstra,
	Alexander Viro, Christian Brauner, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On 11/29/23 14:12, Alice Ryhl wrote:
> +    /// Returns the given task's pid in the current pid namespace.
> +    pub fn pid_in_current_ns(&self) -> Pid {
> +        // SAFETY: We know that `self.0.get()` is valid by the type invariant. The rest is just FFI
> +        // calls.
> +        unsafe {
> +            let namespace = bindings::task_active_pid_ns(bindings::get_current());
> +            bindings::task_tgid_nr_ns(self.0.get(), namespace)
> +        }

I would split this into two `unsafe` blocks.

> +    }
> +
>      /// Wakes up the task.
>      pub fn wake_up(&self) {
>          // SAFETY: By the type invariant, we know that `self.0.get()` is non-null and valid.
> @@ -147,6 +180,42 @@ pub fn wake_up(&self) {
>      }
>  }
> 
> +impl Kuid {
> +    /// Get the current euid.
> +    pub fn current_euid() -> Kuid {
> +        // SAFETY: Just an FFI call.
> +        Self {
> +            kuid: unsafe { bindings::current_euid() },
> +        }

Would expect a call to `from_raw` here instead of `Self {}`.

> +    }
> +
> +    /// Create a `Kuid` given the raw C type.
> +    pub fn from_raw(kuid: bindings::kuid_t) -> Self {
> +        Self { kuid }
> +    }

Is there a reason that this is named `from_raw` and not just a normal
`From` impl? AFAICT any `bindings::kuid_t` is a valid `Kuid`.

> +
> +    /// Turn this kuid into the raw C type.
> +    pub fn into_raw(self) -> bindings::kuid_t {
> +        self.kuid
> +    }
> +
> +    /// Converts this kernel UID into a UID that userspace understands. Uses the namespace of the
> +    /// current task.

Why not:

    /// Converts this kernel UID into a userspace UID.
    ///
    /// Uses the namespace of the current task.

-- 
Cheers,
Benno

> +    pub fn into_uid_in_current_ns(self) -> bindings::uid_t {
> +        // SAFETY: Just an FFI call.
> +        unsafe { bindings::from_kuid(bindings::current_user_ns(), self.kuid) }
> +    }
> +}

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 6/7] rust: file: add `DeferredFdCloser`
  2023-11-29 13:12 ` [PATCH 6/7] rust: file: add `DeferredFdCloser` Alice Ryhl
@ 2023-11-30 17:12   ` Benno Lossin
  2023-12-01 11:35     ` Alice Ryhl
  0 siblings, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 17:12 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Andreas Hindborg, Peter Zijlstra,
	Alexander Viro, Christian Brauner, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On 11/29/23 14:12, Alice Ryhl wrote:
> +    /// Schedule a task work that closes the file descriptor when this task returns to userspace.
> +    pub fn close_fd(mut self, fd: u32) {
> +        use bindings::task_work_notify_mode_TWA_RESUME as TWA_RESUME;
> +
> +        let file = unsafe { bindings::close_fd_get_file(fd) };
> +        if file.is_null() {
> +            // Nothing further to do. The allocation is freed by the destructor of `self.inner`.
> +            return;
> +        }
> +
> +        self.inner.file = file;
> +
> +        // SAFETY: Since `DeferredFdCloserInner` is `#[repr(C)]`, casting the pointers gives a
> +        // pointer to the `twork` field.
> +        let inner = Box::into_raw(self.inner) as *mut bindings::callback_head;

Here you can just use `.cast::<...>()`.

> +        // SAFETY: Getting a pointer to current is always safe.
> +        let current = unsafe { bindings::get_current() };
> +        // SAFETY: The `file` pointer points at a valid file.
> +        unsafe { bindings::get_file(file) };
> +        // SAFETY: Due to the above `get_file`, even if the current task holds an `fdget` to
> +        // this file right now, the refcount will not drop to zero until after it is released
> +        // with `fdput`. This is because when using `fdget`, you must always use `fdput` before
> +        // returning to userspace, and our task work runs after any `fdget` users have returned
> +        // to userspace.
> +        //
> +        // Note: fl_owner_t is currently a void pointer.
> +        unsafe { bindings::filp_close(file, (*current).files as bindings::fl_owner_t) };
> +        // SAFETY: The `inner` pointer is compatible with the `do_close_fd` method.
> +        unsafe { bindings::init_task_work(inner, Some(Self::do_close_fd)) };
> +        // SAFETY: The `inner` pointer points at a valid and fully initialized task work that is
> +        // ready to be scheduled.
> +        unsafe { bindings::task_work_add(current, inner, TWA_RESUME) };

I am a bit confused, when does `do_close_fd` actually run? Does
`TWA_RESUME` mean that `inner` is scheduled to run after the current
task has been completed?

> +    }
> +
> +    // SAFETY: This function is an implementation detail of `close_fd`, so its safety comments
> +    // should be read in extension of that method.
> +    unsafe extern "C" fn do_close_fd(inner: *mut bindings::callback_head) {
> +        // SAFETY: In `close_fd` we use this method together with a pointer that originates from a
> +        // `Box<DeferredFdCloserInner>`, and we have just been given ownership of that allocation.
> +        let inner = unsafe { Box::from_raw(inner as *mut DeferredFdCloserInner) };

In order for this call to be sound, `inner` must be an exclusive
pointer (including any possible references into the `callback_head`).
Is this the case?

-- 
Cheers,
Benno

> +        // SAFETY: This drops a refcount we acquired in `close_fd`. Since this callback runs in a
> +        // task work after we return to userspace, it is guaranteed that the current thread doesn't
> +        // hold this file with `fdget`, as `fdget` must be released before returning to userspace.
> +        unsafe { bindings::fput(inner.file) };
> +        // Free the allocation.
> +        drop(inner);
> +    }
> +}
> +
>  /// Represents the `EBADF` error code.
>  ///
>  /// Used for methods that can only fail with `EBADF`.
> 
> --
> 2.43.0.rc1.413.gea7ed67945-goog
>

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 7/7] rust: file: add abstraction for `poll_table`
  2023-11-29 13:12 ` [PATCH 7/7] rust: file: add abstraction for `poll_table` Alice Ryhl
@ 2023-11-30 17:42   ` Benno Lossin
  2023-12-01 11:47     ` Alice Ryhl
  2023-11-30 22:39   ` Boqun Feng
  2023-11-30 22:50   ` Boqun Feng
  2 siblings, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-11-30 17:42 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Andreas Hindborg, Peter Zijlstra,
	Alexander Viro, Christian Brauner, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Matthew Wilcox, Thomas Gleixner, Daniel Xu,
	linux-kernel, rust-for-linux, linux-fsdevel

On 11/29/23 14:12, Alice Ryhl wrote:
> diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs
> index 578ee307093f..35576678c993 100644
> --- a/rust/kernel/file.rs
> +++ b/rust/kernel/file.rs
> @@ -14,6 +14,9 @@
>  use alloc::boxed::Box;
>  use core::{alloc::AllocError, marker::PhantomData, mem, ptr};
> 
> +mod poll_table;
> +pub use self::poll_table::{PollCondVar, PollTable};

I think it makes more sense to put it under `rust/kernel/sync/`.
> +    fn get_qproc(&self) -> bindings::poll_queue_proc {
> +        let ptr = self.0.get();
> +        // SAFETY: The `ptr` is valid because it originates from a reference, and the `_qproc`
> +        // field is not modified concurrently with this call.

What ensures this? Maybe use a type invariant?

> +        unsafe { (*ptr)._qproc }
> +    }

[...]

> +impl PollCondVar {
> +    /// Constructs a new condvar initialiser.
> +    #[allow(clippy::new_ret_no_self)]

This is no longer needed, as Gary fixed this, see [1].

[1]: https://github.com/rust-lang/rust-clippy/issues/7344

> +    pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self> {
> +        pin_init!(Self {
> +            inner <- CondVar::new(name, key),
> +        })
> +    }
> +}
> +
> +// Make the `CondVar` methods callable on `PollCondVar`.
> +impl Deref for PollCondVar {
> +    type Target = CondVar;
> +
> +    fn deref(&self) -> &CondVar {
> +        &self.inner
> +    }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for PollCondVar {
> +    fn drop(self: Pin<&mut Self>) {
> +        // Clear anything registered using `register_wait`.
> +        self.inner.notify(1, bindings::POLLHUP | bindings::POLLFREE);

Isn't notifying only a single thread problematic, since a user could
misuse the `PollCondVar` (since all functions of `CondVar` are also
accessible) and also `.wait()` on the condvar? When dropping a
`PollCondVar` it might notify only the user `.wait()`, but not the
`PollTable`. Or am I missing something?

-- 
Cheers,
Benno

> +        // Wait for epoll items to be properly removed.
> +        //
> +        // SAFETY: Just an FFI call.
> +        unsafe { bindings::synchronize_rcu() };
> +    }
> +}

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 7/7] rust: file: add abstraction for `poll_table`
  2023-11-29 13:12 ` [PATCH 7/7] rust: file: add abstraction for `poll_table` Alice Ryhl
  2023-11-30 17:42   ` Benno Lossin
@ 2023-11-30 22:39   ` Boqun Feng
  2023-12-01 11:50     ` Alice Ryhl
  2023-11-30 22:50   ` Boqun Feng
  2 siblings, 1 reply; 96+ messages in thread
From: Boqun Feng @ 2023-11-30 22:39 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Wed, Nov 29, 2023 at 01:12:51PM +0000, Alice Ryhl wrote:
> The existing `CondVar` abstraction is a wrapper around `wait_list`, but
> it does not support all use-cases of the C `wait_list` type. To be
> specific, a `CondVar` cannot be registered with a `struct poll_table`.
> This limitation has the advantage that you do not need to call
> `synchronize_rcu` when destroying a `CondVar`.
> 
> However, we need the ability to register a `poll_table` with a
> `wait_list` in Rust Binder. To enable this, introduce a type called
> `PollCondVar`, which is like `CondVar` except that you can register a
> `poll_table`. We also introduce `PollTable`, which is a safe wrapper
> around `poll_table` that is intended to be used with `PollCondVar`.
> 
> The destructor of `PollCondVar` unconditionally calls `synchronize_rcu`
> to ensure that the removal of epoll waiters has fully completed before
> the `wait_list` is destroyed.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> That said, `synchronize_rcu` is rather expensive and is not needed in
> all cases: If we have never registered a `poll_table` with the
> `wait_list`, then we don't need to call `synchronize_rcu`. (And this is
> a common case in Binder - not all processes use Binder with epoll.) The
> current implementation does not account for this, but we could change it
> to store a boolean next to the `wait_list` to keep track of whether a
> `poll_table` has ever been registered. It is up to discussion whether
> this is desireable.
> 
> It is not clear to me whether we can implement the above without storing
> an extra boolean. We could check whether the `wait_list` is empty, but
> it is not clear that this is sufficient. Perhaps someone knows the
> answer? If a `poll_table` has previously been registered with a

That won't be sufficient, considering this:

    CPU 0                           CPU 1
                                    ep_remove_wait_queue():
                                      whead = smp_load_acquire(&pwq->whead); // whead is not NULL
    PollCondVar::drop():
      self.inner.notify():
        <for each wait entry in the list>
	  ep_poll_callback():
	    <remove wait entry>
            smp_store_release(&ep_pwq_from_wait(wait)->whead, NULL);
      <lock the waitqueue>
      waitqueue_active() // return false, since the queue is emtpy
      <unlock>
    ...
    <free the waitqueue>
				       if (whead) {
				         remove_wait_queue(whead, &pwq->wait); // Use-after-free BOOM!
				       }
      

Note that moving the `wait_list` empty checking before
`self.inner.notify()` won't change the result, since there might be a
`notify` called by users before `PollCondVar::drop()`, hence the same
result.

Regards,
Boqun

> `wait_list`, is it the case that we can kfree the `wait_list` after
> observing that the `wait_list` is empty without waiting for an rcu grace
> period?
> 
[...]

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 7/7] rust: file: add abstraction for `poll_table`
  2023-11-29 13:12 ` [PATCH 7/7] rust: file: add abstraction for `poll_table` Alice Ryhl
  2023-11-30 17:42   ` Benno Lossin
  2023-11-30 22:39   ` Boqun Feng
@ 2023-11-30 22:50   ` Boqun Feng
  2 siblings, 0 replies; 96+ messages in thread
From: Boqun Feng @ 2023-11-30 22:50 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Peter Zijlstra, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Wed, Nov 29, 2023 at 01:12:51PM +0000, Alice Ryhl wrote:
[...]
> +// Make the `CondVar` methods callable on `PollCondVar`.
> +impl Deref for PollCondVar {
> +    type Target = CondVar;
> +
> +    fn deref(&self) -> &CondVar {
> +        &self.inner
> +    }
> +}

I generally think we should avoid using Deref for "subclass pattern" due
to the potential confusion for the code readers (of the deref() usage).
Would it be possible we start with `impl AsRef<CondVar>`?

Thanks!

Regards,
Boqun

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 16:12           ` Benno Lossin
@ 2023-12-01  1:16             ` Theodore Ts'o
  2023-12-01 12:11             ` David Laight
  1 sibling, 0 replies; 96+ messages in thread
From: Theodore Ts'o @ 2023-12-01  1:16 UTC (permalink / raw)
  To: Benno Lossin
  Cc: Greg Kroah-Hartman, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Andreas Hindborg, Peter Zijlstra, Alexander Viro,
	Christian Brauner, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Thu, Nov 30, 2023 at 04:12:14PM +0000, Benno Lossin wrote:
> > I don't know about Rust namespacing, but in other languages, how you
> > have to especify namespaces tend to be ***far*** more verbose than
> > just adding an O_ prefix.
> 
> In this case we already have the `flags` namespace, so I thought about
> just dropping the `O_` prefix altogether.

Note that in C code, the flags are known to be an integer, and there
are times when we assume that it's possible to take the bitfield, and
then either (a) or'ing in bitfields from some other "namespace",
because it's known that the open flags only use a certain number of
the low bits of the integer, or even that O_RDONLY, O_WRONLY, and
O_RDWR are 0, 1, and 2, repsectively, and so you can do something like
((flags & 0x03) + 1) such that 1 means "read access", 2 means "write
access", and 3 (1|2) is read and write.

This may make a programmer used to a type-strict language feel a
little dirty, but again, this is a convention going back deckades,
back when a PDP-11 had only 32k of words in its address space....

Cheers,


     	    	       	    	- Ted

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 15:25           ` Boqun Feng
@ 2023-12-01  8:53             ` Peter Zijlstra
  2023-12-01  9:19               ` Boqun Feng
  2023-12-01  9:00             ` Peter Zijlstra
  1 sibling, 1 reply; 96+ messages in thread
From: Peter Zijlstra @ 2023-12-01  8:53 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel,
	Josh Triplett

On Thu, Nov 30, 2023 at 07:25:01AM -0800, Boqun Feng wrote:
> On Thu, Nov 30, 2023 at 11:42:26AM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 29, 2023 at 09:08:14AM -0800, Boqun Feng wrote:
> > 
> > > But but but, I then realized we have asm goto in C but Rust doesn't
> > > support them, and I haven't thought through how hard tht would be..
> > 
> > You're kidding right?
> > 
> 
> I'm not, but I've found this:
> 
> 	https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#asm-goto

Reading that makes all this even worse, apparently rust can't even use
memops.

So to summarise, Rust cannot properly interop with C, it cannot do
inline asm from this side of the millenium. Why are we even trying to
use it again?

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 15:25           ` Boqun Feng
  2023-12-01  8:53             ` Peter Zijlstra
@ 2023-12-01  9:00             ` Peter Zijlstra
  2023-12-01  9:52               ` Boqun Feng
  1 sibling, 1 reply; 96+ messages in thread
From: Peter Zijlstra @ 2023-12-01  9:00 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel,
	Josh Triplett

On Thu, Nov 30, 2023 at 07:25:01AM -0800, Boqun Feng wrote:

> seems to me, the plan for this is something like below:
> 
> 	asm!(
> 		"cmp {}, 42",
> 		"jeq {}",
> 		in(reg) val,
> 		label { println!("a"); },
> 		fallthrough { println!("b"); }
>     	);

Because rust has horrible syntax I can't parse, I can't tell if this is
useful or not :/ Can this be used to implement arch_static_branch*() ?

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred`
  2023-11-30 16:17   ` Benno Lossin
@ 2023-12-01  9:06     ` Alice Ryhl
  2023-12-01 10:27       ` Christian Brauner
  0 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-12-01  9:06 UTC (permalink / raw)
  To: benno.lossin, brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, bjorn3_gh, boqun.feng,
	cmllamas, dan.j.williams, dxu, gary, gregkh, joel, keescook,
	linux-fsdevel, linux-kernel, maco, ojeda, peterz, rust-for-linux,
	surenb, tglx, tkjos, viro, wedsonaf, willy

Benno Lossin <benno.lossin@proton.me> writes:
> On 11/29/23 13:51, Alice Ryhl wrote:
>> +    /// Returns the credentials of the task that originally opened the file.
>> +    pub fn cred(&self) -> &Credential {
>> +        // This `read_volatile` is intended to correspond to a READ_ONCE call.
>> +        //
>> +        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
>> +        //
>> +        // TODO: Replace with `read_once` when available on the Rust side.
>> +        let ptr = unsafe { core::ptr::addr_of!((*self.0.get()).f_cred).read_volatile() };
>> +
>> +        // SAFETY: The signature of this function ensures that the caller will only access the
>> +        // returned credential while the file is still valid, and the credential must stay valid
>> +        // while the file is valid.
> 
> About the last part of this safety comment, is this a guarantee from the
> C side? If yes, then I would phrase it that way:
> 
>     ... while the file is still valid, and the C side ensures that the
>     credentials stay valid while the file is valid.

Yes, that's my intention with this code.

But I guess this is a good question for Christian Brauner to confirm:

If I read the credential from the `f_cred` field, is it guaranteed that
the pointer remains valid for at least as long as the file?

Or should I do some dance along the lines of "lock file, increment
refcount on credential, unlock file"?

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01  8:53             ` Peter Zijlstra
@ 2023-12-01  9:19               ` Boqun Feng
  2023-12-01  9:40                 ` Peter Zijlstra
  0 siblings, 1 reply; 96+ messages in thread
From: Boqun Feng @ 2023-12-01  9:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel,
	Josh Triplett

On Fri, Dec 01, 2023 at 09:53:28AM +0100, Peter Zijlstra wrote:
> On Thu, Nov 30, 2023 at 07:25:01AM -0800, Boqun Feng wrote:
> > On Thu, Nov 30, 2023 at 11:42:26AM +0100, Peter Zijlstra wrote:
> > > On Wed, Nov 29, 2023 at 09:08:14AM -0800, Boqun Feng wrote:
> > > 
> > > > But but but, I then realized we have asm goto in C but Rust doesn't
> > > > support them, and I haven't thought through how hard tht would be..
> > > 
> > > You're kidding right?
> > > 
> > 
> > I'm not, but I've found this:
> > 
> > 	https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#asm-goto
> 
> Reading that makes all this even worse, apparently rust can't even use
> memops.

What do you mean by "memops"?

Regards,
Boqun

> 
> So to summarise, Rust cannot properly interop with C, it cannot do
> inline asm from this side of the millenium. Why are we even trying to
> use it again?

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01  9:19               ` Boqun Feng
@ 2023-12-01  9:40                 ` Peter Zijlstra
  2023-12-01 10:36                   ` Boqun Feng
  0 siblings, 1 reply; 96+ messages in thread
From: Peter Zijlstra @ 2023-12-01  9:40 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel,
	Josh Triplett

On Fri, Dec 01, 2023 at 01:19:14AM -0800, Boqun Feng wrote:

> > > 	https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#asm-goto
> > 
> > Reading that makes all this even worse, apparently rust can't even use
> > memops.
> 
> What do you mean by "memops"?

Above link has the below in "future possibilities":

"Memory operands

We could support mem as an alternative to specifying a register class
which would leave the operand in memory and instead produce a memory
address when inserted into the asm string. This would allow generating
more efficient code by taking advantage of addressing modes instead of
using an intermediate register to hold the computed address."

Just so happens that every x86 atomic block uses memops.. and per-cpu
and ...



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01  9:00             ` Peter Zijlstra
@ 2023-12-01  9:52               ` Boqun Feng
  0 siblings, 0 replies; 96+ messages in thread
From: Boqun Feng @ 2023-12-01  9:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel,
	Josh Triplett

On Fri, Dec 01, 2023 at 10:00:39AM +0100, Peter Zijlstra wrote:
> On Thu, Nov 30, 2023 at 07:25:01AM -0800, Boqun Feng wrote:
> 
> > seems to me, the plan for this is something like below:
> > 
> > 	asm!(
> > 		"cmp {}, 42",
> > 		"jeq {}",
> > 		in(reg) val,
> > 		label { println!("a"); },
> > 		fallthrough { println!("b"); }
> >     	);
> 
> Because rust has horrible syntax I can't parse, I can't tell if this is
> useful or not :/ Can this be used to implement arch_static_branch*() ?

I should think so:

	asm!("jmp {l_yes}", // jump to l_yes
	     "..."          // directives are supported
	     l_yes { return true; } // label "l_yes"
	     fallthrough { return false; } // otherwise return false
	)

Rust uses LLVM backend, so the inline asm should have the same ability
of clang.

But as I said, AFAIK jumping to label hasn't been implemented yet.

Regards,
Boqun

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred`
  2023-12-01  9:06     ` Alice Ryhl
@ 2023-12-01 10:27       ` Christian Brauner
  2023-12-04 15:42         ` Alice Ryhl
  0 siblings, 1 reply; 96+ messages in thread
From: Christian Brauner @ 2023-12-01 10:27 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: benno.lossin, a.hindborg, alex.gaynor, arve, bjorn3_gh,
	boqun.feng, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

On Fri, Dec 01, 2023 at 09:06:35AM +0000, Alice Ryhl wrote:
> Benno Lossin <benno.lossin@proton.me> writes:
> > On 11/29/23 13:51, Alice Ryhl wrote:
> >> +    /// Returns the credentials of the task that originally opened the file.
> >> +    pub fn cred(&self) -> &Credential {
> >> +        // This `read_volatile` is intended to correspond to a READ_ONCE call.
> >> +        //
> >> +        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
> >> +        //
> >> +        // TODO: Replace with `read_once` when available on the Rust side.
> >> +        let ptr = unsafe { core::ptr::addr_of!((*self.0.get()).f_cred).read_volatile() };
> >> +
> >> +        // SAFETY: The signature of this function ensures that the caller will only access the
> >> +        // returned credential while the file is still valid, and the credential must stay valid
> >> +        // while the file is valid.
> > 
> > About the last part of this safety comment, is this a guarantee from the
> > C side? If yes, then I would phrase it that way:
> > 
> >     ... while the file is still valid, and the C side ensures that the
> >     credentials stay valid while the file is valid.
> 
> Yes, that's my intention with this code.
> 
> But I guess this is a good question for Christian Brauner to confirm:
> 
> If I read the credential from the `f_cred` field, is it guaranteed that
> the pointer remains valid for at least as long as the file?
> 
> Or should I do some dance along the lines of "lock file, increment
> refcount on credential, unlock file"?

The lifetime of the f_cred reference is at least as long as the lifetime
of the file:

// file not yet visible anywhere
some_file = alloc_file*()
-> init_file()
   {
           file->f_cred = get_cred(cred /* usually current_cred() */)
   }


// install into fd_table -> irreversible, thing visible, possibly shared
fd_install(1234, some_file)

// last fput
fput()
// atomic_dec_and_test() dance:
-> file_free() // either "delayed" through task work, workqueue, or
	       // sometimes freed right away if file hasn't been opened,
	       // i.e., if fd_install() wasn't called
   -> put_cred(file->f_cred)

In order to access anything you must hold a reference to the file or
files->file_lock. IOW, no poking around in f->f_cred or any field for
that matter just under rcu_read_lock() for example. Because files are
SLAB_TYPESAFE_BY_RCU. You might be poking in someone else's creds then.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01  9:40                 ` Peter Zijlstra
@ 2023-12-01 10:36                   ` Boqun Feng
  2023-12-01 11:05                     ` Peter Zijlstra
  0 siblings, 1 reply; 96+ messages in thread
From: Boqun Feng @ 2023-12-01 10:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel,
	Josh Triplett

On Fri, Dec 01, 2023 at 10:40:37AM +0100, Peter Zijlstra wrote:
> On Fri, Dec 01, 2023 at 01:19:14AM -0800, Boqun Feng wrote:
> 
> > > > 	https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#asm-goto
> > > 
> > > Reading that makes all this even worse, apparently rust can't even use
> > > memops.
> > 
> > What do you mean by "memops"?
> 
> Above link has the below in "future possibilities":
> 
> "Memory operands
> 
> We could support mem as an alternative to specifying a register class
> which would leave the operand in memory and instead produce a memory
> address when inserted into the asm string. This would allow generating
> more efficient code by taking advantage of addressing modes instead of
> using an intermediate register to hold the computed address."
> 
> Just so happens that every x86 atomic block uses memops.. and per-cpu
> and ...
> 

Oh yes, I found out Rust's asm! doesn't support specifying a memory
location as input or output recently as well.


I don't speak for the Rust langauge community, but I think this is
something that they should improve. I understand it could be frustrating
that we find out the new stuff doesn't support good old tools we use
(trust me, I do!), but I believe you also understand that a higher level
language can help in some places, for example, SBRM is naturally
supported ;-) This answers half of the question: "Why are we even trying
to use it again?".

The other half is how languages are designed is different in these days:
a language community may do a better job on listening to the users and
the real use cases can affect the language design in return. While we
are doing our own experiment, we might well give that a shot too.

And at least the document admits these are "future possibilities", so
they should be more motivated to implement these.

It's never perfect, but we gotta start somewhere.

Regards,
Boqun

> 

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 3/7] rust: security: add abstraction for secctx
  2023-11-30 16:26   ` Benno Lossin
@ 2023-12-01 10:48     ` Alice Ryhl
  2023-12-02 10:03       ` Benno Lossin
  0 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-12-01 10:48 UTC (permalink / raw)
  To: benno.lossin
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, bjorn3_gh, boqun.feng,
	brauner, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

Benno Lossin <benno.lossin@proton.me> writes:
> On 11/29/23 14:11, Alice Ryhl wrote:
>> +    /// Returns the bytes for this security context.
>> +    pub fn as_bytes(&self) -> &[u8] {
>> +        let mut ptr = self.secdata;
>> +        if ptr.is_null() {
>> +            // Many C APIs will use null pointers for strings of length zero, but
> 
> I would just write that the secctx API uses null pointers to denote a
> string of length zero.

I don't actually know whether it can ever be null, I just wanted to stay
on the safe side.

>> +            // `slice::from_raw_parts` doesn't allow the pointer to be null even if the length is
>> +            // zero. Replace the pointer with a dangling but non-null pointer in this case.
>> +            debug_assert_eq!(self.seclen, 0);
> 
> I am feeling a bit uncomfortable with this, why can't we just return
> an empty slice in this case?

I can do that, but to be clear, what I'm doing here is also definitely
okay.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01 10:36                   ` Boqun Feng
@ 2023-12-01 11:05                     ` Peter Zijlstra
  0 siblings, 0 replies; 96+ messages in thread
From: Peter Zijlstra @ 2023-12-01 11:05 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Matthew Wilcox, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Thomas Gleixner,
	Daniel Xu, linux-kernel, rust-for-linux, linux-fsdevel,
	Josh Triplett

On Fri, Dec 01, 2023 at 02:36:40AM -0800, Boqun Feng wrote:

> I don't speak for the Rust langauge community, but I think this is
> something that they should improve. I understand it could be frustrating
> that we find out the new stuff doesn't support good old tools we use
> (trust me, I do!), but I believe you also understand that a higher level
> language can help in some places, for example, SBRM is naturally
> supported ;-) This answers half of the question: "Why are we even trying
> to use it again?".

C++ does that too (and a ton of other languages), and has a much less
craptastic syntax (not claiming C++ syntax doesn't have problems, but at
least its the same language family). Now I realize C++ isn't ideal, it
inherits much of the safety issues from C. But gah, rust is such a royal
pain.

> The other half is how languages are designed is different in these days:
> a language community may do a better job on listening to the users and
> the real use cases can affect the language design in return. While we
> are doing our own experiment, we might well give that a shot too.

Well, rust was clearly not designed to interact with C/C++ sanely. Given
the kernel is a giant C project, this is somewhat of an issue IMO.

IIRC the way Chrome makes it work with C++ is by defining the interface
in a *third* language which compiles into 'compatible' Rust and C++,
which is total idiocy if you ask me.

Some langauges (Zig IIUC) can consume regular C headers and are much
less painful to interact with (I know very little about Zig, no
endorsement beyond it integrating much better with C).

> And at least the document admits these are "future possibilities", so
> they should be more motivated to implement these.
> 
> It's never perfect, but we gotta start somewhere.

How about they start by using this LLVM goodness to implement the rust
equivalent of Zig's @cImport? Have it use clang to munge the C/C++
headers into IR and squash the lot into the rust thing.

The syntax is ofcourse unfixable :-(

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 4/7] rust: file: add `FileDescriptorReservation`
  2023-11-30 16:40   ` Benno Lossin
@ 2023-12-01 11:32     ` Alice Ryhl
  0 siblings, 0 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-12-01 11:32 UTC (permalink / raw)
  To: benno.lossin
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, bjorn3_gh, boqun.feng,
	brauner, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

Benno Lossin <benno.lossin@proton.me> writes:
> On 11/29/23 14:11, Alice Ryhl wrote:
> > +impl FileDescriptorReservation {
> > +    /// Creates a new file descriptor reservation.
> > +    pub fn new(flags: u32) -> Result<Self> {
> > +        // SAFETY: FFI call, there are no safety requirements on `flags`.
> > +        let fd: i32 = unsafe { bindings::get_unused_fd_flags(flags) };
> > +        if fd < 0 {
> > +            return Err(Error::from_errno(fd));
> > +        }
> 
> I think here we could also use the modified `to_result` function that
> returns a `u32` if the value is non-negative.

I'll look into that for the next version.

>> +    /// Commits the reservation.
>> +    ///
>> +    /// The previously reserved file descriptor is bound to `file`. This method consumes the
>> +    /// [`FileDescriptorReservation`], so it will not be usable after this call.
>> +    pub fn commit(self, file: ARef<File>) {
>> +        // SAFETY: `self.fd` was previously returned by `get_unused_fd_flags`, and `file.ptr` is
>> +        // guaranteed to have an owned ref count by its type invariants.
>> +        unsafe { bindings::fd_install(self.fd, file.0.get()) };
>> +
>> +        // `fd_install` consumes both the file descriptor and the file reference, so we cannot run
>> +        // the destructors.
>> +        core::mem::forget(self);
>> +        core::mem::forget(file);
> 
> Would be useful to have an `ARef::into_raw` function that would do
> the `forget` for us.

That makes sense to me, but I don't think it needs to happen in this patchset.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 6/7] rust: file: add `DeferredFdCloser`
  2023-11-30 17:12   ` Benno Lossin
@ 2023-12-01 11:35     ` Alice Ryhl
  2023-12-02 10:16       ` Benno Lossin
  0 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-12-01 11:35 UTC (permalink / raw)
  To: benno.lossin
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, bjorn3_gh, boqun.feng,
	brauner, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

Benno Lossin <benno.lossin@proton.me> writes:
>> +        // SAFETY: The `inner` pointer points at a valid and fully initialized task work that is
>> +        // ready to be scheduled.
>> +        unsafe { bindings::task_work_add(current, inner, TWA_RESUME) };
> 
> I am a bit confused, when does `do_close_fd` actually run? Does
> `TWA_RESUME` mean that `inner` is scheduled to run after the current
> task has been completed?

When the current syscall returns to userspace.

>> +    // SAFETY: This function is an implementation detail of `close_fd`, so its safety comments
>> +    // should be read in extension of that method.
>> +    unsafe extern "C" fn do_close_fd(inner: *mut bindings::callback_head) {
>> +        // SAFETY: In `close_fd` we use this method together with a pointer that originates from a
>> +        // `Box<DeferredFdCloserInner>`, and we have just been given ownership of that allocation.
>> +        let inner = unsafe { Box::from_raw(inner as *mut DeferredFdCloserInner) };
> 
> In order for this call to be sound, `inner` must be an exclusive
> pointer (including any possible references into the `callback_head`).
> Is this the case?

Yes, when this is called, it's been removed from the linked list of task
work. That's why we can kfree it.

>> +        // SAFETY: Since `DeferredFdCloserInner` is `#[repr(C)]`, casting the pointers gives a
>> +        // pointer to the `twork` field.
>> +        let inner = Box::into_raw(self.inner) as *mut bindings::callback_head;
> 
> Here you can just use `.cast::<...>()`.

Will do.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 7/7] rust: file: add abstraction for `poll_table`
  2023-11-30 17:42   ` Benno Lossin
@ 2023-12-01 11:47     ` Alice Ryhl
  0 siblings, 0 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-12-01 11:47 UTC (permalink / raw)
  To: benno.lossin
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, bjorn3_gh, boqun.feng,
	brauner, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

Benno Lossin <benno.lossin@proton.me> writes:
>> +#[pinned_drop]
>> +impl PinnedDrop for PollCondVar {
>> +    fn drop(self: Pin<&mut Self>) {
>> +        // Clear anything registered using `register_wait`.
>> +        self.inner.notify(1, bindings::POLLHUP | bindings::POLLFREE);
> 
> Isn't notifying only a single thread problematic, since a user could
> misuse the `PollCondVar` (since all functions of `CondVar` are also
> accessible) and also `.wait()` on the condvar? When dropping a
> `PollCondVar` it might notify only the user `.wait()`, but not the
> `PollTable`. Or am I missing something?

Using POLLFREE clears everything. However, this should probably be updated to
use `wake_up_pollfree` instead.

Note that calls to `.wait()` are definitely gone by the time the destructor
runs, since such calls borrows the `PollCondVar`, preventing you from running
the destructor.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 7/7] rust: file: add abstraction for `poll_table`
  2023-11-30 22:39   ` Boqun Feng
@ 2023-12-01 11:50     ` Alice Ryhl
  0 siblings, 0 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-12-01 11:50 UTC (permalink / raw)
  To: boqun.feng
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, brauner, cmllamas, dan.j.williams, dxu, gary, gregkh,
	joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

Boqun Feng <boqun.feng@gmail.com> writes:
>> That said, `synchronize_rcu` is rather expensive and is not needed in
>> all cases: If we have never registered a `poll_table` with the
>> `wait_list`, then we don't need to call `synchronize_rcu`. (And this is
>> a common case in Binder - not all processes use Binder with epoll.) The
>> current implementation does not account for this, but we could change it
>> to store a boolean next to the `wait_list` to keep track of whether a
>> `poll_table` has ever been registered. It is up to discussion whether
>> this is desireable.
>> 
>> It is not clear to me whether we can implement the above without storing
>> an extra boolean. We could check whether the `wait_list` is empty, but
>> it is not clear that this is sufficient. Perhaps someone knows the
>> answer? If a `poll_table` has previously been registered with a
> 
> That won't be sufficient, considering this:
> 
>     CPU 0                           CPU 1
>                                     ep_remove_wait_queue():
>                                       whead = smp_load_acquire(&pwq->whead); // whead is not NULL
>     PollCondVar::drop():
>       self.inner.notify():
>         <for each wait entry in the list>
> 	  ep_poll_callback():
> 	    <remove wait entry>
>             smp_store_release(&ep_pwq_from_wait(wait)->whead, NULL);
>       <lock the waitqueue>
>       waitqueue_active() // return false, since the queue is emtpy
>       <unlock>
>     ...
>     <free the waitqueue>
> 				       if (whead) {
> 				         remove_wait_queue(whead, &pwq->wait); // Use-after-free BOOM!
> 				       }
>       
> 
> Note that moving the `wait_list` empty checking before
> `self.inner.notify()` won't change the result, since there might be a
> `notify` called by users before `PollCondVar::drop()`, hence the same
> result.
> 
> Regards,
> Boqun

Thank you for confirming my suspicion.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* RE: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-11-30 16:12           ` Benno Lossin
  2023-12-01  1:16             ` Theodore Ts'o
@ 2023-12-01 12:11             ` David Laight
  2023-12-01 12:27               ` Alice Ryhl
  1 sibling, 1 reply; 96+ messages in thread
From: David Laight @ 2023-12-01 12:11 UTC (permalink / raw)
  To: 'Benno Lossin', Theodore Ts'o
  Cc: Greg Kroah-Hartman, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Andreas Hindborg, Peter Zijlstra, Alexander Viro,
	Christian Brauner, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

> > I don't know about Rust namespacing, but in other languages, how you
> > have to especify namespaces tend to be ***far*** more verbose than
> > just adding an O_ prefix.
> 
> In this case we already have the `flags` namespace, so I thought about
> just dropping the `O_` prefix altogether.

Does rust have a 'using namespace' (or similar) so that namespace doesn't
have to be explicitly specified each time a value is used?
If so you still need a hint about which set of values it is from.

Otherwise you get into the same mess as C++ class members (I think
they should have been .member from the start).
Or, worse still, Pascal and multiple 'with' blocks.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 96+ messages in thread

* RE: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01 12:11             ` David Laight
@ 2023-12-01 12:27               ` Alice Ryhl
  2023-12-01 15:04                 ` Theodore Ts'o
  0 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-12-01 12:27 UTC (permalink / raw)
  To: david.laight
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, dan.j.williams, dxu,
	gary, gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco,
	ojeda, peterz, rust-for-linux, surenb, tglx, tkjos, tytso, viro,
	wedsonaf, willy

David Laight <David.Laight@ACULAB.COM> writes:
> > > I don't know about Rust namespacing, but in other languages, how you
> > > have to especify namespaces tend to be ***far*** more verbose than
> > > just adding an O_ prefix.
> > 
> > In this case we already have the `flags` namespace, so I thought about
> > just dropping the `O_` prefix altogether.
> 
> Does rust have a 'using namespace' (or similar) so that namespace doesn't
> have to be explicitly specified each time a value is used?
> If so you still need a hint about which set of values it is from.
> 
> Otherwise you get into the same mess as C++ class members (I think
> they should have been .member from the start).
> Or, worse still, Pascal and multiple 'with' blocks.

Yes.

You can import it with a use statement. For example:

use kernel::file::flags::O_RDONLY;
// use as O_RDONLY

or:

use kernel::file::flags::{O_RDONLY, O_WRONLY, O_RDWR};
// use as O_RDONLY

or:

use kernel::file::flags::*;
// use as O_RDONLY

If you want to specify a namespace every time you use it, then it is
possible: (But often you wouldn't do that.)

use kernel::file::flags;
// use as flags::O_RDONLY

or:

use kernel::file;
// use as file::flags::O_RDONLY

or:

use kernel::file::flags as file_flags;
// use as file_flags::O_RDONLY

And you can also use the full path if you don't want to add a `use`
statement.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01 12:27               ` Alice Ryhl
@ 2023-12-01 15:04                 ` Theodore Ts'o
  2023-12-01 15:14                   ` Benno Lossin
  0 siblings, 1 reply; 96+ messages in thread
From: Theodore Ts'o @ 2023-12-01 15:04 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: david.laight, a.hindborg, alex.gaynor, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, dan.j.williams, dxu,
	gary, gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco,
	ojeda, peterz, rust-for-linux, surenb, tglx, tkjos, viro,
	wedsonaf, willy

On Fri, Dec 01, 2023 at 12:27:40PM +0000, Alice Ryhl wrote:
> 
> You can import it with a use statement. For example:
> 
> use kernel::file::flags::O_RDONLY;
> // use as O_RDONLY

That's good to hear, but it still means that we have to use the XYZ_*
prefix, because otherwise, after something like

use kernel::file::flags::RDONLY;
use kernel::uapi::rwf::RDONLY;

that will blow up.  So that has to be

use kernel::file::flags::O_RDONLY;
use kernel::uapi::rwf::RWF_RDONLY;

Which is a bit more verbose, at least things won't blow up
spectacularly when you need to use both namespaces in the same
codepath.

Also note how we do things like this:

#define IOCB_APPEND          (__force int) RWF_APPEND

In other words, the IOCB_* namespace and the RWF_* namespace partially
share code points, and so they *have* to be assigned to the same value
--- and note that since RWF_APPEND is defined as part of the UAPI, it
might not even be the same across different architectures....

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01 15:04                 ` Theodore Ts'o
@ 2023-12-01 15:14                   ` Benno Lossin
  2023-12-01 17:25                     ` David Laight
  0 siblings, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-12-01 15:14 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Alice Ryhl, david.laight, a.hindborg, alex.gaynor, arve,
	bjorn3_gh, boqun.feng, brauner, cmllamas, dan.j.williams, dxu,
	gary, gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco,
	ojeda, peterz, rust-for-linux, surenb, tglx, tkjos, viro,
	wedsonaf, willy

On 12/1/23 16:04, Theodore Ts'o wrote:
> On Fri, Dec 01, 2023 at 12:27:40PM +0000, Alice Ryhl wrote:
>>
>> You can import it with a use statement. For example:
>>
>> use kernel::file::flags::O_RDONLY;
>> // use as O_RDONLY
> 
> That's good to hear, but it still means that we have to use the XYZ_*
> prefix, because otherwise, after something like
> 
> use kernel::file::flags::RDONLY;
> use kernel::uapi::rwf::RDONLY;
> 
> that will blow up.  So that has to be
> 
> use kernel::file::flags::O_RDONLY;
> use kernel::uapi::rwf::RWF_RDONLY;

You can just import the `flags` and `rwf` modules (the fourth option
posted by Alice):

    use kernel::file::flags;
    use kernel::uapi::rwf;
    
    // usage:
    
    flags::O_RDONLY
    
    rwf::RDONLY

Alternatively if we end up with multiple flags modules you can do this
(the sixth option from Alice):

    use kernel::file::flags as file_flags;
    use kernel::foo::flags as foo_flags;

    // usage:

    file_flags::O_RDONLY

    foo_flags::O_RDONLY

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* RE: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01 15:14                   ` Benno Lossin
@ 2023-12-01 17:25                     ` David Laight
  2023-12-01 17:37                       ` Benno Lossin
  0 siblings, 1 reply; 96+ messages in thread
From: David Laight @ 2023-12-01 17:25 UTC (permalink / raw)
  To: 'Benno Lossin', Theodore Ts'o
  Cc: Alice Ryhl, a.hindborg, alex.gaynor, arve, bjorn3_gh, boqun.feng,
	brauner, cmllamas, dan.j.williams, dxu, gary, gregkh, joel,
	keescook, linux-fsdevel, linux-kernel, maco, ojeda, peterz,
	rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf, willy

From: Benno Lossin
> Sent: 01 December 2023 15:14
> 
> On 12/1/23 16:04, Theodore Ts'o wrote:
> > On Fri, Dec 01, 2023 at 12:27:40PM +0000, Alice Ryhl wrote:
> >>
> >> You can import it with a use statement. For example:
> >>
> >> use kernel::file::flags::O_RDONLY;
> >> // use as O_RDONLY
> >
> > That's good to hear,

Except that the examples here seem to imply you can't import
all of the values without listing them all.

From what I've seen of the rust patches the language seems
to have a lower SNR than ADA or VHDL.
Too much syntatic 'goop' makes it difficult to see what code
is actually doing.

....
> Alternatively if we end up with multiple flags modules you can do this
> (the sixth option from Alice):
> 
>     use kernel::file::flags as file_flags;
>     use kernel::foo::flags as foo_flags;
> 
>     // usage:
> 
>     file_flags::O_RDONLY
> 
>     foo_flags::O_RDONLY

That looks useful for the 'obfuscated rust' competition.
Consider:
	use kernel::file::flags as foo_flags;
	use kernel::foo::flags as file_flags;

It's probably fortunate that I' old enough retire before anyone forces
me to write any of this stuff :-)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 96+ messages in thread

* RE: [PATCH 1/7] rust: file: add Rust abstraction for `struct file`
  2023-12-01 17:25                     ` David Laight
@ 2023-12-01 17:37                       ` Benno Lossin
  0 siblings, 0 replies; 96+ messages in thread
From: Benno Lossin @ 2023-12-01 17:37 UTC (permalink / raw)
  To: David Laight
  Cc: Theodore Ts'o, Alice Ryhl, a.hindborg, alex.gaynor, arve,
	bjorn3_gh, boqun.feng, brauner, cmllamas, dan.j.williams, dxu,
	gary, gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco,
	ojeda, peterz, rust-for-linux, surenb, tglx, tkjos, viro,
	wedsonaf, willy

On 12/1/23 18:25, David Laight wrote:
> From: Benno Lossin
>> Sent: 01 December 2023 15:14
>>
>> On 12/1/23 16:04, Theodore Ts'o wrote:
>>> On Fri, Dec 01, 2023 at 12:27:40PM +0000, Alice Ryhl wrote:
>>>>
>>>> You can import it with a use statement. For example:
>>>>
>>>> use kernel::file::flags::O_RDONLY;
>>>> // use as O_RDONLY
>>>
>>> That's good to hear,
> 
> Except that the examples here seem to imply you can't import
> all of the values without listing them all.

Alice has given an example above, but you might not have noticed:

    use kernel::file::flags::*;
    
    // usage:

    O_RDONLY
    O_APPEND

> From what I've seen of the rust patches the language seems
> to have a lower SNR than ADA or VHDL.
> Too much syntatic 'goop' makes it difficult to see what code
> is actually doing.

This is done for better readability, e.g. when you do not have
rust-analyzer to help you jump to the right definition. But there are
certainly instances where we use the `::*` imports (just look at the
first patch).

> ....
>> Alternatively if we end up with multiple flags modules you can do this
>> (the sixth option from Alice):
>>
>>     use kernel::file::flags as file_flags;
>>     use kernel::foo::flags as foo_flags;
>>
>>     // usage:
>>
>>     file_flags::O_RDONLY
>>
>>     foo_flags::O_RDONLY
> 
> That looks useful for the 'obfuscated rust' competition.
> Consider:
> 	use kernel::file::flags as foo_flags;
> 	use kernel::foo::flags as file_flags;

This is no worse than C preprocessor macros doing funky stuff.
We will just have to catch this in review.

-- 
Cheers,
Benno


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 3/7] rust: security: add abstraction for secctx
  2023-12-01 10:48     ` Alice Ryhl
@ 2023-12-02 10:03       ` Benno Lossin
  0 siblings, 0 replies; 96+ messages in thread
From: Benno Lossin @ 2023-12-02 10:03 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: a.hindborg, alex.gaynor, arve, bjorn3_gh, boqun.feng, brauner,
	cmllamas, dan.j.williams, dxu, gary, gregkh, joel, keescook,
	linux-fsdevel, linux-kernel, maco, ojeda, peterz, rust-for-linux,
	surenb, tglx, tkjos, viro, wedsonaf, willy

On 12/1/23 11:48, Alice Ryhl wrote:
> Benno Lossin <benno.lossin@proton.me> writes:
>> On 11/29/23 14:11, Alice Ryhl wrote:
>>> +    /// Returns the bytes for this security context.
>>> +    pub fn as_bytes(&self) -> &[u8] {
>>> +        let mut ptr = self.secdata;
>>> +        if ptr.is_null() {
>>> +            // Many C APIs will use null pointers for strings of length zero, but
>>
>> I would just write that the secctx API uses null pointers to denote a
>> string of length zero.
> 
> I don't actually know whether it can ever be null, I just wanted to stay
> on the safe side.

I see, can someone from the C side confirm/refute this?

I found the comment a bit weird, since it is phrased in a general way.
If it turns out that the pointer can never be null, maybe use `NonNull`
instead (I would then also move the length check into the constructor)?
You can probably also do this if the pointer is allowed to be null,
assuming that you then do not need to call `security_release_secctx`.

>>> +            // `slice::from_raw_parts` doesn't allow the pointer to be null even if the length is
>>> +            // zero. Replace the pointer with a dangling but non-null pointer in this case.
>>> +            debug_assert_eq!(self.seclen, 0);
>>
>> I am feeling a bit uncomfortable with this, why can't we just return
>> an empty slice in this case?
> 
> I can do that, but to be clear, what I'm doing here is also definitely
> okay.

Yes it is okay, but I see this similar to avoiding `unsafe` code when it
can be done safely. In this example we are not strictly avoiding any
`unsafe` code, but we are avoiding a codepath with `unsafe` code. You
should of course still keep the `debug_assert_eq`.

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 6/7] rust: file: add `DeferredFdCloser`
  2023-12-01 11:35     ` Alice Ryhl
@ 2023-12-02 10:16       ` Benno Lossin
  2023-12-05 14:43         ` Alice Ryhl
  0 siblings, 1 reply; 96+ messages in thread
From: Benno Lossin @ 2023-12-02 10:16 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: a.hindborg, alex.gaynor, arve, bjorn3_gh, boqun.feng, brauner,
	cmllamas, dan.j.williams, dxu, gary, gregkh, joel, keescook,
	linux-fsdevel, linux-kernel, maco, ojeda, peterz, rust-for-linux,
	surenb, tglx, tkjos, viro, wedsonaf, willy

On 12/1/23 12:35, Alice Ryhl wrote:
> Benno Lossin <benno.lossin@proton.me> writes:
>>> +        // SAFETY: The `inner` pointer points at a valid and fully initialized task work that is
>>> +        // ready to be scheduled.
>>> +        unsafe { bindings::task_work_add(current, inner, TWA_RESUME) };
>>
>> I am a bit confused, when does `do_close_fd` actually run? Does
>> `TWA_RESUME` mean that `inner` is scheduled to run after the current
>> task has been completed?
> 
> When the current syscall returns to userspace.

What happens when I use `DeferredFdCloser` outside of a syscall? Will
it never run? Maybe add some documentation about that?

>>> +    // SAFETY: This function is an implementation detail of `close_fd`, so its safety comments
>>> +    // should be read in extension of that method.
>>> +    unsafe extern "C" fn do_close_fd(inner: *mut bindings::callback_head) {
>>> +        // SAFETY: In `close_fd` we use this method together with a pointer that originates from a
>>> +        // `Box<DeferredFdCloserInner>`, and we have just been given ownership of that allocation.
>>> +        let inner = unsafe { Box::from_raw(inner as *mut DeferredFdCloserInner) };
>>
>> In order for this call to be sound, `inner` must be an exclusive
>> pointer (including any possible references into the `callback_head`).
>> Is this the case?
> 
> Yes, when this is called, it's been removed from the linked list of task
> work. That's why we can kfree it.

Please add this to the SAFETY comment.

-- 
Cheers,
Benno

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred`
  2023-12-01 10:27       ` Christian Brauner
@ 2023-12-04 15:42         ` Alice Ryhl
  0 siblings, 0 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-12-04 15:42 UTC (permalink / raw)
  To: brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, benno.lossin,
	bjorn3_gh, boqun.feng, cmllamas, dan.j.williams, dxu, gary,
	gregkh, joel, keescook, linux-fsdevel, linux-kernel, maco, ojeda,
	peterz, rust-for-linux, surenb, tglx, tkjos, viro, wedsonaf,
	willy

Christian Brauner <brauner@kernel.org> writes:
> On Fri, Dec 01, 2023 at 09:06:35AM +0000, Alice Ryhl wrote:
> > Benno Lossin <benno.lossin@proton.me> writes:
> > > On 11/29/23 13:51, Alice Ryhl wrote:
> > >> +    /// Returns the credentials of the task that originally opened the file.
> > >> +    pub fn cred(&self) -> &Credential {
> > >> +        // This `read_volatile` is intended to correspond to a READ_ONCE call.
> > >> +        //
> > >> +        // SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
> > >> +        //
> > >> +        // TODO: Replace with `read_once` when available on the Rust side.
> > >> +        let ptr = unsafe { core::ptr::addr_of!((*self.0.get()).f_cred).read_volatile() };
> > >> +
> > >> +        // SAFETY: The signature of this function ensures that the caller will only access the
> > >> +        // returned credential while the file is still valid, and the credential must stay valid
> > >> +        // while the file is valid.
> > > 
> > > About the last part of this safety comment, is this a guarantee from the
> > > C side? If yes, then I would phrase it that way:
> > > 
> > >     ... while the file is still valid, and the C side ensures that the
> > >     credentials stay valid while the file is valid.
> > 
> > Yes, that's my intention with this code.
> > 
> > But I guess this is a good question for Christian Brauner to confirm:
> > 
> > If I read the credential from the `f_cred` field, is it guaranteed that
> > the pointer remains valid for at least as long as the file?
> > 
> > Or should I do some dance along the lines of "lock file, increment
> > refcount on credential, unlock file"?
> 
> The lifetime of the f_cred reference is at least as long as the lifetime
> of the file:
> 
> // file not yet visible anywhere
> some_file = alloc_file*()
> -> init_file()
>    {
>            file->f_cred = get_cred(cred /* usually current_cred() */)
>    }
> 
> 
> // install into fd_table -> irreversible, thing visible, possibly shared
> fd_install(1234, some_file)
> 
> // last fput
> fput()
> // atomic_dec_and_test() dance:
> -> file_free() // either "delayed" through task work, workqueue, or
> 	       // sometimes freed right away if file hasn't been opened,
> 	       // i.e., if fd_install() wasn't called
>    -> put_cred(file->f_cred)
> 
> In order to access anything you must hold a reference to the file or
> files->file_lock. IOW, no poking around in f->f_cred or any field for
> that matter just under rcu_read_lock() for example. Because files are
> SLAB_TYPESAFE_BY_RCU. You might be poking in someone else's creds then.

Okay, we aren't dealing with the rcu case in this patchset, so we know
that it wont be freed while we're accessing it.

I guess this means that the `f_cred` field is immutable, which means
that I don't need READ_ONCE to read it? I'll use an ordinary load in the
next version.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 6/7] rust: file: add `DeferredFdCloser`
  2023-12-02 10:16       ` Benno Lossin
@ 2023-12-05 14:43         ` Alice Ryhl
  2023-12-05 18:16           ` Alice Ryhl
  0 siblings, 1 reply; 96+ messages in thread
From: Alice Ryhl @ 2023-12-05 14:43 UTC (permalink / raw)
  To: benno.lossin, brauner
  Cc: a.hindborg, alex.gaynor, aliceryhl, arve, bjorn3_gh, boqun.feng,
	cmllamas, dan.j.williams, dxu, gary, gregkh, joel, keescook,
	linux-fsdevel, linux-kernel, maco, ojeda, peterz, rust-for-linux,
	surenb, tglx, tkjos, viro, wedsonaf, willy

Benno Lossin <benno.lossin@proton.me> writes:
> On 12/1/23 12:35, Alice Ryhl wrote:
>> Benno Lossin <benno.lossin@proton.me> writes:
>>>> +        // SAFETY: The `inner` pointer points at a valid and fully initialized task work that is
>>>> +        // ready to be scheduled.
>>>> +        unsafe { bindings::task_work_add(current, inner, TWA_RESUME) };
>>>
>>> I am a bit confused, when does `do_close_fd` actually run? Does
>>> `TWA_RESUME` mean that `inner` is scheduled to run after the current
>>> task has been completed?
>> 
>> When the current syscall returns to userspace.
> 
> What happens when I use `DeferredFdCloser` outside of a syscall? Will
> it never run? Maybe add some documentation about that?

Christian Brauner, I think I need your help here.

I spent a bunch of time today trying to understand the correct way of
closing an fd held with fdget, and I'm unsure what the best way is.

So, first, `task_work_add` only really works when we're called from a
syscall. For one, it's fallible, and for another, you shouldn't even
attempt to use it from a kthread. (See e.g., the implementation of
`fput` in `fs/file_table.c`.)

To handle the above, we could fall back to the workqueue and schedule
the `fput` there when we are on a kthread or `task_work_add` fails. And
since I don't really care about the performance of this utility, let's
say we just unconditionally use the workqueue to simplify the
implementation.

However, it's not clear to me that this is okay. Consider this
execution: (please compare to `binder_deferred_fd_close`)

    Thread A                Thread B (workqueue)
    fdget()
    close_fd_get_file()
    get_file()
    filp_close()
    schedule_work(do_close_fd)
    // we are preempted
                            fput()
    fdput()

And now, since the workqueue can run before thread A returns to
userspace, we are in trouble again, right? Unless I missed an upgrade
to shared file descriptor somewhere that somehow makes this okay? I
looked around the C code and couldn't find one and I guess such an
upgrade has to happen before the call to `fdget` anyway?

In Binder, the above is perfectly fine since it closes the fd from a
context where `task_work_add` will always work, and a task work
definitely runs after the `fdput`. But I added this as a utility in the
shared kernel crate, and I want to avoid the situation where someone
comes along later and uses it from a kthread, gets the fallback to
workqueue, and then has an UAF due to the previously mentioned
execution...

What do you advise that I do?

Maybe the answer is just that, if you're in a context where it makes
sense to talk about an fd of the current task, then task_work_add will
also definitely work? So if `task_work_add` won't work, then
`close_fd_get_file` will return a null pointer and we never reach the
`task_work_add`. This seems fragile though.

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 6/7] rust: file: add `DeferredFdCloser`
  2023-12-05 14:43         ` Alice Ryhl
@ 2023-12-05 18:16           ` Alice Ryhl
  0 siblings, 0 replies; 96+ messages in thread
From: Alice Ryhl @ 2023-12-05 18:16 UTC (permalink / raw)
  To: benno.lossin, brauner
  Cc: a.hindborg, alex.gaynor, arve, bjorn3_gh, boqun.feng, cmllamas,
	dan.j.williams, dxu, gary, gregkh, joel, keescook, linux-fsdevel,
	linux-kernel, maco, ojeda, peterz, rust-for-linux, surenb, tglx,
	tkjos, viro, wedsonaf, willy

On Tue, Dec 5, 2023 at 3:43 PM Alice Ryhl <aliceryhl@google.com> wrote:
>
> Benno Lossin <benno.lossin@proton.me> writes:
> > On 12/1/23 12:35, Alice Ryhl wrote:
> >> Benno Lossin <benno.lossin@proton.me> writes:
> >>>> +        // SAFETY: The `inner` pointer points at a valid and fully initialized task work that is
> >>>> +        // ready to be scheduled.
> >>>> +        unsafe { bindings::task_work_add(current, inner, TWA_RESUME) };
> >>>
> >>> I am a bit confused, when does `do_close_fd` actually run? Does
> >>> `TWA_RESUME` mean that `inner` is scheduled to run after the current
> >>> task has been completed?
> >>
> >> When the current syscall returns to userspace.
> >
> > What happens when I use `DeferredFdCloser` outside of a syscall? Will
> > it never run? Maybe add some documentation about that?
>
> Christian Brauner, I think I need your help here.
>
> I spent a bunch of time today trying to understand the correct way of
> closing an fd held with fdget, and I'm unsure what the best way is.
>
> So, first, `task_work_add` only really works when we're called from a
> syscall. For one, it's fallible, and for another, you shouldn't even
> attempt to use it from a kthread. (See e.g., the implementation of
> `fput` in `fs/file_table.c`.)
>
> To handle the above, we could fall back to the workqueue and schedule
> the `fput` there when we are on a kthread or `task_work_add` fails. And
> since I don't really care about the performance of this utility, let's
> say we just unconditionally use the workqueue to simplify the
> implementation.
>
> However, it's not clear to me that this is okay. Consider this
> execution: (please compare to `binder_deferred_fd_close`)
>
>     Thread A                Thread B (workqueue)
>     fdget()
>     close_fd_get_file()
>     get_file()
>     filp_close()
>     schedule_work(do_close_fd)
>     // we are preempted
>                             fput()
>     fdput()
>
> And now, since the workqueue can run before thread A returns to
> userspace, we are in trouble again, right? Unless I missed an upgrade
> to shared file descriptor somewhere that somehow makes this okay? I
> looked around the C code and couldn't find one and I guess such an
> upgrade has to happen before the call to `fdget` anyway?
>
> In Binder, the above is perfectly fine since it closes the fd from a
> context where `task_work_add` will always work, and a task work
> definitely runs after the `fdput`. But I added this as a utility in the
> shared kernel crate, and I want to avoid the situation where someone
> comes along later and uses it from a kthread, gets the fallback to
> workqueue, and then has an UAF due to the previously mentioned
> execution...
>
> What do you advise that I do?
>
> Maybe the answer is just that, if you're in a context where it makes
> sense to talk about an fd of the current task, then task_work_add will
> also definitely work? So if `task_work_add` won't work, then
> `close_fd_get_file` will return a null pointer and we never reach the
> `task_work_add`. This seems fragile though.
>
> Alice

Ah! I realized that there's another option: Report an error if we
can't schedule the task work.

I didn't suggest this originally because I didn't want to leak the
file in the error path, and I couldn't think of anything else sane to
do.

But! We can schedule the task work *first*, then attempt to close the
file. This way, the file doesn't get closed in the error path. And
there's no race condition since the task work is guaranteed to get
scheduled later on the same thread, so there's no way for it to get
executed in between us scheduling it and closing the file.

Thoughts?

Alice

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-30 12:46       ` Christian Brauner
@ 2023-12-06 19:59         ` Kent Overstreet
  2023-12-08 16:26           ` Peter Zijlstra
  2023-12-08  5:28         ` comex
  1 sibling, 1 reply; 96+ messages in thread
From: Kent Overstreet @ 2023-12-06 19:59 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Peter Zijlstra, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Thu, Nov 30, 2023 at 01:46:36PM +0100, Christian Brauner wrote:
> On Wed, Nov 29, 2023 at 05:48:15PM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 29, 2023 at 05:28:27PM +0100, Christian Brauner wrote:
> > 
> > > > +pid_t rust_helper_task_tgid_nr_ns(struct task_struct *tsk,
> > > > +				  struct pid_namespace *ns)
> > > > +{
> > > > +	return task_tgid_nr_ns(tsk, ns);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(rust_helper_task_tgid_nr_ns);
> > > 
> > > I'm a bit puzzled by all these rust_helper_*() calls. Can you explain
> > > why they are needed? Because they are/can be static inlines and that
> > > somehow doesn't work?
> > 
> > Correct, because Rust can only talk to C ABI, it cannot use C headers.
> > Bindgen would need to translate the full C headers into valid Rust for
> > that to work.
> > 
> > I really think the Rust peoples should spend more effort on that,
> > because you are quite right, all this wrappery is tedious at best.

I suspect even if the manpower existed to go that route we'd end up
regretting it, because then the Rust compiler would need to be able to
handle _all_ the craziness a modern C compiler knows how to do -
preprocessor magic/devilry isn't even the worst of it, it gets even
worse when you start to consider things like bitfields and all the crazy
__attributes__(()) people have invented.

Swift went that route, but they have Apple funding them, and I doubt
even they would want anything to do with Linux kernel C.

IOW: yes, the extra friction from not being able to do full C -> Rust
translation is annoying now, but probably a good thing in the long run.

> The problem is that we end up with a long list of explicit exports that
> also are all really weirdly named like rust_helper_*(). I wouldn't even
> complain if it they were somehow auto-generated but as you say that
> might be out of scope.

I think we'd need help from the C side to auto generate them - what we
really want is for them to be inline, not static inline, but of course
that has never really worked for functions used across a single C file.
But maybe C compiler people are smarter these days?

Just a keyword to to tell the C compiler "take this static inline and
generate a compiled version in this .c file" would be all we need.

I could see it being handy for other things, too: as Linus has been
saying, we tend to inline too much code these days, and part of the
reason for that is we make a function inline because of the _one_
fastpath that needs it, but there's 3 more slowpaths that don't. 

And right now we don't have any sane way of having a function be
available with both inlined and outlined versions, besides the same kind
of manual wrappers the Rust people are doing here... so we should
probably just fix that.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-30 10:36   ` Peter Zijlstra
@ 2023-12-06 20:02     ` Kent Overstreet
  2023-12-07  7:18       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 96+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alexander Viro, Christian Brauner,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Thu, Nov 30, 2023 at 11:36:35AM +0100, Peter Zijlstra wrote:
> On Wed, Nov 29, 2023 at 01:12:17PM +0000, Alice Ryhl wrote:
> 
> > diff --git a/rust/helpers.c b/rust/helpers.c
> > index fd633d9db79a..58e3a9dff349 100644
> > --- a/rust/helpers.c
> > +++ b/rust/helpers.c
> > @@ -142,6 +142,51 @@ void rust_helper_put_task_struct(struct task_struct *t)
> >  }
> >  EXPORT_SYMBOL_GPL(rust_helper_put_task_struct);
> >  
> > +kuid_t rust_helper_task_uid(struct task_struct *task)
> > +{
> > +	return task_uid(task);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_task_uid);
> > +
> > +kuid_t rust_helper_task_euid(struct task_struct *task)
> > +{
> > +	return task_euid(task);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_task_euid);
> 
> Aren't these like ideal speculation gadgets? And shouldn't we avoid
> functions like this for exactly that reason?

I think asking the Rust people to care about that is probably putting
too many constraints on them, unless you actually have an idea for
something better to do...

(loudly giving the CPU manufacturers the middle finger for making _all_
of us deal with this bullshit)

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 0/7] File abstractions needed by Rust Binder
  2023-11-29 16:31 ` [PATCH 0/7] File abstractions needed by Rust Binder Christian Brauner
  2023-11-29 16:48   ` Miguel Ojeda
@ 2023-12-06 20:05   ` Kent Overstreet
  2023-12-08 16:59     ` Miguel Ojeda
  1 sibling, 1 reply; 96+ messages in thread
From: Kent Overstreet @ 2023-12-06 20:05 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Peter Zijlstra, Alexander Viro,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Wed, Nov 29, 2023 at 05:31:44PM +0100, Christian Brauner wrote:
> On Wed, Nov 29, 2023 at 12:51:06PM +0000, Alice Ryhl wrote:
> > This patchset contains the file abstractions needed by the Rust
> > implementation of the Binder driver.
> > 
> > Please see the Rust Binder RFC for usage examples:
> > https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-0-08ba9197f637@google.com/
> > 
> > Users of "rust: file: add Rust abstraction for `struct file`":
> > 	[PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder
> > 	[PATCH RFC 03/20] rust_binder: add threading support
> > 
> > Users of "rust: cred: add Rust abstraction for `struct cred`":
> > 	[PATCH RFC 05/20] rust_binder: add nodes and context managers
> > 	[PATCH RFC 06/20] rust_binder: add oneway transactions
> > 	[PATCH RFC 11/20] rust_binder: send nodes in transaction
> > 	[PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support
> > 
> > Users of "rust: security: add abstraction for security_secid_to_secctx":
> > 	[PATCH RFC 06/20] rust_binder: add oneway transactions
> > 
> > Users of "rust: file: add `FileDescriptorReservation`":
> > 	[PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support
> > 	[PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support
> > 
> > Users of "rust: file: add kuid getters":
> > 	[PATCH RFC 05/20] rust_binder: add nodes and context managers
> > 	[PATCH RFC 06/20] rust_binder: add oneway transactions
> > 
> > Users of "rust: file: add `DeferredFdCloser`":
> > 	[PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support
> > 
> > Users of "rust: file: add abstraction for `poll_table`":
> > 	[PATCH RFC 07/20] rust_binder: add epoll support
> > 
> > This patchset has some uses of read_volatile in place of READ_ONCE.
> > Please see the following rfc for context on this:
> > https://lore.kernel.org/all/20231025195339.1431894-1-boqun.feng@gmail.com/
> > 
> > This was previously sent as an rfc:
> > https://lore.kernel.org/all/20230720152820.3566078-1-aliceryhl@google.com/
> > 
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > Alice Ryhl (4):
> >       rust: security: add abstraction for security_secid_to_secctx
> >       rust: file: add `Kuid` wrapper
> >       rust: file: add `DeferredFdCloser`
> >       rust: file: add abstraction for `poll_table`
> > 
> > Wedson Almeida Filho (3):
> >       rust: file: add Rust abstraction for `struct file`
> >       rust: cred: add Rust abstraction for `struct cred`
> >       rust: file: add `FileDescriptorReservation`
> > 
> >  rust/bindings/bindings_helper.h |   9 ++
> >  rust/bindings/lib.rs            |   1 +
> >  rust/helpers.c                  |  94 +++++++++++
> >  rust/kernel/cred.rs             |  73 +++++++++
> >  rust/kernel/file.rs             | 345 ++++++++++++++++++++++++++++++++++++++++
> >  rust/kernel/file/poll_table.rs  |  97 +++++++++++
> 
> That's pretty far away from the subsystem these wrappers belong to. I
> would prefer if wrappers such as this would live directly in fs/rust/
> and so live within the subsystem they belong to. I think I mentioned
> that before. Maybe I missed some sort of agreement here?

I spoke to Miguel about this and it was my understanding that everything
was in place for moving Rust wrappers to the proper directory -
previously there was build system stuff blocking, but he said that's all
working now. Perhaps the memo just didn't get passed down?

(My vote would actually be for fs/ directly, not fs/rust, and a 1:1
mapping between .c files and the .rs files that wrap them).

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-06 20:02     ` Kent Overstreet
@ 2023-12-07  7:18       ` Greg Kroah-Hartman
  2023-12-07  7:46         ` Kent Overstreet
  0 siblings, 1 reply; 96+ messages in thread
From: Greg Kroah-Hartman @ 2023-12-07  7:18 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Peter Zijlstra, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Wed, Dec 06, 2023 at 03:02:24PM -0500, Kent Overstreet wrote:
> On Thu, Nov 30, 2023 at 11:36:35AM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 29, 2023 at 01:12:17PM +0000, Alice Ryhl wrote:
> > 
> > > diff --git a/rust/helpers.c b/rust/helpers.c
> > > index fd633d9db79a..58e3a9dff349 100644
> > > --- a/rust/helpers.c
> > > +++ b/rust/helpers.c
> > > @@ -142,6 +142,51 @@ void rust_helper_put_task_struct(struct task_struct *t)
> > >  }
> > >  EXPORT_SYMBOL_GPL(rust_helper_put_task_struct);
> > >  
> > > +kuid_t rust_helper_task_uid(struct task_struct *task)
> > > +{
> > > +	return task_uid(task);
> > > +}
> > > +EXPORT_SYMBOL_GPL(rust_helper_task_uid);
> > > +
> > > +kuid_t rust_helper_task_euid(struct task_struct *task)
> > > +{
> > > +	return task_euid(task);
> > > +}
> > > +EXPORT_SYMBOL_GPL(rust_helper_task_euid);
> > 
> > Aren't these like ideal speculation gadgets? And shouldn't we avoid
> > functions like this for exactly that reason?
> 
> I think asking the Rust people to care about that is probably putting
> too many constraints on them, unless you actually have an idea for
> something better to do...

It's not a constraint, it is a "we can not do this as it is buggy
because cpus are broken and we need to protect users from those bugs."

If we were to accept this type of code, then the people who are going
"it's safer to write kernel code in Rust" would be "pleasantly
surprised" when it turns out that their systems are actually more
insecure.

Hint, when "known broken" code is found in code review, it can not just
be ignored.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-07  7:18       ` Greg Kroah-Hartman
@ 2023-12-07  7:46         ` Kent Overstreet
  0 siblings, 0 replies; 96+ messages in thread
From: Kent Overstreet @ 2023-12-07  7:46 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Peter Zijlstra, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Christian Brauner, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Thu, Dec 07, 2023 at 08:18:37AM +0100, Greg Kroah-Hartman wrote:
> On Wed, Dec 06, 2023 at 03:02:24PM -0500, Kent Overstreet wrote:
> > On Thu, Nov 30, 2023 at 11:36:35AM +0100, Peter Zijlstra wrote:
> > > On Wed, Nov 29, 2023 at 01:12:17PM +0000, Alice Ryhl wrote:
> > > 
> > > > diff --git a/rust/helpers.c b/rust/helpers.c
> > > > index fd633d9db79a..58e3a9dff349 100644
> > > > --- a/rust/helpers.c
> > > > +++ b/rust/helpers.c
> > > > @@ -142,6 +142,51 @@ void rust_helper_put_task_struct(struct task_struct *t)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(rust_helper_put_task_struct);
> > > >  
> > > > +kuid_t rust_helper_task_uid(struct task_struct *task)
> > > > +{
> > > > +	return task_uid(task);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(rust_helper_task_uid);
> > > > +
> > > > +kuid_t rust_helper_task_euid(struct task_struct *task)
> > > > +{
> > > > +	return task_euid(task);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(rust_helper_task_euid);
> > > 
> > > Aren't these like ideal speculation gadgets? And shouldn't we avoid
> > > functions like this for exactly that reason?
> > 
> > I think asking the Rust people to care about that is probably putting
> > too many constraints on them, unless you actually have an idea for
> > something better to do...
> 
> It's not a constraint, it is a "we can not do this as it is buggy
> because cpus are broken and we need to protect users from those bugs."
> 
> If we were to accept this type of code, then the people who are going
> "it's safer to write kernel code in Rust" would be "pleasantly
> surprised" when it turns out that their systems are actually more
> insecure.
> 
> Hint, when "known broken" code is found in code review, it can not just
> be ignored.

We're talking about a CPU bug, not a Rust bug, and maybe try a nm
--size-sort and see what you find before throwing stones at them...

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-11-30 12:46       ` Christian Brauner
  2023-12-06 19:59         ` Kent Overstreet
@ 2023-12-08  5:28         ` comex
  2023-12-08 16:19           ` Miguel Ojeda
  1 sibling, 1 reply; 96+ messages in thread
From: comex @ 2023-12-08  5:28 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Peter Zijlstra, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel



> On Nov 30, 2023, at 4:46 AM, Christian Brauner <brauner@kernel.org> wrote:
> 
> I wouldn't even
> complain if it they were somehow auto-generated but as you say that
> might be out of scope.

FYI, rust-bindgen got an experimental feature of this nature earlier this year:

https://github.com/rust-lang/rust-bindgen/pull/2335

Though apparently it has significant limitations meriting it the “experimental” title.

Regarding the issue of wrappers not being inlined, it's possible to get LLVM to optimize C and Rust code together into an object file, with the help of a compatible Clang and LLD:

@ rustc -O --emit llvm-bc a.rs                                         
@ clang --target=x86_64-unknown-linux-gnu -O2 -c -emit-llvm -o b.bc b.c
@ ld.lld -r -o c.o a.bc b.bc

Basically LTO but within the scope of a single object file.  This would be redundant in cases where kernel-wide LTO is enabled.

Using this approach might slow down compilation a bit due to needing to pass the LLVM bitcode between multiple commands, but probably not very much.

Just chiming in as someone not involved in Rust for Linux but familiar with these tools.  Perhaps this has been considered before and rejected for some reason; I wouldn’t know.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-08  5:28         ` comex
@ 2023-12-08 16:19           ` Miguel Ojeda
  2023-12-08 17:08             ` Nick Desaulniers
  2023-12-09  7:24             ` comex
  0 siblings, 2 replies; 96+ messages in thread
From: Miguel Ojeda @ 2023-12-08 16:19 UTC (permalink / raw)
  To: comex
  Cc: Christian Brauner, Peter Zijlstra, Alice Ryhl, Miguel Ojeda,
	Alex Gaynor, Wedson Almeida Filho, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alexander Viro, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel, Nick Desaulniers

On Fri, Dec 8, 2023 at 6:28 AM comex <comexk@gmail.com> wrote:
>
> Regarding the issue of wrappers not being inlined, it's possible to get LLVM to optimize C and Rust code together into an object file, with the help of a compatible Clang and LLD:
>
> @ rustc -O --emit llvm-bc a.rs
> @ clang --target=x86_64-unknown-linux-gnu -O2 -c -emit-llvm -o b.bc b.c
> @ ld.lld -r -o c.o a.bc b.bc
>
> Basically LTO but within the scope of a single object file.  This would be redundant in cases where kernel-wide LTO is enabled.
>
> Using this approach might slow down compilation a bit due to needing to pass the LLVM bitcode between multiple commands, but probably not very much.
>
> Just chiming in as someone not involved in Rust for Linux but familiar with these tools.  Perhaps this has been considered before and rejected for some reason; I wouldn’t know.

Thanks comex for chiming in, much appreciated.

Yeah, this is what we have been calling the "local-LTO hack" and it
was one of the possibilities we were considering for non-LTO kernel
builds for performance reasons originally. I don't recall who
originally suggested it in one of our meetings (Gary or Björn
perhaps).

If LLVM folks think LLVM-wise nothing will break, then we are happy to
go ahead with that (since it also solves the performance side), but it
would be nice to know if it will always be OK to build like that, i.e.
I think Andreas actually tried it and it seemed to work and boot, but
the worry is whether there is something subtle that could have bad
codegen in the future.

(We will also need to worry about GCC.)

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-06 19:59         ` Kent Overstreet
@ 2023-12-08 16:26           ` Peter Zijlstra
  2023-12-08 19:58             ` Kent Overstreet
  0 siblings, 1 reply; 96+ messages in thread
From: Peter Zijlstra @ 2023-12-08 16:26 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Christian Brauner, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Wed, Dec 06, 2023 at 02:59:11PM -0500, Kent Overstreet wrote:

> I suspect even if the manpower existed to go that route we'd end up
> regretting it, because then the Rust compiler would need to be able to
> handle _all_ the craziness a modern C compiler knows how to do -
> preprocessor magic/devilry isn't even the worst of it, it gets even
> worse when you start to consider things like bitfields and all the crazy
> __attributes__(()) people have invented.

Dude, clang can already handle all of that. Both rust and clang are
build on top of llvm, they generate the same IR, you can simply feed a
string into libclang and get IR out of it, which you can splice into
your rust generated IR.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 0/7] File abstractions needed by Rust Binder
  2023-12-06 20:05   ` Kent Overstreet
@ 2023-12-08 16:59     ` Miguel Ojeda
  0 siblings, 0 replies; 96+ messages in thread
From: Miguel Ojeda @ 2023-12-08 16:59 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Christian Brauner, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Peter Zijlstra, Alexander Viro,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Wed, Dec 6, 2023 at 9:05 PM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
>
> I spoke to Miguel about this and it was my understanding that everything
> was in place for moving Rust wrappers to the proper directory -
> previously there was build system stuff blocking, but he said that's all
> working now. Perhaps the memo just didn't get passed down?

No, it is being worked on (please see my sibling reply).

> (My vote would actually be for fs/ directly, not fs/rust, and a 1:1
> mapping between .c files and the .rs files that wrap them).

Thanks Kent for voting :)

Though note that an exact 1:1 mapping is going to be hard, e.g.
consider nested Rust submodules which would go in folders or
abstractions that you may arrange differently even if they wrap the
same concepts.

But, yeah, one should try to avoid to diverge without a good reason,
of course, especially in the beginning.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-08 16:19           ` Miguel Ojeda
@ 2023-12-08 17:08             ` Nick Desaulniers
  2023-12-08 17:37               ` Miguel Ojeda
                                 ` (2 more replies)
  2023-12-09  7:24             ` comex
  1 sibling, 3 replies; 96+ messages in thread
From: Nick Desaulniers @ 2023-12-08 17:08 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: comex, Christian Brauner, Peter Zijlstra, Alice Ryhl,
	Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alexander Viro, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Fri, Dec 8, 2023 at 8:19 AM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> On Fri, Dec 8, 2023 at 6:28 AM comex <comexk@gmail.com> wrote:
> >
> > Regarding the issue of wrappers not being inlined, it's possible to get LLVM to optimize C and Rust code together into an object file, with the help of a compatible Clang and LLD:
> >
> > @ rustc -O --emit llvm-bc a.rs
> > @ clang --target=x86_64-unknown-linux-gnu -O2 -c -emit-llvm -o b.bc b.c
> > @ ld.lld -r -o c.o a.bc b.bc
> >
> > Basically LTO but within the scope of a single object file.  This would be redundant in cases where kernel-wide LTO is enabled.
> >
> > Using this approach might slow down compilation a bit due to needing to pass the LLVM bitcode between multiple commands, but probably not very much.
> >
> > Just chiming in as someone not involved in Rust for Linux but familiar with these tools.  Perhaps this has been considered before and rejected for some reason; I wouldn’t know.
>
> Thanks comex for chiming in, much appreciated.
>
> Yeah, this is what we have been calling the "local-LTO hack" and it
> was one of the possibilities we were considering for non-LTO kernel
> builds for performance reasons originally. I don't recall who
> originally suggested it in one of our meetings (Gary or Björn
> perhaps).
>
> If LLVM folks think LLVM-wise nothing will break, then we are happy to

On paper, nothing comes to mind.  No promises though.

From a build system perspective, I'd rather just point users towards
LTO if they have this concern.  We support full and thin lto.  This
proposal would add a third variant for just rust drivers.  Each
variation on LTO has a maintenance cost and each have had their own
distinct fun bugs in the past.  Not sure an additional variant is
worth the maintenance cost, even if it's technically feasible.

> go ahead with that (since it also solves the performance side), but it
> would be nice to know if it will always be OK to build like that, i.e.
> I think Andreas actually tried it and it seemed to work and boot, but
> the worry is whether there is something subtle that could have bad
> codegen in the future.
>
> (We will also need to worry about GCC.)
>
> Cheers,
> Miguel



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-08 17:08             ` Nick Desaulniers
@ 2023-12-08 17:37               ` Miguel Ojeda
  2023-12-08 17:43               ` Boqun Feng
  2023-12-08 20:43               ` Matthew Wilcox
  2 siblings, 0 replies; 96+ messages in thread
From: Miguel Ojeda @ 2023-12-08 17:37 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: comex, Christian Brauner, Peter Zijlstra, Alice Ryhl,
	Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alexander Viro, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Fri, Dec 8, 2023 at 6:09 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On paper, nothing comes to mind.  No promises though.

Thanks Nick -- that is useful nevertheless.

> From a build system perspective, I'd rather just point users towards
> LTO if they have this concern.  We support full and thin lto.  This
> proposal would add a third variant for just rust drivers.  Each
> variation on LTO has a maintenance cost and each have had their own
> distinct fun bugs in the past.  Not sure an additional variant is
> worth the maintenance cost, even if it's technically feasible.

I was thinking it would be something always done for Rust object
files: under a normal "no LTO" build, the Rust object files would
always get the cross-language inlining done and therefore no extra
dimension in the matrix. Would that help?

I think it is worth at least considering, given there is also a
non-trivial amount of performance to gain if we always do it, e.g.
Andreas wanted it for non-LTO kernel for this reason.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-08 17:08             ` Nick Desaulniers
  2023-12-08 17:37               ` Miguel Ojeda
@ 2023-12-08 17:43               ` Boqun Feng
  2023-12-08 20:43               ` Matthew Wilcox
  2 siblings, 0 replies; 96+ messages in thread
From: Boqun Feng @ 2023-12-08 17:43 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Miguel Ojeda, comex, Christian Brauner, Peter Zijlstra,
	Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alexander Viro, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Fri, Dec 08, 2023 at 09:08:47AM -0800, Nick Desaulniers wrote:
> On Fri, Dec 8, 2023 at 8:19 AM Miguel Ojeda
> <miguel.ojeda.sandonis@gmail.com> wrote:
> >
> > On Fri, Dec 8, 2023 at 6:28 AM comex <comexk@gmail.com> wrote:
> > >
> > > Regarding the issue of wrappers not being inlined, it's possible to get LLVM to optimize C and Rust code together into an object file, with the help of a compatible Clang and LLD:
> > >
> > > @ rustc -O --emit llvm-bc a.rs
> > > @ clang --target=x86_64-unknown-linux-gnu -O2 -c -emit-llvm -o b.bc b.c
> > > @ ld.lld -r -o c.o a.bc b.bc
> > >
> > > Basically LTO but within the scope of a single object file.  This would be redundant in cases where kernel-wide LTO is enabled.
> > >
> > > Using this approach might slow down compilation a bit due to needing to pass the LLVM bitcode between multiple commands, but probably not very much.
> > >
> > > Just chiming in as someone not involved in Rust for Linux but familiar with these tools.  Perhaps this has been considered before and rejected for some reason; I wouldn’t know.
> >
> > Thanks comex for chiming in, much appreciated.
> >
> > Yeah, this is what we have been calling the "local-LTO hack" and it
> > was one of the possibilities we were considering for non-LTO kernel
> > builds for performance reasons originally. I don't recall who
> > originally suggested it in one of our meetings (Gary or Björn
> > perhaps).
> >
> > If LLVM folks think LLVM-wise nothing will break, then we are happy to
> 
> On paper, nothing comes to mind.  No promises though.
> 
> From a build system perspective, I'd rather just point users towards
> LTO if they have this concern.  We support full and thin lto.  This
> proposal would add a third variant for just rust drivers.  Each
> variation on LTO has a maintenance cost and each have had their own
> distinct fun bugs in the past.  Not sure an additional variant is
> worth the maintenance cost, even if it's technically feasible.
> 

Actually, the "LTO" in "local-LTO" may be misleading ;-) The problem we
want to resolve here is letting Rust code call small C functions (or
macros) without exporting the symbols. To me, it's really just "static
linking" a library (right now it's rust/helpers.o) contains small C
functions and macros used by Rust into a Rust driver kmodule, the "LTO"
part can be optional: let the linker make the call.

Regards,
Boqun

> > go ahead with that (since it also solves the performance side), but it
> > would be nice to know if it will always be OK to build like that, i.e.
> > I think Andreas actually tried it and it seemed to work and boot, but
> > the worry is whether there is something subtle that could have bad
> > codegen in the future.
> >
> > (We will also need to worry about GCC.)
> >
> > Cheers,
> > Miguel
> 
> 
> 
> -- 
> Thanks,
> ~Nick Desaulniers

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-08 16:26           ` Peter Zijlstra
@ 2023-12-08 19:58             ` Kent Overstreet
  0 siblings, 0 replies; 96+ messages in thread
From: Kent Overstreet @ 2023-12-08 19:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christian Brauner, Alice Ryhl, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Boqun Feng, Gary Guo, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alexander Viro,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel

On Fri, Dec 08, 2023 at 05:26:16PM +0100, Peter Zijlstra wrote:
> On Wed, Dec 06, 2023 at 02:59:11PM -0500, Kent Overstreet wrote:
> 
> > I suspect even if the manpower existed to go that route we'd end up
> > regretting it, because then the Rust compiler would need to be able to
> > handle _all_ the craziness a modern C compiler knows how to do -
> > preprocessor magic/devilry isn't even the worst of it, it gets even
> > worse when you start to consider things like bitfields and all the crazy
> > __attributes__(()) people have invented.
> 
> Dude, clang can already handle all of that. Both rust and clang are
> build on top of llvm, they generate the same IR, you can simply feed a
> string into libclang and get IR out of it, which you can splice into
> your rust generated IR.

If only it were that simple :)

This is struct definitions we're talking about, not code, so what you
want isn't even IR, what you're generating is a memory layout for a
type, linked in with all your other type information.

And people critize Linux for being a giant monorepo that makes no
considerations for making our code reusable in other contexts; clang and
LLVM are no different. But that's not really the issue because you're
going to need a huge chunk of clang to even parse this stuff, what you
really want is a way to invoke clang and dump _type information_ in a
standardized, easy to consume way. What you want is actually more akin
to the debug info that's generated today.

So... yeah, sure, lovely if it existed, but not the world we live in :)

(As an aside, I've actually got an outstanding bug filed with rustc
because it needs to be able to handle types that are marked both packed
and aligned... if anyone in this thread _does_ know some rust compiler
folks, we need that for bcachefs on disk format types).

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-08 17:08             ` Nick Desaulniers
  2023-12-08 17:37               ` Miguel Ojeda
  2023-12-08 17:43               ` Boqun Feng
@ 2023-12-08 20:43               ` Matthew Wilcox
  2 siblings, 0 replies; 96+ messages in thread
From: Matthew Wilcox @ 2023-12-08 20:43 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Miguel Ojeda, comex, Christian Brauner, Peter Zijlstra,
	Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alexander Viro, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Dan Williams,
	Kees Cook, Thomas Gleixner, Daniel Xu, linux-kernel,
	rust-for-linux, linux-fsdevel

On Fri, Dec 08, 2023 at 09:08:47AM -0800, Nick Desaulniers wrote:
> From a build system perspective, I'd rather just point users towards
> LTO if they have this concern.  We support full and thin lto.  This
> proposal would add a third variant for just rust drivers.  Each
> variation on LTO has a maintenance cost and each have had their own
> distinct fun bugs in the past.  Not sure an additional variant is
> worth the maintenance cost, even if it's technically feasible.

If we're allowed to talk about ideal solutions ... I hate putting
code in header files.  I'd rather be able to put, eg:

__force_inline int put_page_testzero(struct page *page)
{
	VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
	return page_ref_dec_and_test(page);
}

__force_inline int folio_put_testzero(struct folio *folio)
{
	return put_page_testzero(&folio->page);
}

__force_inline void folio_put(struct folio *folio)
{
	if (folio_put_testzero(folio))
		__folio_put(folio);
}

into a .c file and have both C and Rust inline folio_put(),
folio_put_testzero(), put_page_testzero(), VM_BUG_ON_PAGE() and
page_ref_dec_and_test(), but not even attempt to inline __folio_put()
(because We Know Better, and have determined that is the point at
which to stop).

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH 5/7] rust: file: add `Kuid` wrapper
  2023-12-08 16:19           ` Miguel Ojeda
  2023-12-08 17:08             ` Nick Desaulniers
@ 2023-12-09  7:24             ` comex
  1 sibling, 0 replies; 96+ messages in thread
From: comex @ 2023-12-09  7:24 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Christian Brauner, Peter Zijlstra, Alice Ryhl, Miguel Ojeda,
	Alex Gaynor, Wedson Almeida Filho, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alexander Viro, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Dan Williams, Kees Cook, Matthew Wilcox,
	Thomas Gleixner, Daniel Xu, linux-kernel, rust-for-linux,
	linux-fsdevel, Nick Desaulniers

On Dec 8, 2023, at 8:19 AM, Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> wrote:
> 
> If LLVM folks think LLVM-wise nothing will break, then we are happy to
> go ahead with that (since it also solves the performance side), but it
> would be nice to know if it will always be OK to build like that, i.e.
> I think Andreas actually tried it and it seemed to work and boot, but
> the worry is whether there is something subtle that could have bad
> codegen in the future.

One potential issue is incompatibility between the LLVM versions used by rustc, Clang, and LLD.  At minimum, whichever tool is reading bitcode (LLD in my example) should have an LLVM version >= that of the tools producing bitcode, since newer LLVM versions can read older bitcode but not vice versa.  But ideally the tools would all just be linked against the same copy of LLVM.

If you’re getting your tools from a distro, then that may already be true for you.  But if you’re using upstream rustc binaries, those are built against a custom branch of LLVM, which is based on upstream release versions but adds a handful of patches [1]; by policy, those patches can include cherry-picks of miscompilation fixes that are upstream but haven’t made it into a release yet [2].  Upstream rustc binaries are accompanied by a copy of LLD linked against the same LLVM library, named rust-lld, but there’s no corresponding copy of Clang [3].  I’d say that agreement between rustc and LLD is the most important thing, but it would be nice if they'd make a matching Clang available through rustup.

[1] https://github.com/llvm/llvm-project/compare/release/17.x...rust-lang:llvm-project:rustc/17.0-2023-09-19
[2] https://rustc-dev-guide.rust-lang.org/backend/updating-llvm.html#bugfix-updates
[3] https://github.com/rust-lang/rust/issues/56371

^ permalink raw reply	[flat|nested] 96+ messages in thread

end of thread, other threads:[~2023-12-09  7:24 UTC | newest]

Thread overview: 96+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-29 12:51 [PATCH 0/7] File abstractions needed by Rust Binder Alice Ryhl
2023-11-29 12:51 ` [PATCH 1/7] rust: file: add Rust abstraction for `struct file` Alice Ryhl
2023-11-29 15:13   ` Matthew Wilcox
2023-11-29 15:23     ` Peter Zijlstra
2023-11-29 17:08       ` Boqun Feng
2023-11-30 10:42         ` Peter Zijlstra
2023-11-30 15:25           ` Boqun Feng
2023-12-01  8:53             ` Peter Zijlstra
2023-12-01  9:19               ` Boqun Feng
2023-12-01  9:40                 ` Peter Zijlstra
2023-12-01 10:36                   ` Boqun Feng
2023-12-01 11:05                     ` Peter Zijlstra
2023-12-01  9:00             ` Peter Zijlstra
2023-12-01  9:52               ` Boqun Feng
2023-11-29 16:42     ` Alice Ryhl
2023-11-29 16:45       ` Peter Zijlstra
2023-11-30 15:02     ` Benno Lossin
2023-11-29 17:06   ` Christian Brauner
2023-11-29 21:27     ` Alice Ryhl
2023-11-29 23:17       ` Benno Lossin
2023-11-30 10:48       ` Christian Brauner
2023-11-30 12:10         ` Alice Ryhl
2023-11-30 12:36           ` Christian Brauner
2023-11-30 14:53   ` Benno Lossin
2023-11-30 14:59     ` Greg Kroah-Hartman
2023-11-30 15:46       ` Benno Lossin
2023-11-30 15:56         ` Greg Kroah-Hartman
2023-11-30 15:58         ` Theodore Ts'o
2023-11-30 16:12           ` Benno Lossin
2023-12-01  1:16             ` Theodore Ts'o
2023-12-01 12:11             ` David Laight
2023-12-01 12:27               ` Alice Ryhl
2023-12-01 15:04                 ` Theodore Ts'o
2023-12-01 15:14                   ` Benno Lossin
2023-12-01 17:25                     ` David Laight
2023-12-01 17:37                       ` Benno Lossin
2023-11-29 12:51 ` [PATCH 2/7] rust: cred: add Rust abstraction for `struct cred` Alice Ryhl
2023-11-30 16:17   ` Benno Lossin
2023-12-01  9:06     ` Alice Ryhl
2023-12-01 10:27       ` Christian Brauner
2023-12-04 15:42         ` Alice Ryhl
2023-11-29 13:11 ` [PATCH 3/7] rust: security: add abstraction for secctx Alice Ryhl
2023-11-30 16:26   ` Benno Lossin
2023-12-01 10:48     ` Alice Ryhl
2023-12-02 10:03       ` Benno Lossin
2023-11-29 13:11 ` [PATCH 4/7] rust: file: add `FileDescriptorReservation` Alice Ryhl
2023-11-29 16:14   ` Christian Brauner
2023-11-29 16:55     ` Alice Ryhl
2023-11-29 17:14       ` Alice Ryhl
2023-11-30  9:12         ` Christian Brauner
2023-11-30  9:23           ` Alice Ryhl
2023-11-30  9:09       ` Christian Brauner
2023-11-30  9:17         ` Alice Ryhl
2023-11-30 10:51           ` Christian Brauner
2023-11-30 11:54             ` Alice Ryhl
2023-11-30 12:17               ` Benno Lossin
2023-11-30 12:33                 ` Christian Brauner
2023-11-30 16:40   ` Benno Lossin
2023-12-01 11:32     ` Alice Ryhl
2023-11-29 13:12 ` [PATCH 5/7] rust: file: add `Kuid` wrapper Alice Ryhl
2023-11-29 16:28   ` Christian Brauner
2023-11-29 16:48     ` Peter Zijlstra
2023-11-30 12:46       ` Christian Brauner
2023-12-06 19:59         ` Kent Overstreet
2023-12-08 16:26           ` Peter Zijlstra
2023-12-08 19:58             ` Kent Overstreet
2023-12-08  5:28         ` comex
2023-12-08 16:19           ` Miguel Ojeda
2023-12-08 17:08             ` Nick Desaulniers
2023-12-08 17:37               ` Miguel Ojeda
2023-12-08 17:43               ` Boqun Feng
2023-12-08 20:43               ` Matthew Wilcox
2023-12-09  7:24             ` comex
2023-11-30  9:36     ` Alice Ryhl
2023-11-30 10:52       ` Christian Brauner
2023-11-30 10:36   ` Peter Zijlstra
2023-12-06 20:02     ` Kent Overstreet
2023-12-07  7:18       ` Greg Kroah-Hartman
2023-12-07  7:46         ` Kent Overstreet
2023-11-30 16:48   ` Benno Lossin
2023-11-29 13:12 ` [PATCH 6/7] rust: file: add `DeferredFdCloser` Alice Ryhl
2023-11-30 17:12   ` Benno Lossin
2023-12-01 11:35     ` Alice Ryhl
2023-12-02 10:16       ` Benno Lossin
2023-12-05 14:43         ` Alice Ryhl
2023-12-05 18:16           ` Alice Ryhl
2023-11-29 13:12 ` [PATCH 7/7] rust: file: add abstraction for `poll_table` Alice Ryhl
2023-11-30 17:42   ` Benno Lossin
2023-12-01 11:47     ` Alice Ryhl
2023-11-30 22:39   ` Boqun Feng
2023-12-01 11:50     ` Alice Ryhl
2023-11-30 22:50   ` Boqun Feng
2023-11-29 16:31 ` [PATCH 0/7] File abstractions needed by Rust Binder Christian Brauner
2023-11-29 16:48   ` Miguel Ojeda
2023-12-06 20:05   ` Kent Overstreet
2023-12-08 16:59     ` Miguel Ojeda

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.