rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Memory management patches needed by Rust Binder
@ 2024-01-24 11:20 Alice Ryhl
  2024-01-24 11:20 ` [PATCH 1/3] rust: add userspace pointers Alice Ryhl
                   ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-01-24 11:20 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Alice Ryhl, Christian Brauner

This patchset contains some abstractions needed by the Rust
implementation of the Binder driver for passing data between userspace,
kernelspace, and directly into other processes.

These abstractions do not exactly match what was included in the Rust
Binder RFC - I have made various improvements and simplifications since
then. Nonetheless, please see the Rust Binder RFC [1] to get an
understanding for how this will be used:

Users of "rust: add userspace pointers"
     and "rust: add typed accessors for userspace pointers":
	rust_binder: add binderfs support to Rust binder
	rust_binder: add threading support
	rust_binder: add nodes and context managers
	rust_binder: add oneway transactions
	rust_binder: add death notifications
	rust_binder: send nodes in transactions
	rust_binder: add BINDER_TYPE_PTR support
	rust_binder: add BINDER_TYPE_FDA support
	rust_binder: add process freezing

Users of "rust: add abstraction for `struct page`":
	rust_binder: add oneway transactions
	rust_binder: add vma shrinker

Especially the second patch with typed accessors needs review. It
contains a Rust analogy for when the C-side skips `check_object_size`,
and I would be very happy to receive feedback about whether I am going
about this in a reasonable way.

Links: https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-0-08ba9197f637@google.com/ [1]
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Alice Ryhl (2):
      rust: add typed accessors for userspace pointers
      rust: add abstraction for `struct page`

Wedson Almeida Filho (1):
      rust: add userspace pointers

 rust/bindings/bindings_helper.h |   1 +
 rust/helpers.c                  |  68 ++++++++
 rust/kernel/lib.rs              |   2 +
 rust/kernel/page.rs             | 176 ++++++++++++++++++++
 rust/kernel/user_ptr.rs         | 347 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 594 insertions(+)
---
base-commit: 6613476e225e090cc9aad49be7fa504e290dd33d
change-id: 20231128-alice-mm-bc533456cee8

Best regards,
-- 
Alice Ryhl <aliceryhl@google.com>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 1/3] rust: add userspace pointers
  2024-01-24 11:20 [PATCH 0/3] Memory management patches needed by Rust Binder Alice Ryhl
@ 2024-01-24 11:20 ` Alice Ryhl
  2024-01-24 23:12   ` Valentin Obst
  2024-02-01  4:06   ` Trevor Gross
  2024-01-24 11:20 ` [PATCH 2/3] rust: add typed accessors for " Alice Ryhl
  2024-01-24 11:20 ` [PATCH 3/3] rust: add abstraction for `struct page` Alice Ryhl
  2 siblings, 2 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-01-24 11:20 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Alice Ryhl, Christian Brauner

From: Wedson Almeida Filho <wedsonaf@gmail.com>

A pointer to an area in userspace memory, which can be either read-only
or read-write.

All methods on this struct are safe: invalid pointers return `EFAULT`.
Concurrent access, *including data races to/from userspace memory*, is
permitted, because fundamentally another userspace thread/process could
always be modifying memory at the same time (in the same way that
userspace Rust's `std::io` permits data races with the contents of
files on disk). In the presence of a race, the exact byte values
read/written are unspecified but the operation is well-defined.
Kernelspace code should validate its copy of data after completing a
read, and not expect that multiple reads of the same address will return
the same value.

These APIs are designed to make it difficult to accidentally write
TOCTOU bugs. Every time you read from a memory location, the pointer is
advanced by the length so that you cannot use that reader to read the
same memory location twice. Preventing double-fetches avoids TOCTOU
bugs. This is accomplished by taking `self` by value to prevent
obtaining multiple readers on a given `UserSlicePtr`, and the readers
only permitting forward reads. If double-fetching a memory location is
necessary for some reason, then that is done by creating multiple
readers to the same memory location.

Constructing a `UserSlicePtr` performs no checks on the provided
address and length, it can safely be constructed inside a kernel thread
with no current userspace process. Reads and writes wrap the kernel APIs
`copy_from_user` and `copy_to_user`, which check the memory map of the
current process and enforce that the address range is within the user
range (no additional calls to `access_ok` are needed).

This code is based on something that was originally written by Wedson on
the old rust branch. It was modified by Alice by removing the
`IoBufferReader` and `IoBufferWriter` traits, introducing the
`MAX_USER_OP_LEN` constant, and various changes to the comments and
documentation.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Co-developed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/helpers.c          |  14 +++
 rust/kernel/lib.rs      |   1 +
 rust/kernel/user_ptr.rs | 222 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 237 insertions(+)

diff --git a/rust/helpers.c b/rust/helpers.c
index 70e59efd92bc..312b6fcb49d5 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -38,6 +38,20 @@ __noreturn void rust_helper_BUG(void)
 }
 EXPORT_SYMBOL_GPL(rust_helper_BUG);
 
+unsigned long rust_helper_copy_from_user(void *to, const void __user *from,
+					 unsigned long n)
+{
+	return copy_from_user(to, from, n);
+}
+EXPORT_SYMBOL_GPL(rust_helper_copy_from_user);
+
+unsigned long rust_helper_copy_to_user(void __user *to, const void *from,
+				       unsigned long n)
+{
+	return copy_to_user(to, from, n);
+}
+EXPORT_SYMBOL_GPL(rust_helper_copy_to_user);
+
 void rust_helper_mutex_lock(struct mutex *lock)
 {
 	mutex_lock(lock);
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 7ac39874aeac..041233305fda 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -50,6 +50,7 @@
 pub mod sync;
 pub mod task;
 pub mod types;
+pub mod user_ptr;
 pub mod workqueue;
 
 #[doc(hidden)]
diff --git a/rust/kernel/user_ptr.rs b/rust/kernel/user_ptr.rs
new file mode 100644
index 000000000000..00aa26aa6a83
--- /dev/null
+++ b/rust/kernel/user_ptr.rs
@@ -0,0 +1,222 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! User pointers.
+//!
+//! C header: [`include/linux/uaccess.h`](../../../../include/linux/uaccess.h)
+
+// Comparison with MAX_USER_OP_LEN triggers this lint on platforms
+// where `c_ulong == usize`.
+#![allow(clippy::absurd_extreme_comparisons)]
+
+use crate::{bindings, error::code::*, error::Result};
+use alloc::vec::Vec;
+use core::ffi::{c_ulong, c_void};
+
+/// The maximum length of a operation using `copy_[from|to]_user`.
+///
+/// If a usize is not greater than this constant, then casting it to `c_ulong`
+/// is guaranteed to be lossless.
+const MAX_USER_OP_LEN: usize = c_ulong::MAX as usize;
+
+/// A pointer to an area in userspace memory, which can be either read-only or
+/// read-write.
+///
+/// All methods on this struct are safe: invalid pointers return `EFAULT`.
+/// Concurrent access, *including data races to/from userspace memory*, is
+/// permitted, because fundamentally another userspace thread/process could
+/// always be modifying memory at the same time (in the same way that userspace
+/// Rust's [`std::io`] permits data races with the contents of files on disk).
+/// In the presence of a race, the exact byte values read/written are
+/// unspecified but the operation is well-defined. Kernelspace code should
+/// validate its copy of data after completing a read, and not expect that
+/// multiple reads of the same address will return the same value.
+///
+/// These APIs are designed to make it difficult to accidentally write TOCTOU
+/// bugs. Every time you read from a memory location, the pointer is advanced by
+/// the length so that you cannot use that reader to read the same memory
+/// location twice. Preventing double-fetches avoids TOCTOU bugs. This is
+/// accomplished by taking `self` by value to prevent obtaining multiple readers
+/// on a given [`UserSlicePtr`], and the readers only permitting forward reads.
+/// If double-fetching a memory location is necessary for some reason, then that
+/// is done by creating multiple readers to the same memory location, e.g. using
+/// [`clone_reader`].
+///
+/// Constructing a [`UserSlicePtr`] performs no checks on the provided address
+/// and length, it can safely be constructed inside a kernel thread with no
+/// current userspace process. Reads and writes wrap the kernel APIs
+/// `copy_from_user` and `copy_to_user`, which check the memory map of the
+/// current process and enforce that the address range is within the user range
+/// (no additional calls to `access_ok` are needed).
+///
+/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
+/// [`clone_reader`]: UserSlicePtrReader::clone_reader
+pub struct UserSlicePtr(*mut c_void, usize);
+
+impl UserSlicePtr {
+    /// Constructs a user slice from a raw pointer and a length in bytes.
+    ///
+    /// Callers must be careful to avoid time-of-check-time-of-use
+    /// (TOCTOU) issues. The simplest way is to create a single instance of
+    /// [`UserSlicePtr`] per user memory block as it reads each byte at
+    /// most once.
+    pub fn new(ptr: *mut c_void, length: usize) -> Self {
+        UserSlicePtr(ptr, length)
+    }
+
+    /// Reads the entirety of the user slice.
+    ///
+    /// Returns `EFAULT` if the address does not currently point to
+    /// mapped, readable memory.
+    pub fn read_all(self) -> Result<Vec<u8>> {
+        self.reader().read_all()
+    }
+
+    /// Constructs a [`UserSlicePtrReader`].
+    pub fn reader(self) -> UserSlicePtrReader {
+        UserSlicePtrReader(self.0, self.1)
+    }
+
+    /// Constructs a [`UserSlicePtrWriter`].
+    pub fn writer(self) -> UserSlicePtrWriter {
+        UserSlicePtrWriter(self.0, self.1)
+    }
+
+    /// Constructs both a [`UserSlicePtrReader`] and a [`UserSlicePtrWriter`].
+    pub fn reader_writer(self) -> (UserSlicePtrReader, UserSlicePtrWriter) {
+        (
+            UserSlicePtrReader(self.0, self.1),
+            UserSlicePtrWriter(self.0, self.1),
+        )
+    }
+}
+
+/// A reader for [`UserSlicePtr`].
+///
+/// Used to incrementally read from the user slice.
+pub struct UserSlicePtrReader(*mut c_void, usize);
+
+impl UserSlicePtrReader {
+    /// Skip the provided number of bytes.
+    ///
+    /// Returns an error if skipping more than the length of the buffer.
+    pub fn skip(&mut self, num_skip: usize) -> Result {
+        // Update `self.1` first since that's the fallible one.
+        self.1 = self.1.checked_sub(num_skip).ok_or(EFAULT)?;
+        self.0 = self.0.wrapping_add(num_skip);
+        Ok(())
+    }
+
+    /// Create a reader that can access the same range of data.
+    ///
+    /// Reading from the clone does not advance the current reader.
+    ///
+    /// The caller should take care to not introduce TOCTOU issues.
+    pub fn clone_reader(&self) -> UserSlicePtrReader {
+        UserSlicePtrReader(self.0, self.1)
+    }
+
+    /// Returns the number of bytes left to be read from this.
+    ///
+    /// Note that even reading less than this number of bytes may fail.
+    pub fn len(&self) -> usize {
+        self.1
+    }
+
+    /// Returns `true` if no data is available in the io buffer.
+    pub fn is_empty(&self) -> bool {
+        self.1 == 0
+    }
+
+    /// Reads raw data from the user slice into a raw kernel buffer.
+    ///
+    /// Fails with `EFAULT` if the read encounters a page fault.
+    ///
+    /// # Safety
+    ///
+    /// The `out` pointer must be valid for writing `len` bytes.
+    pub unsafe fn read_raw(&mut self, out: *mut u8, len: usize) -> Result {
+        if len > self.1 || len > MAX_USER_OP_LEN {
+            return Err(EFAULT);
+        }
+        // SAFETY: The caller promises that `out` is valid for writing `len` bytes.
+        let res = unsafe { bindings::copy_from_user(out.cast::<c_void>(), self.0, len as c_ulong) };
+        if res != 0 {
+            return Err(EFAULT);
+        }
+        // Since this is not a pointer to a valid object in our program,
+        // we cannot use `add`, which has C-style rules for defined
+        // behavior.
+        self.0 = self.0.wrapping_add(len);
+        self.1 -= len;
+        Ok(())
+    }
+
+    /// Reads all remaining data in the buffer into a vector.
+    ///
+    /// Fails with `EFAULT` if the read encounters a page fault.
+    pub fn read_all(&mut self) -> Result<Vec<u8>> {
+        let len = self.len();
+        let mut data = Vec::<u8>::try_with_capacity(len)?;
+
+        // SAFETY: The output buffer is valid for `len` bytes as we just allocated that much space.
+        unsafe { self.read_raw(data.as_mut_ptr(), len)? };
+
+        // SAFETY: Since the call to `read_raw` was successful, the first `len` bytes of the vector
+        // have been initialized.
+        unsafe { data.set_len(len) };
+        Ok(data)
+    }
+}
+
+/// A writer for [`UserSlicePtr`].
+///
+/// Used to incrementally write into the user slice.
+pub struct UserSlicePtrWriter(*mut c_void, usize);
+
+impl UserSlicePtrWriter {
+    /// Returns the amount of space remaining in this buffer.
+    ///
+    /// Note that even writing less than this number of bytes may fail.
+    pub fn len(&self) -> usize {
+        self.1
+    }
+
+    /// Returns `true` if no more data can be written to this buffer.
+    pub fn is_empty(&self) -> bool {
+        self.1 == 0
+    }
+
+    /// Writes raw data to this user pointer from a raw kernel buffer.
+    ///
+    /// Fails with `EFAULT` if the write encounters a page fault.
+    ///
+    /// # Safety
+    ///
+    /// The `data` pointer must be valid for reading `len` bytes.
+    pub unsafe fn write_raw(&mut self, data: *const u8, len: usize) -> Result {
+        if len > self.1 || len > MAX_USER_OP_LEN {
+            return Err(EFAULT);
+        }
+        let res = unsafe { bindings::copy_to_user(self.0, data.cast::<c_void>(), len as c_ulong) };
+        if res != 0 {
+            return Err(EFAULT);
+        }
+        // Since this is not a pointer to a valid object in our program,
+        // we cannot use `add`, which has C-style rules for defined
+        // behavior.
+        self.0 = self.0.wrapping_add(len);
+        self.1 -= len;
+        Ok(())
+    }
+
+    /// Writes the provided slice to this user pointer.
+    ///
+    /// Fails with `EFAULT` if the write encounters a page fault.
+    pub fn write_slice(&mut self, data: &[u8]) -> Result {
+        let len = data.len();
+        let ptr = data.as_ptr();
+        // SAFETY: The pointer originates from a reference to a slice of length
+        // `len`, so the pointer is valid for reading `len` bytes.
+        unsafe { self.write_raw(ptr, len) }
+    }
+}

-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-01-24 11:20 [PATCH 0/3] Memory management patches needed by Rust Binder Alice Ryhl
  2024-01-24 11:20 ` [PATCH 1/3] rust: add userspace pointers Alice Ryhl
@ 2024-01-24 11:20 ` Alice Ryhl
  2024-01-24 23:46   ` Valentin Obst
                     ` (2 more replies)
  2024-01-24 11:20 ` [PATCH 3/3] rust: add abstraction for `struct page` Alice Ryhl
  2 siblings, 3 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-01-24 11:20 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Alice Ryhl, Christian Brauner

Add safe methods for reading and writing Rust values to and from
userspace pointers.

The C methods for copying to/from userspace use a function called
`check_object_size` to verify that the kernel pointer is not dangling.
However, this check is skipped when the length is a compile-time
constant, with the assumption that such cases trivially have a correct
kernel pointer.

In this patch, we apply the same optimization to the typed accessors.
For both methods, the size of the operation is known at compile time to
be size_of of the type being read or written. Since the C side doesn't
provide a variant that skips only this check, we create custom helpers
for this purpose.

The majority of reads and writes to userspace pointers in the Rust
Binder driver uses these accessor methods. Benchmarking has found that
skipping the `check_object_size` check makes a big difference for the
cases being skipped here. (And that the check doesn't make a difference
for the cases that use the raw read/write methods.)

This code is based on something that was originally written by Wedson on
the old rust branch. It was modified by Alice to skip the
`check_object_size` check, and to update various comments, including the
notes about kernel pointers in `WritableToBytes`.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/helpers.c          |  34 +++++++++++++
 rust/kernel/user_ptr.rs | 125 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 159 insertions(+)

diff --git a/rust/helpers.c b/rust/helpers.c
index 312b6fcb49d5..187f445fbf19 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -52,6 +52,40 @@ unsigned long rust_helper_copy_to_user(void __user *to, const void *from,
 }
 EXPORT_SYMBOL_GPL(rust_helper_copy_to_user);
 
+/*
+ * These methods skip the `check_object_size` check that `copy_[to|from]_user`
+ * normally performs. In C, these checks are skipped whenever the length is a
+ * compile-time constant, since when that is the case, the kernel pointer
+ * usually points at a local variable that is being initialized and the kernel
+ * pointer is trivially non-dangling.
+ *
+ * These helpers serve the same purpose in Rust. Whenever the length is known at
+ * compile-time, we call this helper to skip the check.
+ */
+unsigned long rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to, const void __user *from, unsigned long n)
+{
+	unsigned long res;
+
+	might_fault();
+	instrument_copy_from_user_before(to, from, n);
+	if (should_fail_usercopy())
+		return n;
+	res = raw_copy_from_user(to, from, n);
+	instrument_copy_from_user_after(to, from, n, res);
+	return res;
+}
+EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
+
+unsigned long rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to, const void *from, unsigned long n)
+{
+	might_fault();
+	if (should_fail_usercopy())
+		return n;
+	instrument_copy_to_user(to, from, n);
+	return raw_copy_to_user(to, from, n);
+}
+EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
+
 void rust_helper_mutex_lock(struct mutex *lock)
 {
 	mutex_lock(lock);
diff --git a/rust/kernel/user_ptr.rs b/rust/kernel/user_ptr.rs
index 00aa26aa6a83..daa46abe5525 100644
--- a/rust/kernel/user_ptr.rs
+++ b/rust/kernel/user_ptr.rs
@@ -11,6 +11,7 @@
 use crate::{bindings, error::code::*, error::Result};
 use alloc::vec::Vec;
 use core::ffi::{c_ulong, c_void};
+use core::mem::{size_of, MaybeUninit};
 
 /// The maximum length of a operation using `copy_[from|to]_user`.
 ///
@@ -151,6 +152,36 @@ pub unsafe fn read_raw(&mut self, out: *mut u8, len: usize) -> Result {
         Ok(())
     }
 
+    /// Reads a value of the specified type.
+    ///
+    /// Fails with `EFAULT` if the read encounters a page fault.
+    pub fn read<T: ReadableFromBytes>(&mut self) -> Result<T> {
+        if size_of::<T>() > self.1 || size_of::<T>() > MAX_USER_OP_LEN {
+            return Err(EFAULT);
+        }
+        let mut out: MaybeUninit<T> = MaybeUninit::uninit();
+        // SAFETY: The local variable `out` is valid for writing `size_of::<T>()` bytes.
+        let res = unsafe {
+            bindings::copy_from_user_unsafe_skip_check_object_size(
+                out.as_mut_ptr().cast::<c_void>(),
+                self.0,
+                size_of::<T>() as c_ulong,
+            )
+        };
+        if res != 0 {
+            return Err(EFAULT);
+        }
+        // Since this is not a pointer to a valid object in our program,
+        // we cannot use `add`, which has C-style rules for defined
+        // behavior.
+        self.0 = self.0.wrapping_add(size_of::<T>());
+        self.1 -= size_of::<T>();
+        // SAFETY: The read above has initialized all bytes in `out`, and since
+        // `T` implements `ReadableFromBytes`, any bit-pattern is a valid value
+        // for this type.
+        Ok(unsafe { out.assume_init() })
+    }
+
     /// Reads all remaining data in the buffer into a vector.
     ///
     /// Fails with `EFAULT` if the read encounters a page fault.
@@ -219,4 +250,98 @@ pub fn write_slice(&mut self, data: &[u8]) -> Result {
         // `len`, so the pointer is valid for reading `len` bytes.
         unsafe { self.write_raw(ptr, len) }
     }
+
+    /// Writes the provided Rust value to this userspace pointer.
+    ///
+    /// Fails with `EFAULT` if the write encounters a page fault.
+    pub fn write<T: WritableToBytes>(&mut self, value: &T) -> Result {
+        if size_of::<T>() > self.1 || size_of::<T>() > MAX_USER_OP_LEN {
+            return Err(EFAULT);
+        }
+        // SAFETY: The reference points to a value of type `T`, so it is valid
+        // for reading `size_of::<T>()` bytes.
+        let res = unsafe {
+            bindings::copy_to_user_unsafe_skip_check_object_size(
+                self.0,
+                (value as *const T).cast::<c_void>(),
+                size_of::<T>() as c_ulong,
+            )
+        };
+        if res != 0 {
+            return Err(EFAULT);
+        }
+        // Since this is not a pointer to a valid object in our program,
+        // we cannot use `add`, which has C-style rules for defined
+        // behavior.
+        self.0 = self.0.wrapping_add(size_of::<T>());
+        self.1 -= size_of::<T>();
+        Ok(())
+    }
 }
+
+/// Specifies that a type is safely readable from bytes.
+///
+/// Not all types are valid for all values. For example, a `bool` must be either
+/// zero or one, so reading arbitrary bytes into something that contains a
+/// `bool` is not okay.
+///
+/// It's okay for the type to have padding, as initializing those bytes has no
+/// effect.
+///
+/// # Safety
+///
+/// All bit-patterns must be valid for this type.
+pub unsafe trait ReadableFromBytes {}
+
+// SAFETY: All bit patterns are acceptable values of the types below.
+unsafe impl ReadableFromBytes for u8 {}
+unsafe impl ReadableFromBytes for u16 {}
+unsafe impl ReadableFromBytes for u32 {}
+unsafe impl ReadableFromBytes for u64 {}
+unsafe impl ReadableFromBytes for usize {}
+unsafe impl ReadableFromBytes for i8 {}
+unsafe impl ReadableFromBytes for i16 {}
+unsafe impl ReadableFromBytes for i32 {}
+unsafe impl ReadableFromBytes for i64 {}
+unsafe impl ReadableFromBytes for isize {}
+// SAFETY: If all bit patterns are acceptable for individual values in an array,
+// then all bit patterns are also acceptable for arrays of that type.
+unsafe impl<T: ReadableFromBytes> ReadableFromBytes for [T] {}
+unsafe impl<T: ReadableFromBytes, const N: usize> ReadableFromBytes for [T; N] {}
+
+/// Specifies that a type is safely writable to bytes.
+///
+/// If a struct implements this trait, then it is okay to copy it byte-for-byte
+/// to userspace. This means that it should not have any padding, as padding
+/// bytes are uninitialized. Reading uninitialized memory is not just undefined
+/// behavior, it may even lead to leaking sensitive information on the stack to
+/// userspace.
+///
+/// The struct should also not hold kernel pointers, as kernel pointer addresses
+/// are also considered sensitive. However, leaking kernel pointers is not
+/// considered undefined behavior by Rust, so this is a correctness requirement,
+/// but not a safety requirement.
+///
+/// # Safety
+///
+/// Values of this type may not contain any uninitialized bytes.
+pub unsafe trait WritableToBytes {}
+
+// SAFETY: Instances of the following types have no uninitialized portions.
+unsafe impl WritableToBytes for u8 {}
+unsafe impl WritableToBytes for u16 {}
+unsafe impl WritableToBytes for u32 {}
+unsafe impl WritableToBytes for u64 {}
+unsafe impl WritableToBytes for usize {}
+unsafe impl WritableToBytes for i8 {}
+unsafe impl WritableToBytes for i16 {}
+unsafe impl WritableToBytes for i32 {}
+unsafe impl WritableToBytes for i64 {}
+unsafe impl WritableToBytes for isize {}
+unsafe impl WritableToBytes for bool {}
+unsafe impl WritableToBytes for char {}
+unsafe impl WritableToBytes for str {}
+// SAFETY: If individual values in an array have no uninitialized portions, then
+// the the array itself does not have any uninitialized portions either.
+unsafe impl<T: WritableToBytes> WritableToBytes for [T] {}
+unsafe impl<T: WritableToBytes, const N: usize> WritableToBytes for [T; N] {}

-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-24 11:20 [PATCH 0/3] Memory management patches needed by Rust Binder Alice Ryhl
  2024-01-24 11:20 ` [PATCH 1/3] rust: add userspace pointers Alice Ryhl
  2024-01-24 11:20 ` [PATCH 2/3] rust: add typed accessors for " Alice Ryhl
@ 2024-01-24 11:20 ` Alice Ryhl
  2024-01-26  0:46   ` Boqun Feng
                     ` (2 more replies)
  2 siblings, 3 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-01-24 11:20 UTC (permalink / raw)
  To: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Alice Ryhl, Christian Brauner

Adds a new struct called `Page` that wraps a pointer to `struct page`.
This struct is assumed to hold ownership over the page, so that Rust
code can allocate and manage pages directly.

The page type has various methods for reading and writing into the page.
These methods will temporarily map the page to allow the operation. All
of these methods use a helper that takes an offset and length, performs
bounds checks, and returns a pointer to the given offset in the page.

This patch only adds support for pages of order zero, as that is all
Rust Binder needs. However, it is written to make it easy to add support
for higher-order pages in the future. To do that, you would add a const
generic parameter to `Page` that specifies the order. Most of the
methods do not need to be adjusted, as the logic for dealing with
mapping multiple pages at once can be isolated to just the
`with_pointer_into_page` method. Finally, the struct can be renamed to
`Pages<ORDER>`, and the type alias `Page = Pages<0>` can be introduced.

Rust Binder needs to manage pages directly as that is how transactions
are delivered: Each process has an mmap'd region for incoming
transactions. When an incoming transaction arrives, the Binder driver
will choose a region in the mmap, allocate and map the relevant pages
manually, and copy the incoming transaction directly into the page. This
architecture allows the driver to copy transactions directly from the
address space of one process to another, without an intermediate copy
to a kernel buffer.

This code is based on Wedson's page abstractions from the old rust
branch, but it has been modified by Alice by removing the incomplete
support for higher-order pages, and by introducing the `with_*` helpers
to consolidate the bounds checking logic into a single place.

Co-developed-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/bindings/bindings_helper.h |   1 +
 rust/helpers.c                  |  20 +++++
 rust/kernel/lib.rs              |   1 +
 rust/kernel/page.rs             | 176 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 198 insertions(+)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index c0cb4b05b918..7698f5b349d3 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -21,3 +21,4 @@
 const size_t RUST_CONST_HELPER_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;
 const gfp_t RUST_CONST_HELPER_GFP_KERNEL = GFP_KERNEL;
 const gfp_t RUST_CONST_HELPER___GFP_ZERO = __GFP_ZERO;
+const gfp_t RUST_CONST_HELPER___GFP_HIGHMEM = ___GFP_HIGHMEM;
diff --git a/rust/helpers.c b/rust/helpers.c
index 187f445fbf19..e6541119160b 100644
--- a/rust/helpers.c
+++ b/rust/helpers.c
@@ -25,6 +25,8 @@
 #include <linux/build_bug.h>
 #include <linux/err.h>
 #include <linux/errname.h>
+#include <linux/gfp.h>
+#include <linux/highmem.h>
 #include <linux/mutex.h>
 #include <linux/refcount.h>
 #include <linux/sched/signal.h>
@@ -127,6 +129,24 @@ int rust_helper_signal_pending(struct task_struct *t)
 }
 EXPORT_SYMBOL_GPL(rust_helper_signal_pending);
 
+struct page *rust_helper_alloc_pages(gfp_t gfp_mask, unsigned int order)
+{
+       return alloc_pages(gfp_mask, order);
+}
+EXPORT_SYMBOL_GPL(rust_helper_alloc_pages);
+
+void *rust_helper_kmap_local_page(struct page *page)
+{
+       return kmap_local_page(page);
+}
+EXPORT_SYMBOL_GPL(rust_helper_kmap_local_page);
+
+void rust_helper_kunmap_local(const void *addr)
+{
+       kunmap_local(addr);
+}
+EXPORT_SYMBOL_GPL(rust_helper_kunmap_local);
+
 refcount_t rust_helper_REFCOUNT_INIT(int n)
 {
 	return (refcount_t)REFCOUNT_INIT(n);
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 041233305fda..9f31faf88973 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -41,6 +41,7 @@
 pub mod kunit;
 #[cfg(CONFIG_NET)]
 pub mod net;
+pub mod page;
 pub mod prelude;
 pub mod print;
 mod static_assert;
diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
new file mode 100644
index 000000000000..f83c889d39e3
--- /dev/null
+++ b/rust/kernel/page.rs
@@ -0,0 +1,176 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Kernel page allocation and management.
+
+use crate::{bindings, error::code::*, error::Result, user_ptr::UserSlicePtrReader};
+use core::{
+    alloc::AllocError,
+    ffi::c_void,
+    ptr::{self, NonNull},
+};
+
+/// A bitwise shift for the page size.
+pub const PAGE_SHIFT: usize = bindings::PAGE_SHIFT as usize;
+/// The number of bytes in a page.
+pub const PAGE_SIZE: usize = 1 << PAGE_SHIFT;
+/// A bitwise mask for the page size.
+pub const PAGE_MASK: usize = PAGE_SIZE - 1;
+
+/// A pointer to a page that owns the page allocation.
+///
+/// # Invariants
+///
+/// The pointer points at a page, and has ownership over the page.
+pub struct Page {
+    page: NonNull<bindings::page>,
+}
+
+// SAFETY: It is safe to transfer page allocations between threads.
+unsafe impl Send for Page {}
+
+// SAFETY: Calling `&self` methods on this type in parallel is safe. It might
+// allow you to perform a data race on bytes stored in the page, but we treat
+// this like data races on user pointers.
+unsafe impl Sync for Page {}
+
+impl Page {
+    /// Allocates a new set of contiguous pages.
+    pub fn new() -> Result<Self, AllocError> {
+        // SAFETY: These are the correct arguments to allocate a single page.
+        let page = unsafe {
+            bindings::alloc_pages(
+                bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::__GFP_HIGHMEM,
+                0,
+            )
+        };
+
+        match NonNull::new(page) {
+            // INVARIANT: We checked that the allocation above succeeded.
+            Some(page) => Ok(Self { page }),
+            None => Err(AllocError),
+        }
+    }
+
+    /// Returns a raw pointer to the page.
+    pub fn as_ptr(&self) -> *mut bindings::page {
+        self.page.as_ptr()
+    }
+
+    /// Runs a piece of code with this page mapped to an address.
+    ///
+    /// It is up to the caller to use the provided raw pointer correctly.
+    pub fn with_page_mapped<T>(&self, f: impl FnOnce(*mut c_void) -> T) -> T {
+        // SAFETY: `page` is valid due to the type invariants on `Page`.
+        let mapped_addr = unsafe { bindings::kmap_local_page(self.as_ptr()) };
+
+        let res = f(mapped_addr);
+
+        // SAFETY: This unmaps the page mapped above.
+        //
+        // Since this API takes the user code as a closure, it can only be used
+        // in a manner where the pages are unmapped in reverse order. This is as
+        // required by `kunmap_local`.
+        //
+        // In other words, if this call to `kunmap_local` happens when a
+        // different page should be unmapped first, then there must necessarily
+        // be a call to `kmap_local_page` other than the call just above in
+        // `with_page_mapped` that made that possible. In this case, it is the
+        // unsafe block that wraps that other call that is incorrect.
+        unsafe { bindings::kunmap_local(mapped_addr) };
+
+        res
+    }
+
+    /// Runs a piece of code with a raw pointer to a slice of this page, with
+    /// bounds checking.
+    ///
+    /// If `f` is called, then it will be called with a pointer that points at
+    /// `off` bytes into the page, and the pointer will be valid for at least
+    /// `len` bytes. The pointer is only valid on this task, as this method uses
+    /// a local mapping.
+    ///
+    /// If `off` and `len` refers to a region outside of this page, then this
+    /// method returns `EINVAL` and does not call `f`.
+    pub fn with_pointer_into_page<T>(
+        &self,
+        off: usize,
+        len: usize,
+        f: impl FnOnce(*mut u8) -> Result<T>,
+    ) -> Result<T> {
+        let bounds_ok = off <= PAGE_SIZE && len <= PAGE_SIZE && (off + len) <= PAGE_SIZE;
+
+        if bounds_ok {
+            self.with_page_mapped(move |page_addr| {
+                // SAFETY: The `off` integer is at most `PAGE_SIZE`, so this pointer offset will
+                // result in a pointer that is in bounds or one off the end of the page.
+                f(unsafe { page_addr.cast::<u8>().add(off) })
+            })
+        } else {
+            Err(EINVAL)
+        }
+    }
+
+    /// Maps the page and reads from it into the given buffer.
+    ///
+    /// # Safety
+    ///
+    /// Callers must ensure that `dest` is valid for writing `len` bytes.
+    pub unsafe fn read(&self, dest: *mut u8, offset: usize, len: usize) -> Result {
+        self.with_pointer_into_page(offset, len, move |from_ptr| {
+            // SAFETY: If `with_pointer_into_page` calls into this closure, then
+            // it has performed a bounds check and guarantees that `from_ptr` is
+            // valid for `len` bytes.
+            unsafe { ptr::copy(from_ptr, dest, len) };
+            Ok(())
+        })
+    }
+
+    /// Maps the page and writes into it from the given buffer.
+    ///
+    /// # Safety
+    ///
+    /// Callers must ensure that `src` is valid for reading `len` bytes.
+    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {
+        self.with_pointer_into_page(offset, len, move |to_ptr| {
+            // SAFETY: If `with_pointer_into_page` calls into this closure, then
+            // it has performed a bounds check and guarantees that `to_ptr` is
+            // valid for `len` bytes.
+            unsafe { ptr::copy(src, to_ptr, len) };
+            Ok(())
+        })
+    }
+
+    /// Maps the page and zeroes the given slice.
+    pub fn fill_zero(&self, offset: usize, len: usize) -> Result {
+        self.with_pointer_into_page(offset, len, move |to_ptr| {
+            // SAFETY: If `with_pointer_into_page` calls into this closure, then
+            // it has performed a bounds check and guarantees that `to_ptr` is
+            // valid for `len` bytes.
+            unsafe { ptr::write_bytes(to_ptr, 0u8, len) };
+            Ok(())
+        })
+    }
+
+    /// Copies data from userspace into this page.
+    pub fn copy_into_page(
+        &self,
+        reader: &mut UserSlicePtrReader,
+        offset: usize,
+        len: usize,
+    ) -> Result {
+        self.with_pointer_into_page(offset, len, move |to_ptr| {
+            // SAFETY: If `with_pointer_into_page` calls into this closure, then
+            // it has performed a bounds check and guarantees that `to_ptr` is
+            // valid for `len` bytes.
+            unsafe { reader.read_raw(to_ptr, len) }
+        })
+    }
+}
+
+impl Drop for Page {
+    fn drop(&mut self) {
+        // SAFETY: By the type invariants, we have ownership of the page and can
+        // free it.
+        unsafe { bindings::__free_pages(self.page.as_ptr(), 0) };
+    }
+}

-- 
2.43.0.429.g432eaa2c6b-goog


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-01-24 11:20 ` [PATCH 1/3] rust: add userspace pointers Alice Ryhl
@ 2024-01-24 23:12   ` Valentin Obst
  2024-02-08 12:20     ` Alice Ryhl
  2024-02-01  4:06   ` Trevor Gross
  1 sibling, 1 reply; 40+ messages in thread
From: Valentin Obst @ 2024-01-24 23:12 UTC (permalink / raw)
  To: aliceryhl
  Cc: a.hindborg, akpm, alex.gaynor, arnd, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, gary, gregkh, joel,
	keescook, linux-kernel, linux-mm, maco, ojeda, rust-for-linux,
	surenb, tkjos, viro, wedsonaf, Valentin Obst

> +//! User pointers.
> +//!
> +//! C header: [`include/linux/uaccess.h`](../../../../include/linux/uaccess.h)
> +

nit: could this be using srctree-relative links?

> +/// The maximum length of a operation using `copy_[from|to]_user`.

nit: 'a' -> 'an'

> +///
> +/// If a usize is not greater than this constant, then casting it to `c_ulong`
> +/// is guaranteed to be lossless.

nit: could this be `usize` or [`usize`]. Maybe would also be clearer to
say "... a value of type [`usize`] is smaller than ..."

> +///
> +/// These APIs are designed to make it difficult to accidentally write TOCTOU
> +/// bugs. Every time you read from a memory location, the pointer is advanced by

Maybe makes sense to also introduce the abbreviation TOCTOU in the type
documentation when it is first used.

> +    /// Reads the entirety of the user slice.
> +    ///
> +    /// Returns `EFAULT` if the address does not currently point to
> +    /// mapped, readable memory.
> +    pub fn read_all(self) -> Result<Vec<u8>> {
> +        self.reader().read_all()
> +    }

If I understand it correctly, the function will return `EFAULT` if _any_
address in the interval `[self.0, self.0 + self.1)` does not point to
mapped, readable memory. Maybe the docs could be more explicit.

> +        // Since this is not a pointer to a valid object in our program,
> +        // we cannot use `add`, which has C-style rules for defined
> +        // behavior.
> +        self.0 = self.0.wrapping_add(len);

If I understand it correctly, you are using 'valid object' to refer to
an 'allocated object' [1] as this is what the `add` method's docs
refer to [2]. In that case it might be better to use the latter term as
it has a defined meaning. Also see [3] and [4] which are about making it
more precise.

[1]: https://doc.rust-lang.org/core/ptr/index.html#allocated-object
[2]: https://doc.rust-lang.org/core/primitive.pointer.html#method.add
[3]: https://github.com/rust-lang/rust/pull/116675
[4]: https://github.com/rust-lang/unsafe-code-guidelines/issues/465

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-01-24 11:20 ` [PATCH 2/3] rust: add typed accessors for " Alice Ryhl
@ 2024-01-24 23:46   ` Valentin Obst
  2024-01-25 12:40     ` Alice Ryhl
  2024-01-25 12:26   ` Arnd Bergmann
  2024-02-01  5:03   ` Trevor Gross
  2 siblings, 1 reply; 40+ messages in thread
From: Valentin Obst @ 2024-01-24 23:46 UTC (permalink / raw)
  To: aliceryhl
  Cc: a.hindborg, akpm, alex.gaynor, arnd, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, gary, gregkh, joel,
	keescook, linux-kernel, linux-mm, maco, ojeda, rust-for-linux,
	surenb, tkjos, viro, wedsonaf, Valentin Obst

> +/*

nit: this would be the first comment in the kernel crate to use this
style, not sure if there is a rule about that though. Maybe still
preferable to keep it consistent.

> + * These methods skip the `check_object_size` check that `copy_[to|from]_user`
> + * normally performs.

nit: They skip the (stronger, and also present without usercopy
hardening) `check_copy_size` wrapping that one.

>                        In C, these checks are skipped whenever the length is a
> + * compile-time constant, since when that is the case, the kernel pointer
> + * usually points at a local variable that is being initialized

Question: I thought that this exemption is about dynamic size
calculations being more susceptible to bugs than hard-coded ones. Does
someone recall the original rationale for that?

>                                                                  and the kernel
> + * pointer is trivially non-dangling.

As far as I know the hardened usercopy checks are not meant to catch
UAFs but rather about OOB accesses (and some info leaks). For example,
if the object is on the heap they check if the copy size exceeds the
allocation size, or, if the object is on the stack, they verify the copy
size does not leave the stack frame.

> + *
> + * These helpers serve the same purpose in Rust. Whenever the length is known at
> + * compile-time, we call this helper to skip the check.
> + */
> +unsigned long rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to, const void __user *from, unsigned long n)
> +{
> +	unsigned long res;
> +
> +	might_fault();
> +	instrument_copy_from_user_before(to, from, n);
> +	if (should_fail_usercopy())
> +		return n;
> +	res = raw_copy_from_user(to, from, n);
> +	instrument_copy_from_user_after(to, from, n, res);
> +	return res;
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
> +
> +unsigned long rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to, const void *from, unsigned long n)
> +{
> +	might_fault();
> +	if (should_fail_usercopy())
> +		return n;
> +	instrument_copy_to_user(to, from, n);
> +	return raw_copy_to_user(to, from, n);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);

Could those be wrapping `_copy_[to|from]_user` instead?

	- Valentin

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-01-24 11:20 ` [PATCH 2/3] rust: add typed accessors for " Alice Ryhl
  2024-01-24 23:46   ` Valentin Obst
@ 2024-01-25 12:26   ` Arnd Bergmann
  2024-01-25 12:37     ` Alice Ryhl
  2024-02-01  5:03   ` Trevor Gross
  2 siblings, 1 reply; 40+ messages in thread
From: Arnd Bergmann @ 2024-01-25 12:26 UTC (permalink / raw)
  To: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Kees Cook, Alexander Viro, Andrew Morton
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, linux-mm, linux-kernel, rust-for-linux,
	Christian Brauner

On Wed, Jan 24, 2024, at 12:20, Alice Ryhl wrote:
> +unsigned long 
> rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to, 
> const void __user *from, unsigned long n)
> +{
> +	unsigned long res;
> +
> +	might_fault();
> +	instrument_copy_from_user_before(to, from, n);
> +	if (should_fail_usercopy())
> +		return n;
> +	res = raw_copy_from_user(to, from, n);
> +	instrument_copy_from_user_after(to, from, n, res);
> +	return res;
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
> +
> +unsigned long 
> rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to, 
> const void *from, unsigned long n)
> +{
> +	might_fault();
> +	if (should_fail_usercopy())
> +		return n;
> +	instrument_copy_to_user(to, from, n);
> +	return raw_copy_to_user(to, from, n);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);

These functions are almost identical to the ones in
lib/usercopy.c for !defined(INLINE_COPY_TO_USER).

That version has an extra memset() after a partial
copy_from_user(), and you probably want to have the
same thing here for consistency.

I think ideally we should only have one out-of-line copy
of these two functions and have that one shared between
rust and architectures that want the C version out of line
as well.

     Arnd

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-01-25 12:26   ` Arnd Bergmann
@ 2024-01-25 12:37     ` Alice Ryhl
  2024-01-25 15:59       ` Arnd Bergmann
  0 siblings, 1 reply; 40+ messages in thread
From: Alice Ryhl @ 2024-01-25 12:37 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, linux-mm,
	linux-kernel, rust-for-linux, Christian Brauner

On Thu, Jan 25, 2024 at 1:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Jan 24, 2024, at 12:20, Alice Ryhl wrote:
> > +unsigned long
> > rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to,
> > const void __user *from, unsigned long n)
> > +{
> > +     unsigned long res;
> > +
> > +     might_fault();
> > +     instrument_copy_from_user_before(to, from, n);
> > +     if (should_fail_usercopy())
> > +             return n;
> > +     res = raw_copy_from_user(to, from, n);
> > +     instrument_copy_from_user_after(to, from, n, res);
> > +     return res;
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
> > +
> > +unsigned long
> > rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to,
> > const void *from, unsigned long n)
> > +{
> > +     might_fault();
> > +     if (should_fail_usercopy())
> > +             return n;
> > +     instrument_copy_to_user(to, from, n);
> > +     return raw_copy_to_user(to, from, n);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
>
> These functions are almost identical to the ones in
> lib/usercopy.c for !defined(INLINE_COPY_TO_USER).
>
> That version has an extra memset() after a partial
> copy_from_user(), and you probably want to have the
> same thing here for consistency.
>
> I think ideally we should only have one out-of-line copy
> of these two functions and have that one shared between
> rust and architectures that want the C version out of line
> as well.

I had a bit of trouble figuring out all of the copy_[to/from]_user
methods that are available. I was hoping that a better solution would
be available, and it sounds like one is. Is _copy_from_user always
available as an exported symbol? If it's always available and skips
the check, then I can just use that. I don't think the memset matters
for my case.

Otherwise, I can add a helper in rust/helpers.c that wraps
_copy_from_user only when INLINE_COPY_FROM_USER is defined, and call
the helper in those cases, and otherwise call the exported symbol
directly. (I need an exported symbol to call into C from Rust.)

Would that make sense?

Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-01-24 23:46   ` Valentin Obst
@ 2024-01-25 12:40     ` Alice Ryhl
  0 siblings, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-01-25 12:40 UTC (permalink / raw)
  To: Valentin Obst
  Cc: a.hindborg, akpm, alex.gaynor, arnd, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, gary, gregkh, joel,
	keescook, linux-kernel, linux-mm, maco, ojeda, rust-for-linux,
	surenb, tkjos, viro, wedsonaf

On Thu, Jan 25, 2024 at 12:47 AM Valentin Obst <kernel@valentinobst.de> wrote:
>
> > +/*
>
> nit: this would be the first comment in the kernel crate to use this
> style, not sure if there is a rule about that though. Maybe still
> preferable to keep it consistent.
>
> > + * These methods skip the `check_object_size` check that `copy_[to|from]_user`
> > + * normally performs.
>
> nit: They skip the (stronger, and also present without usercopy
> hardening) `check_copy_size` wrapping that one.

The only difference between check_object_size and check_copy_size is
the extra check with __builtin_object_size, but that doesn't work
across the C/Rust boundary, and Rust doesn't have a direct equivalent.

> >                        In C, these checks are skipped whenever the length is a
> > + * compile-time constant, since when that is the case, the kernel pointer
> > + * usually points at a local variable that is being initialized
>
> Question: I thought that this exemption is about dynamic size
> calculations being more susceptible to bugs than hard-coded ones. Does
> someone recall the original rationale for that?
>
> >                                                                  and the kernel
> > + * pointer is trivially non-dangling.
>
> As far as I know the hardened usercopy checks are not meant to catch
> UAFs but rather about OOB accesses (and some info leaks). For example,
> if the object is on the heap they check if the copy size exceeds the
> allocation size, or, if the object is on the stack, they verify the copy
> size does not leave the stack frame.

Right, I can reword to say OOB instead of UAF.

> > + *
> > + * These helpers serve the same purpose in Rust. Whenever the length is known at
> > + * compile-time, we call this helper to skip the check.
> > + */
> > +unsigned long rust_helper_copy_from_user_unsafe_skip_check_object_size(void *to, const void __user *from, unsigned long n)
> > +{
> > +     unsigned long res;
> > +
> > +     might_fault();
> > +     instrument_copy_from_user_before(to, from, n);
> > +     if (should_fail_usercopy())
> > +             return n;
> > +     res = raw_copy_from_user(to, from, n);
> > +     instrument_copy_from_user_after(to, from, n, res);
> > +     return res;
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_copy_from_user_unsafe_skip_check_object_size);
> > +
> > +unsigned long rust_helper_copy_to_user_unsafe_skip_check_object_size(void __user *to, const void *from, unsigned long n)
> > +{
> > +     might_fault();
> > +     if (should_fail_usercopy())
> > +             return n;
> > +     instrument_copy_to_user(to, from, n);
> > +     return raw_copy_to_user(to, from, n);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
>
> Could those be wrapping `_copy_[to|from]_user` instead?

Yeah maybe, see the other thread with Arnd Bergmann.

Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-01-25 12:37     ` Alice Ryhl
@ 2024-01-25 15:59       ` Arnd Bergmann
  2024-01-25 16:15         ` Alice Ryhl
  0 siblings, 1 reply; 40+ messages in thread
From: Arnd Bergmann @ 2024-01-25 15:59 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, linux-mm,
	linux-kernel, rust-for-linux, Christian Brauner

On Thu, Jan 25, 2024, at 13:37, Alice Ryhl wrote:
> On Thu, Jan 25, 2024 at 1:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
>> On Wed, Jan 24, 2024, at 12:20, Alice Ryhl wrote:

>> > +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
>>
>> These functions are almost identical to the ones in
>> lib/usercopy.c for !defined(INLINE_COPY_TO_USER).
>>
>> That version has an extra memset() after a partial
>> copy_from_user(), and you probably want to have the
>> same thing here for consistency.
>>
>> I think ideally we should only have one out-of-line copy
>> of these two functions and have that one shared between
>> rust and architectures that want the C version out of line
>> as well.
>
> I had a bit of trouble figuring out all of the copy_[to/from]_user
> methods that are available. I was hoping that a better solution would
> be available, and it sounds like one is. Is _copy_from_user always
> available as an exported symbol? If it's always available and skips
> the check, then I can just use that. I don't think the memset matters
> for my case.

At the moment, it appears that it's available on the few architectures
that don't #define INLINE_COPY_FROM_USER: alpha, csky, powerpc,
riscv and x86. On the other architectures it is always an inline
function.

> Otherwise, I can add a helper in rust/helpers.c that wraps
> _copy_from_user only when INLINE_COPY_FROM_USER is defined, and call
> the helper in those cases, and otherwise call the exported symbol
> directly. (I need an exported symbol to call into C from Rust.)
>
> Would that make sense?

I don't think we can have a perfect abstraction here, but rather
than putting knowledge of INLINE_COPY_FROM_USER into the rust
wrapper, I would suggest putting a bit of information about
rust into lib/usercopy.c.

I've tried to come up with an idea below, see if that works
for you.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 3064314f4832..835aa175d0ee 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -138,13 +138,18 @@ __copy_to_user(void __user *to, const void *from, unsigned long n)
 	return raw_copy_to_user(to, from, n);
 }
 
-#ifdef INLINE_COPY_FROM_USER
 static inline __must_check unsigned long
-_copy_from_user(void *to, const void __user *from, unsigned long n)
+_inline_copy_from_user(void *to, const void __user *from, unsigned long n)
 {
 	unsigned long res = n;
 	might_fault();
 	if (!should_fail_usercopy() && likely(access_ok(from, n))) {
+		/*
+		 * Ensure that bad access_ok() speculation will not
+		 * lead to nasty side effects *after* the copy is
+		 * finished:
+		 */
+		barrier_nospec();
 		instrument_copy_from_user_before(to, from, n);
 		res = raw_copy_from_user(to, from, n);
 		instrument_copy_from_user_after(to, from, n, res);
@@ -153,14 +158,11 @@ _copy_from_user(void *to, const void __user *from, unsigned long n)
 		memset(to + (n - res), 0, res);
 	return res;
 }
-#else
 extern __must_check unsigned long
 _copy_from_user(void *, const void __user *, unsigned long);
-#endif
 
-#ifdef INLINE_COPY_TO_USER
 static inline __must_check unsigned long
-_copy_to_user(void __user *to, const void *from, unsigned long n)
+_inline_copy_to_user(void __user *to, const void *from, unsigned long n)
 {
 	might_fault();
 	if (should_fail_usercopy())
@@ -171,25 +173,32 @@ _copy_to_user(void __user *to, const void *from, unsigned long n)
 	}
 	return n;
 }
-#else
 extern __must_check unsigned long
 _copy_to_user(void __user *, const void *, unsigned long);
-#endif
 
 static __always_inline unsigned long __must_check
 copy_from_user(void *to, const void __user *from, unsigned long n)
 {
-	if (check_copy_size(to, n, false))
-		n = _copy_from_user(to, from, n);
-	return n;
+	if (!check_copy_size(to, n, false))
+		return n;
+#ifdef INLINE_COPY_FROM_USER
+	return _inline_copy_from_user(to, from, n);
+#else
+	return _copy_from_user(to, from, n);
+#endif
 }
 
 static __always_inline unsigned long __must_check
 copy_to_user(void __user *to, const void *from, unsigned long n)
 {
-	if (check_copy_size(from, n, true))
-		n = _copy_to_user(to, from, n);
-	return n;
+	if (!check_copy_size(from, n, true))
+		return n;
+
+#ifdef INLINE_COPY_TO_USER
+	return _inline_copy_to_user(to, from, n);
+#else
+	return _copy_to_user(to, from, n);
+#endif
 }
 
 #ifndef copy_mc_to_kernel
diff --git a/lib/usercopy.c b/lib/usercopy.c
index d29fe29c6849..503a064d79e2 100644
--- a/lib/usercopy.c
+++ b/lib/usercopy.c
@@ -7,40 +7,18 @@
 
 /* out-of-line parts */
 
-#ifndef INLINE_COPY_FROM_USER
+#if !defined(INLINE_COPY_FROM_USER) || defined(CONFIG_RUST)
 unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n)
 {
-	unsigned long res = n;
-	might_fault();
-	if (!should_fail_usercopy() && likely(access_ok(from, n))) {
-		/*
-		 * Ensure that bad access_ok() speculation will not
-		 * lead to nasty side effects *after* the copy is
-		 * finished:
-		 */
-		barrier_nospec();
-		instrument_copy_from_user_before(to, from, n);
-		res = raw_copy_from_user(to, from, n);
-		instrument_copy_from_user_after(to, from, n, res);
-	}
-	if (unlikely(res))
-		memset(to + (n - res), 0, res);
-	return res;
+	return _inline_copy_from_user(to, from, n);
 }
 EXPORT_SYMBOL(_copy_from_user);
 #endif
 
-#ifndef INLINE_COPY_TO_USER
+#if !defined(INLINE_COPY_TO_USER) || defined(CONFIG_RUST)
 unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
 {
-	might_fault();
-	if (should_fail_usercopy())
-		return n;
-	if (likely(access_ok(to, n))) {
-		instrument_copy_to_user(to, from, n);
-		n = raw_copy_to_user(to, from, n);
-	}
-	return n;
+	return _inline_copy_to_user(to, from, n);
 }
 EXPORT_SYMBOL(_copy_to_user);
 #endif

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-01-25 15:59       ` Arnd Bergmann
@ 2024-01-25 16:15         ` Alice Ryhl
  0 siblings, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-01-25 16:15 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Alexander Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, linux-mm,
	linux-kernel, rust-for-linux, Christian Brauner

On Thu, Jan 25, 2024 at 5:00 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Jan 25, 2024, at 13:37, Alice Ryhl wrote:
> > On Thu, Jan 25, 2024 at 1:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >> On Wed, Jan 24, 2024, at 12:20, Alice Ryhl wrote:
>
> >> > +EXPORT_SYMBOL_GPL(rust_helper_copy_to_user_unsafe_skip_check_object_size);
> >>
> >> These functions are almost identical to the ones in
> >> lib/usercopy.c for !defined(INLINE_COPY_TO_USER).
> >>
> >> That version has an extra memset() after a partial
> >> copy_from_user(), and you probably want to have the
> >> same thing here for consistency.
> >>
> >> I think ideally we should only have one out-of-line copy
> >> of these two functions and have that one shared between
> >> rust and architectures that want the C version out of line
> >> as well.
> >
> > I had a bit of trouble figuring out all of the copy_[to/from]_user
> > methods that are available. I was hoping that a better solution would
> > be available, and it sounds like one is. Is _copy_from_user always
> > available as an exported symbol? If it's always available and skips
> > the check, then I can just use that. I don't think the memset matters
> > for my case.
>
> At the moment, it appears that it's available on the few architectures
> that don't #define INLINE_COPY_FROM_USER: alpha, csky, powerpc,
> riscv and x86. On the other architectures it is always an inline
> function.
>
> > Otherwise, I can add a helper in rust/helpers.c that wraps
> > _copy_from_user only when INLINE_COPY_FROM_USER is defined, and call
> > the helper in those cases, and otherwise call the exported symbol
> > directly. (I need an exported symbol to call into C from Rust.)
> >
> > Would that make sense?
>
> I don't think we can have a perfect abstraction here, but rather
> than putting knowledge of INLINE_COPY_FROM_USER into the rust
> wrapper, I would suggest putting a bit of information about
> rust into lib/usercopy.c.
>
> I've tried to come up with an idea below, see if that works
> for you.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 3064314f4832..835aa175d0ee 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -138,13 +138,18 @@ __copy_to_user(void __user *to, const void *from, unsigned long n)
>         return raw_copy_to_user(to, from, n);
>  }
>
> -#ifdef INLINE_COPY_FROM_USER
>  static inline __must_check unsigned long
> -_copy_from_user(void *to, const void __user *from, unsigned long n)
> +_inline_copy_from_user(void *to, const void __user *from, unsigned long n)
>  {
>         unsigned long res = n;
>         might_fault();
>         if (!should_fail_usercopy() && likely(access_ok(from, n))) {
> +               /*
> +                * Ensure that bad access_ok() speculation will not
> +                * lead to nasty side effects *after* the copy is
> +                * finished:
> +                */
> +               barrier_nospec();
>                 instrument_copy_from_user_before(to, from, n);
>                 res = raw_copy_from_user(to, from, n);
>                 instrument_copy_from_user_after(to, from, n, res);
> @@ -153,14 +158,11 @@ _copy_from_user(void *to, const void __user *from, unsigned long n)
>                 memset(to + (n - res), 0, res);
>         return res;
>  }
> -#else
>  extern __must_check unsigned long
>  _copy_from_user(void *, const void __user *, unsigned long);
> -#endif
>
> -#ifdef INLINE_COPY_TO_USER
>  static inline __must_check unsigned long
> -_copy_to_user(void __user *to, const void *from, unsigned long n)
> +_inline_copy_to_user(void __user *to, const void *from, unsigned long n)
>  {
>         might_fault();
>         if (should_fail_usercopy())
> @@ -171,25 +173,32 @@ _copy_to_user(void __user *to, const void *from, unsigned long n)
>         }
>         return n;
>  }
> -#else
>  extern __must_check unsigned long
>  _copy_to_user(void __user *, const void *, unsigned long);
> -#endif
>
>  static __always_inline unsigned long __must_check
>  copy_from_user(void *to, const void __user *from, unsigned long n)
>  {
> -       if (check_copy_size(to, n, false))
> -               n = _copy_from_user(to, from, n);
> -       return n;
> +       if (!check_copy_size(to, n, false))
> +               return n;
> +#ifdef INLINE_COPY_FROM_USER
> +       return _inline_copy_from_user(to, from, n);
> +#else
> +       return _copy_from_user(to, from, n);
> +#endif
>  }
>
>  static __always_inline unsigned long __must_check
>  copy_to_user(void __user *to, const void *from, unsigned long n)
>  {
> -       if (check_copy_size(from, n, true))
> -               n = _copy_to_user(to, from, n);
> -       return n;
> +       if (!check_copy_size(from, n, true))
> +               return n;
> +
> +#ifdef INLINE_COPY_TO_USER
> +       return _inline_copy_to_user(to, from, n);
> +#else
> +       return _copy_to_user(to, from, n);
> +#endif
>  }
>
>  #ifndef copy_mc_to_kernel
> diff --git a/lib/usercopy.c b/lib/usercopy.c
> index d29fe29c6849..503a064d79e2 100644
> --- a/lib/usercopy.c
> +++ b/lib/usercopy.c
> @@ -7,40 +7,18 @@
>
>  /* out-of-line parts */
>
> -#ifndef INLINE_COPY_FROM_USER
> +#if !defined(INLINE_COPY_FROM_USER) || defined(CONFIG_RUST)
>  unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n)
>  {
> -       unsigned long res = n;
> -       might_fault();
> -       if (!should_fail_usercopy() && likely(access_ok(from, n))) {
> -               /*
> -                * Ensure that bad access_ok() speculation will not
> -                * lead to nasty side effects *after* the copy is
> -                * finished:
> -                */
> -               barrier_nospec();
> -               instrument_copy_from_user_before(to, from, n);
> -               res = raw_copy_from_user(to, from, n);
> -               instrument_copy_from_user_after(to, from, n, res);
> -       }
> -       if (unlikely(res))
> -               memset(to + (n - res), 0, res);
> -       return res;
> +       return _inline_copy_from_user(to, from, n);
>  }
>  EXPORT_SYMBOL(_copy_from_user);
>  #endif
>
> -#ifndef INLINE_COPY_TO_USER
> +#if !defined(INLINE_COPY_TO_USER) || defined(CONFIG_RUST)
>  unsigned long _copy_to_user(void __user *to, const void *from, unsigned long n)
>  {
> -       might_fault();
> -       if (should_fail_usercopy())
> -               return n;
> -       if (likely(access_ok(to, n))) {
> -               instrument_copy_to_user(to, from, n);
> -               n = raw_copy_to_user(to, from, n);
> -       }
> -       return n;
> +       return _inline_copy_to_user(to, from, n);
>  }
>  EXPORT_SYMBOL(_copy_to_user);
>  #endif

Sure, if that's okay with you, then I'm happy to do it that way in v2.

Thank you!
Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-24 11:20 ` [PATCH 3/3] rust: add abstraction for `struct page` Alice Ryhl
@ 2024-01-26  0:46   ` Boqun Feng
  2024-01-26 12:33     ` Alice Ryhl
  2024-01-30  9:15     ` Andreas Hindborg (Samsung)
  2024-01-29 17:59   ` Matthew Wilcox
  2024-02-01  6:02   ` Trevor Gross
  2 siblings, 2 replies; 40+ messages in thread
From: Boqun Feng @ 2024-01-26  0:46 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Kees Cook,
	Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
[...]
> +    /// Runs a piece of code with a raw pointer to a slice of this page, with
> +    /// bounds checking.
> +    ///
> +    /// If `f` is called, then it will be called with a pointer that points at
> +    /// `off` bytes into the page, and the pointer will be valid for at least
> +    /// `len` bytes. The pointer is only valid on this task, as this method uses
> +    /// a local mapping.
> +    ///
> +    /// If `off` and `len` refers to a region outside of this page, then this
> +    /// method returns `EINVAL` and does not call `f`.
> +    pub fn with_pointer_into_page<T>(

Name it as `with_slice_in_page` maybe?

> +        &self,
> +        off: usize,
> +        len: usize,
> +        f: impl FnOnce(*mut u8) -> Result<T>,
> +    ) -> Result<T> {
> +        let bounds_ok = off <= PAGE_SIZE && len <= PAGE_SIZE && (off + len) <= PAGE_SIZE;
> +
> +        if bounds_ok {
> +            self.with_page_mapped(move |page_addr| {
> +                // SAFETY: The `off` integer is at most `PAGE_SIZE`, so this pointer offset will
> +                // result in a pointer that is in bounds or one off the end of the page.
> +                f(unsafe { page_addr.cast::<u8>().add(off) })
> +            })
> +        } else {
> +            Err(EINVAL)
> +        }
> +    }
> +
> +    /// Maps the page and reads from it into the given buffer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Callers must ensure that `dest` is valid for writing `len` bytes.
> +    pub unsafe fn read(&self, dest: *mut u8, offset: usize, len: usize) -> Result {
> +        self.with_pointer_into_page(offset, len, move |from_ptr| {
> +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> +            // it has performed a bounds check and guarantees that `from_ptr` is
> +            // valid for `len` bytes.
> +            unsafe { ptr::copy(from_ptr, dest, len) };
> +            Ok(())
> +        })
> +    }
> +
> +    /// Maps the page and writes into it from the given buffer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Callers must ensure that `src` is valid for reading `len` bytes.
> +    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {

Use a slice like type as `src` maybe? Then the function can be safe:

	pub fn write<S: AsRef<[u8]>>(&self, src: S, offset: usize) -> Result

Besides, since `Page` impl `Sync`, shouldn't this `write` and the
`fill_zero` be a `&mut self` function? Or make them both `unsafe`
because of potential race and add some safety requirement?

Regards,
Boqun

> +        self.with_pointer_into_page(offset, len, move |to_ptr| {
> +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> +            // it has performed a bounds check and guarantees that `to_ptr` is
> +            // valid for `len` bytes.
> +            unsafe { ptr::copy(src, to_ptr, len) };
> +            Ok(())
> +        })
> +    }
> +
> +    /// Maps the page and zeroes the given slice.
> +    pub fn fill_zero(&self, offset: usize, len: usize) -> Result {
> +        self.with_pointer_into_page(offset, len, move |to_ptr| {
> +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> +            // it has performed a bounds check and guarantees that `to_ptr` is
> +            // valid for `len` bytes.
> +            unsafe { ptr::write_bytes(to_ptr, 0u8, len) };
> +            Ok(())
> +        })
> +    }
> +
> +    /// Copies data from userspace into this page.
> +    pub fn copy_into_page(
> +        &self,
> +        reader: &mut UserSlicePtrReader,
> +        offset: usize,
> +        len: usize,
> +    ) -> Result {
> +        self.with_pointer_into_page(offset, len, move |to_ptr| {
> +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> +            // it has performed a bounds check and guarantees that `to_ptr` is
> +            // valid for `len` bytes.
> +            unsafe { reader.read_raw(to_ptr, len) }
> +        })
> +    }
> +}
> +
> +impl Drop for Page {
> +    fn drop(&mut self) {
> +        // SAFETY: By the type invariants, we have ownership of the page and can
> +        // free it.
> +        unsafe { bindings::__free_pages(self.page.as_ptr(), 0) };
> +    }
> +}
> 
> -- 
> 2.43.0.429.g432eaa2c6b-goog
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-26  0:46   ` Boqun Feng
@ 2024-01-26 12:33     ` Alice Ryhl
  2024-01-26 18:28       ` Boqun Feng
  2024-01-30  9:15     ` Andreas Hindborg (Samsung)
  1 sibling, 1 reply; 40+ messages in thread
From: Alice Ryhl @ 2024-01-26 12:33 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Kees Cook,
	Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Fri, Jan 26, 2024 at 1:47 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> > +    /// Maps the page and reads from it into the given buffer.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// Callers must ensure that `dest` is valid for writing `len` bytes.
> > +    pub unsafe fn read(&self, dest: *mut u8, offset: usize, len: usize) -> Result {
> > +        self.with_pointer_into_page(offset, len, move |from_ptr| {
> > +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> > +            // it has performed a bounds check and guarantees that `from_ptr` is
> > +            // valid for `len` bytes.
> > +            unsafe { ptr::copy(from_ptr, dest, len) };
> > +            Ok(())
> > +        })
> > +    }
> > +
> > +    /// Maps the page and writes into it from the given buffer.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// Callers must ensure that `src` is valid for reading `len` bytes.
> > +    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {
>
> Use a slice like type as `src` maybe? Then the function can be safe:
>
>         pub fn write<S: AsRef<[u8]>>(&self, src: S, offset: usize) -> Result
>
> Besides, since `Page` impl `Sync`, shouldn't this `write` and the
> `fill_zero` be a `&mut self` function? Or make them both `unsafe`
> because of potential race and add some safety requirement?

Ideally, we don't want data races with these methods to be UB. They
could be mapped into the address space of a userspace process.

Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-26 12:33     ` Alice Ryhl
@ 2024-01-26 18:28       ` Boqun Feng
  2024-02-01  6:50         ` Trevor Gross
  0 siblings, 1 reply; 40+ messages in thread
From: Boqun Feng @ 2024-01-26 18:28 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Kees Cook,
	Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Fri, Jan 26, 2024 at 01:33:46PM +0100, Alice Ryhl wrote:
> On Fri, Jan 26, 2024 at 1:47 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> > > +    /// Maps the page and reads from it into the given buffer.
> > > +    ///
> > > +    /// # Safety
> > > +    ///
> > > +    /// Callers must ensure that `dest` is valid for writing `len` bytes.
> > > +    pub unsafe fn read(&self, dest: *mut u8, offset: usize, len: usize) -> Result {
> > > +        self.with_pointer_into_page(offset, len, move |from_ptr| {
> > > +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> > > +            // it has performed a bounds check and guarantees that `from_ptr` is
> > > +            // valid for `len` bytes.
> > > +            unsafe { ptr::copy(from_ptr, dest, len) };
> > > +            Ok(())
> > > +        })
> > > +    }
> > > +
> > > +    /// Maps the page and writes into it from the given buffer.
> > > +    ///
> > > +    /// # Safety
> > > +    ///
> > > +    /// Callers must ensure that `src` is valid for reading `len` bytes.
> > > +    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {
> >
> > Use a slice like type as `src` maybe? Then the function can be safe:
> >
> >         pub fn write<S: AsRef<[u8]>>(&self, src: S, offset: usize) -> Result
> >
> > Besides, since `Page` impl `Sync`, shouldn't this `write` and the
> > `fill_zero` be a `&mut self` function? Or make them both `unsafe`
> > because of potential race and add some safety requirement?
> 
> Ideally, we don't want data races with these methods to be UB. They

I understand that, but in the current code, you can write:

	CPU 0			CPU 1
	=====			=====

	page.write(src1, 0, 8);
				page.write(src2, 0, 8);

and it's a data race at kernel end. So my question is more how we can
prevent the UB ;-)

Regards,
Boqun

> could be mapped into the address space of a userspace process.
> 
> Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-24 11:20 ` [PATCH 3/3] rust: add abstraction for `struct page` Alice Ryhl
  2024-01-26  0:46   ` Boqun Feng
@ 2024-01-29 17:59   ` Matthew Wilcox
  2024-01-29 18:56     ` Carlos Llamas
                       ` (2 more replies)
  2024-02-01  6:02   ` Trevor Gross
  2 siblings, 3 replies; 40+ messages in thread
From: Matthew Wilcox @ 2024-01-29 17:59 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> Adds a new struct called `Page` that wraps a pointer to `struct page`.
> This struct is assumed to hold ownership over the page, so that Rust
> code can allocate and manage pages directly.

OK ...

> This patch only adds support for pages of order zero, as that is all
> Rust Binder needs. However, it is written to make it easy to add support
> for higher-order pages in the future. To do that, you would add a const
> generic parameter to `Page` that specifies the order. Most of the
> methods do not need to be adjusted, as the logic for dealing with
> mapping multiple pages at once can be isolated to just the
> `with_pointer_into_page` method. Finally, the struct can be renamed to
> `Pages<ORDER>`, and the type alias `Page = Pages<0>` can be introduced.

This description concerns me because it reads like you're not keeping
up with the current thinking in MM about what pages are and how we're
improving the type hierarchy.  As in, we're creating one instead of
allowing the current mish-mash of absolutely everything to continue.

Are you the right person to ask about the operations that Binder does
with a page so we can figure out where it fits in the type hierarchy?

> Rust Binder needs to manage pages directly as that is how transactions
> are delivered: Each process has an mmap'd region for incoming
> transactions. When an incoming transaction arrives, the Binder driver
> will choose a region in the mmap, allocate and map the relevant pages
> manually, and copy the incoming transaction directly into the page. This
> architecture allows the driver to copy transactions directly from the
> address space of one process to another, without an intermediate copy
> to a kernel buffer.

Everything about this says "This is what a first year comp sci student
thinks will be fast".  Oh well, the thinking here isn't your fault.

> @@ -127,6 +129,24 @@ int rust_helper_signal_pending(struct task_struct *t)
>  }
>  EXPORT_SYMBOL_GPL(rust_helper_signal_pending);
>  
> +struct page *rust_helper_alloc_pages(gfp_t gfp_mask, unsigned int order)
> +{
> +       return alloc_pages(gfp_mask, order);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_alloc_pages);
> +
> +void *rust_helper_kmap_local_page(struct page *page)
> +{
> +       return kmap_local_page(page);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_kmap_local_page);
> +
> +void rust_helper_kunmap_local(const void *addr)
> +{
> +       kunmap_local(addr);
> +}
> +EXPORT_SYMBOL_GPL(rust_helper_kunmap_local);

I remain opposed to all these fidgetty little helpers.  Particularly
when they're noops on machines without HIGHMEM, which is ~all of them.

> +/// A bitwise shift for the page size.
> +pub const PAGE_SHIFT: usize = bindings::PAGE_SHIFT as usize;

Does PAGE_SHIFT really need to be as large as 'usize'?  If it's more
than 63 by the time I retire, I'll be shocked.  If it's more than 127
by the time I die, I'll be even more shocked.  And it won't get to 255
by the heat death of the universe.

> +/// The number of bytes in a page.
> +pub const PAGE_SIZE: usize = 1 << PAGE_SHIFT;

This is appropriately usize.

> +/// A bitwise mask for the page size.
> +pub const PAGE_MASK: usize = PAGE_SIZE - 1;

Are you trying to get somebody killed?

include/asm-generic/page.h:#define PAGE_MASK    (~(PAGE_SIZE-1))

Defining PAGE_MASK to be the opposite set of bits in C and Rust is
going to bite us all day every day for a decade.

> +impl Page {
> +    /// Allocates a new set of contiguous pages.
> +    pub fn new() -> Result<Self, AllocError> {
> +        // SAFETY: These are the correct arguments to allocate a single page.
> +        let page = unsafe {
> +            bindings::alloc_pages(
> +                bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::__GFP_HIGHMEM,
> +                0,
> +            )
> +        };

This feels too Binder-specific to be 'Page'.  Pages are not necessarily
allocated with GFP_HIGHMEM, nor are they necessarily zeroed.  Maybe you
want a BinderPage type?


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-29 17:59   ` Matthew Wilcox
@ 2024-01-29 18:56     ` Carlos Llamas
  2024-01-29 20:19       ` Matthew Wilcox
  2024-01-29 21:27     ` Alice Ryhl
  2024-01-30  9:02     ` Andreas Hindborg
  2 siblings, 1 reply; 40+ messages in thread
From: Carlos Llamas @ 2024-01-29 18:56 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Kees Cook, Al Viro, Andrew Morton,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Suren Baghdasaryan,
	Arnd Bergmann, linux-mm, linux-kernel, rust-for-linux,
	Christian Brauner

On Mon, Jan 29, 2024 at 05:59:53PM +0000, Matthew Wilcox wrote:
> On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> > Adds a new struct called `Page` that wraps a pointer to `struct page`.
> > This struct is assumed to hold ownership over the page, so that Rust
> > code can allocate and manage pages directly.
> 
> OK ...
> 
> > This patch only adds support for pages of order zero, as that is all
> > Rust Binder needs. However, it is written to make it easy to add support
> > for higher-order pages in the future. To do that, you would add a const
> > generic parameter to `Page` that specifies the order. Most of the
> > methods do not need to be adjusted, as the logic for dealing with
> > mapping multiple pages at once can be isolated to just the
> > `with_pointer_into_page` method. Finally, the struct can be renamed to
> > `Pages<ORDER>`, and the type alias `Page = Pages<0>` can be introduced.
> 
> This description concerns me because it reads like you're not keeping
> up with the current thinking in MM about what pages are and how we're
> improving the type hierarchy.  As in, we're creating one instead of
> allowing the current mish-mash of absolutely everything to continue.
> 
> Are you the right person to ask about the operations that Binder does
> with a page so we can figure out where it fits in the type hierarchy?
 
I would guess you are suggesting a transition to folios here? I don't
think there is anything in binder that would impede such a change. The
core idea behind binder IPC is to skip kernel buffering and perform
instead a "copy-once" of messages across users memory. In theory this
seems efficient but I haven't seen any data proving so. So take that
with a grain of salt.

The size of these binder messages is not limited per se and can trigger
the allocation of multiple pages. However, in reality the vast majority
of transactions are under 1K payload. FWICT, it seems reasonable to
switch over to folios.

The only concern I have is that we've implemented a binder LRU-shrinker
mechanism. We add the unused pages to our freelist and give them back to
the system on demand. However, if a new transaction requests the unused
page before it gets reclaimed it is simply removed from this freelist.
This is convenient as we avoid taking the mmap sem during this process.
I don't know how this mechanism would look with folios though?

--
Carlos Llamas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-29 18:56     ` Carlos Llamas
@ 2024-01-29 20:19       ` Matthew Wilcox
  0 siblings, 0 replies; 40+ messages in thread
From: Matthew Wilcox @ 2024-01-29 20:19 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Kees Cook, Al Viro, Andrew Morton,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Suren Baghdasaryan,
	Arnd Bergmann, linux-mm, linux-kernel, rust-for-linux,
	Christian Brauner

On Mon, Jan 29, 2024 at 06:56:58PM +0000, Carlos Llamas wrote:
> On Mon, Jan 29, 2024 at 05:59:53PM +0000, Matthew Wilcox wrote:
> > On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> > > This patch only adds support for pages of order zero, as that is all
> > > Rust Binder needs. However, it is written to make it easy to add support
> > > for higher-order pages in the future. To do that, you would add a const
> > > generic parameter to `Page` that specifies the order. Most of the
> > > methods do not need to be adjusted, as the logic for dealing with
> > > mapping multiple pages at once can be isolated to just the
> > > `with_pointer_into_page` method. Finally, the struct can be renamed to
> > > `Pages<ORDER>`, and the type alias `Page = Pages<0>` can be introduced.
> > 
> > This description concerns me because it reads like you're not keeping
> > up with the current thinking in MM about what pages are and how we're
> > improving the type hierarchy.  As in, we're creating one instead of
> > allowing the current mish-mash of absolutely everything to continue.
> > 
> > Are you the right person to ask about the operations that Binder does
> > with a page so we can figure out where it fits in the type hierarchy?
>  
> I would guess you are suggesting a transition to folios here? I don't

I don't think folios are the right type to use.  Folios are for files and
anonymous memory; things which are managed on the LRU, have refcounts and
mapcounts, can be found with an rmap, need private data, belong to memory
control groups, belong to either an inode or an anon_vma, and so on.

It's _possible_ that Binder fits this use case well enough, but my
guess is that it needs its own type, or maybe it's the initial example
of a different type from folios (right now we have three types: folios,
slabs and ptdescs, but more are on their way).

> The only concern I have is that we've implemented a binder LRU-shrinker
> mechanism. We add the unused pages to our freelist and give them back to
> the system on demand. However, if a new transaction requests the unused
> page before it gets reclaimed it is simply removed from this freelist.
> This is convenient as we avoid taking the mmap sem during this process.
> I don't know how this mechanism would look with folios though?

This doesn't seem like too much of a problem.  The key thing is that
with memdescs, you get to define your own data type of whatever size
makes sense for you.  Until then you're limited to what we can fit into
a struct page (and we need to be careful not to step on stuff that other
people look at like the refcount).

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-29 17:59   ` Matthew Wilcox
  2024-01-29 18:56     ` Carlos Llamas
@ 2024-01-29 21:27     ` Alice Ryhl
  2024-01-30  9:02     ` Andreas Hindborg
  2 siblings, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-01-29 21:27 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner,
	liam.howlett

On Mon, Jan 29, 2024 at 6:59 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> > Adds a new struct called `Page` that wraps a pointer to `struct page`.
> > This struct is assumed to hold ownership over the page, so that Rust
> > code can allocate and manage pages directly.
>
> OK ...

Thank you for taking your time to review my wrappers!

> > This patch only adds support for pages of order zero, as that is all
> > Rust Binder needs. However, it is written to make it easy to add support
> > for higher-order pages in the future. To do that, you would add a const
> > generic parameter to `Page` that specifies the order. Most of the
> > methods do not need to be adjusted, as the logic for dealing with
> > mapping multiple pages at once can be isolated to just the
> > `with_pointer_into_page` method. Finally, the struct can be renamed to
> > `Pages<ORDER>`, and the type alias `Page = Pages<0>` can be introduced.
>
> This description concerns me because it reads like you're not keeping
> up with the current thinking in MM about what pages are and how we're
> improving the type hierarchy.  As in, we're creating one instead of
> allowing the current mish-mash of absolutely everything to continue.

That's very possible. I have a good understanding about how C binder
interacts with pages, but I don't know too much about the abstractions
that Binder is not using.

> Are you the right person to ask about the operations that Binder does
> with a page so we can figure out where it fits in the type hierarchy?

I can definitely answer questions about that. If we can find another
abstraction that is not too far away from what we are doing today,
then I am open to looking into whether we can do that instead of the
current approach. However, I want to avoid large deviations from C
Binder, at least before I get Rust binder into the kernel tree.

A short overview of what Binder does:
* Every process will mmap a region of memory. This memory region will
contain the data for all incoming transactions sent to that process.
Only the kernel can modify these pages.
* Binder has a data structure that keeps track of where the
allocations in this mmap'd region are. It has functionality to find an
open interval of a given size (so it's essentially an allocator). This
is called the "range allocator".
* The pages in this region are allocated lazily whenever the range
allocator starts using the page.
* When the range allocator stops using a page, it is marked as unused
and put on an LRU list.
* There's a custom shrinker that can free pages not currently used by
any allocation.

So when process A sends a transaction to process B using the
appropriate ioctl, process A will go into the range allocator for
process B and reserve a range for the transaction that A is sending.
If any pages in the resulting range are missing, then new pages are
allocated, and process A will vm_insert_page them into process B's
mmap. Then, process A will map the page with kmap_local_page, and use
copy_from_user to copy data *directly* from A's userspace into a page
in process B's address space.

Note that transactions don't have uniform sizes. Think of them as
arbitrary buffers provided by userspace. They will generally not be
larger than a few hundred bytes each, but larger transactions are
possible. The mmap for a process is usually 4 MB.

The biggest user of Page is here in the RFC:
https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-19-08ba9197f637@google.com/

The range allocator is defined here:
https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-6-08ba9197f637@google.com/

> > Rust Binder needs to manage pages directly as that is how transactions
> > are delivered: Each process has an mmap'd region for incoming
> > transactions. When an incoming transaction arrives, the Binder driver
> > will choose a region in the mmap, allocate and map the relevant pages
> > manually, and copy the incoming transaction directly into the page. This
> > architecture allows the driver to copy transactions directly from the
> > address space of one process to another, without an intermediate copy
> > to a kernel buffer.
>
> Everything about this says "This is what a first year comp sci student
> thinks will be fast".  Oh well, the thinking here isn't your fault.

Ultimately, I am just replicating what C Binder does.

I had a long discussion with Liam Howlett at Plumbers where we
discussed various alternatives to the hand-rolled stuff that Binder
does. Liam thought that we may be able to replace the entire thing
with maple trees. These are things that I definitely want to
experiment with, but I am reluctant to try replacing the entire thing
with a maple tree, at least until I get Rust Binder into the kernel
tree.

In general, there are many places in Binder where we are hand-rolling
something that has an alternative elsewhere in the kernel, but
replacing them is not always trivial. The hand-rolled versions often
have Binder-specific optimizations that make it a regression to
replace it with the general thing.

> > @@ -127,6 +129,24 @@ int rust_helper_signal_pending(struct task_struct *t)
> >  }
> >  EXPORT_SYMBOL_GPL(rust_helper_signal_pending);
> >
> > +struct page *rust_helper_alloc_pages(gfp_t gfp_mask, unsigned int order)
> > +{
> > +       return alloc_pages(gfp_mask, order);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_alloc_pages);
> > +
> > +void *rust_helper_kmap_local_page(struct page *page)
> > +{
> > +       return kmap_local_page(page);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_kmap_local_page);
> > +
> > +void rust_helper_kunmap_local(const void *addr)
> > +{
> > +       kunmap_local(addr);
> > +}
> > +EXPORT_SYMBOL_GPL(rust_helper_kunmap_local);
>
> I remain opposed to all these fidgetty little helpers.  Particularly
> when they're noops on machines without HIGHMEM, which is ~all of them.

I don't disagree with you, but there's not much I can do about them. I
can wrap them in #ifdef HIGHMEM if they are no-ops or exported without
HIGHMEM?

> > +/// A bitwise shift for the page size.
> > +pub const PAGE_SHIFT: usize = bindings::PAGE_SHIFT as usize;
>
> Does PAGE_SHIFT really need to be as large as 'usize'?  If it's more
> than 63 by the time I retire, I'll be shocked.  If it's more than 127
> by the time I die, I'll be even more shocked.  And it won't get to 255
> by the heat death of the universe.

Rust usually requires that both operands to an integer operation are
of the same integer type, requiring explicit conversions if that is
not the case. That is why I am using usize here.

However, it seems like Rust doesn't actually require that for the <<
operator, so I guess it doesn't matter much for this particular
constant.

> > +/// A bitwise mask for the page size.
> > +pub const PAGE_MASK: usize = PAGE_SIZE - 1;
>
> Are you trying to get somebody killed?
>
> include/asm-generic/page.h:#define PAGE_MASK    (~(PAGE_SIZE-1))
>
> Defining PAGE_MASK to be the opposite set of bits in C and Rust is
> going to bite us all day every day for a decade.

Oops, that's embarrassing. Thank you for catching that. I'll make sure
to change it.

> > +impl Page {
> > +    /// Allocates a new set of contiguous pages.
> > +    pub fn new() -> Result<Self, AllocError> {
> > +        // SAFETY: These are the correct arguments to allocate a single page.
> > +        let page = unsafe {
> > +            bindings::alloc_pages(
> > +                bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::__GFP_HIGHMEM,
> > +                0,
> > +            )
> > +        };
>
> This feels too Binder-specific to be 'Page'.  Pages are not necessarily
> allocated with GFP_HIGHMEM, nor are they necessarily zeroed.  Maybe you
> want a BinderPage type?

We can add a constructor that takes a flag argument, so this type
doesn't necessarily have to be tied to those specific arguments.

Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-29 17:59   ` Matthew Wilcox
  2024-01-29 18:56     ` Carlos Llamas
  2024-01-29 21:27     ` Alice Ryhl
@ 2024-01-30  9:02     ` Andreas Hindborg
  2024-01-30  9:06       ` Alice Ryhl
  2 siblings, 1 reply; 40+ messages in thread
From: Andreas Hindborg @ 2024-01-30  9:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner


Matthew Wilcox <willy@infradead.org> writes:

> On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
>> +impl Page {
>> +    /// Allocates a new set of contiguous pages.
>> +    pub fn new() -> Result<Self, AllocError> {
>> +        // SAFETY: These are the correct arguments to allocate a single page.
>> +        let page = unsafe {
>> +            bindings::alloc_pages(
>> +                bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::__GFP_HIGHMEM,
>> +                0,
>> +            )
>> +        };
>
> This feels too Binder-specific to be 'Page'.  Pages are not necessarily
> allocated with GFP_HIGHMEM, nor are they necessarily zeroed.  Maybe you
> want a BinderPage type?

Rust null block uses the same definition of these flags [1], so there is
at least that synergy.

I feel like we had the discussion of the flags before, although I can't
find the thread now. I think the conclusion was that we fix them until
we have code that need to actually change them (as to not add dead
code).

BR Andreas

[1] https://github.com/metaspace/linux/blob/702026e6645193fc89b7d55e00dac75fd492bfb8/rust/kernel/pages.rs#L28

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-30  9:02     ` Andreas Hindborg
@ 2024-01-30  9:06       ` Alice Ryhl
  0 siblings, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-01-30  9:06 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: Matthew Wilcox, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Tue, Jan 30, 2024 at 10:02 AM Andreas Hindborg
<a.hindborg@samsung.com> wrote:
>
>
> Matthew Wilcox <willy@infradead.org> writes:
>
> > On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> >> +impl Page {
> >> +    /// Allocates a new set of contiguous pages.
> >> +    pub fn new() -> Result<Self, AllocError> {
> >> +        // SAFETY: These are the correct arguments to allocate a single page.
> >> +        let page = unsafe {
> >> +            bindings::alloc_pages(
> >> +                bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::__GFP_HIGHMEM,
> >> +                0,
> >> +            )
> >> +        };
> >
> > This feels too Binder-specific to be 'Page'.  Pages are not necessarily
> > allocated with GFP_HIGHMEM, nor are they necessarily zeroed.  Maybe you
> > want a BinderPage type?
>
> Rust null block uses the same definition of these flags [1], so there is
> at least that synergy.
>
> I feel like we had the discussion of the flags before, although I can't
> find the thread now. I think the conclusion was that we fix them until
> we have code that need to actually change them (as to not add dead
> code).

I don't think there's any problem with replacing the constructor with
one that takes a flag argument dead-code-wise. I can definitely do
that.

Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-26  0:46   ` Boqun Feng
  2024-01-26 12:33     ` Alice Ryhl
@ 2024-01-30  9:15     ` Andreas Hindborg (Samsung)
  1 sibling, 0 replies; 40+ messages in thread
From: Andreas Hindborg (Samsung) @ 2024-01-30  9:15 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Alice Ryhl, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Gary Guo, Björn Roy Baron, Benno Lossin, Kees Cook, Al Viro,
	Andrew Morton, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Christian Brauner


Boqun Feng <boqun.feng@gmail.com> writes:

> On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:

>> +
>> +    /// Maps the page and writes into it from the given buffer.
>> +    ///
>> +    /// # Safety
>> +    ///
>> +    /// Callers must ensure that `src` is valid for reading `len` bytes.
>> +    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {
>
> Use a slice like type as `src` maybe? Then the function can be safe:
>
> 	pub fn write<S: AsRef<[u8]>>(&self, src: S, offset: usize) -> Result
>
> Besides, since `Page` impl `Sync`, shouldn't this `write` and the
> `fill_zero` be a `&mut self` function? Or make them both `unsafe`
> because of potential race and add some safety requirement?

We can add a safe version that takes a slice later, as here [1]. Same
for the with_* that take a closure.

It would be nice to model ownership of pages that are only mapped in
kernel with `&mut`.

BR Andreas

[1] https://github.com/metaspace/linux/blob/702026e6645193fc89b7d55e00dac75fd492bfb8/rust/kernel/pages.rs#L178

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-01-24 11:20 ` [PATCH 1/3] rust: add userspace pointers Alice Ryhl
  2024-01-24 23:12   ` Valentin Obst
@ 2024-02-01  4:06   ` Trevor Gross
  2024-02-08 12:53     ` Alice Ryhl
  1 sibling, 1 reply; 40+ messages in thread
From: Trevor Gross @ 2024-02-01  4:06 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Wed, Jan 24, 2024 at 6:21 AM Alice Ryhl <aliceryhl@google.com> wrote:
> --- /dev/null
> +++ b/rust/kernel/user_ptr.rs
> @@ -0,0 +1,222 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! User pointers.
> +//!
> +//! C header: [`include/linux/uaccess.h`](../../../../include/linux/uaccess.h)
> +
> +// Comparison with MAX_USER_OP_LEN triggers this lint on platforms
> +// where `c_ulong == usize`.
> +#![allow(clippy::absurd_extreme_comparisons)]
> +
> +use crate::{bindings, error::code::*, error::Result};
> +use alloc::vec::Vec;
> +use core::ffi::{c_ulong, c_void};
> +
> +/// The maximum length of a operation using `copy_[from|to]_user`.
> +///
> +/// If a usize is not greater than this constant, then casting it to `c_ulong`
> +/// is guaranteed to be lossless.
> +const MAX_USER_OP_LEN: usize = c_ulong::MAX as usize;
> +
> +/// A pointer to an area in userspace memory, which can be either read-only or
> +/// read-write.
> +///
> +/// All methods on this struct are safe: invalid pointers return `EFAULT`.

Maybe

    ... attempting to read or write invalid pointers will return `EFAULT`.

> +/// Concurrent access, *including data races to/from userspace memory*, is
> +/// permitted, because fundamentally another userspace thread/process could
> +/// always be modifying memory at the same time (in the same way that userspace
> +/// Rust's [`std::io`] permits data races with the contents of files on disk).
> +/// In the presence of a race, the exact byte values read/written are
> +/// unspecified but the operation is well-defined. Kernelspace code should
> +/// validate its copy of data after completing a read, and not expect that
> +/// multiple reads of the same address will return the same value.
> +///
> +/// These APIs are designed to make it difficult to accidentally write TOCTOU
> +/// bugs. Every time you read from a memory location, the pointer is advanced by
> +/// the length so that you cannot use that reader to read the same memory
> +/// location twice. Preventing double-fetches avoids TOCTOU bugs. This is
> +/// accomplished by taking `self` by value to prevent obtaining multiple readers
> +/// on a given [`UserSlicePtr`], and the readers only permitting forward reads.
> +/// If double-fetching a memory location is necessary for some reason, then that
> +/// is done by creating multiple readers to the same memory location, e.g. using
> +/// [`clone_reader`].

Maybe something like

    Every time a memory location is read, the reader's position is advanced by
    the read length and the next read will start from there. This helps prevent
    accidentally reading the same location twice and causing a TOCTOU bug.

    Creating a [`UserSlicePtrReader`] and/or [`UserSlicePtrWriter`]
    consumes the `UserSlicePtr`, helping ensure that there aren't multiple
    readers or writers to the same location.

    If double fetching a memory location...

> +///
> +/// Constructing a [`UserSlicePtr`] performs no checks on the provided address
> +/// and length, it can safely be constructed inside a kernel thread with no
> +/// current userspace process. Reads and writes wrap the kernel APIs

Maybe some of this is better documented on `new` rather than the type?

> +/// `copy_from_user` and `copy_to_user`, which check the memory map of the
> +/// current process and enforce that the address range is within the user range
> +/// (no additional calls to `access_ok` are needed).
> +///
> +/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
> +/// [`clone_reader`]: UserSlicePtrReader::clone_reader
> +pub struct UserSlicePtr(*mut c_void, usize);

I think just `UserSlice` could work for the name here, since
`SlicePtr` is a bit redundant (slices automatically containing a
pointer). Also makes the Reader/Writer types a bit less wordy. Though
I understand wanting to make it discoverable for C users.

Is some sort of `Debug` implementation useful?

> +impl UserSlicePtr {
> +    /// Constructs a user slice from a raw pointer and a length in bytes.
> +    ///
> +    /// Callers must be careful to avoid time-of-check-time-of-use
> +    /// (TOCTOU) issues. The simplest way is to create a single instance of
> +    /// [`UserSlicePtr`] per user memory block as it reads each byte at
> +    /// most once.
> +    pub fn new(ptr: *mut c_void, length: usize) -> Self {
> +        UserSlicePtr(ptr, length)
> +    }
> +
> +    /// Reads the entirety of the user slice.
> +    ///
> +    /// Returns `EFAULT` if the address does not currently point to
> +    /// mapped, readable memory.
> +    pub fn read_all(self) -> Result<Vec<u8>> {
> +        self.reader().read_all()
> +    }

std::io uses the signature

    fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize> { ... }

Maybe this could be better here too, since it allows reusing the
vector without going through `read_raw`.

https://doc.rust-lang.org/std/io/trait.Read.html#method.read_to_end

> +    /// Constructs a [`UserSlicePtrReader`].
> +    pub fn reader(self) -> UserSlicePtrReader {
> +        UserSlicePtrReader(self.0, self.1)
> +    }

I wonder if this should be called `into_reader` to note that it is a
consuming method. Indifferent here, since there aren't any non-`self`
variants.

> +
> +    /// Constructs a [`UserSlicePtrWriter`].
> +    pub fn writer(self) -> UserSlicePtrWriter {
> +        UserSlicePtrWriter(self.0, self.1)
> +    }
> +
> +    /// Constructs both a [`UserSlicePtrReader`] and a [`UserSlicePtrWriter`].

Could you add a brief about what is or isn't allowed when you have a
reader and a writer to the same location?

> +    pub fn reader_writer(self) -> (UserSlicePtrReader, UserSlicePtrWriter) {
> +        (
> +            UserSlicePtrReader(self.0, self.1),
> +            UserSlicePtrWriter(self.0, self.1),
> +        )
> +    }
> +}
> +
> +/// A reader for [`UserSlicePtr`].
> +///
> +/// Used to incrementally read from the user slice.
> +pub struct UserSlicePtrReader(*mut c_void, usize);
> +
> +impl UserSlicePtrReader {
> +    /// Skip the provided number of bytes.
> +    ///
> +    /// Returns an error if skipping more than the length of the buffer.
> +    pub fn skip(&mut self, num_skip: usize) -> Result {
> +        // Update `self.1` first since that's the fallible one.
> +        self.1 = self.1.checked_sub(num_skip).ok_or(EFAULT)?;
> +        self.0 = self.0.wrapping_add(num_skip);

I think it would be better to change to named structs to make this more clear.

> +        Ok(())
> +    }
> +
> +    /// Create a reader that can access the same range of data.
> +    ///
> +    /// Reading from the clone does not advance the current reader.
> +    ///
> +    /// The caller should take care to not introduce TOCTOU issues.

Could you add an example of what should be avoided?

> +    pub fn clone_reader(&self) -> UserSlicePtrReader {
> +        UserSlicePtrReader(self.0, self.1)
> +    }
> +
> +    /// Returns the number of bytes left to be read from this.

Remove "from this" or change to "from this reader".

> +    ///
> +    /// Note that even reading less than this number of bytes may fail.> +    pub fn len(&self) -> usize {
> +        self.1
> +    }
> +
> +    /// Returns `true` if no data is available in the io buffer.
> +    pub fn is_empty(&self) -> bool {
> +        self.1 == 0
> +    }
> +
> +    /// Reads raw data from the user slice into a raw kernel buffer.
> +    ///
> +    /// Fails with `EFAULT` if the read encounters a page fault.

This is raised if the address is invalid right, not just page faults?
Or, are page faults just the only mode of indication that a pointer
isn't valid.

I know this is the same as copy_from_user, but I don't understand that
too well myself. I'm also a bit curious what happens if you pass a
kernel pointer to these methods.

> +    /// # Safety
> +    ///
> +    /// The `out` pointer must be valid for writing `len` bytes.
> +    pub unsafe fn read_raw(&mut self, out: *mut u8, len: usize) -> Result {

What is the motivation for taking raw pointers rather than a slice
here? In which case it could just be called `read`.

> +        if len > self.1 || len > MAX_USER_OP_LEN {
> +            return Err(EFAULT);
> +        }
> +        // SAFETY: The caller promises that `out` is valid for writing `len` bytes.
> +        let res = unsafe { bindings::copy_from_user(out.cast::<c_void>(), self.0, len as c_ulong) };

To me, it would be more clear to `c_ulong::try_from(len).map_err(|_|
EFAULT)?` rather than using MAX_USER_OP_LEN with an `as` cast, since
it makes it immediately clear that the conversion is ok (and is doing
that same comparison)

> +        if res != 0 {
> +            return Err(EFAULT);
> +        }
> +        // Since this is not a pointer to a valid object in our program,
> +        // we cannot use `add`, which has C-style rules for defined
> +        // behavior.
> +        self.0 = self.0.wrapping_add(len);
> +        self.1 -= len;
> +        Ok(())
> +    }
> +
> +    /// Reads all remaining data in the buffer into a vector.
> +    ///
> +    /// Fails with `EFAULT` if the read encounters a page fault.
> +    pub fn read_all(&mut self) -> Result<Vec<u8>> {

Same as the above note about read_to_end

> +        let len = self.len();
> +        let mut data = Vec::<u8>::try_with_capacity(len)?;
> +
> +        // SAFETY: The output buffer is valid for `len` bytes as we just allocated that much space.
> +        unsafe { self.read_raw(data.as_mut_ptr(), len)? };

If making the change above (possibly even if not), going through
https://doc.rust-lang.org/std/vec/struct.Vec.html#method.spare_capacity_mut
can be more clear about uninitialized data.

> +
> +        // SAFETY: Since the call to `read_raw` was successful, the first `len` bytes of the vector
> +        // have been initialized.
> +        unsafe { data.set_len(len) };
> +        Ok(data)
> +    }
> +}
> +
> +/// A writer for [`UserSlicePtr`].
> +///
> +/// Used to incrementally write into the user slice.
> +pub struct UserSlicePtrWriter(*mut c_void, usize);
> +
> +impl UserSlicePtrWriter {
> +    /// Returns the amount of space remaining in this buffer.
> +    ///
> +    /// Note that even writing less than this number of bytes may fail.
> +    pub fn len(&self) -> usize {
> +        self.1
> +    }
> +
> +    /// Returns `true` if no more data can be written to this buffer.
> +    pub fn is_empty(&self) -> bool {
> +        self.1 == 0
> +    }
> +
> +    /// Writes raw data to this user pointer from a raw kernel buffer.
> +    ///
> +    /// Fails with `EFAULT` if the write encounters a page fault.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The `data` pointer must be valid for reading `len` bytes.
> +    pub unsafe fn write_raw(&mut self, data: *const u8, len: usize) -> Result {

Same as for Reader regarding signature

> +        if len > self.1 || len > MAX_USER_OP_LEN {
> +            return Err(EFAULT);
> +        }
> +        let res = unsafe { bindings::copy_to_user(self.0, data.cast::<c_void>(), len as c_ulong) };

Same as for Reader regarding `as` casts.

> +        if res != 0 {
> +            return Err(EFAULT);
> +        }
> +        // Since this is not a pointer to a valid object in our program,
> +        // we cannot use `add`, which has C-style rules for defined
> +        // behavior.
> +        self.0 = self.0.wrapping_add(len);
> +        self.1 -= len;
> +        Ok(())
> +    }
> +
> +    /// Writes the provided slice to this user pointer.
> +    ///
> +    /// Fails with `EFAULT` if the write encounters a page fault.
> +    pub fn write_slice(&mut self, data: &[u8]) -> Result {

This could probably just be called `write`, which would be consistent
with std::io::Write.

> +        let len = data.len();
> +        let ptr = data.as_ptr();
> +        // SAFETY: The pointer originates from a reference to a slice of length
> +        // `len`, so the pointer is valid for reading `len` bytes.
> +        unsafe { self.write_raw(ptr, len) }
> +    }
> +}
>
> --
> 2.43.0.429.g432eaa2c6b-goog

The docs could use usage examples, but I am sure you are planning on
that already :)

- Trevor

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-01-24 11:20 ` [PATCH 2/3] rust: add typed accessors for " Alice Ryhl
  2024-01-24 23:46   ` Valentin Obst
  2024-01-25 12:26   ` Arnd Bergmann
@ 2024-02-01  5:03   ` Trevor Gross
  2024-02-08 13:14     ` Alice Ryhl
  2 siblings, 1 reply; 40+ messages in thread
From: Trevor Gross @ 2024-02-01  5:03 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Wed, Jan 24, 2024 at 6:21 AM Alice Ryhl <aliceryhl@google.com> wrote:

I see this patch answers some of my naming questions from 1/3, sorry
for not reading all the way through.

> diff --git a/rust/kernel/user_ptr.rs b/rust/kernel/user_ptr.rs
> index 00aa26aa6a83..daa46abe5525 100644
> --- a/rust/kernel/user_ptr.rs
> +++ b/rust/kernel/user_ptr.rs
> @@ -11,6 +11,7 @@
>  use crate::{bindings, error::code::*, error::Result};
>  use alloc::vec::Vec;
>  use core::ffi::{c_ulong, c_void};
> +use core::mem::{size_of, MaybeUninit};
>
>  /// The maximum length of a operation using `copy_[from|to]_user`.
>  ///
> @@ -151,6 +152,36 @@ pub unsafe fn read_raw(&mut self, out: *mut u8, len: usize) -> Result {
>          Ok(())
>      }
>
> +    /// Reads a value of the specified type.
> +    ///
> +    /// Fails with `EFAULT` if the read encounters a page fault.
> +    pub fn read<T: ReadableFromBytes>(&mut self) -> Result<T> {

I think that `T: Copy` is required here, or for Copy to be a
supertrait of ReadableBytes, since the data in the buffer is being
duplicated from a reference.

Send is probably also a reasonable bound to have .

> +        if size_of::<T>() > self.1 || size_of::<T>() > MAX_USER_OP_LEN {
> +            return Err(EFAULT);
> +        }
> +        let mut out: MaybeUninit<T> = MaybeUninit::uninit();
> +        // SAFETY: The local variable `out` is valid for writing `size_of::<T>()` bytes.
> +        let res = unsafe {
> +            bindings::copy_from_user_unsafe_skip_check_object_size(
> +                out.as_mut_ptr().cast::<c_void>(),
> +                self.0,
> +                size_of::<T>() as c_ulong,

As with the other patch, I think it would be more clear to use
`c_ulong::try_from(...)` rather than comparing against
`MAX_USER_OP_LEN ` and later casting. Possibly just in a helper
function.

> +            )
> +        };
> +        if res != 0 {
> +            return Err(EFAULT);
> +        }
> +        // Since this is not a pointer to a valid object in our program,
> +        // we cannot use `add`, which has C-style rules for defined
> +        // behavior.
> +        self.0 = self.0.wrapping_add(size_of::<T>());

There are now methods `wrapping_byte_add` (since 1.75). Doesn't make
much of a difference since the pointer is c_void anyway, but it does
make the unit more clear.

> +        self.1 -= size_of::<T>();
> +
> +        // SAFETY: The read above has initialized all bytes in `out`, and since
> +        // `T` implements `ReadableFromBytes`, any bit-pattern is a valid value
> +        // for this type.
> +        Ok(unsafe { out.assume_init() })
> +    }
> +
>      /// Reads all remaining data in the buffer into a vector.
>      ///
>      /// Fails with `EFAULT` if the read encounters a page fault.
> @@ -219,4 +250,98 @@ pub fn write_slice(&mut self, data: &[u8]) -> Result {
>          // `len`, so the pointer is valid for reading `len` bytes.
>          unsafe { self.write_raw(ptr, len) }
>      }
> +
> +    /// Writes the provided Rust value to this userspace pointer.
> +    ///
> +    /// Fails with `EFAULT` if the write encounters a page fault.
> +    pub fn write<T: WritableToBytes>(&mut self, value: &T) -> Result {

Send + Copy are also needed here, or supertraits of WritableToBytes.

> +        if size_of::<T>() > self.1 || size_of::<T>() > MAX_USER_OP_LEN {
> +            return Err(EFAULT);
> +        }
> +        // SAFETY: The reference points to a value of type `T`, so it is valid
> +        // for reading `size_of::<T>()` bytes.
> +        let res = unsafe {
> +            bindings::copy_to_user_unsafe_skip_check_object_size(
> +                self.0,
> +                (value as *const T).cast::<c_void>(),
> +                size_of::<T>() as c_ulong,
> +            )
> +        };
> +        if res != 0 {
> +            return Err(EFAULT);
> +        }
> +        // Since this is not a pointer to a valid object in our program,
> +        // we cannot use `add`, which has C-style rules for defined
> +        // behavior.
> +        self.0 = self.0.wrapping_add(size_of::<T>());
> +        self.1 -= size_of::<T>();
> +        Ok(())
> +    }
>  }
> +
> +/// Specifies that a type is safely readable from bytes.
> +///
> +/// Not all types are valid for all values. For example, a `bool` must be either
> +/// zero or one, so reading arbitrary bytes into something that contains a
> +/// `bool` is not okay.
> +///
> +/// It's okay for the type to have padding, as initializing those bytes has no
> +/// effect.
> +///
> +/// # Safety
> +///
> +/// All bit-patterns must be valid for this type.
> +pub unsafe trait ReadableFromBytes {}
> +
> +// SAFETY: All bit patterns are acceptable values of the types below.
> +unsafe impl ReadableFromBytes for u8 {}
> +unsafe impl ReadableFromBytes for u16 {}
> +unsafe impl ReadableFromBytes for u32 {}
> +unsafe impl ReadableFromBytes for u64 {}
> +unsafe impl ReadableFromBytes for usize {}
> +unsafe impl ReadableFromBytes for i8 {}
> +unsafe impl ReadableFromBytes for i16 {}
> +unsafe impl ReadableFromBytes for i32 {}
> +unsafe impl ReadableFromBytes for i64 {}
> +unsafe impl ReadableFromBytes for isize {}
> +// SAFETY: If all bit patterns are acceptable for individual values in an array,
> +// then all bit patterns are also acceptable for arrays of that type.
> +unsafe impl<T: ReadableFromBytes> ReadableFromBytes for [T] {}
> +unsafe impl<T: ReadableFromBytes, const N: usize> ReadableFromBytes for [T; N] {}
> +
> +/// Specifies that a type is safely writable to bytes.
> +///
> +/// If a struct implements this trait, then it is okay to copy it byte-for-byte
> +/// to userspace. This means that it should not have any padding, as padding
> +/// bytes are uninitialized. Reading uninitialized memory is not just undefined
> +/// behavior, it may even lead to leaking sensitive information on the stack to
> +/// userspace.
> +///
> +/// The struct should also not hold kernel pointers, as kernel pointer addresses
> +/// are also considered sensitive. However, leaking kernel pointers is not
> +/// considered undefined behavior by Rust, so this is a correctness requirement,
> +/// but not a safety requirement.
> +///
> +/// # Safety
> +///
> +/// Values of this type may not contain any uninitialized bytes.
> +pub unsafe trait WritableToBytes {}
> +
> +// SAFETY: Instances of the following types have no uninitialized portions.
> +unsafe impl WritableToBytes for u8 {}
> +unsafe impl WritableToBytes for u16 {}
> +unsafe impl WritableToBytes for u32 {}
> +unsafe impl WritableToBytes for u64 {}
> +unsafe impl WritableToBytes for usize {}
> +unsafe impl WritableToBytes for i8 {}
> +unsafe impl WritableToBytes for i16 {}
> +unsafe impl WritableToBytes for i32 {}
> +unsafe impl WritableToBytes for i64 {}
> +unsafe impl WritableToBytes for isize {}
> +unsafe impl WritableToBytes for bool {}
> +unsafe impl WritableToBytes for char {}
> +unsafe impl WritableToBytes for str {}
> +// SAFETY: If individual values in an array have no uninitialized portions, then
> +// the the array itself does not have any uninitialized portions either.
> +unsafe impl<T: WritableToBytes> WritableToBytes for [T] {}
> +unsafe impl<T: WritableToBytes, const N: usize> WritableToBytes for [T; N] {}
>
> --
> 2.43.0.429.g432eaa2c6b-goog
>
>

These traits are probably usable in a lot of other places (e.g.
packets, GPU), so could you put them in a separate module?

The patterns here are pretty similar to what the bytemuck crate does
[1]. Since that crate is well established and open licensed, I think
it makes sense to keep their naming or possibly even vendor a portion
in.

In particular, this would likely include the traits:

- AnyBitPattern, which is roughly ReadableFromBytes here
- NoUninit, which is roughly WritableToBytes here
- Optionally Pod (plain old data), a supertrait of both AnyBitPattern
and NoUninit just used to simplify trait implementation (impl Pod and
you get the other two).

And the functions:

- from_bytes to turn &[u8] into &T for use in `read`. Needs `T: Copy`
to return an owned value, as noted above.
- bytes_of to turn &T into &[u8], for use in `write`

The derive macros would also be nice to have down the line, though
bytemuck's unfortunately relies on syn.

- Trevor

[1]: https://docs.rs/bytemuck/latest/bytemuck/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-24 11:20 ` [PATCH 3/3] rust: add abstraction for `struct page` Alice Ryhl
  2024-01-26  0:46   ` Boqun Feng
  2024-01-29 17:59   ` Matthew Wilcox
@ 2024-02-01  6:02   ` Trevor Gross
  2024-02-08 13:46     ` Alice Ryhl
  2 siblings, 1 reply; 40+ messages in thread
From: Trevor Gross @ 2024-02-01  6:02 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Wed, Jan 24, 2024 at 6:22 AM Alice Ryhl <aliceryhl@google.com> wrote:
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Kernel page allocation and management.
> +
> +use crate::{bindings, error::code::*, error::Result, user_ptr::UserSlicePtrReader};
> +use core::{
> +    alloc::AllocError,
> +    ffi::c_void,
> +    ptr::{self, NonNull},
> +};
> +
> +/// A bitwise shift for the page size.
> +pub const PAGE_SHIFT: usize = bindings::PAGE_SHIFT as usize;
> +/// The number of bytes in a page.
> +pub const PAGE_SIZE: usize = 1 << PAGE_SHIFT;
> +/// A bitwise mask for the page size.
> +pub const PAGE_MASK: usize = PAGE_SIZE - 1;
> +
> +/// A pointer to a page that owns the page allocation.
> +///
> +/// # Invariants
> +///
> +/// The pointer points at a page, and has ownership over the page.
> +pub struct Page {
> +    page: NonNull<bindings::page>,
> +}

Shouldn't this be UnsafeCell / Opaque? Since `struct page` contains locks.

> +// SAFETY: It is safe to transfer page allocations between threads.
> +unsafe impl Send for Page {}
> +
> +// SAFETY: Calling `&self` methods on this type in parallel is safe. It might
> +// allow you to perform a data race on bytes stored in the page, but we treat
> +// this like data races on user pointers.
> +unsafe impl Sync for Page {}

These races should probably be in the Page docs, rather than pointing
to user pointers.

> +impl Page {
> +    /// Allocates a new set of contiguous pages.

"set of contiguous page" -> "page"?

> +    pub fn new() -> Result<Self, AllocError> {
> +        // SAFETY: These are the correct arguments to allocate a single page.
> +        let page = unsafe {
> +            bindings::alloc_pages(
> +                bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::__GFP_HIGHMEM,
> +                0,
> +            )
> +        };
> +
> +        match NonNull::new(page) {
> +            // INVARIANT: We checked that the allocation above succeeded.
> +            Some(page) => Ok(Self { page }),
> +            None => Err(AllocError),
> +        }

Optionally:

    let page = NonNull::new(page).ok_or(AllocError)?;
    Ok(Self { page })

> +    }
> +
> +    /// Returns a raw pointer to the page.

Maybe add ", valid for PAGE_SIZE" or similar to make this obvious.

> +    pub fn as_ptr(&self) -> *mut bindings::page {
> +        self.page.as_ptr()
> +    }
> +
> +    /// Runs a piece of code with this page mapped to an address.

Maybe ", then immediately unmaps the page" to make the entire operation clear.

> +    /// It is up to the caller to use the provided raw pointer correctly.
> +    pub fn with_page_mapped<T>(&self, f: impl FnOnce(*mut c_void) -> T) -> T {

If there is exclusive access into the page, this signature could be:

    FnOnce(&mut [u8; PAGE_SIZE]) -> T

Otherwise possibly

    FnOnce(*mut [u8; PAGE_SIZE]) -> T

But based on the thread with Boqun it seems there is no synchronized
access here. In this case, "use the provided raw pointer correctly" or
the type level docs should clarify what you can and can't rely on with
pointers into a page.

E.g. if I'm understanding correctly, you can never construct a &T or
&mut T anywhere in this page unless T is Sync.

> +        // SAFETY: `page` is valid due to the type invariants on `Page`.
> +        let mapped_addr = unsafe { bindings::kmap_local_page(self.as_ptr()) };
> +
> +        let res = f(mapped_addr);
> +
> +        // SAFETY: This unmaps the page mapped above.
> +        //
> +        // Since this API takes the user code as a closure, it can only be used
> +        // in a manner where the pages are unmapped in reverse order. This is as
> +        // required by `kunmap_local`.
> +        //
> +        // In other words, if this call to `kunmap_local` happens when a
> +        // different page should be unmapped first, then there must necessarily
> +        // be a call to `kmap_local_page` other than the call just above in
> +        // `with_page_mapped` that made that possible. In this case, it is the
> +        // unsafe block that wraps that other call that is incorrect.
> +        unsafe { bindings::kunmap_local(mapped_addr) };
> +
> +        res
> +    }
> +
> +    /// Runs a piece of code with a raw pointer to a slice of this page, with
> +    /// bounds checking.
> +    ///
> +    /// If `f` is called, then it will be called with a pointer that points at
> +    /// `off` bytes into the page, and the pointer will be valid for at least
> +    /// `len` bytes. The pointer is only valid on this task, as this method uses
> +    /// a local mapping./
> +    ///
> +    /// If `off` and `len` refers to a region outside of this page, then this
> +    /// method returns `EINVAL` and does not call `f`.
> +    pub fn with_pointer_into_page<T>(
> +        &self,
> +        off: usize,
> +        len: usize,
> +        f: impl FnOnce(*mut u8) -> Result<T>,
> +    ) -> Result<T> {

Same question about exclusive access

    impl FnOnce(&mut [u8]) -> Result<T>

> +        let bounds_ok = off <= PAGE_SIZE && len <= PAGE_SIZE && (off + len) <= PAGE_SIZE;
> +
> +        if bounds_ok {
> +            self.with_page_mapped(move |page_addr| {
> +                // SAFETY: The `off` integer is at most `PAGE_SIZE`, so this pointer offset will
> +                // result in a pointer that is in bounds or one off the end of the page.
> +                f(unsafe { page_addr.cast::<u8>().add(off) })
> +            })
> +        } else {
> +            Err(EINVAL)
> +        }
> +    }
> +
> +    /// Maps the page and reads from it into the given buffer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Callers must ensure that `dest` is valid for writing `len` bytes.
> +    pub unsafe fn read(&self, dest: *mut u8, offset: usize, len: usize) -> Result {

Is there a reason not to use a slice just for a destination to read into?

> +        self.with_pointer_into_page(offset, len, move |from_ptr| {

Nit: do the names from_ptr/to_ptr come from existing binder? src/dst
seems more common (also dst vs. dest).

> +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> +            // it has performed a bounds check and guarantees that `from_ptr` is
> +            // valid for `len` bytes.
> +            unsafe { ptr::copy(from_ptr, dest, len) };
> +            Ok(())
> +        })
> +    }
> +
> +    /// Maps the page and writes into it from the given buffer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Callers must ensure that `src` is valid for reading `len` bytes.
> +    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {

Same slice question

> +        self.with_pointer_into_page(offset, len, move |to_ptr| {
> +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> +            // it has performed a bounds check and guarantees that `to_ptr` is
> +            // valid for `len` bytes.
> +            unsafe { ptr::copy(src, to_ptr, len) };
> +            Ok(())
> +        })
> +    }
> +
> +    /// Maps the page and zeroes the given slice.

Mention that this will error with the same conditions as with_pointer_into_page.

> +    pub fn fill_zero(&self, offset: usize, len: usize) -> Result {
> +        self.with_pointer_into_page(offset, len, move |to_ptr| {
> +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> +            // it has performed a bounds check and guarantees that `to_ptr` is
> +            // valid for `len` bytes.
> +            unsafe { ptr::write_bytes(to_ptr, 0u8, len) };
> +            Ok(())
> +        })
> +    }
> +
> +    /// Copies data from userspace into this page.
> +    pub fn copy_into_page(
> +        &self,
> +        reader: &mut UserSlicePtrReader,
> +        offset: usize,
> +        len: usize,
> +    ) -> Result {

Maybe copy_from_user_slice or something that includes "user", since
as-is it sounds like copying a page into another page.

Also, docs should point out the error condition.

> +        self.with_pointer_into_page(offset, len, move |to_ptr| {
> +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> +            // it has performed a bounds check and guarantees that `to_ptr` is
> +            // valid for `len` bytes.
> +            unsafe { reader.read_raw(to_ptr, len) }
> +        })
> +    }
> +}
> +
> +impl Drop for Page {
> +    fn drop(&mut self) {
> +        // SAFETY: By the type invariants, we have ownership of the page and can
> +        // free it.
> +        unsafe { bindings::__free_pages(self.page.as_ptr(), 0) };
> +    }
> +}
>
> --
> 2.43.0.429.g432eaa2c6b-goog
>
>

- Trevor

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-01-26 18:28       ` Boqun Feng
@ 2024-02-01  6:50         ` Trevor Gross
  2024-02-05 17:23           ` Boqun Feng
  2024-02-08 13:36           ` Alice Ryhl
  0 siblings, 2 replies; 40+ messages in thread
From: Trevor Gross @ 2024-02-01  6:50 UTC (permalink / raw)
  To: Boqun Feng, Alice Ryhl, Andreas Hindborg
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	Björn Roy Baron, Benno Lossin, Kees Cook, Al Viro,
	Andrew Morton, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Christian Brauner

On Fri, Jan 26, 2024 at 1:28 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Fri, Jan 26, 2024 at 01:33:46PM +0100, Alice Ryhl wrote:
> > On Fri, Jan 26, 2024 at 1:47 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> > >
> > > On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> > > > [...]
> > > > +    /// Maps the page and writes into it from the given buffer.
> > > > +    ///
> > > > +    /// # Safety
> > > > +    ///
> > > > +    /// Callers must ensure that `src` is valid for reading `len` bytes.
> > > > +    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {
> > >
> > > Use a slice like type as `src` maybe? Then the function can be safe:
> > >
> > >         pub fn write<S: AsRef<[u8]>>(&self, src: S, offset: usize) -> Result
> > >
> > > Besides, since `Page` impl `Sync`, shouldn't this `write` and the
> > > `fill_zero` be a `&mut self` function? Or make them both `unsafe`
> > > because of potential race and add some safety requirement?
> >
> > Ideally, we don't want data races with these methods to be UB. They
>
> I understand that, but in the current code, you can write:
>
>         CPU 0                   CPU 1
>         =====                   =====
>
>         page.write(src1, 0, 8);
>                                 page.write(src2, 0, 8);
>
> and it's a data race at kernel end. So my question is more how we can
> prevent the UB ;-)

Hm. Would the following work?

    // Change existing functions to work with references, meaning they need an
    // exclusive &mut self
    pub fn with_page_mapped<T>(
        &mut self,
        f: impl FnOnce(&mut [u8; PAGE_SIZE]) -> T
    ) -> T

    pub fn with_pointer_into_page<T>(
        &mut self,
        off: usize,
        len: usize,
        f: impl FnOnce(&mut [u8]) -> Result<T>,
    ) -> Result<T>

    // writing methods now take &mut self
    pub fn write(&mut self ...)
    pub fn fill_zero(&mut self ...)
    pub fn copy_into_page(&mut self ...)

    // Add two new functions that take &self, but return shared access
    pub fn with_page_mapped_raw<T>(
        &self,
        f: impl FnOnce(&UnsafeCell<[u8; PAGE_SIZE]>) -> T
    ) -> T

    pub fn with_pointer_into_page_raw<T>(
        &self,
        off: usize,
        len: usize,
        f: impl FnOnce(&[UnsafeCell<u8>]) -> Result<T>,
    ) -> Result<T>

This would mean that anyone who can obey rust's mutability rules can
use a page without any safety or race conditions to worry about, much
better for usability.

But if you do need to allow the data to be shared and racy, such as
the userspace example, the `_raw` methods allow for that and you can
`.get()` a `*mut u8` from the UnsafeCell. This moves the interior
mutability only to the mapped data rather than the Page itself, which
I think is more accurate anyway.

Leveraging UnsafeCell would also make some things with UserSlicePtr
more clear too.

- Trevor

> Regards,
> Boqun
>
> > could be mapped into the address space of a userspace process.
> >
> > Alice
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-02-01  6:50         ` Trevor Gross
@ 2024-02-05 17:23           ` Boqun Feng
  2024-02-08 13:36           ` Alice Ryhl
  1 sibling, 0 replies; 40+ messages in thread
From: Boqun Feng @ 2024-02-05 17:23 UTC (permalink / raw)
  To: Trevor Gross
  Cc: Alice Ryhl, Andreas Hindborg, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Kees Cook, Al Viro, Andrew Morton,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Christian Brauner

On Thu, Feb 01, 2024 at 01:50:53AM -0500, Trevor Gross wrote:
> On Fri, Jan 26, 2024 at 1:28 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > On Fri, Jan 26, 2024 at 01:33:46PM +0100, Alice Ryhl wrote:
> > > On Fri, Jan 26, 2024 at 1:47 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> > > > > [...]
> > > > > +    /// Maps the page and writes into it from the given buffer.
> > > > > +    ///
> > > > > +    /// # Safety
> > > > > +    ///
> > > > > +    /// Callers must ensure that `src` is valid for reading `len` bytes.
> > > > > +    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {
> > > >
> > > > Use a slice like type as `src` maybe? Then the function can be safe:
> > > >
> > > >         pub fn write<S: AsRef<[u8]>>(&self, src: S, offset: usize) -> Result
> > > >
> > > > Besides, since `Page` impl `Sync`, shouldn't this `write` and the
> > > > `fill_zero` be a `&mut self` function? Or make them both `unsafe`
> > > > because of potential race and add some safety requirement?
> > >
> > > Ideally, we don't want data races with these methods to be UB. They
> >
> > I understand that, but in the current code, you can write:
> >
> >         CPU 0                   CPU 1
> >         =====                   =====
> >
> >         page.write(src1, 0, 8);
> >                                 page.write(src2, 0, 8);
> >
> > and it's a data race at kernel end. So my question is more how we can
> > prevent the UB ;-)
> 
> Hm. Would the following work?
> 
>     // Change existing functions to work with references, meaning they need an
>     // exclusive &mut self
>     pub fn with_page_mapped<T>(
>         &mut self,
>         f: impl FnOnce(&mut [u8; PAGE_SIZE]) -> T
>     ) -> T
> 
>     pub fn with_pointer_into_page<T>(
>         &mut self,
>         off: usize,
>         len: usize,
>         f: impl FnOnce(&mut [u8]) -> Result<T>,
>     ) -> Result<T>
> 
>     // writing methods now take &mut self
>     pub fn write(&mut self ...)
>     pub fn fill_zero(&mut self ...)
>     pub fn copy_into_page(&mut self ...)
> 
>     // Add two new functions that take &self, but return shared access
>     pub fn with_page_mapped_raw<T>(
>         &self,
>         f: impl FnOnce(&UnsafeCell<[u8; PAGE_SIZE]>) -> T
>     ) -> T
> 
>     pub fn with_pointer_into_page_raw<T>(
>         &self,
>         off: usize,
>         len: usize,
>         f: impl FnOnce(&[UnsafeCell<u8>]) -> Result<T>,
>     ) -> Result<T>
> 
> This would mean that anyone who can obey rust's mutability rules can
> use a page without any safety or race conditions to worry about, much
> better for usability.
> 
> But if you do need to allow the data to be shared and racy, such as
> the userspace example, the `_raw` methods allow for that and you can
> `.get()` a `*mut u8` from the UnsafeCell. This moves the interior
> mutability only to the mapped data rather than the Page itself, which
> I think is more accurate anyway.
> 

Looks good to me ;-) Thanks!

Regards,
Boqun

> Leveraging UnsafeCell would also make some things with UserSlicePtr
> more clear too.
> 
> - Trevor
> 
> > Regards,
> > Boqun
> >
> > > could be mapped into the address space of a userspace process.
> > >
> > > Alice
> >
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-01-24 23:12   ` Valentin Obst
@ 2024-02-08 12:20     ` Alice Ryhl
  0 siblings, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-02-08 12:20 UTC (permalink / raw)
  To: Valentin Obst
  Cc: a.hindborg, akpm, alex.gaynor, arnd, arve, benno.lossin,
	bjorn3_gh, boqun.feng, brauner, cmllamas, gary, gregkh, joel,
	keescook, linux-kernel, linux-mm, maco, ojeda, rust-for-linux,
	surenb, tkjos, viro, wedsonaf

On Thu, Jan 25, 2024 at 12:13 AM Valentin Obst <kernel@valentinobst.de> wrote:
>
> > +//! User pointers.
> > +//!
> > +//! C header: [`include/linux/uaccess.h`](../../../../include/linux/uaccess.h)
> > +
>
> nit: could this be using srctree-relative links?
>
> > +/// The maximum length of a operation using `copy_[from|to]_user`.
>
> nit: 'a' -> 'an'
>
> > +///
> > +/// If a usize is not greater than this constant, then casting it to `c_ulong`
> > +/// is guaranteed to be lossless.
>
> nit: could this be `usize` or [`usize`]. Maybe would also be clearer to
> say "... a value of type [`usize`] is smaller than ..."
>
> > +///
> > +/// These APIs are designed to make it difficult to accidentally write TOCTOU
> > +/// bugs. Every time you read from a memory location, the pointer is advanced by
>
> Maybe makes sense to also introduce the abbreviation TOCTOU in the type
> documentation when it is first used.
>
> > +    /// Reads the entirety of the user slice.
> > +    ///
> > +    /// Returns `EFAULT` if the address does not currently point to
> > +    /// mapped, readable memory.
> > +    pub fn read_all(self) -> Result<Vec<u8>> {
> > +        self.reader().read_all()
> > +    }
>
> If I understand it correctly, the function will return `EFAULT` if _any_
> address in the interval `[self.0, self.0 + self.1)` does not point to
> mapped, readable memory. Maybe the docs could be more explicit.
>
> > +        // Since this is not a pointer to a valid object in our program,
> > +        // we cannot use `add`, which has C-style rules for defined
> > +        // behavior.
> > +        self.0 = self.0.wrapping_add(len);
>
> If I understand it correctly, you are using 'valid object' to refer to
> an 'allocated object' [1] as this is what the `add` method's docs
> refer to [2]. In that case it might be better to use the latter term as
> it has a defined meaning. Also see [3] and [4] which are about making it
> more precise.
>
> [1]: https://doc.rust-lang.org/core/ptr/index.html#allocated-object
> [2]: https://doc.rust-lang.org/core/primitive.pointer.html#method.add
> [3]: https://github.com/rust-lang/rust/pull/116675
> [4]: https://github.com/rust-lang/unsafe-code-guidelines/issues/465

Thanks. I'll include all of your suggestions in my next version.

Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-02-01  4:06   ` Trevor Gross
@ 2024-02-08 12:53     ` Alice Ryhl
  2024-02-08 15:35       ` Greg Kroah-Hartman
                         ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-02-08 12:53 UTC (permalink / raw)
  To: Trevor Gross
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Thu, Feb 1, 2024 at 5:06 AM Trevor Gross <tmgross@umich.edu> wrote:
>
> On Wed, Jan 24, 2024 at 6:21 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > +/// A pointer to an area in userspace memory, which can be either read-only or
> > +/// read-write.
> > +///
> > +/// All methods on this struct are safe: invalid pointers return `EFAULT`.
>
> Maybe
>
>     ... attempting to read or write invalid pointers will return `EFAULT`.

Sure, that works.

> > +/// Concurrent access, *including data races to/from userspace memory*, is
> > +/// permitted, because fundamentally another userspace thread/process could
> > +/// always be modifying memory at the same time (in the same way that userspace
> > +/// Rust's [`std::io`] permits data races with the contents of files on disk).
> > +/// In the presence of a race, the exact byte values read/written are
> > +/// unspecified but the operation is well-defined. Kernelspace code should
> > +/// validate its copy of data after completing a read, and not expect that
> > +/// multiple reads of the same address will return the same value.
> > +///
> > +/// These APIs are designed to make it difficult to accidentally write TOCTOU
> > +/// bugs. Every time you read from a memory location, the pointer is advanced by
> > +/// the length so that you cannot use that reader to read the same memory
> > +/// location twice. Preventing double-fetches avoids TOCTOU bugs. This is
> > +/// accomplished by taking `self` by value to prevent obtaining multiple readers
> > +/// on a given [`UserSlicePtr`], and the readers only permitting forward reads.
> > +/// If double-fetching a memory location is necessary for some reason, then that
> > +/// is done by creating multiple readers to the same memory location, e.g. using
> > +/// [`clone_reader`].
>
> Maybe something like
>
>     Every time a memory location is read, the reader's position is advanced by
>     the read length and the next read will start from there. This helps prevent
>     accidentally reading the same location twice and causing a TOCTOU bug.
>
>     Creating a [`UserSlicePtrReader`] and/or [`UserSlicePtrWriter`]
>     consumes the `UserSlicePtr`, helping ensure that there aren't multiple
>     readers or writers to the same location.
>
>     If double fetching a memory location...

That makes sense to me. I'll update that.

> > +///
> > +/// Constructing a [`UserSlicePtr`] performs no checks on the provided address
> > +/// and length, it can safely be constructed inside a kernel thread with no
> > +/// current userspace process. Reads and writes wrap the kernel APIs
>
> Maybe some of this is better documented on `new` rather than the type?

I agree. I will move it.

> > +/// `copy_from_user` and `copy_to_user`, which check the memory map of the
> > +/// current process and enforce that the address range is within the user range
> > +/// (no additional calls to `access_ok` are needed).
> > +///
> > +/// [`std::io`]: https://doc.rust-lang.org/std/io/index.html
> > +/// [`clone_reader`]: UserSlicePtrReader::clone_reader
> > +pub struct UserSlicePtr(*mut c_void, usize);
>
> I think just `UserSlice` could work for the name here, since
> `SlicePtr` is a bit redundant (slices automatically containing a
> pointer). Also makes the Reader/Writer types a bit less wordy. Though
> I understand wanting to make it discoverable for C users.

Sure, renamed.

> Is some sort of `Debug` implementation useful?

Leaking pointer addresses in logs is frowned upon in the kernel, so I
don't think we should include that.

> > +    /// Reads the entirety of the user slice.
> > +    ///
> > +    /// Returns `EFAULT` if the address does not currently point to
> > +    /// mapped, readable memory.
> > +    pub fn read_all(self) -> Result<Vec<u8>> {
> > +        self.reader().read_all()
> > +    }
>
> std::io uses the signature
>
>     fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize> { ... }
>
> Maybe this could be better here too, since it allows reusing the
> vector without going through `read_raw`.
>
> https://doc.rust-lang.org/std/io/trait.Read.html#method.read_to_end

I updated the signature so that it appends to an existing vector
instead of creating a new one.

> > +    /// Constructs a [`UserSlicePtrReader`].
> > +    pub fn reader(self) -> UserSlicePtrReader {
> > +        UserSlicePtrReader(self.0, self.1)
> > +    }
>
> I wonder if this should be called `into_reader` to note that it is a
> consuming method. Indifferent here, since there aren't any non-`self`
> variants.

Hmm. I think the current name is good enough, since there aren't any
non-self methods, as you mention.

> > +
> > +    /// Constructs a [`UserSlicePtrWriter`].
> > +    pub fn writer(self) -> UserSlicePtrWriter {
> > +        UserSlicePtrWriter(self.0, self.1)
> > +    }
> > +
> > +    /// Constructs both a [`UserSlicePtrReader`] and a [`UserSlicePtrWriter`].
>
> Could you add a brief about what is or isn't allowed when you have a
> reader and a writer to the same location?

Ultimately, you can do whatever you want. But I'll add this:

/// Usually when this is used, you will first read the data, and then
/// overwrite it afterwards.

> > +    pub fn reader_writer(self) -> (UserSlicePtrReader, UserSlicePtrWriter) {
> > +        (
> > +            UserSlicePtrReader(self.0, self.1),
> > +            UserSlicePtrWriter(self.0, self.1),
> > +        )
> > +    }
> > +}
> > +
> > +/// A reader for [`UserSlicePtr`].
> > +///
> > +/// Used to incrementally read from the user slice.
> > +pub struct UserSlicePtrReader(*mut c_void, usize);
> > +
> > +impl UserSlicePtrReader {
> > +    /// Skip the provided number of bytes.
> > +    ///
> > +    /// Returns an error if skipping more than the length of the buffer.
> > +    pub fn skip(&mut self, num_skip: usize) -> Result {
> > +        // Update `self.1` first since that's the fallible one.
> > +        self.1 = self.1.checked_sub(num_skip).ok_or(EFAULT)?;
> > +        self.0 = self.0.wrapping_add(num_skip);
>
> I think it would be better to change to named structs to make this more clear.

Good idea. Changed to named fields.

> > +        Ok(())
> > +    }
> > +
> > +    /// Create a reader that can access the same range of data.
> > +    ///
> > +    /// Reading from the clone does not advance the current reader.
> > +    ///
> > +    /// The caller should take care to not introduce TOCTOU issues.
>
> Could you add an example of what should be avoided?
>
> > +    pub fn clone_reader(&self) -> UserSlicePtrReader {
> > +        UserSlicePtrReader(self.0, self.1)
> > +    }
> > +
> > +    /// Returns the number of bytes left to be read from this.
>
> Remove "from this" or change to "from this reader".

Done.

> > +    ///
> > +    /// Note that even reading less than this number of bytes may fail.> +    pub fn len(&self) -> usize {
> > +        self.1
> > +    }
> > +
> > +    /// Returns `true` if no data is available in the io buffer.
> > +    pub fn is_empty(&self) -> bool {
> > +        self.1 == 0
> > +    }
> > +
> > +    /// Reads raw data from the user slice into a raw kernel buffer.
> > +    ///
> > +    /// Fails with `EFAULT` if the read encounters a page fault.
>
> This is raised if the address is invalid right, not just page faults?
> Or, are page faults just the only mode of indication that a pointer
> isn't valid.
>
> I know this is the same as copy_from_user, but I don't understand that
> too well myself. I'm also a bit curious what happens if you pass a
> kernel pointer to these methods.

Basically, if the userspace process wouldn't be able to dereference
the pointer, then you get this error.

> > +    /// # Safety
> > +    ///
> > +    /// The `out` pointer must be valid for writing `len` bytes.
> > +    pub unsafe fn read_raw(&mut self, out: *mut u8, len: usize) -> Result {
>
> What is the motivation for taking raw pointers rather than a slice
> here? In which case it could just be called `read`.

The second patch will introduce a method called `read`, so we can't
use that name.

I don't think a slice gains that much compared to a raw pointer. It
would have to be an `&mut [MaybeUninit<u8>]` due to uninit memory, but
that doesn't actually allow the caller to use this method without
unsafe, since the caller would have to unsafely assert that the slice
has been initialized by this call themselves. Making this method
usable safely would require that we start using something along the
lines of the standard library's BorrowedBuf [1], which seems way
overkill to me.

[1]: https://doc.rust-lang.org/stable/std/io/struct.BorrowedBuf.html

> > +        if len > self.1 || len > MAX_USER_OP_LEN {
> > +            return Err(EFAULT);
> > +        }
> > +        // SAFETY: The caller promises that `out` is valid for writing `len` bytes.
> > +        let res = unsafe { bindings::copy_from_user(out.cast::<c_void>(), self.0, len as c_ulong) };
>
> To me, it would be more clear to `c_ulong::try_from(len).map_err(|_|
> EFAULT)?` rather than using MAX_USER_OP_LEN with an `as` cast, since
> it makes it immediately clear that the conversion is ok (and is doing
> that same comparison)

Thanks for the suggestion. I changed this.

> > +        let len = self.len();
> > +        let mut data = Vec::<u8>::try_with_capacity(len)?;
> > +
> > +        // SAFETY: The output buffer is valid for `len` bytes as we just allocated that much space.
> > +        unsafe { self.read_raw(data.as_mut_ptr(), len)? };
>
> If making the change above (possibly even if not), going through
> https://doc.rust-lang.org/std/vec/struct.Vec.html#method.spare_capacity_mut
> can be more clear about uninitialized data.
>
> > +
> > +        // SAFETY: Since the call to `read_raw` was successful, the first `len` bytes of the vector
> > +        // have been initialized.
> > +        unsafe { data.set_len(len) };
> > +        Ok(data)
> > +    }
> > +}
> > +
> > +/// A writer for [`UserSlicePtr`].
> > +///
> > +/// Used to incrementally write into the user slice.
> > +pub struct UserSlicePtrWriter(*mut c_void, usize);
> > +
> > +impl UserSlicePtrWriter {
> > +    /// Returns the amount of space remaining in this buffer.
> > +    ///
> > +    /// Note that even writing less than this number of bytes may fail.
> > +    pub fn len(&self) -> usize {
> > +        self.1
> > +    }
> > +
> > +    /// Returns `true` if no more data can be written to this buffer.
> > +    pub fn is_empty(&self) -> bool {
> > +        self.1 == 0
> > +    }
> > +
> > +    /// Writes raw data to this user pointer from a raw kernel buffer.
> > +    ///
> > +    /// Fails with `EFAULT` if the write encounters a page fault.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// The `data` pointer must be valid for reading `len` bytes.
> > +    pub unsafe fn write_raw(&mut self, data: *const u8, len: usize) -> Result {
>
> Same as for Reader regarding signature

I suppose we could have only `write_slice`. But if we're going to have
a read_raw, then having write_raw too makes sense to me.

Besides, a memcpy of uninit memory isn't UB by itself, so the raw
method would let you write uninit memory if you needed to.

> > +        if res != 0 {
> > +            return Err(EFAULT);
> > +        }
> > +        // Since this is not a pointer to a valid object in our program,
> > +        // we cannot use `add`, which has C-style rules for defined
> > +        // behavior.
> > +        self.0 = self.0.wrapping_add(len);
> > +        self.1 -= len;
> > +        Ok(())
> > +    }
> > +
> > +    /// Writes the provided slice to this user pointer.
> > +    ///
> > +    /// Fails with `EFAULT` if the write encounters a page fault.
> > +    pub fn write_slice(&mut self, data: &[u8]) -> Result {
>
> This could probably just be called `write`, which would be consistent
> with std::io::Write.

This would also name conflict with the second patch.

> The docs could use usage examples, but I am sure you are planning on
> that already :)

I'll add some.


Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/3] rust: add typed accessors for userspace pointers
  2024-02-01  5:03   ` Trevor Gross
@ 2024-02-08 13:14     ` Alice Ryhl
  0 siblings, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-02-08 13:14 UTC (permalink / raw)
  To: Trevor Gross
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Thu, Feb 1, 2024 at 6:04 AM Trevor Gross <tmgross@umich.edu> wrote:
>
> On Wed, Jan 24, 2024 at 6:21 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > +    /// Reads a value of the specified type.
> > +    ///
> > +    /// Fails with `EFAULT` if the read encounters a page fault.
> > +    pub fn read<T: ReadableFromBytes>(&mut self) -> Result<T> {
>
> I think that `T: Copy` is required here, or for Copy to be a
> supertrait of ReadableBytes, since the data in the buffer is being
> duplicated from a reference.
>
> Send is probably also a reasonable bound to have .

We're not moving a value of type `T`. We're creating a new value of
type `T` from a byte array. The trait says that we can do that, so I
think that is enough here.

Besides, we'll often want to use this for wrappers around bindgen
types. If we add a Copy bound, then we need to get bindgen to generate
a #[derive(Copy)] for them, which I don't think it does right now.

> > +        // SAFETY: The local variable `out` is valid for writing `size_of::<T>()` bytes.
> > +        let res = unsafe {
> > +            bindings::copy_from_user_unsafe_skip_check_object_size(
> > +                out.as_mut_ptr().cast::<c_void>(),
> > +                self.0,
> > +                size_of::<T>() as c_ulong,
>
> As with the other patch, I think it would be more clear to use
> `c_ulong::try_from(...)` rather than comparing against
> `MAX_USER_OP_LEN ` and later casting. Possibly just in a helper
> function.

Done.

> > +        // Since this is not a pointer to a valid object in our program,
> > +        // we cannot use `add`, which has C-style rules for defined
> > +        // behavior.
> > +        self.0 = self.0.wrapping_add(size_of::<T>());
>
> There are now methods `wrapping_byte_add` (since 1.75). Doesn't make
> much of a difference since the pointer is c_void anyway, but it does
> make the unit more clear.

Sure, I can use those methods.

> > +    /// Writes the provided Rust value to this userspace pointer.
> > +    ///
> > +    /// Fails with `EFAULT` if the write encounters a page fault.
> > +    pub fn write<T: WritableToBytes>(&mut self, value: &T) -> Result {
>
> Send + Copy are also needed here, or supertraits of WritableToBytes.

I also disagree here. We're not moving a value of type T. We're
creating a byte array from a value of type T, and the trait says that
we can do that.

> > +/// Specifies that a type is safely readable from bytes.
> > +///
> > +/// Not all types are valid for all values. For example, a `bool` must be either
> > +/// zero or one, so reading arbitrary bytes into something that contains a
> > +/// `bool` is not okay.
> > +///
> > +/// It's okay for the type to have padding, as initializing those bytes has no
> > +/// effect.
> > +///
> > +/// # Safety
> > +///
> > +/// All bit-patterns must be valid for this type.
> > +pub unsafe trait ReadableFromBytes {}
> > +
> > +// SAFETY: All bit patterns are acceptable values of the types below.
> > +unsafe impl ReadableFromBytes for u8 {}
> > +unsafe impl ReadableFromBytes for u16 {}
> > +unsafe impl ReadableFromBytes for u32 {}
> > +unsafe impl ReadableFromBytes for u64 {}
> > +unsafe impl ReadableFromBytes for usize {}
> > +unsafe impl ReadableFromBytes for i8 {}
> > +unsafe impl ReadableFromBytes for i16 {}
> > +unsafe impl ReadableFromBytes for i32 {}
> > +unsafe impl ReadableFromBytes for i64 {}
> > +unsafe impl ReadableFromBytes for isize {}
> > +// SAFETY: If all bit patterns are acceptable for individual values in an array,
> > +// then all bit patterns are also acceptable for arrays of that type.
> > +unsafe impl<T: ReadableFromBytes> ReadableFromBytes for [T] {}
> > +unsafe impl<T: ReadableFromBytes, const N: usize> ReadableFromBytes for [T; N] {}
> > +
> > +/// Specifies that a type is safely writable to bytes.
> > +///
> > +/// If a struct implements this trait, then it is okay to copy it byte-for-byte
> > +/// to userspace. This means that it should not have any padding, as padding
> > +/// bytes are uninitialized. Reading uninitialized memory is not just undefined
> > +/// behavior, it may even lead to leaking sensitive information on the stack to
> > +/// userspace.
> > +///
> > +/// The struct should also not hold kernel pointers, as kernel pointer addresses
> > +/// are also considered sensitive. However, leaking kernel pointers is not
> > +/// considered undefined behavior by Rust, so this is a correctness requirement,
> > +/// but not a safety requirement.
> > +///
> > +/// # Safety
> > +///
> > +/// Values of this type may not contain any uninitialized bytes.
> > +pub unsafe trait WritableToBytes {}
> > +
> > +// SAFETY: Instances of the following types have no uninitialized portions.
> > +unsafe impl WritableToBytes for u8 {}
> > +unsafe impl WritableToBytes for u16 {}
> > +unsafe impl WritableToBytes for u32 {}
> > +unsafe impl WritableToBytes for u64 {}
> > +unsafe impl WritableToBytes for usize {}
> > +unsafe impl WritableToBytes for i8 {}
> > +unsafe impl WritableToBytes for i16 {}
> > +unsafe impl WritableToBytes for i32 {}
> > +unsafe impl WritableToBytes for i64 {}
> > +unsafe impl WritableToBytes for isize {}
> > +unsafe impl WritableToBytes for bool {}
> > +unsafe impl WritableToBytes for char {}
> > +unsafe impl WritableToBytes for str {}
> > +// SAFETY: If individual values in an array have no uninitialized portions, then
> > +// the the array itself does not have any uninitialized portions either.
> > +unsafe impl<T: WritableToBytes> WritableToBytes for [T] {}
> > +unsafe impl<T: WritableToBytes, const N: usize> WritableToBytes for [T; N] {}
>
> These traits are probably usable in a lot of other places (e.g.
> packets, GPU), so could you put them in a separate module?

I can move them to the types module.

> The patterns here are pretty similar to what the bytemuck crate does
> [1]. Since that crate is well established and open licensed, I think
> it makes sense to keep their naming or possibly even vendor a portion
> in.

Vendoring bytemuck is out of scope for this patchset.

If we *are* going to vendor one of them, I would suggest zerocopy over bytemuck.

> In particular, this would likely include the traits:
>
> - AnyBitPattern, which is roughly ReadableFromBytes here
> - NoUninit, which is roughly WritableToBytes here
> - Optionally Pod (plain old data), a supertrait of both AnyBitPattern
> and NoUninit just used to simplify trait implementation (impl Pod and
> you get the other two).

I can rename the two traits, but I'm not going to introduce Pod. It's
over engineered for my purposes. Also, I prefer the trait names from
zerocopy. They emphasize that it's really about conversions to/from
byte arrays, and not about moving values around.

Note that WritableToBytes has an extra comment about pointer addresses
that bytemuck/zerocopy doesn't have.

> And the functions:
>
> - from_bytes to turn &[u8] into &T for use in `read`. Needs `T: Copy`
> to return an owned value, as noted above.
> - bytes_of to turn &T into &[u8], for use in `write`
>
> The derive macros would also be nice to have down the line, though
> bytemuck's unfortunately relies on syn.
>
> - Trevor
>
> [1]: https://docs.rs/bytemuck/latest/bytemuck/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-02-01  6:50         ` Trevor Gross
  2024-02-05 17:23           ` Boqun Feng
@ 2024-02-08 13:36           ` Alice Ryhl
  1 sibling, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-02-08 13:36 UTC (permalink / raw)
  To: Trevor Gross
  Cc: Boqun Feng, Andreas Hindborg, Miguel Ojeda, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, Björn Roy Baron,
	Benno Lossin, Kees Cook, Al Viro, Andrew Morton,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Christian Brauner

On Thu, Feb 1, 2024 at 7:51 AM Trevor Gross <tmgross@umich.edu> wrote:
>
> On Fri, Jan 26, 2024 at 1:28 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > On Fri, Jan 26, 2024 at 01:33:46PM +0100, Alice Ryhl wrote:
> > > On Fri, Jan 26, 2024 at 1:47 AM Boqun Feng <boqun.feng@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote:
> > > > > [...]
> > > > > +    /// Maps the page and writes into it from the given buffer.
> > > > > +    ///
> > > > > +    /// # Safety
> > > > > +    ///
> > > > > +    /// Callers must ensure that `src` is valid for reading `len` bytes.
> > > > > +    pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize) -> Result {
> > > >
> > > > Use a slice like type as `src` maybe? Then the function can be safe:
> > > >
> > > >         pub fn write<S: AsRef<[u8]>>(&self, src: S, offset: usize) -> Result
> > > >
> > > > Besides, since `Page` impl `Sync`, shouldn't this `write` and the
> > > > `fill_zero` be a `&mut self` function? Or make them both `unsafe`
> > > > because of potential race and add some safety requirement?
> > >
> > > Ideally, we don't want data races with these methods to be UB. They
> >
> > I understand that, but in the current code, you can write:
> >
> >         CPU 0                   CPU 1
> >         =====                   =====
> >
> >         page.write(src1, 0, 8);
> >                                 page.write(src2, 0, 8);
> >
> > and it's a data race at kernel end. So my question is more how we can
> > prevent the UB ;-)
>
> Hm. Would the following work?
>
>     // Change existing functions to work with references, meaning they need an
>     // exclusive &mut self
>     pub fn with_page_mapped<T>(
>         &mut self,
>         f: impl FnOnce(&mut [u8; PAGE_SIZE]) -> T
>     ) -> T
>
>     pub fn with_pointer_into_page<T>(
>         &mut self,
>         off: usize,
>         len: usize,
>         f: impl FnOnce(&mut [u8]) -> Result<T>,
>     ) -> Result<T>
>
>     // writing methods now take &mut self
>     pub fn write(&mut self ...)
>     pub fn fill_zero(&mut self ...)
>     pub fn copy_into_page(&mut self ...)
>
>     // Add two new functions that take &self, but return shared access
>     pub fn with_page_mapped_raw<T>(
>         &self,
>         f: impl FnOnce(&UnsafeCell<[u8; PAGE_SIZE]>) -> T
>     ) -> T
>
>     pub fn with_pointer_into_page_raw<T>(
>         &self,
>         off: usize,
>         len: usize,
>         f: impl FnOnce(&[UnsafeCell<u8>]) -> Result<T>,
>     ) -> Result<T>
>
> This would mean that anyone who can obey rust's mutability rules can
> use a page without any safety or race conditions to worry about, much
> better for usability.

The methods can't be `&mut self` because I need the ability to perform
concurrent writes to disjoint subsets of the page.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-02-01  6:02   ` Trevor Gross
@ 2024-02-08 13:46     ` Alice Ryhl
  2024-02-08 14:02       ` Andreas Hindborg
  0 siblings, 1 reply; 40+ messages in thread
From: Alice Ryhl @ 2024-02-08 13:46 UTC (permalink / raw)
  To: Trevor Gross
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Thu, Feb 1, 2024 at 7:02 AM Trevor Gross <tmgross@umich.edu> wrote:
>
> On Wed, Jan 24, 2024 at 6:22 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > +/// A pointer to a page that owns the page allocation.
> > +///
> > +/// # Invariants
> > +///
> > +/// The pointer points at a page, and has ownership over the page.
> > +pub struct Page {
> > +    page: NonNull<bindings::page>,
> > +}
>
> Shouldn't this be UnsafeCell / Opaque? Since `struct page` contains locks.

That only matters when we use a reference. Here, it's behind a raw pointer.

> > +// SAFETY: It is safe to transfer page allocations between threads.
> > +unsafe impl Send for Page {}
> > +
> > +// SAFETY: Calling `&self` methods on this type in parallel is safe. It might
> > +// allow you to perform a data race on bytes stored in the page, but we treat
> > +// this like data races on user pointers.
> > +unsafe impl Sync for Page {}
>
> These races should probably be in the Page docs, rather than pointing
> to user pointers.

New safety comment:

SAFETY: As long as the safety requirements for `&self` methods on this
type are followed, there is no problem with calling them in parallel.

> > +impl Page {
> > +    /// Allocates a new set of contiguous pages.
>
> "set of contiguous page" -> "page"?

Thanks, done.

> > +    pub fn new() -> Result<Self, AllocError> {
> > +        // SAFETY: These are the correct arguments to allocate a single page.
> > +        let page = unsafe {
> > +            bindings::alloc_pages(
> > +                bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::__GFP_HIGHMEM,
> > +                0,
> > +            )
> > +        };
> > +
> > +        match NonNull::new(page) {
> > +            // INVARIANT: We checked that the allocation above succeeded.
> > +            Some(page) => Ok(Self { page }),
> > +            None => Err(AllocError),
> > +        }
>
> Optionally:
>
>     let page = NonNull::new(page).ok_or(AllocError)?;
>     Ok(Self { page })

Done.

> > +    }
> > +
> > +    /// Returns a raw pointer to the page.
>
> Maybe add ", valid for PAGE_SIZE" or similar to make this obvious.

This is a pointer to the `struct page`, not the actual page data.

> > +    pub fn as_ptr(&self) -> *mut bindings::page {
> > +        self.page.as_ptr()
> > +    }
> > +
> > +    /// Runs a piece of code with this page mapped to an address.
>
> Maybe ", then immediately unmaps the page" to make the entire operation clear.

Ok.

> > +    /// It is up to the caller to use the provided raw pointer correctly.
> > +    pub fn with_page_mapped<T>(&self, f: impl FnOnce(*mut c_void) -> T) -> T {
>
> If there is exclusive access into the page, this signature could be:
>
>     FnOnce(&mut [u8; PAGE_SIZE]) -> T
>
> Otherwise possibly
>
>     FnOnce(*mut [u8; PAGE_SIZE]) -> T
>
> But based on the thread with Boqun it seems there is no synchronized
> access here. In this case, "use the provided raw pointer correctly" or
> the type level docs should clarify what you can and can't rely on with
> pointers into a page.
>
> E.g. if I'm understanding correctly, you can never construct a &T or
> &mut T anywhere in this page unless T is Sync.

We discussed this in the meeting and concluded that we should use *mut u8 here.

> > +    /// Runs a piece of code with a raw pointer to a slice of this page, with
> > +    /// bounds checking.
> > +    ///
> > +    /// If `f` is called, then it will be called with a pointer that points at
> > +    /// `off` bytes into the page, and the pointer will be valid for at least
> > +    /// `len` bytes. The pointer is only valid on this task, as this method uses
> > +    /// a local mapping./
> > +    ///
> > +    /// If `off` and `len` refers to a region outside of this page, then this
> > +    /// method returns `EINVAL` and does not call `f`.
> > +    pub fn with_pointer_into_page<T>(
> > +        &self,
> > +        off: usize,
> > +        len: usize,
> > +        f: impl FnOnce(*mut u8) -> Result<T>,
> > +    ) -> Result<T> {
>
> Same question about exclusive access
>
>     impl FnOnce(&mut [u8]) -> Result<T>

We discussed this in the meeting. Slices raise all sorts of cans of
worms with uninit and exclusivity, so the raw methods won't use them.

> > +        let bounds_ok = off <= PAGE_SIZE && len <= PAGE_SIZE && (off + len) <= PAGE_SIZE;
> > +
> > +        if bounds_ok {
> > +            self.with_page_mapped(move |page_addr| {
> > +                // SAFETY: The `off` integer is at most `PAGE_SIZE`, so this pointer offset will
> > +                // result in a pointer that is in bounds or one off the end of the page.
> > +                f(unsafe { page_addr.cast::<u8>().add(off) })
> > +            })
> > +        } else {
> > +            Err(EINVAL)
> > +        }
> > +    }
> > +
> > +    /// Maps the page and reads from it into the given buffer.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// Callers must ensure that `dest` is valid for writing `len` bytes.
> > +    pub unsafe fn read(&self, dest: *mut u8, offset: usize, len: usize) -> Result {
>
> Is there a reason not to use a slice just for a destination to read into?

Ditto.

> > +        self.with_pointer_into_page(offset, len, move |from_ptr| {
>
> Nit: do the names from_ptr/to_ptr come from existing binder? src/dst
> seems more common (also dst vs. dest).

Renamed everything to use src/dst

> > +        self.with_pointer_into_page(offset, len, move |to_ptr| {
> > +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> > +            // it has performed a bounds check and guarantees that `to_ptr` is
> > +            // valid for `len` bytes.
> > +            unsafe { ptr::copy(src, to_ptr, len) };
> > +            Ok(())
> > +        })
> > +    }
> > +
> > +    /// Maps the page and zeroes the given slice.
>
> Mention that this will error with the same conditions as with_pointer_into_page.

That method is private. I will add documentation for this that doesn't
reference with_pointer_into_page.

> > +    pub fn fill_zero(&self, offset: usize, len: usize) -> Result {
> > +        self.with_pointer_into_page(offset, len, move |to_ptr| {
> > +            // SAFETY: If `with_pointer_into_page` calls into this closure, then
> > +            // it has performed a bounds check and guarantees that `to_ptr` is
> > +            // valid for `len` bytes.
> > +            unsafe { ptr::write_bytes(to_ptr, 0u8, len) };
> > +            Ok(())
> > +        })
> > +    }
> > +
> > +    /// Copies data from userspace into this page.
> > +    pub fn copy_into_page(
> > +        &self,
> > +        reader: &mut UserSlicePtrReader,
> > +        offset: usize,
> > +        len: usize,
> > +    ) -> Result {
>
> Maybe copy_from_user_slice or something that includes "user", since
> as-is it sounds like copying a page into another page.
>
> Also, docs should point out the error condition.

Done.

Thanks,
Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-02-08 13:46     ` Alice Ryhl
@ 2024-02-08 14:02       ` Andreas Hindborg
  2024-02-08 14:12         ` Alice Ryhl
  0 siblings, 1 reply; 40+ messages in thread
From: Andreas Hindborg @ 2024-02-08 14:02 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Trevor Gross, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner


Alice Ryhl <aliceryhl@google.com> writes:

> On Thu, Feb 1, 2024 at 7:02 AM Trevor Gross <tmgross@umich.edu> wrote:
>>
>> On Wed, Jan 24, 2024 at 6:22 AM Alice Ryhl <aliceryhl@google.com> wrote:
>> > +/// A pointer to a page that owns the page allocation.
>> > +///
>> > +/// # Invariants
>> > +///
>> > +/// The pointer points at a page, and has ownership over the page.
>> > +pub struct Page {
>> > +    page: NonNull<bindings::page>,
>> > +}
>>
>> Shouldn't this be UnsafeCell / Opaque? Since `struct page` contains locks.
>
> That only matters when we use a reference. Here, it's behind a raw pointer.

Why is it behind a pointer rather than being transparent over
`Opaque<bindings::page>` and using a `&Page` instead?

BR Andreas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/3] rust: add abstraction for `struct page`
  2024-02-08 14:02       ` Andreas Hindborg
@ 2024-02-08 14:12         ` Alice Ryhl
  0 siblings, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-02-08 14:12 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: Trevor Gross, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Thu, Feb 8, 2024 at 3:02 PM Andreas Hindborg <a.hindborg@samsung.com> wrote:
>
>
> Alice Ryhl <aliceryhl@google.com> writes:
>
> > On Thu, Feb 1, 2024 at 7:02 AM Trevor Gross <tmgross@umich.edu> wrote:
> >>
> >> On Wed, Jan 24, 2024 at 6:22 AM Alice Ryhl <aliceryhl@google.com> wrote:
> >> > +/// A pointer to a page that owns the page allocation.
> >> > +///
> >> > +/// # Invariants
> >> > +///
> >> > +/// The pointer points at a page, and has ownership over the page.
> >> > +pub struct Page {
> >> > +    page: NonNull<bindings::page>,
> >> > +}
> >>
> >> Shouldn't this be UnsafeCell / Opaque? Since `struct page` contains locks.
> >
> > That only matters when we use a reference. Here, it's behind a raw pointer.
>
> Why is it behind a pointer rather than being transparent over
> `Opaque<bindings::page>` and using a `&Page` instead?

Because `&Page` would not have ownership of the page, but I need
ownership. We also can't use `ARef<Page>` because that has a `clone`
method.

One could introduce an Owned smart pointer and use `Owned<Page>`, but
I think that is out of scope for this patchset.

Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-02-08 12:53     ` Alice Ryhl
@ 2024-02-08 15:35       ` Greg Kroah-Hartman
  2024-02-08 15:41         ` Alice Ryhl
  2024-02-10  7:06       ` Trevor Gross
  2024-02-10 14:14       ` David Laight
  2 siblings, 1 reply; 40+ messages in thread
From: Greg Kroah-Hartman @ 2024-02-08 15:35 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Trevor Gross, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Kees Cook, Al Viro, Andrew Morton,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Thu, Feb 08, 2024 at 01:53:20PM +0100, Alice Ryhl wrote:
> > Is some sort of `Debug` implementation useful?
> 
> Leaking pointer addresses in logs is frowned upon in the kernel, so I
> don't think we should include that.

s/frowned upon/forbidden/ :)

Along those lines, you all are tieing in the "I want to print a pointer,
so hash it properly before I do so" logic from rust like we have in c,
right?  Ideally you'd use the same logic if at all possible.

If not, that probably needs to be done so that you don't accidentally
start leaking things.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-02-08 15:35       ` Greg Kroah-Hartman
@ 2024-02-08 15:41         ` Alice Ryhl
  2024-02-08 15:59           ` Greg Kroah-Hartman
  2024-02-10  6:20           ` Kees Cook
  0 siblings, 2 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-02-08 15:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Trevor Gross, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Kees Cook, Al Viro, Andrew Morton,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Thu, Feb 8, 2024 at 4:35 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Thu, Feb 08, 2024 at 01:53:20PM +0100, Alice Ryhl wrote:
> > > Is some sort of `Debug` implementation useful?
> >
> > Leaking pointer addresses in logs is frowned upon in the kernel, so I
> > don't think we should include that.
>
> s/frowned upon/forbidden/ :)

:)

> Along those lines, you all are tieing in the "I want to print a pointer,
> so hash it properly before I do so" logic from rust like we have in c,
> right?  Ideally you'd use the same logic if at all possible.
>
> If not, that probably needs to be done so that you don't accidentally
> start leaking things.

I don't know what the status of this is. For anything I've added, I've
just made it entirely impossible to print addresses, hashed or not.

Alice

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-02-08 15:41         ` Alice Ryhl
@ 2024-02-08 15:59           ` Greg Kroah-Hartman
  2024-02-10  6:20           ` Kees Cook
  1 sibling, 0 replies; 40+ messages in thread
From: Greg Kroah-Hartman @ 2024-02-08 15:59 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Trevor Gross, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Kees Cook, Al Viro, Andrew Morton,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Thu, Feb 08, 2024 at 04:41:52PM +0100, Alice Ryhl wrote:
> On Thu, Feb 8, 2024 at 4:35 PM Greg Kroah-Hartman
> > Along those lines, you all are tieing in the "I want to print a pointer,
> > so hash it properly before I do so" logic from rust like we have in c,
> > right?  Ideally you'd use the same logic if at all possible.
> >
> > If not, that probably needs to be done so that you don't accidentally
> > start leaking things.
> 
> I don't know what the status of this is. For anything I've added, I've
> just made it entirely impossible to print addresses, hashed or not.

Even better, thanks!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-02-08 15:41         ` Alice Ryhl
  2024-02-08 15:59           ` Greg Kroah-Hartman
@ 2024-02-10  6:20           ` Kees Cook
  1 sibling, 0 replies; 40+ messages in thread
From: Kees Cook @ 2024-02-10  6:20 UTC (permalink / raw)
  To: Alice Ryhl, Greg Kroah-Hartman
  Cc: Trevor Gross, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Kees Cook, Al Viro, Andrew Morton,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner



On February 8, 2024 7:41:52 AM PST, Alice Ryhl <aliceryhl@google.com> wrote:
>On Thu, Feb 8, 2024 at 4:35 PM Greg Kroah-Hartman
><gregkh@linuxfoundation.org> wrote:
>>
>> On Thu, Feb 08, 2024 at 01:53:20PM +0100, Alice Ryhl wrote:
>> > > Is some sort of `Debug` implementation useful?
>> >
>> > Leaking pointer addresses in logs is frowned upon in the kernel, so I
>> > don't think we should include that.
>>
>> s/frowned upon/forbidden/ :)
>
>:)
>
>> Along those lines, you all are tieing in the "I want to print a pointer,
>> so hash it properly before I do so" logic from rust like we have in c,
>> right?  Ideally you'd use the same logic if at all possible.
>>
>> If not, that probably needs to be done so that you don't accidentally
>> start leaking things.
>
>I don't know what the status of this is. For anything I've added, I've
>just made it entirely impossible to print addresses, hashed or not.

Thank you! That is certainly the preferred[1] approach. :)

-Kees

[1] https://www.kernel.org/doc/html/next/process/deprecated.html#p-format-specifier

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-02-08 12:53     ` Alice Ryhl
  2024-02-08 15:35       ` Greg Kroah-Hartman
@ 2024-02-10  7:06       ` Trevor Gross
  2024-02-10 14:14       ` David Laight
  2 siblings, 0 replies; 40+ messages in thread
From: Trevor Gross @ 2024-02-10  7:06 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

On Thu, Feb 8, 2024 at 6:53 AM Alice Ryhl <aliceryhl@google.com> wrote:
> Leaking pointer addresses in logs is frowned upon in the kernel, so I
> don't think we should include that.

Sounds like the very correct decision from what Greg and Kees said. If
we add an impl in the future (to be able to `#[derive(Debug)]` on
anything containing this type) it could just print
`0x****************`.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH 1/3] rust: add userspace pointers
  2024-02-08 12:53     ` Alice Ryhl
  2024-02-08 15:35       ` Greg Kroah-Hartman
  2024-02-10  7:06       ` Trevor Gross
@ 2024-02-10 14:14       ` David Laight
  2024-02-12  9:30         ` Alice Ryhl
  2 siblings, 1 reply; 40+ messages in thread
From: David Laight @ 2024-02-10 14:14 UTC (permalink / raw)
  To: 'Alice Ryhl', Trevor Gross
  Cc: Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho, Boqun Feng,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Kees Cook, Al Viro, Andrew Morton, Greg Kroah-Hartman,
	Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Carlos Llamas, Suren Baghdasaryan, Arnd Bergmann,
	linux-mm, linux-kernel, rust-for-linux, Christian Brauner

...
> > Maybe something like
> >
> >     Every time a memory location is read, the reader's position is advanced by
> >     the read length and the next read will start from there. This helps prevent
> >     accidentally reading the same location twice and causing a TOCTOU bug.

WTF TOCTOU? I'm guessing it is reading things twice and getting
different answers.

That really doesn't match how copying from userspace is used is many places.
Sometimes you really do want to be using offsets and lengths.
For instance the user buffer might contain offsets of items further
down the buffer.
There is also the code (eg ioctl) that does a read-modify-write
on a buffer.


There is also this bit:

> > +    /// Reads the entirety of the user slice.
> > +    ///
> > +    /// Returns `EFAULT` if the address does not currently point to
> > +    /// mapped, readable memory.
> > +    pub fn read_all(self) -> Result<Vec<u8>> {
> > +        self.reader().read_all()
> > +    }
>
> If I understand it correctly, the function will return `EFAULT` if _any_
> address in the interval `[self.0, self.0 + self.1)` does not point to
> mapped, readable memory. Maybe the docs could be more explicit.

That isn't (and can't be) how it works.
access_ok() checks that the buffer isn't in kernel space.
The copy is then done until it actually faults on an invalid address.
In that case the destination buffer has been updated to the point
of failure.

You can't do a check before the copy because another thread can
change the mapping (it would also be horribly expensive).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/3] rust: add userspace pointers
  2024-02-10 14:14       ` David Laight
@ 2024-02-12  9:30         ` Alice Ryhl
  0 siblings, 0 replies; 40+ messages in thread
From: Alice Ryhl @ 2024-02-12  9:30 UTC (permalink / raw)
  To: David Laight
  Cc: Trevor Gross, Miguel Ojeda, Alex Gaynor, Wedson Almeida Filho,
	Boqun Feng, Gary Guo, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Kees Cook, Al Viro, Andrew Morton,
	Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Carlos Llamas,
	Suren Baghdasaryan, Arnd Bergmann, linux-mm, linux-kernel,
	rust-for-linux, Christian Brauner

On Sat, Feb 10, 2024 at 3:15 PM David Laight <David.Laight@aculab.com> wrote:
>
> ...
> > > Maybe something like
> > >
> > >     Every time a memory location is read, the reader's position is advanced by
> > >     the read length and the next read will start from there. This helps prevent
> > >     accidentally reading the same location twice and causing a TOCTOU bug.
>
> WTF TOCTOU? I'm guessing it is reading things twice and getting
> different answers.

Yes. In v2 of this patchset [1], I expanded TOCTOU to "time-of-check
to time-of-use" at the first use to reduce this confusion.

> That really doesn't match how copying from userspace is used is many places.
> Sometimes you really do want to be using offsets and lengths.
> For instance the user buffer might contain offsets of items further
> down the buffer.

For this use-case, you can call UserSlice::new multiple times, or use
clone_reader. This use-case does appear sometimes in Rust Binder and
is supported, but I didn't find it to be the most common use-case.

> There is also the code (eg ioctl) that does a read-modify-write
> on a buffer.

The read-modify-write use-case is quite common in Rust Binder and is
supported by the API provided by this patchset. When you call
reader_writer, you get a separate reader and writer. Then, you first
use the reader to read the data. Then you modify it. Then you use the
writer to write it back.

> > > +    /// Reads the entirety of the user slice.
> > > +    ///
> > > +    /// Returns `EFAULT` if the address does not currently point to
> > > +    /// mapped, readable memory.
> > > +    pub fn read_all(self) -> Result<Vec<u8>> {
> > > +        self.reader().read_all()
> > > +    }
> >
> > If I understand it correctly, the function will return `EFAULT` if _any_
> > address in the interval `[self.0, self.0 + self.1)` does not point to
> > mapped, readable memory. Maybe the docs could be more explicit.
>
> That isn't (and can't be) how it works.
> access_ok() checks that the buffer isn't in kernel space.
> The copy is then done until it actually faults on an invalid address.
> In that case the destination buffer has been updated to the point
> of failure.
>
> You can't do a check before the copy because another thread can
> change the mapping (it would also be horribly expensive).

This was reworded in v2 [1]:

/// Fails with `EFAULT` if the read encounters a page fault.

But ultimately, the real condition here is just that it returns EFAULT
if copy_from_user fails. I'm happy to reword further.

Alice

[1]: https://lore.kernel.org/all/20240208-alice-mm-v2-1-d821250204a6@google.com/

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2024-02-12  9:30 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-24 11:20 [PATCH 0/3] Memory management patches needed by Rust Binder Alice Ryhl
2024-01-24 11:20 ` [PATCH 1/3] rust: add userspace pointers Alice Ryhl
2024-01-24 23:12   ` Valentin Obst
2024-02-08 12:20     ` Alice Ryhl
2024-02-01  4:06   ` Trevor Gross
2024-02-08 12:53     ` Alice Ryhl
2024-02-08 15:35       ` Greg Kroah-Hartman
2024-02-08 15:41         ` Alice Ryhl
2024-02-08 15:59           ` Greg Kroah-Hartman
2024-02-10  6:20           ` Kees Cook
2024-02-10  7:06       ` Trevor Gross
2024-02-10 14:14       ` David Laight
2024-02-12  9:30         ` Alice Ryhl
2024-01-24 11:20 ` [PATCH 2/3] rust: add typed accessors for " Alice Ryhl
2024-01-24 23:46   ` Valentin Obst
2024-01-25 12:40     ` Alice Ryhl
2024-01-25 12:26   ` Arnd Bergmann
2024-01-25 12:37     ` Alice Ryhl
2024-01-25 15:59       ` Arnd Bergmann
2024-01-25 16:15         ` Alice Ryhl
2024-02-01  5:03   ` Trevor Gross
2024-02-08 13:14     ` Alice Ryhl
2024-01-24 11:20 ` [PATCH 3/3] rust: add abstraction for `struct page` Alice Ryhl
2024-01-26  0:46   ` Boqun Feng
2024-01-26 12:33     ` Alice Ryhl
2024-01-26 18:28       ` Boqun Feng
2024-02-01  6:50         ` Trevor Gross
2024-02-05 17:23           ` Boqun Feng
2024-02-08 13:36           ` Alice Ryhl
2024-01-30  9:15     ` Andreas Hindborg (Samsung)
2024-01-29 17:59   ` Matthew Wilcox
2024-01-29 18:56     ` Carlos Llamas
2024-01-29 20:19       ` Matthew Wilcox
2024-01-29 21:27     ` Alice Ryhl
2024-01-30  9:02     ` Andreas Hindborg
2024-01-30  9:06       ` Alice Ryhl
2024-02-01  6:02   ` Trevor Gross
2024-02-08 13:46     ` Alice Ryhl
2024-02-08 14:02       ` Andreas Hindborg
2024-02-08 14:12         ` Alice Ryhl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).