linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Gary Guo <gary@garyguo.net>
Cc: "Benno Lossin" <benno.lossin@proton.me>,
	"Wedson Almeida Filho" <wedsonaf@gmail.com>,
	rust-for-linux@vger.kernel.org, "Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	linux-kernel@vger.kernel.org,
	"Wedson Almeida Filho" <walmeida@microsoft.com>,
	"Martin Rodriguez Reboredo" <yakoyoku@gmail.com>
Subject: Re: [PATCH v4 08/13] rust: introduce `ARef`
Date: Fri, 14 Apr 2023 10:03:08 -0700	[thread overview]
Message-ID: <ZDmHTIP5dEkXPOyX@Boquns-Mac-mini.local> (raw)
In-Reply-To: <20230414153806.5dabfdb3.gary@garyguo.net>

On Fri, Apr 14, 2023 at 03:38:06PM +0100, Gary Guo wrote:
> On Fri, 14 Apr 2023 09:46:53 +0000
> Benno Lossin <benno.lossin@proton.me> wrote:
> 
> > On 14.04.23 11:00, Wedson Almeida Filho wrote:
> > > On Thu, 13 Apr 2023 at 19:30, Benno Lossin <benno.lossin@proton.me> wrote:  
> > >>
> > >> On 13.04.23 19:06, Wedson Almeida Filho wrote:  
> > >>> On Thu, 13 Apr 2023 at 06:19, Benno Lossin <benno.lossin@proton.me> wrote:  
> > >>>>
> > >>>> On 11.04.23 07:45, Wedson Almeida Filho wrote:  
> > >>>>> From: Wedson Almeida Filho <walmeida@microsoft.com>
> > >>>>>
> > >>>>> This is an owned reference to an object that is always ref-counted. This
> > >>>>> is meant to be used in wrappers for C types that have their own ref
> > >>>>> counting functions, for example, tasks, files, inodes, dentries, etc.
> > >>>>>
> > >>>>> Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
> > >>>>> Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
> > >>>>> ---
> > >>>>> v1 -> v2: No changes
> > >>>>> v2 -> v3: No changes
> > >>>>> v3 -> v4: No changes
> > >>>>>
> > >>>>>     rust/kernel/types.rs | 107 +++++++++++++++++++++++++++++++++++++++++++
> > >>>>>     1 file changed, 107 insertions(+)
> > >>>>>
> > >>>>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> > >>>>> index a4b1e3778da7..29db59d6119a 100644
> > >>>>> --- a/rust/kernel/types.rs
> > >>>>> +++ b/rust/kernel/types.rs
> > >>>>> @@ -6,8 +6,10 @@ use crate::init::{self, PinInit};
> > >>>>>     use alloc::boxed::Box;
> > >>>>>     use core::{
> > >>>>>         cell::UnsafeCell,
> > >>>>> +    marker::PhantomData,
> > >>>>>         mem::MaybeUninit,
> > >>>>>         ops::{Deref, DerefMut},
> > >>>>> +    ptr::NonNull,
> > >>>>>     };
> > >>>>>
> > >>>>>     /// Used to transfer ownership to and from foreign (non-Rust) languages.
> > >>>>> @@ -268,6 +270,111 @@ impl<T> Opaque<T> {
> > >>>>>         }
> > >>>>>     }
> > >>>>>
> > >>>>> +/// Types that are _always_ reference counted.
> > >>>>> +///
> > >>>>> +/// It allows such types to define their own custom ref increment and decrement functions.
> > >>>>> +/// Additionally, it allows users to convert from a shared reference `&T` to an owned reference
> > >>>>> +/// [`ARef<T>`].
> > >>>>> +///
> > >>>>> +/// This is usually implemented by wrappers to existing structures on the C side of the code. For
> > >>>>> +/// Rust code, the recommendation is to use [`Arc`](crate::sync::Arc) to create reference-counted
> > >>>>> +/// instances of a type.
> > >>>>> +///
> > >>>>> +/// # Safety
> > >>>>> +///
> > >>>>> +/// Implementers must ensure that increments to the reference count keep the object alive in memory
> > >>>>> +/// at least until matching decrements are performed.
> > >>>>> +///
> > >>>>> +/// Implementers must also ensure that all instances are reference-counted. (Otherwise they
> > >>>>> +/// won't be able to honour the requirement that [`AlwaysRefCounted::inc_ref`] keep the object
> > >>>>> +/// alive.)  
> > >>>>
> > >>>> `dec_ref` states below that it 'Frees the object when the count reaches
> > >>>> zero.', this should also be stated here, since implementers should adhere
> > >>>> to that when implementing `dec_ref`.  
> > >>>
> > >>> This section is for safety requirements. Freeing the object doesn't
> > >>> fall into this category.  
> > >>
> > >> It still needs to be upheld by the implementer, since it is guaranteed by
> > >> the documentation on the `dec_ref` function. Even non-safety requirements
> > >> are listed on the `unsafe` traits, if users should be able to rely on them.
> > >> If users should not rely on this, then maybe change the docs of `dec_ref`
> > >> to "when the refcount reaches zero, the object might be freed.".  
> > >
> > > I disagree that non-safety requirements should be listed under the
> > > Safety section.
> > >
> > > This section is meant for rules that implementers must adhere to to
> > > ensure their implementations are safe. So it's usually read before
> > > writing a "SAFETY:" comment for their "unsafe impl" blocks -- adding
> > > extraneous information is counterproductive.  
> > 
> > Sure, I think it you could still mention it outside of the safety section.
> > 
> > >>>>> +pub unsafe trait AlwaysRefCounted {
> > >>>>> +    /// Increments the reference count on the object.
> > >>>>> +    fn inc_ref(&self);  
> > >>>>
> > >>>>
> > >>>>  
> > >>>>> +
> > >>>>> +    /// Decrements the reference count on the object.
> > >>>>> +    ///
> > >>>>> +    /// Frees the object when the count reaches zero.
> > >>>>> +    ///
> > >>>>> +    /// # Safety
> > >>>>> +    ///
> > >>>>> +    /// Callers must ensure that there was a previous matching increment to the reference count,
> > >>>>> +    /// and that the object is no longer used after its reference count is decremented (as it may
> > >>>>> +    /// result in the object being freed), unless the caller owns another increment on the refcount
> > >>>>> +    /// (e.g., it calls [`AlwaysRefCounted::inc_ref`] twice, then calls
> > >>>>> +    /// [`AlwaysRefCounted::dec_ref`] once).
> > >>>>> +    unsafe fn dec_ref(obj: NonNull<Self>);
> > >>>>> +}
> > >>>>> +
> > >>>>> +/// An owned reference to an always-reference-counted object.
> > >>>>> +///
> > >>>>> +/// The object's reference count is automatically decremented when an instance of [`ARef`] is
> > >>>>> +/// dropped. It is also automatically incremented when a new instance is created via
> > >>>>> +/// [`ARef::clone`].
> > >>>>> +///
> > >>>>> +/// # Invariants
> > >>>>> +///
> > >>>>> +/// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
> > >>>>> +/// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
> > >>>>> +pub struct ARef<T: AlwaysRefCounted> {
> > >>>>> +    ptr: NonNull<T>,
> > >>>>> +    _p: PhantomData<T>,
> > >>>>> +}
> > >>>>> +
> > >>>>> +impl<T: AlwaysRefCounted> ARef<T> {
> > >>>>> +    /// Creates a new instance of [`ARef`].
> > >>>>> +    ///
> > >>>>> +    /// It takes over an increment of the reference count on the underlying object.
> > >>>>> +    ///
> > >>>>> +    /// # Safety
> > >>>>> +    ///
> > >>>>> +    /// Callers must ensure that the reference count was incremented at least once, and that they
> > >>>>> +    /// are properly relinquishing one increment. That is, if there is only one increment, callers
> > >>>>> +    /// must not use the underlying object anymore -- it is only safe to do so via the newly
> > >>>>> +    /// created [`ARef`].
> > >>>>> +    pub unsafe fn from_raw(ptr: NonNull<T>) -> Self {
> > >>>>> +        // INVARIANT: The safety requirements guarantee that the new instance now owns the
> > >>>>> +        // increment on the refcount.
> > >>>>> +        Self {
> > >>>>> +            ptr,
> > >>>>> +            _p: PhantomData,
> > >>>>> +        }
> > >>>>> +    }
> > >>>>> +}
> > >>>>> +
> > >>>>> +impl<T: AlwaysRefCounted> Clone for ARef<T> {
> > >>>>> +    fn clone(&self) -> Self {
> > >>>>> +        self.inc_ref();
> > >>>>> +        // SAFETY: We just incremented the refcount above.
> > >>>>> +        unsafe { Self::from_raw(self.ptr) }
> > >>>>> +    }
> > >>>>> +}
> > >>>>> +
> > >>>>> +impl<T: AlwaysRefCounted> Deref for ARef<T> {
> > >>>>> +    type Target = T;
> > >>>>> +
> > >>>>> +    fn deref(&self) -> &Self::Target {
> > >>>>> +        // SAFETY: The type invariants guarantee that the object is valid.
> > >>>>> +        unsafe { self.ptr.as_ref() }
> > >>>>> +    }
> > >>>>> +}
> > >>>>> +
> > >>>>> +impl<T: AlwaysRefCounted> From<&T> for ARef<T> {
> > >>>>> +    fn from(b: &T) -> Self {
> > >>>>> +        b.inc_ref();
> > >>>>> +        // SAFETY: We just incremented the refcount above.
> > >>>>> +        unsafe { Self::from_raw(NonNull::from(b)) }
> > >>>>> +    }
> > >>>>> +}  
> > >>>>
> > >>>> This impl seems unsound to me, as we can do this:
> > >>>>
> > >>>>        struct MyStruct {
> > >>>>            raw: Opaque<bindings::my_struct>, // This has a `refcount_t` inside.
> > >>>>        }
> > >>>>
> > >>>>        impl MyStruct {
> > >>>>            fn new() -> Self { ... }
> > >>>>        }
> > >>>>
> > >>>>        unsafe impl AlwaysRefCounted for MyStruct { ... } // Implemented correctly.
> > >>>>
> > >>>>        fn evil() -> ARef<MyStruct> {
> > >>>>            let my_struct = MyStruct::new();
> > >>>>            ARef::from(&my_struct) // We return a pointer to the stack!
> > >>>>        }
> > >>>>
> > >>>> similarly, this can also be done with a `Box`:
> > >>>>
> > >>>>        fn evil2() -> ARef<MyStruct> {
> > >>>>            let my_struct = Box::new(MyStruct::new());
> > >>>>            ARef::from(&*my_struct)
> > >>>>            // Box is freed here, even just dropping the `ARef` will result in
> > >>>>            // a UAF.
> > >>>>        }  
> > >>>
> > >>> This implementation of `AlwaysRefCounted` is in violation of the
> > >>> safety requirements of the trait, namely:
> > >>>
> > >>> /// Implementers must ensure that increments to the reference count
> > >>> keep the object alive in memory
> > >>> /// at least until matching decrements are performed.
> > >>> ///
> > >>> /// Implementers must also ensure that all instances are
> > >>> reference-counted. (Otherwise they
> > >>> /// won't be able to honour the requirement that
> > >>> [`AlwaysRefCounted::inc_ref`] keep the object
> > >>> /// alive.)
> > >>>
> > >>> It boils down `MyStruct::new` in your example. It's not refcounted.
> > >>>  
> > >>>> Additionally, I think that `AlwaysRefCounted::inc_ref` should not be safe,
> > >>>> as the caller must not deallocate the memory until the refcount is zero.  
> > >>>
> > >>> The existence of an `&T` is evidence that the refcount is non-zero, so
> > >>> it is safe to increment it. The caller cannot free the object without
> > >>> violating the safety requirements.
> > >>>  
> > >>>> Another pitfall of `ARef`: it does not deallocate the memory when the
> > >>>> refcount reaches zero. People might expect that this code would not leak
> > >>>> memory:
> > >>>>
> > >>>>        let foo = Box::try_new(Foo::new())?;
> > >>>>        let foo = Box::leak(foo); // Leak the box, such that we do not
> > >>>>                                  // deallocate the memory too early.
> > >>>>        let foo = ARef::from(foo);
> > >>>>        drop(foo); // refcount is now zero, but the memory is never deallocated.  
> > >>>
> > >>> This is also in violation of the safety requirements of `AlwaysRefCounted`.  
> > >>
> > >> It seems I have misunderstood the term "always reference counted".
> > >> We should document this in more detail, since this places a lot of
> > >> constraints on the implementers:
> > >>
> > >>       Implementing `AlwaysRefCounted` for `T` places the following constraint on shared references `&T`:
> > >>       - Every `&T` points to memory that is not deallocated until the reference count reaches zero.
> > >>       - The existence of `&T` proves that the reference count is at least 1.  
> > >
> > > This is implied by the existing safety rules.  
> > 
> > This was not obvious to me at all, I think we should extend the docs and
> > make this very clear.
> > 
> > >>       This has some important consequences:
> > >>       - Exposing safe a way to get `T` is not allowed, since stack allocations are freed when the scope
> > >>         ends even though the reference count is non-zero.  
> > >
> > > Stack allocations are ok, as long as they wait for the refcount to
> > > drop to zero before the variable goes out of scope.  
> > 
> > "Exposing a **safe** way", you cannot under any circumstances allow safe
> > code access to by value `T` when it implements `AlwaysRefCounted`, since
> > safe code can create a `&T` without upholding the invariants.
> > 
> > In your controlled function, you can create `T`s on the stack if you want,
> > but giving them out to safe code is the problem.
> > 
> > >>       - Similarly giving safe access to `Box<T>` or other smart pointers is not allowed, since a `Box` can
> > >>         be freed independent from the reference count.  
> > >
> > > `ARef<T>` is a smart pointer and it is definitely allowed.  
> > 
> > Yes, I meant smart pointers except `ARef`. E.g. putting `T` inside of an
> > `Arc` has the same problem as `Box<T>`.
> > 
> > >
> > > Similarly to stack allocations I mention above, a `Box<T>`
> > > implementation is conceivable as long as it ensures that the
> > > allocation is freed only once the refcount reaches zero, for example,
> > > by having a drop implementation that performs such a wait. (IOW, when
> > > `Box<T>` goes out of scope, it always calls `drop` on `T` before
> > > actually freeing the memory, so this implementation could block until
> > > it is safe to do so, i.e., until the refcount reaches zero.)  
> > 
> > Is this something you want to do? Because to me this sounds like something
> > that could end up deadlocking very easily.
> > 
> > AIUI `AlwaysRefCounted` is intended for wrapper structs that are never
> > created on the Rust side. They are created and destroyed by C.
> 
> No.
> 
> It's perfectly legal for Rust code to implement this trait, and it
> might even be desirable in some cases, because it gives more control
> than relying on `Arc` and `Drop`.
> 
> For example, if a type is usable in RCU, then you might want to have
> some code like this:
> 
> impl AlwaysRefCounted for MyType {
>     fn inc_ref(&self) {
>         self.refcnt.fetch_add(1, Ordering::Relaxed);
>     }
> 
>     fn dec_ref(this: NonNull<Self>) {
>        let refcnt = this.as_ref().refcnt.fetch_sub(1, Ordering::Relaxed) - 1;
>        if refcnt == 0 {
>            // Unpublish the pointer from some RCU data structure
>            synchronize_rcu();
>            // proceed to drop the type and box
>        }
>     }
> }
> 
> The example given above can't rely on dtor because `drop` takes a
> mutable reference.
> 

How would you implement the `drop` of `MyType`? Will there be a
synchronize_rcu() there?

My understanding of Benno's point is that you won't never implement a
safe function that directly return `MyType` (maybe `ARef<MyType>`,
RCU<MyType>`, but not the type directly).

Regards,
Boqun

> > The scenario
> > of putting them into a `Box` or `Arc` should never happen, since Rust does
> > not have a way to create them.
> > 
> > As soon as this is not the only use case for this trait, I feel like this
> > trait becomes very dangerous, since there are many different ways for you
> > to mess up via normal coding patterns.
> > 
> > Hence I think it is better to spell out the dangerous patterns here and
> > forbid them, since the only use case should never use them anyway.
> > 
> > Also in the `Box` case, the same problem as with `&mut T` exists, since
> > you can derive a `&mut T` from it.
> > 
> > >  
> > >>       This type is intended to be implemented for C types that embedd a `refcount_t` and that are both
> > >>       created and destroyed by C. Static references also work with this type, since they stay live
> > >>       indefinitely.  
> > >
> > > Embedding a `refcount_t` is not a requirement. I already mention in
> > > the documentation that this is usually used for C structs and that
> > > Rust code should use `Arc`.  
> > 
> > I would prefer if we change the wording in the docs from `usually` to
> > `only`. If you agree with my interpretation above, then Rust types should
> > not implement this trait.
> > 
> > >  
> > >>       Implementers must also ensure that they never give out `&mut T`, since
> > >>       - it can be reborrowed as `&T`,
> > >>       - converted to `ARef<T>`,
> > >>       - which can yield a `&T` that is alive at the same time as the `&mut T`.  
> > >
> > > I agree with this. And I don't think this is a direct consequence of
> > > the safety requirements, so I think it makes sense to add something
> > > that covers this.
> > >  
> > >>>>> +
> > >>>>> +impl<T: AlwaysRefCounted> Drop for ARef<T> {
> > >>>>> +    fn drop(&mut self) {
> > >>>>> +        // SAFETY: The type invariants guarantee that the `ARef` owns the reference we're about to
> > >>>>> +        // decrement.
> > >>>>> +        unsafe { T::dec_ref(self.ptr) };
> > >>>>> +    }
> > >>>>> +}
> > >>>>> +
> > >>>>>     /// A sum type that always holds either a value of type `L` or `R`.
> > >>>>>     pub enum Either<L, R> {
> > >>>>>         /// Constructs an instance of [`Either`] containing a value of type `L`.
> > >>>>> --
> > >>>>> 2.34.1
> > >>>>>  
> > >>>>  
> > >>  
> > 
> 

  reply	other threads:[~2023-04-14 17:03 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-11  5:45 [PATCH v4 01/13] rust: sync: introduce `LockClassKey` Wedson Almeida Filho
2023-04-11  5:45 ` [PATCH v4 02/13] rust: sync: introduce `Lock` and `Guard` Wedson Almeida Filho
2023-04-11 20:42   ` Gary Guo
2023-04-12 11:38     ` Wedson Almeida Filho
2023-04-14 12:02       ` Alice Ryhl
2023-04-13  8:46   ` Benno Lossin
2023-04-11  5:45 ` [PATCH v4 03/13] rust: lock: introduce `Mutex` Wedson Almeida Filho
2023-04-13  8:56   ` Benno Lossin
2023-04-11  5:45 ` [PATCH v4 04/13] locking/spinlock: introduce spin_lock_init_with_key Wedson Almeida Filho
2023-04-11 18:05   ` Wedson Almeida Filho
2023-04-12 19:14     ` Boqun Feng
2023-04-11  5:45 ` [PATCH v4 05/13] rust: lock: introduce `SpinLock` Wedson Almeida Filho
2023-04-11  5:45 ` [PATCH v4 06/13] rust: lock: add support for `Lock::lock_irqsave` Wedson Almeida Filho
2023-04-11  5:45 ` [PATCH v4 07/13] rust: lock: implement `IrqSaveBackend` for `SpinLock` Wedson Almeida Filho
2023-04-11  5:45 ` [PATCH v4 08/13] rust: introduce `ARef` Wedson Almeida Filho
2023-04-11 20:45   ` Gary Guo
2023-04-13  9:19   ` Benno Lossin
2023-04-13 17:06     ` Wedson Almeida Filho
2023-04-13 22:29       ` Benno Lossin
2023-04-14  9:00         ` Wedson Almeida Filho
2023-04-14  9:46           ` Benno Lossin
2023-04-14 14:38             ` Gary Guo
2023-04-14 17:03               ` Boqun Feng [this message]
2023-04-11  5:45 ` [PATCH v4 09/13] rust: add basic `Task` Wedson Almeida Filho
2023-04-11 20:47   ` Gary Guo
2023-04-12 11:42     ` Wedson Almeida Filho
2023-04-11  5:45 ` [PATCH v4 10/13] rust: introduce `current` Wedson Almeida Filho
2023-04-11  5:45 ` [PATCH v4 11/13] rust: lock: add `Guard::do_unlocked` Wedson Almeida Filho
2023-04-11 20:54   ` Gary Guo
2023-04-12 11:16     ` Wedson Almeida Filho
2023-04-11 21:17   ` Boqun Feng
2023-04-12 11:09     ` Wedson Almeida Filho
2023-04-12  6:25   ` Boqun Feng
2023-04-12 11:07     ` Wedson Almeida Filho
2023-04-12 14:35       ` Boqun Feng
2023-04-12 17:41         ` Wedson Almeida Filho
2023-04-11  5:45 ` [PATCH v4 12/13] rust: sync: introduce `CondVar` Wedson Almeida Filho
2023-04-14 11:55   ` Alice Ryhl
2023-04-11  5:45 ` [PATCH v4 13/13] rust: sync: introduce `LockedBy` Wedson Almeida Filho
2023-04-13  9:45   ` Benno Lossin
2023-04-11 20:35 ` [PATCH v4 01/13] rust: sync: introduce `LockClassKey` Gary Guo
2023-04-13  8:02 ` Benno Lossin
2023-04-21 23:48 ` Miguel Ojeda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZDmHTIP5dEkXPOyX@Boquns-Mac-mini.local \
    --to=boqun.feng@gmail.com \
    --cc=alex.gaynor@gmail.com \
    --cc=benno.lossin@proton.me \
    --cc=bjorn3_gh@protonmail.com \
    --cc=gary@garyguo.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=walmeida@microsoft.com \
    --cc=wedsonaf@gmail.com \
    --cc=yakoyoku@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).