From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from perceval.ideasonboard.com (perceval.ideasonboard.com [213.167.242.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61756168 for ; Fri, 23 Jul 2021 01:13:07 +0000 (UTC) Received: from pendragon.ideasonboard.com (62-78-145-57.bb.dnainternet.fi [62.78.145.57]) by perceval.ideasonboard.com (Postfix) with ESMTPSA id 3D55D255; Fri, 23 Jul 2021 03:13:05 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com; s=mail; t=1627002785; bh=s+4qJP6p0Qtxd5qMS73nlBcZekuUWhqpLiplAXdpKxI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=v40w/vBgDnIhtmSMbB649v0SEFss3FPrvZ70DGKtTrdsL+5InNUiL8DdzgwX7aqpF Y/VuWkBVigmKD14rXNq2Uy82NlcIPH2kIKNqjeDvqtVM79czdQ1ZzSkeyLuyxwdJz0 KrQ23oWu0GD0RZV9xH5fiWoSptDjc/avPSg02bgg= Date: Fri, 23 Jul 2021 04:13:04 +0300 From: Laurent Pinchart To: Wedson Almeida Filho Cc: Vegard Nossum , Linus Walleij , Miguel Ojeda , Greg KH , Bartosz Golaszewski , Kees Cook , Jan Kara , James Bottomley , Julia Lawall , Roland Dreier , ksummit@lists.linux.dev, Viresh Kumar Subject: Re: [TECH TOPIC] Rust for Linux Message-ID: References: <26cd8155-a85c-01e9-bfac-af4bbd15f273@oracle.com> Precedence: bulk X-Mailing-List: ksummit@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Hi Wedson, On Wed, Jul 21, 2021 at 05:23:43AM +0100, Wedson Almeida Filho wrote: > On Wed, Jul 21, 2021 at 02:54:24AM +0300, Laurent Pinchart wrote: > > On Mon, Jul 19, 2021 at 10:09:46PM +0100, Wedson Almeida Filho wrote: > > > On Mon, Jul 19, 2021 at 10:37:52PM +0300, Laurent Pinchart wrote: > > > > On Mon, Jul 19, 2021 at 07:06:36PM +0100, Wedson Almeida Filho wrote: > > > > > On Mon, Jul 19, 2021 at 06:02:06PM +0200, Vegard Nossum wrote: > > > > > > On 7/19/21 3:15 PM, Wedson Almeida Filho wrote: > > > > > > > On Mon, Jul 19, 2021 at 01:24:49PM +0100, Wedson Almeida Filho wrote: > > > > > > >> On Fri, Jul 09, 2021 at 12:13:25AM +0200, Linus Walleij wrote: > > > > > > >>> I have seen that QEMU has a piece of code for the Arm PrimeCell > > > > > > >>> PL061 GPIO block which corresponds to drivers/gpio/gpio-pl061.c > > > > > > >>> Note that this hardware apart from being used in all Arm reference > > > > > > >>> designs is used on ARMv4T systems that are not supported by > > > > > > >>> LLVM but only GCC, which might complicate things. > > > > > > >> > > > > > > >> Here is a working PL061 driver in Rust (converted form the C one): > > > > > > >> https://raw.githubusercontent.com/wedsonaf/linux/pl061/drivers/gpio/gpio_pl061_rust.rs > > > > > > > > > > > > > > I'm also attaching an html rending of the C and Rust versions side by side where > > > > > > > I try to line the definitions up to make it easier to contrast the two > > > > > > > implementations. > > > > > > > > > > > > This is really cool :-) As a Rust noob, I have a few questions: > > > > > > > > > > > > 1. I'm curious about some of the writeb() vs. try_writeb() calls: > > > > > > > > > > > > fn direction_output(data: &Ref, offset: u32, value: bool) -> > > > > > > Result { > > > > > > let woffset = bit(offset + 2).into(); > > > > > > let _guard = data.lock(); > > > > > > let pl061 = data.resources().ok_or(Error::ENXIO)?; > > > > > > pl061.base.try_writeb((value as u8) << offset, woffset)?; > > > > > > let mut gpiodir = pl061.base.readb(GPIODIR); > > > > > > gpiodir |= bit(offset); > > > > > > pl061.base.writeb(gpiodir, GPIODIR); > > > > > > > > > > > > // gpio value is set again, because pl061 doesn't allow to set > > > > > > value of a gpio pin before > > > > > > // configuring it in OUT mode. > > > > > > pl061.base.try_writeb((value as u8) << offset, woffset)?; > > > > > > Ok(()) > > > > > > } > > > > > > > > > > > > Here you have try_writeb() (and error return) where there was just a > > > > > > writeb() without any error handling in the C version. Is this what > > > > > > Miguel was answering a bit down the thread where the address is computed > > > > > > ((value as u8) << offset) so it _needs_ to use the try_() version? > > > > > > > > > > The `writeb` variant only works when we know at compile-time that the offset is > > > > > within bounds (the compiler will reject the code otherwise). When the value is > > > > > computed at runtime we use a `try` version that checks before performing the > > > > > write. We need this to guarantee memory safety. > > > > > > > > > > > If offset can be anything but a "correct" value here, should there be a > > > > > > check for that somewhere else and then the computed value can be > > > > > > subsequently treated as safe (i.e. there's a second try_writeb() in the > > > > > > function that now presumably does the runtime check a second time, > > > > > > redundantly)? > > > > > > > > > > Oh, that's a neat idea. We can certainly implement something like this: > > > > > > > > > > let woffset = pl061.base.vet_offsetb(bit(offset + 2))?; > > > > > > > > > > Then woffset would be passed to writeb variants that are guaranteed to succeed. > > > > > (Rust helps us ensure that woffset cannot change without checks, which would be > > > > > harder to do in C.) > > > > > > > > > > > 2. In many places you have the C code: > > > > > > > > > > > > struct pl061 *pl061 = dev_get_drvdata(dev); > > > > > > > > > > > > with the equivalent Rust code as: > > > > > > > > > > > > let pl061 = data.resources().ok_or(Error::ENXIO)?; > > > > > > > > > > > > Why doesn't the C code need to check for errors here? Or put > > > > > > differently, why can the Rust version fail? > > > > > > > > > > There are two aspecs worth noting here: > > > > > 1. In C there is cast from void * to struct pl061 * without really knowing if > > > > > the stored pointer is of the right type. For example, if I simply change the > > > > > struct type to say `struct mutex` in the code above, it will still compile, > > > > > though it will be clearly wrong. In Rust we prevent this by not exposing drvdata > > > > > directly to drivers, and using type-specialised functions to set/get drvdata, so > > > > > it *knows* that the type is right. So in this sense Rust is better because it > > > > > offers type guarantees without additional runtime cost. (In Rust, if you change > > > > > the type of the function to say `&Mutex`, it won't compile. > > > > > > > > > > 2. The extra check we have here is because of a feature that the C code doesn't > > > > > have: revocable resources. If we didn't want to have this, we could do say > > > > > `data.base.writeb(...)` directly, but then we could have situations where `base` > > > > > is used after the device was removed. By having these checks we guarantee that > > > > > anyone can hold a reference to device state, but they can no longer use hw > > > > > resources after the device is removed. > > > > > > > > If the driver reached a code path with an I/O write after .remove() > > > > returns, the game is likely over already. It would be more interesting > > > > to see how we could prevent that from happening in the first place. > > > > Checking individual I/O writes at runtime will not only add additional > > > > CPU costs, but will also produce code paths that are not well tested. > > > > > > You may be conflating checking offsets in individual writes/reads with accessing > > > hw resources. Note that these are different things. > > > > Yes, it's the data.resources().ok_or() that I was talking about, not the > > I/O writes, sorry. > > > > > > It > > > > feels that we're inventing a problem just to be able to showcase the > > > > solution :-) > > > > > > Thanks for taking a look. I beg to differ though, as this solves (on the Rust > > > side) a problem you described the other day on this very thread. The solution is > > > different from what you propose though :) > > > > > > - The internal data structures of drivers are refcounted. Drivers then share > > > this internal representation with other subsystems (e.g., cdev). > > > > Refcounting the driver-specific structure is good, that matches what I > > proposed (it's of course implemented differently in C and rust, but > > that's expected). > > The refcounting business is indeed different at the moment because it's easier > to implement it this way, but it doesn't have to be. (The `Ref` struct we use in > Rust is actually based on the kernel's `refcount_t`.) > > > > - On `remove`, the registrations with other subsystems are removed (so no > > > additional sharing of internal data should happen), but existing calls and > > > references to internal data structures continue to exist. This part is > > > important: we don't "revoke" the references, but we do revoke the hw resources > > > part of the internal state. > > > > No issue here either. The handling of the internal data structure (the > > "non-revoke" part to be precise) matches my proposal too I believe. > > Revoking the I/O memory is of course rust-specific. > > > > > - Attempts to access hardware resources freed during `remove` *must* be > > > prevented, that's where the calls to `resources()` are relevant -- if a > > > subsystem calls into the driver with one of the references it held on to, they > > > won't be able to access the (already released) hw resources. > > > > That's where our opinions differ. Yes, those accesses must be prevented, > > but I don't think the right way to do so is to check if the I/O memory > > resource is still valid. We should instead prevent reaching driver code > > paths that make those I/O accesses, by waiting for all calls in progress > > to return, and preventing new calls from being made. This is a more > > generic solution in the sense that it doesn't prevent accessing I/O > > memory only, but avoids any operation that is not supposed to take > > place. > > I like the idea of blocking functions. > > What lead me to a resource-based approach was the following: we have to block > .remove() until ongoing calls complete; how do we do that? Let's take the cdev > example, if we take your approach, we may have to wait arbitrarily long for > say read() to complete because drivers can sleep and not implement cancellation > properly. What happens then? .remove() is stuck. Correct. > With a resource-approach, .remove() needs to wait on RCU only, so there are no > arbitrarily long waits. Bugs in drivers where they take too long in their calls > won't affect .remove(). If an operation is still in progress and you complete .remove(), all bets are off regarding what can happen next. This is a bug in the driver, it needs to ensure that all operations will complete in a reasonable amount of time. That reasonable amount of time may be long, maybe the hardware is performing DMA and can't be stopped until it completes, and maybe the DMA will take a few seconds to complete. That would arguably be a badly designed piece of hardware, but it can happen, and we need to ensure that the DMA completes before we shut down the device in that case. > > My reasoning is that drivers will be written with the assumption > > that, for instance, nobody will try to set the GPIO direction once > > .remove() returns. Even if the direction_output() function correctly > > checks if the I/O memory is available and returns an error if it isn't, > > it may also contain other logic that will not work correctly after > > .remove() as the developer will not have considered that case. > > I agree the extra error paths are a disadvantage of what I implemented. > > > This > > uncorrect logic may or may not lead to bugs, and some categories of bugs > > may be prevented by rust (such as accessing I/O memory after .remove()), > > but I don't think that's relevant. The subsystem, with minimal help from > > the driver's implementation of the .remove() function if necessary, > > should prevent operations from being called when they shouldn't, and > > especially when the driver's author will not expect them to be called. > > That way we'll address whole classes of issues in one go. And if we do > > so, checking if I/O memory access has been revoked isn't required > > anymore, as we guarantee if isn't. > > > > True, this won't prevent I/O memory from being accessed after .remove() > > in other contexts, for instance in a timer handler that the driver would > > have registered and forgotten to cancel in .remove(). And maybe the I/O > > memory revoking mechanism runtime overhead may be a reasonable price to > > pay for avoiding this, I don't know. I however believe that regardless > > of whether I/O memory is revoked or not, implementing a mechanism in the > > subsytem to avoid erroneous conditions from happening in the first place > > is where we'll get the largest benefit with a (hopefully) reasonable > > effort. > > > > > We have this problem specifically in gpio: as Linus explained, they created an > > > indirection via a pointer which is checked in most entry points, but there is no > > > synchronisation that guarantees that the pointer will remain valid during a > > > call, and nothing forces uses of the pointer to be checked (so as Linus points > > > out, they may need more checks). > > > > > > For Rust drivers, if the registration with other subsystems were done by doing > > > references to driver data in Rust, this extra "protection" (that has race > > > conditions that, timed correctly, lead to use-after-free vulnerabilities) would > > > be obviated; all would be handled safely on the Rust side (e.g., all accesses > > > must be checked, there is no way to get to resources without a check, and use of > > > the resources is guarded by a guard that uses RCU read-side lock). > > > > > > Do you still think we don't have a problem? > > > > We do have a problem, we just try to address it in different ways. And > > of course mine is better, and I don't expect you to agree with this > > statement right away ;-) Jokes aside, this has little to do with C vs. > > rust in this case though, it's about how to model APIs between drivers > > and subsystems. > > I'm glad we agree we have a problem, that's the first step :) > > I also agree that the problem exists regardless of the language. I do think that > the C side of the solution relies more on developer discipline (explicitly > checking for certain conditions, remembering to inc/dec counts, etc.) whereas > the compiler helps enforce such disciplines on the Rust side. IOW, it's just a > tool to catch possible violations at compile time. Sure, rust clearly brings more compile-time checks, nobody can claim otherwise :-) C will always rely more on the developer getting it right. I however believe that we can lower the chance of developers getting it wrong by improving the APIs between subsystems and drivers, regardless of the language. That's applicable to rust too, as the compiler won't catch all possible errors (a programming language where a developer wouldn't be able to get anything wrong would be an interesting concept :-)), and good API design will always help. -- Regards, Laurent Pinchart