Re: Can the Kernel Concurrency Sanitizer Own Rust Code?

From: "Paul E. McKenney" <paulmck@kernel.org>
To: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Gary Guo <gary@garyguo.net>, Marco Elver <elver@google.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	rust-for-linux <rust-for-linux@vger.kernel.org>
Subject: Re: Can the Kernel Concurrency Sanitizer Own Rust Code?
Date: Sat, 9 Oct 2021 16:48:34 -0700	[thread overview]
Message-ID: <20211009234834.GX880162@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <CANiq72m76-nRDNAceEqUmC_k75FZj+OZr1_HSFUdksysWgCsCA@mail.gmail.com>

On Sat, Oct 09, 2021 at 06:30:10PM +0200, Miguel Ojeda wrote:
> On Sat, Oct 9, 2021 at 1:57 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > But some other library could have a wild-pointer bug in unsafe Rust code
> > or in C code, correct?  And such a bug could subvert a rather wide range
> 
> Indeed, but that would require a bug somewhere in unsafe Rust code --
> safe Rust code cannot do so on its own. That is why I mentioned
> "outside safe code".

Understood.

> > of code, including that of correct libraries, right?  If I am wrong,
> > please tell me what Rust is doing to provide the additional protection.
> 
> Of course, an unsafe code bug, or C code going wild, or a compiler
> bug, or a hardware bug, or a single-event upset etc. can subvert
> everything (see the other reply).
> 
> This is why I emphasize that the guarantees Rust aims to provide are
> conditional to all that. After all, it is just a language -- there is
> no way it could make a system (including hardware) immune to that.

And understood here as well.

> > I would like to believe that, but I have seen too many cases where
> > UB propagates far and wide.  :-(
> 
> To be clear, the "effectively contain UB" above did not imply that
> Rust somehow prevents UB from breaking everything if it actually
> happens (this relates to the previous point). It means that, as a
> tool, it seems to be an effective way to write less UB-related bugs
> compared to using languages like C.
> 
> In other words, UB-related bugs can definitely still happen, but the
> idea is to reduce the amount of issues involving UB as much as
> possible via reducing the amount of code that we need to write that
> requires potentially-UB operations. So it is a matter of reducing the
> probabilities you mentioned -- but Rust alone will not make them zero
> nor guarantee no UB in an absolute manner.

And understood here, too.

> > Except that all too many compiler writers are actively looking for more
> > UB to exploit.  So this would be a difficult moving target.
> 
> If you mean it in the sense of C and C++ (i.e. where it is easy to
> trigger UB without realizing it because the optimizer may not take
> advantage of that today, but may actually take advantage of it
> tomorrow); then in safe Rust that would be a bug.
> 
> That is, such a bug may be in the compiler frontend, it may be a bug
> in LLVM, or in the language spec, or in the stdlib, or in our own
> unsafe code in the kernel, etc. But ultimately, it would be considered
> a bug.
> 
> The idea is that the safe subset of Rust does not allow you to write
> UB at all, whatever you write. So, for instance, no optimizer (whether
> today's version or tomorrow's version) will be able to break your code
> (again, assuming no bugs in the optimizer etc.).
> 
> This is in contrast with C (or unsafe Rust!), where not only we have
> the risk of compiler bugs like in safe Rust, but also all the UB
> landmines in the language itself that correct optimizers can exploit
> (assuming we agreed what is "legal" by the standard, which is a whole
> another discussion).

As long as a significant number of compiler writers evaluate themselves by
improved optimization, they will be working hard to create additional UB
opportunities.  From what you say above, their doing so has the potential
to generate bugs in the Rust compiler.  Suppose this happens ten years
from now.  Do you propose to force rework not just the compiler, but
large quantities of Rust code that might have been written by that time?

> > Let me see if I can summarize with a bit of interpretation...
> >
> > 1.      Rust modules are a pointless distraction here.  Unless you object,
> >         I will remove all mention of them from this blog series.
> 
> I agree it is best to omit them. However, it is not that Rust modules
> are irrelevant/unrelated to the safety story in Rust, but for
> newcomers to Rust, I think it is a detail that can easily mislead
> them.

Plus the connection to a Rust memory model is not all that strong.

> > 2.      Safe Rust code might have bugs, as might any other code.
> >
> >         For example, even if Linux-kernel RCU were to somehow be rewritten
> >         into Rust with no unsafe code whatsoever, there is not a verifier
> >         alive today that is going to realize that changing the value of
> >         RCU_JIFFIES_FQS_DIV from 256 to (say) 16 is a really bad idea.
> 
> Definitely: logic bugs are not prevented by safe Rust.
> 
> It may reduce the chances of logic bugs compared to C though (e.g.
> through its stricter type system etc.), but this is another topic,
> mostly unrelated to the safety/UB discussion.

The thing is that you have still not convinced me that UB is all that
separate of a category from logic bugs, especially given that either
can generate the other.

> > 3.      Correctly written unsafe Rust code defends itself (and the safe
> >         code invoking it) from misuse.  And presumably the same applies
> >         for wrappers written for C code, given that there is probably
> >         an "unsafe" lurking somewhere in such wrappers.
> 
> Yes. And definitely, calling C code is unsafe, since C code does not
> have a way to promise in its signature that it is safe.

Hence the Rust-unsafe wrappering for C code, presumably.

> > 4.      Rust's safety properties are focused more on UB in particular
> >         than on bugs in general.
> 
> Yes, safety in Rust is all about UB, not logic bugs.
> 
> This does not mean that Rust was not designed to try to minimize logic
> bugs too, of course, but that is another discussion.

This focus on UB surprises me.  Unless the goal is mainly comfort for
compiler writers looking for more UB to "optimize".  ;-)

> > And one final thing to keep in mind...  If I turn this blog series into
> > a rosy hymn to Rust, nobody is going to believe it.  ;-)
> 
> I understand :)
> 
> As a personal note: I am trying my best to give a fair assessment of
> Rust for the kernel, and trying hard to describe what Rust actually
> aims to guarantee and what not. I do not enjoy when Rust is portrayed
> as the solution to every single problem -- it does not solve all
> issues, at all. But I think it is a big enough improvement to be
> seriously considered for kernel development.

It will be interesting to see how the experiment plays out.  And to
be sure, part of my skepticism is the fact that UB is rarely (if ever)
the cause of my Linux-kernel RCU bugs.  But the other option that the
kernel uses is gcc and clang/LLVM flags to cause the compiler to define
standard-C UB, one example being signed integer overflow.

But my main official focus is of course the memory model.

						Thanx, Paul