archive mirror
 help / color / mirror / Atom feed
* notes from Toolchains MC at Linux Plumbers Conf 2021
@ 2021-09-24 17:51 Nick Desaulniers
  0 siblings, 0 replies; only message in thread
From: Nick Desaulniers @ 2021-09-24 17:51 UTC (permalink / raw)
  To: llvm, linux-toolchains

Compiler Features for Kernel Security

-fstack-protector-guard= needs more arch support.
-fzero-call-used-regs= GCC 11+ support.
stack variable auto initialization; GCC 12+ (
 support and clang. Shipped in Android + CrOS already.
 Array bounds checking. Avoid 0-element arrays with
-Wzero-length-array. -Wzero-length-bounds. -Warray-bounds,

Multiple issues with -Warray-bounds on 0/1 element arrays,

"a non-default flag for only treating flexible array members that way
is ok, but there are thousands of programs in the wild that rely on
[0] or [1] or even [2] or more act as flexible array members that
doing this by default is not an option"


Need attributes to turn off signed integer overflow. Need unsigned
overflow sanitizer in GCC; wrapping is defined but sometimes
unexpected (pointer arithmetic).

LTO necessary for CFI?

CET x86 and PAC ARM

does struct layout randomization work for dynamically loaded kernel
modules?  Yes via shared seed.

Randy:  i want #error or #warn to be able to print values (like
#defines or calculated values)

Optimizing Linux Kernel with BOLT

Binary optimization layout tool. Defragment/layout hot code based on
profile data that otherwise suffers high front end stalls from I$
misses. x86 ELF currently, aarch64 WIP. Can use instrumentation
profiles if LBR is unavailable.

Q: I'm wondering what the optimization passes are: your audio is very
blurred here, so if you talked about it was mostly lost. Movement of
basic blocks? All I heard was "basic blocks".
A: Yes.

Q: I do wonder why PGO can't do as good a job as BOLT.  It doesn't
feel like there's anything fundamental stopping it.  So what's the
major benefit against PGO?
A: Context sensitivity.

Q: is debug info rewritten?
A: Yes, v4 currently and split. v5 WIP.

Sun Studio compiler did this too. via: -xlinkopt option

What symbols need to be updated, can we add new sections? use LOAD segments?

Q: Does BOLT inlining maintain symbol interposition in user code?
Would this be an issue for loadable modules or BPF?
A: inlining comes from the compiler, mostly. Not sure about BPF
loadadble modules (modules can't do interposition). "as long as it
doesnt change EXPORT_SYMBOL it should not"

The never-ending saga of control dependencies

What are they? Weaker-than-acquire memory barriers. See
memory-barriers.txt for more examples.

Finer grain barriers than "memory" clobbers possible, just need
compiler vendors to agree on new clobber name.

(In all cases the compiler can implement this as just  "memory" so it
is easy for all compilers to implement)

Q: Has the advantage of using these control dependencies vs explicitly
barriers been quantified somehow?
A: It is hugely hardware dependent. It can be significant on the
weaker architectures (ARM, PowerPC, maybe RISCV, ...). It might help
on stronger architectures (x86, s390, ...) due to allowing more
compiler optimizations.
this is a somewhat special example, Michael Ellerman once reported a
small speed up when lwsync -> isync+ctrl to implement acquire for ppc
atomics on Power8
Q: If the best data we have is a small speed on Power8, is doing this
worth the trouble?
A: see also the "subtle breakage may occur" point from the slides

Q: what about address+data dependency compiler support rather than
control dependencies?
A: hasn't been discussed? Some university students looking into maybe
a compiler pass to try to find issues/breakage; but too early to

subscribe/post to:

Report from the Standards Comittee

maybe C++26 or C++29 for RCU/hazard pointers/asymmetric fences. RCU
has had some changes for C++ standardization. Check out cppcon Oct
20th for more info.

volatile_load<T> and volatile_store<T> WIP

Q: Can you have smart pointers that forbid delete until you synchronize_rcu()?
A: Need more info. Follow up via email. "have the smart_ptr do
call_rcu when it hits ref==0". "that sounds a lot like the old URCU

Q: the strategy to call synchronize_rcu() if allocation fails (for
call_rcu()) wouldn't work for scenarios like the one you described
yesterday where you have a nested read lock and update, would it?
A: Exactly.

Q: Can we get attributes that helps with dependency ordering on non-pointers?
A: Hasn't been brought up in committee yet.

Q: what about UB discussions?
A: expect a food fight.

Objtool for arm64

objtool is a host tool used by x86 port; object file validator and
patching utility. Relies on control flow reconstruction from binary

Q: don't you want static call support on arm64?
A: idk, no retpoline. Indirect predictors work well, need analysis to
show otherwise. Ard might have patches.  It's not a reason to use

Can we start with just a binary validator?  Will folks start patching
rather than fix the tools? Failures to track control flow are objtool
bugs, not compiler bugs; we don't want to be turning off compiler
optimizations.  Can we do anything in the toolchain? Can the toolchain
generate ORC unwinding metadata?

DWARF has asynchronous unwinding, but the DWARF unwinder was removed
for being huge and complicated.

Can we get a spec for what's needed?

"In general it's not possible to reconstruct control flow from a binary."

Looking for description of problem for arm64 jump tables with
disassembly. Can DWARF provide this info?

Q: Would the same issues arise for userspace live-patching? (wrt
kernel-specific compilation flags)
A: Exception handling frames different from DWARF? "Weird stuff" in assembly.

Q: do we have an ORC spec somewhere?
A: read the code
R: Formalize a spec

The arm64 unwinder isn't precise; it relies on the PC then the LR (and
maybe the SP).  Mark Rutland has additional slides.

Rust Toolchain and the Kernel

Rust has nightly/unstable features (language extensions); some are
used by the kernel patches so far.  But we prefer to use a stable

Q: how does Rust compilation differ from C compilation?
A: Still per TU based, but a TU is a crate which can contain multiple
files. Rust doesn't have headers.

Q: How can the Linux kernel community who are interested in Rust help
accelerate the GCC implementation to ensure that Rust has the same
portability and diversity of toolchain support as the current kernel
implementation languages? Encouraging cooperation between Rust, Rustc,
Rust Foundation and GNU Toolchain. Participating in the GNU Toolchain
development and/or providing funding for developers. Etc.
A: We need certain language features, support is needed finalizing
those features.  Rust GCC and rustc_codegen_gcc are additional
Could create a shared mailing list for Rust toolchain issues or use

To elaborate on that: the Rocket project used nightly rust for a long
time, but it helped that they had a comprehensive list of Rust nightly
features they needed and why they needed them, and that helped Rust
and Rocket meet in the middle so that now Rocket runs on stable Rust.

Discussion about `unsafe`  fn and `unsafe` blocks. Potential change
for a future edition would make the body of an unsafe function not
automatically an unsafe block. Note: would be helpful to get feedback
on that issue from the Rust-in-Linux folks, because there's still some
significant debate about whether that's the right change to make.
(There's a concern that it'd generate lots of noise and churn if it
turns out the majority of unsafe function bodies to require unsafe
- It'd help to have that feedback in the upstream issue
(, on behalf of the
Rust-in-Linux project.

Q: How should language versioning work?
A: Flags?

Q:Do you need procedural macros? (Asking because for gccrs it will be
an interesting problem to implement them compared to "normal" macros
by example)
A: Yes, there's a crate for them. See "macros" crate.

Q: Does bindgen rely on libclang? Can it be made more robust by using
DWARF (or maybe CTF) to generate the Rust types? Or are there features
that really rely on C header parsing (which feels more fragile than
using the binary encoding in DWARF/CTF)
A: Build times may be faster to parse headers than have the compiler
dump debug info. Perhaps libabigail can help (consumes DWARF).

Q: How to call C inline functions/macros from Rust?
A: Need C wrappers. :(

Q: What ABI issues have been hit or are forseen?
A: still issues with bindgen (opaque types used in some places to work
around these). Avoid mixing GCC kernels with rustc, or LLVM kernels
with gcc_rs, etc?
- There's no compatibility or ABI issue between GCC-compiled C code
and rustc-compiled Rust code. There seems to be a widespread
perception that there's an issue there, but there's no issue; that
combination is widely tested, commonly used, and upstream rustc will
notice and fix any issues there.

Might be helpful to have Rust for kernel hackers documentation.
Discussions with Shuah about training/mentorship.  LWN articles.


~Nick Desaulniers

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-24 17:52 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-24 17:51 notes from Toolchains MC at Linux Plumbers Conf 2021 Nick Desaulniers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).