rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Writing the Apple AGX GPU driver in Rust?
@ 2022-08-11 14:03 Asahi Lina
  2022-08-11 15:28 ` Alex Gaynor
  2022-08-11 18:45 ` Miguel Ojeda
  0 siblings, 2 replies; 11+ messages in thread
From: Asahi Lina @ 2022-08-11 14:03 UTC (permalink / raw)
  To: rust-for-linux

Hi!

I'm starting work on a new kernel GPU driver for the Apple AGX (found in
M1 and M2 family chips). These GPUs run firmware and have fairly complex
shared memory data structures that need to be managed by the host, so
I've been leaning towards Rust for its safety, better metaprogramming,
and general expressiveness. I have a prototype driver written in Python
(running in userspace from a remote host, long story), and having a
higher-level language has been very helpful in reverse engineering the
GPU and prototyping different ideas for how the driver should work.

I realize it's the early days of Rust on Linux and this is an ambitious
challenge, but I'm willing to learn and the driver will take some time
to stabilize to the point of being upstreamable either way (in
particular the UAPI), so writing it in Rust feels like less of a gamble
at this point than it used to be, given that it sounds like Rust will be
merged in the next few kernel cycles at the latest.

I'd like to hear your thoughts about how crazy an idea this is. ^_^

I think the portion of the driver that would most benefit from Rust
would be the firmware interaction parts (managing shared memory data
structures and interacting with them), so I see several possibilities:
the whole driver could be Rust (which would involve writing bindings for
the render portion of the DRM subsystem), or just the bulk of the
firmware interaction logic could be done in Rust, and then the top-level
driver would be written in C and call into the Rust abstraction layer
for the firmware, or something in between.

(For those not familiar with DRM: the display/KMS API is very extensive
and would be quite an ordeal to bind to Rust I imagine, but I think the
render/3D part of the API is significantly simpler, especially the
subset likely to be used by modern drivers. This is a render-only
device, no display, that's a different driver.)

More specifically, there are some challenges that I think would really
benefit from Rust's features. One is that we have to support multiple
firmware versions in one driver, across multiple GPUs (unlike macOS,
which ships a separate driver build for each GPU and only supports a
single, matched firmware in any given macOS version). The firmware ABI
is not stable, so every new firmware needs many scattered changes to the
data structures. Doing this in C without copying and pasting large
amounts of code would require some horrible multiple-compilation #ifdef
hacks, and really isn't pretty.

I was wondering if it could be done better in Rust, so a couple days ago
I tried my hand at writing a proc macro to automatically generate
versioned variants of structures within a single compile, along with
versioned implementations with conditionally-compiled portions. I
prototyped it in userspace, and this is the result (along with some
example structure definitions, so far only with 2 firmware variants and
some dummy code):

https://github.com/asahilina/gpu-rust-playground

Do you think this kind of approach would be fit for the kernel? (Please
be gentle, I have very little Rust experience and this is my first time
writing a proc macro... I did avoid adding any external cargo deps,
since I know that's an issue in the kernel. The actual struct defs were
automatically generated from the existing Python ones, so I know some of
the naming conventions aren't very idiomatic).

The other problem I thought Rust might be able to help with is the
actual management of GPU objects in memory. These objects have two
addresses (a CPU address and a GPU address), and it would be nice if the
type system could help ensure correctness here too. Structures can be
firmware-owned, CPU-owned, or shared, and their state may change at
certain points in time. Perhaps the type system can be coaxed into
extending the usual correctness guarantees of object ownership to this
model, including things like making sure there are no dangling GPU
pointers in GPU structures pointing to dropped data, etc? (using
PhantomData perhaps?). This also ties into the actual memory mapping
(there are different mapping modes depending on whether data is shared
or not). I haven't started exploring this yet, so I'd love to hear any
thoughts about it!

Beyond the GPU driver, the firmware version ABI instability issue also
affects other drivers for these platforms (display/DCP for one, but
quite possibly others too), so that macro scaffolding might be a good
incentive for other drivers for these chips to also partially or fully
move to Rust.

Please let me know your thoughts on this crazy idea! I haven't seen a
lot of ML activity lately, so if there is any better place to discuss
this (Zulip? I saw it's invite-only...) please let me know.

Thanks!!

~~ Lina

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-08-13 16:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-11 14:03 Writing the Apple AGX GPU driver in Rust? Asahi Lina
2022-08-11 15:28 ` Alex Gaynor
2022-08-11 18:45 ` Miguel Ojeda
2022-08-12  4:01   ` Asahi Lina
2022-08-12 17:48     ` Miguel Ojeda
2022-08-13  6:17       ` Asahi Lina
2022-08-13 13:05         ` Miguel Ojeda
2022-08-13 15:06           ` Asahi Lina
2022-08-13 15:30             ` Greg KH
2022-08-13 15:59               ` Asahi Lina
2022-08-13 16:07             ` Miguel Ojeda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).