All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Maurer <mmaurer@google.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	 Miguel Ojeda <ojeda@kernel.org>, Gary Guo <gary@garyguo.net>,
	 Luis Chamberlain <mcgrof@kernel.org>,
	Nathan Chancellor <nathan@kernel.org>,
	 Nicolas Schier <nicolas@fjasle.eu>,
	linuxppc-dev@lists.ozlabs.org,  linux-kernel@vger.kernel.org,
	linux-modules@vger.kernel.org,  linux-kbuild@vger.kernel.org,
	rust-for-linux@vger.kernel.org,
	 Laura Abbott <laura@labbott.name>
Subject: Re: [PATCH v2 0/5] MODVERSIONS + RUST Redux
Date: Mon, 27 Nov 2023 11:27:07 -0800	[thread overview]
Message-ID: <CAGSQo005hRiUZdeppCifDqG9zFDJRwahpBLE4x7-MyfJscn7tQ@mail.gmail.com> (raw)
In-Reply-To: <2023112312-certified-substance-007c@gregkh>

> > >
> > > > With regards to future directions that likely won't work for loosening it:
> > > > Unfortunately, the .rmeta format itself is not stable, so I wouldn't want to
> > > > teach genksyms to open it up and split out the pieces for specific functions.
> > > > Extending genksyms to parse Rust would also not solve the situation -
> > > > layouts are allowed to differ across compiler versions or even (in rare
> > > > cases) seemingly unrelated code changes.
> > >
> > > What do you mean by "layout" here?  Yes, the crcs can be different
> > > across compiler versions and seemingly unrelated code changes (genksyms
> > > is VERY fragile) but that's ok, that's not what you are checking here.
> > > You want to know if the rust function signature changes or not from the
> > > last time you built the code, with the same compiler and options, that's
> > > all you are verifying.
What I mean by layout here is that if you write in Rust:
struct Foo {
  x: i32,
  y: i32,
}
it is not guaranteed to have the same layout across different compilations, even
within the same compiler. See
https://doc.rust-lang.org/reference/type-layout.html#the-rust-representation
Specifically, the compiler is allowed to arbitrarily insert padding,
reorder fields, etc.
on the same code as long as the overall alignment of the struct and individual
alignment of the fields remains correct and non-overlapping.

This means the compiler is *explicitly* allowed to, for example, permute x and y
as an optimization. In the above example this is unlikely, but if you
instead consider
struct Bar {
  x: i8,
  y: i64,
  z: i8,
}
It's easy to see why the compiler might decide to structure this as
y,x,z to reduce the
size of the struct. Those optimization decisions may be affected by
any other part of
the code, PGO, etc.
> > >
> > > > Future directions that might work for loosening it:
> > > > * Generating crcs from debuginfo + compiler + flags
> > > > * Adding a feature to the rust compiler to dump this information. This
> > > > is likely to
> > > >   get pushback because Rust's current stance is that there is no ability to load
> > > >   object code built against a different library.
> > >
> > > Why not parse the function signature like we do for C?
Because the function signature is insufficient to check the ABI, see above.
> > >
> > > > Would setting up Rust symbols so that they have a crc built out of .rmeta be
> > > > sufficient for you to consider this useful? If not, can you help me understand
> > > > what level of precision would be required?
> > >
> > > What exactly does .rmeta have to do with the function signature?  That's
> > > all you care about here.
The .rmeta file contains the decisions the compiler made about layout
in the crate
you're interfacing with. For example, the choice to encode Bar
with a yxz field order would be written into the .rmeta file.
> >
> >
> >
> >
> > rmeta is generated per crate.
> >
> > CRC is computed per symbol.
> >
> > They have different granularity.
> > It is weird to refuse a module for incompatibility
> > of a symbol that it is not using at all.
>
> I agree, this should be on a per-symbol basis, so the Rust
> infrastructure in the kernel needs to be fixed up to support this
> properly, not just ignored like this patchset does.
I agree there is a divergence here, I tried to point it out so that it
wouldn't be
a surprise later. The .rmeta file itself (which is the only way we
could know that
the ABI actually matches, because layout decisions are in there) is an unstable
format, which is why I would be reluctant to try to parse it and find only the
relevant portions to hash. This isn't just a "technically unstable"
format, but one
in which the compiler essentially just serializes out relevant internal data
structures, so any parser for it will involve significant alterations
on compiler
updates, which doesn't seem like a good plan.
>
> thanks,
>
> greg k-h
Given the above additional information, would you be interested in a patchset
which either:

A. Computes the CRC off the Rust type signature, knowing the compiler is
allowed to change the ABI based on information not contained in the CRC.
B. Uses the CRC of the .rmeta file, knowing, as was pointed out, that this
effectively contains the ABI of every symbol in the compilation unit, as well
as inline functions and polymorphic functions.

If neither of these works, we likely can't turn on MODVERSIONS+RUST until
further work is done upstream in the compiler to export some of this data in
an at least semi-stable fashion.

WARNING: multiple messages have this Message-ID (diff)
From: Matthew Maurer <mmaurer@google.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>,
	rust-for-linux@vger.kernel.org, linux-kbuild@vger.kernel.org,
	Masahiro Yamada <masahiroy@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	linux-kernel@vger.kernel.org,
	Nathan Chancellor <nathan@kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>, Gary Guo <gary@garyguo.net>,
	Miguel Ojeda <ojeda@kernel.org>,
	Laura Abbott <laura@labbott.name>,
	linuxppc-dev@lists.ozlabs.org, linux-modules@vger.kernel.org
Subject: Re: [PATCH v2 0/5] MODVERSIONS + RUST Redux
Date: Mon, 27 Nov 2023 11:27:07 -0800	[thread overview]
Message-ID: <CAGSQo005hRiUZdeppCifDqG9zFDJRwahpBLE4x7-MyfJscn7tQ@mail.gmail.com> (raw)
In-Reply-To: <2023112312-certified-substance-007c@gregkh>

> > >
> > > > With regards to future directions that likely won't work for loosening it:
> > > > Unfortunately, the .rmeta format itself is not stable, so I wouldn't want to
> > > > teach genksyms to open it up and split out the pieces for specific functions.
> > > > Extending genksyms to parse Rust would also not solve the situation -
> > > > layouts are allowed to differ across compiler versions or even (in rare
> > > > cases) seemingly unrelated code changes.
> > >
> > > What do you mean by "layout" here?  Yes, the crcs can be different
> > > across compiler versions and seemingly unrelated code changes (genksyms
> > > is VERY fragile) but that's ok, that's not what you are checking here.
> > > You want to know if the rust function signature changes or not from the
> > > last time you built the code, with the same compiler and options, that's
> > > all you are verifying.
What I mean by layout here is that if you write in Rust:
struct Foo {
  x: i32,
  y: i32,
}
it is not guaranteed to have the same layout across different compilations, even
within the same compiler. See
https://doc.rust-lang.org/reference/type-layout.html#the-rust-representation
Specifically, the compiler is allowed to arbitrarily insert padding,
reorder fields, etc.
on the same code as long as the overall alignment of the struct and individual
alignment of the fields remains correct and non-overlapping.

This means the compiler is *explicitly* allowed to, for example, permute x and y
as an optimization. In the above example this is unlikely, but if you
instead consider
struct Bar {
  x: i8,
  y: i64,
  z: i8,
}
It's easy to see why the compiler might decide to structure this as
y,x,z to reduce the
size of the struct. Those optimization decisions may be affected by
any other part of
the code, PGO, etc.
> > >
> > > > Future directions that might work for loosening it:
> > > > * Generating crcs from debuginfo + compiler + flags
> > > > * Adding a feature to the rust compiler to dump this information. This
> > > > is likely to
> > > >   get pushback because Rust's current stance is that there is no ability to load
> > > >   object code built against a different library.
> > >
> > > Why not parse the function signature like we do for C?
Because the function signature is insufficient to check the ABI, see above.
> > >
> > > > Would setting up Rust symbols so that they have a crc built out of .rmeta be
> > > > sufficient for you to consider this useful? If not, can you help me understand
> > > > what level of precision would be required?
> > >
> > > What exactly does .rmeta have to do with the function signature?  That's
> > > all you care about here.
The .rmeta file contains the decisions the compiler made about layout
in the crate
you're interfacing with. For example, the choice to encode Bar
with a yxz field order would be written into the .rmeta file.
> >
> >
> >
> >
> > rmeta is generated per crate.
> >
> > CRC is computed per symbol.
> >
> > They have different granularity.
> > It is weird to refuse a module for incompatibility
> > of a symbol that it is not using at all.
>
> I agree, this should be on a per-symbol basis, so the Rust
> infrastructure in the kernel needs to be fixed up to support this
> properly, not just ignored like this patchset does.
I agree there is a divergence here, I tried to point it out so that it
wouldn't be
a surprise later. The .rmeta file itself (which is the only way we
could know that
the ABI actually matches, because layout decisions are in there) is an unstable
format, which is why I would be reluctant to try to parse it and find only the
relevant portions to hash. This isn't just a "technically unstable"
format, but one
in which the compiler essentially just serializes out relevant internal data
structures, so any parser for it will involve significant alterations
on compiler
updates, which doesn't seem like a good plan.
>
> thanks,
>
> greg k-h
Given the above additional information, would you be interested in a patchset
which either:

A. Computes the CRC off the Rust type signature, knowing the compiler is
allowed to change the ABI based on information not contained in the CRC.
B. Uses the CRC of the .rmeta file, knowing, as was pointed out, that this
effectively contains the ABI of every symbol in the compilation unit, as well
as inline functions and polymorphic functions.

If neither of these works, we likely can't turn on MODVERSIONS+RUST until
further work is done upstream in the compiler to export some of this data in
an at least semi-stable fashion.

  reply	other threads:[~2023-11-27 19:27 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-18  2:54 [PATCH v2 0/5] MODVERSIONS + RUST Redux Matthew Maurer
2023-11-18  2:54 ` Matthew Maurer
2023-11-18  2:54 ` [PATCH v2 1/5] export_report: Rehabilitate script Matthew Maurer
2023-11-18  2:54   ` Matthew Maurer
2023-11-18 11:35   ` Greg KH
2023-11-18 11:35     ` Greg KH
2023-11-18  2:54 ` [PATCH v2 2/5] modules: Refactor + kdoc elf_validity_cached_copy Matthew Maurer
2023-11-18  2:54   ` Matthew Maurer
2023-11-18 11:36   ` Greg KH
2023-11-18 11:36     ` Greg KH
2023-11-18  2:54 ` [PATCH v2 3/5] modpost: Extended modversion support Matthew Maurer
2023-11-18  2:54   ` Matthew Maurer
2023-11-18 13:42   ` kernel test robot
2023-11-18 13:42     ` kernel test robot
2023-11-18  2:54 ` [PATCH v2 4/5] rust: Allow MODVERSIONS Matthew Maurer
2023-11-18  2:54   ` Matthew Maurer
2023-11-18  2:54 ` [PATCH v2 5/5] export_report: Use new version info format Matthew Maurer
2023-11-18  2:54   ` Matthew Maurer
2023-11-22 15:49 ` [PATCH v2 0/5] MODVERSIONS + RUST Redux Masahiro Yamada
2023-11-22 15:49   ` Masahiro Yamada
2023-11-22 21:04   ` Matthew Maurer
2023-11-22 21:04     ` Matthew Maurer
2023-11-23  9:05     ` Greg KH
2023-11-23  9:05       ` Greg KH
2023-11-23 11:38       ` Masahiro Yamada
2023-11-23 11:38         ` Masahiro Yamada
2023-11-23 12:12         ` Greg KH
2023-11-23 12:12           ` Greg KH
2023-11-27 19:27           ` Matthew Maurer [this message]
2023-11-27 19:27             ` Matthew Maurer
2023-11-28  8:05             ` Greg KH
2023-11-28  8:05               ` Greg KH
2023-11-28  8:44               ` Greg KH
2023-11-28  8:44                 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGSQo005hRiUZdeppCifDqG9zFDJRwahpBLE4x7-MyfJscn7tQ@mail.gmail.com \
    --to=mmaurer@google.com \
    --cc=gary@garyguo.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=laura@labbott.name \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=nicolas@fjasle.eu \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.