bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BTF compatibility issue across builds
@ 2022-01-27 15:10 Shung-Hsi Yu
  2022-01-31 17:36 ` Yonghong Song
  0 siblings, 1 reply; 22+ messages in thread
From: Shung-Hsi Yu @ 2022-01-27 15:10 UTC (permalink / raw)
  To: bpf, netdev, Andrii Nakryiko; +Cc: Daniel Borkmann, Alexei Starovoitov

Hi,

We recently run into module load failure related to split BTF on openSUSE
Tumbleweed[1], which I believe is something that may also happen on other
rolling distros.

The error looks like the follow (though failure is not limited to ipheth)

    BPF:[103111] STRUCT BPF:size=152 vlen=2 BPF: BPF:Invalid name BPF:

    failed to validate module [ipheth] BTF: -22

The error comes down to trying to load BTF of *kernel modules from a
different build* than the runtime kernel (but the source is the same), where
the base BTF of the two build is different.

While it may be too far stretched to call this a bug, solving this might
make BTF adoption easier. I'd natively think that we could further split
base BTF into two part to avoid this issue, where .BTF only contain exported
types, and the other (still residing in vmlinux) holds the unexported types.

Does that sound like something reasonable to work on?


## Root case (in case anyone is interested in a verbose version)

On openSUSE Tumbleweed there can be several builds of the same source. Since
the source is the same, the binaries are simply replaced when a package with
a larger build number is installed during upgrade.

In our case, a rebuild is triggered[2], and resulted in changes in base BTF.
More precisely, the BTF_KIND_FUNC{,_PROTO} of i2c_smbus_check_pec(u8 cpec,
struct i2c_msg *msg) and inet_lhash2_bucket_sk(struct inet_hashinfo *h,
struct sock *sk) was added to the base BTF of 5.15.12-1.3. Those functions
are previously missing in base BTF of 5.15.12-1.1.

The addition of entries in BTF type and string table caused extra offset of
type IDs and string position in the base BTF, and as such the same type ID
may refers to a totally different type, and as does name_off of types.

When users on build#1 (ie 5.15.12-1.1) installs build#3 (ie 5.15.12-1.3),
and then tries to load kernel module, they will be loading build#3 module on
build#1 kernel; and with base BTF of the two builds different, name_off of
some types will end up pointing at invalid string, and the kernel bails out.


Best,
Shung-Hsi Yu

1: https://bugzilla.opensuse.org/show_bug.cgi?id=1194501
2: my guess is rebuild is trigger due to compiler toolchain update, but I
   wasn't able to pin down exactly what changed


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-03-03  4:27 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-27 15:10 BTF compatibility issue across builds Shung-Hsi Yu
2022-01-31 17:36 ` Yonghong Song
2022-02-10 10:01   ` Michal Suchánek
2022-02-10 18:17     ` Yonghong Song
2022-02-10 22:34       ` Alexei Starovoitov
2022-02-10 22:59         ` Yonghong Song
2022-02-12  5:40           ` Shung-Hsi Yu
2022-02-12  6:36             ` Yonghong Song
2022-02-15 19:38               ` Shung-Hsi Yu
2022-02-15 17:47                 ` Yonghong Song
2022-02-15 18:57                   ` Toke Høiland-Jørgensen
2022-02-20  0:28                     ` Andrii Nakryiko
2022-02-16  8:48                   ` David Laight
2022-03-02 17:46                 ` Michal Suchánek
2022-03-03  4:27                   ` Shung-Hsi Yu
2022-02-11  6:01     ` Andrii Nakryiko
2022-02-11 17:20       ` Toke Høiland-Jørgensen
2022-02-11 22:20         ` Andrii Nakryiko
2022-02-11 23:58           ` Toke Høiland-Jørgensen
2022-02-12  7:37             ` Shung-Hsi Yu
2022-02-13 15:40               ` Toke Høiland-Jørgensen
2022-02-14 20:19                 ` Michal Suchánek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).