All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: "Michal Suchánek" <msuchanek@suse.de>,
	"Mel Gorman" <mgorman@techsingularity.net>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Networking <netdev@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Jiri Olsa <jolsa@kernel.org>, Hritik Vijay <hritikxx8@gmail.com>
Subject: Re: BPF: failed module verification on linux-next
Date: Mon, 24 May 2021 16:46:00 -0700	[thread overview]
Message-ID: <CAEf4BzZ9=aLVD7ytgCcSxcbOLqFNK-p1mj14Rv_TGnOyL3aO_g@mail.gmail.com> (raw)
In-Reply-To: <CAEf4BzZ0-sihSL-UAm21JcaCCY92CqfNxycHRZYXcoj8OYb=wA@mail.gmail.com>

On Mon, May 24, 2021 at 3:58 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, May 20, 2021 at 10:31 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, May 19, 2021 at 7:19 AM Michal Suchánek <msuchanek@suse.de> wrote:
> > >
> > > Hello,
> > >
> > > linux-next fails to boot for me:
> > >
> > > [    0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@buildhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils;
> > > openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8)
> > > [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527
> > > 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset
> > >  earlycon ignore_loglevel console=ttyS0,115200
> > > ...
> > > [   26.093364] calling  tracing_set_default_clock+0x0/0x62 @ 1
> > > [   26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs
> > > [   26.106330] calling  acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1
> > > [   26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs
> > > [   26.121559] calling  clk_disable_unused+0x0/0x102 @ 1
> > > [   26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs
> > > [   26.133491] calling  regulator_init_complete+0x0/0x25 @ 1
> > > [   26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs
> > > [   26.147816] Freeing unused decrypted memory: 2036K
> > > [   26.153682] Freeing unused kernel image (initmem) memory: 2308K
> > > [   26.165776] Write protecting the kernel read-only data: 26624k
> > > [   26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K
> > > [   26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K
> > > [   26.187031] Run /init as init process
> > > [   26.190693]   with arguments:
> > > [   26.193661]     /init
> > > [   26.195933]   with environment:
> > > [   26.199079]     HOME=/
> > > [   26.201444]     TERM=linux
> > > [   26.204152]     BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla
> > > [   26.254154] BPF:      type_id=35503 offset=178440 size=4
> > > [   26.259125] BPF:
> > > [   26.261054] BPF:Invalid offset
> > > [   26.264119] BPF:
> >
> > It took me a while to reliably bisect this, but it clearly points to
> > this commit:
> >
> > e481fac7d80b ("mm/page_alloc: convert per-cpu list protection to local_lock")
> >
> > One commit before it, 676535512684 ("mm/page_alloc: split per cpu page
> > lists and zone stats -fix"), works just fine.
> >
> > I'll have to spend more time debugging what exactly is happening, but
> > the immediate problem is two different definitions of numa_node
> > per-cpu variable. They both are at the same offset within
> > .data..percpu ELF section, they both have the same name, but one of
> > them is marked as static and another as global. And one is int
> > variable, while another is struct pagesets. I'll look some more
> > tomorrow, but adding Jiri and Arnaldo for visibility.
> >
> > [110907] DATASEC '.data..percpu' size=178904 vlen=303
> > ...
> >         type_id=27753 offset=163976 size=4 (VAR 'numa_node')
> >         type_id=27754 offset=163976 size=4 (VAR 'numa_node')
> >
> > [27753] VAR 'numa_node' type_id=27556, linkage=static
> > [27754] VAR 'numa_node' type_id=20, linkage=global
> >
> > [20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
> >
> > [27556] STRUCT 'pagesets' size=0 vlen=1
> >         'lock' type_id=507 bits_offset=0
> >
> > [506] STRUCT '(anon)' size=0 vlen=0
> > [507] TYPEDEF 'local_lock_t' type_id=506
> >
> > So also something weird about those zero-sized struct pagesets and
> > local_lock_t inside it.
>
> Ok, so nothing weird about them. local_lock_t is designed to be
> zero-sized unless CONFIG_DEBUG_LOCK_ALLOC is defined.
>
> But such zero-sized per-CPU variables are confusing pahole during BTF
> generation, as now two different variables "occupy" the same address.

FWIW, here's the pahole fix (it tried to filter zero-sized per-CPU
vars, but not quite completely).

  [0] https://lore.kernel.org/bpf/20210524234222.278676-1-andrii@kernel.org/T/#u

>
> Given this seems to be the first zero-sized per-CPU variable, I wonder
> if it would be ok to make sure it's never zero-sized, while pahole
> gets fixed and it's latest version gets widely packaged and
> distributed.
>
> Mel, what do you think about something like below? Or maybe you can
> advise some better solution?
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 41b87d6f840c..6a1d7511cae9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -124,6 +124,13 @@ static DEFINE_MUTEX(pcp_batch_high_lock);
>
>  struct pagesets {
>      local_lock_t lock;
> +#if defined(CONFIG_DEBUG_INFO_BTF) && !defined(CONFIG_DEBUG_LOCK_ALLOC)
> +    /* pahole 1.21 and earlier gets confused by zero-sized per-CPU
> +     * variables and produces invalid BTF. So to accommodate earlier
> +     * versions of pahole, ensure that sizeof(struct pagesets) is never 0.
> +     */
> +    char __filler;
> +#endif
>  };
>  static DEFINE_PER_CPU(struct pagesets, pagesets) = {
>      .lock = INIT_LOCAL_LOCK(lock),
>
> >
> > > [   26.264119]
> > > [   26.267437] failed to validate module [efivarfs] BTF: -22
> > > [   26.316724] systemd[1]: systemd 246.13+suse.105.g14581e0120 running in system mode. (+PAM +AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINI
> > > T +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified)
> > > [   26.357990] systemd[1]: Detected architecture x86-64.
> > > [   26.363068] systemd[1]: Running in initial RAM disk.
> > >
> >
> > [...]

  reply	other threads:[~2021-05-24 23:46 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 14:19 BPF: failed module verification on linux-next Michal Suchánek
2021-05-21  5:31 ` Andrii Nakryiko
2021-05-22 19:42   ` Hritik Vijay
2021-05-23 23:45   ` Hritik Vijay
2021-05-24 22:58   ` Andrii Nakryiko
2021-05-24 23:46     ` Andrii Nakryiko [this message]
2021-05-25 13:51     ` Mel Gorman
2021-05-25 19:26       ` Andrii Nakryiko
2021-05-26  8:15         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzZ9=aLVD7ytgCcSxcbOLqFNK-p1mj14Rv_TGnOyL3aO_g@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=acme@redhat.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=hritikxx8@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=msuchanek@suse.de \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.