All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florent Revest <revest@chromium.org>
To: Yonghong Song <yhs@fb.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	KP Singh <kpsingh@kernel.org>,
	Brendan Jackman <jackmanb@chromium.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] One-liner array initialization with two pointers in BPF results in NULLs
Date: Wed, 10 Mar 2021 12:48:45 +0100	[thread overview]
Message-ID: <CABRcYmK8m21sb8dHbr1wLT_oTCBpvr2Zg-8KHwKuJ2Ak0iTZ_A@mail.gmail.com> (raw)
In-Reply-To: <eb0a8485-9624-1727-6913-e4520c9d8c04@fb.com>

On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song <yhs@fb.com> wrote:
> On 3/9/21 7:43 PM, Yonghong Song wrote:
> > On 3/9/21 5:54 PM, Florent Revest wrote:
> >> I noticed that initializing an array of pointers using this syntax:
> >> __u64 array[] = { (__u64)&var1, (__u64)&var2 };
> >> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF)
> >> always results in array[0] and array[1] being NULL.
> >>
> >> Interestingly, if the array is only initialized with one pointer, ex:
> >> __u64 array[] = { (__u64)&var1 };
> >> Then array[0] will not be NULL.
> >>
> >> Or if the array is initialized field by field, ex:
> >> __u64 array[2];
> >> array[0] = (__u64)&var1;
> >> array[1] = (__u64)&var2;
> >> Then array[0] and array[1] will not be NULL either.
> >>
> >> I'm assuming that this should have something to do with relocations
> >> and might be a bug in clang or in libbpf but because I don't know much
> >> about these, I thought that reporting could be a good first step. :)
> >
> > Thanks for reporting. What you guess is correct, this is due to
> > relocations :-(
> >
> > The compiler notoriously tend to put complex initial values into
> > rodata section. For example, for
> >     __u64 array[] = { (__u64)&var1, (__u64)&var2 };
> > the compiler will put
> >     { (__u64)&var1, (__u64)&var2 }
> > into rodata section.
> >
> > But &var1 and &var2 themselves need relocation since they are
> > address of static variables which will sit inside .data section.
> >
> > So in the elf file, you will see the following relocations:
> >
> > RELOCATION RECORDS FOR [.rodata]:
> > OFFSET           TYPE                     VALUE
> > 0000000000000018 R_BPF_64_64              .data
> > 0000000000000020 R_BPF_64_64              .data

Right :) Thank you for the explanations Yonghong!

> > Currently, libbpf does not handle relocation inside .rodata
> > section, so they content remains 0.

Just for my own edification, why is .rodata relocation not yet handled
in libbpf ? Is it because of a read-only mapping that makes it more
difficult ?

> > That is why you see the issue with pointer as NULL.
> >
> > With array size of 1, compiler does not bother to put it into
> > rodata section.
> >
> > I *guess* that it works in the macro due to some kind of heuristics,
> > e.g., nested blocks, etc, and llvm did not promote the array init value
> > to rodata. I will double check whether llvm can complete prevent
> > such transformation.
> >
> > Maybe in the future libbpf is able to handle relocations for
> > rodata section too. But for the time being, please just consider to use
> > either macro, or the explicit array assignment.
>
> Digging into the compiler, the compiler tries to make *const* initial
> value into rodata section if the initial value size > 64, so in
> this case, macro does not work either. I think this is how you
> discovered the issue.

Indeed, I was using a macro similar to BPF_SEQ_PRINTF and this is how
I found the bug.

> The llvm does not provide target hooks to
> influence this transformation.

Oh, that is unfortunate :) Thanks for looking into it! I feel that the
real fix would be in libbpf anyway and the rest is just workarounds.

> So, there are two workarounds,
> (1).    __u64 param_working[2];
>          param_working[0] = (__u64)str1;
>          param_working[1] = (__u64)str2;
> (2). BPF_SEQ_PRINTF(seq, "%s ", str1);
>       BPF_SEQ_PRINTF(seq, "%s", str2);

(2) is a bit impractical for my actual usecase. I am implementing a
bpf_snprintf helper (patch series Coming Soon TM) and I wanted to keep
the selftest short with a few BPF_SNPRINTF() calls that exercise most
format specifiers.

> In practice, if you have at least one non-const format argument,
> you should be fine. But if all format arguments are constant, then
> none of them should be strings.

Just for context, this does not only happen for strings but also for
all sorts of pointers, for example, when I try to do address lookup of
global __ksym variables, which is important for my selftest.

> Maybe we could change marco
>     unsigned long long ___param[] = { args };
> to declare an array explicitly and then have a loop to
> assign each array element?

I think this would be a good workaround for now, indeed. :) I'll look
into it today and send it as part of my bpf_snprintf series.

Thanks!

  reply	other threads:[~2021-03-10 11:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10  1:54 [BUG] One-liner array initialization with two pointers in BPF results in NULLs Florent Revest
2021-03-10  3:43 ` Yonghong Song
2021-03-10  5:16   ` Yonghong Song
2021-03-10 11:48     ` Florent Revest [this message]
2021-03-10 16:59       ` Yonghong Song
2021-03-10 20:12         ` Andrii Nakryiko
2021-03-10 21:51           ` Andrii Nakryiko
2021-03-10 22:07             ` Florent Revest

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABRcYmK8m21sb8dHbr1wLT_oTCBpvr2Zg-8KHwKuJ2Ak0iTZ_A@mail.gmail.com \
    --to=revest@chromium.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jackmanb@chromium.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.