All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Jakub Sitnicki <jakub@cloudflare.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	kernel-team@cloudflare.com,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>
Subject: Re: [PATCH bpf-next] selftests/bpf: Fix implementation-defined behavior in sk_lookup test
Date: Tue, 22 Feb 2022 18:42:26 +0100	[thread overview]
Message-ID: <6e28c1c3ef0eda6f041593216ac32b210e55e4b7.camel@linux.ibm.com> (raw)
In-Reply-To: <87fsoakjj6.fsf@cloudflare.com>

On Tue, 2022-02-22 at 15:53 +0100, Jakub Sitnicki wrote:
> On Tue, Feb 22, 2022 at 03:22 AM +01, Ilya Leoshkevich wrote:
> > On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote:
> > > On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote:
> > > > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote:
> > > > > Shifting 16-bit type by 16 bits is implementation-defined for
> > > > > BPF
> > > > > programs.
> > > > > Don't rely on it in case it is causing the test failures we
> > > > > are
> > > > > seeing on
> > > > > s390x z15 target.
> > > > > 
> > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from
> > > > > remote_port in bpf_sk_lookup")
> > > > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > > > > ---
> > > > > 
> > > > > I don't have a dev env for s390x/z15 set up yet, so can't
> > > > > definitely
> > > > > confirm the fix.
> > > > > That said, it seems worth fixing either way.
> > > > > 
> > > > >  tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git
> > > > > a/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > index bf5b7caefdd0..7d47276a8964 100644
> > > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A;
> > > > >  static const __u32 KEY_SERVER_B = SERVER_B;
> > > > >  
> > > > >  static const __u16 SRC_PORT = bpf_htons(8008);
> > > > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16);
> > > > >  static const __u32 SRC_IP4 = IP4(127, 0, 0, 2);
> > > > >  static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0,
> > > > > 0x00000002);
> > > > >  
> > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct
> > > > > bpf_sk_lookup
> > > > > *ctx)
> > > > >  
> > > > >         /* Load from remote_port field with zero padding
> > > > > (backward
> > > > > compatibility) */
> > > > >         val_u32 = *(__u32 *)&ctx->remote_port;
> > > > > -       if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16))
> > > > > +       if (val_u32 != SRC_PORT_U32)
> > > > >                 return SK_DROP;
> > > > >  
> > > > >         /* Narrow loads from local_port field. Expect
> > > > > DST_PORT.
> > > > > */
> > > > 
> > > > Unfortunately this doesn't help with the s390 problem.
> > > > I'll try to debug this.
> > > 
> > > I have to admit I have a hard time wrapping my head around the
> > > requirements here.
> > > 
> > > Based on the pre-9a69e2b385f4 code, do I understand correctly
> > > that
> > > for the following input
> > > 
> > > Port:     0x1f48
> > > SRC_PORT: 0x481f
> > > 
> > > we expect the following results for different kinds of loads:
> > > 
> > > Size   Offset  LE      BE
> > > BPF_B  0       0x1f    0
> > > BPF_B  1       0x48    0
> > > BPF_B  2       0       0x48
> > > BPF_B  3       0       0x1f
> > > BPF_H  0       0x481f  0
> > > BPF_H  1       0       0x481f
> > > BPF_W  0       0x481f  0x481f
> > > 
> > > and this is guaranteed by the struct bpf_sk_lookup ABI? Because
> > > then
> > > it
> > > looks as if 9a69e2b385f4 breaks it on big-endian as follows:
> > > 
> > > Size   Offset  BE-9a69e2b385f4
> > > BPF_B  0       0x48
> > > BPF_B  1       0x1f
> > > BPF_B  2       0
> > > BPF_B  3       0
> > > BPF_H  0       0x481f
> > > BPF_H  1       0
> > > BPF_W  0       0x481f0000
> > 
> > Sorry, I worded this incorrectly: 9a69e2b385f4 did not change the
> > kernel behavior, the ABI is not broken and the old compiled code
> > should
> > continue to work.
> > What the second table really shows are what the results should be
> > according to the 9a69e2b385f4 struct bpf_sk_lookup definition,
> > which I
> > still think is broken on big-endian and needs to be adjusted to
> > match
> > the ABI.
> > 
> > I noticed one other strange thing in the meantime: loads from
> > *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and
> > *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390, which
> > is
> > clearly inconsistent. It looks as if convert_ctx_accesses() needs
> > to be
> > adjusted to handle combinations like ctx_field_size == 4 && size ==
> > 2
> > && target_size == 2. I will continue with this tomorrow.
> > 
> > > Or is the old behavior a bug and this new one is desirable?
> > > 9a69e2b385f4 has no Fixes: tag, so I assume that's the former :-(
> > > 
> > > In which case, would it make sense to fix it by swapping
> > > remote_port
> > > and :16 in bpf_sk_lookup on big-endian?
> 
> Thanks for looking into it.
> 
> When it comes to requirements, my intention was to keep the same
> behavior as before the split up of the remote_port field in
> 9a69e2b385f4
> ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide").
> 
> 9a69e2b385f4 was supposed to be a formality, after a similar change
> in
> 4421a582718a ("bpf: Make dst_port field in struct bpf_sock 16-bit
> wide"), which went in earlier.
> 
> In 4421a582718a I've provided a bit more context. I understand that
> the
> remote_port value, even before the type changed from u32 to u16,
> appeared to the BPF program as if laid out in memory like so:
> 
>       offsetof(struct bpf_sk_lookup, remote_port) +0  <port MSB>
>                                                   +1  <port LSB>
>                                                   +2  0x00
>                                                   +3  0x00
> 
> Translating it to your handy table format, I expect should result in
> loads as so if port is 8008 = 0x1f48:
> 
>       Size   Offset  LE      BE
>       BPF_B  0       0x1f    0x1f
>       BPF_B  1       0x48    0x48
>       BPF_B  2       0       0
>       BPF_B  3       0       0
>       BPF_H  0       0x481f  0x1f48
>       BPF_H  1       0       0
>       BPF_W  0       0x481f  0x1f480000

Hmm, I think for big-endian the layout is different.
If we look at test_sk_lookup.c from 9a69e2b385f4^:

        /* Narrow loads from remote_port field. Expect SRC_PORT. */
        if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) ||
            LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) ||
            LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, 3)
!= 0)
                return SK_DROP;

LSB() on little-endian is just byte indexing, so it's indeed 
1f,48,00,00. However, on big-endian it's indexing from the end, so
it's 00,00,48,1f.

> But since the fix does not work, there must be a mistake somewhere in
> my
> reasoning.
> 
> I expect I should be able to get virtme for s390 working sometime
> this
> week to check my math. I've seen your collegue had some luck with it
> [1].

Yeah, I think it should work. In the worst case it should be possible
to tweak vmtest.sh to cross-compile and emulate s390.

> Looking forward to your findings.
> 
> [1] https://github.com/cilium/ebpf/issues/86#issuecomment-623945549


  reply	other threads:[~2022-02-22 17:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-21 18:03 [PATCH bpf-next] selftests/bpf: Fix implementation-defined behavior in sk_lookup test Jakub Sitnicki
2022-02-21 21:39 ` Ilya Leoshkevich
2022-02-22  0:43   ` Ilya Leoshkevich
2022-02-22  2:22     ` Ilya Leoshkevich
2022-02-22 14:53       ` Jakub Sitnicki
2022-02-22 17:42         ` Ilya Leoshkevich [this message]
2022-02-22 18:24           ` Jakub Sitnicki
2022-02-22 21:51             ` Ilya Leoshkevich
2022-02-25 10:01               ` Jakub Sitnicki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e28c1c3ef0eda6f041593216ac32b210e55e4b7.camel@linux.ibm.com \
    --to=iii@linux.ibm.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.