All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <lmb@isovalent.com>
Cc: <andrii@kernel.org>, <ast@kernel.org>, <bpf@vger.kernel.org>,
	<daniel@iogearbox.net>, <davem@davemloft.net>,
	<dsahern@kernel.org>, <edumazet@google.com>, <haoluo@google.com>,
	<hemanthmalla@gmail.com>, <joe@wand.net.nz>,
	<john.fastabend@gmail.com>, <jolsa@kernel.org>,
	<kpsingh@kernel.org>, <kuba@kernel.org>, <kuniyu@amazon.com>,
	<linux-kernel@vger.kernel.org>, <linux-kselftest@vger.kernel.org>,
	<martin.lau@linux.dev>, <mykolal@fb.com>,
	<netdev@vger.kernel.org>, <pabeni@redhat.com>, <sdf@google.com>,
	<shuah@kernel.org>, <song@kernel.org>,
	<willemdebruijn.kernel@gmail.com>, <yhs@fb.com>
Subject: Re: [PATCH bpf-next v2 3/6] net: remove duplicate reuseport_lookup functions
Date: Tue, 20 Jun 2023 11:31:23 -0700	[thread overview]
Message-ID: <20230620183123.74585-1-kuniyu@amazon.com> (raw)
In-Reply-To: <CAN+4W8ge-ZQjins-E1=GHDnsi9myFqt7pwNqMkUQHZOPHQhFvQ@mail.gmail.com>

From: Lorenz Bauer <lmb@isovalent.com>
Date: Tue, 20 Jun 2023 15:26:05 +0100
> On Tue, Jun 13, 2023 at 7:57 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > The assignment to result below is buggy.  Let's say SO_REUSEPROT group
> > have TCP_CLOSE and TCP_ESTABLISHED sockets.
> >
> >   1. Find TCP_CLOSE sk and do SO_REUSEPORT lookup
> >   2. result is not NULL, but the group has TCP_ESTABLISHED sk
> >   3. result = result
> >   4. Find TCP_ESTABLISHED sk, which has a higher score
> >   5. result = result (TCP_CLOSE) <-- should be sk.
> >
> > Same for v6 function.
> 
> Thanks for your explanation, I think I get it now. I misunderstood
> that you were worried about returning TCP_ESTABLISHED instead of
> TCP_CLOSE, but it's exactly the other way around.
> 
> I have a follow up question regarding the existing code:
> 
>     result = lookup_reuseport(net, sk, skb,
>                     saddr, sport, daddr, hnum);
>     /* Fall back to scoring if group has connections */
>     if (result && !reuseport_has_conns(sk))
>         return result;
> 
>     result = result ? : sk;
>     badness = score;
> 
> Assuming that result != NULL but reuseport_has_conns() == true, we use
> the reuseport socket as the result, but assign the score of sk to
> badness. Shouldn't we use the score of the reuseport socket?

Good point.  This is based on an assumption that all SO_REUSEPORT
sockets have the same score, which is wrong for two corner cases
if reuseport_has_conns() == true :

  1) SO_INCOMING_CPU is set
     -> selected sk might have +1 score

  2) BPF prog returns ESTABLISHED and/or SO_INCOMING_CPU sk
     -> selected sk will have more than 8

Using the old score could trigger more lookups depending on the
order that sockets are created.

  sk -> sk (SO_INCOMING_CPU) -> sk (ESTABLISHED)
  |     |
  `-> select the next SO_INCOMING_CPU sk
        |
	`-> select itself (We should save this lookup)

So, yes, we should update badness like

  if (unlikely(result)) {
      badness = compute_score(result, ...);
  } else {
      result = sk;
      badness = score;
  }

  reply	other threads:[~2023-06-20 18:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-13 10:14 [PATCH bpf-next v2 0/6] Add SO_REUSEPORT support for TC bpf_sk_assign Lorenz Bauer
2023-06-13 10:14 ` [PATCH bpf-next v2 1/6] net: export inet_lookup_reuseport and inet6_lookup_reuseport Lorenz Bauer
2023-06-13 10:14 ` [PATCH bpf-next v2 2/6] net: document inet[6]_lookup_reuseport sk_state requirements Lorenz Bauer
2023-06-13 10:14 ` [PATCH bpf-next v2 3/6] net: remove duplicate reuseport_lookup functions Lorenz Bauer
2023-06-13 15:32   ` Simon Horman
2023-06-14 15:42     ` Lorenz Bauer
2023-06-15  7:21       ` Simon Horman
2023-06-13 17:26   ` kernel test robot
2023-06-13 18:56   ` Kuniyuki Iwashima
2023-06-14 15:25     ` Lorenz Bauer
2023-06-14 16:52       ` Kuniyuki Iwashima
2023-06-20 14:26     ` Lorenz Bauer
2023-06-20 18:31       ` Kuniyuki Iwashima [this message]
2023-06-21  8:01         ` Lorenz Bauer
2023-06-21 13:49         ` Lorenz Bauer
2023-06-21 15:00           ` Kuniyuki Iwashima
2023-06-13 10:14 ` [PATCH bpf-next v2 4/6] net: remove duplicate sk_lookup helpers Lorenz Bauer
2023-06-13 19:03   ` Kuniyuki Iwashima
2023-06-13 19:41   ` kernel test robot
2023-06-13 10:15 ` [PATCH bpf-next v2 5/6] bpf, net: Support SO_REUSEPORT sockets with bpf_sk_assign Lorenz Bauer
2023-06-13 17:26   ` kernel test robot
2023-06-13 10:15 ` [PATCH bpf-next v2 6/6] selftests/bpf: Test that SO_REUSEPORT can be used with sk_assign helper Lorenz Bauer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230620183123.74585-1-kuniyu@amazon.com \
    --to=kuniyu@amazon.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=hemanthmalla@gmail.com \
    --cc=joe@wand.net.nz \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=lmb@isovalent.com \
    --cc=martin.lau@linux.dev \
    --cc=mykolal@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@google.com \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.