All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
To: <kuniyu@amazon.co.jp>
Cc: <kuni1840@gmail.com>, <netdev@vger.kernel.org>,
	<bpf@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v1 bpf-next 06/11] bpf: Introduce two attach types for BPF_PROG_TYPE_SK_REUSEPORT.
Date: Sun, 6 Dec 2020 13:32:16 +0900	[thread overview]
Message-ID: <20201206043216.23046-1-kuniyu@amazon.co.jp> (raw)
In-Reply-To: <20201204055226.c6abex2dmperhgx5@kafai-mbp.dhcp.thefacebook.com>

I'm sending this mail just for logging because I failed to send mails only
to LKML, netdev, and bpf yesterday.


From:   Martin KaFai Lau <kafai@fb.com>
Date:   Thu, 3 Dec 2020 21:56:53 -0800
> On Thu, Dec 03, 2020 at 11:16:08PM +0900, Kuniyuki Iwashima wrote:
> > From:   Martin KaFai Lau <kafai@fb.com>
> > Date:   Wed, 2 Dec 2020 20:24:02 -0800
> > > On Wed, Dec 02, 2020 at 11:19:02AM -0800, Martin KaFai Lau wrote:
> > > > On Tue, Dec 01, 2020 at 06:04:50PM -0800, Andrii Nakryiko wrote:
> > > > > On Tue, Dec 1, 2020 at 6:49 AM Kuniyuki Iwashima <kuniyu@amazon.co.jp> wrote:
> > > > > >
> > > > > > This commit adds new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT to
> > > > > > check if the attached eBPF program is capable of migrating sockets.
> > > > > >
> > > > > > When the eBPF program is attached, the kernel runs it for socket migration
> > > > > > only if the expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE.
> > > > > > The kernel will change the behaviour depending on the returned value:
> > > > > >
> > > > > >   - SK_PASS with selected_sk, select it as a new listener
> > > > > >   - SK_PASS with selected_sk NULL, fall back to the random selection
> > > > > >   - SK_DROP, cancel the migration
> > > > > >
> > > > > > Link: https://lore.kernel.org/netdev/20201123003828.xjpjdtk4ygl6tg6h@kafai-mbp.dhcp.thefacebook.com/
> > > > > > Suggested-by: Martin KaFai Lau <kafai@fb.com>
> > > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
> > > > > > ---
> > > > > >  include/uapi/linux/bpf.h       | 2 ++
> > > > > >  kernel/bpf/syscall.c           | 8 ++++++++
> > > > > >  tools/include/uapi/linux/bpf.h | 2 ++
> > > > > >  3 files changed, 12 insertions(+)
> > > > > >
> > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > > index 85278deff439..cfc207ae7782 100644
> > > > > > --- a/include/uapi/linux/bpf.h
> > > > > > +++ b/include/uapi/linux/bpf.h
> > > > > > @@ -241,6 +241,8 @@ enum bpf_attach_type {
> > > > > >         BPF_XDP_CPUMAP,
> > > > > >         BPF_SK_LOOKUP,
> > > > > >         BPF_XDP,
> > > > > > +       BPF_SK_REUSEPORT_SELECT,
> > > > > > +       BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
> > > > > >         __MAX_BPF_ATTACH_TYPE
> > > > > >  };
> > > > > >
> > > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > > index f3fe9f53f93c..a0796a8de5ea 100644
> > > > > > --- a/kernel/bpf/syscall.c
> > > > > > +++ b/kernel/bpf/syscall.c
> > > > > > @@ -2036,6 +2036,14 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
> > > > > >                 if (expected_attach_type == BPF_SK_LOOKUP)
> > > > > >                         return 0;
> > > > > >                 return -EINVAL;
> > > > > > +       case BPF_PROG_TYPE_SK_REUSEPORT:
> > > > > > +               switch (expected_attach_type) {
> > > > > > +               case BPF_SK_REUSEPORT_SELECT:
> > > > > > +               case BPF_SK_REUSEPORT_SELECT_OR_MIGRATE:
> > > > > > +                       return 0;
> > > > > > +               default:
> > > > > > +                       return -EINVAL;
> > > > > > +               }
> > > > > 
> > > > > this is a kernel regression, previously expected_attach_type wasn't
> > > > > enforced, so user-space could have provided any number without an
> > > > > error.
> > > > I also think this change alone will break things like when the usual
> > > > attr->expected_attach_type == 0 case.  At least changes is needed in
> > > > bpf_prog_load_fixup_attach_type() which is also handling a
> > > > similar situation for BPF_PROG_TYPE_CGROUP_SOCK.
> > > > 
> > > > I now think there is no need to expose new bpf_attach_type to the UAPI.
> > > > Since the prog->expected_attach_type is not used, it can be cleared at load time
> > > > and then only set to BPF_SK_REUSEPORT_SELECT_OR_MIGRATE (probably defined
> > > > internally at filter.[c|h]) in the is_valid_access() when "migration"
> > > > is accessed.  When "migration" is accessed, the bpf prog can handle
> > > > migration (and the original not-migration) case.
> > > Scrap this internal only BPF_SK_REUSEPORT_SELECT_OR_MIGRATE idea.
> > > I think there will be cases that bpf prog wants to do both
> > > without accessing any field from sk_reuseport_md.
> > > 
> > > Lets go back to the discussion on using a similar
> > > idea as BPF_PROG_TYPE_CGROUP_SOCK in bpf_prog_load_fixup_attach_type().
> > > I am not aware there is loader setting a random number
> > > in expected_attach_type, so the chance of breaking
> > > is very low.  There was a similar discussion earlier [0].
> > > 
> > > [0]: https://lore.kernel.org/netdev/20200126045443.f47dzxdglazzchfm@ast-mbp/
> > 
> > Thank you for the idea and reference.
> > 
> > I will remove the change in bpf_prog_load_check_attach() and set the
> > default value (BPF_SK_REUSEPORT_SELECT) in bpf_prog_load_fixup_attach_type()
> > for backward compatibility if expected_attach_type is 0.
> check_attach_type() can be kept.  You can refer to
> commit aac3fc320d94 for a similar situation.

I confirmed bpf_prog_load_fixup_attach_type() is called just before
bpf_prog_load_check_attach(), so I will add the fixup code to this patch.
Thank you.

  reply	other threads:[~2020-12-06  4:33 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-01 14:44 [PATCH v1 bpf-next 00/11] Socket migration for SO_REUSEPORT Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 01/11] tcp: Keep TCP_CLOSE sockets in the reuseport group Kuniyuki Iwashima
2020-12-05  1:31   ` Martin KaFai Lau
2020-12-06  4:38     ` Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 02/11] bpf: Define migration types for SO_REUSEPORT Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 03/11] tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues Kuniyuki Iwashima
2020-12-01 15:25   ` Eric Dumazet
2020-12-03 14:14     ` Kuniyuki Iwashima
2020-12-03 14:31       ` Eric Dumazet
2020-12-03 15:41         ` Kuniyuki Iwashima
2020-12-07 20:33       ` Martin KaFai Lau
2020-12-08  6:31         ` Kuniyuki Iwashima
2020-12-08  7:34           ` Martin KaFai Lau
2020-12-08  8:17             ` Kuniyuki Iwashima
2020-12-09  3:09               ` Martin KaFai Lau
2020-12-09  8:05                 ` Kuniyuki Iwashima
2020-12-09 16:57                   ` Kuniyuki Iwashima
2020-12-10  1:53                     ` Martin KaFai Lau
2020-12-10  5:58                       ` Kuniyuki Iwashima
2020-12-10 19:33                         ` Martin KaFai Lau
2020-12-14 17:16                           ` Kuniyuki Iwashima
2020-12-05  1:42   ` Martin KaFai Lau
2020-12-06  4:41     ` Kuniyuki Iwashima
     [not found]     ` <20201205160307.91179-1-kuniyu@amazon.co.jp>
2020-12-07 20:14       ` Martin KaFai Lau
2020-12-08  6:27         ` Kuniyuki Iwashima
2020-12-08  8:13           ` Martin KaFai Lau
2020-12-08  9:02             ` Kuniyuki Iwashima
2020-12-08  6:54   ` Martin KaFai Lau
2020-12-08  7:42     ` Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 04/11] tcp: Migrate TFO requests causing RST during TCP_SYN_RECV Kuniyuki Iwashima
2020-12-01 15:30   ` Eric Dumazet
2020-12-01 14:44 ` [PATCH v1 bpf-next 05/11] tcp: Migrate TCP_NEW_SYN_RECV requests Kuniyuki Iwashima
2020-12-01 15:13   ` Eric Dumazet
2020-12-03 14:12     ` Kuniyuki Iwashima
2020-12-01 17:37   ` kernel test robot
2020-12-01 17:37     ` kernel test robot
2020-12-01 17:42   ` kernel test robot
2020-12-01 17:42     ` kernel test robot
2020-12-10  0:07   ` Martin KaFai Lau
2020-12-10  5:15     ` Kuniyuki Iwashima
2020-12-10 18:49       ` Martin KaFai Lau
2020-12-14 17:03         ` Kuniyuki Iwashima
2020-12-15  2:58           ` Martin KaFai Lau
2020-12-16 16:41             ` Kuniyuki Iwashima
2020-12-16 22:24               ` Martin KaFai Lau
2020-12-01 14:44 ` [PATCH v1 bpf-next 06/11] bpf: Introduce two attach types for BPF_PROG_TYPE_SK_REUSEPORT Kuniyuki Iwashima
2020-12-02  2:04   ` Andrii Nakryiko
2020-12-02 19:19     ` Martin KaFai Lau
2020-12-03  4:24       ` Martin KaFai Lau
2020-12-03 14:16         ` Kuniyuki Iwashima
2020-12-04  5:56           ` Martin KaFai Lau
2020-12-06  4:32             ` Kuniyuki Iwashima [this message]
2020-12-01 14:44 ` [PATCH v1 bpf-next 07/11] libbpf: Set expected_attach_type " Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 08/11] bpf: Add migration to sk_reuseport_(kern|md) Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 09/11] bpf: Support bpf_get_socket_cookie_sock() for BPF_PROG_TYPE_SK_REUSEPORT Kuniyuki Iwashima
2020-12-04 19:58   ` Martin KaFai Lau
2020-12-06  4:36     ` Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 10/11] bpf: Call bpf_run_sk_reuseport() for socket migration Kuniyuki Iwashima
2020-12-01 14:44 ` [PATCH v1 bpf-next 11/11] bpf: Test BPF_SK_REUSEPORT_SELECT_OR_MIGRATE Kuniyuki Iwashima
2020-12-05  1:50   ` Martin KaFai Lau
2020-12-06  4:43     ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201206043216.23046-1-kuniyu@amazon.co.jp \
    --to=kuniyu@amazon.co.jp \
    --cc=bpf@vger.kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.