All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jonathan Lemon" <jonathan.lemon@gmail.com>
To: "Daniel Borkmann" <daniel@iogearbox.net>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	"David Miller" <davem@davemloft.net>,
	"Network Development" <netdev@vger.kernel.org>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	"Alexei Starovoitov" <ast@kernel.org>
Subject: Re: [PATCH net-next v2 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure
Date: Thu, 06 Jun 2019 13:13:06 -0700	[thread overview]
Message-ID: <2319D5A1-22D6-409F-9570-6A135DB026E0@gmail.com> (raw)
In-Reply-To: <709e80ae-a08a-f00e-8f42-50289495d0de@iogearbox.net>

On 6 Jun 2019, at 12:24, Daniel Borkmann wrote:

> On 06/06/2019 08:15 PM, Jonathan Lemon wrote:
>> On 6 Jun 2019, at 9:15, Toke Høiland-Jørgensen wrote:
>>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>>> On Thu, Jun 6, 2019 at 8:51 AM Daniel Borkmann 
>>>> <daniel@iogearbox.net> wrote:
>>>>> On 06/06/2019 03:24 PM, Toke Høiland-Jørgensen wrote:
>>>>>> From: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>>>
>>>>>> The bpf_redirect_map() helper used by XDP programs doesn't return 
>>>>>> any
>>>>>> indication of whether it can successfully redirect to the map 
>>>>>> index it was
>>>>>> given. Instead, BPF programs have to track this themselves, 
>>>>>> leading to
>>>>>> programs using duplicate maps to track which entries are 
>>>>>> populated in the
>>>>>> devmap.
>>>>>>
>>>>>> This patch adds a flag to the XDP version of the 
>>>>>> bpf_redirect_map() helper,
>>>>>> which makes the helper do a lookup in the map when called, and 
>>>>>> return
>>>>>> XDP_PASS if there is no value at the provided index.
>>>>>>
>>>>>> With this, a BPF program can check the return code from the 
>>>>>> helper call and
>>>>>> react if it is XDP_PASS (by, for instance, substituting a 
>>>>>> different
>>>>>> redirect). This works for any type of map used for redirect.
>>>>>>
>>>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>>> ---
>>>>>>  include/uapi/linux/bpf.h |    8 ++++++++
>>>>>>  net/core/filter.c        |   10 +++++++++-
>>>>>>  2 files changed, 17 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>>>>> index 7c6aef253173..d57df4f0b837 100644
>>>>>> --- a/include/uapi/linux/bpf.h
>>>>>> +++ b/include/uapi/linux/bpf.h
>>>>>> @@ -3098,6 +3098,14 @@ enum xdp_action {
>>>>>>       XDP_REDIRECT,
>>>>>>  };
>>>>>>
>>>>>> +/* Flags for bpf_xdp_redirect_map helper */
>>>>>> +
>>>>>> +/* If set, the help will check if the entry exists in the map 
>>>>>> and return
>>>>>> + * XDP_PASS if it doesn't.
>>>>>> + */
>>>>>> +#define XDP_REDIRECT_F_PASS_ON_INVALID BIT(0)
>>>>>> +#define XDP_REDIRECT_ALL_FLAGS XDP_REDIRECT_F_PASS_ON_INVALID
>>>>>> +
>>>>>>  /* user accessible metadata for XDP packet hook
>>>>>>   * new fields must be added to the end of this structure
>>>>>>   */
>>>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>>>> index 55bfc941d17a..2e532a9b2605 100644
>>>>>> --- a/net/core/filter.c
>>>>>> +++ b/net/core/filter.c
>>>>>> @@ -3755,9 +3755,17 @@ BPF_CALL_3(bpf_xdp_redirect_map, struct 
>>>>>> bpf_map *, map, u32, ifindex,
>>>>>>  {
>>>>>>       struct bpf_redirect_info *ri = 
>>>>>> this_cpu_ptr(&bpf_redirect_info);
>>>>>>
>>>>>> -     if (unlikely(flags))
>>>>>> +     if (unlikely(flags & ~XDP_REDIRECT_ALL_FLAGS))
>>>>>>               return XDP_ABORTED;
>>>>>>
>>>>>> +     if (flags & XDP_REDIRECT_F_PASS_ON_INVALID) {
>>>>>> +             void *val;
>>>>>> +
>>>>>> +             val = __xdp_map_lookup_elem(map, 
>>>>>> ifindex);
>>>>>> +             if (unlikely(!val))
>>>>>> +                     return XDP_PASS;
>>>>>
>>>>> Generally looks good to me, also the second part with the flag. 
>>>>> Given we store into
>>>>> the per-CPU scratch space and function like xdp_do_redirect() pick 
>>>>> this up again, we
>>>>> could even propagate val onwards and save a second lookup on the 
>>>>> /same/ element (which
>>>>> also avoids a race if the val was dropped from the map in the 
>>>>> meantime). Given this
>>>>> should all still be within RCU it should work. Perhaps it even 
>>>>> makes sense to do the
>>>>> lookup unconditionally inside bpf_xdp_redirect_map() helper iff we 
>>>>> manage to do it
>>>>> only once anyway?
>>>>
>>>> +1
>>>>
>>>> also I don't think we really need a new flag here.
>>>> Yes, it could be considered an uapi change, but it
>>>> looks more like bugfix in uapi to me.
>>>> Since original behavior was so clunky to use.
>>>
>>> Hmm, the problem with this is that eBPF programs generally do 
>>> something
>>> like:
>>>
>>> return bpf_redirect_map(map, idx, 0);
>>>
>>> after having already modified the packet headers. This will get them 
>>> a
>>> return code of XDP_REDIRECT, and the lookup will then subsequently 
>>> fail,
>>> which returns in XDP_ABORTED in the driver, which you can catch with
>>> tracing.
>>>
>>> However, if we just change it to XDP_PASS, the packet will go up the
>>> stack, but because it has already been modified the stack will drop 
>>> it,
>>> more or less invisibly.
>>>
>>> So the question becomes, is that behaviour change really OK?
>>
>> Another option would be treating the flags (or the lower bits of 
>> flags)
>> as the default xdp action taken if the lookup fails.  0 just happens 
>> to
>> map to XDP_ABORTED, which gives the initial behavior.  Then the new 
>> behavior
>> would be:
>>
>>     return bpf_redirect_map(map, index, XDP_PASS);
>
> Makes sense, that should work, but as default (flags == 0), you'd have
> to return XDP_REDIRECT to stay consistent with existing behavior.

Right - I was thinking something along the lines of:

    val = __xdp_map_lookup_elem(map, ifindex);
    if (unlikely(!val))
        return (flags & 3);
    ...
    return XDP_REDIRECT;


Stated another way, if the map lookup succeeds, return REDIRECT, 
otherwise
return one (ABORT, DROP, PASS, TX).
-- 
Jonathan

	
>
> Thanks,
> Daniel

  reply	other threads:[~2019-06-06 20:13 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06 13:24 [PATCH net-next v2 0/2] xdp: Allow lookup into devmaps before redirect Toke Høiland-Jørgensen
2019-06-06 13:24 ` [PATCH net-next v2 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure Toke Høiland-Jørgensen
2019-06-06 15:51   ` Daniel Borkmann
2019-06-06 15:56     ` Alexei Starovoitov
2019-06-06 16:15       ` Toke Høiland-Jørgensen
2019-06-06 18:15         ` Jonathan Lemon
2019-06-06 19:24           ` Daniel Borkmann
2019-06-06 20:13             ` Jonathan Lemon [this message]
2019-06-06 21:14               ` Toke Høiland-Jørgensen
2019-06-06 21:53                 ` Jonathan Lemon
2019-06-06 22:31                   ` Toke Høiland-Jørgensen
2019-06-06 13:24 ` [PATCH net-next v2 2/2] devmap: Allow map lookups from eBPF Toke Høiland-Jørgensen
2019-06-06 13:33   ` Jesper Dangaard Brouer
2019-06-06 13:49     ` Toke Høiland-Jørgensen
2019-06-06 18:20 ` [PATCH net-next v2 0/2] xdp: Allow lookup into devmaps before redirect David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2319D5A1-22D6-409F-9570-6A135DB026E0@gmail.com \
    --to=jonathan.lemon@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.