netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 2/2] devmap: Allow map lookups from eBPF
  2019-06-04 15:24 [PATCH net-next 0/2] xdp: Allow lookup into devmaps before redirect Toke Høiland-Jørgensen
  2019-06-04 15:24 ` [PATCH net-next 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure Toke Høiland-Jørgensen
@ 2019-06-04 15:24 ` Toke Høiland-Jørgensen
  2019-06-04 16:35   ` Jesper Dangaard Brouer
  2019-06-04 20:22   ` Toke Høiland-Jørgensen
  1 sibling, 2 replies; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-06-04 15:24 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov

We don't currently allow lookups into a devmap from eBPF, because the map
lookup returns a pointer directly to the dev->ifindex, which shouldn't be
modifiable from eBPF.

However, being able to do lookups in devmaps is useful to know (e.g.)
whether forwarding to a specific interface is enabled. Currently, programs
work around this by keeping a shadow map of another type which indicates
whether a map index is valid.

To allow lookups, simply copy the ifindex into a scratch variable and
return a pointer to this. If an eBPF program does modify it, this doesn't
matter since it will be overridden on the next lookup anyway. While this
does add a write to every lookup, the overhead of this is negligible
because the cache line is hot when both the write and the subsequent read
happens.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 kernel/bpf/devmap.c   |    8 +++++++-
 kernel/bpf/verifier.c |    7 ++-----
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 5ae7cce5ef16..830650300ea4 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -65,6 +65,7 @@ struct xdp_bulk_queue {
 struct bpf_dtab_netdev {
 	struct net_device *dev; /* must be first member, due to tracepoint */
 	struct bpf_dtab *dtab;
+	int ifindex_scratch;
 	unsigned int bit;
 	struct xdp_bulk_queue __percpu *bulkq;
 	struct rcu_head rcu;
@@ -375,7 +376,12 @@ static void *dev_map_lookup_elem(struct bpf_map *map, void *key)
 	struct bpf_dtab_netdev *obj = __dev_map_lookup_elem(map, *(u32 *)key);
 	struct net_device *dev = obj ? obj->dev : NULL;
 
-	return dev ? &dev->ifindex : NULL;
+	if (dev) {
+		obj->ifindex_scratch = dev->ifindex;
+		return &obj->ifindex_scratch;
+	}
+
+	return NULL;
 }
 
 static void dev_map_flush_old(struct bpf_dtab_netdev *dev)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5c2cb5bd84ce..7128a9821481 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2893,12 +2893,9 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
 		if (func_id != BPF_FUNC_get_local_storage)
 			goto error;
 		break;
-	/* devmap returns a pointer to a live net_device ifindex that we cannot
-	 * allow to be modified from bpf side. So do not allow lookup elements
-	 * for now.
-	 */
 	case BPF_MAP_TYPE_DEVMAP:
-		if (func_id != BPF_FUNC_redirect_map)
+		if (func_id != BPF_FUNC_redirect_map &&
+		    func_id != BPF_FUNC_map_lookup_elem)
 			goto error;
 		break;
 	/* Restrict bpf side of cpumap and xskmap, open when use-cases


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure
  2019-06-04 15:24 [PATCH net-next 0/2] xdp: Allow lookup into devmaps before redirect Toke Høiland-Jørgensen
@ 2019-06-04 15:24 ` Toke Høiland-Jørgensen
  2019-06-05 10:39   ` Jesper Dangaard Brouer
  2019-06-04 15:24 ` [PATCH net-next 2/2] devmap: Allow map lookups from eBPF Toke Høiland-Jørgensen
  1 sibling, 1 reply; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-06-04 15:24 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov

The bpf_redirect_map() helper used by XDP programs doesn't return any
indication of whether it can successfully redirect to the map index it was
given. Instead, BPF programs have to track this themselves, leading to
programs using duplicate maps to track which entries are populated in the
devmap.

This adds a flag to the XDP version of the bpf_redirect_map() helper, which
makes the helper do a lookup in the map when called, and return XDP_PASS if
there is no value at the provided index. This enables two use cases:

- A BPF program can check the return code from the helper call and react if
  it is XDP_PASS (by, for instance, redirecting out a different interface).

- Programs that just return the value of the bpf_redirect() call will
  automatically fall back to the regular networking stack, simplifying
  programs that (for instance) build a router with the fib_lookup() helper.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/uapi/linux/bpf.h |    8 ++++++++
 net/core/filter.c        |   10 +++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 7c6aef253173..4c41482b7604 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3098,6 +3098,14 @@ enum xdp_action {
 	XDP_REDIRECT,
 };
 
+/* Flags for bpf_xdp_redirect_map helper */
+
+/* If set, the help will check if the entry exists in the map and return
+ * XDP_PASS if it doesn't.
+ */
+#define XDP_REDIRECT_PASS_ON_INVALID BIT(0)
+#define XDP_REDIRECT_ALL_FLAGS XDP_REDIRECT_PASS_ON_INVALID
+
 /* user accessible metadata for XDP packet hook
  * new fields must be added to the end of this structure
  */
diff --git a/net/core/filter.c b/net/core/filter.c
index 55bfc941d17a..dfab8478f66c 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3755,9 +3755,17 @@ BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex,
 {
 	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
 
-	if (unlikely(flags))
+	if (unlikely(flags & ~XDP_REDIRECT_ALL_FLAGS))
 		return XDP_ABORTED;
 
+	if (flags & XDP_REDIRECT_PASS_ON_INVALID) {
+		struct net_device *fwd;
+
+		fwd = __xdp_map_lookup_elem(map, ifindex);
+		if (unlikely(!fwd))
+			return XDP_PASS;
+	}
+
 	ri->ifindex = ifindex;
 	ri->flags = flags;
 	WRITE_ONCE(ri->map, map);


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 0/2] xdp: Allow lookup into devmaps before redirect
@ 2019-06-04 15:24 Toke Høiland-Jørgensen
  2019-06-04 15:24 ` [PATCH net-next 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure Toke Høiland-Jørgensen
  2019-06-04 15:24 ` [PATCH net-next 2/2] devmap: Allow map lookups from eBPF Toke Høiland-Jørgensen
  0 siblings, 2 replies; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-06-04 15:24 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov

When using the bpf_redirect_map() helper to redirect packets from XDP, the eBPF
program cannot currently know whether the redirect will succeed, which makes it
impossible to gracefully handle errors. To properly fix this will probably
require deeper changes to the way TX resources are allocated, but one thing that
is fairly straight forward to fix is to allow lookups into devmaps, so programs
can at least know when a redirect is *guaranteed* to fail because there is no
entry in the map. Currently, programs work around this by keeping a shadow map
of another type which indicates whether a map index is valid.

This series contains two changes that are complementary ways to fix this issue.
The first patch adds a flag to make the bpf_redirect_map() helper itself do a
lookup in the map and return XDP_PASS if the map index is unset, while the
second patch allows regular lookups into devmaps from eBPF programs.

The performance impact of both patches is negligible, in the sense that I cannot
measure it because the variance between test runs is higher than the difference
pre/post patch.

---

Toke Høiland-Jørgensen (2):
      bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure
      devmap: Allow map lookups from eBPF


 include/uapi/linux/bpf.h |    8 ++++++++
 kernel/bpf/devmap.c      |    8 +++++++-
 kernel/bpf/verifier.c    |    7 ++-----
 net/core/filter.c        |   10 +++++++++-
 4 files changed, 26 insertions(+), 7 deletions(-)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/2] devmap: Allow map lookups from eBPF
  2019-06-04 15:24 ` [PATCH net-next 2/2] devmap: Allow map lookups from eBPF Toke Høiland-Jørgensen
@ 2019-06-04 16:35   ` Jesper Dangaard Brouer
  2019-06-04 18:42     ` Toke Høiland-Jørgensen
  2019-06-04 19:41     ` Jonathan Lemon
  2019-06-04 20:22   ` Toke Høiland-Jørgensen
  1 sibling, 2 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2019-06-04 16:35 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: David Miller, netdev, Daniel Borkmann, Alexei Starovoitov,
	brouer, Jonathan Lemon, Arnaldo Carvalho de Melo, Jiri Olsa

On Tue, 04 Jun 2019 17:24:10 +0200
Toke Høiland-Jørgensen <toke@redhat.com> wrote:

> We don't currently allow lookups into a devmap from eBPF, because the map
> lookup returns a pointer directly to the dev->ifindex, which shouldn't be
> modifiable from eBPF.
> 
> However, being able to do lookups in devmaps is useful to know (e.g.)
> whether forwarding to a specific interface is enabled. Currently, programs
> work around this by keeping a shadow map of another type which indicates
> whether a map index is valid.
> 
> To allow lookups, simply copy the ifindex into a scratch variable and
> return a pointer to this. If an eBPF program does modify it, this doesn't
> matter since it will be overridden on the next lookup anyway. While this
> does add a write to every lookup, the overhead of this is negligible
> because the cache line is hot when both the write and the subsequent
> read happens.

When we choose the return value, here the ifindex, then this basically
becomes UABI, right?

Can we somehow use BTF to help us to make this extensible?

As Toke mention in the cover letter, we really want to know if the
chosen egress have actually enabled/allocated resources for XDP
transmitting, but as we currently don't have in-kernel way to query
thus (thus, we cannot expose such info).


> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  kernel/bpf/devmap.c   |    8 +++++++-
>  kernel/bpf/verifier.c |    7 ++-----
>  2 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
> index 5ae7cce5ef16..830650300ea4 100644
> --- a/kernel/bpf/devmap.c
> +++ b/kernel/bpf/devmap.c
> @@ -65,6 +65,7 @@ struct xdp_bulk_queue {
>  struct bpf_dtab_netdev {
>  	struct net_device *dev; /* must be first member, due to tracepoint */
>  	struct bpf_dtab *dtab;
> +	int ifindex_scratch;
>  	unsigned int bit;
>  	struct xdp_bulk_queue __percpu *bulkq;
>  	struct rcu_head rcu;
> @@ -375,7 +376,12 @@ static void *dev_map_lookup_elem(struct bpf_map *map, void *key)
>  	struct bpf_dtab_netdev *obj = __dev_map_lookup_elem(map, *(u32 *)key);
>  	struct net_device *dev = obj ? obj->dev : NULL;
>  
> -	return dev ? &dev->ifindex : NULL;
> +	if (dev) {
> +		obj->ifindex_scratch = dev->ifindex;
> +		return &obj->ifindex_scratch;
> +	}
> +
> +	return NULL;
>  }
>  
>  static void dev_map_flush_old(struct bpf_dtab_netdev *dev)
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 5c2cb5bd84ce..7128a9821481 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -2893,12 +2893,9 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
>  		if (func_id != BPF_FUNC_get_local_storage)
>  			goto error;
>  		break;
> -	/* devmap returns a pointer to a live net_device ifindex that we cannot
> -	 * allow to be modified from bpf side. So do not allow lookup elements
> -	 * for now.
> -	 */
>  	case BPF_MAP_TYPE_DEVMAP:
> -		if (func_id != BPF_FUNC_redirect_map)
> +		if (func_id != BPF_FUNC_redirect_map &&
> +		    func_id != BPF_FUNC_map_lookup_elem)
>  			goto error;
>  		break;
>  	/* Restrict bpf side of cpumap and xskmap, open when use-cases
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/2] devmap: Allow map lookups from eBPF
  2019-06-04 16:35   ` Jesper Dangaard Brouer
@ 2019-06-04 18:42     ` Toke Høiland-Jørgensen
  2019-06-04 19:41     ` Jonathan Lemon
  1 sibling, 0 replies; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-06-04 18:42 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: David Miller, netdev, Daniel Borkmann, Alexei Starovoitov,
	brouer, Jonathan Lemon, Arnaldo Carvalho de Melo, Jiri Olsa

Jesper Dangaard Brouer <brouer@redhat.com> writes:

> On Tue, 04 Jun 2019 17:24:10 +0200
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
>> We don't currently allow lookups into a devmap from eBPF, because the map
>> lookup returns a pointer directly to the dev->ifindex, which shouldn't be
>> modifiable from eBPF.
>> 
>> However, being able to do lookups in devmaps is useful to know (e.g.)
>> whether forwarding to a specific interface is enabled. Currently, programs
>> work around this by keeping a shadow map of another type which indicates
>> whether a map index is valid.
>> 
>> To allow lookups, simply copy the ifindex into a scratch variable and
>> return a pointer to this. If an eBPF program does modify it, this doesn't
>> matter since it will be overridden on the next lookup anyway. While this
>> does add a write to every lookup, the overhead of this is negligible
>> because the cache line is hot when both the write and the subsequent
>> read happens.
>
> When we choose the return value, here the ifindex, then this basically
> becomes UABI, right?

Well, we already have UABI on the insert side, where the value being
inserted has to be an ifindex. And we enforce value_size==4 when
creating the map. So IMO I'm just keeping to the already established
UAPI here.

That being said...

> Can we somehow use BTF to help us to make this extensible?

... this would not necessarily be a bad thing, it just needs to be done
on both the insert and lookup sides.

But I think this is a separate issue, which we need to solve anyway. And
I'm still not convinced that the map value is the right place to specify
what resources we want ;)

-Toke

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/2] devmap: Allow map lookups from eBPF
  2019-06-04 16:35   ` Jesper Dangaard Brouer
  2019-06-04 18:42     ` Toke Høiland-Jørgensen
@ 2019-06-04 19:41     ` Jonathan Lemon
  2019-06-04 20:00       ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 11+ messages in thread
From: Jonathan Lemon @ 2019-06-04 19:41 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Toke Høiland-Jørgensen, David Miller, netdev,
	Daniel Borkmann, Alexei Starovoitov, Arnaldo Carvalho de Melo,
	Jiri Olsa

On 4 Jun 2019, at 9:35, Jesper Dangaard Brouer wrote:

> On Tue, 04 Jun 2019 17:24:10 +0200
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
>> We don't currently allow lookups into a devmap from eBPF, because the map
>> lookup returns a pointer directly to the dev->ifindex, which shouldn't be
>> modifiable from eBPF.
>>
>> However, being able to do lookups in devmaps is useful to know (e.g.)
>> whether forwarding to a specific interface is enabled. Currently, programs
>> work around this by keeping a shadow map of another type which indicates
>> whether a map index is valid.
>>
>> To allow lookups, simply copy the ifindex into a scratch variable and
>> return a pointer to this. If an eBPF program does modify it, this doesn't
>> matter since it will be overridden on the next lookup anyway. While this
>> does add a write to every lookup, the overhead of this is negligible
>> because the cache line is hot when both the write and the subsequent
>> read happens.
>
> When we choose the return value, here the ifindex, then this basically
> becomes UABI, right?
>
> Can we somehow use BTF to help us to make this extensible?
>
> As Toke mention in the cover letter, we really want to know if the
> chosen egress have actually enabled/allocated resources for XDP
> transmitting, but as we currently don't have in-kernel way to query
> thus (thus, we cannot expose such info).

Would it be better to add a helper like bpf_map_element_present(), which
just returns a boolean value indicating whether the entry is NULL or not?

This would solve this problem (and my xskmap problem).
-- 
Jonathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/2] devmap: Allow map lookups from eBPF
  2019-06-04 19:41     ` Jonathan Lemon
@ 2019-06-04 20:00       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-06-04 20:00 UTC (permalink / raw)
  To: Jonathan Lemon, Jesper Dangaard Brouer
  Cc: David Miller, netdev, Daniel Borkmann, Alexei Starovoitov,
	Arnaldo Carvalho de Melo, Jiri Olsa

Jonathan Lemon <jonathan.lemon@gmail.com> writes:

> On 4 Jun 2019, at 9:35, Jesper Dangaard Brouer wrote:
>
>> On Tue, 04 Jun 2019 17:24:10 +0200
>> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>>> We don't currently allow lookups into a devmap from eBPF, because the map
>>> lookup returns a pointer directly to the dev->ifindex, which shouldn't be
>>> modifiable from eBPF.
>>>
>>> However, being able to do lookups in devmaps is useful to know (e.g.)
>>> whether forwarding to a specific interface is enabled. Currently, programs
>>> work around this by keeping a shadow map of another type which indicates
>>> whether a map index is valid.
>>>
>>> To allow lookups, simply copy the ifindex into a scratch variable and
>>> return a pointer to this. If an eBPF program does modify it, this doesn't
>>> matter since it will be overridden on the next lookup anyway. While this
>>> does add a write to every lookup, the overhead of this is negligible
>>> because the cache line is hot when both the write and the subsequent
>>> read happens.
>>
>> When we choose the return value, here the ifindex, then this basically
>> becomes UABI, right?
>>
>> Can we somehow use BTF to help us to make this extensible?
>>
>> As Toke mention in the cover letter, we really want to know if the
>> chosen egress have actually enabled/allocated resources for XDP
>> transmitting, but as we currently don't have in-kernel way to query
>> thus (thus, we cannot expose such info).
>
> Would it be better to add a helper like bpf_map_element_present(), which
> just returns a boolean value indicating whether the entry is NULL or not?
>
> This would solve this problem (and my xskmap problem).

Ah, totally missed that other thread; will go reply there :)

-Toke

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/2] devmap: Allow map lookups from eBPF
  2019-06-04 15:24 ` [PATCH net-next 2/2] devmap: Allow map lookups from eBPF Toke Høiland-Jørgensen
  2019-06-04 16:35   ` Jesper Dangaard Brouer
@ 2019-06-04 20:22   ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-06-04 20:22 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov

Toke Høiland-Jørgensen <toke@redhat.com> writes:

> We don't currently allow lookups into a devmap from eBPF, because the map
> lookup returns a pointer directly to the dev->ifindex, which shouldn't be
> modifiable from eBPF.
>
> However, being able to do lookups in devmaps is useful to know (e.g.)
> whether forwarding to a specific interface is enabled. Currently, programs
> work around this by keeping a shadow map of another type which indicates
> whether a map index is valid.
>
> To allow lookups, simply copy the ifindex into a scratch variable and
> return a pointer to this. If an eBPF program does modify it, this doesn't
> matter since it will be overridden on the next lookup anyway. While this
> does add a write to every lookup, the overhead of this is negligible
> because the cache line is hot when both the write and the subsequent read
> happens.
>
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  kernel/bpf/devmap.c   |    8 +++++++-
>  kernel/bpf/verifier.c |    7 ++-----
>  2 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
> index 5ae7cce5ef16..830650300ea4 100644
> --- a/kernel/bpf/devmap.c
> +++ b/kernel/bpf/devmap.c
> @@ -65,6 +65,7 @@ struct xdp_bulk_queue {
>  struct bpf_dtab_netdev {
>  	struct net_device *dev; /* must be first member, due to tracepoint */
>  	struct bpf_dtab *dtab;
> +	int ifindex_scratch;

Just realised I forgot to make this per-cpu; I'll send an updated
version once we settle on a solution that works for xskmap as well...

-Toke

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure
  2019-06-04 15:24 ` [PATCH net-next 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure Toke Høiland-Jørgensen
@ 2019-06-05 10:39   ` Jesper Dangaard Brouer
  2019-06-05 15:09     ` Jonathan Lemon
  2019-06-06 10:01     ` Toke Høiland-Jørgensen
  0 siblings, 2 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2019-06-05 10:39 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: David Miller, netdev, Daniel Borkmann, Alexei Starovoitov,
	brouer, Jonathan Lemon


On Tue, 04 Jun 2019 17:24:10 +0200
Toke Høiland-Jørgensen <toke@redhat.com> wrote:

> The bpf_redirect_map() helper used by XDP programs doesn't return any
> indication of whether it can successfully redirect to the map index it was
> given. Instead, BPF programs have to track this themselves, leading to
> programs using duplicate maps to track which entries are populated in the
> devmap.
> 
> This adds a flag to the XDP version of the bpf_redirect_map() helper, which
> makes the helper do a lookup in the map when called, and return XDP_PASS if
> there is no value at the provided index. This enables two use cases:

To Jonathan Lemon, notice this approach of adding a flag to the helper
call, it actually also works for your use-case of XSK AF_XDP maps.

> - A BPF program can check the return code from the helper call and react if
>   it is XDP_PASS (by, for instance, redirecting out a different interface).
> 
> - Programs that just return the value of the bpf_redirect() call will
>   automatically fall back to the regular networking stack, simplifying
>   programs that (for instance) build a router with the fib_lookup() helper.
> 
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  include/uapi/linux/bpf.h |    8 ++++++++
>  net/core/filter.c        |   10 +++++++++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 7c6aef253173..4c41482b7604 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -3098,6 +3098,14 @@ enum xdp_action {
>  	XDP_REDIRECT,
>  };
>  
> +/* Flags for bpf_xdp_redirect_map helper */
> +
> +/* If set, the help will check if the entry exists in the map and return
> + * XDP_PASS if it doesn't.
> + */
> +#define XDP_REDIRECT_PASS_ON_INVALID BIT(0)
> +#define XDP_REDIRECT_ALL_FLAGS XDP_REDIRECT_PASS_ON_INVALID
> +
>  /* user accessible metadata for XDP packet hook
>   * new fields must be added to the end of this structure
>   */
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 55bfc941d17a..dfab8478f66c 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3755,9 +3755,17 @@ BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex,
>  {
>  	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
>  
> -	if (unlikely(flags))
> +	if (unlikely(flags & ~XDP_REDIRECT_ALL_FLAGS))
>  		return XDP_ABORTED;
>  
> +	if (flags & XDP_REDIRECT_PASS_ON_INVALID) {
> +		struct net_device *fwd;

It is slightly misguiding that '*fwd' is a 'struct net_device', as the
__xdp_map_lookup_elem() call works for all the supported redirect-map
types.

People should realize that this patch is a general approach for all the
redirect-map types.

> +
> +		fwd = __xdp_map_lookup_elem(map, ifindex);
> +		if (unlikely(!fwd))
> +			return XDP_PASS;
> +	}
> +
>  	ri->ifindex = ifindex;
>  	ri->flags = flags;
>  	WRITE_ONCE(ri->map, map);


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure
  2019-06-05 10:39   ` Jesper Dangaard Brouer
@ 2019-06-05 15:09     ` Jonathan Lemon
  2019-06-06 10:01     ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 11+ messages in thread
From: Jonathan Lemon @ 2019-06-05 15:09 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Toke Høiland-Jørgensen, David Miller, netdev,
	Daniel Borkmann, Alexei Starovoitov



On 5 Jun 2019, at 3:39, Jesper Dangaard Brouer wrote:

> On Tue, 04 Jun 2019 17:24:10 +0200
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
>> The bpf_redirect_map() helper used by XDP programs doesn't return any
>> indication of whether it can successfully redirect to the map index 
>> it was
>> given. Instead, BPF programs have to track this themselves, leading 
>> to
>> programs using duplicate maps to track which entries are populated in 
>> the
>> devmap.
>>
>> This adds a flag to the XDP version of the bpf_redirect_map() helper, 
>> which
>> makes the helper do a lookup in the map when called, and return 
>> XDP_PASS if
>> there is no value at the provided index. This enables two use cases:
>
> To Jonathan Lemon, notice this approach of adding a flag to the helper
> call, it actually also works for your use-case of XSK AF_XDP maps.c

Hmm, yes, that should work also.

I have a patch which returns a XDP_SOCK type from the xskmap.
This could be used with a new helper for redirection directly to a 
socket
(instead of looking the socket up a second time).

While flexible, the downside is that this won't apply to devmaps.
-- 
Jonathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure
  2019-06-05 10:39   ` Jesper Dangaard Brouer
  2019-06-05 15:09     ` Jonathan Lemon
@ 2019-06-06 10:01     ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-06-06 10:01 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: David Miller, netdev, Daniel Borkmann, Alexei Starovoitov,
	brouer, Jonathan Lemon

Jesper Dangaard Brouer <brouer@redhat.com> writes:

> On Tue, 04 Jun 2019 17:24:10 +0200
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
>> The bpf_redirect_map() helper used by XDP programs doesn't return any
>> indication of whether it can successfully redirect to the map index it was
>> given. Instead, BPF programs have to track this themselves, leading to
>> programs using duplicate maps to track which entries are populated in the
>> devmap.
>> 
>> This adds a flag to the XDP version of the bpf_redirect_map() helper, which
>> makes the helper do a lookup in the map when called, and return XDP_PASS if
>> there is no value at the provided index. This enables two use cases:
>
> To Jonathan Lemon, notice this approach of adding a flag to the helper
> call, it actually also works for your use-case of XSK AF_XDP maps.
>
>> - A BPF program can check the return code from the helper call and react if
>>   it is XDP_PASS (by, for instance, redirecting out a different interface).
>> 
>> - Programs that just return the value of the bpf_redirect() call will
>>   automatically fall back to the regular networking stack, simplifying
>>   programs that (for instance) build a router with the fib_lookup() helper.
>> 
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> ---
>>  include/uapi/linux/bpf.h |    8 ++++++++
>>  net/core/filter.c        |   10 +++++++++-
>>  2 files changed, 17 insertions(+), 1 deletion(-)
>> 
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 7c6aef253173..4c41482b7604 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -3098,6 +3098,14 @@ enum xdp_action {
>>  	XDP_REDIRECT,
>>  };
>>  
>> +/* Flags for bpf_xdp_redirect_map helper */
>> +
>> +/* If set, the help will check if the entry exists in the map and return
>> + * XDP_PASS if it doesn't.
>> + */
>> +#define XDP_REDIRECT_PASS_ON_INVALID BIT(0)
>> +#define XDP_REDIRECT_ALL_FLAGS XDP_REDIRECT_PASS_ON_INVALID
>> +
>>  /* user accessible metadata for XDP packet hook
>>   * new fields must be added to the end of this structure
>>   */
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 55bfc941d17a..dfab8478f66c 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3755,9 +3755,17 @@ BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex,
>>  {
>>  	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
>>  
>> -	if (unlikely(flags))
>> +	if (unlikely(flags & ~XDP_REDIRECT_ALL_FLAGS))
>>  		return XDP_ABORTED;
>>  
>> +	if (flags & XDP_REDIRECT_PASS_ON_INVALID) {
>> +		struct net_device *fwd;
>
> It is slightly misguiding that '*fwd' is a 'struct net_device', as the
> __xdp_map_lookup_elem() call works for all the supported redirect-map
> types.
>
> People should realize that this patch is a general approach for all the
> redirect-map types.

Good point, will fix! :)

-Toke

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-06-06 10:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04 15:24 [PATCH net-next 0/2] xdp: Allow lookup into devmaps before redirect Toke Høiland-Jørgensen
2019-06-04 15:24 ` [PATCH net-next 1/2] bpf_xdp_redirect_map: Add flag to return XDP_PASS on map lookup failure Toke Høiland-Jørgensen
2019-06-05 10:39   ` Jesper Dangaard Brouer
2019-06-05 15:09     ` Jonathan Lemon
2019-06-06 10:01     ` Toke Høiland-Jørgensen
2019-06-04 15:24 ` [PATCH net-next 2/2] devmap: Allow map lookups from eBPF Toke Høiland-Jørgensen
2019-06-04 16:35   ` Jesper Dangaard Brouer
2019-06-04 18:42     ` Toke Høiland-Jørgensen
2019-06-04 19:41     ` Jonathan Lemon
2019-06-04 20:00       ` Toke Høiland-Jørgensen
2019-06-04 20:22   ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).