BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup
@ 2020-10-09 10:13 Toke Høiland-Jørgensen
  2020-10-09 16:17 ` David Ahern
  0 siblings, 1 reply; 7+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-09 10:13 UTC (permalink / raw)
  To: daniel, ast; +Cc: Toke Høiland-Jørgensen, bpf, netdev, David Ahern

The bpf_fib_lookup() helper performs a neighbour lookup for the destination
IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
that the BPF program will pass the packet up the stack in this case.
However, with the addition of bpf_redirect_neigh() that can be used instead
to perform the neighbour lookup, at the cost of a bit of duplicated work.

For that we still need the target ifindex, and since bpf_fib_lookup()
already has that at the time it performs the neighbour lookup, there is
really no reason why it can't just return it in any case. So let's just
always return the ifindex, and also add a flag that lets the caller turn
off the neighbour lookup entirely in bpf_fib_lookup().

v2:
- Add flag (Daniel)
- Remove misleading code example from commit message (David)

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/uapi/linux/bpf.h       | 10 ++++++----
 net/core/filter.c              | 15 ++++++++++++---
 tools/include/uapi/linux/bpf.h | 10 ++++++----
 3 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d83561e8cd2c..9c7c10ce7ace 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4813,12 +4813,14 @@ struct bpf_raw_tracepoint_args {
 	__u64 args[0];
 };
 
-/* DIRECT:  Skip the FIB rules and go to FIB table associated with device
- * OUTPUT:  Do lookup from egress perspective; default is ingress
+/* DIRECT:      Skip the FIB rules and go to FIB table associated with device
+ * OUTPUT:      Do lookup from egress perspective; default is ingress
+ * SKIP_NEIGH:  Skip neighbour lookup and return BPF_FIB_LKUP_RET_NO_NEIGH on success
  */
 enum {
-	BPF_FIB_LOOKUP_DIRECT  = (1U << 0),
-	BPF_FIB_LOOKUP_OUTPUT  = (1U << 1),
+	BPF_FIB_LOOKUP_DIRECT      = (1U << 0),
+	BPF_FIB_LOOKUP_OUTPUT      = (1U << 1),
+	BPF_FIB_LOOKUP_SKIP_NEIGH  = (1U << 2),
 };
 
 enum {
diff --git a/net/core/filter.c b/net/core/filter.c
index 05df73780dd3..1038337bc06c 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5192,7 +5192,6 @@ static int bpf_fib_set_fwd_params(struct bpf_fib_lookup *params,
 	memcpy(params->smac, dev->dev_addr, ETH_ALEN);
 	params->h_vlan_TCI = 0;
 	params->h_vlan_proto = 0;
-	params->ifindex = dev->ifindex;
 
 	return 0;
 }
@@ -5289,6 +5288,10 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
 	dev = nhc->nhc_dev;
 
 	params->rt_metric = res.fi->fib_priority;
+	params->ifindex = dev->ifindex;
+
+	if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH)
+		return BPF_FIB_LKUP_RET_NO_NEIGH;
 
 	/* xdp and cls_bpf programs are run in RCU-bh so
 	 * rcu_read_lock_bh is not needed here
@@ -5414,6 +5417,10 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
 
 	dev = res.nh->fib_nh_dev;
 	params->rt_metric = res.f6i->fib6_metric;
+	params->ifindex = dev->ifindex;
+
+	if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH)
+		return BPF_FIB_LKUP_RET_NO_NEIGH;
 
 	/* xdp and cls_bpf programs are run in RCU-bh so rcu_read_lock_bh is
 	 * not needed here.
@@ -5432,7 +5439,8 @@ BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx,
 	if (plen < sizeof(*params))
 		return -EINVAL;
 
-	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT))
+	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT |
+		      BPF_FIB_LOOKUP_SKIP_NEIGH))
 		return -EINVAL;
 
 	switch (params->family) {
@@ -5469,7 +5477,8 @@ BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb,
 	if (plen < sizeof(*params))
 		return -EINVAL;
 
-	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT))
+	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT |
+		      BPF_FIB_LOOKUP_SKIP_NEIGH))
 		return -EINVAL;
 
 	switch (params->family) {
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d83561e8cd2c..9c7c10ce7ace 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4813,12 +4813,14 @@ struct bpf_raw_tracepoint_args {
 	__u64 args[0];
 };
 
-/* DIRECT:  Skip the FIB rules and go to FIB table associated with device
- * OUTPUT:  Do lookup from egress perspective; default is ingress
+/* DIRECT:      Skip the FIB rules and go to FIB table associated with device
+ * OUTPUT:      Do lookup from egress perspective; default is ingress
+ * SKIP_NEIGH:  Skip neighbour lookup and return BPF_FIB_LKUP_RET_NO_NEIGH on success
  */
 enum {
-	BPF_FIB_LOOKUP_DIRECT  = (1U << 0),
-	BPF_FIB_LOOKUP_OUTPUT  = (1U << 1),
+	BPF_FIB_LOOKUP_DIRECT      = (1U << 0),
+	BPF_FIB_LOOKUP_OUTPUT      = (1U << 1),
+	BPF_FIB_LOOKUP_SKIP_NEIGH  = (1U << 2),
 };
 
 enum {
-- 
2.28.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup
  2020-10-09 10:13 [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup Toke Høiland-Jørgensen
@ 2020-10-09 16:17 ` David Ahern
  2020-10-09 18:42   ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 7+ messages in thread
From: David Ahern @ 2020-10-09 16:17 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, daniel, ast; +Cc: bpf, netdev, David Ahern

On 10/9/20 3:13 AM, Toke Høiland-Jørgensen wrote:
> The bpf_fib_lookup() helper performs a neighbour lookup for the destination
> IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
> that the BPF program will pass the packet up the stack in this case.
> However, with the addition of bpf_redirect_neigh() that can be used instead
> to perform the neighbour lookup, at the cost of a bit of duplicated work.
> 
> For that we still need the target ifindex, and since bpf_fib_lookup()
> already has that at the time it performs the neighbour lookup, there is
> really no reason why it can't just return it in any case. So let's just
> always return the ifindex, and also add a flag that lets the caller turn
> off the neighbour lookup entirely in bpf_fib_lookup().

seems really odd to do the fib lookup only to skip the neighbor lookup
and defer to a second helper to do a second fib lookup and send out.

The better back-to-back calls is to return the ifindex and gateway on
successful fib lookup regardless of valid neighbor. If the call to
bpf_redirect_neigh is needed, it can have a flag to skip the fib lookup
and just redirect to the given nexthop address + ifindex. ie.,
bpf_redirect_neigh only does neighbor handling in this case.


> 
> v2:
> - Add flag (Daniel)
> - Remove misleading code example from commit message (David)
> 
> Cc: David Ahern <dsahern@gmail.com>
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  include/uapi/linux/bpf.h       | 10 ++++++----
>  net/core/filter.c              | 15 ++++++++++++---
>  tools/include/uapi/linux/bpf.h | 10 ++++++----
>  3 files changed, 24 insertions(+), 11 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index d83561e8cd2c..9c7c10ce7ace 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4813,12 +4813,14 @@ struct bpf_raw_tracepoint_args {
>  	__u64 args[0];
>  };
>  
> -/* DIRECT:  Skip the FIB rules and go to FIB table associated with device
> - * OUTPUT:  Do lookup from egress perspective; default is ingress
> +/* DIRECT:      Skip the FIB rules and go to FIB table associated with device
> + * OUTPUT:      Do lookup from egress perspective; default is ingress
> + * SKIP_NEIGH:  Skip neighbour lookup and return BPF_FIB_LKUP_RET_NO_NEIGH on success
>   */
>  enum {
> -	BPF_FIB_LOOKUP_DIRECT  = (1U << 0),
> -	BPF_FIB_LOOKUP_OUTPUT  = (1U << 1),
> +	BPF_FIB_LOOKUP_DIRECT      = (1U << 0),
> +	BPF_FIB_LOOKUP_OUTPUT      = (1U << 1),
> +	BPF_FIB_LOOKUP_SKIP_NEIGH  = (1U << 2),
>  };
>  
>  enum {
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 05df73780dd3..1038337bc06c 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5192,7 +5192,6 @@ static int bpf_fib_set_fwd_params(struct bpf_fib_lookup *params,
>  	memcpy(params->smac, dev->dev_addr, ETH_ALEN);
>  	params->h_vlan_TCI = 0;
>  	params->h_vlan_proto = 0;
> -	params->ifindex = dev->ifindex;
>  
>  	return 0;
>  }
> @@ -5289,6 +5288,10 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
>  	dev = nhc->nhc_dev;
>  
>  	params->rt_metric = res.fi->fib_priority;
> +	params->ifindex = dev->ifindex;
> +
> +	if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH)
> +		return BPF_FIB_LKUP_RET_NO_NEIGH;
>  
>  	/* xdp and cls_bpf programs are run in RCU-bh so
>  	 * rcu_read_lock_bh is not needed here
> @@ -5414,6 +5417,10 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
>  
>  	dev = res.nh->fib_nh_dev;
>  	params->rt_metric = res.f6i->fib6_metric;
> +	params->ifindex = dev->ifindex;
> +
> +	if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH)
> +		return BPF_FIB_LKUP_RET_NO_NEIGH;
>  
>  	/* xdp and cls_bpf programs are run in RCU-bh so rcu_read_lock_bh is
>  	 * not needed here.
> @@ -5432,7 +5439,8 @@ BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx,
>  	if (plen < sizeof(*params))
>  		return -EINVAL;
>  
> -	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT))
> +	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT |
> +		      BPF_FIB_LOOKUP_SKIP_NEIGH))
>  		return -EINVAL;
>  
>  	switch (params->family) {
> @@ -5469,7 +5477,8 @@ BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb,
>  	if (plen < sizeof(*params))
>  		return -EINVAL;
>  
> -	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT))
> +	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT |
> +		      BPF_FIB_LOOKUP_SKIP_NEIGH))
>  		return -EINVAL;
>  
>  	switch (params->family) {
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index d83561e8cd2c..9c7c10ce7ace 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -4813,12 +4813,14 @@ struct bpf_raw_tracepoint_args {
>  	__u64 args[0];
>  };
>  
> -/* DIRECT:  Skip the FIB rules and go to FIB table associated with device
> - * OUTPUT:  Do lookup from egress perspective; default is ingress
> +/* DIRECT:      Skip the FIB rules and go to FIB table associated with device
> + * OUTPUT:      Do lookup from egress perspective; default is ingress
> + * SKIP_NEIGH:  Skip neighbour lookup and return BPF_FIB_LKUP_RET_NO_NEIGH on success
>   */
>  enum {
> -	BPF_FIB_LOOKUP_DIRECT  = (1U << 0),
> -	BPF_FIB_LOOKUP_OUTPUT  = (1U << 1),
> +	BPF_FIB_LOOKUP_DIRECT      = (1U << 0),
> +	BPF_FIB_LOOKUP_OUTPUT      = (1U << 1),
> +	BPF_FIB_LOOKUP_SKIP_NEIGH  = (1U << 2),
>  };
>  
>  enum {
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup
  2020-10-09 16:17 ` David Ahern
@ 2020-10-09 18:42   ` Toke Høiland-Jørgensen
  2020-10-09 21:28     ` David Ahern
  0 siblings, 1 reply; 7+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-09 18:42 UTC (permalink / raw)
  To: David Ahern, daniel, ast; +Cc: bpf, netdev, David Ahern

David Ahern <dsahern@gmail.com> writes:

> On 10/9/20 3:13 AM, Toke Høiland-Jørgensen wrote:
>> The bpf_fib_lookup() helper performs a neighbour lookup for the destination
>> IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
>> that the BPF program will pass the packet up the stack in this case.
>> However, with the addition of bpf_redirect_neigh() that can be used instead
>> to perform the neighbour lookup, at the cost of a bit of duplicated work.
>> 
>> For that we still need the target ifindex, and since bpf_fib_lookup()
>> already has that at the time it performs the neighbour lookup, there is
>> really no reason why it can't just return it in any case. So let's just
>> always return the ifindex, and also add a flag that lets the caller turn
>> off the neighbour lookup entirely in bpf_fib_lookup().
>
> seems really odd to do the fib lookup only to skip the neighbor lookup
> and defer to a second helper to do a second fib lookup and send out.
>
> The better back-to-back calls is to return the ifindex and gateway on
> successful fib lookup regardless of valid neighbor. If the call to
> bpf_redirect_neigh is needed, it can have a flag to skip the fib lookup
> and just redirect to the given nexthop address + ifindex. ie.,
> bpf_redirect_neigh only does neighbor handling in this case.

Hmm, yeah, I guess it would make sense to cache and reuse the lookup -
maybe stick it in bpf_redirect_info()? However, given the imminent
opening of the merge window, I don't see this landing before then. So
I'm going to respin this patch with just the original change to always
return the ifindex, then we can revisit the flags/reuse of the fib
lookup later.

-Toke


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup
  2020-10-09 18:42   ` Toke Høiland-Jørgensen
@ 2020-10-09 21:28     ` David Ahern
  2020-10-09 23:05       ` Daniel Borkmann
  0 siblings, 1 reply; 7+ messages in thread
From: David Ahern @ 2020-10-09 21:28 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, daniel, ast; +Cc: bpf, netdev

On 10/9/20 11:42 AM, Toke Høiland-Jørgensen wrote:
> David Ahern <dsahern@gmail.com> writes:
> 
>> On 10/9/20 3:13 AM, Toke Høiland-Jørgensen wrote:
>>> The bpf_fib_lookup() helper performs a neighbour lookup for the destination
>>> IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
>>> that the BPF program will pass the packet up the stack in this case.
>>> However, with the addition of bpf_redirect_neigh() that can be used instead
>>> to perform the neighbour lookup, at the cost of a bit of duplicated work.
>>>
>>> For that we still need the target ifindex, and since bpf_fib_lookup()
>>> already has that at the time it performs the neighbour lookup, there is
>>> really no reason why it can't just return it in any case. So let's just
>>> always return the ifindex, and also add a flag that lets the caller turn
>>> off the neighbour lookup entirely in bpf_fib_lookup().
>>
>> seems really odd to do the fib lookup only to skip the neighbor lookup
>> and defer to a second helper to do a second fib lookup and send out.
>>
>> The better back-to-back calls is to return the ifindex and gateway on
>> successful fib lookup regardless of valid neighbor. If the call to
>> bpf_redirect_neigh is needed, it can have a flag to skip the fib lookup
>> and just redirect to the given nexthop address + ifindex. ie.,
>> bpf_redirect_neigh only does neighbor handling in this case.
> 
> Hmm, yeah, I guess it would make sense to cache and reuse the lookup -
> maybe stick it in bpf_redirect_info()? However, given the imminent

That is not needed.

> opening of the merge window, I don't see this landing before then. So
> I'm going to respin this patch with just the original change to always
> return the ifindex, then we can revisit the flags/reuse of the fib
> lookup later.
> 

What I am suggesting is a change in API to bpf_redirect_neigh which
should be done now, before the merge window, before it comes a locked
API. Right now, bpf_redirect_neigh does a lookup to get the nexthop. It
should take the gateway as an input argument. If set, then the lookup is
not done - only the neighbor redirect.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup
  2020-10-09 21:28     ` David Ahern
@ 2020-10-09 23:05       ` Daniel Borkmann
  2020-10-10 13:42         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2020-10-09 23:05 UTC (permalink / raw)
  To: David Ahern, Toke Høiland-Jørgensen, ast; +Cc: bpf, netdev

On 10/9/20 11:28 PM, David Ahern wrote:
> On 10/9/20 11:42 AM, Toke Høiland-Jørgensen wrote:
>> David Ahern <dsahern@gmail.com> writes:
>>> On 10/9/20 3:13 AM, Toke Høiland-Jørgensen wrote:
>>>> The bpf_fib_lookup() helper performs a neighbour lookup for the destination
>>>> IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
>>>> that the BPF program will pass the packet up the stack in this case.
>>>> However, with the addition of bpf_redirect_neigh() that can be used instead
>>>> to perform the neighbour lookup, at the cost of a bit of duplicated work.
>>>>
>>>> For that we still need the target ifindex, and since bpf_fib_lookup()
>>>> already has that at the time it performs the neighbour lookup, there is
>>>> really no reason why it can't just return it in any case. So let's just
>>>> always return the ifindex, and also add a flag that lets the caller turn
>>>> off the neighbour lookup entirely in bpf_fib_lookup().
>>>
>>> seems really odd to do the fib lookup only to skip the neighbor lookup
>>> and defer to a second helper to do a second fib lookup and send out.
>>>
>>> The better back-to-back calls is to return the ifindex and gateway on
>>> successful fib lookup regardless of valid neighbor. If the call to
>>> bpf_redirect_neigh is needed, it can have a flag to skip the fib lookup
>>> and just redirect to the given nexthop address + ifindex. ie.,
>>> bpf_redirect_neigh only does neighbor handling in this case.
>>
>> Hmm, yeah, I guess it would make sense to cache and reuse the lookup -
>> maybe stick it in bpf_redirect_info()? However, given the imminent
> 
> That is not needed.
> 
>> opening of the merge window, I don't see this landing before then. So
>> I'm going to respin this patch with just the original change to always
>> return the ifindex, then we can revisit the flags/reuse of the fib
>> lookup later.
> 
> What I am suggesting is a change in API to bpf_redirect_neigh which
> should be done now, before the merge window, before it comes a locked
> API. Right now, bpf_redirect_neigh does a lookup to get the nexthop. It
> should take the gateway as an input argument. If set, then the lookup is
> not done - only the neighbor redirect.

Sounds like a reasonable extension, agree. API freeze is not merge win, but
final v5.10 tag in this case as it always has been. In case it's not in time,
we can simply just move flags to arg3 and add a reserved param as arg2 which
must be zero (and thus indicate to perform the lookup as-is). Later we could
extend to pass params similar as in fib_lookup helper for the gw.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup
  2020-10-09 23:05       ` Daniel Borkmann
@ 2020-10-10 13:42         ` Toke Høiland-Jørgensen
  2020-10-11 20:04           ` Daniel Borkmann
  0 siblings, 1 reply; 7+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-10 13:42 UTC (permalink / raw)
  To: Daniel Borkmann, David Ahern, ast; +Cc: bpf, netdev

Daniel Borkmann <daniel@iogearbox.net> writes:

> On 10/9/20 11:28 PM, David Ahern wrote:
>> On 10/9/20 11:42 AM, Toke Høiland-Jørgensen wrote:
>>> David Ahern <dsahern@gmail.com> writes:
>>>> On 10/9/20 3:13 AM, Toke Høiland-Jørgensen wrote:
>>>>> The bpf_fib_lookup() helper performs a neighbour lookup for the destination
>>>>> IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
>>>>> that the BPF program will pass the packet up the stack in this case.
>>>>> However, with the addition of bpf_redirect_neigh() that can be used instead
>>>>> to perform the neighbour lookup, at the cost of a bit of duplicated work.
>>>>>
>>>>> For that we still need the target ifindex, and since bpf_fib_lookup()
>>>>> already has that at the time it performs the neighbour lookup, there is
>>>>> really no reason why it can't just return it in any case. So let's just
>>>>> always return the ifindex, and also add a flag that lets the caller turn
>>>>> off the neighbour lookup entirely in bpf_fib_lookup().
>>>>
>>>> seems really odd to do the fib lookup only to skip the neighbor lookup
>>>> and defer to a second helper to do a second fib lookup and send out.
>>>>
>>>> The better back-to-back calls is to return the ifindex and gateway on
>>>> successful fib lookup regardless of valid neighbor. If the call to
>>>> bpf_redirect_neigh is needed, it can have a flag to skip the fib lookup
>>>> and just redirect to the given nexthop address + ifindex. ie.,
>>>> bpf_redirect_neigh only does neighbor handling in this case.
>>>
>>> Hmm, yeah, I guess it would make sense to cache and reuse the lookup -
>>> maybe stick it in bpf_redirect_info()? However, given the imminent
>> 
>> That is not needed.
>> 
>>> opening of the merge window, I don't see this landing before then. So
>>> I'm going to respin this patch with just the original change to always
>>> return the ifindex, then we can revisit the flags/reuse of the fib
>>> lookup later.
>> 
>> What I am suggesting is a change in API to bpf_redirect_neigh which
>> should be done now, before the merge window, before it comes a locked
>> API. Right now, bpf_redirect_neigh does a lookup to get the nexthop. It
>> should take the gateway as an input argument. If set, then the lookup is
>> not done - only the neighbor redirect.
>
> Sounds like a reasonable extension, agree. API freeze is not merge win, but
> final v5.10 tag in this case as it always has been. In case it's not in time,
> we can simply just move flags to arg3 and add a reserved param as arg2 which
> must be zero (and thus indicate to perform the lookup as-is). Later we could
> extend to pass params similar as in fib_lookup helper for the gw.

Right, I can take a look at this next week. Feel free to merge (v3 of)
this patch now, that change will be needed in any case I think...

-Toke


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup
  2020-10-10 13:42         ` Toke Høiland-Jørgensen
@ 2020-10-11 20:04           ` Daniel Borkmann
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2020-10-11 20:04 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, David Ahern, ast; +Cc: bpf, netdev

On 10/10/20 3:42 PM, Toke Høiland-Jørgensen wrote:
> Daniel Borkmann <daniel@iogearbox.net> writes:
>> On 10/9/20 11:28 PM, David Ahern wrote:
>>> On 10/9/20 11:42 AM, Toke Høiland-Jørgensen wrote:
>>>> David Ahern <dsahern@gmail.com> writes:
>>>>> On 10/9/20 3:13 AM, Toke Høiland-Jørgensen wrote:
>>>>>> The bpf_fib_lookup() helper performs a neighbour lookup for the destination
>>>>>> IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
>>>>>> that the BPF program will pass the packet up the stack in this case.
>>>>>> However, with the addition of bpf_redirect_neigh() that can be used instead
>>>>>> to perform the neighbour lookup, at the cost of a bit of duplicated work.
>>>>>>
>>>>>> For that we still need the target ifindex, and since bpf_fib_lookup()
>>>>>> already has that at the time it performs the neighbour lookup, there is
>>>>>> really no reason why it can't just return it in any case. So let's just
>>>>>> always return the ifindex, and also add a flag that lets the caller turn
>>>>>> off the neighbour lookup entirely in bpf_fib_lookup().
>>>>>
>>>>> seems really odd to do the fib lookup only to skip the neighbor lookup
>>>>> and defer to a second helper to do a second fib lookup and send out.
>>>>>
>>>>> The better back-to-back calls is to return the ifindex and gateway on
>>>>> successful fib lookup regardless of valid neighbor. If the call to
>>>>> bpf_redirect_neigh is needed, it can have a flag to skip the fib lookup
>>>>> and just redirect to the given nexthop address + ifindex. ie.,
>>>>> bpf_redirect_neigh only does neighbor handling in this case.
>>>>
>>>> Hmm, yeah, I guess it would make sense to cache and reuse the lookup -
>>>> maybe stick it in bpf_redirect_info()? However, given the imminent
>>>
>>> That is not needed.
>>>
>>>> opening of the merge window, I don't see this landing before then. So
>>>> I'm going to respin this patch with just the original change to always
>>>> return the ifindex, then we can revisit the flags/reuse of the fib
>>>> lookup later.
>>>
>>> What I am suggesting is a change in API to bpf_redirect_neigh which
>>> should be done now, before the merge window, before it comes a locked
>>> API. Right now, bpf_redirect_neigh does a lookup to get the nexthop. It
>>> should take the gateway as an input argument. If set, then the lookup is
>>> not done - only the neighbor redirect.
>>
>> Sounds like a reasonable extension, agree. API freeze is not merge win, but
>> final v5.10 tag in this case as it always has been. In case it's not in time,
>> we can simply just move flags to arg3 and add a reserved param as arg2 which
>> must be zero (and thus indicate to perform the lookup as-is). Later we could
>> extend to pass params similar as in fib_lookup helper for the gw.
> 
> Right, I can take a look at this next week. Feel free to merge (v3 of)
> this patch now, that change will be needed in any case I think...

Ok, sounds reasonable, done. Lets fix the remaining one as David suggested until
rc1, at latest rc2 time frame. I'll be mostly offline next week during the day,
but happy to help till that deadline as well.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 10:13 [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup Toke Høiland-Jørgensen
2020-10-09 16:17 ` David Ahern
2020-10-09 18:42   ` Toke Høiland-Jørgensen
2020-10-09 21:28     ` David Ahern
2020-10-09 23:05       ` Daniel Borkmann
2020-10-10 13:42         ` Toke Høiland-Jørgensen
2020-10-11 20:04           ` Daniel Borkmann

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git