All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/2] net: introduce and use route hint
@ 2019-11-19 14:38 Paolo Abeni
  2019-11-19 14:38 ` [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input Paolo Abeni
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Paolo Abeni @ 2019-11-19 14:38 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern

This series leverages the listification infrastructure to avoid
unnecessary route lookup on ingress packets. In absence of policy routing,
packets with equal daddr will usually land on the same dst.

When processing packet bursts (lists) we can easily reference the previous
dst entry. When we hit the 'same destination' condition we can avoid the
route lookup, coping the already available dst.

Detailed performance numbers are available in the individual commit
messages. Figures are slightly better then previous iteration because
thanks to Willem's suggestion we additionally skip early demux when using
the route hint.

v2 -> v3:
 - use fib*_has_custom_rules() helpers (David A.)
 - add ip*_extract_route_hint() helper (Edward C.)
 - use prev skb as hint instead of copying data (Willem )

v1 -> v2:
 - fix build issue with !CONFIG_IP*_MULTIPLE_TABLES
 - fix potential race in ip6_list_rcv_finish()

Paolo Abeni (2):
  ipv6: introduce and uses route look hints for list input
  ipv4: use dst hint for ipv4 list receive

 include/net/ip6_fib.h   |  9 +++++++++
 include/net/ip_fib.h    | 10 ++++++++++
 include/net/route.h     |  4 ++++
 net/ipv4/fib_frontend.c | 10 ----------
 net/ipv4/ip_input.c     | 35 +++++++++++++++++++++++++++++++----
 net/ipv4/route.c        | 37 +++++++++++++++++++++++++++++++++++++
 net/ipv6/ip6_input.c    | 26 ++++++++++++++++++++++++--
 7 files changed, 115 insertions(+), 16 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input
  2019-11-19 14:38 [PATCH net-next v3 0/2] net: introduce and use route hint Paolo Abeni
@ 2019-11-19 14:38 ` Paolo Abeni
  2019-11-19 15:39   ` David Ahern
  2019-11-19 17:34   ` Eric Dumazet
  2019-11-19 14:38 ` [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive Paolo Abeni
  2019-11-20  2:47 ` [PATCH net-next v3 0/2] net: introduce and use route hint David Miller
  2 siblings, 2 replies; 16+ messages in thread
From: Paolo Abeni @ 2019-11-19 14:38 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern

When doing RX batch packet processing, we currently always repeat
the route lookup for each ingress packet. If policy routing is
configured, and IPV6_SUBTREES is disabled at build time, we
know that packets with the same destination address will use
the same dst.

This change tries to avoid per packet route lookup caching
the destination address of the latest successful lookup, and
reusing it for the next packet when the above conditions are
in place. Ingress traffic for most servers should fit.

The measured performance delta under UDP flood vs a recvmmsg
receiver is as follow:

vanilla		patched		delta
Kpps		Kpps		%
1431		1674		+17

In the worst-case scenario - each packet has a different
destination address - the performance delta is within noise
range.

v2 -> v3:
 - add fib6_has_custom_rules() helpers (David A.)
 - add ip6_extract_route_hint() helper (Edward C.)
 - use hint directly in ip6_list_rcv_finish() (Willem)

v1 -> v2:
 - fix build issue with !CONFIG_IPV6_MULTIPLE_TABLES
 - fix potential race when fib6_has_custom_rules is set
   while processing a packet batch

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/ip6_fib.h |  9 +++++++++
 net/ipv6/ip6_input.c  | 26 ++++++++++++++++++++++++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 5d1615463138..9ab60611b97b 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -502,6 +502,11 @@ static inline bool fib6_metric_locked(struct fib6_info *f6i, int metric)
 }
 
 #ifdef CONFIG_IPV6_MULTIPLE_TABLES
+static inline bool fib6_has_custom_rules(struct net *net)
+{
+	return net->ipv6.fib6_has_custom_rules;
+}
+
 int fib6_rules_init(void);
 void fib6_rules_cleanup(void);
 bool fib6_rule_default(const struct fib_rule *rule);
@@ -527,6 +532,10 @@ static inline bool fib6_rules_early_flow_dissect(struct net *net,
 	return true;
 }
 #else
+static inline bool fib6_has_custom_rules(struct net *net)
+{
+	return 0;
+}
 static inline int               fib6_rules_init(void)
 {
 	return 0;
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index ef7f707d9ae3..792b52aa9fc9 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -59,6 +59,7 @@ static void ip6_rcv_finish_core(struct net *net, struct sock *sk,
 			INDIRECT_CALL_2(edemux, tcp_v6_early_demux,
 					udp_v6_early_demux, skb);
 	}
+
 	if (!skb_valid_dst(skb))
 		ip6_route_input(skb);
 }
@@ -86,11 +87,26 @@ static void ip6_sublist_rcv_finish(struct list_head *head)
 	}
 }
 
+static bool ip6_can_use_hint(struct sk_buff *skb, const struct sk_buff *hint)
+{
+	return hint && !skb_dst(skb) &&
+	       ipv6_addr_equal(&ipv6_hdr(hint)->daddr, &ipv6_hdr(skb)->daddr);
+}
+
+static struct sk_buff *ip6_extract_route_hint(struct net *net,
+					      struct sk_buff *skb)
+{
+	if (IS_ENABLED(IPV6_SUBTREES) || fib6_has_custom_rules(net))
+		return NULL;
+
+	return skb;
+}
+
 static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
 				struct list_head *head)
 {
+	struct sk_buff *skb, *next, *hint = NULL;
 	struct dst_entry *curr_dst = NULL;
-	struct sk_buff *skb, *next;
 	struct list_head sublist;
 
 	INIT_LIST_HEAD(&sublist);
@@ -104,9 +120,15 @@ static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
 		skb = l3mdev_ip6_rcv(skb);
 		if (!skb)
 			continue;
-		ip6_rcv_finish_core(net, sk, skb);
+
+		if (ip6_can_use_hint(skb, hint))
+			skb_dst_copy(skb, hint);
+		else
+			ip6_rcv_finish_core(net, sk, skb);
 		dst = skb_dst(skb);
 		if (curr_dst != dst) {
+			hint = ip6_extract_route_hint(net, skb);
+
 			/* dispatch old sublist */
 			if (!list_empty(&sublist))
 				ip6_sublist_rcv_finish(&sublist);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive
  2019-11-19 14:38 [PATCH net-next v3 0/2] net: introduce and use route hint Paolo Abeni
  2019-11-19 14:38 ` [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input Paolo Abeni
@ 2019-11-19 14:38 ` Paolo Abeni
  2019-11-19 16:00   ` David Ahern
  2019-11-20  2:47 ` [PATCH net-next v3 0/2] net: introduce and use route hint David Miller
  2 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2019-11-19 14:38 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern

This is alike the previous change, with some additional ipv4 specific
quirk. Even when using the route hint we still have to do perform
additional per packet checks about source address validity: a new
helper is added to wrap them.

To keep the code as simple as possible, use  hints for local destination
only.

UDP flood performances vs recvmmsg() receiver:

vanilla		patched		delta
Kpps		Kpps		%
1683		1871		+11

In the worst case scenario - each packet has a different
destination address - the performance delta is within noise
range.

v2 -> v3:
 - really fix build (sic) and hint usage check
 - use fib4_has_custom_rules() helpers (David A.)
 - add ip_extract_route_hint() helper (Edward C.)
 - use prev skb as hint instead of copying data (Willem)

v1 -> v2:
 - fix build issue with !CONFIG_IP_MULTIPLE_TABLES

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/ip_fib.h    | 10 ++++++++++
 include/net/route.h     |  4 ++++
 net/ipv4/fib_frontend.c | 10 ----------
 net/ipv4/ip_input.c     | 35 +++++++++++++++++++++++++++++++----
 net/ipv4/route.c        | 37 +++++++++++++++++++++++++++++++++++++
 5 files changed, 82 insertions(+), 14 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 52b2406a5dfc..8e65e3e0a948 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -311,6 +311,11 @@ static inline int fib_lookup(struct net *net, const struct flowi4 *flp,
 	return err;
 }
 
+static inline bool fib4_has_custom_rules(struct net *net)
+{
+	return false;
+}
+
 static inline bool fib4_rule_default(const struct fib_rule *rule)
 {
 	return true;
@@ -378,6 +383,11 @@ static inline int fib_lookup(struct net *net, struct flowi4 *flp,
 	return err;
 }
 
+static inline bool fib4_has_custom_rules(struct net *net)
+{
+	return net->ipv4.fib_has_custom_rules;
+}
+
 bool fib4_rule_default(const struct fib_rule *rule);
 int fib4_rules_dump(struct net *net, struct notifier_block *nb,
 		    struct netlink_ext_ack *extack);
diff --git a/include/net/route.h b/include/net/route.h
index 6c516840380d..a9c60fc68e36 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -185,6 +185,10 @@ int ip_route_input_rcu(struct sk_buff *skb, __be32 dst, __be32 src,
 		       u8 tos, struct net_device *devin,
 		       struct fib_result *res);
 
+int ip_route_use_hint(struct sk_buff *skb, __be32 dst, __be32 src,
+		      u8 tos, struct net_device *devin,
+		      const struct sk_buff *hint);
+
 static inline int ip_route_input(struct sk_buff *skb, __be32 dst, __be32 src,
 				 u8 tos, struct net_device *devin)
 {
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 71c78d223dfd..577db1d50a24 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -70,11 +70,6 @@ static int __net_init fib4_rules_init(struct net *net)
 	fib_free_table(main_table);
 	return -ENOMEM;
 }
-
-static bool fib4_has_custom_rules(struct net *net)
-{
-	return false;
-}
 #else
 
 struct fib_table *fib_new_table(struct net *net, u32 id)
@@ -131,11 +126,6 @@ struct fib_table *fib_get_table(struct net *net, u32 id)
 	}
 	return NULL;
 }
-
-static bool fib4_has_custom_rules(struct net *net)
-{
-	return net->ipv4.fib_has_custom_rules;
-}
 #endif /* CONFIG_IP_MULTIPLE_TABLES */
 
 static void fib_replace_table(struct net *net, struct fib_table *old,
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 24a95126e698..e992f90586f3 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -302,16 +302,31 @@ static inline bool ip_rcv_options(struct sk_buff *skb, struct net_device *dev)
 	return true;
 }
 
+static bool ip_can_use_hint(struct sk_buff *skb, const struct iphdr *iph,
+			    const struct sk_buff *hint)
+{
+	return hint && !skb_dst(skb) && ip_hdr(hint)->daddr == iph->daddr &&
+	       ip_hdr(hint)->tos == iph->tos;
+}
+
 INDIRECT_CALLABLE_DECLARE(int udp_v4_early_demux(struct sk_buff *));
 INDIRECT_CALLABLE_DECLARE(int tcp_v4_early_demux(struct sk_buff *));
 static int ip_rcv_finish_core(struct net *net, struct sock *sk,
-			      struct sk_buff *skb, struct net_device *dev)
+			      struct sk_buff *skb, struct net_device *dev,
+			      const struct sk_buff *hint)
 {
 	const struct iphdr *iph = ip_hdr(skb);
 	int (*edemux)(struct sk_buff *skb);
 	struct rtable *rt;
 	int err;
 
+	if (ip_can_use_hint(skb, iph, hint)) {
+		err = ip_route_use_hint(skb, iph->daddr, iph->saddr, iph->tos,
+					dev, hint);
+		if (unlikely(err))
+			goto drop_error;
+	}
+
 	if (net->ipv4.sysctl_ip_early_demux &&
 	    !skb_dst(skb) &&
 	    !skb->sk &&
@@ -408,7 +423,7 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 	if (!skb)
 		return NET_RX_SUCCESS;
 
-	ret = ip_rcv_finish_core(net, sk, skb, dev);
+	ret = ip_rcv_finish_core(net, sk, skb, dev, NULL);
 	if (ret != NET_RX_DROP)
 		ret = dst_input(skb);
 	return ret;
@@ -535,11 +550,20 @@ static void ip_sublist_rcv_finish(struct list_head *head)
 	}
 }
 
+static struct sk_buff *ip_extract_route_hint(struct net *net,
+					     struct sk_buff *skb, int rt_type)
+{
+	if (fib4_has_custom_rules(net) || rt_type != RTN_LOCAL)
+		return NULL;
+
+	return skb;
+}
+
 static void ip_list_rcv_finish(struct net *net, struct sock *sk,
 			       struct list_head *head)
 {
+	struct sk_buff *skb, *next, *hint = NULL;
 	struct dst_entry *curr_dst = NULL;
-	struct sk_buff *skb, *next;
 	struct list_head sublist;
 
 	INIT_LIST_HEAD(&sublist);
@@ -554,11 +578,14 @@ static void ip_list_rcv_finish(struct net *net, struct sock *sk,
 		skb = l3mdev_ip_rcv(skb);
 		if (!skb)
 			continue;
-		if (ip_rcv_finish_core(net, sk, skb, dev) == NET_RX_DROP)
+		if (ip_rcv_finish_core(net, sk, skb, dev, hint) == NET_RX_DROP)
 			continue;
 
 		dst = skb_dst(skb);
 		if (curr_dst != dst) {
+			hint = ip_extract_route_hint(net, skb,
+					       ((struct rtable *)dst)->rt_type);
+
 			/* dispatch old sublist */
 			if (!list_empty(&sublist))
 				ip_sublist_rcv_finish(&sublist);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index dcc4fa10138d..7083cfa9f0a5 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2019,10 +2019,47 @@ static int ip_mkroute_input(struct sk_buff *skb,
 	return __mkroute_input(skb, res, in_dev, daddr, saddr, tos);
 }
 
+/* Implements all the saddr-related checks as ip_route_input_slow(),
+ * assuming daddr is valid and destination is local.
+ * Uses the provided hint instead of performing a route lookup.
+ */
+int ip_route_use_hint(struct sk_buff *skb, __be32 daddr, __be32 saddr,
+		      u8 tos, struct net_device *dev,
+		      const struct sk_buff *hint)
+{
+	struct in_device *in_dev = __in_dev_get_rcu(dev);
+	struct net *net = dev_net(dev);
+	int err = -EINVAL;
+	u32 tag = 0;
+
+	if (ipv4_is_multicast(saddr) || ipv4_is_lbcast(saddr))
+		goto martian_source;
+
+	if (ipv4_is_zeronet(saddr))
+		goto martian_source;
+
+	if (ipv4_is_loopback(saddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
+		goto martian_source;
+
+	tos &= IPTOS_RT_MASK;
+	err = fib_validate_source(skb, saddr, daddr, tos, 0, dev, in_dev, &tag);
+	if (err < 0)
+		goto martian_source;
+
+	skb_dst_copy(skb, hint);
+	return 0;
+
+martian_source:
+	ip_handle_martian_source(dev, in_dev, skb, daddr, saddr);
+	return err;
+}
+
 /*
  *	NOTE. We drop all the packets that has local source
  *	addresses, because every properly looped back packet
  *	must have correct destination already attached by output routine.
+ *	Changes in the enforced policies must be applied also to
+ *	ip_route_use_hint().
  *
  *	Such approach solves two big problems:
  *	1. Not simplex devices are handled properly.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input
  2019-11-19 14:38 ` [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input Paolo Abeni
@ 2019-11-19 15:39   ` David Ahern
  2019-11-19 16:00     ` Paolo Abeni
  2019-11-19 17:34   ` Eric Dumazet
  1 sibling, 1 reply; 16+ messages in thread
From: David Ahern @ 2019-11-19 15:39 UTC (permalink / raw)
  To: Paolo Abeni, netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree

On 11/19/19 7:38 AM, Paolo Abeni wrote:
> When doing RX batch packet processing, we currently always repeat
> the route lookup for each ingress packet. If policy routing is
> configured, and IPV6_SUBTREES is disabled at build time, we
> know that packets with the same destination address will use
> the same dst.
> 
> This change tries to avoid per packet route lookup caching
> the destination address of the latest successful lookup, and
> reusing it for the next packet when the above conditions are
> in place. Ingress traffic for most servers should fit.
> 
> The measured performance delta under UDP flood vs a recvmmsg
> receiver is as follow:
> 
> vanilla		patched		delta
> Kpps		Kpps		%
> 1431		1674		+17

That's a nice boost...

> +static struct sk_buff *ip6_extract_route_hint(struct net *net,
> +					      struct sk_buff *skb)
> +{
> +	if (IS_ENABLED(IPV6_SUBTREES) || fib6_has_custom_rules(net))

... but basing on SUBTREES being disabled is going to limit its use. If
no routes are source based (fib6_src is not set), you should be able to
re-use the hint with SUBTREES enabled. e.g., track fib6_src use with a
per-namespace counter - similar to fib6_rules_require_fldissect.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input
  2019-11-19 15:39   ` David Ahern
@ 2019-11-19 16:00     ` Paolo Abeni
  2019-11-19 16:21       ` David Ahern
  0 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2019-11-19 16:00 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree

Hi,

On Tue, 2019-11-19 at 08:39 -0700, David Ahern wrote:
> +static struct sk_buff *ip6_extract_route_hint(struct net *net,
> > +					      struct sk_buff *skb)
> > +{
> > +	if (IS_ENABLED(IPV6_SUBTREES) || fib6_has_custom_rules(net))
> 
> ... but basing on SUBTREES being disabled is going to limit its use. If
> no routes are source based (fib6_src is not set), you should be able to
> re-use the hint with SUBTREES enabled. e.g., track fib6_src use with a
> per-namespace counter - similar to fib6_rules_require_fldissect.

Thank you for the feedback! Would you consider this as an intermediate
step? e.g. get these patches in, and then I'll implement subtree
support? 
I'm asking because I don't have subtree setup handy, it will a little
time to get there.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive
  2019-11-19 14:38 ` [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive Paolo Abeni
@ 2019-11-19 16:00   ` David Ahern
  2019-11-19 16:20     ` Paolo Abeni
  0 siblings, 1 reply; 16+ messages in thread
From: David Ahern @ 2019-11-19 16:00 UTC (permalink / raw)
  To: Paolo Abeni, netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree

On 11/19/19 7:38 AM, Paolo Abeni wrote:
> @@ -535,11 +550,20 @@ static void ip_sublist_rcv_finish(struct list_head *head)
>  	}
>  }
>  
> +static struct sk_buff *ip_extract_route_hint(struct net *net,
> +					     struct sk_buff *skb, int rt_type)
> +{
> +	if (fib4_has_custom_rules(net) || rt_type != RTN_LOCAL)

Why the local only limitation for v4 but not v6? Really, why limit this
to LOCAL at all? same destination with same tos and no custom rules
means even for forwarding the lookup should be the same and you can
re-use the dst.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive
  2019-11-19 16:00   ` David Ahern
@ 2019-11-19 16:20     ` Paolo Abeni
  2019-11-19 17:33       ` Paolo Abeni
  0 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2019-11-19 16:20 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree

On Tue, 2019-11-19 at 09:00 -0700, David Ahern wrote:
> On 11/19/19 7:38 AM, Paolo Abeni wrote:
> > @@ -535,11 +550,20 @@ static void ip_sublist_rcv_finish(struct list_head *head)
> >  	}
> >  }
> >  
> > +static struct sk_buff *ip_extract_route_hint(struct net *net,
> > +					     struct sk_buff *skb, int rt_type)
> > +{
> > +	if (fib4_has_custom_rules(net) || rt_type != RTN_LOCAL)
> 
> Why the local only limitation for v4 but not v6? Really, why limit this
> to LOCAL at all? 

The goal here was to simplify as much as possible the ipv4
ip_route_use_hint() helper, as its complexity raised some eyebrown.

Yes, hints can be used also for forwarding. I'm unsure how much will
help, given the daddr contraint. If there is agreement I can re-add it.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input
  2019-11-19 16:00     ` Paolo Abeni
@ 2019-11-19 16:21       ` David Ahern
  0 siblings, 0 replies; 16+ messages in thread
From: David Ahern @ 2019-11-19 16:21 UTC (permalink / raw)
  To: Paolo Abeni, netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree

On 11/19/19 9:00 AM, Paolo Abeni wrote:
> Hi,
> 
> On Tue, 2019-11-19 at 08:39 -0700, David Ahern wrote:
>> +static struct sk_buff *ip6_extract_route_hint(struct net *net,
>>> +					      struct sk_buff *skb)
>>> +{
>>> +	if (IS_ENABLED(IPV6_SUBTREES) || fib6_has_custom_rules(net))
>>
>> ... but basing on SUBTREES being disabled is going to limit its use. If
>> no routes are source based (fib6_src is not set), you should be able to
>> re-use the hint with SUBTREES enabled. e.g., track fib6_src use with a
>> per-namespace counter - similar to fib6_rules_require_fldissect.
> 
> Thank you for the feedback! Would you consider this as an intermediate
> step? e.g. get these patches in, and then I'll implement subtree
> support? 
> I'm asking because I don't have subtree setup handy, it will a little
> time to get there.
> 


IPV6_SUBTREES is just a matter of source based routing, so with iproute2
just add 'from <addr>'

If you delay dealing with it, then this patch needs a change: since
ip6_extract_route_hint will only return NULL, ip6_can_use_hint will only
take NULL as an input so leaving it enabled just adds overhead with no
benefit.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive
  2019-11-19 16:20     ` Paolo Abeni
@ 2019-11-19 17:33       ` Paolo Abeni
  2019-11-19 17:45         ` David Ahern
  0 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2019-11-19 17:33 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree

On Tue, 2019-11-19 at 17:20 +0100, Paolo Abeni wrote:
> On Tue, 2019-11-19 at 09:00 -0700, David Ahern wrote:
> > On 11/19/19 7:38 AM, Paolo Abeni wrote:
> > > @@ -535,11 +550,20 @@ static void ip_sublist_rcv_finish(struct list_head *head)
> > >  	}
> > >  }
> > >  
> > > +static struct sk_buff *ip_extract_route_hint(struct net *net,
> > > +					     struct sk_buff *skb, int rt_type)
> > > +{
> > > +	if (fib4_has_custom_rules(net) || rt_type != RTN_LOCAL)
> > 
> > Why the local only limitation for v4 but not v6? Really, why limit this
> > to LOCAL at all? 
> 
> The goal here was to simplify as much as possible the ipv4
> ip_route_use_hint() helper, as its complexity raised some eyebrown.
> 
> Yes, hints can be used also for forwarding. I'm unsure how much will
> help, given the daddr contraint. If there is agreement I can re-add it.

Sorry, I forgot to ask: would you be ok enabling the route hint for
!RTN_BROADCAST, as in the previous iteration? Covering RTN_BROADCAST
will add quite a bit of complexity to ip_route_use_hint(), likely with
no relevant use-case.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input
  2019-11-19 14:38 ` [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input Paolo Abeni
  2019-11-19 15:39   ` David Ahern
@ 2019-11-19 17:34   ` Eric Dumazet
  2019-11-19 17:40     ` Eric Dumazet
                       ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Eric Dumazet @ 2019-11-19 17:34 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern



On 11/19/19 6:38 AM, Paolo Abeni wrote:
> When doing RX batch packet processing, we currently always repeat
> the route lookup for each ingress packet. If policy routing is
> configured, and IPV6_SUBTREES is disabled at build time, we
> know that packets with the same destination address will use
> the same dst.
> 
> This change tries to avoid per packet route lookup caching
> the destination address of the latest successful lookup, and
> reusing it for the next packet when the above conditions are
> in place. Ingress traffic for most servers should fit.
> 
> The measured performance delta under UDP flood vs a recvmmsg
> receiver is as follow:
> 
> vanilla		patched		delta
> Kpps		Kpps		%
> 1431		1674		+17
> 



> In the worst-case scenario - each packet has a different
> destination address - the performance delta is within noise
> range.
> 
> v2 -> v3:
>  - add fib6_has_custom_rules() helpers (David A.)
>  - add ip6_extract_route_hint() helper (Edward C.)
>  - use hint directly in ip6_list_rcv_finish() (Willem)
> 
> v1 -> v2:
>  - fix build issue with !CONFIG_IPV6_MULTIPLE_TABLES
>  - fix potential race when fib6_has_custom_rules is set
>    while processing a packet batch
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/net/ip6_fib.h |  9 +++++++++
>  net/ipv6/ip6_input.c  | 26 ++++++++++++++++++++++++--
>  2 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> index 5d1615463138..9ab60611b97b 100644
> --- a/include/net/ip6_fib.h
> +++ b/include/net/ip6_fib.h
> @@ -502,6 +502,11 @@ static inline bool fib6_metric_locked(struct fib6_info *f6i, int metric)
>  }
>  
>  #ifdef CONFIG_IPV6_MULTIPLE_TABLES
> +static inline bool fib6_has_custom_rules(struct net *net)

const struct net *net

> +{
> +	return net->ipv6.fib6_has_custom_rules;

It would be nice to be able to detect that some custom rules only impact egress routes :/

> +}
> +
>  int fib6_rules_init(void);
>  void fib6_rules_cleanup(void);
>  bool fib6_rule_default(const struct fib_rule *rule);
> @@ -527,6 +532,10 @@ static inline bool fib6_rules_early_flow_dissect(struct net *net,
>  	return true;
>  }
>  #else
> +static inline bool fib6_has_custom_rules(struct net *net)

const struct net *net

> +{
> +	return 0;


	return false;


BTW, this deserves a patch on its own :)

> +}
>  static inline int               fib6_rules_init(void)
>  {
>  	return 0;
> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> index ef7f707d9ae3..792b52aa9fc9 100644
> --- a/net/ipv6/ip6_input.c
> +++ b/net/ipv6/ip6_input.c
> @@ -59,6 +59,7 @@ static void ip6_rcv_finish_core(struct net *net, struct sock *sk,
>  			INDIRECT_CALL_2(edemux, tcp_v6_early_demux,
>  					udp_v6_early_demux, skb);
>  	}
> +

Why adding a new line ? Please refrain adding noise to a patch.

>  	if (!skb_valid_dst(skb))
>  		ip6_route_input(skb);
>  }
> @@ -86,11 +87,26 @@ static void ip6_sublist_rcv_finish(struct list_head *head)
>  	}
>  }
>  
> +static bool ip6_can_use_hint(struct sk_buff *skb, const struct sk_buff *hint)
> +{
> +	return hint && !skb_dst(skb) &&
> +	       ipv6_addr_equal(&ipv6_hdr(hint)->daddr, &ipv6_hdr(skb)->daddr);
> +}
> +

Why keeping whole skb as the hint, since all you want is the ipv6_hdr(skb)->daddr ?

Remembering the pointer to daddr would avoid de-referencing many skb fields.


> +static struct sk_buff *ip6_extract_route_hint(struct net *net,
> +					      struct sk_buff *skb)
> +{
> +	if (IS_ENABLED(IPV6_SUBTREES) || fib6_has_custom_rules(net))
> +		return NULL;
> +
> +	return skb;
> +}
> +
>  static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
>  				struct list_head *head)
>  {
> +	struct sk_buff *skb, *next, *hint = NULL;
>  	struct dst_entry *curr_dst = NULL;
> -	struct sk_buff *skb, *next;
>  	struct list_head sublist;
>  
>  	INIT_LIST_HEAD(&sublist);
> @@ -104,9 +120,15 @@ static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
>  		skb = l3mdev_ip6_rcv(skb);
>  		if (!skb)
>  			continue;
> -		ip6_rcv_finish_core(net, sk, skb);
> +
> +		if (ip6_can_use_hint(skb, hint))
> +			skb_dst_copy(skb, hint);
> +		else
> +			ip6_rcv_finish_core(net, sk, skb);
>  		dst = skb_dst(skb);
>  		if (curr_dst != dst) {
> +			hint = ip6_extract_route_hint(net, skb);
> +
>  			/* dispatch old sublist */
>  			if (!list_empty(&sublist))
>  				ip6_sublist_rcv_finish(&sublist);
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input
  2019-11-19 17:34   ` Eric Dumazet
@ 2019-11-19 17:40     ` Eric Dumazet
  2019-11-19 17:40     ` David Ahern
  2019-11-19 21:41     ` Paolo Abeni
  2 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2019-11-19 17:40 UTC (permalink / raw)
  To: Eric Dumazet, Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern



On 11/19/19 9:34 AM, Eric Dumazet wrote:
> 

>>  
>> +static bool ip6_can_use_hint(struct sk_buff *skb, const struct sk_buff *hint)
>> +{
>> +	return hint && !skb_dst(skb) &&
>> +	       ipv6_addr_equal(&ipv6_hdr(hint)->daddr, &ipv6_hdr(skb)->daddr);
>> +}
>> +
> 
> Why keeping whole skb as the hint, since all you want is the ipv6_hdr(skb)->daddr ?
> 
> Remembering the pointer to daddr would avoid de-referencing many skb fields.
> 

Ah we also need the hint dst, scrap this then...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input
  2019-11-19 17:34   ` Eric Dumazet
  2019-11-19 17:40     ` Eric Dumazet
@ 2019-11-19 17:40     ` David Ahern
  2019-11-19 21:41     ` Paolo Abeni
  2 siblings, 0 replies; 16+ messages in thread
From: David Ahern @ 2019-11-19 17:40 UTC (permalink / raw)
  To: Eric Dumazet, Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree

On 11/19/19 10:34 AM, Eric Dumazet wrote:
>> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
>> index 5d1615463138..9ab60611b97b 100644
>> --- a/include/net/ip6_fib.h
>> +++ b/include/net/ip6_fib.h
>> @@ -502,6 +502,11 @@ static inline bool fib6_metric_locked(struct fib6_info *f6i, int metric)
>>  }
>>  
>>  #ifdef CONFIG_IPV6_MULTIPLE_TABLES
>> +static inline bool fib6_has_custom_rules(struct net *net)
> 
> const struct net *net
> 
>> +{
>> +	return net->ipv6.fib6_has_custom_rules;
> 
> It would be nice to be able to detect that some custom rules only impact egress routes :/
> 

Or vrf. :-)

It's a common problem that needs a better solution - not lumping the
full complexity of fib rules into a single boolean.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive
  2019-11-19 17:33       ` Paolo Abeni
@ 2019-11-19 17:45         ` David Ahern
  0 siblings, 0 replies; 16+ messages in thread
From: David Ahern @ 2019-11-19 17:45 UTC (permalink / raw)
  To: Paolo Abeni, netdev; +Cc: David S. Miller, Willem de Bruijn, Edward Cree

On 11/19/19 10:33 AM, Paolo Abeni wrote:
> On Tue, 2019-11-19 at 17:20 +0100, Paolo Abeni wrote:
>> On Tue, 2019-11-19 at 09:00 -0700, David Ahern wrote:
>>> On 11/19/19 7:38 AM, Paolo Abeni wrote:
>>>> @@ -535,11 +550,20 @@ static void ip_sublist_rcv_finish(struct list_head *head)
>>>>  	}
>>>>  }
>>>>  
>>>> +static struct sk_buff *ip_extract_route_hint(struct net *net,
>>>> +					     struct sk_buff *skb, int rt_type)
>>>> +{
>>>> +	if (fib4_has_custom_rules(net) || rt_type != RTN_LOCAL)
>>>
>>> Why the local only limitation for v4 but not v6? Really, why limit this
>>> to LOCAL at all? 
>>
>> The goal here was to simplify as much as possible the ipv4
>> ip_route_use_hint() helper, as its complexity raised some eyebrown.
>>
>> Yes, hints can be used also for forwarding. I'm unsure how much will
>> help, given the daddr contraint. If there is agreement I can re-add it.
> 
> Sorry, I forgot to ask: would you be ok enabling the route hint for
> !RTN_BROADCAST, as in the previous iteration? Covering RTN_BROADCAST
> will add quite a bit of complexity to ip_route_use_hint(), likely with
> no relevant use-case.
> 

It is a trade-off of too many checks which just add overhead to the
packets that can not benefit from re-use. I was trying to understand why
local delivery was given preference.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input
  2019-11-19 17:34   ` Eric Dumazet
  2019-11-19 17:40     ` Eric Dumazet
  2019-11-19 17:40     ` David Ahern
@ 2019-11-19 21:41     ` Paolo Abeni
  2 siblings, 0 replies; 16+ messages in thread
From: Paolo Abeni @ 2019-11-19 21:41 UTC (permalink / raw)
  To: Eric Dumazet, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern

On Tue, 2019-11-19 at 09:34 -0800, Eric Dumazet wrote:
> On 11/19/19 6:38 AM, Paolo Abeni wrote:
> > diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> > index 5d1615463138..9ab60611b97b 100644
> > --- a/include/net/ip6_fib.h
> > +++ b/include/net/ip6_fib.h
> > @@ -502,6 +502,11 @@ static inline bool fib6_metric_locked(struct fib6_info *f6i, int metric)
> >  }
> >  
> >  #ifdef CONFIG_IPV6_MULTIPLE_TABLES
> > +static inline bool fib6_has_custom_rules(struct net *net)
> 
> const struct net *net

Yep, will do in the next iteration.

> > +{
> > +	return net->ipv6.fib6_has_custom_rules;
> 
> It would be nice to be able to detect that some custom rules only impact egress routes :/

My [mis-] understanding is that addressing correctly the above (and VRF
and likely many other use-cases) is beyond these patches scope.

> > +}
> > +
> >  int fib6_rules_init(void);
> >  void fib6_rules_cleanup(void);
> >  bool fib6_rule_default(const struct fib_rule *rule);
> > @@ -527,6 +532,10 @@ static inline bool fib6_rules_early_flow_dissect(struct net *net,
> >  	return true;
> >  }
> >  #else
> > +static inline bool fib6_has_custom_rules(struct net *net)
> 
> const struct net *net

Ditto ;)

> > +{
> > +	return 0;
> 
> 	return false;
> 
> 
> BTW, this deserves a patch on its own :)

Oks, will do

> > +}
> >  static inline int               fib6_rules_init(void)
> >  {
> >  	return 0;
> > diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> > index ef7f707d9ae3..792b52aa9fc9 100644
> > --- a/net/ipv6/ip6_input.c
> > +++ b/net/ipv6/ip6_input.c
> > @@ -59,6 +59,7 @@ static void ip6_rcv_finish_core(struct net *net, struct sock *sk,
> >  			INDIRECT_CALL_2(edemux, tcp_v6_early_demux,
> >  					udp_v6_early_demux, skb);
> >  	}
> > +
> 
> Why adding a new line ? Please refrain adding noise to a patch.

Sorry this is a left-over from previous iterations, will fix in the
next one.

Thank you for the detailed feedback!

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 0/2] net: introduce and use route hint
  2019-11-19 14:38 [PATCH net-next v3 0/2] net: introduce and use route hint Paolo Abeni
  2019-11-19 14:38 ` [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input Paolo Abeni
  2019-11-19 14:38 ` [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive Paolo Abeni
@ 2019-11-20  2:47 ` David Miller
  2019-11-20  8:08   ` Paolo Abeni
  2 siblings, 1 reply; 16+ messages in thread
From: David Miller @ 2019-11-20  2:47 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, willemdebruijn.kernel, ecree, dsahern

From: Paolo Abeni <pabeni@redhat.com>
Date: Tue, 19 Nov 2019 15:38:35 +0100

> This series leverages the listification infrastructure to avoid
> unnecessary route lookup on ingress packets. In absence of policy routing,
> packets with equal daddr will usually land on the same dst.
> 
> When processing packet bursts (lists) we can easily reference the previous
> dst entry. When we hit the 'same destination' condition we can avoid the
> route lookup, coping the already available dst.
> 
> Detailed performance numbers are available in the individual commit
> messages. Figures are slightly better then previous iteration because
> thanks to Willem's suggestion we additionally skip early demux when using
> the route hint.
> 
> v2 -> v3:
>  - use fib*_has_custom_rules() helpers (David A.)
>  - add ip*_extract_route_hint() helper (Edward C.)
>  - use prev skb as hint instead of copying data (Willem )
> 
> v1 -> v2:
>  - fix build issue with !CONFIG_IP*_MULTIPLE_TABLES
>  - fix potential race in ip6_list_rcv_finish()

To reiterate David A.'s feedback, having this depend upon
IP_MULTIPLE_TABLES being disabled is %100 a non-starter.

No distribution will benefit from these changes at all.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v3 0/2] net: introduce and use route hint
  2019-11-20  2:47 ` [PATCH net-next v3 0/2] net: introduce and use route hint David Miller
@ 2019-11-20  8:08   ` Paolo Abeni
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Abeni @ 2019-11-20  8:08 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, willemdebruijn.kernel, ecree, dsahern

On Tue, 2019-11-19 at 18:47 -0800, David Miller wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> Date: Tue, 19 Nov 2019 15:38:35 +0100
> 
> > This series leverages the listification infrastructure to avoid
> > unnecessary route lookup on ingress packets. In absence of policy routing,
> > packets with equal daddr will usually land on the same dst.
> > 
> > When processing packet bursts (lists) we can easily reference the previous
> > dst entry. When we hit the 'same destination' condition we can avoid the
> > route lookup, coping the already available dst.
> > 
> > Detailed performance numbers are available in the individual commit
> > messages. Figures are slightly better then previous iteration because
> > thanks to Willem's suggestion we additionally skip early demux when using
> > the route hint.
> > 
> > v2 -> v3:
> >  - use fib*_has_custom_rules() helpers (David A.)
> >  - add ip*_extract_route_hint() helper (Edward C.)
> >  - use prev skb as hint instead of copying data (Willem )
> > 
> > v1 -> v2:
> >  - fix build issue with !CONFIG_IP*_MULTIPLE_TABLES
> >  - fix potential race in ip6_list_rcv_finish()
> 
> To reiterate David A.'s feedback, 

Yep, I'm working to address it...

> having this depend upon
> IP_MULTIPLE_TABLES being disabled is %100 a non-starter.

...anyway in its current form it 'just' depends on CONFIG_IPV6_SUBTREE
being disabled. Next iteration will remove such dep, as per David's
feedback.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-11-20  8:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-19 14:38 [PATCH net-next v3 0/2] net: introduce and use route hint Paolo Abeni
2019-11-19 14:38 ` [PATCH net-next v3 1/2] ipv6: introduce and uses route look hints for list input Paolo Abeni
2019-11-19 15:39   ` David Ahern
2019-11-19 16:00     ` Paolo Abeni
2019-11-19 16:21       ` David Ahern
2019-11-19 17:34   ` Eric Dumazet
2019-11-19 17:40     ` Eric Dumazet
2019-11-19 17:40     ` David Ahern
2019-11-19 21:41     ` Paolo Abeni
2019-11-19 14:38 ` [PATCH net-next v3 2/2] ipv4: use dst hint for ipv4 list receive Paolo Abeni
2019-11-19 16:00   ` David Ahern
2019-11-19 16:20     ` Paolo Abeni
2019-11-19 17:33       ` Paolo Abeni
2019-11-19 17:45         ` David Ahern
2019-11-20  2:47 ` [PATCH net-next v3 0/2] net: introduce and use route hint David Miller
2019-11-20  8:08   ` Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.