All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/5] net: introduce and use route hint
@ 2019-11-20 12:47 Paolo Abeni
  2019-11-20 12:47 ` [PATCH net-next v4 1/5] ipv6: add fib6_has_custom_rules() helper Paolo Abeni
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Paolo Abeni @ 2019-11-20 12:47 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern,
	Eric Dumazet

This series leverages the listification infrastructure to avoid
unnecessary route lookup on ingress packets. In absence of custom rules,
packets with equal daddr will usually land on the same dst.

When processing packet bursts (lists) we can easily reference the previous
dst entry. When we hit the 'same destination' condition we can avoid the
route lookup, coping the already available dst.

Detailed performance numbers are available in the individual commit
messages.

v3 -> v4:
 - move helpers to their own patches (Eric D.)
 - enable hints for SUBTREE builds (David A.)
 - re-enable hints for ipv4 forward (David A.)

v2 -> v3:
 - use fib*_has_custom_rules() helpers (David A.)
 - add ip*_extract_route_hint() helper (Edward C.)
 - use prev skb as hint instead of copying data (Willem )

v1 -> v2:
 - fix build issue with !CONFIG_IP*_MULTIPLE_TABLES
 - fix potential race in ip6_list_rcv_finish()

Paolo Abeni (5):
  ipv6: add fib6_has_custom_rules() helper
  ipv6: keep track of routes using src
  ipv6: introduce and uses route look hints for list input.
  ipv4: move fib4_has_custom_rules() helper to public header
  ipv4: use dst hint for ipv4 list receive

 include/net/ip6_fib.h    | 39 +++++++++++++++++++++++++++++++++++++
 include/net/ip_fib.h     | 10 ++++++++++
 include/net/netns/ipv6.h |  3 +++
 include/net/route.h      |  4 ++++
 net/ipv4/fib_frontend.c  | 10 ----------
 net/ipv4/ip_input.c      | 35 +++++++++++++++++++++++++++++----
 net/ipv4/route.c         | 42 ++++++++++++++++++++++++++++++++++++++++
 net/ipv6/ip6_fib.c       |  4 ++++
 net/ipv6/ip6_input.c     | 26 +++++++++++++++++++++++--
 net/ipv6/route.c         |  3 +++
 10 files changed, 160 insertions(+), 16 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH net-next v4 1/5] ipv6: add fib6_has_custom_rules() helper
  2019-11-20 12:47 [PATCH net-next v4 0/5] net: introduce and use route hint Paolo Abeni
@ 2019-11-20 12:47 ` Paolo Abeni
  2019-11-21 20:07   ` David Ahern
  2019-11-20 12:47 ` [PATCH net-next v4 2/5] ipv6: keep track of routes using src Paolo Abeni
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2019-11-20 12:47 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern,
	Eric Dumazet

It wraps the namespace field with the same name, to easily
access it regardless of build options.

Suggested-by: David Ahern <dsahern@gmail.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/ip6_fib.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 5d1615463138..8ac3a59e5126 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -502,6 +502,11 @@ static inline bool fib6_metric_locked(struct fib6_info *f6i, int metric)
 }
 
 #ifdef CONFIG_IPV6_MULTIPLE_TABLES
+static inline bool fib6_has_custom_rules(const struct net *net)
+{
+	return net->ipv6.fib6_has_custom_rules;
+}
+
 int fib6_rules_init(void);
 void fib6_rules_cleanup(void);
 bool fib6_rule_default(const struct fib_rule *rule);
@@ -527,6 +532,10 @@ static inline bool fib6_rules_early_flow_dissect(struct net *net,
 	return true;
 }
 #else
+static inline bool fib6_has_custom_rules(const struct net *net)
+{
+	return false;
+}
 static inline int               fib6_rules_init(void)
 {
 	return 0;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v4 2/5] ipv6: keep track of routes using src
  2019-11-20 12:47 [PATCH net-next v4 0/5] net: introduce and use route hint Paolo Abeni
  2019-11-20 12:47 ` [PATCH net-next v4 1/5] ipv6: add fib6_has_custom_rules() helper Paolo Abeni
@ 2019-11-20 12:47 ` Paolo Abeni
  2019-11-21 20:09   ` David Ahern
  2019-11-20 12:47 ` [PATCH net-next v4 3/5] ipv6: introduce and uses route look hints for list input Paolo Abeni
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2019-11-20 12:47 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern,
	Eric Dumazet

Use a per namespace counter, increment it on successful creation
of any route using the source address, decrement it on deletion
of such routes.

This allows us to check easily if the routing decision in the
current namespace depends on the packet source. Will be used
by the next patch.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/ip6_fib.h    | 30 ++++++++++++++++++++++++++++++
 include/net/netns/ipv6.h |  3 +++
 net/ipv6/ip6_fib.c       |  4 ++++
 net/ipv6/route.c         |  3 +++
 4 files changed, 40 insertions(+)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 8ac3a59e5126..f1535f172935 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -90,7 +90,32 @@ struct fib6_gc_args {
 
 #ifndef CONFIG_IPV6_SUBTREES
 #define FIB6_SUBTREE(fn)	NULL
+
+static inline bool fib6_routes_require_src(const struct net *net)
+{
+	return false;
+}
+
+static inline void fib6_routes_require_src_inc(struct net *net) {}
+static inline void fib6_routes_require_src_dec(struct net *net) {}
+
 #else
+
+static inline bool fib6_routes_require_src(const struct net *net)
+{
+	return net->ipv6.fib6_routes_require_src > 0;
+}
+
+static inline void fib6_routes_require_src_inc(struct net *net)
+{
+	net->ipv6.fib6_routes_require_src++;
+}
+
+static inline void fib6_routes_require_src_dec(struct net *net)
+{
+	net->ipv6.fib6_routes_require_src--;
+}
+
 #define FIB6_SUBTREE(fn)	(rcu_dereference_protected((fn)->subtree, 1))
 #endif
 
@@ -212,6 +237,11 @@ static inline struct inet6_dev *ip6_dst_idev(struct dst_entry *dst)
 	return ((struct rt6_info *)dst)->rt6i_idev;
 }
 
+static inline bool fib6_requires_src(const struct fib6_info *rt)
+{
+	return rt->fib6_src.plen > 0;
+}
+
 static inline void fib6_clean_expires(struct fib6_info *f6i)
 {
 	f6i->fib6_flags &= ~RTF_EXPIRES;
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 022a0fd1a5a4..5ec054473d81 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -83,6 +83,9 @@ struct netns_ipv6 {
 #ifdef CONFIG_IPV6_MULTIPLE_TABLES
 	unsigned int		fib6_rules_require_fldissect;
 	bool			fib6_has_custom_rules;
+#ifdef CONFIG_IPV6_SUBTREES
+	unsigned int		fib6_routes_require_src;
+#endif
 	struct rt6_info         *ip6_prohibit_entry;
 	struct rt6_info         *ip6_blk_hole_entry;
 	struct fib6_table       *fib6_local_tbl;
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index f66bc2af4e9d..7bae6a91b487 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1461,6 +1461,8 @@ int fib6_add(struct fib6_node *root, struct fib6_info *rt,
 		}
 #endif
 		goto failure;
+	} else if (fib6_requires_src(rt)) {
+		fib6_routes_require_src_inc(info->nl_net);
 	}
 	return err;
 
@@ -1933,6 +1935,8 @@ int fib6_del(struct fib6_info *rt, struct nl_info *info)
 		struct fib6_info *cur = rcu_dereference_protected(*rtp,
 					lockdep_is_held(&table->tb6_lock));
 		if (rt == cur) {
+			if (fib6_requires_src(cur))
+				fib6_routes_require_src_dec(info->nl_net);
 			fib6_del_route(table, fn, rtp, info);
 			return 0;
 		}
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index edcb52543518..c92b367e058d 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -6199,6 +6199,9 @@ static int __net_init ip6_route_net_init(struct net *net)
 	dst_init_metrics(&net->ipv6.ip6_blk_hole_entry->dst,
 			 ip6_template_metrics, true);
 	INIT_LIST_HEAD(&net->ipv6.ip6_blk_hole_entry->rt6i_uncached);
+#ifdef CONFIG_IPV6_SUBTREES
+	net->ipv6.fib6_routes_require_src = 0;
+#endif
 #endif
 
 	net->ipv6.sysctl.flush_delay = 0;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v4 3/5] ipv6: introduce and uses route look hints for list input.
  2019-11-20 12:47 [PATCH net-next v4 0/5] net: introduce and use route hint Paolo Abeni
  2019-11-20 12:47 ` [PATCH net-next v4 1/5] ipv6: add fib6_has_custom_rules() helper Paolo Abeni
  2019-11-20 12:47 ` [PATCH net-next v4 2/5] ipv6: keep track of routes using src Paolo Abeni
@ 2019-11-20 12:47 ` Paolo Abeni
  2019-11-21 20:11   ` David Ahern
  2019-11-20 12:47 ` [PATCH net-next v4 4/5] ipv4: move fib4_has_custom_rules() helper to public header Paolo Abeni
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2019-11-20 12:47 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern,
	Eric Dumazet

When doing RX batch packet processing, we currently always repeat
the route lookup for each ingress packet. When no custom rules are
in place, and there aren't routes depending on source addresses,
we know that packets with the same destination address will use
the same dst.

This change tries to avoid per packet route lookup caching
the destination address of the latest successful lookup, and
reusing it for the next packet when the above conditions are
in place. Ingress traffic for most servers should fit.

The measured performance delta under UDP flood vs a recvmmsg
receiver is as follow:

vanilla		patched		delta
Kpps		Kpps		%
1431		1674		+17

In the worst-case scenario - each packet has a different
destination address - the performance delta is within noise
range.

v3 -> v4:
 - support hints for SUBFLOW build, too (David A.)
 - several style fixes (Eric)

v2 -> v3:
 - add fib6_has_custom_rules() helpers (David A.)
 - add ip6_extract_route_hint() helper (Edward C.)
 - use hint directly in ip6_list_rcv_finish() (Willem)

v1 -> v2:
 - fix build issue with !CONFIG_IPV6_MULTIPLE_TABLES
 - fix potential race when fib6_has_custom_rules is set
   while processing a packet batch

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/ipv6/ip6_input.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index ef7f707d9ae3..7b089d0ac8cd 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -86,11 +86,27 @@ static void ip6_sublist_rcv_finish(struct list_head *head)
 	}
 }
 
+static bool ip6_can_use_hint(const struct sk_buff *skb,
+			     const struct sk_buff *hint)
+{
+	return hint && !skb_dst(skb) &&
+	       ipv6_addr_equal(&ipv6_hdr(hint)->daddr, &ipv6_hdr(skb)->daddr);
+}
+
+static struct sk_buff *ip6_extract_route_hint(const struct net *net,
+					      struct sk_buff *skb)
+{
+	if (fib6_routes_require_src(net) || fib6_has_custom_rules(net))
+		return NULL;
+
+	return skb;
+}
+
 static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
 				struct list_head *head)
 {
+	struct sk_buff *skb, *next, *hint = NULL;
 	struct dst_entry *curr_dst = NULL;
-	struct sk_buff *skb, *next;
 	struct list_head sublist;
 
 	INIT_LIST_HEAD(&sublist);
@@ -104,9 +120,15 @@ static void ip6_list_rcv_finish(struct net *net, struct sock *sk,
 		skb = l3mdev_ip6_rcv(skb);
 		if (!skb)
 			continue;
-		ip6_rcv_finish_core(net, sk, skb);
+
+		if (ip6_can_use_hint(skb, hint))
+			skb_dst_copy(skb, hint);
+		else
+			ip6_rcv_finish_core(net, sk, skb);
 		dst = skb_dst(skb);
 		if (curr_dst != dst) {
+			hint = ip6_extract_route_hint(net, skb);
+
 			/* dispatch old sublist */
 			if (!list_empty(&sublist))
 				ip6_sublist_rcv_finish(&sublist);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v4 4/5] ipv4: move fib4_has_custom_rules() helper to public header
  2019-11-20 12:47 [PATCH net-next v4 0/5] net: introduce and use route hint Paolo Abeni
                   ` (2 preceding siblings ...)
  2019-11-20 12:47 ` [PATCH net-next v4 3/5] ipv6: introduce and uses route look hints for list input Paolo Abeni
@ 2019-11-20 12:47 ` Paolo Abeni
  2019-11-21 20:12   ` David Ahern
  2019-11-20 12:47 ` [PATCH net-next v4 5/5] ipv4: use dst hint for ipv4 list receive Paolo Abeni
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2019-11-20 12:47 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern,
	Eric Dumazet

So that we can use it in the next patch.
Additionally constify the helper argument.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/ip_fib.h    | 10 ++++++++++
 net/ipv4/fib_frontend.c | 10 ----------
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 52b2406a5dfc..b9cba41c6d4f 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -311,6 +311,11 @@ static inline int fib_lookup(struct net *net, const struct flowi4 *flp,
 	return err;
 }
 
+static inline bool fib4_has_custom_rules(const struct net *net)
+{
+	return false;
+}
+
 static inline bool fib4_rule_default(const struct fib_rule *rule)
 {
 	return true;
@@ -378,6 +383,11 @@ static inline int fib_lookup(struct net *net, struct flowi4 *flp,
 	return err;
 }
 
+static inline bool fib4_has_custom_rules(const struct net *net)
+{
+	return net->ipv4.fib_has_custom_rules;
+}
+
 bool fib4_rule_default(const struct fib_rule *rule);
 int fib4_rules_dump(struct net *net, struct notifier_block *nb,
 		    struct netlink_ext_ack *extack);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 71c78d223dfd..577db1d50a24 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -70,11 +70,6 @@ static int __net_init fib4_rules_init(struct net *net)
 	fib_free_table(main_table);
 	return -ENOMEM;
 }
-
-static bool fib4_has_custom_rules(struct net *net)
-{
-	return false;
-}
 #else
 
 struct fib_table *fib_new_table(struct net *net, u32 id)
@@ -131,11 +126,6 @@ struct fib_table *fib_get_table(struct net *net, u32 id)
 	}
 	return NULL;
 }
-
-static bool fib4_has_custom_rules(struct net *net)
-{
-	return net->ipv4.fib_has_custom_rules;
-}
 #endif /* CONFIG_IP_MULTIPLE_TABLES */
 
 static void fib_replace_table(struct net *net, struct fib_table *old,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH net-next v4 5/5] ipv4: use dst hint for ipv4 list receive
  2019-11-20 12:47 [PATCH net-next v4 0/5] net: introduce and use route hint Paolo Abeni
                   ` (3 preceding siblings ...)
  2019-11-20 12:47 ` [PATCH net-next v4 4/5] ipv4: move fib4_has_custom_rules() helper to public header Paolo Abeni
@ 2019-11-20 12:47 ` Paolo Abeni
  2019-11-21 21:16   ` David Ahern
  2019-11-20 16:54 ` [PATCH net-next v4 0/5] net: introduce and use route hint Edward Cree
  2019-11-21 22:46 ` David Miller
  6 siblings, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2019-11-20 12:47 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, David Ahern,
	Eric Dumazet

This is alike the previous change, with some additional ipv4 specific
quirk. Even when using the route hint we still have to do perform
additional per packet checks about source address validity: a new
helper is added to wrap them.

Hints are explicitly disabled if the destination is a local broadcast,
that keeps the code simple and local broadcast are a slower path anyway.

UDP flood performances vs recvmmsg() receiver:

vanilla		patched		delta
Kpps		Kpps		%
1683		1871		+11

In the worst case scenario - each packet has a different
destination address - the performance delta is within noise
range.

v3 -> v4:
 - re-enable hints for forward

v2 -> v3:
 - really fix build (sic) and hint usage check
 - use fib4_has_custom_rules() helpers (David A.)
 - add ip_extract_route_hint() helper (Edward C.)
 - use prev skb as hint instead of copying data (Willem)

v1 -> v2:
 - fix build issue with !CONFIG_IP_MULTIPLE_TABLES

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/route.h |  4 ++++
 net/ipv4/ip_input.c | 35 +++++++++++++++++++++++++++++++----
 net/ipv4/route.c    | 42 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 4 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index 6c516840380d..a9c60fc68e36 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -185,6 +185,10 @@ int ip_route_input_rcu(struct sk_buff *skb, __be32 dst, __be32 src,
 		       u8 tos, struct net_device *devin,
 		       struct fib_result *res);
 
+int ip_route_use_hint(struct sk_buff *skb, __be32 dst, __be32 src,
+		      u8 tos, struct net_device *devin,
+		      const struct sk_buff *hint);
+
 static inline int ip_route_input(struct sk_buff *skb, __be32 dst, __be32 src,
 				 u8 tos, struct net_device *devin)
 {
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 24a95126e698..aa438c6758a7 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -302,16 +302,31 @@ static inline bool ip_rcv_options(struct sk_buff *skb, struct net_device *dev)
 	return true;
 }
 
+static bool ip_can_use_hint(const struct sk_buff *skb, const struct iphdr *iph,
+			    const struct sk_buff *hint)
+{
+	return hint && !skb_dst(skb) && ip_hdr(hint)->daddr == iph->daddr &&
+	       ip_hdr(hint)->tos == iph->tos;
+}
+
 INDIRECT_CALLABLE_DECLARE(int udp_v4_early_demux(struct sk_buff *));
 INDIRECT_CALLABLE_DECLARE(int tcp_v4_early_demux(struct sk_buff *));
 static int ip_rcv_finish_core(struct net *net, struct sock *sk,
-			      struct sk_buff *skb, struct net_device *dev)
+			      struct sk_buff *skb, struct net_device *dev,
+			      const struct sk_buff *hint)
 {
 	const struct iphdr *iph = ip_hdr(skb);
 	int (*edemux)(struct sk_buff *skb);
 	struct rtable *rt;
 	int err;
 
+	if (ip_can_use_hint(skb, iph, hint)) {
+		err = ip_route_use_hint(skb, iph->daddr, iph->saddr, iph->tos,
+					dev, hint);
+		if (unlikely(err))
+			goto drop_error;
+	}
+
 	if (net->ipv4.sysctl_ip_early_demux &&
 	    !skb_dst(skb) &&
 	    !skb->sk &&
@@ -408,7 +423,7 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
 	if (!skb)
 		return NET_RX_SUCCESS;
 
-	ret = ip_rcv_finish_core(net, sk, skb, dev);
+	ret = ip_rcv_finish_core(net, sk, skb, dev, NULL);
 	if (ret != NET_RX_DROP)
 		ret = dst_input(skb);
 	return ret;
@@ -535,11 +550,20 @@ static void ip_sublist_rcv_finish(struct list_head *head)
 	}
 }
 
+static struct sk_buff *ip_extract_route_hint(const struct net *net,
+					     struct sk_buff *skb, int rt_type)
+{
+	if (fib4_has_custom_rules(net) || rt_type == RTN_BROADCAST)
+		return NULL;
+
+	return skb;
+}
+
 static void ip_list_rcv_finish(struct net *net, struct sock *sk,
 			       struct list_head *head)
 {
+	struct sk_buff *skb, *next, *hint = NULL;
 	struct dst_entry *curr_dst = NULL;
-	struct sk_buff *skb, *next;
 	struct list_head sublist;
 
 	INIT_LIST_HEAD(&sublist);
@@ -554,11 +578,14 @@ static void ip_list_rcv_finish(struct net *net, struct sock *sk,
 		skb = l3mdev_ip_rcv(skb);
 		if (!skb)
 			continue;
-		if (ip_rcv_finish_core(net, sk, skb, dev) == NET_RX_DROP)
+		if (ip_rcv_finish_core(net, sk, skb, dev, hint) == NET_RX_DROP)
 			continue;
 
 		dst = skb_dst(skb);
 		if (curr_dst != dst) {
+			hint = ip_extract_route_hint(net, skb,
+					       ((struct rtable *)dst)->rt_type);
+
 			/* dispatch old sublist */
 			if (!list_empty(&sublist))
 				ip_sublist_rcv_finish(&sublist);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index dcc4fa10138d..f88c93c38f11 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2019,10 +2019,52 @@ static int ip_mkroute_input(struct sk_buff *skb,
 	return __mkroute_input(skb, res, in_dev, daddr, saddr, tos);
 }
 
+/* Implements all the saddr-related checks as ip_route_input_slow(),
+ * assuming daddr is valid and the destination is not a local broadcast one.
+ * Uses the provided hint instead of performing a route lookup.
+ */
+int ip_route_use_hint(struct sk_buff *skb, __be32 daddr, __be32 saddr,
+		      u8 tos, struct net_device *dev,
+		      const struct sk_buff *hint)
+{
+	struct in_device *in_dev = __in_dev_get_rcu(dev);
+	struct rtable *rt = (struct rtable *)hint;
+	struct net *net = dev_net(dev);
+	int err = -EINVAL;
+	u32 tag = 0;
+
+	if (ipv4_is_multicast(saddr) || ipv4_is_lbcast(saddr))
+		goto martian_source;
+
+	if (ipv4_is_zeronet(saddr))
+		goto martian_source;
+
+	if (ipv4_is_loopback(saddr) && !IN_DEV_NET_ROUTE_LOCALNET(in_dev, net))
+		goto martian_source;
+
+	if (rt->rt_type != RTN_LOCAL)
+		goto skip_validate_source;
+
+	tos &= IPTOS_RT_MASK;
+	err = fib_validate_source(skb, saddr, daddr, tos, 0, dev, in_dev, &tag);
+	if (err < 0)
+		goto martian_source;
+
+skip_validate_source:
+	skb_dst_copy(skb, hint);
+	return 0;
+
+martian_source:
+	ip_handle_martian_source(dev, in_dev, skb, daddr, saddr);
+	return err;
+}
+
 /*
  *	NOTE. We drop all the packets that has local source
  *	addresses, because every properly looped back packet
  *	must have correct destination already attached by output routine.
+ *	Changes in the enforced policies must be applied also to
+ *	ip_route_use_hint().
  *
  *	Such approach solves two big problems:
  *	1. Not simplex devices are handled properly.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v4 0/5] net: introduce and use route hint
  2019-11-20 12:47 [PATCH net-next v4 0/5] net: introduce and use route hint Paolo Abeni
                   ` (4 preceding siblings ...)
  2019-11-20 12:47 ` [PATCH net-next v4 5/5] ipv4: use dst hint for ipv4 list receive Paolo Abeni
@ 2019-11-20 16:54 ` Edward Cree
  2019-11-21 22:46 ` David Miller
  6 siblings, 0 replies; 13+ messages in thread
From: Edward Cree @ 2019-11-20 16:54 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, David Ahern, Eric Dumazet

On 20/11/2019 12:47, Paolo Abeni wrote:
> This series leverages the listification infrastructure to avoid
> unnecessary route lookup on ingress packets. In absence of custom rules,
> packets with equal daddr will usually land on the same dst.
>
> When processing packet bursts (lists) we can easily reference the previous
> dst entry. When we hit the 'same destination' condition we can avoid the
> route lookup, coping the already available dst.
>
> Detailed performance numbers are available in the individual commit
> messages.
I wonder if you could use static keys for the fib*_has_custom_rules()
 and if that would gain you any extra speed?
Other than that,
Acked-by: Edward Cree <ecree@solarflare.com>
 for the series.
>
> v3 -> v4:
>  - move helpers to their own patches (Eric D.)
>  - enable hints for SUBTREE builds (David A.)
>  - re-enable hints for ipv4 forward (David A.)
>
> v2 -> v3:
>  - use fib*_has_custom_rules() helpers (David A.)
>  - add ip*_extract_route_hint() helper (Edward C.)
>  - use prev skb as hint instead of copying data (Willem )
>
> v1 -> v2:
>  - fix build issue with !CONFIG_IP*_MULTIPLE_TABLES
>  - fix potential race in ip6_list_rcv_finish()
>
> Paolo Abeni (5):
>   ipv6: add fib6_has_custom_rules() helper
>   ipv6: keep track of routes using src
>   ipv6: introduce and uses route look hints for list input.
>   ipv4: move fib4_has_custom_rules() helper to public header
>   ipv4: use dst hint for ipv4 list receive
>
>  include/net/ip6_fib.h    | 39 +++++++++++++++++++++++++++++++++++++
>  include/net/ip_fib.h     | 10 ++++++++++
>  include/net/netns/ipv6.h |  3 +++
>  include/net/route.h      |  4 ++++
>  net/ipv4/fib_frontend.c  | 10 ----------
>  net/ipv4/ip_input.c      | 35 +++++++++++++++++++++++++++++----
>  net/ipv4/route.c         | 42 ++++++++++++++++++++++++++++++++++++++++
>  net/ipv6/ip6_fib.c       |  4 ++++
>  net/ipv6/ip6_input.c     | 26 +++++++++++++++++++++++--
>  net/ipv6/route.c         |  3 +++
>  10 files changed, 160 insertions(+), 16 deletions(-)
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v4 1/5] ipv6: add fib6_has_custom_rules() helper
  2019-11-20 12:47 ` [PATCH net-next v4 1/5] ipv6: add fib6_has_custom_rules() helper Paolo Abeni
@ 2019-11-21 20:07   ` David Ahern
  0 siblings, 0 replies; 13+ messages in thread
From: David Ahern @ 2019-11-21 20:07 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, Eric Dumazet

On 11/20/19 5:47 AM, Paolo Abeni wrote:
> It wraps the namespace field with the same name, to easily
> access it regardless of build options.
> 
> Suggested-by: David Ahern <dsahern@gmail.com>
> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/net/ip6_fib.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v4 2/5] ipv6: keep track of routes using src
  2019-11-20 12:47 ` [PATCH net-next v4 2/5] ipv6: keep track of routes using src Paolo Abeni
@ 2019-11-21 20:09   ` David Ahern
  0 siblings, 0 replies; 13+ messages in thread
From: David Ahern @ 2019-11-21 20:09 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, Eric Dumazet

On 11/20/19 5:47 AM, Paolo Abeni wrote:
> Use a per namespace counter, increment it on successful creation
> of any route using the source address, decrement it on deletion
> of such routes.
> 
> This allows us to check easily if the routing decision in the
> current namespace depends on the packet source. Will be used
> by the next patch.
> 
> Suggested-by: David Ahern <dsahern@gmail.com>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/net/ip6_fib.h    | 30 ++++++++++++++++++++++++++++++
>  include/net/netns/ipv6.h |  3 +++
>  net/ipv6/ip6_fib.c       |  4 ++++
>  net/ipv6/route.c         |  3 +++
>  4 files changed, 40 insertions(+)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v4 3/5] ipv6: introduce and uses route look hints for list input.
  2019-11-20 12:47 ` [PATCH net-next v4 3/5] ipv6: introduce and uses route look hints for list input Paolo Abeni
@ 2019-11-21 20:11   ` David Ahern
  0 siblings, 0 replies; 13+ messages in thread
From: David Ahern @ 2019-11-21 20:11 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, Eric Dumazet

On 11/20/19 5:47 AM, Paolo Abeni wrote:
> When doing RX batch packet processing, we currently always repeat
> the route lookup for each ingress packet. When no custom rules are
> in place, and there aren't routes depending on source addresses,
> we know that packets with the same destination address will use
> the same dst.
> 
> This change tries to avoid per packet route lookup caching
> the destination address of the latest successful lookup, and
> reusing it for the next packet when the above conditions are
> in place. Ingress traffic for most servers should fit.
> 
> The measured performance delta under UDP flood vs a recvmmsg
> receiver is as follow:
> 
> vanilla		patched		delta
> Kpps		Kpps		%
> 1431		1674		+17
> 
> In the worst-case scenario - each packet has a different
> destination address - the performance delta is within noise
> range.
> 
...

> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/ipv6/ip6_input.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v4 4/5] ipv4: move fib4_has_custom_rules() helper to public header
  2019-11-20 12:47 ` [PATCH net-next v4 4/5] ipv4: move fib4_has_custom_rules() helper to public header Paolo Abeni
@ 2019-11-21 20:12   ` David Ahern
  0 siblings, 0 replies; 13+ messages in thread
From: David Ahern @ 2019-11-21 20:12 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, Eric Dumazet

On 11/20/19 5:47 AM, Paolo Abeni wrote:
> So that we can use it in the next patch.
> Additionally constify the helper argument.
> 
> Suggested-by: David Ahern <dsahern@gmail.com>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/net/ip_fib.h    | 10 ++++++++++
>  net/ipv4/fib_frontend.c | 10 ----------
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v4 5/5] ipv4: use dst hint for ipv4 list receive
  2019-11-20 12:47 ` [PATCH net-next v4 5/5] ipv4: use dst hint for ipv4 list receive Paolo Abeni
@ 2019-11-21 21:16   ` David Ahern
  0 siblings, 0 replies; 13+ messages in thread
From: David Ahern @ 2019-11-21 21:16 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Willem de Bruijn, Edward Cree, Eric Dumazet

On 11/20/19 5:47 AM, Paolo Abeni wrote:
> This is alike the previous change, with some additional ipv4 specific
> quirk. Even when using the route hint we still have to do perform
> additional per packet checks about source address validity: a new
> helper is added to wrap them.
> 
> Hints are explicitly disabled if the destination is a local broadcast,
> that keeps the code simple and local broadcast are a slower path anyway.
> 
> UDP flood performances vs recvmmsg() receiver:
> 
> vanilla		patched		delta
> Kpps		Kpps		%
> 1683		1871		+11
> 
> In the worst case scenario - each packet has a different
> destination address - the performance delta is within noise
> range.
> 
> v3 -> v4:
>  - re-enable hints for forward
> 
> v2 -> v3:
>  - really fix build (sic) and hint usage check
>  - use fib4_has_custom_rules() helpers (David A.)
>  - add ip_extract_route_hint() helper (Edward C.)
>  - use prev skb as hint instead of copying data (Willem)
> 
> v1 -> v2:
>  - fix build issue with !CONFIG_IP_MULTIPLE_TABLES
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/net/route.h |  4 ++++
>  net/ipv4/ip_input.c | 35 +++++++++++++++++++++++++++++++----
>  net/ipv4/route.c    | 42 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 77 insertions(+), 4 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next v4 0/5] net: introduce and use route hint
  2019-11-20 12:47 [PATCH net-next v4 0/5] net: introduce and use route hint Paolo Abeni
                   ` (5 preceding siblings ...)
  2019-11-20 16:54 ` [PATCH net-next v4 0/5] net: introduce and use route hint Edward Cree
@ 2019-11-21 22:46 ` David Miller
  6 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2019-11-21 22:46 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, willemdebruijn.kernel, ecree, dsahern, eric.dumazet

From: Paolo Abeni <pabeni@redhat.com>
Date: Wed, 20 Nov 2019 13:47:32 +0100

> This series leverages the listification infrastructure to avoid
> unnecessary route lookup on ingress packets. In absence of custom rules,
> packets with equal daddr will usually land on the same dst.
> 
> When processing packet bursts (lists) we can easily reference the previous
> dst entry. When we hit the 'same destination' condition we can avoid the
> route lookup, coping the already available dst.
> 
> Detailed performance numbers are available in the individual commit
> messages.
 ...

Series applied, thanks.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-11-21 22:46 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-20 12:47 [PATCH net-next v4 0/5] net: introduce and use route hint Paolo Abeni
2019-11-20 12:47 ` [PATCH net-next v4 1/5] ipv6: add fib6_has_custom_rules() helper Paolo Abeni
2019-11-21 20:07   ` David Ahern
2019-11-20 12:47 ` [PATCH net-next v4 2/5] ipv6: keep track of routes using src Paolo Abeni
2019-11-21 20:09   ` David Ahern
2019-11-20 12:47 ` [PATCH net-next v4 3/5] ipv6: introduce and uses route look hints for list input Paolo Abeni
2019-11-21 20:11   ` David Ahern
2019-11-20 12:47 ` [PATCH net-next v4 4/5] ipv4: move fib4_has_custom_rules() helper to public header Paolo Abeni
2019-11-21 20:12   ` David Ahern
2019-11-20 12:47 ` [PATCH net-next v4 5/5] ipv4: use dst hint for ipv4 list receive Paolo Abeni
2019-11-21 21:16   ` David Ahern
2019-11-20 16:54 ` [PATCH net-next v4 0/5] net: introduce and use route hint Edward Cree
2019-11-21 22:46 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.