All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive()
@ 2021-03-13 20:29 Alexander Lobakin
  2021-03-13 20:30 ` [PATCH v2 net-next 1/3] gro: simplify gro_list_prepare() Alexander Lobakin
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Alexander Lobakin @ 2021-03-13 20:29 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Alexander Lobakin, Eric Dumazet, Wei Wang, Cong Wang, Taehee Yoo,
	netdev, linux-kernel

This random series addresses some of suboptimal constructions used
in the main GRO entry point.
The main body is gro_list_prepare() simplification and pointer usage
optimization in dev_gro_receive() itself. Being mostly cosmetic, it
gives like +10 Mbps on my setup to both TCP and UDP (both single- and
multi-flow).

Since v1 [0]:
 - drop the replacement of bucket index calculation with
   reciprocal_scale() since it makes absolutely no sense (Eric);
 - improve stack usage in dev_gro_receive() (Eric);
 - reverse the order of patches to avoid changes superseding.

[0] https://lore.kernel.org/netdev/20210312162127.239795-1-alobakin@pm.me

Alexander Lobakin (3):
  gro: simplify gro_list_prepare()
  gro: consistentify napi->gro_hash[x] access in dev_gro_receive()
  gro: give 'hash' variable in dev_gro_receive() a less confusing name

 net/core/dev.c | 40 ++++++++++++++++++----------------------
 1 file changed, 18 insertions(+), 22 deletions(-)

--
2.30.2



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 net-next 1/3] gro: simplify gro_list_prepare()
  2021-03-13 20:29 [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive() Alexander Lobakin
@ 2021-03-13 20:30 ` Alexander Lobakin
  2021-03-13 20:30 ` [PATCH v2 net-next 2/3] gro: consistentify napi->gro_hash[x] access in dev_gro_receive() Alexander Lobakin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Alexander Lobakin @ 2021-03-13 20:30 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Alexander Lobakin, Eric Dumazet, Wei Wang, Cong Wang, Taehee Yoo,
	netdev, linux-kernel

gro_list_prepare() always returns &napi->gro_hash[bucket].list,
without any variations. Moreover, it uses 'napi' argument only to
have access to this list, and calculates the bucket index for the
second time (firstly it happens at the beginning of
dev_gro_receive()) to do that.
Given that dev_gro_receive() already has an index to the needed
list, just pass it as the first argument to eliminate redundant
calculations, and make gro_list_prepare() return void.
Also, both arguments of gro_list_prepare() can be constified since
this function can only modify the skbs from the bucket list.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
---
 net/core/dev.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 2bfdd528c7c3..1317e6b6758a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5858,15 +5858,13 @@ void napi_gro_flush(struct napi_struct *napi, bool flush_old)
 }
 EXPORT_SYMBOL(napi_gro_flush);

-static struct list_head *gro_list_prepare(struct napi_struct *napi,
-					  struct sk_buff *skb)
+static void gro_list_prepare(const struct list_head *head,
+			     const struct sk_buff *skb)
 {
 	unsigned int maclen = skb->dev->hard_header_len;
 	u32 hash = skb_get_hash_raw(skb);
-	struct list_head *head;
 	struct sk_buff *p;

-	head = &napi->gro_hash[hash & (GRO_HASH_BUCKETS - 1)].list;
 	list_for_each_entry(p, head, list) {
 		unsigned long diffs;

@@ -5892,8 +5890,6 @@ static struct list_head *gro_list_prepare(struct napi_struct *napi,
 				       maclen);
 		NAPI_GRO_CB(p)->same_flow = !diffs;
 	}
-
-	return head;
 }

 static void skb_gro_reset_offset(struct sk_buff *skb)
@@ -5957,10 +5953,10 @@ static void gro_flush_oldest(struct napi_struct *napi, struct list_head *head)
 static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
 	u32 hash = skb_get_hash_raw(skb) & (GRO_HASH_BUCKETS - 1);
+	struct list_head *gro_head = &napi->gro_hash[hash].list;
 	struct list_head *head = &offload_base;
 	struct packet_offload *ptype;
 	__be16 type = skb->protocol;
-	struct list_head *gro_head;
 	struct sk_buff *pp = NULL;
 	enum gro_result ret;
 	int same_flow;
@@ -5969,7 +5965,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	if (netif_elide_gro(skb->dev))
 		goto normal;

-	gro_head = gro_list_prepare(napi, skb);
+	gro_list_prepare(gro_head, skb);

 	rcu_read_lock();
 	list_for_each_entry_rcu(ptype, head, list) {
--
2.30.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 net-next 2/3] gro: consistentify napi->gro_hash[x] access in dev_gro_receive()
  2021-03-13 20:29 [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive() Alexander Lobakin
  2021-03-13 20:30 ` [PATCH v2 net-next 1/3] gro: simplify gro_list_prepare() Alexander Lobakin
@ 2021-03-13 20:30 ` Alexander Lobakin
  2021-03-13 20:30 ` [PATCH v2 net-next 3/3] gro: give 'hash' variable in dev_gro_receive() a less confusing name Alexander Lobakin
  2021-03-15  8:50 ` [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive() Eric Dumazet
  3 siblings, 0 replies; 5+ messages in thread
From: Alexander Lobakin @ 2021-03-13 20:30 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Alexander Lobakin, Eric Dumazet, Wei Wang, Cong Wang, Taehee Yoo,
	netdev, linux-kernel

GRO bucket index doesn't change through the entire function.
Store a pointer to the corresponding bucket instead of its member
and use it consistently through the function.
It is performance-safe since &gro_list->list == gro_list.

Misc: remove superfluous braces around single-line branches.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
---
 net/core/dev.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 1317e6b6758a..b635467087f3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5953,7 +5953,7 @@ static void gro_flush_oldest(struct napi_struct *napi, struct list_head *head)
 static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
 	u32 hash = skb_get_hash_raw(skb) & (GRO_HASH_BUCKETS - 1);
-	struct list_head *gro_head = &napi->gro_hash[hash].list;
+	struct gro_list *gro_list = &napi->gro_hash[hash];
 	struct list_head *head = &offload_base;
 	struct packet_offload *ptype;
 	__be16 type = skb->protocol;
@@ -5965,7 +5965,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	if (netif_elide_gro(skb->dev))
 		goto normal;

-	gro_list_prepare(gro_head, skb);
+	gro_list_prepare(&gro_list->list, skb);

 	rcu_read_lock();
 	list_for_each_entry_rcu(ptype, head, list) {
@@ -6001,7 +6001,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff

 		pp = INDIRECT_CALL_INET(ptype->callbacks.gro_receive,
 					ipv6_gro_receive, inet_gro_receive,
-					gro_head, skb);
+					&gro_list->list, skb);
 		break;
 	}
 	rcu_read_unlock();
@@ -6020,7 +6020,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	if (pp) {
 		skb_list_del_init(pp);
 		napi_gro_complete(napi, pp);
-		napi->gro_hash[hash].count--;
+		gro_list->count--;
 	}

 	if (same_flow)
@@ -6029,16 +6029,16 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	if (NAPI_GRO_CB(skb)->flush)
 		goto normal;

-	if (unlikely(napi->gro_hash[hash].count >= MAX_GRO_SKBS)) {
-		gro_flush_oldest(napi, gro_head);
-	} else {
-		napi->gro_hash[hash].count++;
-	}
+	if (unlikely(gro_list->count >= MAX_GRO_SKBS))
+		gro_flush_oldest(napi, &gro_list->list);
+	else
+		gro_list->count++;
+
 	NAPI_GRO_CB(skb)->count = 1;
 	NAPI_GRO_CB(skb)->age = jiffies;
 	NAPI_GRO_CB(skb)->last = skb;
 	skb_shinfo(skb)->gso_size = skb_gro_len(skb);
-	list_add(&skb->list, gro_head);
+	list_add(&skb->list, &gro_list->list);
 	ret = GRO_HELD;

 pull:
@@ -6046,7 +6046,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 	if (grow > 0)
 		gro_pull_from_frag0(skb, grow);
 ok:
-	if (napi->gro_hash[hash].count) {
+	if (gro_list->count) {
 		if (!test_bit(hash, &napi->gro_bitmask))
 			__set_bit(hash, &napi->gro_bitmask);
 	} else if (test_bit(hash, &napi->gro_bitmask)) {
--
2.30.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 net-next 3/3] gro: give 'hash' variable in dev_gro_receive() a less confusing name
  2021-03-13 20:29 [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive() Alexander Lobakin
  2021-03-13 20:30 ` [PATCH v2 net-next 1/3] gro: simplify gro_list_prepare() Alexander Lobakin
  2021-03-13 20:30 ` [PATCH v2 net-next 2/3] gro: consistentify napi->gro_hash[x] access in dev_gro_receive() Alexander Lobakin
@ 2021-03-13 20:30 ` Alexander Lobakin
  2021-03-15  8:50 ` [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive() Eric Dumazet
  3 siblings, 0 replies; 5+ messages in thread
From: Alexander Lobakin @ 2021-03-13 20:30 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Alexander Lobakin, Eric Dumazet, Wei Wang, Cong Wang, Taehee Yoo,
	netdev, linux-kernel

'hash' stores not the flow hash, but the index of the GRO bucket
corresponding to it.
Change its name to 'bucket' to avoid confusion while reading lines
like '__set_bit(hash, &napi->gro_bitmask)'.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
---
 net/core/dev.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index b635467087f3..5a2847a19cf2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5952,8 +5952,8 @@ static void gro_flush_oldest(struct napi_struct *napi, struct list_head *head)

 static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
 {
-	u32 hash = skb_get_hash_raw(skb) & (GRO_HASH_BUCKETS - 1);
-	struct gro_list *gro_list = &napi->gro_hash[hash];
+	u32 bucket = skb_get_hash_raw(skb) & (GRO_HASH_BUCKETS - 1);
+	struct gro_list *gro_list = &napi->gro_hash[bucket];
 	struct list_head *head = &offload_base;
 	struct packet_offload *ptype;
 	__be16 type = skb->protocol;
@@ -6047,10 +6047,10 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff
 		gro_pull_from_frag0(skb, grow);
 ok:
 	if (gro_list->count) {
-		if (!test_bit(hash, &napi->gro_bitmask))
-			__set_bit(hash, &napi->gro_bitmask);
-	} else if (test_bit(hash, &napi->gro_bitmask)) {
-		__clear_bit(hash, &napi->gro_bitmask);
+		if (!test_bit(bucket, &napi->gro_bitmask))
+			__set_bit(bucket, &napi->gro_bitmask);
+	} else if (test_bit(bucket, &napi->gro_bitmask)) {
+		__clear_bit(bucket, &napi->gro_bitmask);
 	}

 	return ret;
--
2.30.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive()
  2021-03-13 20:29 [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive() Alexander Lobakin
                   ` (2 preceding siblings ...)
  2021-03-13 20:30 ` [PATCH v2 net-next 3/3] gro: give 'hash' variable in dev_gro_receive() a less confusing name Alexander Lobakin
@ 2021-03-15  8:50 ` Eric Dumazet
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2021-03-15  8:50 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Wei Wang, Cong Wang,
	Taehee Yoo, netdev, LKML

On Sat, Mar 13, 2021 at 9:30 PM Alexander Lobakin <alobakin@pm.me> wrote:
>
> This random series addresses some of suboptimal constructions used
> in the main GRO entry point.
> The main body is gro_list_prepare() simplification and pointer usage
> optimization in dev_gro_receive() itself. Being mostly cosmetic, it
> gives like +10 Mbps on my setup to both TCP and UDP (both single- and
> multi-flow).
>
> Since v1 [0]:
>  - drop the replacement of bucket index calculation with
>    reciprocal_scale() since it makes absolutely no sense (Eric);
>  - improve stack usage in dev_gro_receive() (Eric);
>  - reverse the order of patches to avoid changes superseding.
>
> [0] https://lore.kernel.org/netdev/20210312162127.239795-1-alobakin@pm.me
>

SGTM, thanks.

Reviewed-by: Eric Dumazet <edumaet@google.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-15  8:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-13 20:29 [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive() Alexander Lobakin
2021-03-13 20:30 ` [PATCH v2 net-next 1/3] gro: simplify gro_list_prepare() Alexander Lobakin
2021-03-13 20:30 ` [PATCH v2 net-next 2/3] gro: consistentify napi->gro_hash[x] access in dev_gro_receive() Alexander Lobakin
2021-03-13 20:30 ` [PATCH v2 net-next 3/3] gro: give 'hash' variable in dev_gro_receive() a less confusing name Alexander Lobakin
2021-03-15  8:50 ` [PATCH v2 net-next 0/3] gro: micro-optimize dev_gro_receive() Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.