All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4.14 0/2] Backport fixes for 3226b158e67c
@ 2021-07-29 18:47 Matthieu Baerts
  2021-07-29 18:47 ` [PATCH 4.14 1/2] virtio_net: Do not pull payload in skb->head Matthieu Baerts
  2021-07-29 18:47 ` [PATCH 4.14 2/2] gro: ensure frag0 meets IP header alignment Matthieu Baerts
  0 siblings, 2 replies; 3+ messages in thread
From: Matthieu Baerts @ 2021-07-29 18:47 UTC (permalink / raw)
  To: stable; +Cc: Eric Dumazet, Matthieu Baerts

Commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
introduces a ~10% performance drop when using virtio-net drivers.

This commit has been backported to v4.14 in commit 40b95b92f1db and this
performance drop is also visible there.
Here at Tessares, we can also notice this drop with the MPTCP fork [1]
on top of the v4.14 kernel.

Eric Dumazet already fixed this issue a few months ago, see
commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head").

Unfortunately, this patch has not been backported to < v5.4 because it
caused issues [2]. Indeed, after having backported it, the kernel fails
to compile. Please refer to patch 1/2 for more details.

A new version of this patch is proposed here fixing the compilation
issue. It has been validated: it fixes the original issue on v4.14 as
well.

Please note that there is also a fix for the fix, see
commit 38ec4944b593 ("gro: ensure frag0 meets IP header alignment").

This second fix has also not been backported because it caused issues as
well [3]. Here, it was due to a conflict but also a compilation error
when the conflict has been resolved. Please refer to patch 2/2 for more
details.

One last note: It looks like it could be interesting to backport these
two patches to v4.9 and v4.4 as well but unfortunately, the backport of
these two patches fails with conflicts and I don't have any setup to
validate the performance drop and fix with v4.9 and v4.4 kernels.

[1] https://github.com/multipath-tcp/mptcp
[2] https://lore.kernel.org/stable/161806389310822@kroah.com/
[3] https://lore.kernel.org/stable/16187490172453@kroah.com/

Eric Dumazet (2):
  virtio_net: Do not pull payload in skb->head
  gro: ensure frag0 meets IP header alignment

 drivers/net/virtio_net.c   | 10 +++++++---
 include/linux/skbuff.h     |  9 +++++++++
 include/linux/virtio_net.h | 14 +++++++++-----
 net/core/dev.c             |  3 ++-
 4 files changed, 27 insertions(+), 9 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 4.14 1/2] virtio_net: Do not pull payload in skb->head
  2021-07-29 18:47 [PATCH 4.14 0/2] Backport fixes for 3226b158e67c Matthieu Baerts
@ 2021-07-29 18:47 ` Matthieu Baerts
  2021-07-29 18:47 ` [PATCH 4.14 2/2] gro: ensure frag0 meets IP header alignment Matthieu Baerts
  1 sibling, 0 replies; 3+ messages in thread
From: Matthieu Baerts @ 2021-07-29 18:47 UTC (permalink / raw)
  To: stable
  Cc: Eric Dumazet, Xuan Zhuo, Michael S. Tsirkin, Jason Wang,
	virtualization, David S . Miller, Matthieu Baerts

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db ]

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

[ Backport note ]

    There was no conflict but the upstream patch was using a variable
    (metasize) not defined in < v5.7 kernels. This new variable has been
    introduced by commit 503d539a6e417 ("virtio_net: Add XDP meta data support").
    This commit has been backported in 5.4.112 and 4.19.186 to allow this
    commit 0f6925b3e8da0 ("virtio_net: Do not pull payload in skb->head")
    to be backported without issue.

    I don't think we want to backport this new feature related to XDP in
    older kernels < v4.19. In this version here, we simply drop this
    'metasize' variable as there is no need for it.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 drivers/net/virtio_net.c   | 10 +++++++---
 include/linux/virtio_net.h | 14 +++++++++-----
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c8abbf81ef52..9e18389309cf 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -339,9 +339,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 	offset += hdr_padded_len;
 	p += hdr_padded_len;
 
-	copy = len;
-	if (copy > skb_tailroom(skb))
-		copy = skb_tailroom(skb);
+	/* Copy all frame if it fits skb->head, otherwise
+	 * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+	 */
+	if (len <= skb_tailroom(skb))
+		copy = len;
+	else
+		copy = ETH_HLEN;
 	skb_put_data(skb, p, copy);
 
 	len -= copy;
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 48afea1b8b4e..ca68826d8449 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
 	skb_reset_mac_header(skb);
 
 	if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-		u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-		u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+		u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+		if (!pskb_may_pull(skb, needed))
+			return -EINVAL;
 
 		if (!skb_partial_csum_set(skb, start, off))
 			return -EINVAL;
 
 		p_off = skb_transport_offset(skb) + thlen;
-		if (p_off > skb_headlen(skb))
+		if (!pskb_may_pull(skb, p_off))
 			return -EINVAL;
 	} else {
 		/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -100,14 +104,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
 			}
 
 			p_off = keys.control.thoff + thlen;
-			if (p_off > skb_headlen(skb) ||
+			if (!pskb_may_pull(skb, p_off) ||
 			    keys.basic.ip_proto != ip_proto)
 				return -EINVAL;
 
 			skb_set_transport_header(skb, keys.control.thoff);
 		} else if (gso_type) {
 			p_off = thlen;
-			if (p_off > skb_headlen(skb))
+			if (!pskb_may_pull(skb, p_off))
 				return -EINVAL;
 		}
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 4.14 2/2] gro: ensure frag0 meets IP header alignment
  2021-07-29 18:47 [PATCH 4.14 0/2] Backport fixes for 3226b158e67c Matthieu Baerts
  2021-07-29 18:47 ` [PATCH 4.14 1/2] virtio_net: Do not pull payload in skb->head Matthieu Baerts
@ 2021-07-29 18:47 ` Matthieu Baerts
  1 sibling, 0 replies; 3+ messages in thread
From: Matthieu Baerts @ 2021-07-29 18:47 UTC (permalink / raw)
  To: stable
  Cc: Eric Dumazet, Guenter Roeck, Xuan Zhuo, Michael S. Tsirkin,
	Jason Wang, David S . Miller, Matthieu Baerts

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 38ec4944b593fd90c5ef42aaaa53e66ae5769d04 ]

After commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Guenter Roeck reported one failure in his tests using sh architecture.

After much debugging, we have been able to spot silent unaligned accesses
in inet_gro_receive()

The issue at hand is that upper networking stacks assume their header
is word-aligned. Low level drivers are supposed to reserve NET_IP_ALIGN
bytes before the Ethernet header to make that happen.

This patch hardens skb_gro_reset_offset() to not allow frag0 fast-path
if the fragment is not properly aligned.

Some arches like x86, arm64 and powerpc do not care and define NET_IP_ALIGN
as 0, this extra check will be a NOP for them.

Note that if frag0 is not used, GRO will call pskb_may_pull()
as many times as needed to pull network and transport headers.

[ Backport note ]

    A small conflict has been reported:

        ++<<<<<<< HEAD
         +      if (skb_mac_header(skb) == skb_tail_pointer(skb) &&
         +          pinfo->nr_frags &&
         +          !PageHighMem(skb_frag_page(frag0))) {
        ++=======
        +       if (!skb_headlen(skb) && pinfo->nr_frags &&
        +           !PageHighMem(skb_frag_page(frag0)) &&
        +           (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) {
        ++>>>>>>> 38ec4944b593 (gro: ensure frag0 meets IP header alignment)

    This is expected because older kernels are missing
    commit 8aef998df3979 ("net: core: allow fast GRO for skbs with Ethernet header in head").
    This commit modifies the beginning of the 'if' statement.

    The resolution was easy: the patch we want to backport here is
    adding new conditions to the 'if' statement:

        && (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))

    We simply append these new conditions to it on older kernels.

    Another issue had to be resolved: skb_frag_off() is used in this
    patch we want to backport but this function is not defined in
    kernels < 5.4. It has then been extracted and imported from
    commit 7240b60c98d63 ("linux: Add skb_frag_t page_offset accessors").

Fixes: 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
 include/linux/skbuff.h | 9 +++++++++
 net/core/dev.c         | 3 ++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index de3e59329b02..2f303454a323 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2784,6 +2784,15 @@ static inline void skb_propagate_pfmemalloc(struct page *page,
 		skb->pfmemalloc = true;
 }
 
+/**
+ * skb_frag_off() - Returns the offset of a skb fragment
+ * @frag: the paged fragment
+ */
+static inline unsigned int skb_frag_off(const skb_frag_t *frag)
+{
+	return frag->page_offset;
+}
+
 /**
  * skb_frag_page - retrieve the page referred to by a paged fragment
  * @frag: the paged fragment
diff --git a/net/core/dev.c b/net/core/dev.c
index aa419f3162b8..ea09e0809c12 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4763,7 +4763,8 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
 
 	if (skb_mac_header(skb) == skb_tail_pointer(skb) &&
 	    pinfo->nr_frags &&
-	    !PageHighMem(skb_frag_page(frag0))) {
+	    !PageHighMem(skb_frag_page(frag0)) &&
+	    (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) {
 		NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
 		NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,
 						    skb_frag_size(frag0),
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-07-29 18:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29 18:47 [PATCH 4.14 0/2] Backport fixes for 3226b158e67c Matthieu Baerts
2021-07-29 18:47 ` [PATCH 4.14 1/2] virtio_net: Do not pull payload in skb->head Matthieu Baerts
2021-07-29 18:47 ` [PATCH 4.14 2/2] gro: ensure frag0 meets IP header alignment Matthieu Baerts

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.