* [PATCH 4.19 1/2] virtio_net: Do not pull payload in skb->head
2021-07-29 18:44 [PATCH 4.19 0/2] Backport fixes for 3226b158e67c Matthieu Baerts
@ 2021-07-29 18:44 ` Matthieu Baerts
2021-07-29 18:44 ` [PATCH 4.19 2/2] gro: ensure frag0 meets IP header alignment Matthieu Baerts
2021-07-30 10:45 ` [PATCH 4.19 0/2] Backport fixes for 3226b158e67c Greg KH
2 siblings, 0 replies; 4+ messages in thread
From: Matthieu Baerts @ 2021-07-29 18:44 UTC (permalink / raw)
To: stable
Cc: Eric Dumazet, Xuan Zhuo, Michael S. Tsirkin, Jason Wang,
virtualization, David S . Miller, Matthieu Baerts
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db ]
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
drivers/net/virtio_net.c | 10 +++++++---
include/linux/virtio_net.h | 14 +++++++++-----
2 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5e8b40630286..1a8fe5bacb19 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -413,9 +413,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
offset += hdr_padded_len;
p += hdr_padded_len;
- copy = len;
- if (copy > skb_tailroom(skb))
- copy = skb_tailroom(skb);
+ /* Copy all frame if it fits skb->head, otherwise
+ * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+ */
+ if (len <= skb_tailroom(skb))
+ copy = len;
+ else
+ copy = ETH_HLEN + metasize;
skb_put_data(skb, p, copy);
if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index a1829139ff4a..8f48264f5dab 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
skb_reset_mac_header(skb);
if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
- u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
- u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+ u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+ u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+ u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+ if (!pskb_may_pull(skb, needed))
+ return -EINVAL;
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
p_off = skb_transport_offset(skb) + thlen;
- if (p_off > skb_headlen(skb))
+ if (!pskb_may_pull(skb, p_off))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
}
p_off = keys.control.thoff + thlen;
- if (p_off > skb_headlen(skb) ||
+ if (!pskb_may_pull(skb, p_off) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
skb_set_transport_header(skb, keys.control.thoff);
} else if (gso_type) {
p_off = thlen;
- if (p_off > skb_headlen(skb))
+ if (!pskb_may_pull(skb, p_off))
return -EINVAL;
}
}
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 4.19 2/2] gro: ensure frag0 meets IP header alignment
2021-07-29 18:44 [PATCH 4.19 0/2] Backport fixes for 3226b158e67c Matthieu Baerts
2021-07-29 18:44 ` [PATCH 4.19 1/2] virtio_net: Do not pull payload in skb->head Matthieu Baerts
@ 2021-07-29 18:44 ` Matthieu Baerts
2021-07-30 10:45 ` [PATCH 4.19 0/2] Backport fixes for 3226b158e67c Greg KH
2 siblings, 0 replies; 4+ messages in thread
From: Matthieu Baerts @ 2021-07-29 18:44 UTC (permalink / raw)
To: stable
Cc: Eric Dumazet, Guenter Roeck, Xuan Zhuo, Michael S. Tsirkin,
Jason Wang, David S . Miller, Matthieu Baerts
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 38ec4944b593fd90c5ef42aaaa53e66ae5769d04 ]
After commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Guenter Roeck reported one failure in his tests using sh architecture.
After much debugging, we have been able to spot silent unaligned accesses
in inet_gro_receive()
The issue at hand is that upper networking stacks assume their header
is word-aligned. Low level drivers are supposed to reserve NET_IP_ALIGN
bytes before the Ethernet header to make that happen.
This patch hardens skb_gro_reset_offset() to not allow frag0 fast-path
if the fragment is not properly aligned.
Some arches like x86, arm64 and powerpc do not care and define NET_IP_ALIGN
as 0, this extra check will be a NOP for them.
Note that if frag0 is not used, GRO will call pskb_may_pull()
as many times as needed to pull network and transport headers.
[ Backport note ]
A small conflict has been reported:
++<<<<<<< HEAD
+ if (skb_mac_header(skb) == skb_tail_pointer(skb) &&
+ pinfo->nr_frags &&
+ !PageHighMem(skb_frag_page(frag0))) {
++=======
+ if (!skb_headlen(skb) && pinfo->nr_frags &&
+ !PageHighMem(skb_frag_page(frag0)) &&
+ (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) {
++>>>>>>> 38ec4944b593 (gro: ensure frag0 meets IP header alignment)
This is expected because older kernels are missing
commit 8aef998df3979 ("net: core: allow fast GRO for skbs with Ethernet header in head").
This commit modifies the beginning of the 'if' statement.
The resolution was easy: the patch we want to backport here is
adding new conditions to the 'if' statement:
&& (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))
We simply append these new conditions to it on older kernels.
Another issue had to be resolved: skb_frag_off() is used in this
patch we want to backport but this function is not defined in
kernels < 5.4. It has then been extracted and imported from
commit 7240b60c98d63 ("linux: Add skb_frag_t page_offset accessors").
Fixes: 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
---
include/linux/skbuff.h | 9 +++++++++
net/core/dev.c | 3 ++-
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 06176ef2a842..5f2e6451ece5 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2788,6 +2788,15 @@ static inline void skb_propagate_pfmemalloc(struct page *page,
skb->pfmemalloc = true;
}
+/**
+ * skb_frag_off() - Returns the offset of a skb fragment
+ * @frag: the paged fragment
+ */
+static inline unsigned int skb_frag_off(const skb_frag_t *frag)
+{
+ return frag->page_offset;
+}
+
/**
* skb_frag_page - retrieve the page referred to by a paged fragment
* @frag: the paged fragment
diff --git a/net/core/dev.c b/net/core/dev.c
index 722ae0b57f3f..a6798117bb1a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5400,7 +5400,8 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
if (skb_mac_header(skb) == skb_tail_pointer(skb) &&
pinfo->nr_frags &&
- !PageHighMem(skb_frag_page(frag0))) {
+ !PageHighMem(skb_frag_page(frag0)) &&
+ (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) {
NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,
skb_frag_size(frag0),
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 4.19 0/2] Backport fixes for 3226b158e67c
2021-07-29 18:44 [PATCH 4.19 0/2] Backport fixes for 3226b158e67c Matthieu Baerts
2021-07-29 18:44 ` [PATCH 4.19 1/2] virtio_net: Do not pull payload in skb->head Matthieu Baerts
2021-07-29 18:44 ` [PATCH 4.19 2/2] gro: ensure frag0 meets IP header alignment Matthieu Baerts
@ 2021-07-30 10:45 ` Greg KH
2 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2021-07-30 10:45 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: stable, Eric Dumazet
On Thu, Jul 29, 2021 at 08:44:25PM +0200, Matthieu Baerts wrote:
> Commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
> introduces a ~10% performance drop when using virtio-net drivers.
>
> This commit has been backported to v4.19 in commit 669c0b5782fb and this
> performance drop is also visible there.
> Here at Tessares, we can also notice this drop with the MPTCP fork [1]
> on top of the v4.19 kernel.
>
> Eric Dumazet already fixed this issue a few months ago, see
> commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head").
>
> Unfortunately, this patch has not been backported to < v5.4 because it
> caused issues [2]. Indeed, after having backported it, the kernel failed
> to compile because one commit was missing, see
> commit 503d539a6e41 ("virtio_net: Add XDP meta data support"). However,
> this missing commit has been added in 4.19.186 but probably because
> there were still some opened discussions [3] around
> commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head"),
> the latter has not been backported at all in v4.19.
>
> A cherry-pick of this patch without any modification is proposed here.
> It has been validated: it fixes the original issue on v4.19 as well.
>
> Please note that there is also a fix for the fix, see
> commit 38ec4944b593 ("gro: ensure frag0 meets IP header alignment").
>
> This second fix has also not been backported because it caused issues as
> well [4]. Here, it was due to a conflict but also a compilation error
> when the conflict has been resolved. Please refer to patch 2/2 for more
> details.
>
> One last note: these two patches have also been backported and validated
> on a v4.14 release. A second series is going to be sent.
> It looks like it could be interesting to backport these two patches to
> v4.9 and v4.4 as well but unfortunately, the backport of these two
> patches fails with conflicts and I don't have any setup to validate the
> performance drop and fix with v4.9 and v4.4 kernels.
Both sets of series now queued up, thanks!
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread