All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Eric Dumazet <edumazet@google.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	virtualization@lists.linux-foundation.org,
	"David S. Miller" <davem@davemloft.net>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 08/47] virtio_net: Do not pull payload in skb->head
Date: Thu, 20 May 2021 11:22:06 +0200	[thread overview]
Message-ID: <20210520092053.825306214@linuxfoundation.org> (raw)
In-Reply-To: <20210520092053.559923764@linuxfoundation.org>

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db ]

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/virtio_net.c   | 10 +++++++---
 include/linux/virtio_net.h | 14 +++++++++-----
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 038ce4e5e84b..286f836a53bf 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 	offset += hdr_padded_len;
 	p += hdr_padded_len;
 
-	copy = len;
-	if (copy > skb_tailroom(skb))
-		copy = skb_tailroom(skb);
+	/* Copy all frame if it fits skb->head, otherwise
+	 * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+	 */
+	if (len <= skb_tailroom(skb))
+		copy = len;
+	else
+		copy = ETH_HLEN + metasize;
 	skb_put_data(skb, p, copy);
 
 	if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 98775d7fa696..b465f8f3e554 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
 	skb_reset_mac_header(skb);
 
 	if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-		u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-		u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+		u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+		if (!pskb_may_pull(skb, needed))
+			return -EINVAL;
 
 		if (!skb_partial_csum_set(skb, start, off))
 			return -EINVAL;
 
 		p_off = skb_transport_offset(skb) + thlen;
-		if (p_off > skb_headlen(skb))
+		if (!pskb_may_pull(skb, p_off))
 			return -EINVAL;
 	} else {
 		/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
 			}
 
 			p_off = keys.control.thoff + thlen;
-			if (p_off > skb_headlen(skb) ||
+			if (!pskb_may_pull(skb, p_off) ||
 			    keys.basic.ip_proto != ip_proto)
 				return -EINVAL;
 
 			skb_set_transport_header(skb, keys.control.thoff);
 		} else if (gso_type) {
 			p_off = thlen;
-			if (p_off > skb_headlen(skb))
+			if (!pskb_may_pull(skb, p_off))
 				return -EINVAL;
 		}
 	}
-- 
2.30.2




WARNING: multiple messages have this Message-ID (diff)
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Sasha Levin <sashal@kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 5.10 08/47] virtio_net: Do not pull payload in skb->head
Date: Thu, 20 May 2021 11:22:06 +0200	[thread overview]
Message-ID: <20210520092053.825306214@linuxfoundation.org> (raw)
In-Reply-To: <20210520092053.559923764@linuxfoundation.org>

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 0f6925b3e8da0dbbb52447ca8a8b42b371aac7db ]

Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought  a ~10% performance drop.

The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.

It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.

This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.

Fix is to pull only the ethernet header in page_to_skb()

Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.

This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.

Many thanks to Xuan Zhuo for his report, and his tests/help.

Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/virtio_net.c   | 10 +++++++---
 include/linux/virtio_net.h | 14 +++++++++-----
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 038ce4e5e84b..286f836a53bf 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -406,9 +406,13 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
 	offset += hdr_padded_len;
 	p += hdr_padded_len;
 
-	copy = len;
-	if (copy > skb_tailroom(skb))
-		copy = skb_tailroom(skb);
+	/* Copy all frame if it fits skb->head, otherwise
+	 * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
+	 */
+	if (len <= skb_tailroom(skb))
+		copy = len;
+	else
+		copy = ETH_HLEN + metasize;
 	skb_put_data(skb, p, copy);
 
 	if (metasize) {
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 98775d7fa696..b465f8f3e554 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -65,14 +65,18 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
 	skb_reset_mac_header(skb);
 
 	if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-		u16 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
-		u16 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 start = __virtio16_to_cpu(little_endian, hdr->csum_start);
+		u32 off = __virtio16_to_cpu(little_endian, hdr->csum_offset);
+		u32 needed = start + max_t(u32, thlen, off + sizeof(__sum16));
+
+		if (!pskb_may_pull(skb, needed))
+			return -EINVAL;
 
 		if (!skb_partial_csum_set(skb, start, off))
 			return -EINVAL;
 
 		p_off = skb_transport_offset(skb) + thlen;
-		if (p_off > skb_headlen(skb))
+		if (!pskb_may_pull(skb, p_off))
 			return -EINVAL;
 	} else {
 		/* gso packets without NEEDS_CSUM do not set transport_offset.
@@ -102,14 +106,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb,
 			}
 
 			p_off = keys.control.thoff + thlen;
-			if (p_off > skb_headlen(skb) ||
+			if (!pskb_may_pull(skb, p_off) ||
 			    keys.basic.ip_proto != ip_proto)
 				return -EINVAL;
 
 			skb_set_transport_header(skb, keys.control.thoff);
 		} else if (gso_type) {
 			p_off = thlen;
-			if (p_off > skb_headlen(skb))
+			if (!pskb_may_pull(skb, p_off))
 				return -EINVAL;
 		}
 	}
-- 
2.30.2



_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2021-05-20  9:28 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20  9:21 [PATCH 5.10 00/47] 5.10.39-rc1 review Greg Kroah-Hartman
2021-05-20  9:21 ` [PATCH 5.10 01/47] x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 02/47] drm/i915/display: fix compiler warning about array overrun Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 03/47] airo: work around stack usage warning Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 04/47] kgdb: fix gcc-11 warning on indentation Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 05/47] usb: sl811-hcd: improve misleading indentation Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 06/47] cxgb4: Fix the -Wmisleading-indentation warning Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 07/47] isdn: capi: fix mismatched prototypes Greg Kroah-Hartman
2021-05-20  9:22 ` Greg Kroah-Hartman [this message]
2021-05-20  9:22   ` [PATCH 5.10 08/47] virtio_net: Do not pull payload in skb->head Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 09/47] ARM: 9058/1: cache-v7: refactor v7_invalidate_l1 to avoid clobbering r5/r6 Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 10/47] PCI: thunder: Fix compile testing Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 11/47] dmaengine: dw-edma: Fix crash on loading/unloading driver Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 12/47] ARM: 9066/1: ftrace: pause/unpause function graph tracer in cpu_suspend() Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 13/47] ACPI / hotplug / PCI: Fix reference count leak in enable_slot() Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 14/47] PCI: tegra: Fix runtime PM imbalance in pex_ep_event_pex_rst_deassert() Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 15/47] Input: elants_i2c - do not bind to i2c-hid compatible ACPI instantiated devices Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 16/47] Input: silead - add workaround for x86 BIOS-es which bring the chip up in a stuck state Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 17/47] NFS: NFS_INO_REVAL_PAGECACHE should mark the change attribute invalid Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 18/47] um: Mark all kernel symbols as local Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 19/47] um: Disable CONFIG_GCOV with MODULES Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 20/47] PCI: tegra: Add Tegra194 MCFG quirks for ECAM errata Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 21/47] ARM: 9075/1: kernel: Fix interrupted SMC calls Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 22/47] platform/chrome: cros_ec_typec: Add DP mode check Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 23/47] riscv: Use $(LD) instead of $(CC) to link vDSO Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 24/47] scripts/recordmcount.pl: Fix RISC-V regex for clang Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 25/47] riscv: Workaround mcount name prior to clang-13 Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 26/47] scsi: lpfc: Fix illegal memory access on Abort IOCBs Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 27/47] ceph: fix fscache invalidation Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 28/47] ceph: dont clobber i_snap_caps on non-I_NEW inode Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 29/47] ceph: dont allow access to MDS-private inodes Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 30/47] scsi: target: tcmu: Return from tcmu_handle_completions() if cmd_id not found Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 31/47] amdgpu/pm: Prevent force of DCEFCLK on NAVI10 and SIENNA_CICHLID Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 32/47] bridge: Fix possible races between assigning rx_handler_data and setting IFF_BRIDGE_PORT bit Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 33/47] net: hsr: check skb can contain struct hsr_ethhdr in fill_frame_info Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 34/47] nvmet: remove unsupported command noise Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 35/47] drm/amd/display: Fix two cursor duplication when using overlay Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 36/47] gpiolib: acpi: Add quirk to ignore EC wakeups on Dell Venue 10 Pro 5055 Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 37/47] net:CXGB4: fix leak if sk_buff is not used Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 38/47] ALSA: hda: generic: change the DAC ctl name for LO+SPK or LO+HP Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 39/47] block: reexpand iov_iter after read/write Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 40/47] lib: stackdepot: turn depot_lock spinlock to raw_spinlock Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 41/47] net: stmmac: Do not enable RX FIFO overflow interrupts Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 42/47] ip6_gre: proper dev_{hold|put} in ndo_[un]init methods Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 43/47] sit: " Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 44/47] ip6_tunnel: " Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 45/47] ipv6: remove extra dev_hold() for fallback tunnels Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 46/47] tweewide: Fix most Shebang lines Greg Kroah-Hartman
2021-05-20  9:22 ` [PATCH 5.10 47/47] scripts: switch explicitly to Python 3 Greg Kroah-Hartman
2021-05-20 11:24 ` [PATCH 5.10 00/47] 5.10.39-rc1 review Pavel Machek
2021-05-20 12:32 ` Jon Hunter
2021-05-20 12:59   ` Greg Kroah-Hartman
2021-05-20 14:07 ` Fox Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210520092053.825306214@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.