linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: gregkh@linuxfoundation.org
To: linux-kernel@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	"Alexander Lobakin" <bloodyreaper@yandex.ru>,
	"David S. Miller" <davem@davemloft.net>,
	"Pali Rohár" <pali@kernel.org>
Subject: [PATCH 4.19 21/39] net: dsa: add GRO support via gro_cells
Date: Wed, 10 Mar 2021 14:24:29 +0100	[thread overview]
Message-ID: <20210310132320.386282583@linuxfoundation.org> (raw)
In-Reply-To: <20210310132319.708237392@linuxfoundation.org>

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

From: Alexander Lobakin <bloodyreaper@yandex.ru>

commit e131a5634830047923c694b4ce0c3b31745ff01b upstream.

gro_cells lib is used by different encapsulating netdevices, such as
geneve, macsec, vxlan etc. to speed up decapsulated traffic processing.
CPU tag is a sort of "encapsulation", and we can use the same mechs to
greatly improve overall DSA performance.
skbs are passed to the GRO layer after removing CPU tags, so we don't
need any new packet offload types as it was firstly proposed by me in
the first GRO-over-DSA variant [1].

The size of struct gro_cells is sizeof(void *), so hot struct
dsa_slave_priv becomes only 4/8 bytes bigger, and all critical fields
remain in one 32-byte cacheline.
The other positive side effect is that drivers for network devices
that can be shipped as CPU ports of DSA-driven switches can now use
napi_gro_frags() to pass skbs to kernel. Packets built that way are
completely non-linear and are likely being dropped without GRO.

This was tested on to-be-mainlined-soon Ethernet driver that uses
napi_gro_frags(), and the overall performance was on par with the
variant from [1], sometimes even better due to minimal overhead.
net.core.gro_normal_batch tuning may help to push it to the limit
on particular setups and platforms.

iperf3 IPoE VLAN NAT TCP forwarding (port1.218 -> port0) setup
on 1.2 GHz MIPS board:

5.7-rc2 baseline:

[ID]  Interval         Transfer     Bitrate        Retr
[ 5]  0.00-120.01 sec  9.00 GBytes  644 Mbits/sec  413  sender
[ 5]  0.00-120.00 sec  8.99 GBytes  644 Mbits/sec       receiver

Iface      RX packets  TX packets
eth0       7097731     7097702
port0      426050      6671829
port1      6671681     425862
port1.218  6671677     425851

With this patch:

[ID]  Interval         Transfer     Bitrate        Retr
[ 5]  0.00-120.01 sec  12.2 GBytes  870 Mbits/sec  122  sender
[ 5]  0.00-120.00 sec  12.2 GBytes  870 Mbits/sec       receiver

Iface      RX packets  TX packets
eth0       9474792     9474777
port0      455200      353288
port1      9019592     455035
port1.218  353144      455024

v2:
 - Add some performance examples in the commit message;
 - No functional changes.

[1] https://lore.kernel.org/netdev/20191230143028.27313-1-alobakin@dlink.ru/

Signed-off-by: Alexander Lobakin <bloodyreaper@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Pali Rohár <pali@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/dsa/Kconfig    |    1 +
 net/dsa/dsa.c      |    2 +-
 net/dsa/dsa_priv.h |    3 +++
 net/dsa/slave.c    |   10 +++++++++-
 4 files changed, 14 insertions(+), 2 deletions(-)

--- a/net/dsa/Kconfig
+++ b/net/dsa/Kconfig
@@ -8,6 +8,7 @@ config NET_DSA
 	tristate "Distributed Switch Architecture"
 	depends on HAVE_NET_DSA && MAY_USE_DEVLINK
 	depends on BRIDGE || BRIDGE=n
+	select GRO_CELLS
 	select NET_SWITCHDEV
 	select PHYLINK
 	---help---
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -191,7 +191,7 @@ static int dsa_switch_rcv(struct sk_buff
 	if (dsa_skb_defer_rx_timestamp(p, skb))
 		return 0;
 
-	netif_receive_skb(skb);
+	gro_cells_receive(&p->gcells, skb);
 
 	return 0;
 }
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -15,6 +15,7 @@
 #include <linux/netdevice.h>
 #include <linux/netpoll.h>
 #include <net/dsa.h>
+#include <net/gro_cells.h>
 
 enum {
 	DSA_NOTIFIER_AGEING_TIME,
@@ -72,6 +73,8 @@ struct dsa_slave_priv {
 
 	struct pcpu_sw_netstats	*stats64;
 
+	struct gro_cells	gcells;
+
 	/* DSA port data, such as switch, port index, etc. */
 	struct dsa_port		*dp;
 
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1337,6 +1337,11 @@ int dsa_slave_create(struct dsa_port *po
 		free_netdev(slave_dev);
 		return -ENOMEM;
 	}
+
+	ret = gro_cells_init(&p->gcells, slave_dev);
+	if (ret)
+		goto out_free;
+
 	p->dp = port;
 	INIT_LIST_HEAD(&p->mall_tc_list);
 	p->xmit = cpu_dp->tag_ops->xmit;
@@ -1347,7 +1352,7 @@ int dsa_slave_create(struct dsa_port *po
 	ret = dsa_slave_phy_setup(slave_dev);
 	if (ret) {
 		netdev_err(master, "error %d setting up slave phy\n", ret);
-		goto out_free;
+		goto out_gcells;
 	}
 
 	dsa_slave_notify(slave_dev, DSA_PORT_REGISTER);
@@ -1366,6 +1371,8 @@ out_phy:
 	phylink_disconnect_phy(p->dp->pl);
 	rtnl_unlock();
 	phylink_destroy(p->dp->pl);
+out_gcells:
+	gro_cells_destroy(&p->gcells);
 out_free:
 	free_percpu(p->stats64);
 	free_netdev(slave_dev);
@@ -1386,6 +1393,7 @@ void dsa_slave_destroy(struct net_device
 	dsa_slave_notify(slave_dev, DSA_PORT_UNREGISTER);
 	unregister_netdev(slave_dev);
 	phylink_destroy(dp->pl);
+	gro_cells_destroy(&p->gcells);
 	free_percpu(p->stats64);
 	free_netdev(slave_dev);
 }



  parent reply	other threads:[~2021-03-10 13:31 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 13:24 [PATCH 4.19 00/39] 4.19.180-rc1 review gregkh
2021-03-10 13:24 ` [PATCH 4.19 01/39] btrfs: raid56: simplify tracking of Q stripe presence gregkh
2021-03-10 13:24 ` [PATCH 4.19 02/39] btrfs: fix raid6 qstripe kmap gregkh
2021-03-10 13:24 ` [PATCH 4.19 03/39] btrfs: validate qgroup inherit for SNAP_CREATE_V2 ioctl gregkh
2021-03-10 13:24 ` [PATCH 4.19 04/39] btrfs: free correct amount of space in btrfs_delayed_inode_reserve_metadata gregkh
2021-03-10 13:24 ` [PATCH 4.19 05/39] btrfs: unlock extents in btrfs_zero_range in case of quota reservation errors gregkh
2021-03-10 13:24 ` [PATCH 4.19 06/39] PM: runtime: Update device status before letting suppliers suspend gregkh
2021-03-10 13:24 ` [PATCH 4.19 07/39] dm bufio: subtract the number of initial sectors in dm_bufio_get_device_size gregkh
2021-03-10 13:24 ` [PATCH 4.19 08/39] drm/amdgpu: fix parameter error of RREG32_PCIE() in amdgpu_regs_pcie gregkh
2021-03-10 13:24 ` [PATCH 4.19 09/39] usbip: tools: fix build error for multiple definition gregkh
2021-03-10 13:24 ` [PATCH 4.19 10/39] Revert "zram: close udev startup race condition as default groups" gregkh
2021-03-10 13:24 ` [PATCH 4.19 11/39] block: genhd: add groups argument to device_add_disk gregkh
2021-03-10 13:24 ` [PATCH 4.19 12/39] nvme: register ns_id attributes as default sysfs groups gregkh
2021-03-10 13:24 ` [PATCH 4.19 13/39] aoe: register default groups with device_add_disk() gregkh
2021-03-10 13:24 ` [PATCH 4.19 14/39] zram: " gregkh
2021-03-10 13:24 ` [PATCH 4.19 15/39] virtio-blk: modernize sysfs attribute creation gregkh
2021-03-10 13:24 ` [PATCH 4.19 16/39] ALSA: ctxfi: cthw20k2: fix mask on conf to allow 4 bits gregkh
2021-03-10 13:24 ` [PATCH 4.19 17/39] RDMA/rxe: Fix missing kconfig dependency on CRYPTO gregkh
2021-03-10 13:24 ` [PATCH 4.19 18/39] rsxx: Return -EFAULT if copy_to_user() fails gregkh
2021-03-10 13:24 ` [PATCH 4.19 19/39] dm verity: fix FEC for RS roots unaligned to block size gregkh
2021-03-10 13:24 ` [PATCH 4.19 20/39] r8169: fix resuming from suspend on RTL8105e if machine runs on battery gregkh
2021-03-10 13:24 ` gregkh [this message]
2021-03-10 13:24 ` [PATCH 4.19 22/39] dm table: fix iterate_devices based device capability checks gregkh
2021-03-10 13:24 ` [PATCH 4.19 23/39] dm table: fix DAX " gregkh
2021-03-10 13:24 ` [PATCH 4.19 24/39] dm table: fix zoned " gregkh
2021-03-10 13:24 ` [PATCH 4.19 25/39] iommu/amd: Fix sleeping in atomic in increase_address_space() gregkh
2021-03-10 13:24 ` [PATCH 4.19 26/39] mwifiex: pcie: skip cancel_work_sync() on reset failure path gregkh
2021-03-10 13:24 ` [PATCH 4.19 27/39] platform/x86: acer-wmi: Cleanup ACER_CAP_FOO defines gregkh
2021-03-10 13:24 ` [PATCH 4.19 28/39] platform/x86: acer-wmi: Cleanup accelerometer device handling gregkh
2021-03-10 13:24 ` [PATCH 4.19 29/39] platform/x86: acer-wmi: Add new force_caps module parameter gregkh
2021-03-10 13:24 ` [PATCH 4.19 30/39] platform/x86: acer-wmi: Add ACER_CAP_SET_FUNCTION_MODE capability flag gregkh
2021-03-10 13:24 ` [PATCH 4.19 31/39] platform/x86: acer-wmi: Add support for SW_TABLET_MODE on Switch devices gregkh
2021-03-10 13:24 ` [PATCH 4.19 32/39] platform/x86: acer-wmi: Add ACER_CAP_KBD_DOCK quirk for the Aspire Switch 10E SW3-016 gregkh
2021-03-10 13:24 ` [PATCH 4.19 33/39] HID: mf: add support for 0079:1846 Mayflash/Dragonrise USB Gamecube Adapter gregkh
2021-03-10 13:24 ` [PATCH 4.19 34/39] media: cx23885: add more quirks for reset DMA on some AMD IOMMU gregkh
2021-03-10 13:24 ` [PATCH 4.19 35/39] ASoC: Intel: bytcr_rt5640: Add quirk for ARCHOS Cesium 140 gregkh
2021-03-10 13:24 ` [PATCH 4.19 36/39] PCI: Add function 1 DMA alias quirk for Marvell 9215 SATA controller gregkh
2021-03-10 13:24 ` [PATCH 4.19 37/39] misc: eeprom_93xx46: Add quirk to support Microchip 93LC46B eeprom gregkh
2021-03-10 13:24 ` [PATCH 4.19 38/39] drm/msm/a5xx: Remove overwriting A5XX_PC_DBG_ECO_CNTL register gregkh
2021-03-10 13:24 ` [PATCH 4.19 39/39] mmc: sdhci-of-dwcmshc: set SDHCI_QUIRK2_PRESET_VALUE_BROKEN gregkh
2021-03-10 20:22 ` [PATCH 4.19 00/39] 4.19.180-rc1 review Pavel Machek
2021-03-10 22:01 ` Shuah Khan
2021-03-10 23:51 ` Guenter Roeck
2021-03-11  2:39 ` Samuel Zou
2021-03-11  4:04 ` Ross Schmidt
2021-03-11  7:47 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210310132320.386282583@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bloodyreaper@yandex.ru \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pali@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).