LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH RFC net-next 00/20] net: dsa: add GRO support
@ 2019-12-30 14:30 Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 01/19] net: dsa: make .flow_dissect() callback returning void Alexander Lobakin
                   ` (21 more replies)
  0 siblings, 22 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

As of now, napi_gro_receive() in cases where the corresponding
device is installed as CPU port of DSA-driven switch is in fact
an overheaded version of netif_receive_skb{,_list}() with no
advantages over:

- dev_gro_receive() can't find packet_offload for ETH_P_XDSA type;
- so it immediately returns GRO_NORMAL;
- napi_skb_finish() passes skb to gro_normal_one() -> netstack.

This series adds a basic infrastructure to allow DSA taggers to
implement GRO callbacks and adds GRO support for 5 tagger drivers:
* tag_ar9331
* tag_gswip
* tag_lan9303
* tag_mtk
* tag_qca

I didn't make it for the rest because they are in fact way more
complicated (e.g. combined DSA + 802.1q tag etc.) and require
more familiarity with them and tests on the real hardware, which
is inaccesible for me at the moment.

This series also includes a bunch of random fixes in several tagger
drivers and some cleanup. I left them all in one place as they depend
on each other, but there's no problem for me to split this into
different series.

I mark this as RFC, and there are the key questions for maintainers,
developers, users etc.:
- Do we need GRO support for DSA at all?
- Which tagger protocols really need it and which don't?
- Are the actual changes correct in every single tagger code?
- Does this series bring any performance improvements on the
  affected systems?
- Would anybody mind if we'd add DSA support to napi_gro_frags()?
- Any code/other comments/notes.

I also would like to see more taggers with GRO callbacks, such as
DSA and EDSA, and the results of their addition.

Alexander Lobakin (20):
  net: dsa: make .flow_dissect() callback returning void
  net: dsa: add GRO support infrastructure
  net: dsa: tag_ar9331: add .flow_dissect() callback
  net: dsa: tag_ar9331: split out common tag accessors
  net: dsa: tag_ar9331: add GRO callbacks
  net: dsa: tag_gswip: fix typo in tag name
  net: dsa: tag_gswip: switch to bitfield helpers
  net: dsa: tag_gswip: add .flow_dissect() callback
  net: dsa: tag_gswip: split out common tag accessors
  net: dsa: tag_gswip: add GRO callbacks
  net: dsa: tag_lan9303: add .flow_dissect() callback
  net: dsa: tag_lan9303: split out common tag accessors
  net: dsa: tag_lan9303: add GRO callbacks
  net: dsa: tag_mtk: split out common tag accessors
  net: dsa: tag_mtk: add GRO callbacks
  net: dsa: tag_qca: fix doubled Tx statistics
  net: dsa: tag_qca: switch to bitfield helpers
  net: dsa: tag_qca: split out common tag accessors
  net: dsa: tag_qca: add GRO callbacks
  net: core: add (unlikely) DSA support in napi_gro_frags()

 include/net/dsa.h         |  10 ++-
 net/core/dev.c            |  11 ++-
 net/core/flow_dissector.c |   8 +-
 net/dsa/dsa.c             |  43 +++++++++-
 net/dsa/dsa2.c            |   1 +
 net/dsa/tag_ar9331.c      | 139 ++++++++++++++++++++++++++-----
 net/dsa/tag_dsa.c         |   5 +-
 net/dsa/tag_edsa.c        |   5 +-
 net/dsa/tag_gswip.c       | 126 +++++++++++++++++++++++-----
 net/dsa/tag_lan9303.c     | 139 ++++++++++++++++++++++++++-----
 net/dsa/tag_mtk.c         | 115 +++++++++++++++++++++-----
 net/dsa/tag_qca.c         | 167 ++++++++++++++++++++++++++++----------
 12 files changed, 629 insertions(+), 140 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 01/19] net: dsa: make .flow_dissect() callback returning void
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 18:11   ` Florian Fainelli
  2019-12-30 14:30 ` [PATCH RFC net-next 02/19] net: dsa: add GRO support infrastructure Alexander Lobakin
                   ` (20 subsequent siblings)
  21 siblings, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

There are no tag protocols which return non-zero values from
flow_dissect() callback. Remove it to simplify code and save some
object size.
If a particular tagger can't calculate offset and proto for some
reason, it can simply leave the original values untouched.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 include/net/dsa.h         | 5 ++---
 net/core/flow_dissector.c | 8 ++++----
 net/dsa/tag_dsa.c         | 5 ++---
 net/dsa/tag_edsa.c        | 5 ++---
 net/dsa/tag_mtk.c         | 6 ++----
 net/dsa/tag_qca.c         | 6 ++----
 6 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index da5578db228e..633d9894ab87 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -72,8 +72,8 @@ struct dsa_device_ops {
 	struct sk_buff *(*xmit)(struct sk_buff *skb, struct net_device *dev);
 	struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev,
 			       struct packet_type *pt);
-	int (*flow_dissect)(const struct sk_buff *skb, __be16 *proto,
-			    int *offset);
+	void (*flow_dissect)(const struct sk_buff *skb, __be16 *proto,
+			     int *offset);
 	/* Used to determine which traffic should match the DSA filter in
 	 * eth_type_trans, and which, if any, should bypass it and be processed
 	 * as regular on the master net device.
@@ -774,4 +774,3 @@ static struct dsa_tag_driver *dsa_tag_driver_array[] =	{		\
 };									\
 module_dsa_tag_drivers(dsa_tag_driver_array)
 #endif
-
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 2dbbb030fbed..2c9d8c7c76b3 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -971,12 +971,12 @@ bool __skb_flow_dissect(const struct net *net,
 #if IS_ENABLED(CONFIG_NET_DSA)
 		if (unlikely(skb->dev && netdev_uses_dsa(skb->dev) &&
 			     proto == htons(ETH_P_XDSA))) {
-			const struct dsa_device_ops *ops;
+			typeof_member(struct dsa_device_ops, flow_dissect) fd;
 			int offset = 0;
 
-			ops = skb->dev->dsa_ptr->tag_ops;
-			if (ops->flow_dissect &&
-			    !ops->flow_dissect(skb, &proto, &offset)) {
+			fd = skb->dev->dsa_ptr->tag_ops->flow_dissect;
+			if (fd) {
+				fd(skb, &proto, &offset);
 				hlen -= offset;
 				nhoff += offset;
 			}
diff --git a/net/dsa/tag_dsa.c b/net/dsa/tag_dsa.c
index 7ddec9794477..ef15aee58dfc 100644
--- a/net/dsa/tag_dsa.c
+++ b/net/dsa/tag_dsa.c
@@ -142,12 +142,11 @@ static struct sk_buff *dsa_rcv(struct sk_buff *skb, struct net_device *dev,
 	return skb;
 }
 
-static int dsa_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
-				int *offset)
+static void dsa_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
+				 int *offset)
 {
 	*offset = 4;
 	*proto = ((__be16 *)skb->data)[1];
-	return 0;
 }
 
 static const struct dsa_device_ops dsa_netdev_ops = {
diff --git a/net/dsa/tag_edsa.c b/net/dsa/tag_edsa.c
index e8eaa804ccb9..37a99254b411 100644
--- a/net/dsa/tag_edsa.c
+++ b/net/dsa/tag_edsa.c
@@ -161,12 +161,11 @@ static struct sk_buff *edsa_rcv(struct sk_buff *skb, struct net_device *dev,
 	return skb;
 }
 
-static int edsa_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
-				 int *offset)
+static void edsa_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
+				  int *offset)
 {
 	*offset = 8;
 	*proto = ((__be16 *)skb->data)[3];
-	return 0;
 }
 
 static const struct dsa_device_ops edsa_netdev_ops = {
diff --git a/net/dsa/tag_mtk.c b/net/dsa/tag_mtk.c
index b5705cba8318..c3ad7b7b142a 100644
--- a/net/dsa/tag_mtk.c
+++ b/net/dsa/tag_mtk.c
@@ -89,13 +89,11 @@ static struct sk_buff *mtk_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 	return skb;
 }
 
-static int mtk_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
-				int *offset)
+static void mtk_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
+				 int *offset)
 {
 	*offset = 4;
 	*proto = ((__be16 *)skb->data)[1];
-
-	return 0;
 }
 
 static const struct dsa_device_ops mtk_netdev_ops = {
diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index c95885215525..8e2dbaaffe59 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -90,13 +90,11 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 	return skb;
 }
 
-static int qca_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
-                                int *offset)
+static void qca_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
+				 int *offset)
 {
 	*offset = QCA_HDR_LEN;
 	*proto = ((__be16 *)skb->data)[0];
-
-	return 0;
 }
 
 static const struct dsa_device_ops qca_netdev_ops = {
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 02/19] net: dsa: add GRO support infrastructure
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 01/19] net: dsa: make .flow_dissect() callback returning void Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 03/19] net: dsa: tag_ar9331: add .flow_dissect() callback Alexander Lobakin
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

Add .gro_receive() (with shortcut) and .gro_complete() callbacks to
tagger ops and basic ETH_P_XDSA packet_offload with wrappers around
them, so DSA-tagged frames can now be processed within GRO layer if
the particular tagger implements this (will be added in subsequent
patches).

Note: no need to take RCU read locks in dsa_gro_receive() and
dsa_gro_complete() as dev->cpu_dp is not RCU-protected, at least
for now. The corresponding locks must be taken in the actual
tagger callbacks.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 include/net/dsa.h |  5 +++++
 net/dsa/dsa.c     | 43 +++++++++++++++++++++++++++++++++++++++++--
 net/dsa/dsa2.c    |  1 +
 3 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 633d9894ab87..8a7f80709d51 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -79,6 +79,9 @@ struct dsa_device_ops {
 	 * as regular on the master net device.
 	 */
 	bool (*filter)(const struct sk_buff *skb, struct net_device *dev);
+	struct sk_buff *(*gro_receive)(struct list_head *head,
+				       struct sk_buff *skb);
+	int (*gro_complete)(struct sk_buff *skb, int nhoff);
 	unsigned int overhead;
 	const char *name;
 	enum dsa_tag_protocol proto;
@@ -170,6 +173,8 @@ struct dsa_port {
 	struct sk_buff *(*rcv)(struct sk_buff *skb, struct net_device *dev,
 			       struct packet_type *pt);
 	bool (*filter)(const struct sk_buff *skb, struct net_device *dev);
+	struct sk_buff *(*gro_receive)(struct list_head *head,
+				       struct sk_buff *skb);
 
 	enum {
 		DSA_PORT_TYPE_UNUSED = 0,
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 17281fec710c..9a8d8ce7473c 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -243,6 +243,34 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
 	return 0;
 }
 
+static struct sk_buff *dsa_gro_receive(struct list_head *head,
+				       struct sk_buff *skb)
+{
+	const struct dsa_port *cpu_dp = skb->dev->dsa_ptr;
+	struct sk_buff *pp = NULL;
+	int flush = 1;
+
+	if (unlikely(!cpu_dp) || !cpu_dp->gro_receive)
+		goto flush;
+
+	pp = cpu_dp->gro_receive(head, skb);
+	flush = 0;
+
+flush:
+	skb_gro_flush_final(skb, pp, flush);
+	return pp;
+}
+
+static int dsa_gro_complete(struct sk_buff *skb, int nhoff)
+{
+	const struct dsa_port *cpu_dp = skb->dev->dsa_ptr;
+
+	if (likely(cpu_dp) && cpu_dp->tag_ops->gro_complete)
+		return cpu_dp->tag_ops->gro_complete(skb, nhoff);
+
+	return -ENOENT;
+}
+
 #ifdef CONFIG_PM_SLEEP
 static bool dsa_is_port_initialized(struct dsa_switch *ds, int p)
 {
@@ -298,8 +326,17 @@ EXPORT_SYMBOL_GPL(dsa_switch_resume);
 #endif
 
 static struct packet_type dsa_pack_type __read_mostly = {
-	.type	= cpu_to_be16(ETH_P_XDSA),
-	.func	= dsa_switch_rcv,
+	.type		= htons(ETH_P_XDSA),
+	.func		= dsa_switch_rcv,
+};
+
+static struct packet_offload dsa_pack_offload __read_mostly = {
+	.type		= htons(ETH_P_XDSA),
+	.priority	= 10,
+	.callbacks	= {
+		.gro_receive	= dsa_gro_receive,
+		.gro_complete	= dsa_gro_complete,
+	},
 };
 
 static struct workqueue_struct *dsa_owq;
@@ -430,6 +467,7 @@ static int __init dsa_init_module(void)
 		goto register_notifier_fail;
 
 	dev_add_pack(&dsa_pack_type);
+	dev_add_offload(&dsa_pack_offload);
 
 	dsa_tag_driver_register(&DSA_TAG_DRIVER_NAME(none_ops),
 				THIS_MODULE);
@@ -448,6 +486,7 @@ static void __exit dsa_cleanup_module(void)
 	dsa_tag_driver_unregister(&DSA_TAG_DRIVER_NAME(none_ops));
 
 	dsa_slave_unregister_notifier();
+	dev_remove_offload(&dsa_pack_offload);
 	dev_remove_pack(&dsa_pack_type);
 	destroy_workqueue(dsa_owq);
 }
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index c66abbed4daf..5f66e0280e8e 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -631,6 +631,7 @@ static int dsa_port_parse_cpu(struct dsa_port *dp, struct net_device *master)
 	}
 
 	dp->type = DSA_PORT_TYPE_CPU;
+	dp->gro_receive = tag_ops->gro_receive;
 	dp->filter = tag_ops->filter;
 	dp->rcv = tag_ops->rcv;
 	dp->tag_ops = tag_ops;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 03/19] net: dsa: tag_ar9331: add .flow_dissect() callback
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 01/19] net: dsa: make .flow_dissect() callback returning void Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 02/19] net: dsa: add GRO support infrastructure Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 18:22   ` Florian Fainelli
  2019-12-30 14:30 ` [PATCH RFC net-next 04/19] net: dsa: tag_ar9331: split out common tag accessors Alexander Lobakin
                   ` (18 subsequent siblings)
  21 siblings, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

...to make RPS work correctly if user would like to configure it.
Misc: fix identation of ar9331_netdev_ops structure.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_ar9331.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c
index 466ffa92a474..399ca21ec03b 100644
--- a/net/dsa/tag_ar9331.c
+++ b/net/dsa/tag_ar9331.c
@@ -83,12 +83,20 @@ static struct sk_buff *ar9331_tag_rcv(struct sk_buff *skb,
 	return skb;
 }
 
+static void ar9331_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
+				    int *offset)
+{
+	*offset = AR9331_HDR_LEN;
+	*proto = *(__be16 *)skb->data;
+}
+
 static const struct dsa_device_ops ar9331_netdev_ops = {
-	.name	= "ar9331",
-	.proto	= DSA_TAG_PROTO_AR9331,
-	.xmit	= ar9331_tag_xmit,
-	.rcv	= ar9331_tag_rcv,
-	.overhead = AR9331_HDR_LEN,
+	.name		= "ar9331",
+	.proto		= DSA_TAG_PROTO_AR9331,
+	.xmit		= ar9331_tag_xmit,
+	.rcv		= ar9331_tag_rcv,
+	.flow_dissect	= ar9331_tag_flow_dissect,
+	.overhead	= AR9331_HDR_LEN,
 };
 
 MODULE_LICENSE("GPL v2");
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 04/19] net: dsa: tag_ar9331: split out common tag accessors
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (2 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 03/19] net: dsa: tag_ar9331: add .flow_dissect() callback Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 17:18   ` Andrew Lunn
  2019-12-30 14:30 ` [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks Alexander Lobakin
                   ` (17 subsequent siblings)
  21 siblings, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

They will be reused in upcoming GRO callbacks.
(Almost) no functional changes except less informative error string.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_ar9331.c | 46 +++++++++++++++++++++++++++-----------------
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c
index 399ca21ec03b..c22c1b515e02 100644
--- a/net/dsa/tag_ar9331.c
+++ b/net/dsa/tag_ar9331.c
@@ -24,6 +24,25 @@
 #define AR9331_HDR_RESERVED_MASK	GENMASK(5, 4)
 #define AR9331_HDR_PORT_NUM_MASK	GENMASK(3, 0)
 
+static inline bool ar9331_tag_sanity_check(const u8 *data)
+{
+	u16 hdr = le16_to_cpup((__le16 *)(data - ETH_HLEN));
+
+	return FIELD_GET(AR9331_HDR_VERSION_MASK, hdr) == AR9331_HDR_VERSION &&
+	       !(hdr & AR9331_HDR_FROM_CPU);
+}
+
+static inline int ar9331_tag_source_port(const u8 *data)
+{
+	/* hdr comes in LE byte order, so srcport field is in the first byte */
+	return FIELD_GET(AR9331_HDR_PORT_NUM_MASK, *(data - ETH_HLEN));
+}
+
+static inline __be16 ar9331_tag_encap_proto(const u8 *data)
+{
+	return *(__be16 *)data;
+}
+
 static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
 				       struct net_device *dev)
 {
@@ -50,36 +69,27 @@ static struct sk_buff *ar9331_tag_rcv(struct sk_buff *skb,
 				      struct net_device *ndev,
 				      struct packet_type *pt)
 {
-	u8 ver, port;
-	u16 hdr;
+	int port;
 
 	if (unlikely(!pskb_may_pull(skb, AR9331_HDR_LEN)))
 		return NULL;
 
-	hdr = le16_to_cpu(*(__le16 *)skb_mac_header(skb));
-
-	ver = FIELD_GET(AR9331_HDR_VERSION_MASK, hdr);
-	if (unlikely(ver != AR9331_HDR_VERSION)) {
-		netdev_warn_once(ndev, "%s:%i wrong header version 0x%2x\n",
-				 __func__, __LINE__, hdr);
-		return NULL;
-	}
-
-	if (unlikely(hdr & AR9331_HDR_FROM_CPU)) {
-		netdev_warn_once(ndev, "%s:%i packet should not be from cpu 0x%2x\n",
-				 __func__, __LINE__, hdr);
+	if (unlikely(!ar9331_tag_sanity_check(skb->data))) {
+		netdev_warn_once(ndev,
+				 "%s:%i wrong header version or source port\n",
+				 __func__, __LINE__);
 		return NULL;
 	}
 
-	skb_pull_rcsum(skb, AR9331_HDR_LEN);
-
 	/* Get source port information */
-	port = FIELD_GET(AR9331_HDR_PORT_NUM_MASK, hdr);
+	port = ar9331_tag_source_port(skb->data);
 
 	skb->dev = dsa_master_find_slave(ndev, 0, port);
 	if (!skb->dev)
 		return NULL;
 
+	skb_pull_rcsum(skb, AR9331_HDR_LEN);
+
 	return skb;
 }
 
@@ -87,7 +97,7 @@ static void ar9331_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 				    int *offset)
 {
 	*offset = AR9331_HDR_LEN;
-	*proto = *(__be16 *)skb->data;
+	*proto = ar9331_tag_encap_proto(skb->data);
 }
 
 static const struct dsa_device_ops ar9331_netdev_ops = {
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (3 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 04/19] net: dsa: tag_ar9331: split out common tag accessors Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 18:20   ` Florian Fainelli
  2019-12-30 14:30 ` [PATCH RFC net-next 06/19] net: dsa: tag_gswip: fix typo in tag name Alexander Lobakin
                   ` (16 subsequent siblings)
  21 siblings, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

Add GRO callbacks to the AR9331 tagger so GRO layer can now process
such frames.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_ar9331.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c
index c22c1b515e02..99cc7fd92d8e 100644
--- a/net/dsa/tag_ar9331.c
+++ b/net/dsa/tag_ar9331.c
@@ -100,12 +100,89 @@ static void ar9331_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 	*proto = ar9331_tag_encap_proto(skb->data);
 }
 
+static struct sk_buff *ar9331_tag_gro_receive(struct list_head *head,
+					      struct sk_buff *skb)
+{
+	const struct packet_offload *ptype;
+	struct sk_buff *p, *pp = NULL;
+	u32 data_off, data_end;
+	const u8 *data;
+	int flush = 1;
+
+	data_off = skb_gro_offset(skb);
+	data_end = data_off + AR9331_HDR_LEN;
+
+	data = skb_gro_header_fast(skb, data_off);
+	if (skb_gro_header_hard(skb, data_end)) {
+		data = skb_gro_header_slow(skb, data_end, data_off);
+		if (unlikely(!data))
+			goto out;
+	}
+
+	/* Data that is to the left from the current position is already
+	 * pulled to the head
+	 */
+	if (unlikely(!ar9331_tag_sanity_check(skb->data + data_off)))
+		goto out;
+
+	rcu_read_lock();
+
+	ptype = gro_find_receive_by_type(ar9331_tag_encap_proto(data));
+	if (!ptype)
+		goto out_unlock;
+
+	flush = 0;
+
+	list_for_each_entry(p, head, list) {
+		if (!NAPI_GRO_CB(p)->same_flow)
+			continue;
+
+		if (ar9331_tag_source_port(skb->data + data_off) ^
+		    ar9331_tag_source_port(p->data + data_off))
+			NAPI_GRO_CB(p)->same_flow = 0;
+	}
+
+	skb_gro_pull(skb, AR9331_HDR_LEN);
+	skb_gro_postpull_rcsum(skb, data, AR9331_HDR_LEN);
+
+	pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+
+out_unlock:
+	rcu_read_unlock();
+out:
+	skb_gro_flush_final(skb, pp, flush);
+
+	return pp;
+}
+
+static int ar9331_tag_gro_complete(struct sk_buff *skb, int nhoff)
+{
+	const struct packet_offload *ptype;
+	int err = -ENOENT;
+	__be16 proto;
+
+	proto = ar9331_tag_encap_proto(skb->data + nhoff);
+	nhoff += AR9331_HDR_LEN;
+
+	rcu_read_lock();
+
+	ptype = gro_find_complete_by_type(proto);
+	if (ptype)
+		err = ptype->callbacks.gro_complete(skb, nhoff);
+
+	rcu_read_unlock();
+
+	return err;
+}
+
 static const struct dsa_device_ops ar9331_netdev_ops = {
 	.name		= "ar9331",
 	.proto		= DSA_TAG_PROTO_AR9331,
 	.xmit		= ar9331_tag_xmit,
 	.rcv		= ar9331_tag_rcv,
 	.flow_dissect	= ar9331_tag_flow_dissect,
+	.gro_receive	= ar9331_tag_gro_receive,
+	.gro_complete	= ar9331_tag_gro_complete,
 	.overhead	= AR9331_HDR_LEN,
 };
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 06/19] net: dsa: tag_gswip: fix typo in tag name
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (4 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 17:22   ` Andrew Lunn
  2019-12-30 14:30 ` [PATCH RFC net-next 07/19] net: dsa: tag_gswip: switch to bitfield helpers Alexander Lobakin
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

"gwsip" -> "gswip".

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_gswip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/dsa/tag_gswip.c b/net/dsa/tag_gswip.c
index b678160bbd66..408d4af390a0 100644
--- a/net/dsa/tag_gswip.c
+++ b/net/dsa/tag_gswip.c
@@ -104,7 +104,7 @@ static struct sk_buff *gswip_tag_rcv(struct sk_buff *skb,
 }
 
 static const struct dsa_device_ops gswip_netdev_ops = {
-	.name = "gwsip",
+	.name = "gswip",
 	.proto	= DSA_TAG_PROTO_GSWIP,
 	.xmit = gswip_tag_xmit,
 	.rcv = gswip_tag_rcv,
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 07/19] net: dsa: tag_gswip: switch to bitfield helpers
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (5 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 06/19] net: dsa: tag_gswip: fix typo in tag name Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 08/19] net: dsa: tag_gswip: add .flow_dissect() callback Alexander Lobakin
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

Make code look cleaner and more readable by using bitfield macros instead
of open-coding masks and shifts.
Misc: remove redundant variable in gswip_tag_xmit(), make dsa_port const.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_gswip.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/net/dsa/tag_gswip.c b/net/dsa/tag_gswip.c
index 408d4af390a0..de920f6aac5b 100644
--- a/net/dsa/tag_gswip.c
+++ b/net/dsa/tag_gswip.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2017 - 2018 Hauke Mehrtens <hauke@hauke-m.de>
  */
 
+#include <linux/bitfield.h>
 #include <linux/bitops.h>
 #include <linux/etherdevice.h>
 #include <linux/skbuff.h>
@@ -45,26 +46,22 @@
 #define GSWIP_TX_CLASS_MASK		GENMASK(3, 0)
 
 /* Byte 3 */
+#define GSWIP_TX_PORT_MAP(port)		FIELD_PREP(GENMASK(6, 1), BIT(port))
 #define GSWIP_TX_DPID_EN		BIT(0)
-#define GSWIP_TX_PORT_MAP_SHIFT		1
-#define GSWIP_TX_PORT_MAP_MASK		GENMASK(6, 1)
 
-#define GSWIP_RX_HEADER_LEN	8
+#define GSWIP_RX_HEADER_LEN		8
 
 /* special tag in RX path header */
 /* Byte 7 */
-#define GSWIP_RX_SPPID_SHIFT		4
-#define GSWIP_RX_SPPID_MASK		GENMASK(6, 4)
+#define GSWIP_RX_SPPID(byte_7)		FIELD_GET(GENMASK(6, 4), byte_7)
 
 static struct sk_buff *gswip_tag_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
-	struct dsa_port *dp = dsa_slave_to_port(dev);
-	int err;
+	const struct dsa_port *dp = dsa_slave_to_port(dev);
 	u8 *gswip_tag;
 
-	err = skb_cow_head(skb, GSWIP_TX_HEADER_LEN);
-	if (err)
+	if (skb_cow_head(skb, GSWIP_TX_HEADER_LEN))
 		return NULL;
 
 	skb_push(skb, GSWIP_TX_HEADER_LEN);
@@ -73,8 +70,7 @@ static struct sk_buff *gswip_tag_xmit(struct sk_buff *skb,
 	gswip_tag[0] = GSWIP_TX_SLPID_CPU;
 	gswip_tag[1] = GSWIP_TX_DPID_ELAN;
 	gswip_tag[2] = GSWIP_TX_PORT_MAP_EN | GSWIP_TX_PORT_MAP_SEL;
-	gswip_tag[3] = BIT(dp->index + GSWIP_TX_PORT_MAP_SHIFT) & GSWIP_TX_PORT_MAP_MASK;
-	gswip_tag[3] |= GSWIP_TX_DPID_EN;
+	gswip_tag[3] = GSWIP_TX_PORT_MAP(dp->index) | GSWIP_TX_DPID_EN;
 
 	return skb;
 }
@@ -84,15 +80,13 @@ static struct sk_buff *gswip_tag_rcv(struct sk_buff *skb,
 				     struct packet_type *pt)
 {
 	int port;
-	u8 *gswip_tag;
 
 	if (unlikely(!pskb_may_pull(skb, GSWIP_RX_HEADER_LEN)))
 		return NULL;
 
-	gswip_tag = skb->data - ETH_HLEN;
-
 	/* Get source port information */
-	port = (gswip_tag[7] & GSWIP_RX_SPPID_MASK) >> GSWIP_RX_SPPID_SHIFT;
+	port = GSWIP_RX_SPPID(*(skb->data - ETH_HLEN + 7));
+
 	skb->dev = dsa_master_find_slave(dev, 0, port);
 	if (!skb->dev)
 		return NULL;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 08/19] net: dsa: tag_gswip: add .flow_dissect() callback
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (6 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 07/19] net: dsa: tag_gswip: switch to bitfield helpers Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 09/19] net: dsa: tag_gswip: split out common tag accessors Alexander Lobakin
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

In case user would like to configure RPS on such systems.
Misc: fix identation of gswip_netdev_ops structure.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_gswip.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/net/dsa/tag_gswip.c b/net/dsa/tag_gswip.c
index de920f6aac5b..d37289540ef3 100644
--- a/net/dsa/tag_gswip.c
+++ b/net/dsa/tag_gswip.c
@@ -97,12 +97,20 @@ static struct sk_buff *gswip_tag_rcv(struct sk_buff *skb,
 	return skb;
 }
 
+static void gswip_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
+				   int *offset)
+{
+	*offset = GSWIP_RX_HEADER_LEN;
+	*proto = *(__be16 *)(skb->data + 6);
+}
+
 static const struct dsa_device_ops gswip_netdev_ops = {
-	.name = "gswip",
-	.proto	= DSA_TAG_PROTO_GSWIP,
-	.xmit = gswip_tag_xmit,
-	.rcv = gswip_tag_rcv,
-	.overhead = GSWIP_RX_HEADER_LEN,
+	.name		= "gswip",
+	.proto		= DSA_TAG_PROTO_GSWIP,
+	.xmit		= gswip_tag_xmit,
+	.rcv		= gswip_tag_rcv,
+	.flow_dissect	= gswip_tag_flow_dissect,
+	.overhead	= GSWIP_RX_HEADER_LEN,
 };
 
 MODULE_LICENSE("GPL");
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 09/19] net: dsa: tag_gswip: split out common tag accessors
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (7 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 08/19] net: dsa: tag_gswip: add .flow_dissect() callback Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 10/19] net: dsa: tag_gswip: add GRO callbacks Alexander Lobakin
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

...to reuse them in GRO callbacks.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_gswip.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/net/dsa/tag_gswip.c b/net/dsa/tag_gswip.c
index d37289540ef3..e7b36de27fd8 100644
--- a/net/dsa/tag_gswip.c
+++ b/net/dsa/tag_gswip.c
@@ -55,6 +55,16 @@
 /* Byte 7 */
 #define GSWIP_RX_SPPID(byte_7)		FIELD_GET(GENMASK(6, 4), byte_7)
 
+static inline int gswip_tag_source_port(const u8 *data)
+{
+	return GSWIP_RX_SPPID(*(data - ETH_HLEN + 7));
+}
+
+static inline __be16 gswip_tag_encap_proto(const u8 *data)
+{
+	return *(__be16 *)(data + 6);
+}
+
 static struct sk_buff *gswip_tag_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
@@ -85,7 +95,7 @@ static struct sk_buff *gswip_tag_rcv(struct sk_buff *skb,
 		return NULL;
 
 	/* Get source port information */
-	port = GSWIP_RX_SPPID(*(skb->data - ETH_HLEN + 7));
+	port = gswip_tag_source_port(skb->data);
 
 	skb->dev = dsa_master_find_slave(dev, 0, port);
 	if (!skb->dev)
@@ -101,7 +111,7 @@ static void gswip_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 				   int *offset)
 {
 	*offset = GSWIP_RX_HEADER_LEN;
-	*proto = *(__be16 *)(skb->data + 6);
+	*proto = gswip_tag_encap_proto(skb->data);
 }
 
 static const struct dsa_device_ops gswip_netdev_ops = {
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 10/19] net: dsa: tag_gswip: add GRO callbacks
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (8 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 09/19] net: dsa: tag_gswip: split out common tag accessors Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 11/19] net: dsa: tag_lan9303: add .flow_dissect() callback Alexander Lobakin
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

...so that GRO layer can now correctly process GSWIP-tagged frames.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_gswip.c | 74 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/net/dsa/tag_gswip.c b/net/dsa/tag_gswip.c
index e7b36de27fd8..56e754bd434a 100644
--- a/net/dsa/tag_gswip.c
+++ b/net/dsa/tag_gswip.c
@@ -114,12 +114,86 @@ static void gswip_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 	*proto = gswip_tag_encap_proto(skb->data);
 }
 
+static struct sk_buff *gswip_tag_gro_receive(struct list_head *head,
+					     struct sk_buff *skb)
+{
+	const struct packet_offload *ptype;
+	struct sk_buff *p, *pp = NULL;
+	u32 data_off, data_end;
+	const u8 *data;
+	int flush = 1;
+
+	data_off = skb_gro_offset(skb);
+	data_end = data_off + GSWIP_RX_HEADER_LEN;
+
+	data = skb_gro_header_fast(skb, data_off);
+	if (skb_gro_header_hard(skb, data_end)) {
+		data = skb_gro_header_slow(skb, data_end, data_off);
+		if (unlikely(!data))
+			goto out;
+	}
+
+	rcu_read_lock();
+
+	ptype = gro_find_receive_by_type(gswip_tag_encap_proto(data));
+	if (!ptype)
+		goto out_unlock;
+
+	flush = 0;
+
+	list_for_each_entry(p, head, list) {
+		if (!NAPI_GRO_CB(p)->same_flow)
+			continue;
+
+		/* Data that is to the left to the current position is already
+		 * pulled to the head
+		 */
+		if (gswip_tag_source_port(skb->data + data_off) ^
+		    gswip_tag_source_port(p->data + data_off))
+			NAPI_GRO_CB(p)->same_flow = 0;
+	}
+
+	skb_gro_pull(skb, GSWIP_RX_HEADER_LEN);
+	skb_gro_postpull_rcsum(skb, data, GSWIP_RX_HEADER_LEN);
+
+	pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+
+out_unlock:
+	rcu_read_unlock();
+out:
+	skb_gro_flush_final(skb, pp, flush);
+
+	return pp;
+}
+
+static int gswip_tag_gro_complete(struct sk_buff *skb, int nhoff)
+{
+	const struct packet_offload *ptype;
+	int err = -ENOENT;
+	__be16 proto;
+
+	proto = gswip_tag_encap_proto(skb->data + nhoff);
+	nhoff += GSWIP_RX_HEADER_LEN;
+
+	rcu_read_lock();
+
+	ptype = gro_find_complete_by_type(proto);
+	if (ptype)
+		err = ptype->callbacks.gro_complete(skb, nhoff);
+
+	rcu_read_unlock();
+
+	return err;
+}
+
 static const struct dsa_device_ops gswip_netdev_ops = {
 	.name		= "gswip",
 	.proto		= DSA_TAG_PROTO_GSWIP,
 	.xmit		= gswip_tag_xmit,
 	.rcv		= gswip_tag_rcv,
 	.flow_dissect	= gswip_tag_flow_dissect,
+	.gro_receive	= gswip_tag_gro_receive,
+	.gro_complete	= gswip_tag_gro_complete,
 	.overhead	= GSWIP_RX_HEADER_LEN,
 };
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 11/19] net: dsa: tag_lan9303: add .flow_dissect() callback
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (9 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 10/19] net: dsa: tag_gswip: add GRO callbacks Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 12/19] net: dsa: tag_lan9303: split out common tag accessors Alexander Lobakin
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

This fixes Rx packet hashing on RPS-enabled systems.
Misc: fix lan9303_netdev_ops identation.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_lan9303.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/net/dsa/tag_lan9303.c b/net/dsa/tag_lan9303.c
index eb0e7a32e53d..d328a44381a9 100644
--- a/net/dsa/tag_lan9303.c
+++ b/net/dsa/tag_lan9303.c
@@ -128,12 +128,20 @@ static struct sk_buff *lan9303_rcv(struct sk_buff *skb, struct net_device *dev,
 	return skb;
 }
 
+static void lan9303_flow_dissect(const struct sk_buff *skb, __be16 *proto,
+				 int *offset)
+{
+	*offset = LAN9303_TAG_LEN;
+	*proto = *(__be16 *)(skb->data + 2);
+}
+
 static const struct dsa_device_ops lan9303_netdev_ops = {
-	.name = "lan9303",
-	.proto	= DSA_TAG_PROTO_LAN9303,
-	.xmit = lan9303_xmit,
-	.rcv = lan9303_rcv,
-	.overhead = LAN9303_TAG_LEN,
+	.name		= "lan9303",
+	.proto		= DSA_TAG_PROTO_LAN9303,
+	.xmit		= lan9303_xmit,
+	.rcv		= lan9303_rcv,
+	.flow_dissect	= lan9303_flow_dissect,
+	.overhead	= LAN9303_TAG_LEN,
 };
 
 MODULE_LICENSE("GPL");
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 12/19] net: dsa: tag_lan9303: split out common tag accessors
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (10 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 11/19] net: dsa: tag_lan9303: add .flow_dissect() callback Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 13/19] net: dsa: tag_lan9303: add GRO callbacks Alexander Lobakin
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

...as they will be needed in the upcoming GRO callbacks.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_lan9303.c | 46 +++++++++++++++++++++++++++++--------------
 1 file changed, 31 insertions(+), 15 deletions(-)

diff --git a/net/dsa/tag_lan9303.c b/net/dsa/tag_lan9303.c
index d328a44381a9..ba03502986a4 100644
--- a/net/dsa/tag_lan9303.c
+++ b/net/dsa/tag_lan9303.c
@@ -38,6 +38,32 @@
 # define LAN9303_TAG_RX_TRAPPED_TO_CPU (LAN9303_TAG_RX_IGMP | \
 					LAN9303_TAG_RX_STP)
 
+static inline bool lan9303_sanity_check(const u8 *data)
+{
+	/* '->data' points into the middle of our special VLAN tag information:
+	 *
+	 * ~ MAC src   | 0x81 | 0x00 | 0xyy | 0xzz | ether type
+	 *                           ^
+	 *                        ->data
+	 */
+	return *(__be16 *)(data - 2) == htons(ETH_P_8021Q);
+}
+
+static inline bool lan9303_trapped_to_cpu(const u8 *data)
+{
+	return *(data + 1) & LAN9303_TAG_RX_TRAPPED_TO_CPU;
+}
+
+static inline int lan9303_source_port(const u8 *data)
+{
+	return *(data + 1) & GENMASK(1, 0);
+}
+
+static inline __be16 lan9303_encap_proto(const u8 *data)
+{
+	return *(__be16 *)(data + 2);
+}
+
 /* Decide whether to transmit using ALR lookup, or transmit directly to
  * port using tag. ALR learning is performed only when using ALR lookup.
  * If the two external ports are bridged and the frame is unicast,
@@ -85,8 +111,6 @@ static struct sk_buff *lan9303_xmit(struct sk_buff *skb, struct net_device *dev)
 static struct sk_buff *lan9303_rcv(struct sk_buff *skb, struct net_device *dev,
 				   struct packet_type *pt)
 {
-	u16 *lan9303_tag;
-	u16 lan9303_tag1;
 	unsigned int source_port;
 
 	if (unlikely(!pskb_may_pull(skb, LAN9303_TAG_LEN))) {
@@ -95,21 +119,12 @@ static struct sk_buff *lan9303_rcv(struct sk_buff *skb, struct net_device *dev,
 		return NULL;
 	}
 
-	/* '->data' points into the middle of our special VLAN tag information:
-	 *
-	 * ~ MAC src   | 0x81 | 0x00 | 0xyy | 0xzz | ether type
-	 *                           ^
-	 *                        ->data
-	 */
-	lan9303_tag = (u16 *)(skb->data - 2);
-
-	if (lan9303_tag[0] != htons(ETH_P_8021Q)) {
+	if (!lan9303_sanity_check(skb->data)) {
 		dev_warn_ratelimited(&dev->dev, "Dropping packet due to invalid VLAN marker\n");
 		return NULL;
 	}
 
-	lan9303_tag1 = ntohs(lan9303_tag[1]);
-	source_port = lan9303_tag1 & 0x3;
+	source_port = lan9303_source_port(skb->data);
 
 	skb->dev = dsa_master_find_slave(dev, 0, source_port);
 	if (!skb->dev) {
@@ -117,13 +132,14 @@ static struct sk_buff *lan9303_rcv(struct sk_buff *skb, struct net_device *dev,
 		return NULL;
 	}
 
+	skb->offload_fwd_mark = !lan9303_trapped_to_cpu(skb->data);
+
 	/* remove the special VLAN tag between the MAC addresses
 	 * and the current ethertype field.
 	 */
 	skb_pull_rcsum(skb, 2 + 2);
 	memmove(skb->data - ETH_HLEN, skb->data - (ETH_HLEN + LAN9303_TAG_LEN),
 		2 * ETH_ALEN);
-	skb->offload_fwd_mark = !(lan9303_tag1 & LAN9303_TAG_RX_TRAPPED_TO_CPU);
 
 	return skb;
 }
@@ -132,7 +148,7 @@ static void lan9303_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 				 int *offset)
 {
 	*offset = LAN9303_TAG_LEN;
-	*proto = *(__be16 *)(skb->data + 2);
+	*proto = lan9303_encap_proto(skb->data);
 }
 
 static const struct dsa_device_ops lan9303_netdev_ops = {
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 13/19] net: dsa: tag_lan9303: add GRO callbacks
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (11 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 12/19] net: dsa: tag_lan9303: split out common tag accessors Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 14/19] net: dsa: tag_mtk: split out common tag accessors Alexander Lobakin
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

Add GRO callbacks for LAN9303 tagger to make GRO available for frames
tagged with it.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_lan9303.c | 77 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 77 insertions(+)

diff --git a/net/dsa/tag_lan9303.c b/net/dsa/tag_lan9303.c
index ba03502986a4..77aba4311f2d 100644
--- a/net/dsa/tag_lan9303.c
+++ b/net/dsa/tag_lan9303.c
@@ -151,12 +151,89 @@ static void lan9303_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 	*proto = lan9303_encap_proto(skb->data);
 }
 
+static struct sk_buff *lan9303_gro_receive(struct list_head *head,
+					   struct sk_buff *skb)
+{
+	const struct packet_offload *ptype;
+	struct sk_buff *p, *pp = NULL;
+	const u8 *data, *data_p;
+	u32 data_off, data_end;
+	int flush = 1;
+
+	data_off = skb_gro_offset(skb);
+	data_end = data_off + LAN9303_TAG_LEN;
+
+	data = skb_gro_header_fast(skb, data_off);
+	if (skb_gro_header_hard(skb, data_end)) {
+		data = skb_gro_header_slow(skb, data_end, data_off);
+		if (unlikely(!data))
+			goto out;
+	}
+
+	/* Data that is to the left to the current position is already
+	 * pulled to the head
+	 */
+	if (!lan9303_sanity_check(skb->data + data_off))
+		goto out;
+
+	rcu_read_lock();
+
+	ptype = gro_find_receive_by_type(lan9303_encap_proto(data));
+	if (!ptype)
+		goto out_unlock;
+
+	flush = 0;
+
+	list_for_each_entry(p, head, list) {
+		if (!NAPI_GRO_CB(p)->same_flow)
+			continue;
+
+		data_p = p->data + data_off;
+		if (lan9303_source_port(data) ^ lan9303_source_port(data_p))
+			NAPI_GRO_CB(p)->same_flow = 0;
+	}
+
+	skb_gro_pull(skb, LAN9303_TAG_LEN);
+	skb_gro_postpull_rcsum(skb, data, LAN9303_TAG_LEN);
+
+	pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+
+out_unlock:
+	rcu_read_unlock();
+out:
+	skb_gro_flush_final(skb, pp, flush);
+
+	return pp;
+}
+
+static int lan9303_gro_complete(struct sk_buff *skb, int nhoff)
+{
+	const struct packet_offload *ptype;
+	int err = -ENOENT;
+	__be16 proto;
+
+	proto = lan9303_encap_proto(skb->data + nhoff);
+	nhoff += LAN9303_TAG_LEN;
+
+	rcu_read_lock();
+
+	ptype = gro_find_complete_by_type(proto);
+	if (ptype)
+		err = ptype->callbacks.gro_complete(skb, nhoff);
+
+	rcu_read_unlock();
+
+	return err;
+}
+
 static const struct dsa_device_ops lan9303_netdev_ops = {
 	.name		= "lan9303",
 	.proto		= DSA_TAG_PROTO_LAN9303,
 	.xmit		= lan9303_xmit,
 	.rcv		= lan9303_rcv,
 	.flow_dissect	= lan9303_flow_dissect,
+	.gro_receive	= lan9303_gro_receive,
+	.gro_complete	= lan9303_gro_complete,
 	.overhead	= LAN9303_TAG_LEN,
 };
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 14/19] net: dsa: tag_mtk: split out common tag accessors
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (12 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 13/19] net: dsa: tag_lan9303: add GRO callbacks Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 15/19] net: dsa: tag_mtk: add GRO callbacks Alexander Lobakin
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

Misc: fix identation of MTK_HDR_LEN and make use of it in
mtk_tag_flow_dissect() instead of open-coded value.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_mtk.c | 40 +++++++++++++++++++++++-----------------
 1 file changed, 23 insertions(+), 17 deletions(-)

diff --git a/net/dsa/tag_mtk.c b/net/dsa/tag_mtk.c
index c3ad7b7b142a..b926ffdf5fb5 100644
--- a/net/dsa/tag_mtk.c
+++ b/net/dsa/tag_mtk.c
@@ -10,12 +10,27 @@
 
 #include "dsa_priv.h"
 
-#define MTK_HDR_LEN		4
+#define MTK_HDR_LEN			4
 #define MTK_HDR_XMIT_UNTAGGED		0
 #define MTK_HDR_XMIT_TAGGED_TPID_8100	1
 #define MTK_HDR_RECV_SOURCE_PORT_MASK	GENMASK(2, 0)
 #define MTK_HDR_XMIT_DP_BIT_MASK	GENMASK(5, 0)
 
+static inline int mtk_tag_source_port(const u8 *data)
+{
+	/* The MTK header is added by the switch between src addr
+	 * and ethertype at this point, skb->data points to 2 bytes
+	 * after src addr so header should be 2 bytes right before.
+	 * The source port field is in the second byte of the tag.
+	 */
+	return *(data - 1) & MTK_HDR_RECV_SOURCE_PORT_MASK;
+}
+
+static inline __be16 mtk_tag_encap_proto(const u8 *data)
+{
+	return *(__be16 *)(data + 2);
+}
+
 static struct sk_buff *mtk_tag_xmit(struct sk_buff *skb,
 				    struct net_device *dev)
 {
@@ -60,17 +75,15 @@ static struct sk_buff *mtk_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 				   struct packet_type *pt)
 {
 	int port;
-	__be16 *phdr, hdr;
 
 	if (unlikely(!pskb_may_pull(skb, MTK_HDR_LEN)))
 		return NULL;
 
-	/* The MTK header is added by the switch between src addr
-	 * and ethertype at this point, skb->data points to 2 bytes
-	 * after src addr so header should be 2 bytes right before.
-	 */
-	phdr = (__be16 *)(skb->data - 2);
-	hdr = ntohs(*phdr);
+	port = mtk_tag_source_port(skb->data);
+
+	skb->dev = dsa_master_find_slave(dev, 0, port);
+	if (!skb->dev)
+		return NULL;
 
 	/* Remove MTK tag and recalculate checksum. */
 	skb_pull_rcsum(skb, MTK_HDR_LEN);
@@ -79,21 +92,14 @@ static struct sk_buff *mtk_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 		skb->data - ETH_HLEN - MTK_HDR_LEN,
 		2 * ETH_ALEN);
 
-	/* Get source port information */
-	port = (hdr & MTK_HDR_RECV_SOURCE_PORT_MASK);
-
-	skb->dev = dsa_master_find_slave(dev, 0, port);
-	if (!skb->dev)
-		return NULL;
-
 	return skb;
 }
 
 static void mtk_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 				 int *offset)
 {
-	*offset = 4;
-	*proto = ((__be16 *)skb->data)[1];
+	*offset = MTK_HDR_LEN;
+	*proto = mtk_tag_encap_proto(skb->data);
 }
 
 static const struct dsa_device_ops mtk_netdev_ops = {
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 15/19] net: dsa: tag_mtk: add GRO callbacks
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (13 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 14/19] net: dsa: tag_mtk: split out common tag accessors Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 16/19] net: dsa: tag_qca: fix doubled Tx statistics Alexander Lobakin
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

Make GRO available for frames with MTK tag by adding the corresponding
callbacks to MTK tagger.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_mtk.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/net/dsa/tag_mtk.c b/net/dsa/tag_mtk.c
index b926ffdf5fb5..13d929160283 100644
--- a/net/dsa/tag_mtk.c
+++ b/net/dsa/tag_mtk.c
@@ -102,12 +102,85 @@ static void mtk_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 	*proto = mtk_tag_encap_proto(skb->data);
 }
 
+static struct sk_buff *mtk_tag_gro_receive(struct list_head *head,
+					   struct sk_buff *skb)
+{
+	const struct packet_offload *ptype;
+	struct sk_buff *p, *pp = NULL;
+	u32 data_off, data_end;
+	const u8 *data;
+	int flush = 1;
+
+	data_off = skb_gro_offset(skb);
+	data_end = data_off + MTK_HDR_LEN;
+
+	data = skb_gro_header_fast(skb, data_off);
+	if (skb_gro_header_hard(skb, data_end)) {
+		data = skb_gro_header_slow(skb, data_end, data_off);
+		if (unlikely(!data))
+			goto out;
+	}
+
+	rcu_read_lock();
+
+	ptype = gro_find_receive_by_type(mtk_tag_encap_proto(data));
+	if (!ptype)
+		goto out_unlock;
+
+	flush = 0;
+
+	list_for_each_entry(p, head, list) {
+		if (!NAPI_GRO_CB(p)->same_flow)
+			continue;
+
+		/* Data that is to the left to the current position is already
+		 * pulled to the head
+		 */
+		if (mtk_tag_source_port(skb->data + data_off) ^
+		    mtk_tag_source_port(p->data + data_off))
+			NAPI_GRO_CB(p)->same_flow = 0;
+	}
+
+	skb_gro_pull(skb, MTK_HDR_LEN);
+	skb_gro_postpull_rcsum(skb, data, MTK_HDR_LEN);
+
+	pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+
+out_unlock:
+	rcu_read_unlock();
+out:
+	skb_gro_flush_final(skb, pp, flush);
+
+	return pp;
+}
+
+static int mtk_tag_gro_complete(struct sk_buff *skb, int nhoff)
+{
+	const struct packet_offload *ptype;
+	int err = -ENOENT;
+	__be16 proto;
+
+	proto = mtk_tag_encap_proto(skb->data + nhoff);
+
+	rcu_read_lock();
+
+	ptype = gro_find_complete_by_type(proto);
+	if (ptype)
+		err = ptype->callbacks.gro_complete(skb, nhoff + MTK_HDR_LEN);
+
+	rcu_read_unlock();
+
+	return err;
+}
+
 static const struct dsa_device_ops mtk_netdev_ops = {
 	.name		= "mtk",
 	.proto		= DSA_TAG_PROTO_MTK,
 	.xmit		= mtk_tag_xmit,
 	.rcv		= mtk_tag_rcv,
 	.flow_dissect	= mtk_tag_flow_dissect,
+	.gro_receive	= mtk_tag_gro_receive,
+	.gro_complete	= mtk_tag_gro_complete,
 	.overhead	= MTK_HDR_LEN,
 };
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 16/19] net: dsa: tag_qca: fix doubled Tx statistics
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (14 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 15/19] net: dsa: tag_mtk: add GRO callbacks Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 17:23   ` Andrew Lunn
  2019-12-30 14:30 ` [PATCH RFC net-next 17/19] net: dsa: tag_qca: switch to bitfield helpers Alexander Lobakin
                   ` (5 subsequent siblings)
  21 siblings, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

DSA core updates Tx stats for slaves in dsa_slave_xmit(), no need to do
it manually in .xmit() tagger callback.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_qca.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index 8e2dbaaffe59..e1c4dd04734a 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -33,9 +33,6 @@ static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct dsa_port *dp = dsa_slave_to_port(dev);
 	u16 *phdr, hdr;
 
-	dev->stats.tx_packets++;
-	dev->stats.tx_bytes += skb->len;
-
 	if (skb_cow_head(skb, 0) < 0)
 		return NULL;
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 17/19] net: dsa: tag_qca: switch to bitfield helpers
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (15 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 16/19] net: dsa: tag_qca: fix doubled Tx statistics Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 18/19] net: dsa: tag_qca: split out common tag accessors Alexander Lobakin
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

Make code look cleaner and more readable by using bitfield macros.
Misc: fix several macro identation and phdr type in qca_tag_xmit().

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_qca.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index e1c4dd04734a..8939abce36d7 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -3,15 +3,15 @@
  * Copyright (c) 2015, The Linux Foundation. All rights reserved.
  */
 
+#include <linux/bitfield.h>
 #include <linux/etherdevice.h>
 
 #include "dsa_priv.h"
 
-#define QCA_HDR_LEN	2
-#define QCA_HDR_VERSION	0x2
+#define QCA_HDR_LEN			2
+#define QCA_HDR_VERSION			0x2
 
-#define QCA_HDR_RECV_VERSION_MASK	GENMASK(15, 14)
-#define QCA_HDR_RECV_VERSION_S		14
+#define QCA_HDR_RECV_VERSION(tag)	FIELD_GET(GENMASK(15, 14), tag)
 #define QCA_HDR_RECV_PRIORITY_MASK	GENMASK(13, 11)
 #define QCA_HDR_RECV_PRIORITY_S		11
 #define QCA_HDR_RECV_TYPE_MASK		GENMASK(10, 6)
@@ -19,19 +19,20 @@
 #define QCA_HDR_RECV_FRAME_IS_TAGGED	BIT(3)
 #define QCA_HDR_RECV_SOURCE_PORT_MASK	GENMASK(2, 0)
 
-#define QCA_HDR_XMIT_VERSION_MASK	GENMASK(15, 14)
-#define QCA_HDR_XMIT_VERSION_S		14
+#define QCA_HDR_XMIT_VERSION		FIELD_PREP(GENMASK(15, 14), \
+						   QCA_HDR_VERSION)
 #define QCA_HDR_XMIT_PRIORITY_MASK	GENMASK(13, 11)
 #define QCA_HDR_XMIT_PRIORITY_S		11
 #define QCA_HDR_XMIT_CONTROL_MASK	GENMASK(10, 8)
 #define QCA_HDR_XMIT_CONTROL_S		8
 #define QCA_HDR_XMIT_FROM_CPU		BIT(7)
-#define QCA_HDR_XMIT_DP_BIT_MASK	GENMASK(6, 0)
+#define QCA_HDR_XMIT_DP(port)		FIELD_PREP(GENMASK(6, 0), BIT(port))
 
 static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 {
-	struct dsa_port *dp = dsa_slave_to_port(dev);
-	u16 *phdr, hdr;
+	const struct dsa_port *dp = dsa_slave_to_port(dev);
+	__be16 *phdr;
+	u16 hdr;
 
 	if (skb_cow_head(skb, 0) < 0)
 		return NULL;
@@ -39,11 +40,11 @@ static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 	skb_push(skb, QCA_HDR_LEN);
 
 	memmove(skb->data, skb->data + QCA_HDR_LEN, 2 * ETH_ALEN);
-	phdr = (u16 *)(skb->data + 2 * ETH_ALEN);
+	phdr = (__be16 *)(skb->data + 2 * ETH_ALEN);
 
 	/* Set the version field, and set destination port information */
-	hdr = QCA_HDR_VERSION << QCA_HDR_XMIT_VERSION_S |
-		QCA_HDR_XMIT_FROM_CPU | BIT(dp->index);
+	hdr = QCA_HDR_XMIT_VERSION | QCA_HDR_XMIT_FROM_CPU |
+	      QCA_HDR_XMIT_DP(dp->index);
 
 	*phdr = htons(hdr);
 
@@ -53,7 +54,6 @@ static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 				   struct packet_type *pt)
 {
-	u8 ver;
 	int port;
 	__be16 *phdr, hdr;
 
@@ -68,8 +68,7 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 	hdr = ntohs(*phdr);
 
 	/* Make sure the version is correct */
-	ver = (hdr & QCA_HDR_RECV_VERSION_MASK) >> QCA_HDR_RECV_VERSION_S;
-	if (unlikely(ver != QCA_HDR_VERSION))
+	if (unlikely(QCA_HDR_RECV_VERSION(hdr) != QCA_HDR_VERSION))
 		return NULL;
 
 	/* Remove QCA tag and recalculate checksum */
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 18/19] net: dsa: tag_qca: split out common tag accessors
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (16 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 17/19] net: dsa: tag_qca: switch to bitfield helpers Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 19/19] net: dsa: tag_qca: add GRO callbacks Alexander Lobakin
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

...to make them available for the upcoming GRO callbacks.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_qca.c | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index 8939abce36d7..bee2788e034d 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -28,6 +28,27 @@
 #define QCA_HDR_XMIT_FROM_CPU		BIT(7)
 #define QCA_HDR_XMIT_DP(port)		FIELD_PREP(GENMASK(6, 0), BIT(port))
 
+static inline bool qca_tag_sanity_check(const u8 *data)
+{
+	/* The QCA header is added by the switch between src addr and Ethertype
+	 * At this point, skb->data points to ethertype so header should be
+	 * right before
+	 */
+	u16 hdr = ntohs(*(__be16 *)(data - 2));
+
+	return QCA_HDR_RECV_VERSION(hdr) == QCA_HDR_VERSION;
+}
+
+static inline int qca_tag_source_port(const u8 *data)
+{
+	return *(data - 1) & QCA_HDR_RECV_SOURCE_PORT_MASK;
+}
+
+static inline __be16 qca_tag_encap_proto(const u8 *data)
+{
+	return *(__be16 *)data;
+}
+
 static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	const struct dsa_port *dp = dsa_slave_to_port(dev);
@@ -55,34 +76,26 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 				   struct packet_type *pt)
 {
 	int port;
-	__be16 *phdr, hdr;
 
 	if (unlikely(!pskb_may_pull(skb, QCA_HDR_LEN)))
 		return NULL;
 
-	/* The QCA header is added by the switch between src addr and Ethertype
-	 * At this point, skb->data points to ethertype so header should be
-	 * right before
-	 */
-	phdr = (__be16 *)(skb->data - 2);
-	hdr = ntohs(*phdr);
-
 	/* Make sure the version is correct */
-	if (unlikely(QCA_HDR_RECV_VERSION(hdr) != QCA_HDR_VERSION))
+	if (unlikely(!qca_tag_sanity_check(skb->data)))
 		return NULL;
 
-	/* Remove QCA tag and recalculate checksum */
-	skb_pull_rcsum(skb, QCA_HDR_LEN);
-	memmove(skb->data - ETH_HLEN, skb->data - ETH_HLEN - QCA_HDR_LEN,
-		ETH_HLEN - QCA_HDR_LEN);
-
 	/* Get source port information */
-	port = (hdr & QCA_HDR_RECV_SOURCE_PORT_MASK);
+	port = qca_tag_source_port(skb->data);
 
 	skb->dev = dsa_master_find_slave(dev, 0, port);
 	if (!skb->dev)
 		return NULL;
 
+	/* Remove QCA tag and recalculate checksum */
+	skb_pull_rcsum(skb, QCA_HDR_LEN);
+	memmove(skb->data - ETH_HLEN, skb->data - ETH_HLEN - QCA_HDR_LEN,
+		ETH_HLEN - QCA_HDR_LEN);
+
 	return skb;
 }
 
@@ -90,7 +103,7 @@ static void qca_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 				 int *offset)
 {
 	*offset = QCA_HDR_LEN;
-	*proto = ((__be16 *)skb->data)[0];
+	*proto = qca_tag_encap_proto(skb->data);
 }
 
 static const struct dsa_device_ops qca_netdev_ops = {
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 19/19] net: dsa: tag_qca: add GRO callbacks
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (17 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 18/19] net: dsa: tag_qca: split out common tag accessors Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 14:30 ` [PATCH RFC net-next 20/20] net: core: add (unlikely) DSA support in napi_gro_frags() Alexander Lobakin
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

...so that frames tagged with this CPU tag type can be correctly
processed by the GRO layer.
Misc: fix qca_netdev_ops structure identation.

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/dsa/tag_qca.c | 88 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 82 insertions(+), 6 deletions(-)

diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index bee2788e034d..d0cb2299cbe9 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -106,13 +106,89 @@ static void qca_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
 	*proto = qca_tag_encap_proto(skb->data);
 }
 
+static struct sk_buff *qca_tag_gro_receive(struct list_head *head,
+					   struct sk_buff *skb)
+{
+	const struct packet_offload *ptype;
+	struct sk_buff *p, *pp = NULL;
+	u32 data_off, data_end;
+	const u8 *data;
+	int flush = 1;
+
+	data_off = skb_gro_offset(skb);
+	data_end = data_off + QCA_HDR_LEN;
+
+	data = skb_gro_header_fast(skb, data_off);
+	if (skb_gro_header_hard(skb, data_end)) {
+		data = skb_gro_header_slow(skb, data_end, data_off);
+		if (unlikely(!data))
+			goto out;
+	}
+
+	/* Data that is to the left to the current position is already
+	 * pulled to the head
+	 */
+	if (unlikely(!qca_tag_sanity_check(skb->data + data_off)))
+		goto out;
+
+	rcu_read_lock();
+
+	ptype = gro_find_receive_by_type(qca_tag_encap_proto(data));
+	if (!ptype)
+		goto out_unlock;
+
+	flush = 0;
+
+	list_for_each_entry(p, head, list) {
+		if (!NAPI_GRO_CB(p)->same_flow)
+			continue;
+
+		if (qca_tag_source_port(skb->data + data_off) ^
+		    qca_tag_source_port(p->data + data_off))
+			NAPI_GRO_CB(p)->same_flow = 0;
+	}
+
+	skb_gro_pull(skb, QCA_HDR_LEN);
+	skb_gro_postpull_rcsum(skb, data, QCA_HDR_LEN);
+
+	pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
+
+out_unlock:
+	rcu_read_unlock();
+out:
+	skb_gro_flush_final(skb, pp, flush);
+
+	return pp;
+}
+
+static int qca_tag_gro_complete(struct sk_buff *skb, int nhoff)
+{
+	const struct packet_offload *ptype;
+	int err = -ENOENT;
+	__be16 proto;
+
+	proto = qca_tag_encap_proto(skb->data + nhoff);
+
+	rcu_read_lock();
+
+	ptype = gro_find_complete_by_type(proto);
+	if (ptype)
+		err = ptype->callbacks.gro_complete(skb, nhoff + QCA_HDR_LEN);
+
+	rcu_read_unlock();
+
+	return err;
+}
+
 static const struct dsa_device_ops qca_netdev_ops = {
-	.name	= "qca",
-	.proto	= DSA_TAG_PROTO_QCA,
-	.xmit	= qca_tag_xmit,
-	.rcv	= qca_tag_rcv,
-	.flow_dissect = qca_tag_flow_dissect,
-	.overhead = QCA_HDR_LEN,
+	.name		= "qca",
+	.proto		= DSA_TAG_PROTO_QCA,
+	.xmit		= qca_tag_xmit,
+	.rcv		= qca_tag_rcv,
+	.flow_dissect	= qca_tag_flow_dissect,
+	.gro_receive	= qca_tag_gro_receive,
+	.gro_complete	= qca_tag_gro_complete,
+	.overhead	= QCA_HDR_LEN,
 };
 
 MODULE_LICENSE("GPL");
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH RFC net-next 20/20] net: core: add (unlikely) DSA support in napi_gro_frags()
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (18 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 19/19] net: dsa: tag_qca: add GRO callbacks Alexander Lobakin
@ 2019-12-30 14:30 ` Alexander Lobakin
  2019-12-30 17:12 ` [PATCH RFC net-next 00/20] net: dsa: add GRO support Andrew Lunn
  2019-12-31 15:32 ` Vladimir Oltean
  21 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2019-12-30 14:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Alexander Lobakin,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

Make napi_gro_frags() available for DSA-enabled device drivers by adding
the same condition for them as the one in eth_type_trans().

Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
---
 net/core/dev.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index f1b8afcfbc0f..923b930a4506 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -142,6 +142,7 @@
 #include <linux/net_namespace.h>
 #include <linux/indirect_call_wrapper.h>
 #include <net/devlink.h>
+#include <net/dsa.h>
 
 #include "net-sysfs.h"
 
@@ -5951,6 +5952,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi,
  */
 static struct sk_buff *napi_frags_skb(struct napi_struct *napi)
 {
+	struct net_device *dev = napi->dev;
 	struct sk_buff *skb = napi->skb;
 	const struct ethhdr *eth;
 	unsigned int hlen = sizeof(*eth);
@@ -5964,7 +5966,7 @@ static struct sk_buff *napi_frags_skb(struct napi_struct *napi)
 		eth = skb_gro_header_slow(skb, hlen, 0);
 		if (unlikely(!eth)) {
 			net_warn_ratelimited("%s: dropping impossible skb from %s\n",
-					     __func__, napi->dev->name);
+					     __func__, dev->name);
 			napi_reuse_skb(napi, skb);
 			return NULL;
 		}
@@ -5978,10 +5980,13 @@ static struct sk_buff *napi_frags_skb(struct napi_struct *napi)
 
 	/*
 	 * This works because the only protocols we care about don't require
-	 * special handling.
+	 * special handling... except for DSA.
 	 * We'll fix it up properly in napi_frags_finish()
 	 */
-	skb->protocol = eth->h_proto;
+	if (unlikely(netdev_uses_dsa(dev)) && dsa_can_decode(skb, dev))
+		skb->protocol = htons(ETH_P_XDSA);
+	else
+		skb->protocol = eth->h_proto;
 
 	return skb;
 }
-- 
2.24.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 00/20] net: dsa: add GRO support
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (19 preceding siblings ...)
  2019-12-30 14:30 ` [PATCH RFC net-next 20/20] net: core: add (unlikely) DSA support in napi_gro_frags() Alexander Lobakin
@ 2019-12-30 17:12 ` Andrew Lunn
  2020-01-13  9:25   ` Alexander Lobakin
  2019-12-31 15:32 ` Vladimir Oltean
  21 siblings, 1 reply; 42+ messages in thread
From: Andrew Lunn @ 2019-12-30 17:12 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: David S. Miller, Edward Cree, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek

> I mark this as RFC, and there are the key questions for maintainers,
> developers, users etc.:
> - Do we need GRO support for DSA at all?

> - Does this series bring any performance improvements on the
>   affected systems?

Hi Alexander

I think these are the two most important questions. Did you do any
performance testing for the hardware you have?

I personally don't have any of the switches you have made
modifications to, so i cannot test these patches. I might be able to
add GRO to DSA and EDSA, where i can do some performance testing.

    Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 04/19] net: dsa: tag_ar9331: split out common tag accessors
  2019-12-30 14:30 ` [PATCH RFC net-next 04/19] net: dsa: tag_ar9331: split out common tag accessors Alexander Lobakin
@ 2019-12-30 17:18   ` Andrew Lunn
  0 siblings, 0 replies; 42+ messages in thread
From: Andrew Lunn @ 2019-12-30 17:18 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: David S. Miller, Edward Cree, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek

On Mon, Dec 30, 2019 at 05:30:12PM +0300, Alexander Lobakin wrote:
> They will be reused in upcoming GRO callbacks.
> (Almost) no functional changes except less informative error string.
> 
> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
> ---
>  net/dsa/tag_ar9331.c | 46 +++++++++++++++++++++++++++-----------------
>  1 file changed, 28 insertions(+), 18 deletions(-)
> 
> diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c
> index 399ca21ec03b..c22c1b515e02 100644
> --- a/net/dsa/tag_ar9331.c
> +++ b/net/dsa/tag_ar9331.c
> @@ -24,6 +24,25 @@
>  #define AR9331_HDR_RESERVED_MASK	GENMASK(5, 4)
>  #define AR9331_HDR_PORT_NUM_MASK	GENMASK(3, 0)
>  
> +static inline bool ar9331_tag_sanity_check(const u8 *data)

Hi Alexander

Please don't use inline in C files. Leave it to the compiler to
decide.

	Thanks
		Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 06/19] net: dsa: tag_gswip: fix typo in tag name
  2019-12-30 14:30 ` [PATCH RFC net-next 06/19] net: dsa: tag_gswip: fix typo in tag name Alexander Lobakin
@ 2019-12-30 17:22   ` Andrew Lunn
  2020-01-14 21:57     ` Florian Fainelli
  0 siblings, 1 reply; 42+ messages in thread
From: Andrew Lunn @ 2019-12-30 17:22 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: David S. Miller, Edward Cree, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek

On Mon, Dec 30, 2019 at 05:30:14PM +0300, Alexander Lobakin wrote:
> "gwsip" -> "gswip".
> 
> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 16/19] net: dsa: tag_qca: fix doubled Tx statistics
  2019-12-30 14:30 ` [PATCH RFC net-next 16/19] net: dsa: tag_qca: fix doubled Tx statistics Alexander Lobakin
@ 2019-12-30 17:23   ` Andrew Lunn
  2020-01-14 21:57     ` Florian Fainelli
  0 siblings, 1 reply; 42+ messages in thread
From: Andrew Lunn @ 2019-12-30 17:23 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: David S. Miller, Edward Cree, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek

On Mon, Dec 30, 2019 at 05:30:24PM +0300, Alexander Lobakin wrote:
> DSA core updates Tx stats for slaves in dsa_slave_xmit(), no need to do
> it manually in .xmit() tagger callback.
> 
> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 01/19] net: dsa: make .flow_dissect() callback returning void
  2019-12-30 14:30 ` [PATCH RFC net-next 01/19] net: dsa: make .flow_dissect() callback returning void Alexander Lobakin
@ 2019-12-30 18:11   ` Florian Fainelli
  0 siblings, 0 replies; 42+ messages in thread
From: Florian Fainelli @ 2019-12-30 18:11 UTC (permalink / raw)
  To: Alexander Lobakin, David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Hauke Mehrtens,
	Sean Wang, Matthias Brugger, Jiri Pirko, Eric Dumazet,
	Paolo Abeni, Jakub Kicinski, Taehee Yoo, Stephen Hemminger,
	Stanislav Fomichev, Daniel Borkmann, Song Liu, Matteo Croce,
	Jakub Sitnicki, Paul Blakey, Yoshiki Komachi, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek

On 12/30/19 6:30 AM, Alexander Lobakin wrote:
> There are no tag protocols which return non-zero values from
> flow_dissect() callback. Remove it to simplify code and save some
> object size.
> If a particular tagger can't calculate offset and proto for some
> reason, it can simply leave the original values untouched.
> 
> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2019-12-30 14:30 ` [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks Alexander Lobakin
@ 2019-12-30 18:20   ` Florian Fainelli
  2019-12-30 20:36     ` Andrew Lunn
  2020-01-13  9:21     ` Alexander Lobakin
  0 siblings, 2 replies; 42+ messages in thread
From: Florian Fainelli @ 2019-12-30 18:20 UTC (permalink / raw)
  To: Alexander Lobakin, David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Hauke Mehrtens,
	Sean Wang, Matthias Brugger, Jiri Pirko, Eric Dumazet,
	Paolo Abeni, Jakub Kicinski, Taehee Yoo, Stephen Hemminger,
	Stanislav Fomichev, Daniel Borkmann, Song Liu, Matteo Croce,
	Jakub Sitnicki, Paul Blakey, Yoshiki Komachi, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek

On 12/30/19 6:30 AM, Alexander Lobakin wrote:
> Add GRO callbacks to the AR9331 tagger so GRO layer can now process
> such frames.
> 
> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>

This is a good example and we should probably build a tagger abstraction
that is much simpler to fill in callbacks for (although indirect
function calls may end-up killing performance with retpoline and
friends), but let's consider this idea.

> ---
>  net/dsa/tag_ar9331.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 77 insertions(+)
> 
> diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c
> index c22c1b515e02..99cc7fd92d8e 100644
> --- a/net/dsa/tag_ar9331.c
> +++ b/net/dsa/tag_ar9331.c
> @@ -100,12 +100,89 @@ static void ar9331_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
>  	*proto = ar9331_tag_encap_proto(skb->data);
>  }
>  
> +static struct sk_buff *ar9331_tag_gro_receive(struct list_head *head,
> +					      struct sk_buff *skb)
> +{
> +	const struct packet_offload *ptype;
> +	struct sk_buff *p, *pp = NULL;
> +	u32 data_off, data_end;
> +	const u8 *data;
> +	int flush = 1;
> +
> +	data_off = skb_gro_offset(skb);
> +	data_end = data_off + AR9331_HDR_LEN;

AR9331_HDR_LEN is a parameter here which is incidentally
dsa_device_ops::overhead.

> +
> +	data = skb_gro_header_fast(skb, data_off);
> +	if (skb_gro_header_hard(skb, data_end)) {
> +		data = skb_gro_header_slow(skb, data_end, data_off);
> +		if (unlikely(!data))
> +			goto out;
> +	}
> +
> +	/* Data that is to the left from the current position is already
> +	 * pulled to the head
> +	 */
> +	if (unlikely(!ar9331_tag_sanity_check(skb->data + data_off)))
> +		goto out;

This is applicable to all taggers, they need to verify the sanity of the
header they are being handed.

> +
> +	rcu_read_lock();
> +
> +	ptype = gro_find_receive_by_type(ar9331_tag_encap_proto(data));

If there is no encapsulation a tagger can return the frame's protocol
directly, so similarly the tagger can be interrogated for returning that.

> +	if (!ptype)
> +		goto out_unlock;
> +
> +	flush = 0;
> +
> +	list_for_each_entry(p, head, list) {
> +		if (!NAPI_GRO_CB(p)->same_flow)
> +			continue;
> +
> +		if (ar9331_tag_source_port(skb->data + data_off) ^
> +		    ar9331_tag_source_port(p->data + data_off))

Similarly here, the tagger could provide a function whose job is to
return the port number from within its own tag.

So with that being said, what do you think about building a tagger
abstraction which is comprised of:

- header length which is dsa_device_ops::overhead
- validate_tag()
- get_tag_encap_proto()
- get_port_number()

and the rest is just wrapping the general GRO list manipulation?

Also, I am wondering should we somehow expose the DSA master
net_device's napi_struct such that we could have the DSA slave
net_devices call napi_gro_receive() themselves directly such that they
could also perform additional GRO on top of Ethernet frames?
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 03/19] net: dsa: tag_ar9331: add .flow_dissect() callback
  2019-12-30 14:30 ` [PATCH RFC net-next 03/19] net: dsa: tag_ar9331: add .flow_dissect() callback Alexander Lobakin
@ 2019-12-30 18:22   ` Florian Fainelli
  0 siblings, 0 replies; 42+ messages in thread
From: Florian Fainelli @ 2019-12-30 18:22 UTC (permalink / raw)
  To: Alexander Lobakin, David S. Miller
  Cc: Edward Cree, Andrew Lunn, Vivien Didelot, Hauke Mehrtens,
	Sean Wang, Matthias Brugger, Jiri Pirko, Eric Dumazet,
	Paolo Abeni, Jakub Kicinski, Taehee Yoo, Stephen Hemminger,
	Stanislav Fomichev, Daniel Borkmann, Song Liu, Matteo Croce,
	Jakub Sitnicki, Paul Blakey, Yoshiki Komachi, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek

On 12/30/19 6:30 AM, Alexander Lobakin wrote:
> ...to make RPS work correctly if user would like to configure it.
> Misc: fix identation of ar9331_netdev_ops structure.
> 
> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
> ---
>  net/dsa/tag_ar9331.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c
> index 466ffa92a474..399ca21ec03b 100644
> --- a/net/dsa/tag_ar9331.c
> +++ b/net/dsa/tag_ar9331.c
> @@ -83,12 +83,20 @@ static struct sk_buff *ar9331_tag_rcv(struct sk_buff *skb,
>  	return skb;
>  }
>  
> +static void ar9331_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
> +				    int *offset)
> +{
> +	*offset = AR9331_HDR_LEN;
> +	*proto = *(__be16 *)skb->data;

If we provided an offset from the beginning of the frame, then there is
no need to call .flow_dissect() at all and we can just return the
dsa_device_ops::overhead and offset_start (whatever is the name we pick)
and avoid a function call.

> +}
> +
>  static const struct dsa_device_ops ar9331_netdev_ops = {
> -	.name	= "ar9331",
> -	.proto	= DSA_TAG_PROTO_AR9331,
> -	.xmit	= ar9331_tag_xmit,
> -	.rcv	= ar9331_tag_rcv,
> -	.overhead = AR9331_HDR_LEN,
> +	.name		= "ar9331",
> +	.proto		= DSA_TAG_PROTO_AR9331,
> +	.xmit		= ar9331_tag_xmit,
> +	.rcv		= ar9331_tag_rcv,
> +	.flow_dissect	= ar9331_tag_flow_dissect,
> +	.overhead	= AR9331_HDR_LEN,
>  };
>  
>  MODULE_LICENSE("GPL v2");
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2019-12-30 18:20   ` Florian Fainelli
@ 2019-12-30 20:36     ` Andrew Lunn
  2020-01-13  9:21     ` Alexander Lobakin
  1 sibling, 0 replies; 42+ messages in thread
From: Andrew Lunn @ 2019-12-30 20:36 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Alexander Lobakin, David S. Miller, Edward Cree, Vivien Didelot,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek

On Mon, Dec 30, 2019 at 10:20:50AM -0800, Florian Fainelli wrote:
> On 12/30/19 6:30 AM, Alexander Lobakin wrote:
> > Add GRO callbacks to the AR9331 tagger so GRO layer can now process
> > such frames.
> > 
> > Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
> 
> This is a good example and we should probably build a tagger abstraction
> that is much simpler to fill in callbacks for (although indirect
> function calls may end-up killing performance with retpoline and
> friends), but let's consider this idea.

Hi Florian

We really do need some numbers here. Does GRO really help? On an ARM
or MIPS platform, i don't think retpoline is an issue? But x86 is, and
we do have a few x86 boards with switches.

Maybe we can do some macro magic instead of function pointers, if we
can keep it all within one object file?

    Andrew

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 00/20] net: dsa: add GRO support
  2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
                   ` (20 preceding siblings ...)
  2019-12-30 17:12 ` [PATCH RFC net-next 00/20] net: dsa: add GRO support Andrew Lunn
@ 2019-12-31 15:32 ` Vladimir Oltean
  2020-01-13  9:30   ` Alexander Lobakin
  21 siblings, 1 reply; 42+ messages in thread
From: Vladimir Oltean @ 2019-12-31 15:32 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: David S. Miller, Edward Cree, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Hauke Mehrtens, Sean Wang, Matthias Brugger,
	Jiri Pirko, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, lkml, linux-arm-kernel,
	linux-mediatek

Hi Alexander,

On Mon, 30 Dec 2019 at 16:31, Alexander Lobakin <alobakin@dlink.ru> wrote:
>
> I mark this as RFC, and there are the key questions for maintainers,
> developers, users etc.:
> - Do we need GRO support for DSA at all?
> - Which tagger protocols really need it and which don't?
> - Does this series bring any performance improvements on the
>   affected systems?

If these are these questions for maintainers, developers, users etc,
then what has determined you to make these changes?

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2019-12-30 18:20   ` Florian Fainelli
  2019-12-30 20:36     ` Andrew Lunn
@ 2020-01-13  9:21     ` Alexander Lobakin
  2020-01-13  9:42       ` Vladimir Oltean
  1 sibling, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2020-01-13  9:21 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: David S. Miller, Edward Cree, Andrew Lunn, Vivien Didelot,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek

Florian Fainelli wrote 30.12.2019 21:20:
> On 12/30/19 6:30 AM, Alexander Lobakin wrote:
>> Add GRO callbacks to the AR9331 tagger so GRO layer can now process
>> such frames.
>> 
>> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
> 
> This is a good example and we should probably build a tagger 
> abstraction
> that is much simpler to fill in callbacks for (although indirect
> function calls may end-up killing performance with retpoline and
> friends), but let's consider this idea.

Hey al,
Sorry for late replies, was in a big trip.

The performance issue was the main reason why I chose to write full
.gro_receive() for every single tagger instead of providing a bunch
of abstraction callbacks. It really isn't a problem for MIPS, on
which I'm working on this stuff, but can kill any advantages that we
could get from GRO support on e.g. x86.

>> ---
>>  net/dsa/tag_ar9331.c | 77 
>> ++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 77 insertions(+)
>> 
>> diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c
>> index c22c1b515e02..99cc7fd92d8e 100644
>> --- a/net/dsa/tag_ar9331.c
>> +++ b/net/dsa/tag_ar9331.c
>> @@ -100,12 +100,89 @@ static void ar9331_tag_flow_dissect(const struct 
>> sk_buff *skb, __be16 *proto,
>>  	*proto = ar9331_tag_encap_proto(skb->data);
>>  }
>> 
>> +static struct sk_buff *ar9331_tag_gro_receive(struct list_head *head,
>> +					      struct sk_buff *skb)
>> +{
>> +	const struct packet_offload *ptype;
>> +	struct sk_buff *p, *pp = NULL;
>> +	u32 data_off, data_end;
>> +	const u8 *data;
>> +	int flush = 1;
>> +
>> +	data_off = skb_gro_offset(skb);
>> +	data_end = data_off + AR9331_HDR_LEN;
> 
> AR9331_HDR_LEN is a parameter here which is incidentally
> dsa_device_ops::overhead.

Or we can split .overhead to .rx_len and .tx_len and use the first
to help GRO layer and flow dissector and the second to determine
total overhead to correct MTU value. Smth like:

mtu = max(tag_ops->rx_len, tag_ops->tx_len);

>> +
>> +	data = skb_gro_header_fast(skb, data_off);
>> +	if (skb_gro_header_hard(skb, data_end)) {
>> +		data = skb_gro_header_slow(skb, data_end, data_off);
>> +		if (unlikely(!data))
>> +			goto out;
>> +	}
>> +
>> +	/* Data that is to the left from the current position is already
>> +	 * pulled to the head
>> +	 */
>> +	if (unlikely(!ar9331_tag_sanity_check(skb->data + data_off)))
>> +		goto out;
> 
> This is applicable to all taggers, they need to verify the sanity of 
> the
> header they are being handed.
> 
>> +
>> +	rcu_read_lock();
>> +
>> +	ptype = gro_find_receive_by_type(ar9331_tag_encap_proto(data));
> 
> If there is no encapsulation a tagger can return the frame's protocol
> directly, so similarly the tagger can be interrogated for returning 
> that.
> 
>> +	if (!ptype)
>> +		goto out_unlock;
>> +
>> +	flush = 0;
>> +
>> +	list_for_each_entry(p, head, list) {
>> +		if (!NAPI_GRO_CB(p)->same_flow)
>> +			continue;
>> +
>> +		if (ar9331_tag_source_port(skb->data + data_off) ^
>> +		    ar9331_tag_source_port(p->data + data_off))
> 
> Similarly here, the tagger could provide a function whose job is to
> return the port number from within its own tag.
> 
> So with that being said, what do you think about building a tagger
> abstraction which is comprised of:
> 
> - header length which is dsa_device_ops::overhead
> - validate_tag()
> - get_tag_encap_proto()
> - get_port_number()
> 
> and the rest is just wrapping the general GRO list manipulation?

get_tag_encap_proto() and get_port_number() would be called more
than once in that case for every single frame. Not sure if it is
a good idea regarding to mentioned retpoline issues.

> Also, I am wondering should we somehow expose the DSA master
> net_device's napi_struct such that we could have the DSA slave
> net_devices call napi_gro_receive() themselves directly such that they
> could also perform additional GRO on top of Ethernet frames?

There's no reason to pass frames to GRO layer more than once.

The most correct way to handle frames is to pass them to networking
stack only after DSA tag extraction and removal. That's kinda how
mac80211 infra works. But this is rather problematic for DSA as it
keeps Ethernet controller drivers and taggers completely independent
from each others.

I also had an idea to use net_device::rx_handler for tag processing
instead of dsa_pack_type. CPU ports can't be bridged anyway, so this
should not be a problem an the first look.

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 00/20] net: dsa: add GRO support
  2019-12-30 17:12 ` [PATCH RFC net-next 00/20] net: dsa: add GRO support Andrew Lunn
@ 2020-01-13  9:25   ` Alexander Lobakin
  0 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2020-01-13  9:25 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David S. Miller, Edward Cree, Vivien Didelot, Florian Fainelli,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek

Andrew Lunn wrote 30.12.2019 20:12:
>> I mark this as RFC, and there are the key questions for maintainers,
>> developers, users etc.:
>> - Do we need GRO support for DSA at all?
> 
>> - Does this series bring any performance improvements on the
>>   affected systems?
> 
> Hi Alexander

Hi,

> I think these are the two most important questions. Did you do any
> performance testing for the hardware you have?

Exactly, this are the top questions. I performed lots of tests on
hardware with which I'm working on and had a pretty good boosts
(I didn't mainlined my drivers yet unfortunately).
But this does not mean that GRO would be that nice for all kind of
devices *at all*. That's why I would like to see more test results
on different systems.

> I personally don't have any of the switches you have made
> modifications to, so i cannot test these patches. I might be able to
> add GRO to DSA and EDSA, where i can do some performance testing.
> 
>     Andrew

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 00/20] net: dsa: add GRO support
  2019-12-31 15:32 ` Vladimir Oltean
@ 2020-01-13  9:30   ` Alexander Lobakin
  0 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2020-01-13  9:30 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David S. Miller, Edward Cree, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, Hauke Mehrtens, Sean Wang, Matthias Brugger,
	Jiri Pirko, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, lkml, linux-arm-kernel,
	linux-mediatek

Vladimir Oltean wrote 31.12.2019 18:32:
> Hi Alexander,

Hi,

> On Mon, 30 Dec 2019 at 16:31, Alexander Lobakin <alobakin@dlink.ru> 
> wrote:
>> 
>> I mark this as RFC, and there are the key questions for maintainers,
>> developers, users etc.:
>> - Do we need GRO support for DSA at all?
>> - Which tagger protocols really need it and which don't?
>> - Does this series bring any performance improvements on the
>>   affected systems?
> 
> If these are these questions for maintainers, developers, users etc,
> then what has determined you to make these changes?

The main reason was obviously pretty good results on a particular
hardware on which I developed this (and other) series and a general
opinion that GRO improves overall performance on most systems.
DSA is a special case though.

> Thanks,
> -Vladimir

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2020-01-13  9:21     ` Alexander Lobakin
@ 2020-01-13  9:42       ` Vladimir Oltean
  2020-01-13  9:46         ` Alexander Lobakin
  0 siblings, 1 reply; 42+ messages in thread
From: Vladimir Oltean @ 2020-01-13  9:42 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Florian Fainelli, David S. Miller, Edward Cree, Andrew Lunn,
	Vivien Didelot, Hauke Mehrtens, Sean Wang, Matthias Brugger,
	Jiri Pirko, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, lkml,
	moderated list:ARM/Mediatek SoC support,
	moderated list:ARM/Mediatek SoC support

Hi Alexander,

On Mon, 13 Jan 2020 at 11:22, Alexander Lobakin <alobakin@dlink.ru> wrote:
>
> CPU ports can't be bridged anyway
>
> Regards,
> ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

The fact that CPU ports can't be bridged is already not ideal.
One can have a DSA switch with cascaded switches on each port, so it
acts like N DSA masters (not as DSA links, since the taggers are
incompatible), with each switch forming its own tree. It is desirable
that the ports of the DSA switch on top are bridged, so that
forwarding between cascaded switches does not pass through the CPU.

-Vladimir

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2020-01-13  9:42       ` Vladimir Oltean
@ 2020-01-13  9:46         ` Alexander Lobakin
  2020-01-13 10:28           ` Vladimir Oltean
  0 siblings, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2020-01-13  9:46 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, David S. Miller, Edward Cree, Andrew Lunn,
	Vivien Didelot, Hauke Mehrtens, Sean Wang, Matthias Brugger,
	Jiri Pirko, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, lkml,
	moderated list:ARM/Mediatek SoC support,
	moderated list:ARM/Mediatek SoC support

Vladimir Oltean wrote 13.01.2020 12:42:
> Hi Alexander,
> 
> On Mon, 13 Jan 2020 at 11:22, Alexander Lobakin <alobakin@dlink.ru> 
> wrote:
>> 
>> CPU ports can't be bridged anyway
>> 
>> Regards,
>> ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ
> 
> The fact that CPU ports can't be bridged is already not ideal.
> One can have a DSA switch with cascaded switches on each port, so it
> acts like N DSA masters (not as DSA links, since the taggers are
> incompatible), with each switch forming its own tree. It is desirable
> that the ports of the DSA switch on top are bridged, so that
> forwarding between cascaded switches does not pass through the CPU.

Oh, I see. But currently DSA infra forbids the adding DSA masters to
bridges IIRC. Can't name it good or bad decision, but was introduced
to prevent accidental packet flow breaking on DSA setups.

> -Vladimir

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2020-01-13  9:46         ` Alexander Lobakin
@ 2020-01-13 10:28           ` Vladimir Oltean
  2020-01-14 21:56             ` Florian Fainelli
  0 siblings, 1 reply; 42+ messages in thread
From: Vladimir Oltean @ 2020-01-13 10:28 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Florian Fainelli, David S. Miller, Edward Cree, Andrew Lunn,
	Vivien Didelot, Hauke Mehrtens, Sean Wang, Matthias Brugger,
	Jiri Pirko, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, lkml,
	moderated list:ARM/Mediatek SoC support,
	moderated list:ARM/Mediatek SoC support

On Mon, 13 Jan 2020 at 11:46, Alexander Lobakin <alobakin@dlink.ru> wrote:
>
> Vladimir Oltean wrote 13.01.2020 12:42:
> > Hi Alexander,
> >
> > On Mon, 13 Jan 2020 at 11:22, Alexander Lobakin <alobakin@dlink.ru>
> > wrote:
> >>
> >> CPU ports can't be bridged anyway
> >>
> >> Regards,
> >> ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ
> >
> > The fact that CPU ports can't be bridged is already not ideal.
> > One can have a DSA switch with cascaded switches on each port, so it
> > acts like N DSA masters (not as DSA links, since the taggers are
> > incompatible), with each switch forming its own tree. It is desirable
> > that the ports of the DSA switch on top are bridged, so that
> > forwarding between cascaded switches does not pass through the CPU.
>
> Oh, I see. But currently DSA infra forbids the adding DSA masters to
> bridges IIRC. Can't name it good or bad decision, but was introduced
> to prevent accidental packet flow breaking on DSA setups.
>

I just wanted to point out that some people are going to be looking at
ways by which the ETH_P_XDSA handler can be made to play nice with the
master's rx_handler, and that it would be nice to at least not make
the limitation worse than it is by converting everything to
rx_handlers (which "currently" can't be stacked, from the comments in
netdevice.h).

> > -Vladimir
>
> Regards,
> ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2020-01-13 10:28           ` Vladimir Oltean
@ 2020-01-14 21:56             ` Florian Fainelli
  2020-01-15  7:38               ` Alexander Lobakin
  0 siblings, 1 reply; 42+ messages in thread
From: Florian Fainelli @ 2020-01-14 21:56 UTC (permalink / raw)
  To: Vladimir Oltean, Alexander Lobakin
  Cc: David S. Miller, Edward Cree, Andrew Lunn, Vivien Didelot,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, lkml, moderated list:ARM/Mediatek SoC support,
	moderated list:ARM/Mediatek SoC support

On 1/13/20 2:28 AM, Vladimir Oltean wrote:
> On Mon, 13 Jan 2020 at 11:46, Alexander Lobakin <alobakin@dlink.ru> wrote:
>>
>> Vladimir Oltean wrote 13.01.2020 12:42:
>>> Hi Alexander,
>>>
>>> On Mon, 13 Jan 2020 at 11:22, Alexander Lobakin <alobakin@dlink.ru>
>>> wrote:
>>>>
>>>> CPU ports can't be bridged anyway
>>>>
>>>> Regards,
>>>> ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ
>>>
>>> The fact that CPU ports can't be bridged is already not ideal.
>>> One can have a DSA switch with cascaded switches on each port, so it
>>> acts like N DSA masters (not as DSA links, since the taggers are
>>> incompatible), with each switch forming its own tree. It is desirable
>>> that the ports of the DSA switch on top are bridged, so that
>>> forwarding between cascaded switches does not pass through the CPU.
>>
>> Oh, I see. But currently DSA infra forbids the adding DSA masters to
>> bridges IIRC. Can't name it good or bad decision, but was introduced
>> to prevent accidental packet flow breaking on DSA setups.
>>
> 
> I just wanted to point out that some people are going to be looking at
> ways by which the ETH_P_XDSA handler can be made to play nice with the
> master's rx_handler, and that it would be nice to at least not make
> the limitation worse than it is by converting everything to
> rx_handlers (which "currently" can't be stacked, from the comments in
> netdevice.h).

I am not sure this would change the situation much, today we cannot have
anything but switch tags travel on the DSA master network device,
whether we accomplish the RX tap through a special skb->protocol value
or via rx_handler, it probably does not functionally matter, but it
could change the performance.
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 16/19] net: dsa: tag_qca: fix doubled Tx statistics
  2019-12-30 17:23   ` Andrew Lunn
@ 2020-01-14 21:57     ` Florian Fainelli
  0 siblings, 0 replies; 42+ messages in thread
From: Florian Fainelli @ 2020-01-14 21:57 UTC (permalink / raw)
  To: Andrew Lunn, Alexander Lobakin
  Cc: David S. Miller, Edward Cree, Vivien Didelot, Hauke Mehrtens,
	Sean Wang, Matthias Brugger, Jiri Pirko, Eric Dumazet,
	Paolo Abeni, Jakub Kicinski, Taehee Yoo, Stephen Hemminger,
	Stanislav Fomichev, Daniel Borkmann, Song Liu, Matteo Croce,
	Jakub Sitnicki, Paul Blakey, Yoshiki Komachi, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek

On 12/30/19 9:23 AM, Andrew Lunn wrote:
> On Mon, Dec 30, 2019 at 05:30:24PM +0300, Alexander Lobakin wrote:
>> DSA core updates Tx stats for slaves in dsa_slave_xmit(), no need to do
>> it manually in .xmit() tagger callback.
>>
>> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
> 
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Alexander, can you submit that separately from your GRO series and add a
Fixes tag for this?

Thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 06/19] net: dsa: tag_gswip: fix typo in tag name
  2019-12-30 17:22   ` Andrew Lunn
@ 2020-01-14 21:57     ` Florian Fainelli
  2020-01-15  7:24       ` Alexander Lobakin
  0 siblings, 1 reply; 42+ messages in thread
From: Florian Fainelli @ 2020-01-14 21:57 UTC (permalink / raw)
  To: Andrew Lunn, Alexander Lobakin
  Cc: David S. Miller, Edward Cree, Vivien Didelot, Hauke Mehrtens,
	Sean Wang, Matthias Brugger, Jiri Pirko, Eric Dumazet,
	Paolo Abeni, Jakub Kicinski, Taehee Yoo, Stephen Hemminger,
	Stanislav Fomichev, Daniel Borkmann, Song Liu, Matteo Croce,
	Jakub Sitnicki, Paul Blakey, Yoshiki Komachi, netdev,
	linux-kernel, linux-arm-kernel, linux-mediatek

On 12/30/19 9:22 AM, Andrew Lunn wrote:
> On Mon, Dec 30, 2019 at 05:30:14PM +0300, Alexander Lobakin wrote:
>> "gwsip" -> "gswip".
>>
>> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
> 
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Likewise, this is a bug fix that should be extracted out of this GRO
series and a Fixes: tag be put since this has an user-visible impact
through /sys/class/net/*/dsa/tagging.

Thanks
-- 
Florian

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 06/19] net: dsa: tag_gswip: fix typo in tag name
  2020-01-14 21:57     ` Florian Fainelli
@ 2020-01-15  7:24       ` Alexander Lobakin
  0 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2020-01-15  7:24 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, David S. Miller, Edward Cree, Vivien Didelot,
	Hauke Mehrtens, Sean Wang, Matthias Brugger, Jiri Pirko,
	Eric Dumazet, Paolo Abeni, Jakub Kicinski, Taehee Yoo,
	Stephen Hemminger, Stanislav Fomichev, Daniel Borkmann, Song Liu,
	Matteo Croce, Jakub Sitnicki, Paul Blakey, Yoshiki Komachi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek

Florian Fainelli wrote 15.01.2020 00:57:
> On 12/30/19 9:22 AM, Andrew Lunn wrote:
>> On Mon, Dec 30, 2019 at 05:30:14PM +0300, Alexander Lobakin wrote:
>>> "gwsip" -> "gswip".
>>> 
>>> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
>> 
>> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
> 
> Likewise, this is a bug fix that should be extracted out of this GRO
> series and a Fixes: tag be put since this has an user-visible impact
> through /sys/class/net/*/dsa/tagging.

Sure, I'll pull some really important fixes (like this one and doubled
Tx stats in tag_qca) out of this series and submit them as separate
patches, maybe even in net-fixes tree?

> Thanks

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2020-01-14 21:56             ` Florian Fainelli
@ 2020-01-15  7:38               ` Alexander Lobakin
  2020-01-15 11:29                 ` Alexander Lobakin
  0 siblings, 1 reply; 42+ messages in thread
From: Alexander Lobakin @ 2020-01-15  7:38 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Vladimir Oltean, David S. Miller, Edward Cree, Andrew Lunn,
	Vivien Didelot, Hauke Mehrtens, Sean Wang, Matthias Brugger,
	Jiri Pirko, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, lkml,
	moderated list:ARM/Mediatek SoC support,
	moderated list:ARM/Mediatek SoC support

Florian Fainelli wrote 15.01.2020 00:56:
> On 1/13/20 2:28 AM, Vladimir Oltean wrote:
>> On Mon, 13 Jan 2020 at 11:46, Alexander Lobakin <alobakin@dlink.ru> 
>> wrote:
>>> 
>>> Vladimir Oltean wrote 13.01.2020 12:42:
>>>> Hi Alexander,
>>>> 
>>>> On Mon, 13 Jan 2020 at 11:22, Alexander Lobakin <alobakin@dlink.ru>
>>>> wrote:
>>>>> 
>>>>> CPU ports can't be bridged anyway
>>>>> 
>>>>> Regards,
>>>>> ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ
>>>> 
>>>> The fact that CPU ports can't be bridged is already not ideal.
>>>> One can have a DSA switch with cascaded switches on each port, so it
>>>> acts like N DSA masters (not as DSA links, since the taggers are
>>>> incompatible), with each switch forming its own tree. It is 
>>>> desirable
>>>> that the ports of the DSA switch on top are bridged, so that
>>>> forwarding between cascaded switches does not pass through the CPU.
>>> 
>>> Oh, I see. But currently DSA infra forbids the adding DSA masters to
>>> bridges IIRC. Can't name it good or bad decision, but was introduced
>>> to prevent accidental packet flow breaking on DSA setups.
>>> 
>> 
>> I just wanted to point out that some people are going to be looking at
>> ways by which the ETH_P_XDSA handler can be made to play nice with the
>> master's rx_handler, and that it would be nice to at least not make
>> the limitation worse than it is by converting everything to
>> rx_handlers (which "currently" can't be stacked, from the comments in
>> netdevice.h).
> 
> I am not sure this would change the situation much, today we cannot 
> have
> anything but switch tags travel on the DSA master network device,
> whether we accomplish the RX tap through a special skb->protocol value
> or via rx_handler, it probably does not functionally matter, but it
> could change the performance.

As for now, I think that we should keep this RFC as it is so
developers working with different DSA switches could test it or
implement GRO offload for other taggers like DSA and EDSA, *but*
any future work on this should come only when we'll revise/reimagine
basic DSA packet flow, as we already know (at least me and Florian
reproduce it well) that the current path through unlikely branches
in eth_type_trans() and frame capturing through packet_type is so
suboptimal that nearly destroys overall performance on several
setups.
Switching to net_device::rx_handler() is just one of all the possible
variants, I'm sure we'll find the best solution together.

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks
  2020-01-15  7:38               ` Alexander Lobakin
@ 2020-01-15 11:29                 ` Alexander Lobakin
  0 siblings, 0 replies; 42+ messages in thread
From: Alexander Lobakin @ 2020-01-15 11:29 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Vladimir Oltean, David S. Miller, Edward Cree, Andrew Lunn,
	Vivien Didelot, Hauke Mehrtens, Sean Wang, Matthias Brugger,
	Jiri Pirko, Eric Dumazet, Paolo Abeni, Jakub Kicinski,
	Taehee Yoo, Stephen Hemminger, Stanislav Fomichev,
	Daniel Borkmann, Song Liu, Matteo Croce, Jakub Sitnicki,
	Paul Blakey, Yoshiki Komachi, netdev, lkml,
	moderated list:ARM/Mediatek SoC support,
	moderated list:ARM/Mediatek SoC support

Alexander Lobakin wrote 15.01.2020 10:38:
> Florian Fainelli wrote 15.01.2020 00:56:
>> On 1/13/20 2:28 AM, Vladimir Oltean wrote:
>>> On Mon, 13 Jan 2020 at 11:46, Alexander Lobakin <alobakin@dlink.ru> 
>>> wrote:
>>>> 
>>>> Vladimir Oltean wrote 13.01.2020 12:42:
>>>>> Hi Alexander,
>>>>> 
>>>>> On Mon, 13 Jan 2020 at 11:22, Alexander Lobakin <alobakin@dlink.ru>
>>>>> wrote:
>>>>>> 
>>>>>> CPU ports can't be bridged anyway
>>>>>> 
>>>>>> Regards,
>>>>>> ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ
>>>>> 
>>>>> The fact that CPU ports can't be bridged is already not ideal.
>>>>> One can have a DSA switch with cascaded switches on each port, so 
>>>>> it
>>>>> acts like N DSA masters (not as DSA links, since the taggers are
>>>>> incompatible), with each switch forming its own tree. It is 
>>>>> desirable
>>>>> that the ports of the DSA switch on top are bridged, so that
>>>>> forwarding between cascaded switches does not pass through the CPU.
>>>> 
>>>> Oh, I see. But currently DSA infra forbids the adding DSA masters to
>>>> bridges IIRC. Can't name it good or bad decision, but was introduced
>>>> to prevent accidental packet flow breaking on DSA setups.
>>>> 
>>> 
>>> I just wanted to point out that some people are going to be looking 
>>> at
>>> ways by which the ETH_P_XDSA handler can be made to play nice with 
>>> the
>>> master's rx_handler, and that it would be nice to at least not make
>>> the limitation worse than it is by converting everything to
>>> rx_handlers (which "currently" can't be stacked, from the comments in
>>> netdevice.h).
>> 
>> I am not sure this would change the situation much, today we cannot 
>> have
>> anything but switch tags travel on the DSA master network device,
>> whether we accomplish the RX tap through a special skb->protocol value
>> or via rx_handler, it probably does not functionally matter, but it
>> could change the performance.
> 
> As for now, I think that we should keep this RFC as it is so
> developers working with different DSA switches could test it or
> implement GRO offload for other taggers like DSA and EDSA, *but*
> any future work on this should come only when we'll revise/reimagine
> basic DSA packet flow, as we already know (at least me and Florian
> reproduce it well) that the current path through unlikely branches
> in eth_type_trans() and frame capturing through packet_type is so
> suboptimal that nearly destroys overall performance on several
> setups.

Well, I had enough free time today to write and test sort of
blueprint-like DSA via .rx_handler() to compare it with the current
flow and get at least basic picture of what's going on.

I chose a 600 MHz UP MIPS system to make a difference more noticeable
as more powerful systems tend to mitigate plenty of different "heavy"
corners and misses.
Ethernet driver for CPU port uses BQL and DIM, as well as hardware TSO.
A minimal GRO over DSA is also enabled. The codebase is Linux 5.5-rc6.
I use simple VLAN NAT (with nft flow offload), iperf3, IPv4 + TCP.

Mainline DSA Rx processing, one flow:

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-60.00  sec  4.30 GBytes   615 Mbits/sec  2091   sender
[  5]   0.00-60.01  sec  4.30 GBytes   615 Mbits/sec         receiver

10 flows:

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-60.00  sec   414 MBytes  57.9 Mbits/sec  460    sender
[  5]   0.00-60.01  sec   413 MBytes  57.7 Mbits/sec         receiver
[  7]   0.00-60.00  sec   392 MBytes  54.8 Mbits/sec  497    sender
[  7]   0.00-60.01  sec   391 MBytes  54.6 Mbits/sec         receiver
[  9]   0.00-60.00  sec   391 MBytes  54.6 Mbits/sec  438    sender
[  9]   0.00-60.01  sec   389 MBytes  54.4 Mbits/sec         receiver
[ 11]   0.00-60.00  sec   383 MBytes  53.5 Mbits/sec  472    sender
[ 11]   0.00-60.01  sec   382 MBytes  53.4 Mbits/sec         receiver
[ 13]   0.00-60.00  sec   404 MBytes  56.5 Mbits/sec  466    sender
[ 13]   0.00-60.01  sec   403 MBytes  56.3 Mbits/sec         receiver
[ 15]   0.00-60.00  sec   453 MBytes  63.4 Mbits/sec  490    sender
[ 15]   0.00-60.01  sec   452 MBytes  63.1 Mbits/sec         receiver
[ 17]   0.00-60.00  sec   461 MBytes  64.4 Mbits/sec  430    sender
[ 17]   0.00-60.01  sec   459 MBytes  64.2 Mbits/sec         receiver
[ 19]   0.00-60.00  sec   365 MBytes  51.0 Mbits/sec  493    sender
[ 19]   0.00-60.01  sec   364 MBytes  50.9 Mbits/sec         receiver
[ 21]   0.00-60.00  sec   407 MBytes  56.9 Mbits/sec  517    sender
[ 21]   0.00-60.01  sec   405 MBytes  56.7 Mbits/sec         receiver
[ 23]   0.00-60.00  sec   486 MBytes  68.0 Mbits/sec  458    sender
[ 23]   0.00-60.01  sec   484 MBytes  67.7 Mbits/sec         receiver
[SUM]   0.00-60.00  sec  4.06 GBytes   581 Mbits/sec  4721   sender
[SUM]   0.00-60.01  sec  4.04 GBytes   579 Mbits/sec         receiver

.rx_handler(), one flow:

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-60.00  sec  4.40 GBytes   630 Mbits/sec  853    sender
[  5]   0.00-60.01  sec  4.40 GBytes   630 Mbits/sec         receiver

And 10:

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-60.00  sec   440 MBytes  61.5 Mbits/sec  551    sender
[  5]   0.00-60.01  sec   439 MBytes  61.4 Mbits/sec         receiver
[  7]   0.00-60.00  sec   455 MBytes  63.6 Mbits/sec  496    sender
[  7]   0.00-60.01  sec   454 MBytes  63.4 Mbits/sec         receiver
[  9]   0.00-60.00  sec   484 MBytes  67.7 Mbits/sec  532    sender
[  9]   0.00-60.01  sec   483 MBytes  67.5 Mbits/sec         receiver
[ 11]   0.00-60.00  sec   598 MBytes  83.6 Mbits/sec  452    sender
[ 11]   0.00-60.01  sec   596 MBytes  83.3 Mbits/sec         receiver
[ 13]   0.00-60.00  sec   427 MBytes  59.7 Mbits/sec  539    sender
[ 13]   0.00-60.01  sec   426 MBytes  59.5 Mbits/sec         receiver
[ 15]   0.00-60.00  sec   469 MBytes  65.5 Mbits/sec  466    sender
[ 15]   0.00-60.01  sec   467 MBytes  65.3 Mbits/sec         receiver
[ 17]   0.00-60.00  sec   463 MBytes  64.7 Mbits/sec  472    sender
[ 17]   0.00-60.01  sec   462 MBytes  64.5 Mbits/sec         receiver
[ 19]   0.00-60.00  sec   533 MBytes  74.5 Mbits/sec  447    sender
[ 19]   0.00-60.01  sec   532 MBytes  74.3 Mbits/sec         receiver
[ 21]   0.00-60.00  sec   444 MBytes  62.1 Mbits/sec  527    sender
[ 21]   0.00-60.01  sec   443 MBytes  61.9 Mbits/sec         receiver
[ 23]   0.00-60.00  sec   500 MBytes  69.9 Mbits/sec  449    sender
[ 23]   0.00-60.01  sec   499 MBytes  69.8 Mbits/sec         receiver
[SUM]   0.00-60.00  sec  4.70 GBytes   673 Mbits/sec  4931   sender
[SUM]   0.00-60.01  sec  4.69 GBytes   671 Mbits/sec         receiver

Pretty significant stats. This happens not only because we get rid of
out-of-line unlikely() branches (which are natural killers, at least
on MIPS), but also because we don't need to call netif_receive_skb()
for the second time -- we might just return RX_HANDLER_ANOTHER and
Rx path becomes then not much longer than in case of simple VLAN tag
removal (_net/core/dev.c:5056_).

This should get more attention and tests on a wide variety of other
systems, of course.

> Switching to net_device::rx_handler() is just one of all the possible
> variants, I'm sure we'll find the best solution together.
> 
> Regards,
> ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, back to index

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-30 14:30 [PATCH RFC net-next 00/20] net: dsa: add GRO support Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 01/19] net: dsa: make .flow_dissect() callback returning void Alexander Lobakin
2019-12-30 18:11   ` Florian Fainelli
2019-12-30 14:30 ` [PATCH RFC net-next 02/19] net: dsa: add GRO support infrastructure Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 03/19] net: dsa: tag_ar9331: add .flow_dissect() callback Alexander Lobakin
2019-12-30 18:22   ` Florian Fainelli
2019-12-30 14:30 ` [PATCH RFC net-next 04/19] net: dsa: tag_ar9331: split out common tag accessors Alexander Lobakin
2019-12-30 17:18   ` Andrew Lunn
2019-12-30 14:30 ` [PATCH RFC net-next 05/19] net: dsa: tag_ar9331: add GRO callbacks Alexander Lobakin
2019-12-30 18:20   ` Florian Fainelli
2019-12-30 20:36     ` Andrew Lunn
2020-01-13  9:21     ` Alexander Lobakin
2020-01-13  9:42       ` Vladimir Oltean
2020-01-13  9:46         ` Alexander Lobakin
2020-01-13 10:28           ` Vladimir Oltean
2020-01-14 21:56             ` Florian Fainelli
2020-01-15  7:38               ` Alexander Lobakin
2020-01-15 11:29                 ` Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 06/19] net: dsa: tag_gswip: fix typo in tag name Alexander Lobakin
2019-12-30 17:22   ` Andrew Lunn
2020-01-14 21:57     ` Florian Fainelli
2020-01-15  7:24       ` Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 07/19] net: dsa: tag_gswip: switch to bitfield helpers Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 08/19] net: dsa: tag_gswip: add .flow_dissect() callback Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 09/19] net: dsa: tag_gswip: split out common tag accessors Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 10/19] net: dsa: tag_gswip: add GRO callbacks Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 11/19] net: dsa: tag_lan9303: add .flow_dissect() callback Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 12/19] net: dsa: tag_lan9303: split out common tag accessors Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 13/19] net: dsa: tag_lan9303: add GRO callbacks Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 14/19] net: dsa: tag_mtk: split out common tag accessors Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 15/19] net: dsa: tag_mtk: add GRO callbacks Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 16/19] net: dsa: tag_qca: fix doubled Tx statistics Alexander Lobakin
2019-12-30 17:23   ` Andrew Lunn
2020-01-14 21:57     ` Florian Fainelli
2019-12-30 14:30 ` [PATCH RFC net-next 17/19] net: dsa: tag_qca: switch to bitfield helpers Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 18/19] net: dsa: tag_qca: split out common tag accessors Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 19/19] net: dsa: tag_qca: add GRO callbacks Alexander Lobakin
2019-12-30 14:30 ` [PATCH RFC net-next 20/20] net: core: add (unlikely) DSA support in napi_gro_frags() Alexander Lobakin
2019-12-30 17:12 ` [PATCH RFC net-next 00/20] net: dsa: add GRO support Andrew Lunn
2020-01-13  9:25   ` Alexander Lobakin
2019-12-31 15:32 ` Vladimir Oltean
2020-01-13  9:30   ` Alexander Lobakin

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git