All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
@ 2020-12-02  4:53 ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

Hi,

Changes from RFC v2:
 - Reorganised the offload enabling/disabling on the driver size;
 - Added a few igc fixes;
 
Changes from RFC v1:
 - The per-queue preemptible/express setting is moved to applicable
   qdiscs (Jakub Kicinski and others);
 - "min-frag-size" now follows the 802.3br specification more closely,
   it's expressed as X in '64(1 + X) + 4' (Joergen Andreasen);

Another point that should be noted is the addition of the
TC_SETUP_PREEMPT offload type, the idea behind this is to allow other
qdiscs (was thinking of mqprio) to also configure which queues should
be marked as express/preemptible.

Original cover letter:

This is still an RFC because two main reasons, I want to confirm that
this approach (per-queue settings via qdiscs, device settings via
ethtool) looks good, even though there aren't much more options left ;-)
The other reason is that while testing this I found some weirdness
in the driver that I would need a bit more time to investigate.

(In case these patches are not enough to give an idea of how things
work, I can send the userspace patches, of course.)

The idea of this "hybrid" approach is that applications/users would do
the following steps to configure frame preemption:

$ tc qdisc replace dev $IFACE parent root handle 100 taprio \
      num_tc 3 \
      map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
      queues 1@0 1@1 2@2 \
      base-time $BASE_TIME \
      sched-entry S 0f 10000000 \
      preempt 1110 \
      flags 0x2 

The "preempt" parameter is the only difference, it configures which
queues are marked as preemptible, in this example, queue 0 is marked
as "not preemptible", so it is express, the rest of the four queues
are preemptible.

The next step, of this example, would be to enable frame preemption in
the device, via ethtool, and set the minimum fragment size to 2:

$ sudo ./ethtool --set-frame-preemption $IFACE fp on min-frag-size 2


Cheers,


Vinicius Costa Gomes (9):
  ethtool: Add support for configuring frame preemption
  taprio: Add support for frame preemption offload
  igc: Set the RX packet buffer size for TSN mode
  igc: Only dump registers if configured to dump HW information
  igc: Avoid TX Hangs because long cycles
  igc: Add support for tuning frame preemption via ethtool
  igc: Add support for Frame Preemption offload
  igc: Add support for exposing frame preemption stats registers
  igc: Separate TSN configurations that can be updated

 drivers/net/ethernet/intel/igc/igc.h         |  10 ++
 drivers/net/ethernet/intel/igc/igc_defines.h |   6 +
 drivers/net/ethernet/intel/igc/igc_dump.c    |   3 +
 drivers/net/ethernet/intel/igc/igc_ethtool.c |  63 ++++++++
 drivers/net/ethernet/intel/igc/igc_main.c    |  31 +++-
 drivers/net/ethernet/intel/igc/igc_regs.h    |  10 ++
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 158 ++++++++++++++-----
 drivers/net/ethernet/intel/igc/igc_tsn.h     |   1 +
 include/linux/ethtool.h                      |  19 +++
 include/linux/netdevice.h                    |   1 +
 include/net/pkt_sched.h                      |   4 +
 include/uapi/linux/ethtool_netlink.h         |  17 ++
 include/uapi/linux/pkt_sched.h               |   1 +
 net/ethtool/Makefile                         |   2 +-
 net/ethtool/netlink.c                        |  19 +++
 net/ethtool/netlink.h                        |   4 +
 net/ethtool/preempt.c                        | 151 ++++++++++++++++++
 net/sched/sch_taprio.c                       |  41 ++++-
 18 files changed, 491 insertions(+), 50 deletions(-)
 create mode 100644 net/ethtool/preempt.c

-- 
2.29.2


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
@ 2020-12-02  4:53 ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

Hi,

Changes from RFC v2:
 - Reorganised the offload enabling/disabling on the driver size;
 - Added a few igc fixes;
 
Changes from RFC v1:
 - The per-queue preemptible/express setting is moved to applicable
   qdiscs (Jakub Kicinski and others);
 - "min-frag-size" now follows the 802.3br specification more closely,
   it's expressed as X in '64(1 + X) + 4' (Joergen Andreasen);

Another point that should be noted is the addition of the
TC_SETUP_PREEMPT offload type, the idea behind this is to allow other
qdiscs (was thinking of mqprio) to also configure which queues should
be marked as express/preemptible.

Original cover letter:

This is still an RFC because two main reasons, I want to confirm that
this approach (per-queue settings via qdiscs, device settings via
ethtool) looks good, even though there aren't much more options left ;-)
The other reason is that while testing this I found some weirdness
in the driver that I would need a bit more time to investigate.

(In case these patches are not enough to give an idea of how things
work, I can send the userspace patches, of course.)

The idea of this "hybrid" approach is that applications/users would do
the following steps to configure frame preemption:

$ tc qdisc replace dev $IFACE parent root handle 100 taprio \
      num_tc 3 \
      map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
      queues 1 at 0 1 at 1 2 at 2 \
      base-time $BASE_TIME \
      sched-entry S 0f 10000000 \
      preempt 1110 \
      flags 0x2 

The "preempt" parameter is the only difference, it configures which
queues are marked as preemptible, in this example, queue 0 is marked
as "not preemptible", so it is express, the rest of the four queues
are preemptible.

The next step, of this example, would be to enable frame preemption in
the device, via ethtool, and set the minimum fragment size to 2:

$ sudo ./ethtool --set-frame-preemption $IFACE fp on min-frag-size 2


Cheers,


Vinicius Costa Gomes (9):
  ethtool: Add support for configuring frame preemption
  taprio: Add support for frame preemption offload
  igc: Set the RX packet buffer size for TSN mode
  igc: Only dump registers if configured to dump HW information
  igc: Avoid TX Hangs because long cycles
  igc: Add support for tuning frame preemption via ethtool
  igc: Add support for Frame Preemption offload
  igc: Add support for exposing frame preemption stats registers
  igc: Separate TSN configurations that can be updated

 drivers/net/ethernet/intel/igc/igc.h         |  10 ++
 drivers/net/ethernet/intel/igc/igc_defines.h |   6 +
 drivers/net/ethernet/intel/igc/igc_dump.c    |   3 +
 drivers/net/ethernet/intel/igc/igc_ethtool.c |  63 ++++++++
 drivers/net/ethernet/intel/igc/igc_main.c    |  31 +++-
 drivers/net/ethernet/intel/igc/igc_regs.h    |  10 ++
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 158 ++++++++++++++-----
 drivers/net/ethernet/intel/igc/igc_tsn.h     |   1 +
 include/linux/ethtool.h                      |  19 +++
 include/linux/netdevice.h                    |   1 +
 include/net/pkt_sched.h                      |   4 +
 include/uapi/linux/ethtool_netlink.h         |  17 ++
 include/uapi/linux/pkt_sched.h               |   1 +
 net/ethtool/Makefile                         |   2 +-
 net/ethtool/netlink.c                        |  19 +++
 net/ethtool/netlink.h                        |   4 +
 net/ethtool/preempt.c                        | 151 ++++++++++++++++++
 net/sched/sch_taprio.c                       |  41 ++++-
 18 files changed, 491 insertions(+), 50 deletions(-)
 create mode 100644 net/ethtool/preempt.c

-- 
2.29.2


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

Frame preemption (described in IEEE 802.3br-2016) defines the concept
of preemptible and express queues. It allows traffic from express
queues to "interrupt" traffic from preemptible queues, which are
"resumed" after the express traffic has finished transmitting.

Frame preemption can only be used when both the local device and the
link partner support it.

Only parameters for enabling/disabling frame preemption and
configuring the minimum fragment size are included here. Expressing
which queues are marked as preemptible is left to mqprio/taprio, as
having that information there should be easier on the user.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 include/linux/ethtool.h              |  19 ++++
 include/uapi/linux/ethtool_netlink.h |  17 +++
 net/ethtool/Makefile                 |   2 +-
 net/ethtool/netlink.c                |  19 ++++
 net/ethtool/netlink.h                |   4 +
 net/ethtool/preempt.c                | 151 +++++++++++++++++++++++++++
 6 files changed, 211 insertions(+), 1 deletion(-)
 create mode 100644 net/ethtool/preempt.c

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index e3da25b51ae4..16d6ee29a6ac 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -263,6 +263,19 @@ struct ethtool_pause_stats {
 	u64 rx_pause_frames;
 };
 
+/**
+ * struct ethtool_fp - Frame Preemption information
+ *
+ * @enabled: Enable frame preemption.
+ *
+ * @min_frag_size_mult: Minimum size for all non-final fragment size,
+ * expressed in terms of X in '(1 + X)*64 + 4'
+ */
+struct ethtool_fp {
+	u8 enabled;
+	u8 min_frag_size_mult;
+};
+
 /**
  * struct ethtool_ops - optional netdev operations
  * @supported_coalesce_params: supported types of interrupt coalescing.
@@ -406,6 +419,8 @@ struct ethtool_pause_stats {
  * @get_ethtool_phy_stats: Return extended statistics about the PHY device.
  *	This is only useful if the device maintains PHY statistics and
  *	cannot use the standard PHY library helpers.
+ * @get_preempt: Get the network device Frame Preemption parameters.
+ * @set_preempt: Set the network device Frame Preemption parameters.
  *
  * All operations are optional (i.e. the function pointer may be set
  * to %NULL) and callers must take this into account.  Callers must
@@ -504,6 +519,10 @@ struct ethtool_ops {
 				      struct ethtool_fecparam *);
 	int	(*set_fecparam)(struct net_device *,
 				      struct ethtool_fecparam *);
+	int	(*get_preempt)(struct net_device *,
+			       struct ethtool_fp *);
+	int	(*set_preempt)(struct net_device *,
+			       struct ethtool_fp *);
 	void	(*get_ethtool_phy_stats)(struct net_device *,
 					 struct ethtool_stats *, u64 *);
 	int	(*get_phy_tunable)(struct net_device *,
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index e2bf36e6964b..0b3dc0c263a9 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -42,6 +42,8 @@ enum {
 	ETHTOOL_MSG_CABLE_TEST_ACT,
 	ETHTOOL_MSG_CABLE_TEST_TDR_ACT,
 	ETHTOOL_MSG_TUNNEL_INFO_GET,
+	ETHTOOL_MSG_PREEMPT_GET,
+	ETHTOOL_MSG_PREEMPT_SET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -80,6 +82,8 @@ enum {
 	ETHTOOL_MSG_CABLE_TEST_NTF,
 	ETHTOOL_MSG_CABLE_TEST_TDR_NTF,
 	ETHTOOL_MSG_TUNNEL_INFO_GET_REPLY,
+	ETHTOOL_MSG_PREEMPT_GET_REPLY,
+	ETHTOOL_MSG_PREEMPT_NTF,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -628,6 +632,19 @@ enum {
 	ETHTOOL_A_TUNNEL_INFO_MAX = (__ETHTOOL_A_TUNNEL_INFO_CNT - 1)
 };
 
+/* FRAME PREEMPTION */
+
+enum {
+	ETHTOOL_A_PREEMPT_UNSPEC,
+	ETHTOOL_A_PREEMPT_HEADER,			/* nest - _A_HEADER_* */
+	ETHTOOL_A_PREEMPT_ENABLED,			/* u8 */
+	ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT,		/* u8 */
+
+	/* add new constants above here */
+	__ETHTOOL_A_PREEMPT_CNT,
+	ETHTOOL_A_PREEMPT_MAX = (__ETHTOOL_A_PREEMPT_CNT - 1)
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index 7a849ff22dad..4e584903e3ef 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -7,4 +7,4 @@ obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o \
 		   linkstate.o debug.o wol.o features.o privflags.o rings.o \
 		   channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \
-		   tunnels.o
+		   tunnels.o preempt.o
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 50d3c8896f91..bc7d66e3ba38 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -245,6 +245,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_PAUSE_GET]		= &ethnl_pause_request_ops,
 	[ETHTOOL_MSG_EEE_GET]		= &ethnl_eee_request_ops,
 	[ETHTOOL_MSG_TSINFO_GET]	= &ethnl_tsinfo_request_ops,
+	[ETHTOOL_MSG_PREEMPT_GET]	= &ethnl_preempt_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -551,6 +552,7 @@ ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = {
 	[ETHTOOL_MSG_COALESCE_NTF]	= &ethnl_coalesce_request_ops,
 	[ETHTOOL_MSG_PAUSE_NTF]		= &ethnl_pause_request_ops,
 	[ETHTOOL_MSG_EEE_NTF]		= &ethnl_eee_request_ops,
+	[ETHTOOL_MSG_PREEMPT_NTF]	= &ethnl_preempt_request_ops,
 };
 
 /* default notification handler */
@@ -643,6 +645,7 @@ static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
 	[ETHTOOL_MSG_COALESCE_NTF]	= ethnl_default_notify,
 	[ETHTOOL_MSG_PAUSE_NTF]		= ethnl_default_notify,
 	[ETHTOOL_MSG_EEE_NTF]		= ethnl_default_notify,
+	[ETHTOOL_MSG_PREEMPT_NTF]	= ethnl_default_notify,
 };
 
 void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
@@ -912,6 +915,22 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.policy = ethnl_tunnel_info_get_policy,
 		.maxattr = ARRAY_SIZE(ethnl_tunnel_info_get_policy) - 1,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_PREEMPT_GET,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+		.policy = ethnl_preempt_get_policy,
+		.maxattr = ARRAY_SIZE(ethnl_preempt_get_policy) - 1,
+	},
+	{
+		.cmd	= ETHTOOL_MSG_PREEMPT_SET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_set_preempt,
+		.policy = ethnl_preempt_set_policy,
+		.maxattr = ARRAY_SIZE(ethnl_preempt_set_policy) - 1,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index d8efec516d86..8f65e53ccd59 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -344,6 +344,7 @@ extern const struct ethnl_request_ops ethnl_coalesce_request_ops;
 extern const struct ethnl_request_ops ethnl_pause_request_ops;
 extern const struct ethnl_request_ops ethnl_eee_request_ops;
 extern const struct ethnl_request_ops ethnl_tsinfo_request_ops;
+extern const struct ethnl_request_ops ethnl_preempt_request_ops;
 
 extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1];
 extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1];
@@ -375,6 +376,8 @@ extern const struct nla_policy ethnl_tsinfo_get_policy[ETHTOOL_A_TSINFO_HEADER +
 extern const struct nla_policy ethnl_cable_test_act_policy[ETHTOOL_A_CABLE_TEST_HEADER + 1];
 extern const struct nla_policy ethnl_cable_test_tdr_act_policy[ETHTOOL_A_CABLE_TEST_TDR_CFG + 1];
 extern const struct nla_policy ethnl_tunnel_info_get_policy[ETHTOOL_A_TUNNEL_INFO_HEADER + 1];
+extern const struct nla_policy ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1];
+extern const struct nla_policy ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1];
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
@@ -392,5 +395,6 @@ int ethnl_act_cable_test_tdr(struct sk_buff *skb, struct genl_info *info);
 int ethnl_tunnel_info_doit(struct sk_buff *skb, struct genl_info *info);
 int ethnl_tunnel_info_start(struct netlink_callback *cb);
 int ethnl_tunnel_info_dumpit(struct sk_buff *skb, struct netlink_callback *cb);
+int ethnl_set_preempt(struct sk_buff *skb, struct genl_info *info);
 
 #endif /* _NET_ETHTOOL_NETLINK_H */
diff --git a/net/ethtool/preempt.c b/net/ethtool/preempt.c
new file mode 100644
index 000000000000..4d97d1180a65
--- /dev/null
+++ b/net/ethtool/preempt.c
@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "netlink.h"
+#include "common.h"
+
+struct preempt_req_info {
+	struct ethnl_req_info		base;
+};
+
+struct preempt_reply_data {
+	struct ethnl_reply_data		base;
+	struct ethtool_fp		fp;
+};
+
+#define PREEMPT_REPDATA(__reply_base) \
+	container_of(__reply_base, struct preempt_reply_data, base)
+
+const struct nla_policy
+ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
+	[ETHTOOL_A_PREEMPT_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_PREEMPT_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_PREEMPT_ENABLED]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]	= { .type = NLA_REJECT },
+};
+
+static int preempt_prepare_data(const struct ethnl_req_info *req_base,
+				struct ethnl_reply_data *reply_base,
+				struct genl_info *info)
+{
+	struct preempt_reply_data *data = PREEMPT_REPDATA(reply_base);
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	if (!dev->ethtool_ops->get_preempt)
+		return -EOPNOTSUPP;
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+
+	ret = dev->ethtool_ops->get_preempt(dev, &data->fp);
+	ethnl_ops_complete(dev);
+
+	return ret;
+}
+
+static int preempt_reply_size(const struct ethnl_req_info *req_base,
+			      const struct ethnl_reply_data *reply_base)
+{
+	int len = 0;
+
+	len += nla_total_size(sizeof(u8)); /* _PREEMPT_ENABLED */
+	len += nla_total_size(sizeof(u8)); /* _PREEMPT_MIN_FRAG_SIZE */
+
+	return len;
+}
+
+static int preempt_fill_reply(struct sk_buff *skb,
+			      const struct ethnl_req_info *req_base,
+			      const struct ethnl_reply_data *reply_base)
+{
+	const struct preempt_reply_data *data = PREEMPT_REPDATA(reply_base);
+	const struct ethtool_fp *preempt = &data->fp;
+
+	if (nla_put_u8(skb, ETHTOOL_A_PREEMPT_ENABLED, preempt->enabled))
+		return -EMSGSIZE;
+
+	if (nla_put_u8(skb, ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT,
+		       preempt->min_frag_size_mult))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+const struct ethnl_request_ops ethnl_preempt_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_PREEMPT_GET,
+	.reply_cmd		= ETHTOOL_MSG_PREEMPT_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_PREEMPT_HEADER,
+	.req_info_size		= sizeof(struct preempt_req_info),
+	.reply_data_size	= sizeof(struct preempt_reply_data),
+
+	.prepare_data		= preempt_prepare_data,
+	.reply_size		= preempt_reply_size,
+	.fill_reply		= preempt_fill_reply,
+};
+
+const struct nla_policy
+ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
+	[ETHTOOL_A_PREEMPT_UNSPEC]			= { .type = NLA_REJECT },
+	[ETHTOOL_A_PREEMPT_HEADER]			= { .type = NLA_NESTED },
+	[ETHTOOL_A_PREEMPT_ENABLED]			= { .type = NLA_U8 },
+	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]		= { .type = NLA_U8 },
+};
+
+int ethnl_set_preempt(struct sk_buff *skb, struct genl_info *info)
+{
+	struct ethnl_req_info req_info = {};
+	struct nlattr **tb = info->attrs;
+	struct ethtool_fp preempt = {};
+	struct net_device *dev;
+	bool mod = false;
+	int ret;
+
+	ret = ethnl_parse_header_dev_get(&req_info,
+					 tb[ETHTOOL_A_PREEMPT_HEADER],
+					 genl_info_net(info), info->extack,
+					 true);
+	if (ret < 0)
+		return ret;
+	dev = req_info.dev;
+	ret = -EOPNOTSUPP;
+	if (!dev->ethtool_ops->get_preempt ||
+	    !dev->ethtool_ops->set_preempt)
+		goto out_dev;
+
+	rtnl_lock();
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto out_rtnl;
+
+	ret = dev->ethtool_ops->get_preempt(dev, &preempt);
+	if (ret < 0) {
+		GENL_SET_ERR_MSG(info, "failed to retrieve frame preemption settings");
+		goto out_ops;
+	}
+
+	ethnl_update_u8(&preempt.enabled,
+			tb[ETHTOOL_A_PREEMPT_ENABLED], &mod);
+	ethnl_update_u8(&preempt.min_frag_size_mult,
+			tb[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT], &mod);
+
+	ret = 0;
+	if (!mod)
+		goto out_ops;
+
+	ret = dev->ethtool_ops->set_preempt(dev, &preempt);
+	if (ret < 0) {
+		GENL_SET_ERR_MSG(info, "frame preemption settings update failed");
+		goto out_ops;
+	}
+
+	ethtool_notify(dev, ETHTOOL_MSG_PREEMPT_NTF, NULL);
+
+out_ops:
+	ethnl_ops_complete(dev);
+out_rtnl:
+	rtnl_unlock();
+out_dev:
+	dev_put(dev);
+	return ret;
+}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

Frame preemption (described in IEEE 802.3br-2016) defines the concept
of preemptible and express queues. It allows traffic from express
queues to "interrupt" traffic from preemptible queues, which are
"resumed" after the express traffic has finished transmitting.

Frame preemption can only be used when both the local device and the
link partner support it.

Only parameters for enabling/disabling frame preemption and
configuring the minimum fragment size are included here. Expressing
which queues are marked as preemptible is left to mqprio/taprio, as
having that information there should be easier on the user.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 include/linux/ethtool.h              |  19 ++++
 include/uapi/linux/ethtool_netlink.h |  17 +++
 net/ethtool/Makefile                 |   2 +-
 net/ethtool/netlink.c                |  19 ++++
 net/ethtool/netlink.h                |   4 +
 net/ethtool/preempt.c                | 151 +++++++++++++++++++++++++++
 6 files changed, 211 insertions(+), 1 deletion(-)
 create mode 100644 net/ethtool/preempt.c

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index e3da25b51ae4..16d6ee29a6ac 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -263,6 +263,19 @@ struct ethtool_pause_stats {
 	u64 rx_pause_frames;
 };
 
+/**
+ * struct ethtool_fp - Frame Preemption information
+ *
+ * @enabled: Enable frame preemption.
+ *
+ * @min_frag_size_mult: Minimum size for all non-final fragment size,
+ * expressed in terms of X in '(1 + X)*64 + 4'
+ */
+struct ethtool_fp {
+	u8 enabled;
+	u8 min_frag_size_mult;
+};
+
 /**
  * struct ethtool_ops - optional netdev operations
  * @supported_coalesce_params: supported types of interrupt coalescing.
@@ -406,6 +419,8 @@ struct ethtool_pause_stats {
  * @get_ethtool_phy_stats: Return extended statistics about the PHY device.
  *	This is only useful if the device maintains PHY statistics and
  *	cannot use the standard PHY library helpers.
+ * @get_preempt: Get the network device Frame Preemption parameters.
+ * @set_preempt: Set the network device Frame Preemption parameters.
  *
  * All operations are optional (i.e. the function pointer may be set
  * to %NULL) and callers must take this into account.  Callers must
@@ -504,6 +519,10 @@ struct ethtool_ops {
 				      struct ethtool_fecparam *);
 	int	(*set_fecparam)(struct net_device *,
 				      struct ethtool_fecparam *);
+	int	(*get_preempt)(struct net_device *,
+			       struct ethtool_fp *);
+	int	(*set_preempt)(struct net_device *,
+			       struct ethtool_fp *);
 	void	(*get_ethtool_phy_stats)(struct net_device *,
 					 struct ethtool_stats *, u64 *);
 	int	(*get_phy_tunable)(struct net_device *,
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index e2bf36e6964b..0b3dc0c263a9 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -42,6 +42,8 @@ enum {
 	ETHTOOL_MSG_CABLE_TEST_ACT,
 	ETHTOOL_MSG_CABLE_TEST_TDR_ACT,
 	ETHTOOL_MSG_TUNNEL_INFO_GET,
+	ETHTOOL_MSG_PREEMPT_GET,
+	ETHTOOL_MSG_PREEMPT_SET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -80,6 +82,8 @@ enum {
 	ETHTOOL_MSG_CABLE_TEST_NTF,
 	ETHTOOL_MSG_CABLE_TEST_TDR_NTF,
 	ETHTOOL_MSG_TUNNEL_INFO_GET_REPLY,
+	ETHTOOL_MSG_PREEMPT_GET_REPLY,
+	ETHTOOL_MSG_PREEMPT_NTF,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -628,6 +632,19 @@ enum {
 	ETHTOOL_A_TUNNEL_INFO_MAX = (__ETHTOOL_A_TUNNEL_INFO_CNT - 1)
 };
 
+/* FRAME PREEMPTION */
+
+enum {
+	ETHTOOL_A_PREEMPT_UNSPEC,
+	ETHTOOL_A_PREEMPT_HEADER,			/* nest - _A_HEADER_* */
+	ETHTOOL_A_PREEMPT_ENABLED,			/* u8 */
+	ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT,		/* u8 */
+
+	/* add new constants above here */
+	__ETHTOOL_A_PREEMPT_CNT,
+	ETHTOOL_A_PREEMPT_MAX = (__ETHTOOL_A_PREEMPT_CNT - 1)
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index 7a849ff22dad..4e584903e3ef 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -7,4 +7,4 @@ obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o \
 		   linkstate.o debug.o wol.o features.o privflags.o rings.o \
 		   channels.o coalesce.o pause.o eee.o tsinfo.o cabletest.o \
-		   tunnels.o
+		   tunnels.o preempt.o
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 50d3c8896f91..bc7d66e3ba38 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -245,6 +245,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_PAUSE_GET]		= &ethnl_pause_request_ops,
 	[ETHTOOL_MSG_EEE_GET]		= &ethnl_eee_request_ops,
 	[ETHTOOL_MSG_TSINFO_GET]	= &ethnl_tsinfo_request_ops,
+	[ETHTOOL_MSG_PREEMPT_GET]	= &ethnl_preempt_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -551,6 +552,7 @@ ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = {
 	[ETHTOOL_MSG_COALESCE_NTF]	= &ethnl_coalesce_request_ops,
 	[ETHTOOL_MSG_PAUSE_NTF]		= &ethnl_pause_request_ops,
 	[ETHTOOL_MSG_EEE_NTF]		= &ethnl_eee_request_ops,
+	[ETHTOOL_MSG_PREEMPT_NTF]	= &ethnl_preempt_request_ops,
 };
 
 /* default notification handler */
@@ -643,6 +645,7 @@ static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
 	[ETHTOOL_MSG_COALESCE_NTF]	= ethnl_default_notify,
 	[ETHTOOL_MSG_PAUSE_NTF]		= ethnl_default_notify,
 	[ETHTOOL_MSG_EEE_NTF]		= ethnl_default_notify,
+	[ETHTOOL_MSG_PREEMPT_NTF]	= ethnl_default_notify,
 };
 
 void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
@@ -912,6 +915,22 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.policy = ethnl_tunnel_info_get_policy,
 		.maxattr = ARRAY_SIZE(ethnl_tunnel_info_get_policy) - 1,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_PREEMPT_GET,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+		.policy = ethnl_preempt_get_policy,
+		.maxattr = ARRAY_SIZE(ethnl_preempt_get_policy) - 1,
+	},
+	{
+		.cmd	= ETHTOOL_MSG_PREEMPT_SET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_set_preempt,
+		.policy = ethnl_preempt_set_policy,
+		.maxattr = ARRAY_SIZE(ethnl_preempt_set_policy) - 1,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index d8efec516d86..8f65e53ccd59 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -344,6 +344,7 @@ extern const struct ethnl_request_ops ethnl_coalesce_request_ops;
 extern const struct ethnl_request_ops ethnl_pause_request_ops;
 extern const struct ethnl_request_ops ethnl_eee_request_ops;
 extern const struct ethnl_request_ops ethnl_tsinfo_request_ops;
+extern const struct ethnl_request_ops ethnl_preempt_request_ops;
 
 extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1];
 extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1];
@@ -375,6 +376,8 @@ extern const struct nla_policy ethnl_tsinfo_get_policy[ETHTOOL_A_TSINFO_HEADER +
 extern const struct nla_policy ethnl_cable_test_act_policy[ETHTOOL_A_CABLE_TEST_HEADER + 1];
 extern const struct nla_policy ethnl_cable_test_tdr_act_policy[ETHTOOL_A_CABLE_TEST_TDR_CFG + 1];
 extern const struct nla_policy ethnl_tunnel_info_get_policy[ETHTOOL_A_TUNNEL_INFO_HEADER + 1];
+extern const struct nla_policy ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1];
+extern const struct nla_policy ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1];
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
@@ -392,5 +395,6 @@ int ethnl_act_cable_test_tdr(struct sk_buff *skb, struct genl_info *info);
 int ethnl_tunnel_info_doit(struct sk_buff *skb, struct genl_info *info);
 int ethnl_tunnel_info_start(struct netlink_callback *cb);
 int ethnl_tunnel_info_dumpit(struct sk_buff *skb, struct netlink_callback *cb);
+int ethnl_set_preempt(struct sk_buff *skb, struct genl_info *info);
 
 #endif /* _NET_ETHTOOL_NETLINK_H */
diff --git a/net/ethtool/preempt.c b/net/ethtool/preempt.c
new file mode 100644
index 000000000000..4d97d1180a65
--- /dev/null
+++ b/net/ethtool/preempt.c
@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "netlink.h"
+#include "common.h"
+
+struct preempt_req_info {
+	struct ethnl_req_info		base;
+};
+
+struct preempt_reply_data {
+	struct ethnl_reply_data		base;
+	struct ethtool_fp		fp;
+};
+
+#define PREEMPT_REPDATA(__reply_base) \
+	container_of(__reply_base, struct preempt_reply_data, base)
+
+const struct nla_policy
+ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
+	[ETHTOOL_A_PREEMPT_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_PREEMPT_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_PREEMPT_ENABLED]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]	= { .type = NLA_REJECT },
+};
+
+static int preempt_prepare_data(const struct ethnl_req_info *req_base,
+				struct ethnl_reply_data *reply_base,
+				struct genl_info *info)
+{
+	struct preempt_reply_data *data = PREEMPT_REPDATA(reply_base);
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	if (!dev->ethtool_ops->get_preempt)
+		return -EOPNOTSUPP;
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+
+	ret = dev->ethtool_ops->get_preempt(dev, &data->fp);
+	ethnl_ops_complete(dev);
+
+	return ret;
+}
+
+static int preempt_reply_size(const struct ethnl_req_info *req_base,
+			      const struct ethnl_reply_data *reply_base)
+{
+	int len = 0;
+
+	len += nla_total_size(sizeof(u8)); /* _PREEMPT_ENABLED */
+	len += nla_total_size(sizeof(u8)); /* _PREEMPT_MIN_FRAG_SIZE */
+
+	return len;
+}
+
+static int preempt_fill_reply(struct sk_buff *skb,
+			      const struct ethnl_req_info *req_base,
+			      const struct ethnl_reply_data *reply_base)
+{
+	const struct preempt_reply_data *data = PREEMPT_REPDATA(reply_base);
+	const struct ethtool_fp *preempt = &data->fp;
+
+	if (nla_put_u8(skb, ETHTOOL_A_PREEMPT_ENABLED, preempt->enabled))
+		return -EMSGSIZE;
+
+	if (nla_put_u8(skb, ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT,
+		       preempt->min_frag_size_mult))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+const struct ethnl_request_ops ethnl_preempt_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_PREEMPT_GET,
+	.reply_cmd		= ETHTOOL_MSG_PREEMPT_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_PREEMPT_HEADER,
+	.req_info_size		= sizeof(struct preempt_req_info),
+	.reply_data_size	= sizeof(struct preempt_reply_data),
+
+	.prepare_data		= preempt_prepare_data,
+	.reply_size		= preempt_reply_size,
+	.fill_reply		= preempt_fill_reply,
+};
+
+const struct nla_policy
+ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
+	[ETHTOOL_A_PREEMPT_UNSPEC]			= { .type = NLA_REJECT },
+	[ETHTOOL_A_PREEMPT_HEADER]			= { .type = NLA_NESTED },
+	[ETHTOOL_A_PREEMPT_ENABLED]			= { .type = NLA_U8 },
+	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]		= { .type = NLA_U8 },
+};
+
+int ethnl_set_preempt(struct sk_buff *skb, struct genl_info *info)
+{
+	struct ethnl_req_info req_info = {};
+	struct nlattr **tb = info->attrs;
+	struct ethtool_fp preempt = {};
+	struct net_device *dev;
+	bool mod = false;
+	int ret;
+
+	ret = ethnl_parse_header_dev_get(&req_info,
+					 tb[ETHTOOL_A_PREEMPT_HEADER],
+					 genl_info_net(info), info->extack,
+					 true);
+	if (ret < 0)
+		return ret;
+	dev = req_info.dev;
+	ret = -EOPNOTSUPP;
+	if (!dev->ethtool_ops->get_preempt ||
+	    !dev->ethtool_ops->set_preempt)
+		goto out_dev;
+
+	rtnl_lock();
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto out_rtnl;
+
+	ret = dev->ethtool_ops->get_preempt(dev, &preempt);
+	if (ret < 0) {
+		GENL_SET_ERR_MSG(info, "failed to retrieve frame preemption settings");
+		goto out_ops;
+	}
+
+	ethnl_update_u8(&preempt.enabled,
+			tb[ETHTOOL_A_PREEMPT_ENABLED], &mod);
+	ethnl_update_u8(&preempt.min_frag_size_mult,
+			tb[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT], &mod);
+
+	ret = 0;
+	if (!mod)
+		goto out_ops;
+
+	ret = dev->ethtool_ops->set_preempt(dev, &preempt);
+	if (ret < 0) {
+		GENL_SET_ERR_MSG(info, "frame preemption settings update failed");
+		goto out_ops;
+	}
+
+	ethtool_notify(dev, ETHTOOL_MSG_PREEMPT_NTF, NULL);
+
+out_ops:
+	ethnl_ops_complete(dev);
+out_rtnl:
+	rtnl_unlock();
+out_dev:
+	dev_put(dev);
+	return ret;
+}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 2/9] taprio: Add support for frame preemption offload
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

This adds a way to configure which queues are marked as preemptible
and which are marked as express.

Even if this is not a "real" offload, because it can't be executed
purely in software, having this information near where the mapping of
queues is specified, makes it, hopefully, easier to understand.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 include/linux/netdevice.h      |  1 +
 include/net/pkt_sched.h        |  4 ++++
 include/uapi/linux/pkt_sched.h |  1 +
 net/sched/sch_taprio.c         | 41 ++++++++++++++++++++++++++++++----
 4 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8eeb73ac58bd..d7a99e769e79 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -852,6 +852,7 @@ enum tc_setup_type {
 	TC_SETUP_QDISC_ETS,
 	TC_SETUP_QDISC_TBF,
 	TC_SETUP_QDISC_FIFO,
+	TC_SETUP_PREEMPT,
 };
 
 /* These structures hold the attributes of bpf state that are being passed
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 15b1b30f454e..be5ff1535332 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -183,6 +183,10 @@ struct tc_taprio_qopt_offload {
 	struct tc_taprio_sched_entry entries[];
 };
 
+struct tc_preempt_qopt_offload {
+	u32 preemptible_queues;
+};
+
 /* Reference counting */
 struct tc_taprio_qopt_offload *taprio_offload_get(struct tc_taprio_qopt_offload
 						  *offload);
diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 9e7c2c607845..f0240ddaeee3 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -1240,6 +1240,7 @@ enum {
 	TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION, /* s64 */
 	TCA_TAPRIO_ATTR_FLAGS, /* u32 */
 	TCA_TAPRIO_ATTR_TXTIME_DELAY, /* u32 */
+	TCA_TAPRIO_ATTR_PREEMPT_QUEUES, /* u32 */
 	__TCA_TAPRIO_ATTR_MAX,
 };
 
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 26fb8a62996b..c482c5d211bb 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -64,6 +64,7 @@ struct taprio_sched {
 	struct Qdisc **qdiscs;
 	struct Qdisc *root;
 	u32 flags;
+	u32 preemptible_queues;
 	enum tk_offsets tk_offset;
 	int clockid;
 	atomic64_t picos_per_byte; /* Using picoseconds because for 10Gbps+
@@ -776,6 +777,7 @@ static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
 	[TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION] = { .type = NLA_S64 },
 	[TCA_TAPRIO_ATTR_FLAGS]                      = { .type = NLA_U32 },
 	[TCA_TAPRIO_ATTR_TXTIME_DELAY]		     = { .type = NLA_U32 },
+	[TCA_TAPRIO_ATTR_PREEMPT_QUEUES]	     = { .type = NLA_U32 },
 };
 
 static int fill_sched_entry(struct taprio_sched *q, struct nlattr **tb,
@@ -1268,6 +1270,7 @@ static int taprio_disable_offload(struct net_device *dev,
 				  struct netlink_ext_ack *extack)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
+	struct tc_preempt_qopt_offload preempt = { };
 	struct tc_taprio_qopt_offload *offload;
 	int err;
 
@@ -1286,13 +1289,15 @@ static int taprio_disable_offload(struct net_device *dev,
 	offload->enable = 0;
 
 	err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
-	if (err < 0) {
+	if (err < 0)
+		NL_SET_ERR_MSG(extack,
+			       "Device failed to disable offload");
+
+	err = ops->ndo_setup_tc(dev, TC_SETUP_PREEMPT, &preempt);
+	if (err < 0)
 		NL_SET_ERR_MSG(extack,
 			       "Device failed to disable offload");
-		goto out;
-	}
 
-out:
 	taprio_offload_free(offload);
 
 	return err;
@@ -1509,6 +1514,29 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
 					       mqprio->prio_tc_map[i]);
 	}
 
+	/* It's valid to enable frame preemption without any kind of
+	 * offloading being enabled, so keep it separated.
+	 */
+	if (tb[TCA_TAPRIO_ATTR_PREEMPT_QUEUES]) {
+		u32 preempt = nla_get_u32(tb[TCA_TAPRIO_ATTR_PREEMPT_QUEUES]);
+		struct tc_preempt_qopt_offload qopt = { };
+
+		if (preempt == U32_MAX) {
+			NL_SET_ERR_MSG(extack, "At least one queue must be not be preemptible");
+			err = -EINVAL;
+			goto free_sched;
+		}
+
+		qopt.preemptible_queues = preempt;
+
+		err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_PREEMPT,
+						    &qopt);
+		if (err)
+			goto free_sched;
+
+		q->preemptible_queues = preempt;
+	}
+
 	if (FULL_OFFLOAD_IS_ENABLED(q->flags))
 		err = taprio_enable_offload(dev, q, new_admin, extack);
 	else
@@ -1650,6 +1678,7 @@ static int taprio_init(struct Qdisc *sch, struct nlattr *opt,
 	 */
 	q->clockid = -1;
 	q->flags = TAPRIO_FLAGS_INVALID;
+	q->preemptible_queues = U32_MAX;
 
 	spin_lock(&taprio_list_lock);
 	list_add(&q->taprio_list, &taprio_list);
@@ -1833,6 +1862,10 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
 	if (q->flags && nla_put_u32(skb, TCA_TAPRIO_ATTR_FLAGS, q->flags))
 		goto options_error;
 
+	if (q->preemptible_queues != U32_MAX &&
+	    nla_put_u32(skb, TCA_TAPRIO_ATTR_PREEMPT_QUEUES, q->preemptible_queues))
+		goto options_error;
+
 	if (q->txtime_delay &&
 	    nla_put_u32(skb, TCA_TAPRIO_ATTR_TXTIME_DELAY, q->txtime_delay))
 		goto options_error;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 2/9] taprio: Add support for frame preemption offload
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

This adds a way to configure which queues are marked as preemptible
and which are marked as express.

Even if this is not a "real" offload, because it can't be executed
purely in software, having this information near where the mapping of
queues is specified, makes it, hopefully, easier to understand.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 include/linux/netdevice.h      |  1 +
 include/net/pkt_sched.h        |  4 ++++
 include/uapi/linux/pkt_sched.h |  1 +
 net/sched/sch_taprio.c         | 41 ++++++++++++++++++++++++++++++----
 4 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8eeb73ac58bd..d7a99e769e79 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -852,6 +852,7 @@ enum tc_setup_type {
 	TC_SETUP_QDISC_ETS,
 	TC_SETUP_QDISC_TBF,
 	TC_SETUP_QDISC_FIFO,
+	TC_SETUP_PREEMPT,
 };
 
 /* These structures hold the attributes of bpf state that are being passed
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 15b1b30f454e..be5ff1535332 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -183,6 +183,10 @@ struct tc_taprio_qopt_offload {
 	struct tc_taprio_sched_entry entries[];
 };
 
+struct tc_preempt_qopt_offload {
+	u32 preemptible_queues;
+};
+
 /* Reference counting */
 struct tc_taprio_qopt_offload *taprio_offload_get(struct tc_taprio_qopt_offload
 						  *offload);
diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 9e7c2c607845..f0240ddaeee3 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -1240,6 +1240,7 @@ enum {
 	TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION, /* s64 */
 	TCA_TAPRIO_ATTR_FLAGS, /* u32 */
 	TCA_TAPRIO_ATTR_TXTIME_DELAY, /* u32 */
+	TCA_TAPRIO_ATTR_PREEMPT_QUEUES, /* u32 */
 	__TCA_TAPRIO_ATTR_MAX,
 };
 
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 26fb8a62996b..c482c5d211bb 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -64,6 +64,7 @@ struct taprio_sched {
 	struct Qdisc **qdiscs;
 	struct Qdisc *root;
 	u32 flags;
+	u32 preemptible_queues;
 	enum tk_offsets tk_offset;
 	int clockid;
 	atomic64_t picos_per_byte; /* Using picoseconds because for 10Gbps+
@@ -776,6 +777,7 @@ static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
 	[TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION] = { .type = NLA_S64 },
 	[TCA_TAPRIO_ATTR_FLAGS]                      = { .type = NLA_U32 },
 	[TCA_TAPRIO_ATTR_TXTIME_DELAY]		     = { .type = NLA_U32 },
+	[TCA_TAPRIO_ATTR_PREEMPT_QUEUES]	     = { .type = NLA_U32 },
 };
 
 static int fill_sched_entry(struct taprio_sched *q, struct nlattr **tb,
@@ -1268,6 +1270,7 @@ static int taprio_disable_offload(struct net_device *dev,
 				  struct netlink_ext_ack *extack)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
+	struct tc_preempt_qopt_offload preempt = { };
 	struct tc_taprio_qopt_offload *offload;
 	int err;
 
@@ -1286,13 +1289,15 @@ static int taprio_disable_offload(struct net_device *dev,
 	offload->enable = 0;
 
 	err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
-	if (err < 0) {
+	if (err < 0)
+		NL_SET_ERR_MSG(extack,
+			       "Device failed to disable offload");
+
+	err = ops->ndo_setup_tc(dev, TC_SETUP_PREEMPT, &preempt);
+	if (err < 0)
 		NL_SET_ERR_MSG(extack,
 			       "Device failed to disable offload");
-		goto out;
-	}
 
-out:
 	taprio_offload_free(offload);
 
 	return err;
@@ -1509,6 +1514,29 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
 					       mqprio->prio_tc_map[i]);
 	}
 
+	/* It's valid to enable frame preemption without any kind of
+	 * offloading being enabled, so keep it separated.
+	 */
+	if (tb[TCA_TAPRIO_ATTR_PREEMPT_QUEUES]) {
+		u32 preempt = nla_get_u32(tb[TCA_TAPRIO_ATTR_PREEMPT_QUEUES]);
+		struct tc_preempt_qopt_offload qopt = { };
+
+		if (preempt == U32_MAX) {
+			NL_SET_ERR_MSG(extack, "At least one queue must be not be preemptible");
+			err = -EINVAL;
+			goto free_sched;
+		}
+
+		qopt.preemptible_queues = preempt;
+
+		err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_PREEMPT,
+						    &qopt);
+		if (err)
+			goto free_sched;
+
+		q->preemptible_queues = preempt;
+	}
+
 	if (FULL_OFFLOAD_IS_ENABLED(q->flags))
 		err = taprio_enable_offload(dev, q, new_admin, extack);
 	else
@@ -1650,6 +1678,7 @@ static int taprio_init(struct Qdisc *sch, struct nlattr *opt,
 	 */
 	q->clockid = -1;
 	q->flags = TAPRIO_FLAGS_INVALID;
+	q->preemptible_queues = U32_MAX;
 
 	spin_lock(&taprio_list_lock);
 	list_add(&q->taprio_list, &taprio_list);
@@ -1833,6 +1862,10 @@ static int taprio_dump(struct Qdisc *sch, struct sk_buff *skb)
 	if (q->flags && nla_put_u32(skb, TCA_TAPRIO_ATTR_FLAGS, q->flags))
 		goto options_error;
 
+	if (q->preemptible_queues != U32_MAX &&
+	    nla_put_u32(skb, TCA_TAPRIO_ATTR_PREEMPT_QUEUES, q->preemptible_queues))
+		goto options_error;
+
 	if (q->txtime_delay &&
 	    nla_put_u32(skb, TCA_TAPRIO_ATTR_TXTIME_DELAY, q->txtime_delay))
 		goto options_error;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 3/9] igc: Set the RX packet buffer size for TSN mode
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

In preparation for supporting frame preemption, when entering TSN mode
set the receive packet buffer to 16KB for the Express MAC, 16KB for
the Preemptible MAC and 2KB for the BMC, according to the datasheet
section 7.1.3.2.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_defines.h |  2 ++
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 14 ++++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 32f5fd684139..0e78abfd99ee 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -351,6 +351,8 @@
 #define IGC_RXPBS_CFG_TS_EN	0x80000000 /* Timestamp in Rx buffer */
 
 #define IGC_TXPBSIZE_TSN	0x04145145 /* 5k bytes buffer for each queue */
+#define IGC_RXPBSIZE_TSN	0x00010090 /* 16KB for EXP + 16KB for BE + 2KB for BMC */
+#define IGC_RXPBSIZE_SIZE_MASK	0x0001FFFF
 
 #define IGC_DTXMXPKTSZ_TSN	0x19 /* 1600 bytes of max TX DMA packet size */
 #define IGC_DTXMXPKTSZ_DEFAULT	0x98 /* 9728-byte Jumbo frames */
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 174103c4bea6..38451cf05ac6 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -24,7 +24,7 @@ static bool is_any_launchtime(struct igc_adapter *adapter)
 static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
-	u32 tqavctrl;
+	u32 tqavctrl, rxpbs;
 	int i;
 
 	if (!(adapter->flags & IGC_FLAG_TSN_QBV_ENABLED))
@@ -35,6 +35,11 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 	wr32(IGC_TXPBS, I225_TXPBSIZE_DEFAULT);
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_DEFAULT);
 
+	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
+	rxpbs |= I225_RXPBSIZE_DEFAULT;
+
+	wr32(IGC_RXPBS, rxpbs);
+
 	tqavctrl = rd32(IGC_TQAVCTRL);
 	tqavctrl &= ~(IGC_TQAVCTRL_TRANSMIT_MODE_TSN |
 		      IGC_TQAVCTRL_ENHANCED_QAV);
@@ -64,7 +69,7 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
 	u32 tqavctrl, baset_l, baset_h;
-	u32 sec, nsec, cycle;
+	u32 sec, nsec, cycle, rxpbs;
 	ktime_t base_time, systim;
 	int i;
 
@@ -79,6 +84,11 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
 
 	tqavctrl = rd32(IGC_TQAVCTRL);
+	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
+	rxpbs |= IGC_RXPBSIZE_TSN;
+
+	wr32(IGC_RXPBS, rxpbs);
+
 	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 3/9] igc: Set the RX packet buffer size for TSN mode
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

In preparation for supporting frame preemption, when entering TSN mode
set the receive packet buffer to 16KB for the Express MAC, 16KB for
the Preemptible MAC and 2KB for the BMC, according to the datasheet
section 7.1.3.2.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_defines.h |  2 ++
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 14 ++++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 32f5fd684139..0e78abfd99ee 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -351,6 +351,8 @@
 #define IGC_RXPBS_CFG_TS_EN	0x80000000 /* Timestamp in Rx buffer */
 
 #define IGC_TXPBSIZE_TSN	0x04145145 /* 5k bytes buffer for each queue */
+#define IGC_RXPBSIZE_TSN	0x00010090 /* 16KB for EXP + 16KB for BE + 2KB for BMC */
+#define IGC_RXPBSIZE_SIZE_MASK	0x0001FFFF
 
 #define IGC_DTXMXPKTSZ_TSN	0x19 /* 1600 bytes of max TX DMA packet size */
 #define IGC_DTXMXPKTSZ_DEFAULT	0x98 /* 9728-byte Jumbo frames */
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 174103c4bea6..38451cf05ac6 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -24,7 +24,7 @@ static bool is_any_launchtime(struct igc_adapter *adapter)
 static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
-	u32 tqavctrl;
+	u32 tqavctrl, rxpbs;
 	int i;
 
 	if (!(adapter->flags & IGC_FLAG_TSN_QBV_ENABLED))
@@ -35,6 +35,11 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 	wr32(IGC_TXPBS, I225_TXPBSIZE_DEFAULT);
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_DEFAULT);
 
+	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
+	rxpbs |= I225_RXPBSIZE_DEFAULT;
+
+	wr32(IGC_RXPBS, rxpbs);
+
 	tqavctrl = rd32(IGC_TQAVCTRL);
 	tqavctrl &= ~(IGC_TQAVCTRL_TRANSMIT_MODE_TSN |
 		      IGC_TQAVCTRL_ENHANCED_QAV);
@@ -64,7 +69,7 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
 	u32 tqavctrl, baset_l, baset_h;
-	u32 sec, nsec, cycle;
+	u32 sec, nsec, cycle, rxpbs;
 	ktime_t base_time, systim;
 	int i;
 
@@ -79,6 +84,11 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
 
 	tqavctrl = rd32(IGC_TQAVCTRL);
+	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
+	rxpbs |= IGC_RXPBSIZE_TSN;
+
+	wr32(IGC_RXPBS, rxpbs);
+
 	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 4/9] igc: Only dump registers if configured to dump HW information
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

To avoid polluting the users logs with register dumps, only dump the
adapter's registers if configured to do so.

If users want to enable HW status messages they can do:

$ ethtool -s IFACE msglvl hw on

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_dump.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_dump.c b/drivers/net/ethernet/intel/igc/igc_dump.c
index 4b9ec7d0b727..90b754b429ff 100644
--- a/drivers/net/ethernet/intel/igc/igc_dump.c
+++ b/drivers/net/ethernet/intel/igc/igc_dump.c
@@ -308,6 +308,9 @@ void igc_regs_dump(struct igc_adapter *adapter)
 	struct igc_hw *hw = &adapter->hw;
 	struct igc_reg_info *reginfo;
 
+	if (!netif_msg_hw(adapter))
+		return;
+
 	/* Print Registers */
 	netdev_info(adapter->netdev, "Register Dump\n");
 	netdev_info(adapter->netdev, "Register Name   Value\n");
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 4/9] igc: Only dump registers if configured to dump HW information
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

To avoid polluting the users logs with register dumps, only dump the
adapter's registers if configured to do so.

If users want to enable HW status messages they can do:

$ ethtool -s IFACE msglvl hw on

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_dump.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_dump.c b/drivers/net/ethernet/intel/igc/igc_dump.c
index 4b9ec7d0b727..90b754b429ff 100644
--- a/drivers/net/ethernet/intel/igc/igc_dump.c
+++ b/drivers/net/ethernet/intel/igc/igc_dump.c
@@ -308,6 +308,9 @@ void igc_regs_dump(struct igc_adapter *adapter)
 	struct igc_hw *hw = &adapter->hw;
 	struct igc_reg_info *reginfo;
 
+	if (!netif_msg_hw(adapter))
+		return;
+
 	/* Print Registers */
 	netdev_info(adapter->netdev, "Register Dump\n");
 	netdev_info(adapter->netdev, "Register Name   Value\n");
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 5/9] igc: Avoid TX Hangs because long cycles
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

Avoid possible TX Hangs caused by using long Qbv cycles. In some
cases, using long cycles (more than 1 second) can cause transmissions
to be blocked for that time. As the TX Hang timeout is close to 1
second, we may need to reduce the cycle time to something more
reasonable: the value chosen is 1ms.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 4 ++--
 drivers/net/ethernet/intel/igc/igc_tsn.c  | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index b673ac1199bb..3af54fd9cc0d 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4692,12 +4692,12 @@ static int igc_save_launchtime_params(struct igc_adapter *adapter, int queue,
 	if (adapter->base_time)
 		return 0;
 
-	adapter->cycle_time = NSEC_PER_SEC;
+	adapter->cycle_time = NSEC_PER_MSEC;
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		ring = adapter->tx_ring[i];
 		ring->start_time = 0;
-		ring->end_time = NSEC_PER_SEC;
+		ring->end_time = NSEC_PER_MSEC;
 	}
 
 	return 0;
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 38451cf05ac6..f5a5527adb21 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -54,11 +54,11 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 
 		wr32(IGC_TXQCTL(i), 0);
 		wr32(IGC_STQT(i), 0);
-		wr32(IGC_ENDQT(i), NSEC_PER_SEC);
+		wr32(IGC_ENDQT(i), NSEC_PER_MSEC);
 	}
 
-	wr32(IGC_QBVCYCLET_S, NSEC_PER_SEC);
-	wr32(IGC_QBVCYCLET, NSEC_PER_SEC);
+	wr32(IGC_QBVCYCLET_S, NSEC_PER_MSEC);
+	wr32(IGC_QBVCYCLET, NSEC_PER_MSEC);
 
 	adapter->flags &= ~IGC_FLAG_TSN_QBV_ENABLED;
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 5/9] igc: Avoid TX Hangs because long cycles
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

Avoid possible TX Hangs caused by using long Qbv cycles. In some
cases, using long cycles (more than 1 second) can cause transmissions
to be blocked for that time. As the TX Hang timeout is close to 1
second, we may need to reduce the cycle time to something more
reasonable: the value chosen is 1ms.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 4 ++--
 drivers/net/ethernet/intel/igc/igc_tsn.c  | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index b673ac1199bb..3af54fd9cc0d 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4692,12 +4692,12 @@ static int igc_save_launchtime_params(struct igc_adapter *adapter, int queue,
 	if (adapter->base_time)
 		return 0;
 
-	adapter->cycle_time = NSEC_PER_SEC;
+	adapter->cycle_time = NSEC_PER_MSEC;
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		ring = adapter->tx_ring[i];
 		ring->start_time = 0;
-		ring->end_time = NSEC_PER_SEC;
+		ring->end_time = NSEC_PER_MSEC;
 	}
 
 	return 0;
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 38451cf05ac6..f5a5527adb21 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -54,11 +54,11 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 
 		wr32(IGC_TXQCTL(i), 0);
 		wr32(IGC_STQT(i), 0);
-		wr32(IGC_ENDQT(i), NSEC_PER_SEC);
+		wr32(IGC_ENDQT(i), NSEC_PER_MSEC);
 	}
 
-	wr32(IGC_QBVCYCLET_S, NSEC_PER_SEC);
-	wr32(IGC_QBVCYCLET, NSEC_PER_SEC);
+	wr32(IGC_QBVCYCLET_S, NSEC_PER_MSEC);
+	wr32(IGC_QBVCYCLET, NSEC_PER_MSEC);
 
 	adapter->flags &= ~IGC_FLAG_TSN_QBV_ENABLED;
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

The tc subsystem sets which queues are marked as preemptible, it's the
role of ethtool to control more hardware specific parameters. These
parameters include:

 - enabling the frame preemption hardware: As enabling frame
 preemption may have other requirements before it can be enabled, it's
 exposed via the ethtool API;

 - mininum fragment size multiplier: expressed in usually in the form
 of (1 + N)*64, this number indicates what's the size of the minimum
 fragment that can be preempted.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h         | 10 ++++
 drivers/net/ethernet/intel/igc/igc_defines.h |  4 ++
 drivers/net/ethernet/intel/igc/igc_ethtool.c | 55 ++++++++++++++++++++
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 20 +++++--
 4 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 35baae900c1f..35c2dc4ad810 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -87,6 +87,7 @@ struct igc_ring {
 	u8 queue_index;                 /* logical index of the ring*/
 	u8 reg_idx;                     /* physical index of the ring */
 	bool launchtime_enable;         /* true if LaunchTime is enabled */
+	bool preemptible;		/* true if not express */
 
 	u32 start_time;
 	u32 end_time;
@@ -165,6 +166,11 @@ struct igc_adapter {
 
 	ktime_t base_time;
 	ktime_t cycle_time;
+	bool frame_preemption_active;
+	/* Frame preemption minimum fragment size, this the X in
+	 * (1 + X)*64 + 4
+	 */
+	u8 min_frag_size_mult;
 
 	/* OS defined structs */
 	struct pci_dev *pdev;
@@ -262,6 +268,10 @@ extern char igc_driver_name[];
 #define IGC_FLAG_VLAN_PROMISC		BIT(15)
 #define IGC_FLAG_RX_LEGACY		BIT(16)
 #define IGC_FLAG_TSN_QBV_ENABLED	BIT(17)
+#define IGC_FLAG_TSN_PREEMPT_ENABLED	BIT(18)
+
+#define IGC_FLAG_TSN_ANY_ENABLED \
+	(IGC_FLAG_TSN_QBV_ENABLED | IGC_FLAG_TSN_PREEMPT_ENABLED)
 
 #define IGC_FLAG_RSS_FIELD_IPV4_UDP	BIT(6)
 #define IGC_FLAG_RSS_FIELD_IPV6_UDP	BIT(7)
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 0e78abfd99ee..c2155d109bd6 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -410,10 +410,14 @@
 /* Transmit Scheduling */
 #define IGC_TQAVCTRL_TRANSMIT_MODE_TSN	0x00000001
 #define IGC_TQAVCTRL_ENHANCED_QAV	0x00000008
+#define IGC_TQAVCTRL_PREEMPT_ENA	0x00000002
+#define IGC_TQAVCTRL_MIN_FRAG_MASK	0x0000C000
+#define IGC_TQAVCTRL_MIN_FRAG_SHIFT	14
 
 #define IGC_TXQCTL_QUEUE_MODE_LAUNCHT	0x00000001
 #define IGC_TXQCTL_STRICT_CYCLE		0x00000002
 #define IGC_TXQCTL_STRICT_END		0x00000004
+#define IGC_TXQCTL_PREEMPTABLE		0x00000008
 
 /* Receive Checksum Control */
 #define IGC_RXCSUM_CRCOFL	0x00000800   /* CRC32 offload enable */
diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index 61d331ce38cd..492c6592c150 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -8,6 +8,7 @@
 
 #include "igc.h"
 #include "igc_diag.h"
+#include "igc_tsn.h"
 
 /* forward declaration */
 struct igc_stats {
@@ -1636,6 +1637,58 @@ static int igc_ethtool_set_eee(struct net_device *netdev,
 	return 0;
 }
 
+static int igc_ethtool_get_preempt(struct net_device *netdev,
+				   struct ethtool_fp *fpcmd)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+
+	fpcmd->enabled = adapter->frame_preemption_active;
+	fpcmd->min_frag_size_mult = adapter->min_frag_size_mult;
+
+	return 0;
+}
+
+static int igc_ethtool_set_preempt(struct net_device *netdev,
+				   struct ethtool_fp *fpcmd)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+	int i;
+
+	/* The formula is (Section 8.12.4 of the datasheet):
+	 *   MIN_FRAG_SIZE = 4 + (1 + MIN_FRAG)*64
+	 * MIN_FRAG is represented by two bits, so we can only have
+	 * min_frag_size between 68 and 260.
+	 */
+	if (fpcmd->min_frag_size_mult > 3)
+		return -EINVAL;
+
+	adapter->frame_preemption_active = fpcmd->enabled;
+	adapter->min_frag_size_mult = fpcmd->min_frag_size_mult;
+
+	if (!adapter->frame_preemption_active)
+		goto done;
+
+	/* Enabling frame preemption requires TSN mode to be enabled,
+	 * which requires a schedule to be active. So, if there isn't
+	 * a schedule already configured, configure a simple one, with
+	 * all queues open, with 1ms cycle time.
+	 */
+	if (adapter->base_time)
+		goto done;
+
+	adapter->cycle_time = NSEC_PER_MSEC;
+
+	for (i = 0; i < adapter->num_tx_queues; i++) {
+		struct igc_ring *ring = adapter->tx_ring[i];
+
+		ring->start_time = 0;
+		ring->end_time = NSEC_PER_MSEC;
+	}
+
+done:
+	return igc_tsn_offload_apply(adapter);
+}
+
 static int igc_ethtool_begin(struct net_device *netdev)
 {
 	struct igc_adapter *adapter = netdev_priv(netdev);
@@ -1915,6 +1968,8 @@ static const struct ethtool_ops igc_ethtool_ops = {
 	.get_ts_info		= igc_ethtool_get_ts_info,
 	.get_channels		= igc_ethtool_get_channels,
 	.set_channels		= igc_ethtool_set_channels,
+	.get_preempt		= igc_ethtool_get_preempt,
+	.set_preempt		= igc_ethtool_set_preempt,
 	.get_priv_flags		= igc_ethtool_get_priv_flags,
 	.set_priv_flags		= igc_ethtool_set_priv_flags,
 	.get_eee		= igc_ethtool_get_eee,
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index f5a5527adb21..76272f62bb69 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -31,6 +31,7 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 		return 0;
 
 	adapter->cycle_time = 0;
+	adapter->frame_preemption_active = false;
 
 	wr32(IGC_TXPBS, I225_TXPBSIZE_DEFAULT);
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_DEFAULT);
@@ -42,7 +43,8 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 
 	tqavctrl = rd32(IGC_TQAVCTRL);
 	tqavctrl &= ~(IGC_TQAVCTRL_TRANSMIT_MODE_TSN |
-		      IGC_TQAVCTRL_ENHANCED_QAV);
+		      IGC_TQAVCTRL_ENHANCED_QAV | IGC_TQAVCTRL_PREEMPT_ENA |
+		      IGC_TQAVCTRL_MIN_FRAG_MASK);
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
@@ -51,6 +53,7 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 		ring->start_time = 0;
 		ring->end_time = 0;
 		ring->launchtime_enable = false;
+		ring->preemptible = false;
 
 		wr32(IGC_TXQCTL(i), 0);
 		wr32(IGC_STQT(i), 0);
@@ -83,13 +86,20 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_TSN);
 	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
 
-	tqavctrl = rd32(IGC_TQAVCTRL);
 	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
 	rxpbs |= IGC_RXPBSIZE_TSN;
 
 	wr32(IGC_RXPBS, rxpbs);
 
+	tqavctrl = rd32(IGC_TQAVCTRL) &
+		~(IGC_TQAVCTRL_MIN_FRAG_MASK | IGC_TQAVCTRL_PREEMPT_ENA);
 	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
+
+	if (adapter->frame_preemption_active)
+		tqavctrl |= IGC_TQAVCTRL_PREEMPT_ENA;
+
+	tqavctrl |= adapter->min_frag_size_mult << IGC_TQAVCTRL_MIN_FRAG_SHIFT;
+
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
 	wr32(IGC_QBVCYCLET_S, cycle);
@@ -115,6 +125,9 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 		if (ring->launchtime_enable)
 			txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
 
+		if (ring->preemptible)
+			txqctl |= IGC_TXQCTL_PREEMPTABLE;
+
 		wr32(IGC_TXQCTL(i), txqctl);
 	}
 
@@ -142,7 +155,8 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 
 int igc_tsn_offload_apply(struct igc_adapter *adapter)
 {
-	bool is_any_enabled = adapter->base_time || is_any_launchtime(adapter);
+	bool is_any_enabled = adapter->base_time ||
+		is_any_launchtime(adapter) || adapter->frame_preemption_active;
 
 	if (!(adapter->flags & IGC_FLAG_TSN_QBV_ENABLED) && !is_any_enabled)
 		return 0;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

The tc subsystem sets which queues are marked as preemptible, it's the
role of ethtool to control more hardware specific parameters. These
parameters include:

 - enabling the frame preemption hardware: As enabling frame
 preemption may have other requirements before it can be enabled, it's
 exposed via the ethtool API;

 - mininum fragment size multiplier: expressed in usually in the form
 of (1 + N)*64, this number indicates what's the size of the minimum
 fragment that can be preempted.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h         | 10 ++++
 drivers/net/ethernet/intel/igc/igc_defines.h |  4 ++
 drivers/net/ethernet/intel/igc/igc_ethtool.c | 55 ++++++++++++++++++++
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 20 +++++--
 4 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 35baae900c1f..35c2dc4ad810 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -87,6 +87,7 @@ struct igc_ring {
 	u8 queue_index;                 /* logical index of the ring*/
 	u8 reg_idx;                     /* physical index of the ring */
 	bool launchtime_enable;         /* true if LaunchTime is enabled */
+	bool preemptible;		/* true if not express */
 
 	u32 start_time;
 	u32 end_time;
@@ -165,6 +166,11 @@ struct igc_adapter {
 
 	ktime_t base_time;
 	ktime_t cycle_time;
+	bool frame_preemption_active;
+	/* Frame preemption minimum fragment size, this the X in
+	 * (1 + X)*64 + 4
+	 */
+	u8 min_frag_size_mult;
 
 	/* OS defined structs */
 	struct pci_dev *pdev;
@@ -262,6 +268,10 @@ extern char igc_driver_name[];
 #define IGC_FLAG_VLAN_PROMISC		BIT(15)
 #define IGC_FLAG_RX_LEGACY		BIT(16)
 #define IGC_FLAG_TSN_QBV_ENABLED	BIT(17)
+#define IGC_FLAG_TSN_PREEMPT_ENABLED	BIT(18)
+
+#define IGC_FLAG_TSN_ANY_ENABLED \
+	(IGC_FLAG_TSN_QBV_ENABLED | IGC_FLAG_TSN_PREEMPT_ENABLED)
 
 #define IGC_FLAG_RSS_FIELD_IPV4_UDP	BIT(6)
 #define IGC_FLAG_RSS_FIELD_IPV6_UDP	BIT(7)
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 0e78abfd99ee..c2155d109bd6 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -410,10 +410,14 @@
 /* Transmit Scheduling */
 #define IGC_TQAVCTRL_TRANSMIT_MODE_TSN	0x00000001
 #define IGC_TQAVCTRL_ENHANCED_QAV	0x00000008
+#define IGC_TQAVCTRL_PREEMPT_ENA	0x00000002
+#define IGC_TQAVCTRL_MIN_FRAG_MASK	0x0000C000
+#define IGC_TQAVCTRL_MIN_FRAG_SHIFT	14
 
 #define IGC_TXQCTL_QUEUE_MODE_LAUNCHT	0x00000001
 #define IGC_TXQCTL_STRICT_CYCLE		0x00000002
 #define IGC_TXQCTL_STRICT_END		0x00000004
+#define IGC_TXQCTL_PREEMPTABLE		0x00000008
 
 /* Receive Checksum Control */
 #define IGC_RXCSUM_CRCOFL	0x00000800   /* CRC32 offload enable */
diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index 61d331ce38cd..492c6592c150 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -8,6 +8,7 @@
 
 #include "igc.h"
 #include "igc_diag.h"
+#include "igc_tsn.h"
 
 /* forward declaration */
 struct igc_stats {
@@ -1636,6 +1637,58 @@ static int igc_ethtool_set_eee(struct net_device *netdev,
 	return 0;
 }
 
+static int igc_ethtool_get_preempt(struct net_device *netdev,
+				   struct ethtool_fp *fpcmd)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+
+	fpcmd->enabled = adapter->frame_preemption_active;
+	fpcmd->min_frag_size_mult = adapter->min_frag_size_mult;
+
+	return 0;
+}
+
+static int igc_ethtool_set_preempt(struct net_device *netdev,
+				   struct ethtool_fp *fpcmd)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+	int i;
+
+	/* The formula is (Section 8.12.4 of the datasheet):
+	 *   MIN_FRAG_SIZE = 4 + (1 + MIN_FRAG)*64
+	 * MIN_FRAG is represented by two bits, so we can only have
+	 * min_frag_size between 68 and 260.
+	 */
+	if (fpcmd->min_frag_size_mult > 3)
+		return -EINVAL;
+
+	adapter->frame_preemption_active = fpcmd->enabled;
+	adapter->min_frag_size_mult = fpcmd->min_frag_size_mult;
+
+	if (!adapter->frame_preemption_active)
+		goto done;
+
+	/* Enabling frame preemption requires TSN mode to be enabled,
+	 * which requires a schedule to be active. So, if there isn't
+	 * a schedule already configured, configure a simple one, with
+	 * all queues open, with 1ms cycle time.
+	 */
+	if (adapter->base_time)
+		goto done;
+
+	adapter->cycle_time = NSEC_PER_MSEC;
+
+	for (i = 0; i < adapter->num_tx_queues; i++) {
+		struct igc_ring *ring = adapter->tx_ring[i];
+
+		ring->start_time = 0;
+		ring->end_time = NSEC_PER_MSEC;
+	}
+
+done:
+	return igc_tsn_offload_apply(adapter);
+}
+
 static int igc_ethtool_begin(struct net_device *netdev)
 {
 	struct igc_adapter *adapter = netdev_priv(netdev);
@@ -1915,6 +1968,8 @@ static const struct ethtool_ops igc_ethtool_ops = {
 	.get_ts_info		= igc_ethtool_get_ts_info,
 	.get_channels		= igc_ethtool_get_channels,
 	.set_channels		= igc_ethtool_set_channels,
+	.get_preempt		= igc_ethtool_get_preempt,
+	.set_preempt		= igc_ethtool_set_preempt,
 	.get_priv_flags		= igc_ethtool_get_priv_flags,
 	.set_priv_flags		= igc_ethtool_set_priv_flags,
 	.get_eee		= igc_ethtool_get_eee,
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index f5a5527adb21..76272f62bb69 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -31,6 +31,7 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 		return 0;
 
 	adapter->cycle_time = 0;
+	adapter->frame_preemption_active = false;
 
 	wr32(IGC_TXPBS, I225_TXPBSIZE_DEFAULT);
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_DEFAULT);
@@ -42,7 +43,8 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 
 	tqavctrl = rd32(IGC_TQAVCTRL);
 	tqavctrl &= ~(IGC_TQAVCTRL_TRANSMIT_MODE_TSN |
-		      IGC_TQAVCTRL_ENHANCED_QAV);
+		      IGC_TQAVCTRL_ENHANCED_QAV | IGC_TQAVCTRL_PREEMPT_ENA |
+		      IGC_TQAVCTRL_MIN_FRAG_MASK);
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
@@ -51,6 +53,7 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 		ring->start_time = 0;
 		ring->end_time = 0;
 		ring->launchtime_enable = false;
+		ring->preemptible = false;
 
 		wr32(IGC_TXQCTL(i), 0);
 		wr32(IGC_STQT(i), 0);
@@ -83,13 +86,20 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_TSN);
 	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
 
-	tqavctrl = rd32(IGC_TQAVCTRL);
 	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
 	rxpbs |= IGC_RXPBSIZE_TSN;
 
 	wr32(IGC_RXPBS, rxpbs);
 
+	tqavctrl = rd32(IGC_TQAVCTRL) &
+		~(IGC_TQAVCTRL_MIN_FRAG_MASK | IGC_TQAVCTRL_PREEMPT_ENA);
 	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
+
+	if (adapter->frame_preemption_active)
+		tqavctrl |= IGC_TQAVCTRL_PREEMPT_ENA;
+
+	tqavctrl |= adapter->min_frag_size_mult << IGC_TQAVCTRL_MIN_FRAG_SHIFT;
+
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
 	wr32(IGC_QBVCYCLET_S, cycle);
@@ -115,6 +125,9 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 		if (ring->launchtime_enable)
 			txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
 
+		if (ring->preemptible)
+			txqctl |= IGC_TXQCTL_PREEMPTABLE;
+
 		wr32(IGC_TXQCTL(i), txqctl);
 	}
 
@@ -142,7 +155,8 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 
 int igc_tsn_offload_apply(struct igc_adapter *adapter)
 {
-	bool is_any_enabled = adapter->base_time || is_any_launchtime(adapter);
+	bool is_any_enabled = adapter->base_time ||
+		is_any_launchtime(adapter) || adapter->frame_preemption_active;
 
 	if (!(adapter->flags & IGC_FLAG_TSN_QBV_ENABLED) && !is_any_enabled)
 		return 0;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 7/9] igc: Add support for Frame Preemption offload
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

After the set of queues that are marked as preemptible are exposed to
the driver we can configure the hardware to enable the frame
preemption functionality.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 32 +++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 3af54fd9cc0d..763f39082f91 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4817,6 +4817,23 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
 	return 0;
 }
 
+static int igc_save_frame_preemption(struct igc_adapter *adapter,
+				     struct tc_preempt_qopt_offload *qopt)
+{
+	u32 preempt;
+	int i;
+
+	preempt = qopt->preemptible_queues;
+
+	for (i = 0; i < adapter->num_tx_queues; i++) {
+		struct igc_ring *ring = adapter->tx_ring[i];
+
+		ring->preemptible = preempt & BIT(i);
+	}
+
+	return 0;
+}
+
 static int igc_tsn_enable_qbv_scheduling(struct igc_adapter *adapter,
 					 struct tc_taprio_qopt_offload *qopt)
 {
@@ -4833,6 +4850,18 @@ static int igc_tsn_enable_qbv_scheduling(struct igc_adapter *adapter,
 	return igc_tsn_offload_apply(adapter);
 }
 
+static int igc_tsn_enable_frame_preemption(struct igc_adapter *adapter,
+					   struct tc_preempt_qopt_offload *qopt)
+{
+	int err;
+
+	err = igc_save_frame_preemption(adapter, qopt);
+	if (err)
+		return err;
+
+	return igc_tsn_offload_apply(adapter);
+}
+
 static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 			void *type_data)
 {
@@ -4845,6 +4874,9 @@ static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 	case TC_SETUP_QDISC_ETF:
 		return igc_tsn_enable_launchtime(adapter, type_data);
 
+	case TC_SETUP_PREEMPT:
+		return igc_tsn_enable_frame_preemption(adapter, type_data);
+
 	default:
 		return -EOPNOTSUPP;
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 7/9] igc: Add support for Frame Preemption offload
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

After the set of queues that are marked as preemptible are exposed to
the driver we can configure the hardware to enable the frame
preemption functionality.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 32 +++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 3af54fd9cc0d..763f39082f91 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4817,6 +4817,23 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
 	return 0;
 }
 
+static int igc_save_frame_preemption(struct igc_adapter *adapter,
+				     struct tc_preempt_qopt_offload *qopt)
+{
+	u32 preempt;
+	int i;
+
+	preempt = qopt->preemptible_queues;
+
+	for (i = 0; i < adapter->num_tx_queues; i++) {
+		struct igc_ring *ring = adapter->tx_ring[i];
+
+		ring->preemptible = preempt & BIT(i);
+	}
+
+	return 0;
+}
+
 static int igc_tsn_enable_qbv_scheduling(struct igc_adapter *adapter,
 					 struct tc_taprio_qopt_offload *qopt)
 {
@@ -4833,6 +4850,18 @@ static int igc_tsn_enable_qbv_scheduling(struct igc_adapter *adapter,
 	return igc_tsn_offload_apply(adapter);
 }
 
+static int igc_tsn_enable_frame_preemption(struct igc_adapter *adapter,
+					   struct tc_preempt_qopt_offload *qopt)
+{
+	int err;
+
+	err = igc_save_frame_preemption(adapter, qopt);
+	if (err)
+		return err;
+
+	return igc_tsn_offload_apply(adapter);
+}
+
 static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 			void *type_data)
 {
@@ -4845,6 +4874,9 @@ static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 	case TC_SETUP_QDISC_ETF:
 		return igc_tsn_enable_launchtime(adapter, type_data);
 
+	case TC_SETUP_PREEMPT:
+		return igc_tsn_enable_frame_preemption(adapter, type_data);
+
 	default:
 		return -EOPNOTSUPP;
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 8/9] igc: Add support for exposing frame preemption stats registers
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

Expose the Frame Preemption counters, so the number of
express/preemptible packets can be monitored by userspace.

These registers are cleared when read, so the value shown is the
number of events that happened since the last read.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_ethtool.c |  8 ++++++++
 drivers/net/ethernet/intel/igc/igc_regs.h    | 10 ++++++++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index 492c6592c150..14a14a61de45 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -325,6 +325,14 @@ static void igc_ethtool_get_regs(struct net_device *netdev,
 
 	regs_buff[213] = adapter->stats.tlpic;
 	regs_buff[214] = adapter->stats.rlpic;
+	regs_buff[215] = rd32(IGC_PRMPTDTCNT);
+	regs_buff[216] = rd32(IGC_PRMEVNTTCNT);
+	regs_buff[217] = rd32(IGC_PRMPTDRCNT);
+	regs_buff[218] = rd32(IGC_PRMEVNTRCNT);
+	regs_buff[219] = rd32(IGC_PRMPBLTCNT);
+	regs_buff[220] = rd32(IGC_PRMPBLRCNT);
+	regs_buff[221] = rd32(IGC_PRMEXPTCNT);
+	regs_buff[222] = rd32(IGC_PRMEXPRCNT);
 }
 
 static void igc_ethtool_get_wol(struct net_device *netdev,
diff --git a/drivers/net/ethernet/intel/igc/igc_regs.h b/drivers/net/ethernet/intel/igc/igc_regs.h
index b52dd9d737e8..69e295d69d84 100644
--- a/drivers/net/ethernet/intel/igc/igc_regs.h
+++ b/drivers/net/ethernet/intel/igc/igc_regs.h
@@ -197,6 +197,16 @@
 
 #define IGC_FTQF(_n)	(0x059E0 + (4 * (_n)))  /* 5-tuple Queue Fltr */
 
+/* Time sync registers - preemption statistics */
+#define IGC_PRMPTDTCNT	0x04280  /* Good TX Preempted Packets */
+#define IGC_PRMEVNTTCNT	0x04298  /* TX Preemption event counter */
+#define IGC_PRMPTDRCNT	0x04284  /* Good RX Preempted Packets */
+#define IGC_PRMEVNTRCNT	0x0429C  /* RX Preemption event counter */
+#define IGC_PRMPBLTCNT	0x04288  /* Good TX Preemptable Packets */
+#define IGC_PRMPBLRCNT	0x0428C  /* Good RX Preemptable Packets */
+#define IGC_PRMEXPTCNT	0x04290  /* Good TX Express Packets */
+#define IGC_PRMEXPRCNT	0x042A0  /* Preemption Exception Counter */
+
 /* Transmit Scheduling Registers */
 #define IGC_TQAVCTRL		0x3570
 #define IGC_TXQCTL(_n)		(0x3344 + 0x4 * (_n))
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 8/9] igc: Add support for exposing frame preemption stats registers
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

Expose the Frame Preemption counters, so the number of
express/preemptible packets can be monitored by userspace.

These registers are cleared when read, so the value shown is the
number of events that happened since the last read.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_ethtool.c |  8 ++++++++
 drivers/net/ethernet/intel/igc/igc_regs.h    | 10 ++++++++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index 492c6592c150..14a14a61de45 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -325,6 +325,14 @@ static void igc_ethtool_get_regs(struct net_device *netdev,
 
 	regs_buff[213] = adapter->stats.tlpic;
 	regs_buff[214] = adapter->stats.rlpic;
+	regs_buff[215] = rd32(IGC_PRMPTDTCNT);
+	regs_buff[216] = rd32(IGC_PRMEVNTTCNT);
+	regs_buff[217] = rd32(IGC_PRMPTDRCNT);
+	regs_buff[218] = rd32(IGC_PRMEVNTRCNT);
+	regs_buff[219] = rd32(IGC_PRMPBLTCNT);
+	regs_buff[220] = rd32(IGC_PRMPBLRCNT);
+	regs_buff[221] = rd32(IGC_PRMEXPTCNT);
+	regs_buff[222] = rd32(IGC_PRMEXPRCNT);
 }
 
 static void igc_ethtool_get_wol(struct net_device *netdev,
diff --git a/drivers/net/ethernet/intel/igc/igc_regs.h b/drivers/net/ethernet/intel/igc/igc_regs.h
index b52dd9d737e8..69e295d69d84 100644
--- a/drivers/net/ethernet/intel/igc/igc_regs.h
+++ b/drivers/net/ethernet/intel/igc/igc_regs.h
@@ -197,6 +197,16 @@
 
 #define IGC_FTQF(_n)	(0x059E0 + (4 * (_n)))  /* 5-tuple Queue Fltr */
 
+/* Time sync registers - preemption statistics */
+#define IGC_PRMPTDTCNT	0x04280  /* Good TX Preempted Packets */
+#define IGC_PRMEVNTTCNT	0x04298  /* TX Preemption event counter */
+#define IGC_PRMPTDRCNT	0x04284  /* Good RX Preempted Packets */
+#define IGC_PRMEVNTRCNT	0x0429C  /* RX Preemption event counter */
+#define IGC_PRMPBLTCNT	0x04288  /* Good TX Preemptable Packets */
+#define IGC_PRMPBLRCNT	0x0428C  /* Good RX Preemptable Packets */
+#define IGC_PRMEXPTCNT	0x04290  /* Good TX Express Packets */
+#define IGC_PRMEXPRCNT	0x042A0  /* Preemption Exception Counter */
+
 /* Transmit Scheduling Registers */
 #define IGC_TQAVCTRL		0x3570
 #define IGC_TXQCTL(_n)		(0x3344 + 0x4 * (_n))
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH net-next v1 9/9] igc: Separate TSN configurations that can be updated
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: netdev
  Cc: Vinicius Costa Gomes, jhs, xiyou.wangcong, jiri, kuba,
	m-karicheri2, vladimir.oltean, Jose.Abreu, po.liu,
	intel-wired-lan, anthony.l.nguyen

Some TSN features can be enabled during runtime, but most of the
features need an adapter reset to be disabled.

To better keep track of this, separate the process into an "_apply"
and a "reset" functions, "_apply" will run with the adapter in
potencially "dirty" state, and if necessary will request an adapter
reset, so "_reset" always run with a "clean" adapter.

The idea is to make the process easier to follow.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c |  21 ++--
 drivers/net/ethernet/intel/igc/igc_tsn.c  | 140 +++++++++++++++-------
 drivers/net/ethernet/intel/igc/igc_tsn.h  |   1 +
 3 files changed, 103 insertions(+), 59 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 763f39082f91..76999fe05926 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -107,7 +107,7 @@ void igc_reset(struct igc_adapter *adapter)
 	igc_ptp_reset(adapter);
 
 	/* Re-enable TSN offloading, where applicable. */
-	igc_tsn_offload_apply(adapter);
+	igc_tsn_reset(adapter);
 
 	igc_get_phy_info(hw);
 }
@@ -4823,6 +4823,11 @@ static int igc_save_frame_preemption(struct igc_adapter *adapter,
 	u32 preempt;
 	int i;
 
+	/* What we want here is just to save the configuration, so
+	 * when frame preemption is enabled via ethtool, which queues
+	 * are marked as preemptible is saved.
+	 */
+
 	preempt = qopt->preemptible_queues;
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
@@ -4850,18 +4855,6 @@ static int igc_tsn_enable_qbv_scheduling(struct igc_adapter *adapter,
 	return igc_tsn_offload_apply(adapter);
 }
 
-static int igc_tsn_enable_frame_preemption(struct igc_adapter *adapter,
-					   struct tc_preempt_qopt_offload *qopt)
-{
-	int err;
-
-	err = igc_save_frame_preemption(adapter, qopt);
-	if (err)
-		return err;
-
-	return igc_tsn_offload_apply(adapter);
-}
-
 static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 			void *type_data)
 {
@@ -4875,7 +4868,7 @@ static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 		return igc_tsn_enable_launchtime(adapter, type_data);
 
 	case TC_SETUP_PREEMPT:
-		return igc_tsn_enable_frame_preemption(adapter, type_data);
+		return igc_save_frame_preemption(adapter, type_data);
 
 	default:
 		return -EOPNOTSUPP;
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 76272f62bb69..caf8510ad6be 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -18,8 +18,24 @@ static bool is_any_launchtime(struct igc_adapter *adapter)
 	return false;
 }
 
+static unsigned int igc_tsn_new_flags(struct igc_adapter *adapter)
+{
+	unsigned int new_flags = adapter->flags & ~IGC_FLAG_TSN_ANY_ENABLED;
+
+	if (adapter->base_time)
+		new_flags |= IGC_FLAG_TSN_QBV_ENABLED;
+
+	if (is_any_launchtime(adapter))
+		new_flags |= IGC_FLAG_TSN_QBV_ENABLED;
+
+	if (adapter->frame_preemption_active)
+		new_flags |= IGC_FLAG_TSN_PREEMPT_ENABLED;
+
+	return new_flags;
+}
+
 /* Returns the TSN specific registers to their default values after
- * TSN offloading is disabled.
+ * the adapter is reset.
  */
 static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 {
@@ -27,11 +43,9 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 	u32 tqavctrl, rxpbs;
 	int i;
 
-	if (!(adapter->flags & IGC_FLAG_TSN_QBV_ENABLED))
-		return 0;
-
 	adapter->cycle_time = 0;
 	adapter->frame_preemption_active = false;
+	adapter->base_time = 0;
 
 	wr32(IGC_TXPBS, I225_TXPBSIZE_DEFAULT);
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_DEFAULT);
@@ -63,37 +77,24 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 	wr32(IGC_QBVCYCLET_S, NSEC_PER_MSEC);
 	wr32(IGC_QBVCYCLET, NSEC_PER_MSEC);
 
-	adapter->flags &= ~IGC_FLAG_TSN_QBV_ENABLED;
+	adapter->flags &= ~IGC_FLAG_TSN_ANY_ENABLED;
 
 	return 0;
 }
 
-static int igc_tsn_enable_offload(struct igc_adapter *adapter)
+static int igc_tsn_update_params(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
-	u32 tqavctrl, baset_l, baset_h;
-	u32 sec, nsec, cycle, rxpbs;
-	ktime_t base_time, systim;
+	u32 tqavctrl;
+	unsigned int flags;
 	int i;
 
-	if (adapter->flags & IGC_FLAG_TSN_QBV_ENABLED)
+	flags = igc_tsn_new_flags(adapter) & IGC_FLAG_TSN_ANY_ENABLED;
+	if (!flags)
 		return 0;
 
-	cycle = adapter->cycle_time;
-	base_time = adapter->base_time;
-
-	wr32(IGC_TSAUXC, 0);
-	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_TSN);
-	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
-
-	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
-	rxpbs |= IGC_RXPBSIZE_TSN;
-
-	wr32(IGC_RXPBS, rxpbs);
-
 	tqavctrl = rd32(IGC_TQAVCTRL) &
 		~(IGC_TQAVCTRL_MIN_FRAG_MASK | IGC_TQAVCTRL_PREEMPT_ENA);
-	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
 
 	if (adapter->frame_preemption_active)
 		tqavctrl |= IGC_TQAVCTRL_PREEMPT_ENA;
@@ -102,9 +103,6 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
-	wr32(IGC_QBVCYCLET_S, cycle);
-	wr32(IGC_QBVCYCLET, cycle);
-
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igc_ring *ring = adapter->tx_ring[i];
 		u32 txqctl = 0;
@@ -125,12 +123,47 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 		if (ring->launchtime_enable)
 			txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
 
-		if (ring->preemptible)
+		if (adapter->frame_preemption_active && ring->preemptible)
 			txqctl |= IGC_TXQCTL_PREEMPTABLE;
 
 		wr32(IGC_TXQCTL(i), txqctl);
 	}
 
+	adapter->flags = igc_tsn_new_flags(adapter);
+
+	return 0;
+}
+
+static int igc_tsn_enable_offload(struct igc_adapter *adapter)
+{
+	struct igc_hw *hw = &adapter->hw;
+	u32 baset_l, baset_h, tqavctrl;
+	u32 sec, nsec, cycle, rxpbs;
+	ktime_t base_time, systim;
+
+	tqavctrl = rd32(IGC_TQAVCTRL);
+	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
+
+	wr32(IGC_TQAVCTRL, tqavctrl);
+
+	wr32(IGC_TSAUXC, 0);
+	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_TSN);
+	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
+
+	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
+	rxpbs |= IGC_RXPBSIZE_TSN;
+
+	wr32(IGC_RXPBS, rxpbs);
+
+	if (!adapter->base_time)
+		goto done;
+
+	cycle = adapter->cycle_time;
+	base_time = adapter->base_time;
+
+	wr32(IGC_QBVCYCLET_S, cycle);
+	wr32(IGC_QBVCYCLET, cycle);
+
 	nsec = rd32(IGC_SYSTIML);
 	sec = rd32(IGC_SYSTIMH);
 
@@ -148,34 +181,51 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 	wr32(IGC_BASET_H, baset_h);
 	wr32(IGC_BASET_L, baset_l);
 
-	adapter->flags |= IGC_FLAG_TSN_QBV_ENABLED;
+done:
+	igc_tsn_update_params(adapter);
 
 	return 0;
 }
 
+int igc_tsn_reset(struct igc_adapter *adapter)
+{
+	unsigned int new_flags;
+	int err = 0;
+
+	new_flags = igc_tsn_new_flags(adapter);
+
+	if (!(new_flags & IGC_FLAG_TSN_ANY_ENABLED))
+		return igc_tsn_disable_offload(adapter);
+
+	err = igc_tsn_enable_offload(adapter);
+	if (err < 0)
+		return err;
+
+	adapter->flags = new_flags;
+
+	return err;
+}
+
 int igc_tsn_offload_apply(struct igc_adapter *adapter)
 {
-	bool is_any_enabled = adapter->base_time ||
-		is_any_launchtime(adapter) || adapter->frame_preemption_active;
+	unsigned int new_flags, old_flags;
 
-	if (!(adapter->flags & IGC_FLAG_TSN_QBV_ENABLED) && !is_any_enabled)
-		return 0;
+	old_flags = adapter->flags;
+	new_flags = igc_tsn_new_flags(adapter);
 
-	if (!is_any_enabled) {
-		int err = igc_tsn_disable_offload(adapter);
+	if (old_flags == new_flags)
+		return igc_tsn_update_params(adapter);
 
-		if (err < 0)
-			return err;
+	/* Enabling features work without resetting the adapter */
+	if (new_flags > old_flags)
+		return igc_tsn_enable_offload(adapter);
 
-		/* The BASET registers aren't cleared when writing
-		 * into them, force a reset if the interface is
-		 * running.
-		 */
-		if (netif_running(adapter->netdev))
-			schedule_work(&adapter->reset_task);
+	adapter->flags = new_flags;
 
-		return 0;
-	}
+	if (!netif_running(adapter->netdev))
+		return igc_tsn_enable_offload(adapter);
 
-	return igc_tsn_enable_offload(adapter);
+	schedule_work(&adapter->reset_task);
+
+	return 0;
 }
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.h b/drivers/net/ethernet/intel/igc/igc_tsn.h
index f76bc86ddccd..1512307f5a52 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.h
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.h
@@ -5,5 +5,6 @@
 #define _IGC_TSN_H_
 
 int igc_tsn_offload_apply(struct igc_adapter *adapter);
+int igc_tsn_reset(struct igc_adapter *adapter);
 
 #endif /* _IGC_BASE_H */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 9/9] igc: Separate TSN configurations that can be updated
@ 2020-12-02  4:53   ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-02  4:53 UTC (permalink / raw)
  To: intel-wired-lan

Some TSN features can be enabled during runtime, but most of the
features need an adapter reset to be disabled.

To better keep track of this, separate the process into an "_apply"
and a "reset" functions, "_apply" will run with the adapter in
potencially "dirty" state, and if necessary will request an adapter
reset, so "_reset" always run with a "clean" adapter.

The idea is to make the process easier to follow.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c |  21 ++--
 drivers/net/ethernet/intel/igc/igc_tsn.c  | 140 +++++++++++++++-------
 drivers/net/ethernet/intel/igc/igc_tsn.h  |   1 +
 3 files changed, 103 insertions(+), 59 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 763f39082f91..76999fe05926 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -107,7 +107,7 @@ void igc_reset(struct igc_adapter *adapter)
 	igc_ptp_reset(adapter);
 
 	/* Re-enable TSN offloading, where applicable. */
-	igc_tsn_offload_apply(adapter);
+	igc_tsn_reset(adapter);
 
 	igc_get_phy_info(hw);
 }
@@ -4823,6 +4823,11 @@ static int igc_save_frame_preemption(struct igc_adapter *adapter,
 	u32 preempt;
 	int i;
 
+	/* What we want here is just to save the configuration, so
+	 * when frame preemption is enabled via ethtool, which queues
+	 * are marked as preemptible is saved.
+	 */
+
 	preempt = qopt->preemptible_queues;
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
@@ -4850,18 +4855,6 @@ static int igc_tsn_enable_qbv_scheduling(struct igc_adapter *adapter,
 	return igc_tsn_offload_apply(adapter);
 }
 
-static int igc_tsn_enable_frame_preemption(struct igc_adapter *adapter,
-					   struct tc_preempt_qopt_offload *qopt)
-{
-	int err;
-
-	err = igc_save_frame_preemption(adapter, qopt);
-	if (err)
-		return err;
-
-	return igc_tsn_offload_apply(adapter);
-}
-
 static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 			void *type_data)
 {
@@ -4875,7 +4868,7 @@ static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
 		return igc_tsn_enable_launchtime(adapter, type_data);
 
 	case TC_SETUP_PREEMPT:
-		return igc_tsn_enable_frame_preemption(adapter, type_data);
+		return igc_save_frame_preemption(adapter, type_data);
 
 	default:
 		return -EOPNOTSUPP;
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 76272f62bb69..caf8510ad6be 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -18,8 +18,24 @@ static bool is_any_launchtime(struct igc_adapter *adapter)
 	return false;
 }
 
+static unsigned int igc_tsn_new_flags(struct igc_adapter *adapter)
+{
+	unsigned int new_flags = adapter->flags & ~IGC_FLAG_TSN_ANY_ENABLED;
+
+	if (adapter->base_time)
+		new_flags |= IGC_FLAG_TSN_QBV_ENABLED;
+
+	if (is_any_launchtime(adapter))
+		new_flags |= IGC_FLAG_TSN_QBV_ENABLED;
+
+	if (adapter->frame_preemption_active)
+		new_flags |= IGC_FLAG_TSN_PREEMPT_ENABLED;
+
+	return new_flags;
+}
+
 /* Returns the TSN specific registers to their default values after
- * TSN offloading is disabled.
+ * the adapter is reset.
  */
 static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 {
@@ -27,11 +43,9 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 	u32 tqavctrl, rxpbs;
 	int i;
 
-	if (!(adapter->flags & IGC_FLAG_TSN_QBV_ENABLED))
-		return 0;
-
 	adapter->cycle_time = 0;
 	adapter->frame_preemption_active = false;
+	adapter->base_time = 0;
 
 	wr32(IGC_TXPBS, I225_TXPBSIZE_DEFAULT);
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_DEFAULT);
@@ -63,37 +77,24 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 	wr32(IGC_QBVCYCLET_S, NSEC_PER_MSEC);
 	wr32(IGC_QBVCYCLET, NSEC_PER_MSEC);
 
-	adapter->flags &= ~IGC_FLAG_TSN_QBV_ENABLED;
+	adapter->flags &= ~IGC_FLAG_TSN_ANY_ENABLED;
 
 	return 0;
 }
 
-static int igc_tsn_enable_offload(struct igc_adapter *adapter)
+static int igc_tsn_update_params(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
-	u32 tqavctrl, baset_l, baset_h;
-	u32 sec, nsec, cycle, rxpbs;
-	ktime_t base_time, systim;
+	u32 tqavctrl;
+	unsigned int flags;
 	int i;
 
-	if (adapter->flags & IGC_FLAG_TSN_QBV_ENABLED)
+	flags = igc_tsn_new_flags(adapter) & IGC_FLAG_TSN_ANY_ENABLED;
+	if (!flags)
 		return 0;
 
-	cycle = adapter->cycle_time;
-	base_time = adapter->base_time;
-
-	wr32(IGC_TSAUXC, 0);
-	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_TSN);
-	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
-
-	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
-	rxpbs |= IGC_RXPBSIZE_TSN;
-
-	wr32(IGC_RXPBS, rxpbs);
-
 	tqavctrl = rd32(IGC_TQAVCTRL) &
 		~(IGC_TQAVCTRL_MIN_FRAG_MASK | IGC_TQAVCTRL_PREEMPT_ENA);
-	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
 
 	if (adapter->frame_preemption_active)
 		tqavctrl |= IGC_TQAVCTRL_PREEMPT_ENA;
@@ -102,9 +103,6 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
-	wr32(IGC_QBVCYCLET_S, cycle);
-	wr32(IGC_QBVCYCLET, cycle);
-
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igc_ring *ring = adapter->tx_ring[i];
 		u32 txqctl = 0;
@@ -125,12 +123,47 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 		if (ring->launchtime_enable)
 			txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
 
-		if (ring->preemptible)
+		if (adapter->frame_preemption_active && ring->preemptible)
 			txqctl |= IGC_TXQCTL_PREEMPTABLE;
 
 		wr32(IGC_TXQCTL(i), txqctl);
 	}
 
+	adapter->flags = igc_tsn_new_flags(adapter);
+
+	return 0;
+}
+
+static int igc_tsn_enable_offload(struct igc_adapter *adapter)
+{
+	struct igc_hw *hw = &adapter->hw;
+	u32 baset_l, baset_h, tqavctrl;
+	u32 sec, nsec, cycle, rxpbs;
+	ktime_t base_time, systim;
+
+	tqavctrl = rd32(IGC_TQAVCTRL);
+	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
+
+	wr32(IGC_TQAVCTRL, tqavctrl);
+
+	wr32(IGC_TSAUXC, 0);
+	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_TSN);
+	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
+
+	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
+	rxpbs |= IGC_RXPBSIZE_TSN;
+
+	wr32(IGC_RXPBS, rxpbs);
+
+	if (!adapter->base_time)
+		goto done;
+
+	cycle = adapter->cycle_time;
+	base_time = adapter->base_time;
+
+	wr32(IGC_QBVCYCLET_S, cycle);
+	wr32(IGC_QBVCYCLET, cycle);
+
 	nsec = rd32(IGC_SYSTIML);
 	sec = rd32(IGC_SYSTIMH);
 
@@ -148,34 +181,51 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 	wr32(IGC_BASET_H, baset_h);
 	wr32(IGC_BASET_L, baset_l);
 
-	adapter->flags |= IGC_FLAG_TSN_QBV_ENABLED;
+done:
+	igc_tsn_update_params(adapter);
 
 	return 0;
 }
 
+int igc_tsn_reset(struct igc_adapter *adapter)
+{
+	unsigned int new_flags;
+	int err = 0;
+
+	new_flags = igc_tsn_new_flags(adapter);
+
+	if (!(new_flags & IGC_FLAG_TSN_ANY_ENABLED))
+		return igc_tsn_disable_offload(adapter);
+
+	err = igc_tsn_enable_offload(adapter);
+	if (err < 0)
+		return err;
+
+	adapter->flags = new_flags;
+
+	return err;
+}
+
 int igc_tsn_offload_apply(struct igc_adapter *adapter)
 {
-	bool is_any_enabled = adapter->base_time ||
-		is_any_launchtime(adapter) || adapter->frame_preemption_active;
+	unsigned int new_flags, old_flags;
 
-	if (!(adapter->flags & IGC_FLAG_TSN_QBV_ENABLED) && !is_any_enabled)
-		return 0;
+	old_flags = adapter->flags;
+	new_flags = igc_tsn_new_flags(adapter);
 
-	if (!is_any_enabled) {
-		int err = igc_tsn_disable_offload(adapter);
+	if (old_flags == new_flags)
+		return igc_tsn_update_params(adapter);
 
-		if (err < 0)
-			return err;
+	/* Enabling features work without resetting the adapter */
+	if (new_flags > old_flags)
+		return igc_tsn_enable_offload(adapter);
 
-		/* The BASET registers aren't cleared when writing
-		 * into them, force a reset if the interface is
-		 * running.
-		 */
-		if (netif_running(adapter->netdev))
-			schedule_work(&adapter->reset_task);
+	adapter->flags = new_flags;
 
-		return 0;
-	}
+	if (!netif_running(adapter->netdev))
+		return igc_tsn_enable_offload(adapter);
 
-	return igc_tsn_enable_offload(adapter);
+	schedule_work(&adapter->reset_task);
+
+	return 0;
 }
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.h b/drivers/net/ethernet/intel/igc/igc_tsn.h
index f76bc86ddccd..1512307f5a52 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.h
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.h
@@ -5,5 +5,6 @@
 #define _IGC_TSN_H_
 
 int igc_tsn_offload_apply(struct igc_adapter *adapter);
+int igc_tsn_reset(struct igc_adapter *adapter);
 
 #endif /* _IGC_BASE_H */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-03  1:53     ` Jakub Kicinski
  -1 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-03  1:53 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen,
	Michal Kubecek

On Tue,  1 Dec 2020 20:53:17 -0800 Vinicius Costa Gomes wrote:
> Frame preemption (described in IEEE 802.3br-2016) defines the concept
> of preemptible and express queues. It allows traffic from express
> queues to "interrupt" traffic from preemptible queues, which are
> "resumed" after the express traffic has finished transmitting.
> 
> Frame preemption can only be used when both the local device and the
> link partner support it.
> 
> Only parameters for enabling/disabling frame preemption and
> configuring the minimum fragment size are included here. Expressing
> which queues are marked as preemptible is left to mqprio/taprio, as
> having that information there should be easier on the user.
> 
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>

CC: Michal

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-03  1:53     ` Jakub Kicinski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-03  1:53 UTC (permalink / raw)
  To: intel-wired-lan

On Tue,  1 Dec 2020 20:53:17 -0800 Vinicius Costa Gomes wrote:
> Frame preemption (described in IEEE 802.3br-2016) defines the concept
> of preemptible and express queues. It allows traffic from express
> queues to "interrupt" traffic from preemptible queues, which are
> "resumed" after the express traffic has finished transmitting.
> 
> Frame preemption can only be used when both the local device and the
> link partner support it.
> 
> Only parameters for enabling/disabling frame preemption and
> configuring the minimum fragment size are included here. Expressing
> which queues are marked as preemptible is left to mqprio/taprio, as
> having that information there should be easier on the user.
> 
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>

CC: Michal

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-05 17:43     ` Jakub Kicinski
  -1 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-05 17:43 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

On Tue,  1 Dec 2020 20:53:17 -0800 Vinicius Costa Gomes wrote:
> Frame preemption (described in IEEE 802.3br-2016) defines the concept
> of preemptible and express queues. It allows traffic from express
> queues to "interrupt" traffic from preemptible queues, which are
> "resumed" after the express traffic has finished transmitting.
> 
> Frame preemption can only be used when both the local device and the
> link partner support it.
> 
> Only parameters for enabling/disabling frame preemption and
> configuring the minimum fragment size are included here. Expressing
> which queues are marked as preemptible is left to mqprio/taprio, as
> having that information there should be easier on the user.
> 
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> ---
>  include/linux/ethtool.h              |  19 ++++
>  include/uapi/linux/ethtool_netlink.h |  17 +++
>  net/ethtool/Makefile                 |   2 +-
>  net/ethtool/netlink.c                |  19 ++++
>  net/ethtool/netlink.h                |   4 +
>  net/ethtool/preempt.c                | 151 +++++++++++++++++++++++++++
>  6 files changed, 211 insertions(+), 1 deletion(-)
>  create mode 100644 net/ethtool/preempt.c
> 
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index e3da25b51ae4..16d6ee29a6ac 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -263,6 +263,19 @@ struct ethtool_pause_stats {
>  	u64 rx_pause_frames;
>  };
>  
> +/**
> + * struct ethtool_fp - Frame Preemption information
> + *
> + * @enabled: Enable frame preemption.
> + *

The empty line between members seems unnecessary.

> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> + * expressed in terms of X in '(1 + X)*64 + 4'

Is this way of expressing the min frag size from the standard?

> + */
> +struct ethtool_fp {
> +	u8 enabled;
> +	u8 min_frag_size_mult;
> +};

> +	int	(*get_preempt)(struct net_device *,
> +			       struct ethtool_fp *);
> +	int	(*set_preempt)(struct net_device *,
> +			       struct ethtool_fp *);

Since this is a new op we should probably pass extack to the drivers?

>  extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1];
>  extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1];
> @@ -375,6 +376,8 @@ extern const struct nla_policy ethnl_tsinfo_get_policy[ETHTOOL_A_TSINFO_HEADER +
>  extern const struct nla_policy ethnl_cable_test_act_policy[ETHTOOL_A_CABLE_TEST_HEADER + 1];
>  extern const struct nla_policy ethnl_cable_test_tdr_act_policy[ETHTOOL_A_CABLE_TEST_TDR_CFG + 1];
>  extern const struct nla_policy ethnl_tunnel_info_get_policy[ETHTOOL_A_TUNNEL_INFO_HEADER + 1];
> +extern const struct nla_policy ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1];
> +extern const struct nla_policy ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1];

Let's make the size

ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT + 1

for set, and

ETHTOOL_A_PREEMPT_HEADER + 1

for get, like the other tables

> +const struct nla_policy
> +ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
> +	[ETHTOOL_A_PREEMPT_UNSPEC]		= { .type = NLA_REJECT },

Unnecessary, NLA_REJECT is 0.

> +	[ETHTOOL_A_PREEMPT_HEADER]		= { .type = NLA_NESTED },

Please specify nested policy

> +	[ETHTOOL_A_PREEMPT_ENABLED]		= { .type = NLA_REJECT },
> +	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]	= { .type = NLA_REJECT },

Unnecessary

> +const struct nla_policy
> +ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
> +	[ETHTOOL_A_PREEMPT_UNSPEC]			= { .type = NLA_REJECT },
> +	[ETHTOOL_A_PREEMPT_HEADER]			= { .type = NLA_NESTED },
> +	[ETHTOOL_A_PREEMPT_ENABLED]			= { .type = NLA_U8 },

Set the right netlink policy to check the value is <= 1.

> +	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]		= { .type = NLA_U8 },
> +};

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-05 17:43     ` Jakub Kicinski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-05 17:43 UTC (permalink / raw)
  To: intel-wired-lan

On Tue,  1 Dec 2020 20:53:17 -0800 Vinicius Costa Gomes wrote:
> Frame preemption (described in IEEE 802.3br-2016) defines the concept
> of preemptible and express queues. It allows traffic from express
> queues to "interrupt" traffic from preemptible queues, which are
> "resumed" after the express traffic has finished transmitting.
> 
> Frame preemption can only be used when both the local device and the
> link partner support it.
> 
> Only parameters for enabling/disabling frame preemption and
> configuring the minimum fragment size are included here. Expressing
> which queues are marked as preemptible is left to mqprio/taprio, as
> having that information there should be easier on the user.
> 
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> ---
>  include/linux/ethtool.h              |  19 ++++
>  include/uapi/linux/ethtool_netlink.h |  17 +++
>  net/ethtool/Makefile                 |   2 +-
>  net/ethtool/netlink.c                |  19 ++++
>  net/ethtool/netlink.h                |   4 +
>  net/ethtool/preempt.c                | 151 +++++++++++++++++++++++++++
>  6 files changed, 211 insertions(+), 1 deletion(-)
>  create mode 100644 net/ethtool/preempt.c
> 
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index e3da25b51ae4..16d6ee29a6ac 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -263,6 +263,19 @@ struct ethtool_pause_stats {
>  	u64 rx_pause_frames;
>  };
>  
> +/**
> + * struct ethtool_fp - Frame Preemption information
> + *
> + * @enabled: Enable frame preemption.
> + *

The empty line between members seems unnecessary.

> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> + * expressed in terms of X in '(1 + X)*64 + 4'

Is this way of expressing the min frag size from the standard?

> + */
> +struct ethtool_fp {
> +	u8 enabled;
> +	u8 min_frag_size_mult;
> +};

> +	int	(*get_preempt)(struct net_device *,
> +			       struct ethtool_fp *);
> +	int	(*set_preempt)(struct net_device *,
> +			       struct ethtool_fp *);

Since this is a new op we should probably pass extack to the drivers?

>  extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1];
>  extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1];
> @@ -375,6 +376,8 @@ extern const struct nla_policy ethnl_tsinfo_get_policy[ETHTOOL_A_TSINFO_HEADER +
>  extern const struct nla_policy ethnl_cable_test_act_policy[ETHTOOL_A_CABLE_TEST_HEADER + 1];
>  extern const struct nla_policy ethnl_cable_test_tdr_act_policy[ETHTOOL_A_CABLE_TEST_TDR_CFG + 1];
>  extern const struct nla_policy ethnl_tunnel_info_get_policy[ETHTOOL_A_TUNNEL_INFO_HEADER + 1];
> +extern const struct nla_policy ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1];
> +extern const struct nla_policy ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1];

Let's make the size

ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT + 1

for set, and

ETHTOOL_A_PREEMPT_HEADER + 1

for get, like the other tables

> +const struct nla_policy
> +ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
> +	[ETHTOOL_A_PREEMPT_UNSPEC]		= { .type = NLA_REJECT },

Unnecessary, NLA_REJECT is 0.

> +	[ETHTOOL_A_PREEMPT_HEADER]		= { .type = NLA_NESTED },

Please specify nested policy

> +	[ETHTOOL_A_PREEMPT_ENABLED]		= { .type = NLA_REJECT },
> +	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]	= { .type = NLA_REJECT },

Unnecessary

> +const struct nla_policy
> +ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
> +	[ETHTOOL_A_PREEMPT_UNSPEC]			= { .type = NLA_REJECT },
> +	[ETHTOOL_A_PREEMPT_HEADER]			= { .type = NLA_NESTED },
> +	[ETHTOOL_A_PREEMPT_ENABLED]			= { .type = NLA_U8 },

Set the right netlink policy to check the value is <= 1.

> +	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]		= { .type = NLA_U8 },
> +};

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
  2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-05 17:50   ` Jakub Kicinski
  -1 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-05 17:50 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

On Tue,  1 Dec 2020 20:53:16 -0800 Vinicius Costa Gomes wrote:
> $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
>       num_tc 3 \
>       map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>       queues 1@0 1@1 2@2 \
>       base-time $BASE_TIME \
>       sched-entry S 0f 10000000 \
>       preempt 1110 \
>       flags 0x2 
> 
> The "preempt" parameter is the only difference, it configures which
> queues are marked as preemptible, in this example, queue 0 is marked
> as "not preemptible", so it is express, the rest of the four queues
> are preemptible.

Does it make more sense for the individual queues to be preemptible 
or not, or is it better controlled at traffic class level?
I was looking at patch 2, and 32 queues isn't that many these days..
We either need a larger type there or configure this based on classes.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
@ 2020-12-05 17:50   ` Jakub Kicinski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-05 17:50 UTC (permalink / raw)
  To: intel-wired-lan

On Tue,  1 Dec 2020 20:53:16 -0800 Vinicius Costa Gomes wrote:
> $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
>       num_tc 3 \
>       map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>       queues 1 at 0 1 at 1 2 at 2 \
>       base-time $BASE_TIME \
>       sched-entry S 0f 10000000 \
>       preempt 1110 \
>       flags 0x2 
> 
> The "preempt" parameter is the only difference, it configures which
> queues are marked as preemptible, in this example, queue 0 is marked
> as "not preemptible", so it is express, the rest of the four queues
> are preemptible.

Does it make more sense for the individual queues to be preemptible 
or not, or is it better controlled at traffic class level?
I was looking at patch 2, and 32 queues isn't that many these days..
We either need a larger type there or configure this based on classes.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 8/9] igc: Add support for exposing frame preemption stats registers
  2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-05 17:59     ` Jakub Kicinski
  -1 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-05 17:59 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

On Tue,  1 Dec 2020 20:53:24 -0800 Vinicius Costa Gomes wrote:
> Expose the Frame Preemption counters, so the number of
> express/preemptible packets can be monitored by userspace.
> 
> These registers are cleared when read, so the value shown is the
> number of events that happened since the last read.
> 
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>

You mean expose in a register dump? That's not great user experience..

Are there any stats that the standards mandate?

It'd be great if we could come up with some common set and expose them
via ethtool like the pause frame statistics.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 8/9] igc: Add support for exposing frame preemption stats registers
@ 2020-12-05 17:59     ` Jakub Kicinski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-05 17:59 UTC (permalink / raw)
  To: intel-wired-lan

On Tue,  1 Dec 2020 20:53:24 -0800 Vinicius Costa Gomes wrote:
> Expose the Frame Preemption counters, so the number of
> express/preemptible packets can be monitored by userspace.
> 
> These registers are cleared when read, so the value shown is the
> number of events that happened since the last read.
> 
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>

You mean expose in a register dump? That's not great user experience..

Are there any stats that the standards mandate?

It'd be great if we could come up with some common set and expose them
via ethtool like the pause frame statistics.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool
  2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-05 18:00     ` Jakub Kicinski
  -1 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-05 18:00 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

On Tue,  1 Dec 2020 20:53:22 -0800 Vinicius Costa Gomes wrote:
> The tc subsystem sets which queues are marked as preemptible, it's the
> role of ethtool to control more hardware specific parameters. These
> parameters include:
> 
>  - enabling the frame preemption hardware: As enabling frame
>  preemption may have other requirements before it can be enabled, it's
>  exposed via the ethtool API;
> 
>  - mininum fragment size multiplier: expressed in usually in the form
>  of (1 + N)*64, this number indicates what's the size of the minimum
>  fragment that can be preempted.
> 
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>

WARNING: 'PREEMPTABLE' may be misspelled - perhaps 'PREEMPTIBLE'?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool
@ 2020-12-05 18:00     ` Jakub Kicinski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-05 18:00 UTC (permalink / raw)
  To: intel-wired-lan

On Tue,  1 Dec 2020 20:53:22 -0800 Vinicius Costa Gomes wrote:
> The tc subsystem sets which queues are marked as preemptible, it's the
> role of ethtool to control more hardware specific parameters. These
> parameters include:
> 
>  - enabling the frame preemption hardware: As enabling frame
>  preemption may have other requirements before it can be enabled, it's
>  exposed via the ethtool API;
> 
>  - mininum fragment size multiplier: expressed in usually in the form
>  of (1 + N)*64, this number indicates what's the size of the minimum
>  fragment that can be preempted.
> 
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>

WARNING: 'PREEMPTABLE' may be misspelled - perhaps 'PREEMPTIBLE'?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-05 17:43     ` [Intel-wired-lan] " Jakub Kicinski
@ 2020-12-07 22:11       ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-07 22:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

Jakub Kicinski <kuba@kernel.org> writes:

>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>> index e3da25b51ae4..16d6ee29a6ac 100644
>> --- a/include/linux/ethtool.h
>> +++ b/include/linux/ethtool.h
>> @@ -263,6 +263,19 @@ struct ethtool_pause_stats {
>>  	u64 rx_pause_frames;
>>  };
>>  
>> +/**
>> + * struct ethtool_fp - Frame Preemption information
>> + *
>> + * @enabled: Enable frame preemption.
>> + *
>
> The empty line between members seems unnecessary.

Will fix.

>
>> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
>> + * expressed in terms of X in '(1 + X)*64 + 4'
>
> Is this way of expressing the min frag size from the standard?
>

The standard has this: "A 2-bit integer value indicating, in units of 64
octets, the minimum number of octets over 64 octets required in
non-final fragments by the receiver" from IEEE 802.3br-2016, Table
79-7a.

>> + */
>> +struct ethtool_fp {
>> +	u8 enabled;
>> +	u8 min_frag_size_mult;
>> +};
>
>> +	int	(*get_preempt)(struct net_device *,
>> +			       struct ethtool_fp *);
>> +	int	(*set_preempt)(struct net_device *,
>> +			       struct ethtool_fp *);
>
> Since this is a new op we should probably pass extack to the drivers?

Yes. Will fix.

>
>>  extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1];
>>  extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1];
>> @@ -375,6 +376,8 @@ extern const struct nla_policy ethnl_tsinfo_get_policy[ETHTOOL_A_TSINFO_HEADER +
>>  extern const struct nla_policy ethnl_cable_test_act_policy[ETHTOOL_A_CABLE_TEST_HEADER + 1];
>>  extern const struct nla_policy ethnl_cable_test_tdr_act_policy[ETHTOOL_A_CABLE_TEST_TDR_CFG + 1];
>>  extern const struct nla_policy ethnl_tunnel_info_get_policy[ETHTOOL_A_TUNNEL_INFO_HEADER + 1];
>> +extern const struct nla_policy ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1];
>> +extern const struct nla_policy ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1];
>
> Let's make the size
>
> ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT + 1
>
> for set, and
>
> ETHTOOL_A_PREEMPT_HEADER + 1
>
> for get, like the other tables
>
>> +const struct nla_policy
>> +ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
>> +	[ETHTOOL_A_PREEMPT_UNSPEC]		= { .type = NLA_REJECT },
>
> Unnecessary, NLA_REJECT is 0.
>
>> +	[ETHTOOL_A_PREEMPT_HEADER]		= { .type = NLA_NESTED },
>
> Please specify nested policy
>
>> +	[ETHTOOL_A_PREEMPT_ENABLED]		= { .type = NLA_REJECT },
>> +	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]	= { .type = NLA_REJECT },
>
> Unnecessary
>
>> +const struct nla_policy
>> +ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
>> +	[ETHTOOL_A_PREEMPT_UNSPEC]			= { .type = NLA_REJECT },
>> +	[ETHTOOL_A_PREEMPT_HEADER]			= { .type = NLA_NESTED },
>> +	[ETHTOOL_A_PREEMPT_ENABLED]			= { .type = NLA_U8 },
>
> Set the right netlink policy to check the value is <= 1.
>
>> +	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]		= { .type = NLA_U8 },
>> +};


Will fix these netlink validation issues.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-07 22:11       ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-07 22:11 UTC (permalink / raw)
  To: intel-wired-lan

Jakub Kicinski <kuba@kernel.org> writes:

>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>> index e3da25b51ae4..16d6ee29a6ac 100644
>> --- a/include/linux/ethtool.h
>> +++ b/include/linux/ethtool.h
>> @@ -263,6 +263,19 @@ struct ethtool_pause_stats {
>>  	u64 rx_pause_frames;
>>  };
>>  
>> +/**
>> + * struct ethtool_fp - Frame Preemption information
>> + *
>> + * @enabled: Enable frame preemption.
>> + *
>
> The empty line between members seems unnecessary.

Will fix.

>
>> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
>> + * expressed in terms of X in '(1 + X)*64 + 4'
>
> Is this way of expressing the min frag size from the standard?
>

The standard has this: "A 2-bit integer value indicating, in units of 64
octets, the minimum number of octets over 64 octets required in
non-final fragments by the receiver" from IEEE 802.3br-2016, Table
79-7a.

>> + */
>> +struct ethtool_fp {
>> +	u8 enabled;
>> +	u8 min_frag_size_mult;
>> +};
>
>> +	int	(*get_preempt)(struct net_device *,
>> +			       struct ethtool_fp *);
>> +	int	(*set_preempt)(struct net_device *,
>> +			       struct ethtool_fp *);
>
> Since this is a new op we should probably pass extack to the drivers?

Yes. Will fix.

>
>>  extern const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_FLAGS + 1];
>>  extern const struct nla_policy ethnl_header_policy_stats[ETHTOOL_A_HEADER_FLAGS + 1];
>> @@ -375,6 +376,8 @@ extern const struct nla_policy ethnl_tsinfo_get_policy[ETHTOOL_A_TSINFO_HEADER +
>>  extern const struct nla_policy ethnl_cable_test_act_policy[ETHTOOL_A_CABLE_TEST_HEADER + 1];
>>  extern const struct nla_policy ethnl_cable_test_tdr_act_policy[ETHTOOL_A_CABLE_TEST_TDR_CFG + 1];
>>  extern const struct nla_policy ethnl_tunnel_info_get_policy[ETHTOOL_A_TUNNEL_INFO_HEADER + 1];
>> +extern const struct nla_policy ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1];
>> +extern const struct nla_policy ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1];
>
> Let's make the size
>
> ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT + 1
>
> for set, and
>
> ETHTOOL_A_PREEMPT_HEADER + 1
>
> for get, like the other tables
>
>> +const struct nla_policy
>> +ethnl_preempt_get_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
>> +	[ETHTOOL_A_PREEMPT_UNSPEC]		= { .type = NLA_REJECT },
>
> Unnecessary, NLA_REJECT is 0.
>
>> +	[ETHTOOL_A_PREEMPT_HEADER]		= { .type = NLA_NESTED },
>
> Please specify nested policy
>
>> +	[ETHTOOL_A_PREEMPT_ENABLED]		= { .type = NLA_REJECT },
>> +	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]	= { .type = NLA_REJECT },
>
> Unnecessary
>
>> +const struct nla_policy
>> +ethnl_preempt_set_policy[ETHTOOL_A_PREEMPT_MAX + 1] = {
>> +	[ETHTOOL_A_PREEMPT_UNSPEC]			= { .type = NLA_REJECT },
>> +	[ETHTOOL_A_PREEMPT_HEADER]			= { .type = NLA_NESTED },
>> +	[ETHTOOL_A_PREEMPT_ENABLED]			= { .type = NLA_U8 },
>
> Set the right netlink policy to check the value is <= 1.
>
>> +	[ETHTOOL_A_PREEMPT_MIN_FRAG_SIZE_MULT]		= { .type = NLA_U8 },
>> +};


Will fix these netlink validation issues.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool
  2020-12-05 18:00     ` [Intel-wired-lan] " Jakub Kicinski
@ 2020-12-07 22:15       ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-07 22:15 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

Jakub Kicinski <kuba@kernel.org> writes:

> On Tue,  1 Dec 2020 20:53:22 -0800 Vinicius Costa Gomes wrote:
>> The tc subsystem sets which queues are marked as preemptible, it's the
>> role of ethtool to control more hardware specific parameters. These
>> parameters include:
>> 
>>  - enabling the frame preemption hardware: As enabling frame
>>  preemption may have other requirements before it can be enabled, it's
>>  exposed via the ethtool API;
>> 
>>  - mininum fragment size multiplier: expressed in usually in the form
>>  of (1 + N)*64, this number indicates what's the size of the minimum
>>  fragment that can be preempted.
>> 
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>
> WARNING: 'PREEMPTABLE' may be misspelled - perhaps 'PREEMPTIBLE'?

In the datasheet the term PREEMPTABLE is used, and when refering to
register values I chose to be consistent with the datasheet. But as the
margin for confusion is small, I can change to use "preemptible"
everywhere, no problem.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool
@ 2020-12-07 22:15       ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-07 22:15 UTC (permalink / raw)
  To: intel-wired-lan

Jakub Kicinski <kuba@kernel.org> writes:

> On Tue,  1 Dec 2020 20:53:22 -0800 Vinicius Costa Gomes wrote:
>> The tc subsystem sets which queues are marked as preemptible, it's the
>> role of ethtool to control more hardware specific parameters. These
>> parameters include:
>> 
>>  - enabling the frame preemption hardware: As enabling frame
>>  preemption may have other requirements before it can be enabled, it's
>>  exposed via the ethtool API;
>> 
>>  - mininum fragment size multiplier: expressed in usually in the form
>>  of (1 + N)*64, this number indicates what's the size of the minimum
>>  fragment that can be preempted.
>> 
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>
> WARNING: 'PREEMPTABLE' may be misspelled - perhaps 'PREEMPTIBLE'?

In the datasheet the term PREEMPTABLE is used, and when refering to
register values I chose to be consistent with the datasheet. But as the
margin for confusion is small, I can change to use "preemptible"
everywhere, no problem.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 8/9] igc: Add support for exposing frame preemption stats registers
  2020-12-05 17:59     ` [Intel-wired-lan] " Jakub Kicinski
@ 2020-12-07 22:29       ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-07 22:29 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

Jakub Kicinski <kuba@kernel.org> writes:

> On Tue,  1 Dec 2020 20:53:24 -0800 Vinicius Costa Gomes wrote:
>> Expose the Frame Preemption counters, so the number of
>> express/preemptible packets can be monitored by userspace.
>> 
>> These registers are cleared when read, so the value shown is the
>> number of events that happened since the last read.
>> 
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>
> You mean expose in a register dump? That's not great user experience..

I can agree with that, even after some formatting on the ethtool side:

Preemption statistics:
    TX Preemption event counter: 14070
    Good TX Preemptable Packets: 201957
    Good TX Express Packets: 32031
    TX Preempted Packets: 13259
    RX Preemption event counter: 0
    Good RX Preemptable Packets: 0
    Good RX Preempted Packets: 0
    Preemption Exception Counter:
        OOO_SMDC 0
        OOO_FRAME 0
        OOO_FRAG 0
        MISS_FRAME_FRAG 0

It's less than ideal, but useful for development/debugging.

>
> Are there any stats that the standards mandate?

I just took abother look at the standard, mainly at the MIBs, there are
no statistics related to frame preemption that I could find, only
configuration stuff.

>
> It'd be great if we could come up with some common set and expose them
> via ethtool like the pause frame statistics.

Agreed, will drop this patch, until this common set is agreed upon.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 8/9] igc: Add support for exposing frame preemption stats registers
@ 2020-12-07 22:29       ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-07 22:29 UTC (permalink / raw)
  To: intel-wired-lan

Jakub Kicinski <kuba@kernel.org> writes:

> On Tue,  1 Dec 2020 20:53:24 -0800 Vinicius Costa Gomes wrote:
>> Expose the Frame Preemption counters, so the number of
>> express/preemptible packets can be monitored by userspace.
>> 
>> These registers are cleared when read, so the value shown is the
>> number of events that happened since the last read.
>> 
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>
> You mean expose in a register dump? That's not great user experience..

I can agree with that, even after some formatting on the ethtool side:

Preemption statistics:
    TX Preemption event counter: 14070
    Good TX Preemptable Packets: 201957
    Good TX Express Packets: 32031
    TX Preempted Packets: 13259
    RX Preemption event counter: 0
    Good RX Preemptable Packets: 0
    Good RX Preempted Packets: 0
    Preemption Exception Counter:
        OOO_SMDC 0
        OOO_FRAME 0
        OOO_FRAG 0
        MISS_FRAME_FRAG 0

It's less than ideal, but useful for development/debugging.

>
> Are there any stats that the standards mandate?

I just took abother look at the standard, mainly at the MIBs, there are
no statistics related to frame preemption that I could find, only
configuration stuff.

>
> It'd be great if we could come up with some common set and expose them
> via ethtool like the pause frame statistics.

Agreed, will drop this patch, until this common set is agreed upon.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
  2020-12-05 17:50   ` [Intel-wired-lan] " Jakub Kicinski
@ 2020-12-07 22:49     ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-07 22:49 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

Jakub Kicinski <kuba@kernel.org> writes:

> On Tue,  1 Dec 2020 20:53:16 -0800 Vinicius Costa Gomes wrote:
>> $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
>>       num_tc 3 \
>>       map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>>       queues 1@0 1@1 2@2 \
>>       base-time $BASE_TIME \
>>       sched-entry S 0f 10000000 \
>>       preempt 1110 \
>>       flags 0x2 
>> 
>> The "preempt" parameter is the only difference, it configures which
>> queues are marked as preemptible, in this example, queue 0 is marked
>> as "not preemptible", so it is express, the rest of the four queues
>> are preemptible.
>
> Does it make more sense for the individual queues to be preemptible 
> or not, or is it better controlled at traffic class level?
> I was looking at patch 2, and 32 queues isn't that many these days..
> We either need a larger type there or configure this based on classes.

I can set more future proof sizes for expressing the queues, sure, but
the issue, I think, is that frame preemption has dimishing returns with
link speed: at 2.5G the latency improvements are on the order of single
digit microseconds. At greater speeds the improvements are even less
noticeable.

The only adapters that I see that support frame preemtion have 8 queues
or less. 

The idea of configuring frame preemption based on classes is
interesting. I will play with it, and see how it looks.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
@ 2020-12-07 22:49     ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-07 22:49 UTC (permalink / raw)
  To: intel-wired-lan

Jakub Kicinski <kuba@kernel.org> writes:

> On Tue,  1 Dec 2020 20:53:16 -0800 Vinicius Costa Gomes wrote:
>> $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
>>       num_tc 3 \
>>       map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>>       queues 1 at 0 1 at 1 2 at 2 \
>>       base-time $BASE_TIME \
>>       sched-entry S 0f 10000000 \
>>       preempt 1110 \
>>       flags 0x2 
>> 
>> The "preempt" parameter is the only difference, it configures which
>> queues are marked as preemptible, in this example, queue 0 is marked
>> as "not preemptible", so it is express, the rest of the four queues
>> are preemptible.
>
> Does it make more sense for the individual queues to be preemptible 
> or not, or is it better controlled at traffic class level?
> I was looking at patch 2, and 32 queues isn't that many these days..
> We either need a larger type there or configure this based on classes.

I can set more future proof sizes for expressing the queues, sure, but
the issue, I think, is that frame preemption has dimishing returns with
link speed: at 2.5G the latency improvements are on the order of single
digit microseconds. At greater speeds the improvements are even less
noticeable.

The only adapters that I see that support frame preemtion have 8 queues
or less. 

The idea of configuring frame preemption based on classes is
interesting. I will play with it, and see how it looks.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
  2020-12-07 22:49     ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-07 23:12       ` Vladimir Oltean
  -1 siblings, 0 replies; 54+ messages in thread
From: Vladimir Oltean @ 2020-12-07 23:12 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: Jakub Kicinski, netdev, jhs, xiyou.wangcong, jiri, m-karicheri2,
	Jose.Abreu, Po Liu, intel-wired-lan, anthony.l.nguyen

On Mon, Dec 07, 2020 at 02:49:35PM -0800, Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
>
> > On Tue,  1 Dec 2020 20:53:16 -0800 Vinicius Costa Gomes wrote:
> >> $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
> >>       num_tc 3 \
> >>       map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
> >>       queues 1@0 1@1 2@2 \
> >>       base-time $BASE_TIME \
> >>       sched-entry S 0f 10000000 \
> >>       preempt 1110 \
> >>       flags 0x2
> >>
> >> The "preempt" parameter is the only difference, it configures which
> >> queues are marked as preemptible, in this example, queue 0 is marked
> >> as "not preemptible", so it is express, the rest of the four queues
> >> are preemptible.
> >
> > Does it make more sense for the individual queues to be preemptible
> > or not, or is it better controlled at traffic class level?
> > I was looking at patch 2, and 32 queues isn't that many these days..
> > We either need a larger type there or configure this based on classes.
>
> I can set more future proof sizes for expressing the queues, sure, but
> the issue, I think, is that frame preemption has dimishing returns with
> link speed: at 2.5G the latency improvements are on the order of single
> digit microseconds. At greater speeds the improvements are even less
> noticeable.

You could look at it another way.
You can enable jumbo frames in your network, and your latency-sensitive
traffic would not suffer as long as the jumbo frames are preemptible.

> The only adapters that I see that support frame preemtion have 8 queues
> or less.
>
> The idea of configuring frame preemption based on classes is
> interesting. I will play with it, and see how it looks.

I admit I never understood why you insist on configuring TSN offloads
per hardware queue and not per traffic class.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
@ 2020-12-07 23:12       ` Vladimir Oltean
  0 siblings, 0 replies; 54+ messages in thread
From: Vladimir Oltean @ 2020-12-07 23:12 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 07, 2020 at 02:49:35PM -0800, Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
>
> > On Tue,  1 Dec 2020 20:53:16 -0800 Vinicius Costa Gomes wrote:
> >> $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
> >>       num_tc 3 \
> >>       map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
> >>       queues 1 at 0 1 at 1 2 at 2 \
> >>       base-time $BASE_TIME \
> >>       sched-entry S 0f 10000000 \
> >>       preempt 1110 \
> >>       flags 0x2
> >>
> >> The "preempt" parameter is the only difference, it configures which
> >> queues are marked as preemptible, in this example, queue 0 is marked
> >> as "not preemptible", so it is express, the rest of the four queues
> >> are preemptible.
> >
> > Does it make more sense for the individual queues to be preemptible
> > or not, or is it better controlled at traffic class level?
> > I was looking at patch 2, and 32 queues isn't that many these days..
> > We either need a larger type there or configure this based on classes.
>
> I can set more future proof sizes for expressing the queues, sure, but
> the issue, I think, is that frame preemption has dimishing returns with
> link speed: at 2.5G the latency improvements are on the order of single
> digit microseconds. At greater speeds the improvements are even less
> noticeable.

You could look at it another way.
You can enable jumbo frames in your network, and your latency-sensitive
traffic would not suffer as long as the jumbo frames are preemptible.

> The only adapters that I see that support frame preemtion have 8 queues
> or less.
>
> The idea of configuring frame preemption based on classes is
> interesting. I will play with it, and see how it looks.

I admit I never understood why you insist on configuring TSN offloads
per hardware queue and not per traffic class.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-07 22:11       ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-07 23:21         ` Jakub Kicinski
  -1 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-07 23:21 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

On Mon, 07 Dec 2020 14:11:48 -0800 Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> >> + * expressed in terms of X in '(1 + X)*64 + 4'  
> >
> > Is this way of expressing the min frag size from the standard?
> >  
> 
> The standard has this: "A 2-bit integer value indicating, in units of 64
> octets, the minimum number of octets over 64 octets required in
> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
> 79-7a.

Thanks! Let's drop the _mult suffix and add a mention of this
controlling the addFragSize variable from the standard. Perhaps 
it should in fact be called add_frag_size (with an explanation 
that the "additional" means "above the 64B" which are required in
Ethernet, and which are accounted for by the "1" in the 1 + X formula)?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-07 23:21         ` Jakub Kicinski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-07 23:21 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, 07 Dec 2020 14:11:48 -0800 Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> >> + * expressed in terms of X in '(1 + X)*64 + 4'  
> >
> > Is this way of expressing the min frag size from the standard?
> >  
> 
> The standard has this: "A 2-bit integer value indicating, in units of 64
> octets, the minimum number of octets over 64 octets required in
> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
> 79-7a.

Thanks! Let's drop the _mult suffix and add a mention of this
controlling the addFragSize variable from the standard. Perhaps 
it should in fact be called add_frag_size (with an explanation 
that the "additional" means "above the 64B" which are required in
Ethernet, and which are accounted for by the "1" in the 1 + X formula)?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool
  2020-12-07 22:15       ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-07 23:22         ` Jakub Kicinski
  -1 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-07 23:22 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

On Mon, 07 Dec 2020 14:15:25 -0800 Vinicius Costa Gomes wrote:
> > WARNING: 'PREEMPTABLE' may be misspelled - perhaps 'PREEMPTIBLE'?  
> 
> In the datasheet the term PREEMPTABLE is used, and when refering to
> register values I chose to be consistent with the datasheet. But as the
> margin for confusion is small, I can change to use "preemptible"
> everywhere, no problem.

If there is a reason to it feel free to keep as is.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool
@ 2020-12-07 23:22         ` Jakub Kicinski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-07 23:22 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, 07 Dec 2020 14:15:25 -0800 Vinicius Costa Gomes wrote:
> > WARNING: 'PREEMPTABLE' may be misspelled - perhaps 'PREEMPTIBLE'?  
> 
> In the datasheet the term PREEMPTABLE is used, and when refering to
> register values I chose to be consistent with the datasheet. But as the
> margin for confusion is small, I can change to use "preemptible"
> everywhere, no problem.

If there is a reason to it feel free to keep as is.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-07 23:21         ` [Intel-wired-lan] " Jakub Kicinski
@ 2020-12-08  0:24           ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-08  0:24 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, jhs, xiyou.wangcong, jiri, m-karicheri2, vladimir.oltean,
	Jose.Abreu, po.liu, intel-wired-lan, anthony.l.nguyen

Jakub Kicinski <kuba@kernel.org> writes:

> On Mon, 07 Dec 2020 14:11:48 -0800 Vinicius Costa Gomes wrote:
>> Jakub Kicinski <kuba@kernel.org> writes:
>> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
>> >> + * expressed in terms of X in '(1 + X)*64 + 4'  
>> >
>> > Is this way of expressing the min frag size from the standard?
>> >  
>> 
>> The standard has this: "A 2-bit integer value indicating, in units of 64
>> octets, the minimum number of octets over 64 octets required in
>> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
>> 79-7a.
>
> Thanks! Let's drop the _mult suffix and add a mention of this
> controlling the addFragSize variable from the standard. Perhaps 
> it should in fact be called add_frag_size (with an explanation 
> that the "additional" means "above the 64B" which are required in
> Ethernet, and which are accounted for by the "1" in the 1 + X
> formula)?

Sounds good :-) Will add a comment with the standard reference and
change the name to 'add_frag_size'.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-08  0:24           ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-08  0:24 UTC (permalink / raw)
  To: intel-wired-lan

Jakub Kicinski <kuba@kernel.org> writes:

> On Mon, 07 Dec 2020 14:11:48 -0800 Vinicius Costa Gomes wrote:
>> Jakub Kicinski <kuba@kernel.org> writes:
>> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
>> >> + * expressed in terms of X in '(1 + X)*64 + 4'  
>> >
>> > Is this way of expressing the min frag size from the standard?
>> >  
>> 
>> The standard has this: "A 2-bit integer value indicating, in units of 64
>> octets, the minimum number of octets over 64 octets required in
>> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
>> 79-7a.
>
> Thanks! Let's drop the _mult suffix and add a mention of this
> controlling the addFragSize variable from the standard. Perhaps 
> it should in fact be called add_frag_size (with an explanation 
> that the "additional" means "above the 64B" which are required in
> Ethernet, and which are accounted for by the "1" in the 1 + X
> formula)?

Sounds good :-) Will add a comment with the standard reference and
change the name to 'add_frag_size'.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-08  0:24           ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-08  0:27             ` Vladimir Oltean
  -1 siblings, 0 replies; 54+ messages in thread
From: Vladimir Oltean @ 2020-12-08  0:27 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: Jakub Kicinski, netdev, jhs, xiyou.wangcong, jiri, m-karicheri2,
	Jose.Abreu, Po Liu, intel-wired-lan, anthony.l.nguyen

On Mon, Dec 07, 2020 at 04:24:02PM -0800, Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
>
> > On Mon, 07 Dec 2020 14:11:48 -0800 Vinicius Costa Gomes wrote:
> >> Jakub Kicinski <kuba@kernel.org> writes:
> >> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> >> >> + * expressed in terms of X in '(1 + X)*64 + 4'
> >> >
> >> > Is this way of expressing the min frag size from the standard?
> >> >
> >>
> >> The standard has this: "A 2-bit integer value indicating, in units of 64
> >> octets, the minimum number of octets over 64 octets required in
> >> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
> >> 79-7a.
> >
> > Thanks! Let's drop the _mult suffix and add a mention of this
> > controlling the addFragSize variable from the standard. Perhaps
> > it should in fact be called add_frag_size (with an explanation
> > that the "additional" means "above the 64B" which are required in
> > Ethernet, and which are accounted for by the "1" in the 1 + X
> > formula)?
>
> Sounds good :-) Will add a comment with the standard reference and
> change the name to 'add_frag_size'.

I think you should be making references to the IEEE 802.3-2018, that
will age better, and a lot more people have that handy.
I believe the go-to definition for the additional fragment size can be
found in clause 30.12.2.1.37 aLldpXdot3LocAddFragSize.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-08  0:27             ` Vladimir Oltean
  0 siblings, 0 replies; 54+ messages in thread
From: Vladimir Oltean @ 2020-12-08  0:27 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 07, 2020 at 04:24:02PM -0800, Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
>
> > On Mon, 07 Dec 2020 14:11:48 -0800 Vinicius Costa Gomes wrote:
> >> Jakub Kicinski <kuba@kernel.org> writes:
> >> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> >> >> + * expressed in terms of X in '(1 + X)*64 + 4'
> >> >
> >> > Is this way of expressing the min frag size from the standard?
> >> >
> >>
> >> The standard has this: "A 2-bit integer value indicating, in units of 64
> >> octets, the minimum number of octets over 64 octets required in
> >> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
> >> 79-7a.
> >
> > Thanks! Let's drop the _mult suffix and add a mention of this
> > controlling the addFragSize variable from the standard. Perhaps
> > it should in fact be called add_frag_size (with an explanation
> > that the "additional" means "above the 64B" which are required in
> > Ethernet, and which are accounted for by the "1" in the 1 + X
> > formula)?
>
> Sounds good :-) Will add a comment with the standard reference and
> change the name to 'add_frag_size'.

I think you should be making references to the IEEE 802.3-2018, that
will age better, and a lot more people have that handy.
I believe the go-to definition for the additional fragment size can be
found in clause 30.12.2.1.37 aLldpXdot3LocAddFragSize.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
  2020-12-07 23:12       ` [Intel-wired-lan] " Vladimir Oltean
@ 2020-12-08  0:34         ` Vinicius Costa Gomes
  -1 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-08  0:34 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Jakub Kicinski, netdev, jhs, xiyou.wangcong, jiri, m-karicheri2,
	Jose.Abreu, Po Liu, intel-wired-lan, anthony.l.nguyen

Vladimir Oltean <vladimir.oltean@nxp.com> writes:

> On Mon, Dec 07, 2020 at 02:49:35PM -0800, Vinicius Costa Gomes wrote:
>> Jakub Kicinski <kuba@kernel.org> writes:
>>
>> > On Tue,  1 Dec 2020 20:53:16 -0800 Vinicius Costa Gomes wrote:
>> >> $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
>> >>       num_tc 3 \
>> >>       map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>> >>       queues 1@0 1@1 2@2 \
>> >>       base-time $BASE_TIME \
>> >>       sched-entry S 0f 10000000 \
>> >>       preempt 1110 \
>> >>       flags 0x2
>> >>
>> >> The "preempt" parameter is the only difference, it configures which
>> >> queues are marked as preemptible, in this example, queue 0 is marked
>> >> as "not preemptible", so it is express, the rest of the four queues
>> >> are preemptible.
>> >
>> > Does it make more sense for the individual queues to be preemptible
>> > or not, or is it better controlled at traffic class level?
>> > I was looking at patch 2, and 32 queues isn't that many these days..
>> > We either need a larger type there or configure this based on classes.
>>
>> I can set more future proof sizes for expressing the queues, sure, but
>> the issue, I think, is that frame preemption has dimishing returns with
>> link speed: at 2.5G the latency improvements are on the order of single
>> digit microseconds. At greater speeds the improvements are even less
>> noticeable.
>
> You could look at it another way.
> You can enable jumbo frames in your network, and your latency-sensitive
> traffic would not suffer as long as the jumbo frames are preemptible.
>

Speaking of jumbo frame, that's something that the standards are
missing, TSN features + jumbo frames will leave a lot of stuff up to the
implementation.

>> The only adapters that I see that support frame preemtion have 8 queues
>> or less.
>>
>> The idea of configuring frame preemption based on classes is
>> interesting. I will play with it, and see how it looks.
>
> I admit I never understood why you insist on configuring TSN offloads
> per hardware queue and not per traffic class.

So, I am sorry that I wasn't able to fully understand what you were
saying, then.

I always thought that you were thinking more that the driver was
responsible of making the 'traffic class to queue' translation than the
configuration interface for frame preemption to the user (taprio,
mqprio, etc) should be in terms of traffic classes, instead of queues.

My bad.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 0/9] ethtool: Add support for frame preemption
@ 2020-12-08  0:34         ` Vinicius Costa Gomes
  0 siblings, 0 replies; 54+ messages in thread
From: Vinicius Costa Gomes @ 2020-12-08  0:34 UTC (permalink / raw)
  To: intel-wired-lan

Vladimir Oltean <vladimir.oltean@nxp.com> writes:

> On Mon, Dec 07, 2020 at 02:49:35PM -0800, Vinicius Costa Gomes wrote:
>> Jakub Kicinski <kuba@kernel.org> writes:
>>
>> > On Tue,  1 Dec 2020 20:53:16 -0800 Vinicius Costa Gomes wrote:
>> >> $ tc qdisc replace dev $IFACE parent root handle 100 taprio \
>> >>       num_tc 3 \
>> >>       map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>> >>       queues 1 at 0 1 at 1 2 at 2 \
>> >>       base-time $BASE_TIME \
>> >>       sched-entry S 0f 10000000 \
>> >>       preempt 1110 \
>> >>       flags 0x2
>> >>
>> >> The "preempt" parameter is the only difference, it configures which
>> >> queues are marked as preemptible, in this example, queue 0 is marked
>> >> as "not preemptible", so it is express, the rest of the four queues
>> >> are preemptible.
>> >
>> > Does it make more sense for the individual queues to be preemptible
>> > or not, or is it better controlled at traffic class level?
>> > I was looking at patch 2, and 32 queues isn't that many these days..
>> > We either need a larger type there or configure this based on classes.
>>
>> I can set more future proof sizes for expressing the queues, sure, but
>> the issue, I think, is that frame preemption has dimishing returns with
>> link speed: at 2.5G the latency improvements are on the order of single
>> digit microseconds. At greater speeds the improvements are even less
>> noticeable.
>
> You could look at it another way.
> You can enable jumbo frames in your network, and your latency-sensitive
> traffic would not suffer as long as the jumbo frames are preemptible.
>

Speaking of jumbo frame, that's something that the standards are
missing, TSN features + jumbo frames will leave a lot of stuff up to the
implementation.

>> The only adapters that I see that support frame preemtion have 8 queues
>> or less.
>>
>> The idea of configuring frame preemption based on classes is
>> interesting. I will play with it, and see how it looks.
>
> I admit I never understood why you insist on configuring TSN offloads
> per hardware queue and not per traffic class.

So, I am sorry that I wasn't able to fully understand what you were
saying, then.

I always thought that you were thinking more that the driver was
responsible of making the 'traffic class to queue' translation than the
configuration interface for frame preemption to the user (taprio,
mqprio, etc) should be in terms of traffic classes, instead of queues.

My bad.


Cheers,
-- 
Vinicius

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-08  0:27             ` [Intel-wired-lan] " Vladimir Oltean
@ 2020-12-08  0:48               ` Jakub Kicinski
  -1 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-08  0:48 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Vinicius Costa Gomes, netdev, jhs, xiyou.wangcong, jiri,
	m-karicheri2, Jose.Abreu, Po Liu, intel-wired-lan,
	anthony.l.nguyen

On Tue, 8 Dec 2020 00:27:31 +0000 Vladimir Oltean wrote:
> On Mon, Dec 07, 2020 at 04:24:02PM -0800, Vinicius Costa Gomes wrote:
> > Jakub Kicinski <kuba@kernel.org> writes:
> >  
> > > On Mon, 07 Dec 2020 14:11:48 -0800 Vinicius Costa Gomes wrote:  
> > >> Jakub Kicinski <kuba@kernel.org> writes:  
> > >> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> > >> >> + * expressed in terms of X in '(1 + X)*64 + 4'  
> > >> >
> > >> > Is this way of expressing the min frag size from the standard?
> > >> >  
> > >>
> > >> The standard has this: "A 2-bit integer value indicating, in units of 64
> > >> octets, the minimum number of octets over 64 octets required in
> > >> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
> > >> 79-7a.  
> > >
> > > Thanks! Let's drop the _mult suffix and add a mention of this
> > > controlling the addFragSize variable from the standard. Perhaps
> > > it should in fact be called add_frag_size (with an explanation
> > > that the "additional" means "above the 64B" which are required in
> > > Ethernet, and which are accounted for by the "1" in the 1 + X
> > > formula)?  
> >
> > Sounds good :-) Will add a comment with the standard reference and
> > change the name to 'add_frag_size'.  
> 
> I think you should be making references to the IEEE 802.3-2018, that
> will age better, and a lot more people have that handy.
> I believe the go-to definition for the additional fragment size can be
> found in clause 30.12.2.1.37 aLldpXdot3LocAddFragSize.

That's the LLDP incarnation of it. The variable is defined in:

99.4.7.3 Variables

Probably better mention 30.14.1.7 aMACMergeAddFragSize if we want a MIB
reference.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-08  0:48               ` Jakub Kicinski
  0 siblings, 0 replies; 54+ messages in thread
From: Jakub Kicinski @ 2020-12-08  0:48 UTC (permalink / raw)
  To: intel-wired-lan

On Tue, 8 Dec 2020 00:27:31 +0000 Vladimir Oltean wrote:
> On Mon, Dec 07, 2020 at 04:24:02PM -0800, Vinicius Costa Gomes wrote:
> > Jakub Kicinski <kuba@kernel.org> writes:
> >  
> > > On Mon, 07 Dec 2020 14:11:48 -0800 Vinicius Costa Gomes wrote:  
> > >> Jakub Kicinski <kuba@kernel.org> writes:  
> > >> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> > >> >> + * expressed in terms of X in '(1 + X)*64 + 4'  
> > >> >
> > >> > Is this way of expressing the min frag size from the standard?
> > >> >  
> > >>
> > >> The standard has this: "A 2-bit integer value indicating, in units of 64
> > >> octets, the minimum number of octets over 64 octets required in
> > >> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
> > >> 79-7a.  
> > >
> > > Thanks! Let's drop the _mult suffix and add a mention of this
> > > controlling the addFragSize variable from the standard. Perhaps
> > > it should in fact be called add_frag_size (with an explanation
> > > that the "additional" means "above the 64B" which are required in
> > > Ethernet, and which are accounted for by the "1" in the 1 + X
> > > formula)?  
> >
> > Sounds good :-) Will add a comment with the standard reference and
> > change the name to 'add_frag_size'.  
> 
> I think you should be making references to the IEEE 802.3-2018, that
> will age better, and a lot more people have that handy.
> I believe the go-to definition for the additional fragment size can be
> found in clause 30.12.2.1.37 aLldpXdot3LocAddFragSize.

That's the LLDP incarnation of it. The variable is defined in:

99.4.7.3 Variables

Probably better mention 30.14.1.7 aMACMergeAddFragSize if we want a MIB
reference.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
  2020-12-07 22:11       ` [Intel-wired-lan] " Vinicius Costa Gomes
@ 2020-12-08  6:22         ` Michal Kubecek
  -1 siblings, 0 replies; 54+ messages in thread
From: Michal Kubecek @ 2020-12-08  6:22 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: Jakub Kicinski, netdev, jhs, xiyou.wangcong, jiri, m-karicheri2,
	vladimir.oltean, Jose.Abreu, po.liu, intel-wired-lan,
	anthony.l.nguyen

On Mon, Dec 07, 2020 at 02:11:48PM -0800, Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> >> + * expressed in terms of X in '(1 + X)*64 + 4'
> >
> > Is this way of expressing the min frag size from the standard?
> >
> 
> The standard has this: "A 2-bit integer value indicating, in units of 64
> octets, the minimum number of octets over 64 octets required in
> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
> 79-7a.

Can we be sure that newer version of the standard cannot change this,
e.g. come with a finer granularity? Perhaps it would be safer to express
the size in bytes in the userspace API and translate to this internal
representation in common ethtool code.

Also, please don't forget to update Documentation/networking/ethtool-netlink.rst

Michal

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Intel-wired-lan] [PATCH net-next v1 1/9] ethtool: Add support for configuring frame preemption
@ 2020-12-08  6:22         ` Michal Kubecek
  0 siblings, 0 replies; 54+ messages in thread
From: Michal Kubecek @ 2020-12-08  6:22 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 07, 2020 at 02:11:48PM -0800, Vinicius Costa Gomes wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> >> + * @min_frag_size_mult: Minimum size for all non-final fragment size,
> >> + * expressed in terms of X in '(1 + X)*64 + 4'
> >
> > Is this way of expressing the min frag size from the standard?
> >
> 
> The standard has this: "A 2-bit integer value indicating, in units of 64
> octets, the minimum number of octets over 64 octets required in
> non-final fragments by the receiver" from IEEE 802.3br-2016, Table
> 79-7a.

Can we be sure that newer version of the standard cannot change this,
e.g. come with a finer granularity? Perhaps it would be safer to express
the size in bytes in the userspace API and translate to this internal
representation in common ethtool code.

Also, please don't forget to update Documentation/networking/ethtool-netlink.rst

Michal

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2020-12-08  6:23 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-02  4:53 [PATCH net-next v1 0/9] ethtool: Add support for frame preemption Vinicius Costa Gomes
2020-12-02  4:53 ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-02  4:53 ` [PATCH net-next v1 1/9] ethtool: Add support for configuring " Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-03  1:53   ` Jakub Kicinski
2020-12-03  1:53     ` [Intel-wired-lan] " Jakub Kicinski
2020-12-05 17:43   ` Jakub Kicinski
2020-12-05 17:43     ` [Intel-wired-lan] " Jakub Kicinski
2020-12-07 22:11     ` Vinicius Costa Gomes
2020-12-07 22:11       ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-07 23:21       ` Jakub Kicinski
2020-12-07 23:21         ` [Intel-wired-lan] " Jakub Kicinski
2020-12-08  0:24         ` Vinicius Costa Gomes
2020-12-08  0:24           ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-08  0:27           ` Vladimir Oltean
2020-12-08  0:27             ` [Intel-wired-lan] " Vladimir Oltean
2020-12-08  0:48             ` Jakub Kicinski
2020-12-08  0:48               ` [Intel-wired-lan] " Jakub Kicinski
2020-12-08  6:22       ` Michal Kubecek
2020-12-08  6:22         ` [Intel-wired-lan] " Michal Kubecek
2020-12-02  4:53 ` [PATCH net-next v1 2/9] taprio: Add support for frame preemption offload Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-02  4:53 ` [PATCH net-next v1 3/9] igc: Set the RX packet buffer size for TSN mode Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-02  4:53 ` [PATCH net-next v1 4/9] igc: Only dump registers if configured to dump HW information Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-02  4:53 ` [PATCH net-next v1 5/9] igc: Avoid TX Hangs because long cycles Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-02  4:53 ` [PATCH net-next v1 6/9] igc: Add support for tuning frame preemption via ethtool Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-05 18:00   ` Jakub Kicinski
2020-12-05 18:00     ` [Intel-wired-lan] " Jakub Kicinski
2020-12-07 22:15     ` Vinicius Costa Gomes
2020-12-07 22:15       ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-07 23:22       ` Jakub Kicinski
2020-12-07 23:22         ` [Intel-wired-lan] " Jakub Kicinski
2020-12-02  4:53 ` [PATCH net-next v1 7/9] igc: Add support for Frame Preemption offload Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-02  4:53 ` [PATCH net-next v1 8/9] igc: Add support for exposing frame preemption stats registers Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-05 17:59   ` Jakub Kicinski
2020-12-05 17:59     ` [Intel-wired-lan] " Jakub Kicinski
2020-12-07 22:29     ` Vinicius Costa Gomes
2020-12-07 22:29       ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-02  4:53 ` [PATCH net-next v1 9/9] igc: Separate TSN configurations that can be updated Vinicius Costa Gomes
2020-12-02  4:53   ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-05 17:50 ` [PATCH net-next v1 0/9] ethtool: Add support for frame preemption Jakub Kicinski
2020-12-05 17:50   ` [Intel-wired-lan] " Jakub Kicinski
2020-12-07 22:49   ` Vinicius Costa Gomes
2020-12-07 22:49     ` [Intel-wired-lan] " Vinicius Costa Gomes
2020-12-07 23:12     ` Vladimir Oltean
2020-12-07 23:12       ` [Intel-wired-lan] " Vladimir Oltean
2020-12-08  0:34       ` Vinicius Costa Gomes
2020-12-08  0:34         ` [Intel-wired-lan] " Vinicius Costa Gomes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.