netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH iproute2-next] tc/skbmod: Introduce SKBMOD_F_ECN option
@ 2021-07-21 23:20 Peilin Ye
  2021-08-02 16:28 ` David Ahern
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Peilin Ye @ 2021-07-21 23:20 UTC (permalink / raw)
  To: netdev, Stephen Hemminger, Jamal Hadi Salim, Cong Wang, Jiri Pirko
  Cc: Cong Wang, Peilin Ye, Peilin Ye

From: Peilin Ye <peilin.ye@bytedance.com>

Recently we added SKBMOD_F_ECN option support to the kernel; support it in
the tc-skbmod(8) front end, and update its man page accordingly.

The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
headers are used to represent different ECN states [1]:

	0b00: "Non ECN-Capable Transport", Non-ECT
	0b10: "ECN Capable Transport", ECT(0)
	0b01: "ECN Capable Transport", ECT(1)
	0b11: "Congestion Encountered", CE

This new option, "ecn", marks ECT(0) and ECT(1) IPv{4,6} packets as CE,
which is useful for ECN-based rate limiting.  For example:

	$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
		u32 match ip protocol 1 0xff flowid 1:2 \
		action skbmod \
		ecn

The updated tc-skbmod SYNOPSIS looks like the following:

	tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ...

Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod
command.  Trying to use more than one of them at a time is considered
undefined behavior; pipe multiple tc-skbmod commands together instead.
"set" and "swap" only affect Ethernet packets, while "ecn" only affects
IP packets.

Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
support", as well as iproute2 patch "tc/skbmod: Remove misinformation
about the swap action".

[1] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification

Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
---
Hi all,

The corresponding kernel patch is here, which is currently pending
for review:
https://lore.kernel.org/netdev/f5c5a81d6674a8f4838684ac52ed66da83f92499.1626899889.git.peilin.ye@bytedance.com/T/#u

It also depends on this iproute2 patch, which is also pending:
https://lore.kernel.org/netdev/20210720192145.20166-1-yepeilin.cs@gmail.com/

Thanks,
Peilin Ye

 include/uapi/linux/tc_act/tc_skbmod.h |  1 +
 man/man8/tc-skbmod.8                  | 38 ++++++++++++++++++++-------
 tc/m_skbmod.c                         |  8 +++++-
 3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/include/uapi/linux/tc_act/tc_skbmod.h b/include/uapi/linux/tc_act/tc_skbmod.h
index c525b3503797..af6ef2cfbf3d 100644
--- a/include/uapi/linux/tc_act/tc_skbmod.h
+++ b/include/uapi/linux/tc_act/tc_skbmod.h
@@ -17,6 +17,7 @@
 #define SKBMOD_F_SMAC	0x2
 #define SKBMOD_F_ETYPE	0x4
 #define SKBMOD_F_SWAPMAC 0x8
+#define SKBMOD_F_ECN	0x10
 
 struct tc_skbmod {
 	tc_gen;
diff --git a/man/man8/tc-skbmod.8 b/man/man8/tc-skbmod.8
index 76512311b17d..52eaf989e80b 100644
--- a/man/man8/tc-skbmod.8
+++ b/man/man8/tc-skbmod.8
@@ -8,7 +8,8 @@ skbmod - user-friendly packet editor action
 .BR tc " ... " "action skbmod " "{ " "set "
 .IR SETTABLE " | "
 .BI swap " SWAPPABLE"
-.RI " } [ " CONTROL " ] [ "
+.RB " | " ecn
+.RI "} [ " CONTROL " ] [ "
 .BI index " INDEX "
 ]
 
@@ -37,6 +38,12 @@ action. Instead of having to manually edit 8-, 16-, or 32-bit chunks of an
 ethernet header,
 .B skbmod
 allows complete substitution of supported elements.
+Action must be one of
+.BR set ", " swap " and " ecn "."
+.BR set " and " swap
+only affect Ethernet packets, while
+.B ecn
+only affects IP packets.
 .SH OPTIONS
 .TP
 .BI dmac " DMAC"
@@ -51,6 +58,10 @@ Change the ethertype to the specified value.
 .BI mac
 Used to swap mac addresses.
 .TP
+.B ecn
+Used to mark ECN Capable Transport (ECT) IP packets as Congestion Encountered (CE).
+Does not affect Non ECN-Capable Transport (Non-ECT) packets.
+.TP
 .I CONTROL
 The following keywords allow to control how the tree of qdisc, classes,
 filters and actions is further traversed after this action.
@@ -115,7 +126,7 @@ tc filter add dev eth5 parent 1: protocol ip prio 10 \\
 .EE
 .RE
 
-Finally, swap the destination and source mac addresses in the header:
+To swap the destination and source mac addresses in the Ethernet header:
 
 .RS
 .EX
@@ -126,13 +137,22 @@ tc filter add dev eth3 parent 1: protocol ip prio 10 \\
 .EE
 .RE
 
-However, trying to
-.B set
-and
-.B swap
-in a single
-.B skbmod
-command will cause undefined behavior.
+Finally, to mark the CE codepoint in the IP header for ECN Capable Transport (ECT) packets:
+
+.RS
+.EX
+tc filter add dev eth0 parent 1: protocol ip prio 10 \\
+	u32 match ip protocol 1 0xff flowid 1:2 \\
+	action skbmod \\
+	ecn
+.EE
+.RE
+
+Only one of
+.BR set ", " swap " and " ecn
+shall be used in a single command.
+Trying to use more than one of them in a single command is considered undefined behavior; pipe
+multiple commands together instead.
 
 .SH SEE ALSO
 .BR tc (8),
diff --git a/tc/m_skbmod.c b/tc/m_skbmod.c
index 3fe30651a7d8..8d8bac5bc481 100644
--- a/tc/m_skbmod.c
+++ b/tc/m_skbmod.c
@@ -28,7 +28,7 @@
 static void skbmod_explain(void)
 {
 	fprintf(stderr,
-		"Usage:... skbmod { set <SETTABLE> | swap <SWAPPABLE> } [CONTROL] [index INDEX]\n"
+		"Usage:... skbmod { set <SETTABLE> | swap <SWAPPABLE> | ecn } [CONTROL] [index INDEX]\n"
 		"where SETTABLE is: [dmac DMAC] [smac SMAC] [etype ETYPE]\n"
 		"where SWAPPABLE is: \"mac\" to swap mac addresses\n"
 		"\tDMAC := 6 byte Destination MAC address\n"
@@ -111,6 +111,9 @@ static int parse_skbmod(struct action_util *a, int *argc_p, char ***argv_p,
 			p.flags |= SKBMOD_F_SMAC;
 			fprintf(stderr, "src MAC address <%s>\n", saddr);
 			ok += 1;
+		} else if (matches(*argv, "ecn") == 0) {
+			p.flags |= SKBMOD_F_ECN;
+			ok += 1;
 		} else if (matches(*argv, "help") == 0) {
 			skbmod_usage();
 		} else {
@@ -211,6 +214,9 @@ static int print_skbmod(struct action_util *au, FILE *f, struct rtattr *arg)
 	if (p->flags & SKBMOD_F_SWAPMAC)
 		fprintf(f, "swap mac ");
 
+	if (p->flags & SKBMOD_F_ECN)
+		fprintf(f, "ecn ");
+
 	fprintf(f, "\n\t index %u ref %d bind %d", p->index, p->refcnt,
 		p->bindcnt);
 	if (show_stats) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH iproute2-next] tc/skbmod: Introduce SKBMOD_F_ECN option
  2021-07-21 23:20 [PATCH iproute2-next] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
@ 2021-08-02 16:28 ` David Ahern
  2021-08-02 17:54 ` [PATCH iproute2-next v2] " Peilin Ye
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2021-08-02 16:28 UTC (permalink / raw)
  To: Peilin Ye, netdev, Stephen Hemminger, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko
  Cc: Cong Wang, Peilin Ye

On 7/21/21 5:20 PM, Peilin Ye wrote:
> From: Peilin Ye <peilin.ye@bytedance.com>
> 
> Recently we added SKBMOD_F_ECN option support to the kernel; support it in
> the tc-skbmod(8) front end, and update its man page accordingly.
> 
> The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
> headers are used to represent different ECN states [1]:
> 
> 	0b00: "Non ECN-Capable Transport", Non-ECT
> 	0b10: "ECN Capable Transport", ECT(0)
> 	0b01: "ECN Capable Transport", ECT(1)
> 	0b11: "Congestion Encountered", CE
> 
> This new option, "ecn", marks ECT(0) and ECT(1) IPv{4,6} packets as CE,
> which is useful for ECN-based rate limiting.  For example:
> 
> 	$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
> 		u32 match ip protocol 1 0xff flowid 1:2 \
> 		action skbmod \
> 		ecn
> 
> The updated tc-skbmod SYNOPSIS looks like the following:
> 
> 	tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ...
> 
> Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod
> command.  Trying to use more than one of them at a time is considered
> undefined behavior; pipe multiple tc-skbmod commands together instead.
> "set" and "swap" only affect Ethernet packets, while "ecn" only affects
> IP packets.
> 
> Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
> support", as well as iproute2 patch "tc/skbmod: Remove misinformation
> about the swap action".
> 
> [1] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification
> 
> Reviewed-by: Cong Wang <cong.wang@bytedance.com>
> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
> ---
> Hi all,
> 
> The corresponding kernel patch is here, which is currently pending
> for review:
> https://lore.kernel.org/netdev/f5c5a81d6674a8f4838684ac52ed66da83f92499.1626899889.git.peilin.ye@bytedance.com/T/#u
> 
> It also depends on this iproute2 patch, which is also pending:
> https://lore.kernel.org/netdev/20210720192145.20166-1-yepeilin.cs@gmail.com/
> 
> Thanks,
> Peilin Ye
> 

man page update has conflicts; please rebase.
Thanks,


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH iproute2-next v2] tc/skbmod: Introduce SKBMOD_F_ECN option
  2021-07-21 23:20 [PATCH iproute2-next] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
  2021-08-02 16:28 ` David Ahern
@ 2021-08-02 17:54 ` Peilin Ye
  2021-08-04 15:25   ` David Ahern
  2021-08-02 20:51 ` [PATCH iproute2-next] tc/ingress: Introduce clsact egress mini-Qdisc option Peilin Ye
  2021-08-04 18:15 ` [PATCH iproute2-next v3] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
  3 siblings, 1 reply; 8+ messages in thread
From: Peilin Ye @ 2021-08-02 17:54 UTC (permalink / raw)
  To: netdev, David Ahern
  Cc: Stephen Hemminger, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	Cong Wang, Peilin Ye, Peilin Ye

From: Peilin Ye <peilin.ye@bytedance.com>

Recently we added SKBMOD_F_ECN option support to the kernel; support it in
the tc-skbmod(8) front end, and update its man page accordingly.

The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
headers are used to represent different ECN states [1]:

	0b00: "Non ECN-Capable Transport", Non-ECT
	0b10: "ECN Capable Transport", ECT(0)
	0b01: "ECN Capable Transport", ECT(1)
	0b11: "Congestion Encountered", CE

This new option, "ecn", marks ECT(0) and ECT(1) IPv{4,6} packets as CE,
which is useful for ECN-based rate limiting.  For example:

	$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
		u32 match ip protocol 1 0xff flowid 1:2 \
		action skbmod \
		ecn

The updated tc-skbmod SYNOPSIS looks like the following:

	tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ...

Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod
command.  Trying to use more than one of them at a time is considered
undefined behavior; pipe multiple tc-skbmod commands together instead.
"set" and "swap" only affect Ethernet packets, while "ecn" only affects
IP packets.

Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
support", as well as iproute2 patch "tc/skbmod: Remove misinformation
about the swap action".

[1] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification

Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
---
Hi David,

Rebased on iproute2-next, should hunk now; thank you!

There will be a conflict next time you merge iproute2 into iproute2-next
because of this commit:

"tc/skbmod: Remove misinformation about the swap action"
https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=c06d313d86c1acb8dd72589816301853ff5a4ac4

Please just ignore its code change since it is now superseded by this v2.

Thanks!
Peilin Ye

Change since v1:
    - rebased on iproute2-next (David)

 man/man8/tc-skbmod.8 | 46 ++++++++++++++++++++++++++++++++------------
 tc/m_skbmod.c        | 11 ++++++++---
 2 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/man/man8/tc-skbmod.8 b/man/man8/tc-skbmod.8
index eb3c38fa6bf3..52eaf989e80b 100644
--- a/man/man8/tc-skbmod.8
+++ b/man/man8/tc-skbmod.8
@@ -5,12 +5,13 @@ skbmod - user-friendly packet editor action
 .SH SYNOPSIS
 .in +8
 .ti -8
-.BR tc " ... " "action skbmod " "{ [ " "set "
-.IR SETTABLE " ] [ "
+.BR tc " ... " "action skbmod " "{ " "set "
+.IR SETTABLE " | "
 .BI swap " SWAPPABLE"
-.RI " ] [ " CONTROL " ] [ "
+.RB " | " ecn
+.RI "} [ " CONTROL " ] [ "
 .BI index " INDEX "
-] }
+]
 
 .ti -8
 .IR SETTABLE " := "
@@ -25,6 +26,7 @@ skbmod - user-friendly packet editor action
 .IR SWAPPABLE " := "
 .B mac
 .ti -8
+
 .IR CONTROL " := {"
 .BR reclassify " | " pipe " | " drop " | " shot " | " continue " | " pass " }"
 .SH DESCRIPTION
@@ -36,6 +38,12 @@ action. Instead of having to manually edit 8-, 16-, or 32-bit chunks of an
 ethernet header,
 .B skbmod
 allows complete substitution of supported elements.
+Action must be one of
+.BR set ", " swap " and " ecn "."
+.BR set " and " swap
+only affect Ethernet packets, while
+.B ecn
+only affects IP packets.
 .SH OPTIONS
 .TP
 .BI dmac " DMAC"
@@ -48,10 +56,11 @@ Change the source mac to the specified address.
 Change the ethertype to the specified value.
 .TP
 .BI mac
-Used to swap mac addresses. The
-.B swap mac
-directive is performed
-after any outstanding D/SMAC changes.
+Used to swap mac addresses.
+.TP
+.B ecn
+Used to mark ECN Capable Transport (ECT) IP packets as Congestion Encountered (CE).
+Does not affect Non ECN-Capable Transport (Non-ECT) packets.
 .TP
 .I CONTROL
 The following keywords allow to control how the tree of qdisc, classes,
@@ -117,7 +126,7 @@ tc filter add dev eth5 parent 1: protocol ip prio 10 \\
 .EE
 .RE
 
-Finally, swap the destination and source mac addresses in the header:
+To swap the destination and source mac addresses in the Ethernet header:
 
 .RS
 .EX
@@ -128,9 +137,22 @@ tc filter add dev eth3 parent 1: protocol ip prio 10 \\
 .EE
 .RE
 
-As mentioned above, the swap action will occur after any
-.B " smac/dmac "
-substitutions are executed, if they are present.
+Finally, to mark the CE codepoint in the IP header for ECN Capable Transport (ECT) packets:
+
+.RS
+.EX
+tc filter add dev eth0 parent 1: protocol ip prio 10 \\
+	u32 match ip protocol 1 0xff flowid 1:2 \\
+	action skbmod \\
+	ecn
+.EE
+.RE
+
+Only one of
+.BR set ", " swap " and " ecn
+shall be used in a single command.
+Trying to use more than one of them in a single command is considered undefined behavior; pipe
+multiple commands together instead.
 
 .SH SEE ALSO
 .BR tc (8),
diff --git a/tc/m_skbmod.c b/tc/m_skbmod.c
index e13d3f16bfcb..8d8bac5bc481 100644
--- a/tc/m_skbmod.c
+++ b/tc/m_skbmod.c
@@ -28,10 +28,9 @@
 static void skbmod_explain(void)
 {
 	fprintf(stderr,
-		"Usage:... skbmod {[set <SETTABLE>] [swap <SWAPABLE>]} [CONTROL] [index INDEX]\n"
+		"Usage:... skbmod { set <SETTABLE> | swap <SWAPPABLE> | ecn } [CONTROL] [index INDEX]\n"
 		"where SETTABLE is: [dmac DMAC] [smac SMAC] [etype ETYPE]\n"
-		"where SWAPABLE is: \"mac\" to swap mac addresses\n"
-		"note: \"swap mac\" is done after any outstanding D/SMAC change\n"
+		"where SWAPPABLE is: \"mac\" to swap mac addresses\n"
 		"\tDMAC := 6 byte Destination MAC address\n"
 		"\tSMAC := optional 6 byte Source MAC address\n"
 		"\tETYPE := optional 16 bit ethertype\n"
@@ -112,6 +111,9 @@ static int parse_skbmod(struct action_util *a, int *argc_p, char ***argv_p,
 			p.flags |= SKBMOD_F_SMAC;
 			fprintf(stderr, "src MAC address <%s>\n", saddr);
 			ok += 1;
+		} else if (matches(*argv, "ecn") == 0) {
+			p.flags |= SKBMOD_F_ECN;
+			ok += 1;
 		} else if (matches(*argv, "help") == 0) {
 			skbmod_usage();
 		} else {
@@ -212,6 +214,9 @@ static int print_skbmod(struct action_util *au, FILE *f, struct rtattr *arg)
 	if (p->flags & SKBMOD_F_SWAPMAC)
 		fprintf(f, "swap mac ");
 
+	if (p->flags & SKBMOD_F_ECN)
+		fprintf(f, "ecn ");
+
 	fprintf(f, "\n\t index %u ref %d bind %d", p->index, p->refcnt,
 		p->bindcnt);
 	if (show_stats) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH iproute2-next] tc/ingress: Introduce clsact egress mini-Qdisc option
  2021-07-21 23:20 [PATCH iproute2-next] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
  2021-08-02 16:28 ` David Ahern
  2021-08-02 17:54 ` [PATCH iproute2-next v2] " Peilin Ye
@ 2021-08-02 20:51 ` Peilin Ye
  2021-08-04 18:15 ` [PATCH iproute2-next v3] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
  3 siblings, 0 replies; 8+ messages in thread
From: Peilin Ye @ 2021-08-02 20:51 UTC (permalink / raw)
  To: netdev, David Ahern
  Cc: Stephen Hemminger, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	Daniel Borkmann, Cong Wang, Peilin Ye, Peilin Ye

From: Peilin Ye <peilin.ye@bytedance.com>

If the ingress Qdisc is in use, currently it is not possible to add
another clsact egress mini-Qdisc to the same device without taking down
the ingress Qdisc, since both sch_ingress and sch_clsact use the same
handle (0xFFFF0000).

To solve this issue, recently we added a new clsact egress mini-Qdisc
option for sch_ingress in the kernel.  Support it in the q_ingress front
end, and update the usage message accordingly.

To turn on the egress mini-Qdisc:

    $ tc qdisc add dev eth0 ingress
    $ tc qdisc change dev eth0 ingress clsact-on

Then users can add filters to the egress mini-Qdisc as usual:

    $ tc filter add dev eth0 egress protocol ip prio 10 \
	    matchall action skbmod swap mac

Deleting the ingress Qdisc removes the egress mini-Qdisc as well.  To
remove egress mini-Qdisc only, use:

    $ tc qdisc change dev eth0 ingress clsact-off

Finally, if the egress mini-Qdisc is enabled, the "show" command will
print out a "clsact" flag to indicate it:

    $ tc qdisc show ingress
    qdisc ingress ffff: dev eth0 parent ffff:fff1 ----------------
    $ tc qdisc change dev eth0 ingress clsact-on
    $ tc qdisc show ingress
    qdisc ingress ffff: dev eth0 parent ffff:fff1 ---------------- clsact

Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
---
 include/uapi/linux/pkt_sched.h | 12 +++++++++
 tc/q_ingress.c                 | 46 ++++++++++++++++++++++++++++++++--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 79a699f106b1..cb0eb5dd848a 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -586,6 +586,18 @@ enum {
 
 #define TCA_ATM_MAX	(__TCA_ATM_MAX - 1)
 
+/* INGRESS section */
+
+enum {
+	TCA_INGRESS_UNSPEC,
+	TCA_INGRESS_FLAGS,
+#define	TC_INGRESS_CLSACT	   _BITUL(0)	/* enable clsact egress mini-Qdisc */
+#define	TC_INGRESS_SUPPORTED_FLAGS TC_INGRESS_CLSACT
+	__TCA_INGRESS_MAX,
+};
+
+#define	TCA_INGRESS_MAX	(__TCA_INGRESS_MAX - 1)
+
 /* Network emulator */
 
 enum {
diff --git a/tc/q_ingress.c b/tc/q_ingress.c
index 93313c9c2aec..25bf2dce0b56 100644
--- a/tc/q_ingress.c
+++ b/tc/q_ingress.c
@@ -17,21 +17,45 @@
 
 static void explain(void)
 {
-	fprintf(stderr, "Usage: ... ingress\n");
+	fprintf(stderr,
+		"Usage: [ add | replace | link | delete ] ... ingress\n"
+		"       change ... ingress [ clsact-on | clsact-off ]\n"
+		" clsact-on\tenable clsact egress mini-Qdisc\n"
+		" clsact-off\tdelete clsact egress mini-Qdisc\n");
 }
 
 static int ingress_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			     struct nlmsghdr *n, const char *dev)
 {
+	struct nla_bitfield32 flags = {
+		.selector = TC_INGRESS_SUPPORTED_FLAGS,
+	};
+	bool change = false;
+	struct rtattr *tail;
+
 	while (argc > 0) {
 		if (strcmp(*argv, "handle") == 0) {
 			NEXT_ARG();
-			argc--; argv++;
+		} else if (strcmp(*argv, "clsact-on") == 0) {
+			flags.value |= TC_INGRESS_CLSACT;
+			change = true;
+		} else if (strcmp(*argv, "clsact-off") == 0) {
+			flags.value &= ~TC_INGRESS_CLSACT;
+			change = true;
 		} else {
 			fprintf(stderr, "What is \"%s\"?\n", *argv);
 			explain();
 			return -1;
 		}
+
+		argc--;
+		argv++;
+	}
+
+	if (change) {
+		tail = addattr_nest(n, 1024, TCA_OPTIONS);
+		addattr_l(n, 1024, TCA_INGRESS_FLAGS, &flags, sizeof(flags));
+		addattr_nest_end(n, tail);
 	}
 
 	return 0;
@@ -40,7 +64,25 @@ static int ingress_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 static int ingress_print_opt(struct qdisc_util *qu, FILE *f,
 			     struct rtattr *opt)
 {
+	struct rtattr *tb[TCA_INGRESS_MAX + 1];
+	struct nla_bitfield32 *flags;
+
 	print_string(PRINT_FP, NULL, "---------------- ", NULL);
+
+	if (!opt)
+		return 0;
+
+	parse_rtattr_nested(tb, TCA_INGRESS_MAX, opt);
+
+	if (!tb[TCA_INGRESS_FLAGS])
+		return -1;
+	if (RTA_PAYLOAD(tb[TCA_INGRESS_FLAGS]) < sizeof(*flags))
+		return -1;
+
+	flags = RTA_DATA(tb[TCA_INGRESS_FLAGS]);
+	if (flags->value & TC_INGRESS_CLSACT)
+		print_string(PRINT_FP, NULL, "clsact", NULL);
+
 	return 0;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH iproute2-next v2] tc/skbmod: Introduce SKBMOD_F_ECN option
  2021-08-02 17:54 ` [PATCH iproute2-next v2] " Peilin Ye
@ 2021-08-04 15:25   ` David Ahern
  0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2021-08-04 15:25 UTC (permalink / raw)
  To: Peilin Ye, netdev
  Cc: Stephen Hemminger, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	Cong Wang, Peilin Ye

On 8/2/21 11:54 AM, Peilin Ye wrote:
> There will be a conflict next time you merge iproute2 into iproute2-next
> because of this commit:
> 
> "tc/skbmod: Remove misinformation about the swap action"
> https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=c06d313d86c1acb8dd72589816301853ff5a4ac4
> 
> Please just ignore its code change since it is now superseded by this v2.

thanks for the heads up about the conflict, but let's not duplicate that
removal in this patch.

I just merged main into next. Please fix up this patch and re-send. In
the future, just ask for a merge in cases like this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH iproute2-next v3] tc/skbmod: Introduce SKBMOD_F_ECN option
  2021-07-21 23:20 [PATCH iproute2-next] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
                   ` (2 preceding siblings ...)
  2021-08-02 20:51 ` [PATCH iproute2-next] tc/ingress: Introduce clsact egress mini-Qdisc option Peilin Ye
@ 2021-08-04 18:15 ` Peilin Ye
  2021-08-08 17:59   ` David Ahern
  2021-08-08 18:00   ` patchwork-bot+netdevbpf
  3 siblings, 2 replies; 8+ messages in thread
From: Peilin Ye @ 2021-08-04 18:15 UTC (permalink / raw)
  To: netdev, David Ahern
  Cc: Stephen Hemminger, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	Cong Wang, Peilin Ye, Peilin Ye

From: Peilin Ye <peilin.ye@bytedance.com>

Recently we added SKBMOD_F_ECN option support to the kernel; support it in
the tc-skbmod(8) front end, and update its man page accordingly.

The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
headers are used to represent different ECN states [1]:

	0b00: "Non ECN-Capable Transport", Non-ECT
	0b10: "ECN Capable Transport", ECT(0)
	0b01: "ECN Capable Transport", ECT(1)
	0b11: "Congestion Encountered", CE

This new option, "ecn", marks ECT(0) and ECT(1) IPv{4,6} packets as CE,
which is useful for ECN-based rate limiting.  For example:

	$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
		u32 match ip protocol 1 0xff flowid 1:2 \
		action skbmod \
		ecn

The updated tc-skbmod SYNOPSIS looks like the following:

	tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ...

Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod
command.  Trying to use more than one of them at a time is considered
undefined behavior; pipe multiple tc-skbmod commands together instead.
"set" and "swap" only affect Ethernet packets, while "ecn" only affects
IP packets.

Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
support", as well as iproute2 patch "tc/skbmod: Remove misinformation
about the swap action".

[1] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification

Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
---
Hi David,

> I just merged main into next. Please fix up this patch and re-send. In
> the future, just ask for a merge in cases like this.

Ah, I see; thanks!
Peilin Ye

Change since v2:
    - re-rebased on iproute2-next (David)

 man/man8/tc-skbmod.8 | 38 +++++++++++++++++++++++++++++---------
 tc/m_skbmod.c        |  8 +++++++-
 2 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/man/man8/tc-skbmod.8 b/man/man8/tc-skbmod.8
index 76512311b17d..52eaf989e80b 100644
--- a/man/man8/tc-skbmod.8
+++ b/man/man8/tc-skbmod.8
@@ -8,7 +8,8 @@ skbmod - user-friendly packet editor action
 .BR tc " ... " "action skbmod " "{ " "set "
 .IR SETTABLE " | "
 .BI swap " SWAPPABLE"
-.RI " } [ " CONTROL " ] [ "
+.RB " | " ecn
+.RI "} [ " CONTROL " ] [ "
 .BI index " INDEX "
 ]
 
@@ -37,6 +38,12 @@ action. Instead of having to manually edit 8-, 16-, or 32-bit chunks of an
 ethernet header,
 .B skbmod
 allows complete substitution of supported elements.
+Action must be one of
+.BR set ", " swap " and " ecn "."
+.BR set " and " swap
+only affect Ethernet packets, while
+.B ecn
+only affects IP packets.
 .SH OPTIONS
 .TP
 .BI dmac " DMAC"
@@ -51,6 +58,10 @@ Change the ethertype to the specified value.
 .BI mac
 Used to swap mac addresses.
 .TP
+.B ecn
+Used to mark ECN Capable Transport (ECT) IP packets as Congestion Encountered (CE).
+Does not affect Non ECN-Capable Transport (Non-ECT) packets.
+.TP
 .I CONTROL
 The following keywords allow to control how the tree of qdisc, classes,
 filters and actions is further traversed after this action.
@@ -115,7 +126,7 @@ tc filter add dev eth5 parent 1: protocol ip prio 10 \\
 .EE
 .RE
 
-Finally, swap the destination and source mac addresses in the header:
+To swap the destination and source mac addresses in the Ethernet header:
 
 .RS
 .EX
@@ -126,13 +137,22 @@ tc filter add dev eth3 parent 1: protocol ip prio 10 \\
 .EE
 .RE
 
-However, trying to
-.B set
-and
-.B swap
-in a single
-.B skbmod
-command will cause undefined behavior.
+Finally, to mark the CE codepoint in the IP header for ECN Capable Transport (ECT) packets:
+
+.RS
+.EX
+tc filter add dev eth0 parent 1: protocol ip prio 10 \\
+	u32 match ip protocol 1 0xff flowid 1:2 \\
+	action skbmod \\
+	ecn
+.EE
+.RE
+
+Only one of
+.BR set ", " swap " and " ecn
+shall be used in a single command.
+Trying to use more than one of them in a single command is considered undefined behavior; pipe
+multiple commands together instead.
 
 .SH SEE ALSO
 .BR tc (8),
diff --git a/tc/m_skbmod.c b/tc/m_skbmod.c
index 3fe30651a7d8..8d8bac5bc481 100644
--- a/tc/m_skbmod.c
+++ b/tc/m_skbmod.c
@@ -28,7 +28,7 @@
 static void skbmod_explain(void)
 {
 	fprintf(stderr,
-		"Usage:... skbmod { set <SETTABLE> | swap <SWAPPABLE> } [CONTROL] [index INDEX]\n"
+		"Usage:... skbmod { set <SETTABLE> | swap <SWAPPABLE> | ecn } [CONTROL] [index INDEX]\n"
 		"where SETTABLE is: [dmac DMAC] [smac SMAC] [etype ETYPE]\n"
 		"where SWAPPABLE is: \"mac\" to swap mac addresses\n"
 		"\tDMAC := 6 byte Destination MAC address\n"
@@ -111,6 +111,9 @@ static int parse_skbmod(struct action_util *a, int *argc_p, char ***argv_p,
 			p.flags |= SKBMOD_F_SMAC;
 			fprintf(stderr, "src MAC address <%s>\n", saddr);
 			ok += 1;
+		} else if (matches(*argv, "ecn") == 0) {
+			p.flags |= SKBMOD_F_ECN;
+			ok += 1;
 		} else if (matches(*argv, "help") == 0) {
 			skbmod_usage();
 		} else {
@@ -211,6 +214,9 @@ static int print_skbmod(struct action_util *au, FILE *f, struct rtattr *arg)
 	if (p->flags & SKBMOD_F_SWAPMAC)
 		fprintf(f, "swap mac ");
 
+	if (p->flags & SKBMOD_F_ECN)
+		fprintf(f, "ecn ");
+
 	fprintf(f, "\n\t index %u ref %d bind %d", p->index, p->refcnt,
 		p->bindcnt);
 	if (show_stats) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH iproute2-next v3] tc/skbmod: Introduce SKBMOD_F_ECN option
  2021-08-04 18:15 ` [PATCH iproute2-next v3] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
@ 2021-08-08 17:59   ` David Ahern
  2021-08-08 18:00   ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 8+ messages in thread
From: David Ahern @ 2021-08-08 17:59 UTC (permalink / raw)
  To: Peilin Ye, netdev
  Cc: Stephen Hemminger, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	Cong Wang, Peilin Ye

On 8/4/21 12:15 PM, Peilin Ye wrote:
> From: Peilin Ye <peilin.ye@bytedance.com>
> 
> Recently we added SKBMOD_F_ECN option support to the kernel; support it in
> the tc-skbmod(8) front end, and update its man page accordingly.
> 
> The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
> headers are used to represent different ECN states [1]:
> 
> 	0b00: "Non ECN-Capable Transport", Non-ECT
> 	0b10: "ECN Capable Transport", ECT(0)
> 	0b01: "ECN Capable Transport", ECT(1)
> 	0b11: "Congestion Encountered", CE
> 
> This new option, "ecn", marks ECT(0) and ECT(1) IPv{4,6} packets as CE,
> which is useful for ECN-based rate limiting.  For example:
> 
> 	$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
> 		u32 match ip protocol 1 0xff flowid 1:2 \
> 		action skbmod \
> 		ecn
> 
> The updated tc-skbmod SYNOPSIS looks like the following:
> 
> 	tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ...
> 
> Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod
> command.  Trying to use more than one of them at a time is considered
> undefined behavior; pipe multiple tc-skbmod commands together instead.
> "set" and "swap" only affect Ethernet packets, while "ecn" only affects
> IP packets.
> 
> Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
> support", as well as iproute2 patch "tc/skbmod: Remove misinformation
> about the swap action".
> 
> [1] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification
> 
> Reviewed-by: Cong Wang <cong.wang@bytedance.com>
> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
> ---
> Hi David,
> 
>> I just merged main into next. Please fix up this patch and re-send. In
>> the future, just ask for a merge in cases like this.
> 
> Ah, I see; thanks!
> Peilin Ye
> 
> Change since v2:
>     - re-rebased on iproute2-next (David)
> 
>  man/man8/tc-skbmod.8 | 38 +++++++++++++++++++++++++++++---------
>  tc/m_skbmod.c        |  8 +++++++-
>  2 files changed, 36 insertions(+), 10 deletions(-)
> 

applied to iproute2-next.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH iproute2-next v3] tc/skbmod: Introduce SKBMOD_F_ECN option
  2021-08-04 18:15 ` [PATCH iproute2-next v3] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
  2021-08-08 17:59   ` David Ahern
@ 2021-08-08 18:00   ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-08-08 18:00 UTC (permalink / raw)
  To: Peilin Ye
  Cc: netdev, dsahern, stephen, jhs, xiyou.wangcong, jiri, cong.wang,
	peilin.ye

Hello:

This patch was applied to iproute2/iproute2-next.git (refs/heads/main):

On Wed,  4 Aug 2021 11:15:16 -0700 you wrote:
> From: Peilin Ye <peilin.ye@bytedance.com>
> 
> Recently we added SKBMOD_F_ECN option support to the kernel; support it in
> the tc-skbmod(8) front end, and update its man page accordingly.
> 
> The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
> headers are used to represent different ECN states [1]:
> 
> [...]

Here is the summary with links:
  - [iproute2-next,v3] tc/skbmod: Introduce SKBMOD_F_ECN option
    https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=e78411948dc7

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-08-08 18:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21 23:20 [PATCH iproute2-next] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
2021-08-02 16:28 ` David Ahern
2021-08-02 17:54 ` [PATCH iproute2-next v2] " Peilin Ye
2021-08-04 15:25   ` David Ahern
2021-08-02 20:51 ` [PATCH iproute2-next] tc/ingress: Introduce clsact egress mini-Qdisc option Peilin Ye
2021-08-04 18:15 ` [PATCH iproute2-next v3] tc/skbmod: Introduce SKBMOD_F_ECN option Peilin Ye
2021-08-08 17:59   ` David Ahern
2021-08-08 18:00   ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).