linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND net-next 1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support
@ 2021-07-28  1:33 Peilin Ye
  2021-07-28  1:33 ` [PATCH RESEND net-next 2/2] tc-testing: Add control-plane selftest for skbmod SKBMOD_F_ECN option Peilin Ye
  2021-07-28 12:40 ` [PATCH RESEND net-next 1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Peilin Ye @ 2021-07-28  1:33 UTC (permalink / raw)
  To: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Jakub Kicinski, netdev
  Cc: linux-kernel, Cong Wang, Peilin Ye, Peilin Ye

From: Peilin Ye <peilin.ye@bytedance.com>

Currently, when doing rate limiting using the tc-police(8) action, the
easiest way is to simply drop the packets which exceed or conform the
configured bandwidth limit.  Add a new option to tc-skbmod(8), so that
users may use the ECN [1] extension to explicitly inform the receiver
about the congestion instead of dropping packets "on the floor".

The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
headers are used to represent different ECN states [2]:

	0b00: "Non ECN-Capable Transport", Non-ECT
	0b10: "ECN Capable Transport", ECT(0)
	0b01: "ECN Capable Transport", ECT(1)
	0b11: "Congestion Encountered", CE

As an example:

	$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
		matchall action skbmod ecn

Doing the above marks all ECT(0) and ECT(1) packets as CE.  It does NOT
affect Non-ECT or non-IP packets.  In the tc-police scenario mentioned
above, users may pipe a tc-police action and a tc-skbmod "ecn" action
together to achieve ECN-based rate limiting.

For TCP connections, upon receiving a CE packet, the receiver will respond
with an ECE packet, asking the sender to reduce their congestion window.
However ECN also works with other L4 protocols e.g. DCCP and SCTP [2], and
our implementation does not touch or care about L4 headers.

The updated tc-skbmod SYNOPSIS looks like the following:

	tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ...

Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod
command.  Trying to use more than one of them at a time is considered
undefined behavior; pipe multiple tc-skbmod commands together instead.
"set" and "swap" only affect Ethernet packets, while "ecn" only affects
IPv{4,6} packets.

It is also worth mentioning that, in theory, the same effect could be
achieved by piping a "police" action and a "bpf" action using the
bpf_skb_ecn_set_ce() helper, but this requires eBPF programming from the
user, thus impractical.

Depends on patch "net/sched: act_skbmod: Skip non-Ethernet packets".

[1] https://datatracker.ietf.org/doc/html/rfc3168
[2] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification

Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
---
Hi all,

These 2 patches should apply to net-next now, since "net/sched:
act_skbmod: Skip non-Ethernet packets" has been pulled into net-next.
Please review; thank you!

Peilin Ye

 include/uapi/linux/tc_act/tc_skbmod.h |  1 +
 net/sched/act_skbmod.c                | 44 +++++++++++++++++++--------
 2 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/include/uapi/linux/tc_act/tc_skbmod.h b/include/uapi/linux/tc_act/tc_skbmod.h
index c525b3503797..af6ef2cfbf3d 100644
--- a/include/uapi/linux/tc_act/tc_skbmod.h
+++ b/include/uapi/linux/tc_act/tc_skbmod.h
@@ -17,6 +17,7 @@
 #define SKBMOD_F_SMAC	0x2
 #define SKBMOD_F_ETYPE	0x4
 #define SKBMOD_F_SWAPMAC 0x8
+#define SKBMOD_F_ECN	0x10
 
 struct tc_skbmod {
 	tc_gen;
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index 8d17a543cc9f..762ceec3e6f6 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -11,6 +11,7 @@
 #include <linux/kernel.h>
 #include <linux/skbuff.h>
 #include <linux/rtnetlink.h>
+#include <net/inet_ecn.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
 #include <net/pkt_cls.h>
@@ -21,15 +22,13 @@
 static unsigned int skbmod_net_id;
 static struct tc_action_ops act_skbmod_ops;
 
-#define MAX_EDIT_LEN ETH_HLEN
 static int tcf_skbmod_act(struct sk_buff *skb, const struct tc_action *a,
 			  struct tcf_result *res)
 {
 	struct tcf_skbmod *d = to_skbmod(a);
-	int action;
+	int action, max_edit_len, err;
 	struct tcf_skbmod_params *p;
 	u64 flags;
-	int err;
 
 	tcf_lastuse_update(&d->tcf_tm);
 	bstats_cpu_update(this_cpu_ptr(d->common.cpu_bstats), skb);
@@ -38,19 +37,34 @@ static int tcf_skbmod_act(struct sk_buff *skb, const struct tc_action *a,
 	if (unlikely(action == TC_ACT_SHOT))
 		goto drop;
 
-	if (!skb->dev || skb->dev->type != ARPHRD_ETHER)
-		return action;
+	max_edit_len = skb_mac_header_len(skb);
+	p = rcu_dereference_bh(d->skbmod_p);
+	flags = p->flags;
+
+	/* tcf_skbmod_init() guarantees "flags" to be one of the following:
+	 *	1. a combination of SKBMOD_F_{DMAC,SMAC,ETYPE}
+	 *	2. SKBMOD_F_SWAPMAC
+	 *	3. SKBMOD_F_ECN
+	 * SKBMOD_F_ECN only works with IP packets; all other flags only work with Ethernet
+	 * packets.
+	 */
+	if (flags == SKBMOD_F_ECN) {
+		switch (skb_protocol(skb, true)) {
+		case cpu_to_be16(ETH_P_IP):
+		case cpu_to_be16(ETH_P_IPV6):
+			max_edit_len += skb_network_header_len(skb);
+			break;
+		default:
+			goto out;
+		}
+	} else if (!skb->dev || skb->dev->type != ARPHRD_ETHER) {
+		goto out;
+	}
 
-	/* XXX: if you are going to edit more fields beyond ethernet header
-	 * (example when you add IP header replacement or vlan swap)
-	 * then MAX_EDIT_LEN needs to change appropriately
-	*/
-	err = skb_ensure_writable(skb, MAX_EDIT_LEN);
+	err = skb_ensure_writable(skb, max_edit_len);
 	if (unlikely(err)) /* best policy is to drop on the floor */
 		goto drop;
 
-	p = rcu_dereference_bh(d->skbmod_p);
-	flags = p->flags;
 	if (flags & SKBMOD_F_DMAC)
 		ether_addr_copy(eth_hdr(skb)->h_dest, p->eth_dst);
 	if (flags & SKBMOD_F_SMAC)
@@ -66,6 +80,10 @@ static int tcf_skbmod_act(struct sk_buff *skb, const struct tc_action *a,
 		ether_addr_copy(eth_hdr(skb)->h_source, (u8 *)tmpaddr);
 	}
 
+	if (flags & SKBMOD_F_ECN)
+		INET_ECN_set_ce(skb);
+
+out:
 	return action;
 
 drop:
@@ -129,6 +147,8 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
 	index = parm->index;
 	if (parm->flags & SKBMOD_F_SWAPMAC)
 		lflags = SKBMOD_F_SWAPMAC;
+	if (parm->flags & SKBMOD_F_ECN)
+		lflags = SKBMOD_F_ECN;
 
 	err = tcf_idr_check_alloc(tn, &index, a, bind);
 	if (err < 0)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH RESEND net-next 2/2] tc-testing: Add control-plane selftest for skbmod SKBMOD_F_ECN option
  2021-07-28  1:33 [PATCH RESEND net-next 1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support Peilin Ye
@ 2021-07-28  1:33 ` Peilin Ye
  2021-07-28 12:40 ` [PATCH RESEND net-next 1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Peilin Ye @ 2021-07-28  1:33 UTC (permalink / raw)
  To: Shuah Khan, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	David S. Miller, Jakub Kicinski, netdev
  Cc: linux-kselftest, linux-kernel, Cong Wang, Peilin Ye, Peilin Ye

From: Peilin Ye <peilin.ye@bytedance.com>

Recently we added a new option, SKBMOD_F_ECN, to tc-skbmod(8).  Add a
control-plane selftest for it.

Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
support", as well as iproute2 patch "tc/skbmod: Introduce SKBMOD_F_ECN
option".

Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
---
 .../tc-testing/tc-tests/actions/skbmod.json   | 24 +++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json b/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json
index 6eb4c4f97060..742f2290973e 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/skbmod.json
@@ -417,5 +417,29 @@
         "teardown": [
             "$TC actions flush action skbmod"
         ]
+    },
+    {
+        "id": "fe09",
+        "name": "Add skbmod action to mark ECN bits",
+        "category": [
+            "actions",
+            "skbmod"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action skbmod",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action skbmod ecn",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions get action skbmod index 1",
+        "matchPattern": "action order [0-9]*: skbmod pipe ecn",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action skbmod"
+        ]
     }
 ]
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH RESEND net-next 1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support
  2021-07-28  1:33 [PATCH RESEND net-next 1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support Peilin Ye
  2021-07-28  1:33 ` [PATCH RESEND net-next 2/2] tc-testing: Add control-plane selftest for skbmod SKBMOD_F_ECN option Peilin Ye
@ 2021-07-28 12:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-07-28 12:40 UTC (permalink / raw)
  To: Peilin Ye
  Cc: jhs, xiyou.wangcong, jiri, davem, kuba, netdev, linux-kernel,
	cong.wang, peilin.ye

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Tue, 27 Jul 2021 18:33:15 -0700 you wrote:
> From: Peilin Ye <peilin.ye@bytedance.com>
> 
> Currently, when doing rate limiting using the tc-police(8) action, the
> easiest way is to simply drop the packets which exceed or conform the
> configured bandwidth limit.  Add a new option to tc-skbmod(8), so that
> users may use the ECN [1] extension to explicitly inform the receiver
> about the congestion instead of dropping packets "on the floor".
> 
> [...]

Here is the summary with links:
  - [RESEND,net-next,1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support
    https://git.kernel.org/netdev/net-next/c/56af5e749f20
  - [RESEND,net-next,2/2] tc-testing: Add control-plane selftest for skbmod SKBMOD_F_ECN option
    https://git.kernel.org/netdev/net-next/c/68f9884837c6

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-07-28 12:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-28  1:33 [PATCH RESEND net-next 1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support Peilin Ye
2021-07-28  1:33 ` [PATCH RESEND net-next 2/2] tc-testing: Add control-plane selftest for skbmod SKBMOD_F_ECN option Peilin Ye
2021-07-28 12:40 ` [PATCH RESEND net-next 1/2] net/sched: act_skbmod: Add SKBMOD_F_ECN option support patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).