[PATCH v2] net: sched: check tc_skip_classify as far as possible

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2] net: sched: check tc_skip_classify as far as possible
@ 2021-11-03 14:32 xiangxia.m.yue
  2021-11-03 14:47 ` Willem de Bruijn
  0 siblings, 1 reply; 4+ messages in thread
From: xiangxia.m.yue @ 2021-11-03 14:32 UTC (permalink / raw)
  To: netdev
  Cc: Tonghao Zhang, Willem de Bruijn, Cong Wang, Jakub Kicinski,
	Daniel Borkmann

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

We look up and then check tc_skip_classify flag in net
sched layer, even though skb don't want to be classified.
That case may consume a lot of cpu cycles.

Install the rules as below:
$ for id in $(seq 1 100); do
$       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
$ done

netperf:
$ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32
$ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32

Before: 10662.33 tps, 108.95 Mbit/s
After:  12434.48 tps, 145.89 Mbit/s

For TCP_RR, there are 16.6% improvement, TCP_STREAM 33.9%.

Cc: Willem de Bruijn <willemb@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
v2: don't delete skb_skip_tc_classify in act_api 
---
 net/core/dev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index edeb811c454e..fc29a429e9ad 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3940,6 +3940,9 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
 	if (!miniq)
 		return skb;
 
+	if (skb_skip_tc_classify(skb))
+		return skb;
+
 	/* qdisc_skb_cb(skb)->pkt_len was already set by the caller. */
 	qdisc_skb_cb(skb)->mru = 0;
 	qdisc_skb_cb(skb)->post_ct = false;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] net: sched: check tc_skip_classify as far as possible
  2021-11-03 14:32 [PATCH v2] net: sched: check tc_skip_classify as far as possible xiangxia.m.yue
@ 2021-11-03 14:47 ` Willem de Bruijn
  2021-11-03 14:59   ` Daniel Borkmann
  0 siblings, 1 reply; 4+ messages in thread
From: Willem de Bruijn @ 2021-11-03 14:47 UTC (permalink / raw)
  To: xiangxia.m.yue; +Cc: netdev, Cong Wang, Jakub Kicinski, Daniel Borkmann

On Wed, Nov 3, 2021 at 10:32 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> We look up and then check tc_skip_classify flag in net
> sched layer, even though skb don't want to be classified.
> That case may consume a lot of cpu cycles.
>
> Install the rules as below:
> $ for id in $(seq 1 100); do
> $       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
> $ done
>
> netperf:
> $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32
> $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32
>
> Before: 10662.33 tps, 108.95 Mbit/s
> After:  12434.48 tps, 145.89 Mbit/s
>
> For TCP_RR, there are 16.6% improvement, TCP_STREAM 33.9%.
>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Cong Wang <xiyou.wangcong@gmail.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
> v2: don't delete skb_skip_tc_classify in act_api
> ---
>  net/core/dev.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index edeb811c454e..fc29a429e9ad 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3940,6 +3940,9 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
>         if (!miniq)
>                 return skb;
>
> +       if (skb_skip_tc_classify(skb))
> +               return skb;
> +

This bit and test exist to make sure that packets redirected through
the ifb device are redirected only once.

I was afraid that on second loop, this will also short-circuit other
unrelated tc classifiers and actions that should take place.

The name and the variable comment make clear that the intention is
indeed to bypass all classification.

However, the current implementation acts at tcf_action_exec. So it
does not stop processing by fused classifier-action objects, notably tc_bpf.

So I agree both that this seems to follow the original intent, but also
has the chance to break existing production configurations.


>         /* qdisc_skb_cb(skb)->pkt_len was already set by the caller. */
>         qdisc_skb_cb(skb)->mru = 0;
>         qdisc_skb_cb(skb)->post_ct = false;
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] net: sched: check tc_skip_classify as far as possible
  2021-11-03 14:47 ` Willem de Bruijn
@ 2021-11-03 14:59   ` Daniel Borkmann
  2021-11-04  1:23     ` Tonghao Zhang
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Borkmann @ 2021-11-03 14:59 UTC (permalink / raw)
  To: Willem de Bruijn, xiangxia.m.yue; +Cc: netdev, Cong Wang, Jakub Kicinski

On 11/3/21 3:47 PM, Willem de Bruijn wrote:
> On Wed, Nov 3, 2021 at 10:32 AM <xiangxia.m.yue@gmail.com> wrote:
>>
>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>
>> We look up and then check tc_skip_classify flag in net
>> sched layer, even though skb don't want to be classified.
>> That case may consume a lot of cpu cycles.
>>
>> Install the rules as below:
>> $ for id in $(seq 1 100); do
>> $       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
>> $ done
>>
>> netperf:
>> $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32
>> $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32
>>
>> Before: 10662.33 tps, 108.95 Mbit/s
>> After:  12434.48 tps, 145.89 Mbit/s
>>
>> For TCP_RR, there are 16.6% improvement, TCP_STREAM 33.9%.
>>
>> Cc: Willem de Bruijn <willemb@google.com>
>> Cc: Cong Wang <xiyou.wangcong@gmail.com>
>> Cc: Jakub Kicinski <kuba@kernel.org>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>> ---
>> v2: don't delete skb_skip_tc_classify in act_api
>> ---
>>   net/core/dev.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index edeb811c454e..fc29a429e9ad 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -3940,6 +3940,9 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
>>          if (!miniq)
>>                  return skb;
>>
>> +       if (skb_skip_tc_classify(skb))
>> +               return skb;
>> +
> 
> This bit and test exist to make sure that packets redirected through
> the ifb device are redirected only once.
> 
> I was afraid that on second loop, this will also short-circuit other
> unrelated tc classifiers and actions that should take place.
> 
> The name and the variable comment make clear that the intention is
> indeed to bypass all classification.
> 
> However, the current implementation acts at tcf_action_exec. So it
> does not stop processing by fused classifier-action objects, notably tc_bpf.
> 
> So I agree both that this seems to follow the original intent, but also
> has the chance to break existing production configurations.

Agree, I share your concern.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] net: sched: check tc_skip_classify as far as possible
  2021-11-03 14:59   ` Daniel Borkmann
@ 2021-11-04  1:23     ` Tonghao Zhang
  0 siblings, 0 replies; 4+ messages in thread
From: Tonghao Zhang @ 2021-11-04  1:23 UTC (permalink / raw)
  To: Daniel Borkmann, Willem de Bruijn
  Cc: Linux Kernel Network Developers, Cong Wang, Jakub Kicinski

On Wed, Nov 3, 2021 at 10:59 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 11/3/21 3:47 PM, Willem de Bruijn wrote:
> > On Wed, Nov 3, 2021 at 10:32 AM <xiangxia.m.yue@gmail.com> wrote:
> >>
> >> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>
> >> We look up and then check tc_skip_classify flag in net
> >> sched layer, even though skb don't want to be classified.
> >> That case may consume a lot of cpu cycles.
> >>
> >> Install the rules as below:
> >> $ for id in $(seq 1 100); do
> >> $       tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0
> >> $ done
> >>
> >> netperf:
> >> $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32
> >> $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32
> >>
> >> Before: 10662.33 tps, 108.95 Mbit/s
> >> After:  12434.48 tps, 145.89 Mbit/s
> >>
> >> For TCP_RR, there are 16.6% improvement, TCP_STREAM 33.9%.
> >>
> >> Cc: Willem de Bruijn <willemb@google.com>
> >> Cc: Cong Wang <xiyou.wangcong@gmail.com>
> >> Cc: Jakub Kicinski <kuba@kernel.org>
> >> Cc: Daniel Borkmann <daniel@iogearbox.net>
> >> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >> ---
> >> v2: don't delete skb_skip_tc_classify in act_api
> >> ---
> >>   net/core/dev.c | 3 +++
> >>   1 file changed, 3 insertions(+)
> >>
> >> diff --git a/net/core/dev.c b/net/core/dev.c
> >> index edeb811c454e..fc29a429e9ad 100644
> >> --- a/net/core/dev.c
> >> +++ b/net/core/dev.c
> >> @@ -3940,6 +3940,9 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
> >>          if (!miniq)
> >>                  return skb;
> >>
> >> +       if (skb_skip_tc_classify(skb))
> >> +               return skb;
> >> +
> >
> > This bit and test exist to make sure that packets redirected through
> > the ifb device are redirected only once.
> >
> > I was afraid that on second loop, this will also short-circuit other
> > unrelated tc classifiers and actions that should take place.
> >
> > The name and the variable comment make clear that the intention is
> > indeed to bypass all classification.
> >
> > However, the current implementation acts at tcf_action_exec. So it
> > does not stop processing by fused classifier-action objects, notably tc_bpf.
> >
> > So I agree both that this seems to follow the original intent, but also
> > has the chance to break existing production configurations.
>
> Agree, I share your concern.
I got it, thanks Daniel, Willem.
One question, If users use the bpf_redirect, instead of tc-mirred, to
redirect the packets to ifb in egress.
If there is not a test with *skb_skip_tc_classify, the packets will
loop and be dropped, right ? but we hope the packets sent out the
linux box.


-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-04  1:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-03 14:32 [PATCH v2] net: sched: check tc_skip_classify as far as possible xiangxia.m.yue
2021-11-03 14:47 ` Willem de Bruijn
2021-11-03 14:59   ` Daniel Borkmann
2021-11-04  1:23     ` Tonghao Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.