All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
@ 2018-06-05  8:04 Paul Blakey
  2018-06-05 14:30 ` David Miller
  2018-06-05 18:57 ` Jakub Kicinski
  0 siblings, 2 replies; 14+ messages in thread
From: Paul Blakey @ 2018-06-05  8:04 UTC (permalink / raw)
  To: Jiri Pirko, Cong Wang, Jamal Hadi Salim, David Miller, netdev
  Cc: Yevgeny Kliteynik, Roi Dayan, Shahar Klein, Mark Bloch,
	Or Gerlitz, Paul Blakey

When using a vxlan device as the ingress dev, we count it as a
"no offload dev", so when such a rule comes and err stop is true,
we fail early and don't try the egdev route which can offload it
through the egress device.

Fix that by not calling the block offload if one of the devices
attached to it is not offload capable, but make sure egress on such case
is capable instead.

Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
---
 net/sched/cls_api.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a57e112..2cd579f 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -734,10 +734,6 @@ static int tcf_block_cb_call(struct tcf_block *block, enum tc_setup_type type,
 	int ok_count = 0;
 	int err;
 
-	/* Make sure all netdevs sharing this block are offload-capable. */
-	if (block->nooffloaddevcnt && err_stop)
-		return -EOPNOTSUPP;
-
 	list_for_each_entry(block_cb, &block->cb_list, list) {
 		err = block_cb->cb(type, type_data, block_cb->cb_priv);
 		if (err) {
@@ -1580,21 +1576,31 @@ static int tc_exts_setup_cb_egdev_call(struct tcf_exts *exts,
 int tc_setup_cb_call(struct tcf_block *block, struct tcf_exts *exts,
 		     enum tc_setup_type type, void *type_data, bool err_stop)
 {
-	int ok_count;
+	int ok_count = 0;
 	int ret;
 
-	ret = tcf_block_cb_call(block, type, type_data, err_stop);
-	if (ret < 0)
-		return ret;
-	ok_count = ret;
+	if (!block->nooffloaddevcnt) {
+		ret = tcf_block_cb_call(block, type, type_data, err_stop);
+		if (ret < 0)
+			return ret;
+		ok_count = ret;
+	}
 
 	if (!exts || ok_count)
-		return ok_count;
+		goto skip_egress;
+
 	ret = tc_exts_setup_cb_egdev_call(exts, type, type_data, err_stop);
 	if (ret < 0)
 		return ret;
 	ok_count += ret;
 
+skip_egress:
+	/* if one of the netdevs sharing this block are not offload-capable
+	 * make sure we succeeded in egress instead.
+	 */
+	if (block->nooffloaddevcnt && !ok_count && err_stop)
+		return -EOPNOTSUPP;
+
 	return ok_count;
 }
 EXPORT_SYMBOL(tc_setup_cb_call);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05  8:04 [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan Paul Blakey
@ 2018-06-05 14:30 ` David Miller
  2018-06-05 18:57 ` Jakub Kicinski
  1 sibling, 0 replies; 14+ messages in thread
From: David Miller @ 2018-06-05 14:30 UTC (permalink / raw)
  To: paulb
  Cc: jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid, shahark, markb,
	ogerlitz

From: Paul Blakey <paulb@mellanox.com>
Date: Tue,  5 Jun 2018 11:04:03 +0300

> When using a vxlan device as the ingress dev, we count it as a
> "no offload dev", so when such a rule comes and err stop is true,
> we fail early and don't try the egdev route which can offload it
> through the egress device.
> 
> Fix that by not calling the block offload if one of the devices
> attached to it is not offload capable, but make sure egress on such case
> is capable instead.
> 
> Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
> Reviewed-by: Roi Dayan <roid@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> Signed-off-by: Paul Blakey <paulb@mellanox.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05  8:04 [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan Paul Blakey
  2018-06-05 14:30 ` David Miller
@ 2018-06-05 18:57 ` Jakub Kicinski
  2018-06-05 18:59   ` Jakub Kicinski
  2018-06-05 19:06   ` David Miller
  1 sibling, 2 replies; 14+ messages in thread
From: Jakub Kicinski @ 2018-06-05 18:57 UTC (permalink / raw)
  To: Paul Blakey
  Cc: Jiri Pirko, Cong Wang, Jamal Hadi Salim, David Miller, netdev,
	Yevgeny Kliteynik, Roi Dayan, Shahar Klein, Mark Bloch,
	Or Gerlitz

On Tue,  5 Jun 2018 11:04:03 +0300, Paul Blakey wrote:
> When using a vxlan device as the ingress dev, we count it as a
> "no offload dev", so when such a rule comes and err stop is true,
> we fail early and don't try the egdev route which can offload it
> through the egress device.
> 
> Fix that by not calling the block offload if one of the devices
> attached to it is not offload capable, but make sure egress on such case
> is capable instead.
> 
> Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
> Reviewed-by: Roi Dayan <roid@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> Signed-off-by: Paul Blakey <paulb@mellanox.com>

Very poor commit message.  What you're doing is re-enabling skip_sw
filters on tunnel devices which semantically make no sense and
shouldn't have been allowed in the first place.

This will breaks block sharing between tunnels and HW netdevs (because
you skip the tcf_block_cb_call() completely).  The entire egdev idea
remains badly broken accepting filters like this:

# tc filter add dev vxlan0 ingress \
	matchall action skip_sw \
		mirred egress redirect dev lo \
		mirred egress redirect dev sw1np0

Do we still care about correctness and not breaking backward
compatibility?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05 18:57 ` Jakub Kicinski
@ 2018-06-05 18:59   ` Jakub Kicinski
  2018-06-06  5:12     ` Or Gerlitz
  2018-06-05 19:06   ` David Miller
  1 sibling, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2018-06-05 18:59 UTC (permalink / raw)
  To: Paul Blakey
  Cc: Jiri Pirko, Cong Wang, Jamal Hadi Salim, David Miller, netdev,
	Yevgeny Kliteynik, Roi Dayan, Shahar Klein, Mark Bloch,
	Or Gerlitz

On Tue, 5 Jun 2018 11:57:47 -0700, Jakub Kicinski wrote:
> On Tue,  5 Jun 2018 11:04:03 +0300, Paul Blakey wrote:
> > When using a vxlan device as the ingress dev, we count it as a
> > "no offload dev", so when such a rule comes and err stop is true,
> > we fail early and don't try the egdev route which can offload it
> > through the egress device.
> > 
> > Fix that by not calling the block offload if one of the devices
> > attached to it is not offload capable, but make sure egress on such case
> > is capable instead.
> > 
> > Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
> > Reviewed-by: Roi Dayan <roid@mellanox.com>
> > Acked-by: Jiri Pirko <jiri@mellanox.com>
> > Signed-off-by: Paul Blakey <paulb@mellanox.com>  
> 
> Very poor commit message.  What you're doing is re-enabling skip_sw
> filters on tunnel devices which semantically make no sense and
> shouldn't have been allowed in the first place.
> 
> This will breaks block sharing between tunnels and HW netdevs (because
> you skip the tcf_block_cb_call() completely).  The entire egdev idea
> remains badly broken accepting filters like this:
> 
> # tc filter add dev vxlan0 ingress \
> 	matchall action skip_sw \
> 		mirred egress redirect dev lo \
> 		mirred egress redirect dev sw1np0

For above we probably need something like this (untested):

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 3f4cf930f809..71e5eebec31a 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -1511,7 +1511,7 @@ int tc_setup_cb_egdev_call(const struct net_device *dev,
        struct tcf_action_egdev *egdev = tcf_action_egdev_lookup(dev);
 
        if (!egdev)
-               return 0;
+               return err_stop ? -EOPNOTSUPP : 0;
        return tcf_action_egdev_cb_call(egdev, type, type_data, err_stop);
 }
 EXPORT_SYMBOL_GPL(tc_setup_cb_egdev_call);

But the correct fix is to remove egdev crutch completely IMO.

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05 18:57 ` Jakub Kicinski
  2018-06-05 18:59   ` Jakub Kicinski
@ 2018-06-05 19:06   ` David Miller
  2018-06-05 21:27     ` Jakub Kicinski
  1 sibling, 1 reply; 14+ messages in thread
From: David Miller @ 2018-06-05 19:06 UTC (permalink / raw)
  To: kubakici
  Cc: paulb, jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid, shahark,
	markb, ogerlitz

From: Jakub Kicinski <kubakici@wp.pl>
Date: Tue, 5 Jun 2018 11:57:47 -0700

> Do we still care about correctness and not breaking backward
> compatibility?

Jakub let me know if you want me to revert this change.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05 19:06   ` David Miller
@ 2018-06-05 21:27     ` Jakub Kicinski
  2018-06-06  5:15       ` Or Gerlitz
                         ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jakub Kicinski @ 2018-06-05 21:27 UTC (permalink / raw)
  To: David Miller
  Cc: paulb, jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid, shahark,
	markb, ogerlitz

On Tue, 05 Jun 2018 15:06:40 -0400 (EDT), David Miller wrote:
> From: Jakub Kicinski <kubakici@wp.pl>
> Date: Tue, 5 Jun 2018 11:57:47 -0700
> 
> > Do we still care about correctness and not breaking backward
> > compatibility?  
> 
> Jakub let me know if you want me to revert this change.

Yes, I think this patch introduces a regression when block is shared
between offload capable and in-capable device, therefore it should be
reverted.  Shared blocks went through a number of review cycles to
ensure such cases are handled correctly.


Longer story about the actual issue which is never explained in the
commit message is this: in kernels 4.10 - 4.14 skip_sw flag was
supported on tunnels in cls_flower only:

static int fl_hw_replace_filter(struct tcf_proto *tp,
[...]
	if (!tc_can_offload(dev)) {
		if (tcf_exts_get_dev(dev, &f->exts, &f->hw_dev) ||
		    (f->hw_dev && !tc_can_offload(f->hw_dev))) {
			f->hw_dev = dev;
			return tc_skip_sw(f->flags) ? -EINVAL : 0;
		}
		dev = f->hw_dev;
		cls_flower.egress_dev = true;
	} else {
		f->hw_dev = dev;
	}


In 4.15 - 4.17 with addition of shared blocks egdev mechanism was
promoted to a generic TC thing supported on all filters but it no
longer works with skip_sw.

I'd argue skip_sw is incorrect for tunnels, because the rule will only
apply to traffic ingressing on ASIC ports, not all traffic which hits
the tunnel netdev.  Therefore we should keep the 4.15 - 4.17 behaviour.

But that's a side note, I don't think we should be breaking offload on
shared blocks whether we decide to support skip_sw on tunnels or not.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05 18:59   ` Jakub Kicinski
@ 2018-06-06  5:12     ` Or Gerlitz
  2018-06-06 15:50       ` Jakub Kicinski
  0 siblings, 1 reply; 14+ messages in thread
From: Or Gerlitz @ 2018-06-06  5:12 UTC (permalink / raw)
  To: Jakub Kicinski, Paul Blakey
  Cc: Jiri Pirko, Cong Wang, Jamal Hadi Salim, David Miller,
	Linux Netdev List, Yevgeny Kliteynik, Roi Dayan, Shahar Klein,
	Mark Bloch, Or Gerlitz

On Tue, Jun 5, 2018 at 9:59 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Tue, 5 Jun 2018 11:57:47 -0700, Jakub Kicinski wrote:
>> On Tue,  5 Jun 2018 11:04:03 +0300, Paul Blakey wrote:
>> > When using a vxlan device as the ingress dev, we count it as a
>> > "no offload dev", so when such a rule comes and err stop is true,
>> > we fail early and don't try the egdev route which can offload it
>> > through the egress device.
>> >
>> > Fix that by not calling the block offload if one of the devices
>> > attached to it is not offload capable, but make sure egress on such case
>> > is capable instead.
>> >
>> > Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
>> > Reviewed-by: Roi Dayan <roid@mellanox.com>
>> > Acked-by: Jiri Pirko <jiri@mellanox.com>
>> > Signed-off-by: Paul Blakey <paulb@mellanox.com>
>>
>> Very poor commit message.  What you're doing is re-enabling skip_sw
>> filters on tunnel devices which semantically make no sense and
>> shouldn't have been allowed in the first place.
>>
>> This will breaks block sharing between tunnels and HW netdevs (because
>> you skip the tcf_block_cb_call() completely).  The entire egdev idea
>> remains badly broken accepting filters like this:
>>
>> # tc filter add dev vxlan0 ingress \
>>       matchall action skip_sw \
>>               mirred egress redirect dev lo \
>>               mirred egress redirect dev sw1np0
>
> For above we probably need something like this (untested):
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 3f4cf930f809..71e5eebec31a 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -1511,7 +1511,7 @@ int tc_setup_cb_egdev_call(const struct net_device *dev,
>         struct tcf_action_egdev *egdev = tcf_action_egdev_lookup(dev);
>
>         if (!egdev)
> -               return 0;
> +               return err_stop ? -EOPNOTSUPP : 0;
>         return tcf_action_egdev_cb_call(egdev, type, type_data, err_stop);
>  }
>  EXPORT_SYMBOL_GPL(tc_setup_cb_egdev_call);

Jakub,

We will test it out and let you know

> But the correct fix is to remove egdev crutch completely IMO.

Not against it, sometimes designs should change and be replaced
with better ones, happens

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05 21:27     ` Jakub Kicinski
@ 2018-06-06  5:15       ` Or Gerlitz
  2018-06-06 15:46         ` Jakub Kicinski
  2018-06-06  7:59       ` Paul Blakey
  2018-06-06 17:56       ` David Miller
  2 siblings, 1 reply; 14+ messages in thread
From: Or Gerlitz @ 2018-06-06  5:15 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, Paul Blakey, Jiri Pirko, Cong Wang,
	Jamal Hadi Salim, Linux Netdev List, Yevgeny Kliteynik,
	Roi Dayan, Shahar Klein, Mark Bloch, Or Gerlitz

On Wed, Jun 6, 2018 at 12:27 AM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Tue, 05 Jun 2018 15:06:40 -0400 (EDT), David Miller wrote:
>> From: Jakub Kicinski <kubakici@wp.pl>
>> Date: Tue, 5 Jun 2018 11:57:47 -0700
>>
>> > Do we still care about correctness and not breaking backward
>> > compatibility?
>>
>> Jakub let me know if you want me to revert this change.
>
> Yes, I think this patch introduces a regression when block is shared
> between offload capable and in-capable device, therefore it should be
> reverted.  Shared blocks went through a number of review cycles to
> ensure such cases are handled correctly.
>
>
> Longer story about the actual issue which is never explained in the
> commit message is this: in kernels 4.10 - 4.14 skip_sw flag was
> supported on tunnels in cls_flower only:
>
> static int fl_hw_replace_filter(struct tcf_proto *tp,
> [...]
>         if (!tc_can_offload(dev)) {
>                 if (tcf_exts_get_dev(dev, &f->exts, &f->hw_dev) ||
>                     (f->hw_dev && !tc_can_offload(f->hw_dev))) {
>                         f->hw_dev = dev;
>                         return tc_skip_sw(f->flags) ? -EINVAL : 0;
>                 }
>                 dev = f->hw_dev;
>                 cls_flower.egress_dev = true;
>         } else {
>                 f->hw_dev = dev;
>         }
>
>
> In 4.15 - 4.17 with addition of shared blocks egdev mechanism was
> promoted to a generic TC thing supported on all filters but it no
> longer works with skip_sw.
>
> I'd argue skip_sw is incorrect for tunnels, because the rule will only
> apply to traffic ingressing on ASIC ports, not all traffic which hits
> the tunnel netdev.

This argument makes sense, however, skip_sw for tunnel decap rules
 **is** allowed since 4.10 and we have some sort of regression here (turns
out that before and after the patch..)

> Therefore we should keep the 4.15 - 4.17 behaviour.
> But that's a side note, I don't think we should be breaking offload on
> shared blocks whether we decide to support skip_sw on tunnels or not.

skip_sw on tunnels was there before shared-block, newer features should
take care not to break existing ones.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05 21:27     ` Jakub Kicinski
  2018-06-06  5:15       ` Or Gerlitz
@ 2018-06-06  7:59       ` Paul Blakey
  2018-06-06 15:57         ` Jakub Kicinski
  2018-06-06 17:56       ` David Miller
  2 siblings, 1 reply; 14+ messages in thread
From: Paul Blakey @ 2018-06-06  7:59 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller
  Cc: paulb, jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid, shahark,
	markb, ogerlitz



On 06/06/2018 00:27, Jakub Kicinski wrote:
> On Tue, 05 Jun 2018 15:06:40 -0400 (EDT), David Miller wrote:
>> From: Jakub Kicinski <kubakici@wp.pl>
>> Date: Tue, 5 Jun 2018 11:57:47 -0700
>>
>>> Do we still care about correctness and not breaking backward
>>> compatibility?
>>
>> Jakub let me know if you want me to revert this change.
> 
> Yes, I think this patch introduces a regression when block is shared
> between offload capable and in-capable device, therefore it should be
> reverted.  Shared blocks went through a number of review cycles to
> ensure such cases are handled correctly.
> 
> 
> Longer story about the actual issue which is never explained in the
> commit message is this: in kernels 4.10 - 4.14 skip_sw flag was
> supported on tunnels in cls_flower only:
> 
> static int fl_hw_replace_filter(struct tcf_proto *tp,
> [...]
> 	if (!tc_can_offload(dev)) {
> 		if (tcf_exts_get_dev(dev, &f->exts, &f->hw_dev) ||
> 		    (f->hw_dev && !tc_can_offload(f->hw_dev))) {
> 			f->hw_dev = dev;
> 			return tc_skip_sw(f->flags) ? -EINVAL : 0;
> 		}
> 		dev = f->hw_dev;
> 		cls_flower.egress_dev = true;
> 	} else {
> 		f->hw_dev = dev;
> 	}
> 
> 
> In 4.15 - 4.17 with addition of shared blocks egdev mechanism was
> promoted to a generic TC thing supported on all filters but it no
> longer works with skip_sw.
> 
> I'd argue skip_sw is incorrect for tunnels, because the rule will only
> apply to traffic ingressing on ASIC ports, not all traffic which hits
> the tunnel netdev.  Therefore we should keep the 4.15 - 4.17 behaviour.
> 
> But that's a side note, I don't think we should be breaking offload on
> shared blocks whether we decide to support skip_sw on tunnels or not.
> 

Maybe we can apply my patch logic of still trying the egress dev if the 
block has a single device, and not shared. Is that ok with you?

You're patch seems good as an add on, but the egress hw device (sw1np0) 
would go over the tc actions and see if it can offload such rule (output 
to software lo device) and would fail anyway.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-06  5:15       ` Or Gerlitz
@ 2018-06-06 15:46         ` Jakub Kicinski
  0 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2018-06-06 15:46 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, Paul Blakey, Jiri Pirko, Cong Wang,
	Jamal Hadi Salim, Linux Netdev List, Yevgeny Kliteynik,
	Roi Dayan, Shahar Klein, Mark Bloch, Or Gerlitz

On Wed, 6 Jun 2018 08:15:27 +0300, Or Gerlitz wrote:
> On Wed, Jun 6, 2018 at 12:27 AM, Jakub Kicinski <kubakici@wp.pl> wrote:
> > On Tue, 05 Jun 2018 15:06:40 -0400 (EDT), David Miller wrote:  
> >> From: Jakub Kicinski <kubakici@wp.pl>
> >> Date: Tue, 5 Jun 2018 11:57:47 -0700
> >>  
> >> > Do we still care about correctness and not breaking backward
> >> > compatibility?  
> >>
> >> Jakub let me know if you want me to revert this change.  
> >
> > Yes, I think this patch introduces a regression when block is shared
> > between offload capable and in-capable device, therefore it should be
> > reverted.  Shared blocks went through a number of review cycles to
> > ensure such cases are handled correctly.
> >
> >
> > Longer story about the actual issue which is never explained in the
> > commit message is this: in kernels 4.10 - 4.14 skip_sw flag was
> > supported on tunnels in cls_flower only:
> >
> > static int fl_hw_replace_filter(struct tcf_proto *tp,
> > [...]
> >         if (!tc_can_offload(dev)) {
> >                 if (tcf_exts_get_dev(dev, &f->exts, &f->hw_dev) ||
> >                     (f->hw_dev && !tc_can_offload(f->hw_dev))) {
> >                         f->hw_dev = dev;
> >                         return tc_skip_sw(f->flags) ? -EINVAL : 0;
> >                 }
> >                 dev = f->hw_dev;
> >                 cls_flower.egress_dev = true;
> >         } else {
> >                 f->hw_dev = dev;
> >         }
> >
> >
> > In 4.15 - 4.17 with addition of shared blocks egdev mechanism was
> > promoted to a generic TC thing supported on all filters but it no
> > longer works with skip_sw.
> >
> > I'd argue skip_sw is incorrect for tunnels, because the rule will only
> > apply to traffic ingressing on ASIC ports, not all traffic which hits
> > the tunnel netdev.  
> 
> This argument makes sense, however, skip_sw for tunnel decap rules
>  **is** allowed since 4.10 and we have some sort of regression here (turns
> out that before and after the patch..)

As I said it was allowed in 4 releases, which was a mistake, in last 3
it wasn't.  I understand your use case, but the semantics of skip_sw
are not preserved here so we should find a different solution.

> > Therefore we should keep the 4.15 - 4.17 behaviour.
> > But that's a side note, I don't think we should be breaking offload on
> > shared blocks whether we decide to support skip_sw on tunnels or not.  
> 
> skip_sw on tunnels was there before shared-block, newer features should
> take care not to break existing ones.

Oh, I agree we shouldn't break existing use cases so please don't break
the use case I mentioned above.  I want to set up shared block between
a LAG and its members.  Now since the bond will not be offload-capable
TC will not even make an attempt to offload to members.

I'm gonna test that my reading of the code is correct and send a revert
later today, sorry.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-06  5:12     ` Or Gerlitz
@ 2018-06-06 15:50       ` Jakub Kicinski
  2018-06-06 19:07         ` Or Gerlitz
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2018-06-06 15:50 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Paul Blakey, Jiri Pirko, Cong Wang, Jamal Hadi Salim,
	David Miller, Linux Netdev List, Yevgeny Kliteynik, Roi Dayan,
	Shahar Klein, Mark Bloch, Or Gerlitz

On Wed, 6 Jun 2018 08:12:20 +0300, Or Gerlitz wrote:
> On Tue, Jun 5, 2018 at 9:59 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> > On Tue, 5 Jun 2018 11:57:47 -0700, Jakub Kicinski wrote:  
> >> On Tue,  5 Jun 2018 11:04:03 +0300, Paul Blakey wrote:  
> >> > When using a vxlan device as the ingress dev, we count it as a
> >> > "no offload dev", so when such a rule comes and err stop is true,
> >> > we fail early and don't try the egdev route which can offload it
> >> > through the egress device.
> >> >
> >> > Fix that by not calling the block offload if one of the devices
> >> > attached to it is not offload capable, but make sure egress on such case
> >> > is capable instead.
> >> >
> >> > Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
> >> > Reviewed-by: Roi Dayan <roid@mellanox.com>
> >> > Acked-by: Jiri Pirko <jiri@mellanox.com>
> >> > Signed-off-by: Paul Blakey <paulb@mellanox.com>  
> >>
> >> Very poor commit message.  What you're doing is re-enabling skip_sw
> >> filters on tunnel devices which semantically make no sense and
> >> shouldn't have been allowed in the first place.
> >>
> >> This will breaks block sharing between tunnels and HW netdevs (because
> >> you skip the tcf_block_cb_call() completely).  The entire egdev idea
> >> remains badly broken accepting filters like this:
> >>
> >> # tc filter add dev vxlan0 ingress \
> >>       matchall action skip_sw \
> >>               mirred egress redirect dev lo \
> >>               mirred egress redirect dev sw1np0  
> >
> > For above we probably need something like this (untested):
> >
> > diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> > index 3f4cf930f809..71e5eebec31a 100644
> > --- a/net/sched/act_api.c
> > +++ b/net/sched/act_api.c
> > @@ -1511,7 +1511,7 @@ int tc_setup_cb_egdev_call(const struct net_device *dev,
> >         struct tcf_action_egdev *egdev = tcf_action_egdev_lookup(dev);
> >
> >         if (!egdev)
> > -               return 0;
> > +               return err_stop ? -EOPNOTSUPP : 0;
> >         return tcf_action_egdev_cb_call(egdev, type, type_data, err_stop);
> >  }
> >  EXPORT_SYMBOL_GPL(tc_setup_cb_egdev_call);  
> 
> Jakub,
> 
> We will test it out and let you know

That's probably insufficient, because the second action should be
skipped completely.  But an improvement nonetheless :)

> > But the correct fix is to remove egdev crutch completely IMO.  
> 
> Not against it, sometimes designs should change and be replaced
> with better ones, happens

Cool, I'm not sure when we'll get to it, but perhaps we can go over 
the ideas I presented at netconf on switchdev call next week (if it's
still active)?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-06  7:59       ` Paul Blakey
@ 2018-06-06 15:57         ` Jakub Kicinski
  0 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2018-06-06 15:57 UTC (permalink / raw)
  To: Paul Blakey
  Cc: David Miller, jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid,
	shahark, markb, ogerlitz

On Wed, 6 Jun 2018 10:59:30 +0300, Paul Blakey wrote:
> Maybe we can apply my patch logic of still trying the egress dev if the 
> block has a single device, and not shared. Is that ok with you?

I don't remember that patch but sounds pretty bad.

> You're patch seems good as an add on, but the egress hw device (sw1np0) 
> would go over the tc actions and see if it can offload such rule (output 
> to software lo device) and would fail anyway.

Ack, I'm not trying to solve the skip_sw problem.  It's a hard one.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-05 21:27     ` Jakub Kicinski
  2018-06-06  5:15       ` Or Gerlitz
  2018-06-06  7:59       ` Paul Blakey
@ 2018-06-06 17:56       ` David Miller
  2 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2018-06-06 17:56 UTC (permalink / raw)
  To: kubakici
  Cc: paulb, jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid, shahark,
	markb, ogerlitz

From: Jakub Kicinski <kubakici@wp.pl>
Date: Tue, 5 Jun 2018 14:27:00 -0700

> On Tue, 05 Jun 2018 15:06:40 -0400 (EDT), David Miller wrote:
>> From: Jakub Kicinski <kubakici@wp.pl>
>> Date: Tue, 5 Jun 2018 11:57:47 -0700
>> 
>> > Do we still care about correctness and not breaking backward
>> > compatibility?  
>> 
>> Jakub let me know if you want me to revert this change.
> 
> Yes, I think this patch introduces a regression when block is shared
> between offload capable and in-capable device, therefore it should be
> reverted.

Ok, I've reverted the change.  Please sort out how to fix things
properly.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
  2018-06-06 15:50       ` Jakub Kicinski
@ 2018-06-06 19:07         ` Or Gerlitz
  0 siblings, 0 replies; 14+ messages in thread
From: Or Gerlitz @ 2018-06-06 19:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Paul Blakey, Jiri Pirko, Cong Wang, Jamal Hadi Salim,
	David Miller, Linux Netdev List, Yevgeny Kliteynik, Roi Dayan,
	Shahar Klein, Mark Bloch, Or Gerlitz

On Wed, Jun 6, 2018 at 6:50 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Wed, 6 Jun 2018 08:12:20 +0300, Or Gerlitz wrote:
>> On Tue, Jun 5, 2018 at 9:59 PM, Jakub Kicinski <kubakici@wp.pl> wrote:

>> > But the correct fix is to remove egdev crutch completely IMO.

>> Not against it, sometimes designs should change and be replaced
>> with better ones, happens

> Cool, I'm not sure when we'll get to it, but perhaps we can go over
> the ideas I presented at netconf on switchdev call next week (if it's
> still active)?

sounds good, I'll check. Where the netconf slides can be found?

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-06-06 19:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-05  8:04 [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan Paul Blakey
2018-06-05 14:30 ` David Miller
2018-06-05 18:57 ` Jakub Kicinski
2018-06-05 18:59   ` Jakub Kicinski
2018-06-06  5:12     ` Or Gerlitz
2018-06-06 15:50       ` Jakub Kicinski
2018-06-06 19:07         ` Or Gerlitz
2018-06-05 19:06   ` David Miller
2018-06-05 21:27     ` Jakub Kicinski
2018-06-06  5:15       ` Or Gerlitz
2018-06-06 15:46         ` Jakub Kicinski
2018-06-06  7:59       ` Paul Blakey
2018-06-06 15:57         ` Jakub Kicinski
2018-06-06 17:56       ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.