linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
@ 2019-10-12  7:16 Zhiyuan Hou
  2019-10-12 10:59 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Zhiyuan Hou @ 2019-10-12  7:16 UTC (permalink / raw)
  To: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller
  Cc: netdev, linux-kernel

In act_mirred's ingress redirection, if the skb's dst_entry is valid
when call function netif_receive_skb, the fllowing l3 stack process
(ip_rcv_finish_core) will check dst_entry and skip the routing
decision. Using the old dst_entry is unexpected and may discard the
skb in some case. For example dst->dst_input points to dst_discard.

This patch drops the skb's dst_entry before calling netif_receive_skb
so that the skb can be made routing decision like a normal ingress
skb.

Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
---
 net/sched/act_mirred.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 9ce073a05414..6108a64c0cd5 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -18,6 +18,7 @@
 #include <linux/gfp.h>
 #include <linux/if_arp.h>
 #include <net/net_namespace.h>
+#include <net/dst.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
 #include <net/pkt_cls.h>
@@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
 
 	if (!want_ingress)
 		err = dev_queue_xmit(skb2);
-	else
+	else {
+		skb_dst_drop(skb2);
 		err = netif_receive_skb(skb2);
+	}
 
 	if (err) {
 out:
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-12  7:16 [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection Zhiyuan Hou
@ 2019-10-12 10:59 ` Eric Dumazet
  2019-10-14  7:07   ` Zhiyuan Hou
  2019-10-12 19:34 ` Sergei Shtylyov
  2019-10-14 17:57 ` Cong Wang
  2 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2019-10-12 10:59 UTC (permalink / raw)
  To: Zhiyuan Hou, Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller
  Cc: netdev, linux-kernel



On 10/12/19 12:16 AM, Zhiyuan Hou wrote:
> In act_mirred's ingress redirection, if the skb's dst_entry is valid
> when call function netif_receive_skb, the fllowing l3 stack process
> (ip_rcv_finish_core) will check dst_entry and skip the routing
> decision. Using the old dst_entry is unexpected and may discard the
> skb in some case. For example dst->dst_input points to dst_discard.
> 
> This patch drops the skb's dst_entry before calling netif_receive_skb
> so that the skb can be made routing decision like a normal ingress
> skb.
> 
> Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
> ---
>  net/sched/act_mirred.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> index 9ce073a05414..6108a64c0cd5 100644
> --- a/net/sched/act_mirred.c
> +++ b/net/sched/act_mirred.c
> @@ -18,6 +18,7 @@
>  #include <linux/gfp.h>
>  #include <linux/if_arp.h>
>  #include <net/net_namespace.h>
> +#include <net/dst.h>
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
>  #include <net/pkt_cls.h>
> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
>  
>  	if (!want_ingress)
>  		err = dev_queue_xmit(skb2);
> -	else
> +	else {
> +		skb_dst_drop(skb2);
>  		err = netif_receive_skb(skb2);
> +	}
>  
>  	if (err) {
>  out:
> 

Why is dst_discard used ?

This could actually drop packets, for loopback.

A Fixes: tag would tremendously help, I wonder if you are not working around
the other issue Wei was tracking yesterday ( https://www.spinics.net/lists/netdev/msg604397.html )


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-12  7:16 [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection Zhiyuan Hou
  2019-10-12 10:59 ` Eric Dumazet
@ 2019-10-12 19:34 ` Sergei Shtylyov
  2019-10-14  7:08   ` Zhiyuan Hou
  2019-10-14 17:57 ` Cong Wang
  2 siblings, 1 reply; 15+ messages in thread
From: Sergei Shtylyov @ 2019-10-12 19:34 UTC (permalink / raw)
  To: Zhiyuan Hou, Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller
  Cc: netdev, linux-kernel

Hello!

On 10/12/2019 10:16 AM, Zhiyuan Hou wrote:

> In act_mirred's ingress redirection, if the skb's dst_entry is valid
> when call function netif_receive_skb, the fllowing l3 stack process

  Following or flowing?

> (ip_rcv_finish_core) will check dst_entry and skip the routing
> decision. Using the old dst_entry is unexpected and may discard the
> skb in some case. For example dst->dst_input points to dst_discard.
> 
> This patch drops the skb's dst_entry before calling netif_receive_skb
> so that the skb can be made routing decision like a normal ingress
> skb.
> 
> Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
> ---
>  net/sched/act_mirred.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> index 9ce073a05414..6108a64c0cd5 100644
> --- a/net/sched/act_mirred.c
> +++ b/net/sched/act_mirred.c
[...]
> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
>  
>  	if (!want_ingress)
>  		err = dev_queue_xmit(skb2);
> -	else
> +	else {
> +		skb_dst_drop(skb2);
>  		err = netif_receive_skb(skb2);
> +	}

   If you introduce {} in one *if* branch, {} should be added to the other branches as well,
says CodingStyle.

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-12 10:59 ` Eric Dumazet
@ 2019-10-14  7:07   ` Zhiyuan Hou
  2019-10-14 12:46     ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Zhiyuan Hou @ 2019-10-14  7:07 UTC (permalink / raw)
  To: Eric Dumazet, Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller
  Cc: netdev, linux-kernel


On 2019/10/12 6:59 下午, Eric Dumazet wrote:
>
> On 10/12/19 12:16 AM, Zhiyuan Hou wrote:
>> In act_mirred's ingress redirection, if the skb's dst_entry is valid
>> when call function netif_receive_skb, the fllowing l3 stack process
>> (ip_rcv_finish_core) will check dst_entry and skip the routing
>> decision. Using the old dst_entry is unexpected and may discard the
>> skb in some case. For example dst->dst_input points to dst_discard.
>>
>> This patch drops the skb's dst_entry before calling netif_receive_skb
>> so that the skb can be made routing decision like a normal ingress
>> skb.
>>
>> Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
>> ---
>>   net/sched/act_mirred.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>> index 9ce073a05414..6108a64c0cd5 100644
>> --- a/net/sched/act_mirred.c
>> +++ b/net/sched/act_mirred.c
>> @@ -18,6 +18,7 @@
>>   #include <linux/gfp.h>
>>   #include <linux/if_arp.h>
>>   #include <net/net_namespace.h>
>> +#include <net/dst.h>
>>   #include <net/netlink.h>
>>   #include <net/pkt_sched.h>
>>   #include <net/pkt_cls.h>
>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
>>   
>>   	if (!want_ingress)
>>   		err = dev_queue_xmit(skb2);
>> -	else
>> +	else {
>> +		skb_dst_drop(skb2);
>>   		err = netif_receive_skb(skb2);
>> +	}
>>   
>>   	if (err) {
>>   out:
>>
> Why is dst_discard used ?
When send a skb from local to external, the dst->dst_input will be
assigned dst_discard after routing decision. So if we redirect these
skbs to ingress stack, it will be dropped.

For ipvlan l2 mode or macvlan, clsact egress filters on master deivce
may also meet these skbs even if they came from slave device. Ingress
redirection on these skbs may drop them on l3 stack.
> This could actually drop packets, for loopback.
>
> A Fixes: tag would tremendously help, I wonder if you are not working around
> the other issue Wei was tracking yesterday ( https://www.spinics.net/lists/netdev/msg604397.html )
No, this is a different issue ^_^.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-12 19:34 ` Sergei Shtylyov
@ 2019-10-14  7:08   ` Zhiyuan Hou
  0 siblings, 0 replies; 15+ messages in thread
From: Zhiyuan Hou @ 2019-10-14  7:08 UTC (permalink / raw)
  To: Sergei Shtylyov, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	David S . Miller
  Cc: netdev, linux-kernel


On 2019/10/13 3:34 上午, Sergei Shtylyov wrote:
> Hello!
>
> On 10/12/2019 10:16 AM, Zhiyuan Hou wrote:
>
>> In act_mirred's ingress redirection, if the skb's dst_entry is valid
>> when call function netif_receive_skb, the fllowing l3 stack process
>    Following or flowing?
Sorry, it should be following. I will change it in next version.
>> (ip_rcv_finish_core) will check dst_entry and skip the routing
>> decision. Using the old dst_entry is unexpected and may discard the
>> skb in some case. For example dst->dst_input points to dst_discard.
>>
>> This patch drops the skb's dst_entry before calling netif_receive_skb
>> so that the skb can be made routing decision like a normal ingress
>> skb.
>>
>> Signed-off-by: Zhiyuan Hou<zhiyuan2048@linux.alibaba.com>
>> ---
>>   net/sched/act_mirred.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>> index 9ce073a05414..6108a64c0cd5 100644
>> --- a/net/sched/act_mirred.c
>> +++ b/net/sched/act_mirred.c
> [...]
>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
>>   
>>   	if (!want_ingress)
>>   		err = dev_queue_xmit(skb2);
>> -	else
>> +	else {
>> +		skb_dst_drop(skb2);
>>   		err = netif_receive_skb(skb2);
>> +	}
>     If you introduce {} in one *if* branch, {} should be added to the other branches as well,
> says CodingStyle.
I will change it in next version . Thanks.
> [...]
>
> MBR, Sergei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-14  7:07   ` Zhiyuan Hou
@ 2019-10-14 12:46     ` Eric Dumazet
  2019-10-17 16:08       ` Zhiyuan Hou
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2019-10-14 12:46 UTC (permalink / raw)
  To: Zhiyuan Hou, Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller
  Cc: netdev, linux-kernel



On 10/14/19 12:07 AM, Zhiyuan Hou wrote:
> 
> On 2019/10/12 6:59 下午, Eric Dumazet wrote:
>>
>> On 10/12/19 12:16 AM, Zhiyuan Hou wrote:
>>> In act_mirred's ingress redirection, if the skb's dst_entry is valid
>>> when call function netif_receive_skb, the fllowing l3 stack process
>>> (ip_rcv_finish_core) will check dst_entry and skip the routing
>>> decision. Using the old dst_entry is unexpected and may discard the
>>> skb in some case. For example dst->dst_input points to dst_discard.
>>>
>>> This patch drops the skb's dst_entry before calling netif_receive_skb
>>> so that the skb can be made routing decision like a normal ingress
>>> skb.
>>>
>>> Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
>>> ---
>>>   net/sched/act_mirred.c | 5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>>> index 9ce073a05414..6108a64c0cd5 100644
>>> --- a/net/sched/act_mirred.c
>>> +++ b/net/sched/act_mirred.c
>>> @@ -18,6 +18,7 @@
>>>   #include <linux/gfp.h>
>>>   #include <linux/if_arp.h>
>>>   #include <net/net_namespace.h>
>>> +#include <net/dst.h>
>>>   #include <net/netlink.h>
>>>   #include <net/pkt_sched.h>
>>>   #include <net/pkt_cls.h>
>>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
>>>         if (!want_ingress)
>>>           err = dev_queue_xmit(skb2);
>>> -    else
>>> +    else {
>>> +        skb_dst_drop(skb2);
>>>           err = netif_receive_skb(skb2);
>>> +    }
>>>         if (err) {
>>>   out:
>>>
>> Why is dst_discard used ?
> When send a skb from local to external, the dst->dst_input will be
> assigned dst_discard after routing decision. So if we redirect these
> skbs to ingress stack, it will be dropped.
> 
> For ipvlan l2 mode or macvlan, clsact egress filters on master deivce
> may also meet these skbs even if they came from slave device. Ingress
> redirection on these skbs may drop them on l3 stack.

Can you please add a test, so that we can see what you are trying to do exactly ?




>> This could actually drop packets, for loopback.
>>
>> A Fixes: tag would tremendously help, I wonder if you are not working around
>> the other issue Wei was tracking yesterday ( https://www.spinics.net/lists/netdev/msg604397.html )
> No, this is a different issue ^_^.

Please add a Fixes: tag then.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-12  7:16 [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection Zhiyuan Hou
  2019-10-12 10:59 ` Eric Dumazet
  2019-10-12 19:34 ` Sergei Shtylyov
@ 2019-10-14 17:57 ` Cong Wang
  2019-10-15 17:22   ` Zhiyuan Hou
  2 siblings, 1 reply; 15+ messages in thread
From: Cong Wang @ 2019-10-14 17:57 UTC (permalink / raw)
  To: Zhiyuan Hou
  Cc: Jamal Hadi Salim, Jiri Pirko, David S . Miller,
	Linux Kernel Network Developers, LKML

On Sat, Oct 12, 2019 at 12:16 AM Zhiyuan Hou
<zhiyuan2048@linux.alibaba.com> wrote:
> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> index 9ce073a05414..6108a64c0cd5 100644
> --- a/net/sched/act_mirred.c
> +++ b/net/sched/act_mirred.c
> @@ -18,6 +18,7 @@
>  #include <linux/gfp.h>
>  #include <linux/if_arp.h>
>  #include <net/net_namespace.h>
> +#include <net/dst.h>
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
>  #include <net/pkt_cls.h>
> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
>
>         if (!want_ingress)
>                 err = dev_queue_xmit(skb2);
> -       else
> +       else {
> +               skb_dst_drop(skb2);
>                 err = netif_receive_skb(skb2);
> +       }

Good catch!

I don't want to be picky, but it seems this is only needed
when redirecting from egress to ingress, right? That is,
ingress to ingress, or ingress to egress is okay? If not,
please fix all the cases while you are on it?

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-14 17:57 ` Cong Wang
@ 2019-10-15 17:22   ` Zhiyuan Hou
  2019-10-16 12:13     ` Eyal Birger
  0 siblings, 1 reply; 15+ messages in thread
From: Zhiyuan Hou @ 2019-10-15 17:22 UTC (permalink / raw)
  To: Cong Wang
  Cc: Jamal Hadi Salim, Jiri Pirko, David S . Miller,
	Linux Kernel Network Developers, LKML


On 2019/10/15 1:57 上午, Cong Wang wrote:
> On Sat, Oct 12, 2019 at 12:16 AM Zhiyuan Hou
> <zhiyuan2048@linux.alibaba.com> wrote:
>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>> index 9ce073a05414..6108a64c0cd5 100644
>> --- a/net/sched/act_mirred.c
>> +++ b/net/sched/act_mirred.c
>> @@ -18,6 +18,7 @@
>>   #include <linux/gfp.h>
>>   #include <linux/if_arp.h>
>>   #include <net/net_namespace.h>
>> +#include <net/dst.h>
>>   #include <net/netlink.h>
>>   #include <net/pkt_sched.h>
>>   #include <net/pkt_cls.h>
>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
>>
>>          if (!want_ingress)
>>                  err = dev_queue_xmit(skb2);
>> -       else
>> +       else {
>> +               skb_dst_drop(skb2);
>>                  err = netif_receive_skb(skb2);
>> +       }
> Good catch!
>
> I don't want to be picky, but it seems this is only needed
> when redirecting from egress to ingress, right? That is,
> ingress to ingress, or ingress to egress is okay? If not,
> please fix all the cases while you are on it?
Sure. But I think this patch is also needed when redirecting from
ingress to ingress. Because we cannot assure that a skb has null dst
in ingress redirection path. For example, if redirecting a skb from
loopback's ingress to other device's ingress, the skb will take a
dst.

As commit logs point out, skb with valid dst cannot be made routing
decision in following process. original dst may cause skb loss or
other unexpected behavior.
>
> Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-15 17:22   ` Zhiyuan Hou
@ 2019-10-16 12:13     ` Eyal Birger
  2019-10-17 16:33       ` Zhiyuan Hou
  0 siblings, 1 reply; 15+ messages in thread
From: Eyal Birger @ 2019-10-16 12:13 UTC (permalink / raw)
  To: Zhiyuan Hou
  Cc: Cong Wang, Jamal Hadi Salim, Jiri Pirko, David S . Miller,
	Linux Kernel Network Developers, LKML, shmulik.ladkani

Hi,

On Wed, 16 Oct 2019 01:22:01 +0800
Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:

> On 2019/10/15 1:57 上午, Cong Wang wrote:
> > On Sat, Oct 12, 2019 at 12:16 AM Zhiyuan Hou
> > <zhiyuan2048@linux.alibaba.com> wrote:  
> >> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> >> index 9ce073a05414..6108a64c0cd5 100644
> >> --- a/net/sched/act_mirred.c
> >> +++ b/net/sched/act_mirred.c
> >> @@ -18,6 +18,7 @@
> >>   #include <linux/gfp.h>
> >>   #include <linux/if_arp.h>
> >>   #include <net/net_namespace.h>
> >> +#include <net/dst.h>
> >>   #include <net/netlink.h>
> >>   #include <net/pkt_sched.h>
> >>   #include <net/pkt_cls.h>
> >> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff
> >> *skb, const struct tc_action *a,
> >>
> >>          if (!want_ingress)
> >>                  err = dev_queue_xmit(skb2);
> >> -       else
> >> +       else {
> >> +               skb_dst_drop(skb2);
> >>                  err = netif_receive_skb(skb2);
> >> +       }  

> > Good catch!

Indeed! Thanks for fixing this!

> >
> > I don't want to be picky, but it seems this is only needed
> > when redirecting from egress to ingress, right? That is,
> > ingress to ingress, or ingress to egress is okay? If not,
> > please fix all the cases while you are on it?  
> Sure. But I think this patch is also needed when redirecting from
> ingress to ingress. Because we cannot assure that a skb has null dst
> in ingress redirection path. For example, if redirecting a skb from
> loopback's ingress to other device's ingress, the skb will take a
> dst.
> 
> As commit logs point out, skb with valid dst cannot be made routing
> decision in following process. original dst may cause skb loss or
> other unexpected behavior.

On the other hand, removing the dst on ingress-to-ingress redirection
may remove LWT information on incoming packets, which may be undesired.

Eyal.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-14 12:46     ` Eric Dumazet
@ 2019-10-17 16:08       ` Zhiyuan Hou
  0 siblings, 0 replies; 15+ messages in thread
From: Zhiyuan Hou @ 2019-10-17 16:08 UTC (permalink / raw)
  To: Eric Dumazet, Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller
  Cc: netdev, linux-kernel


On 2019/10/14 8:46 下午, Eric Dumazet wrote:
>
> On 10/14/19 12:07 AM, Zhiyuan Hou wrote:
>> On 2019/10/12 6:59 下午, Eric Dumazet wrote:
>>> On 10/12/19 12:16 AM, Zhiyuan Hou wrote:
>>>> In act_mirred's ingress redirection, if the skb's dst_entry is valid
>>>> when call function netif_receive_skb, the fllowing l3 stack process
>>>> (ip_rcv_finish_core) will check dst_entry and skip the routing
>>>> decision. Using the old dst_entry is unexpected and may discard the
>>>> skb in some case. For example dst->dst_input points to dst_discard.
>>>>
>>>> This patch drops the skb's dst_entry before calling netif_receive_skb
>>>> so that the skb can be made routing decision like a normal ingress
>>>> skb.
>>>>
>>>> Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
>>>> ---
>>>>    net/sched/act_mirred.c | 5 ++++-
>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>>>> index 9ce073a05414..6108a64c0cd5 100644
>>>> --- a/net/sched/act_mirred.c
>>>> +++ b/net/sched/act_mirred.c
>>>> @@ -18,6 +18,7 @@
>>>>    #include <linux/gfp.h>
>>>>    #include <linux/if_arp.h>
>>>>    #include <net/net_namespace.h>
>>>> +#include <net/dst.h>
>>>>    #include <net/netlink.h>
>>>>    #include <net/pkt_sched.h>
>>>>    #include <net/pkt_cls.h>
>>>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a,
>>>>          if (!want_ingress)
>>>>            err = dev_queue_xmit(skb2);
>>>> -    else
>>>> +    else {
>>>> +        skb_dst_drop(skb2);
>>>>            err = netif_receive_skb(skb2);
>>>> +    }
>>>>          if (err) {
>>>>    out:
>>>>
>>> Why is dst_discard used ?
>> When send a skb from local to external, the dst->dst_input will be
>> assigned dst_discard after routing decision. So if we redirect these
>> skbs to ingress stack, it will be dropped.
>>
>> For ipvlan l2 mode or macvlan, clsact egress filters on master deivce
>> may also meet these skbs even if they came from slave device. Ingress
>> redirection on these skbs may drop them on l3 stack.
> Can you please add a test, so that we can see what you are trying to do exactly ?
Sure. Suppose a linux box has two interfaces (eth0 and eth1). We
create a vrf (vrf0) and put eth1 in it, as following commands:

   # ip link add vrf0 type vrf table 10
   # ip link set dev vrf0 up
   # ip link set eth1 master vrf0

Then let's intercept some egress flows through eth0 and redirect them
to vrf0 using act_mirred.

   # ip route add table 10 ...   # add routes to vrf0
   # tc qdisc add dev eth0 clsact
   # tc filter add dev eth0  egress proto ip u32 \
        match ip dst 10.0.0.0/24 \
        action mirred ingress redirect dev eth1

We expect that the matching skb will be received or forwarded via
vrf0's route table. But the fact is, the skb is dropped as commit log
notes.
>
>
>
>
>>> This could actually drop packets, for loopback.
>>>
>>> A Fixes: tag would tremendously help, I wonder if you are not working around
>>> the other issue Wei was tracking yesterday ( https://www.spinics.net/lists/netdev/msg604397.html )
>> No, this is a different issue ^_^.
> Please add a Fixes: tag then.
Yes, I will add following tag in v2.
Fixes: 53592b364001 ("net/sched: act_mirred: Implement ingress actions")
>
> Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-16 12:13     ` Eyal Birger
@ 2019-10-17 16:33       ` Zhiyuan Hou
  2019-10-18 21:25         ` Eyal Birger
  0 siblings, 1 reply; 15+ messages in thread
From: Zhiyuan Hou @ 2019-10-17 16:33 UTC (permalink / raw)
  To: Eyal Birger
  Cc: Cong Wang, Jamal Hadi Salim, Jiri Pirko, David S . Miller,
	Linux Kernel Network Developers, LKML, shmulik.ladkani


On 2019/10/16 8:13 下午, Eyal Birger wrote:
> Hi,
>
> On Wed, 16 Oct 2019 01:22:01 +0800
> Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:
>
>> On 2019/10/15 1:57 上午, Cong Wang wrote:
>>> On Sat, Oct 12, 2019 at 12:16 AM Zhiyuan Hou
>>> <zhiyuan2048@linux.alibaba.com> wrote:
>>>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>>>> index 9ce073a05414..6108a64c0cd5 100644
>>>> --- a/net/sched/act_mirred.c
>>>> +++ b/net/sched/act_mirred.c
>>>> @@ -18,6 +18,7 @@
>>>>    #include <linux/gfp.h>
>>>>    #include <linux/if_arp.h>
>>>>    #include <net/net_namespace.h>
>>>> +#include <net/dst.h>
>>>>    #include <net/netlink.h>
>>>>    #include <net/pkt_sched.h>
>>>>    #include <net/pkt_cls.h>
>>>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff
>>>> *skb, const struct tc_action *a,
>>>>
>>>>           if (!want_ingress)
>>>>                   err = dev_queue_xmit(skb2);
>>>> -       else
>>>> +       else {
>>>> +               skb_dst_drop(skb2);
>>>>                   err = netif_receive_skb(skb2);
>>>> +       }
>>> Good catch!
> Indeed! Thanks for fixing this!
>
>>> I don't want to be picky, but it seems this is only needed
>>> when redirecting from egress to ingress, right? That is,
>>> ingress to ingress, or ingress to egress is okay? If not,
>>> please fix all the cases while you are on it?
>> Sure. But I think this patch is also needed when redirecting from
>> ingress to ingress. Because we cannot assure that a skb has null dst
>> in ingress redirection path. For example, if redirecting a skb from
>> loopback's ingress to other device's ingress, the skb will take a
>> dst.
>>
>> As commit logs point out, skb with valid dst cannot be made routing
>> decision in following process. original dst may cause skb loss or
>> other unexpected behavior.
> On the other hand, removing the dst on ingress-to-ingress redirection
> may remove LWT information on incoming packets, which may be undesired.
Sorry, I do not understand why lwt information is needed on
ingress-to-ingress redirection. lwt is used on output path, isn't it?
Can you please give more information?
> Eyal.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-17 16:33       ` Zhiyuan Hou
@ 2019-10-18 21:25         ` Eyal Birger
  2019-10-21 13:06           ` Zhiyuan Hou
  2019-10-21 20:50           ` Cong Wang
  0 siblings, 2 replies; 15+ messages in thread
From: Eyal Birger @ 2019-10-18 21:25 UTC (permalink / raw)
  To: Zhiyuan Hou
  Cc: Cong Wang, Jamal Hadi Salim, Jiri Pirko, David S . Miller,
	Linux Kernel Network Developers, LKML, shmulik.ladkani

Hi,

On Fri, 18 Oct 2019 00:33:53 +0800
Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:

> On 2019/10/16 8:13 下午, Eyal Birger wrote:
> > Hi,
> >
> > On Wed, 16 Oct 2019 01:22:01 +0800
> > Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:
> >  
> >> On 2019/10/15 1:57 上午, Cong Wang wrote:  
> >>> On Sat, Oct 12, 2019 at 12:16 AM Zhiyuan Hou
> >>> <zhiyuan2048@linux.alibaba.com> wrote:  
> >>>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> >>>> index 9ce073a05414..6108a64c0cd5 100644
> >>>> --- a/net/sched/act_mirred.c
> >>>> +++ b/net/sched/act_mirred.c
> >>>> @@ -18,6 +18,7 @@
> >>>>    #include <linux/gfp.h>
> >>>>    #include <linux/if_arp.h>
> >>>>    #include <net/net_namespace.h>
> >>>> +#include <net/dst.h>
> >>>>    #include <net/netlink.h>
> >>>>    #include <net/pkt_sched.h>
> >>>>    #include <net/pkt_cls.h>
> >>>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff
> >>>> *skb, const struct tc_action *a,
> >>>>
> >>>>           if (!want_ingress)
> >>>>                   err = dev_queue_xmit(skb2);
> >>>> -       else
> >>>> +       else {
> >>>> +               skb_dst_drop(skb2);
> >>>>                   err = netif_receive_skb(skb2);
> >>>> +       }  
> >>> Good catch!  
> > Indeed! Thanks for fixing this!
> >  
> >>> I don't want to be picky, but it seems this is only needed
> >>> when redirecting from egress to ingress, right? That is,
> >>> ingress to ingress, or ingress to egress is okay? If not,
> >>> please fix all the cases while you are on it?  
> >> Sure. But I think this patch is also needed when redirecting from
> >> ingress to ingress. Because we cannot assure that a skb has null
> >> dst in ingress redirection path. For example, if redirecting a skb
> >> from loopback's ingress to other device's ingress, the skb will
> >> take a dst.
> >>
> >> As commit logs point out, skb with valid dst cannot be made routing
> >> decision in following process. original dst may cause skb loss or
> >> other unexpected behavior.  
> > On the other hand, removing the dst on ingress-to-ingress
> > redirection may remove LWT information on incoming packets, which
> > may be undesired.  
> Sorry, I do not understand why lwt information is needed on
> ingress-to-ingress redirection. lwt is used on output path, isn't it?
> Can you please give more information?

On rx path tunnelled packets parameters received on a collect_md tunnel device
are kept in a metadata dst. See ip_tunnel_rcv() 'tun_dst' parameter.

The rx metadata dst can be matched by a number of mechanisms like routing
rules, eBPF, OVS, and netfilter.

Eyal.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-18 21:25         ` Eyal Birger
@ 2019-10-21 13:06           ` Zhiyuan Hou
  2019-10-21 20:50           ` Cong Wang
  1 sibling, 0 replies; 15+ messages in thread
From: Zhiyuan Hou @ 2019-10-21 13:06 UTC (permalink / raw)
  To: Eyal Birger
  Cc: Cong Wang, Jamal Hadi Salim, Jiri Pirko, David S . Miller,
	Linux Kernel Network Developers, LKML, shmulik.ladkani


On 2019/10/19 5:25 上午, Eyal Birger wrote:
> Hi,
>
> On Fri, 18 Oct 2019 00:33:53 +0800
> Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:
>
>> On 2019/10/16 8:13 下午, Eyal Birger wrote:
>>> Hi,
>>>
>>> On Wed, 16 Oct 2019 01:22:01 +0800
>>> Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:
>>>   
>>>> On 2019/10/15 1:57 上午, Cong Wang wrote:
>>>>> On Sat, Oct 12, 2019 at 12:16 AM Zhiyuan Hou
>>>>> <zhiyuan2048@linux.alibaba.com> wrote:
>>>>>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>>>>>> index 9ce073a05414..6108a64c0cd5 100644
>>>>>> --- a/net/sched/act_mirred.c
>>>>>> +++ b/net/sched/act_mirred.c
>>>>>> @@ -18,6 +18,7 @@
>>>>>>     #include <linux/gfp.h>
>>>>>>     #include <linux/if_arp.h>
>>>>>>     #include <net/net_namespace.h>
>>>>>> +#include <net/dst.h>
>>>>>>     #include <net/netlink.h>
>>>>>>     #include <net/pkt_sched.h>
>>>>>>     #include <net/pkt_cls.h>
>>>>>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff
>>>>>> *skb, const struct tc_action *a,
>>>>>>
>>>>>>            if (!want_ingress)
>>>>>>                    err = dev_queue_xmit(skb2);
>>>>>> -       else
>>>>>> +       else {
>>>>>> +               skb_dst_drop(skb2);
>>>>>>                    err = netif_receive_skb(skb2);
>>>>>> +       }
>>>>> Good catch!
>>> Indeed! Thanks for fixing this!
>>>   
>>>>> I don't want to be picky, but it seems this is only needed
>>>>> when redirecting from egress to ingress, right? That is,
>>>>> ingress to ingress, or ingress to egress is okay? If not,
>>>>> please fix all the cases while you are on it?
>>>> Sure. But I think this patch is also needed when redirecting from
>>>> ingress to ingress. Because we cannot assure that a skb has null
>>>> dst in ingress redirection path. For example, if redirecting a skb
>>>> from loopback's ingress to other device's ingress, the skb will
>>>> take a dst.
>>>>
>>>> As commit logs point out, skb with valid dst cannot be made routing
>>>> decision in following process. original dst may cause skb loss or
>>>> other unexpected behavior.
>>> On the other hand, removing the dst on ingress-to-ingress
>>> redirection may remove LWT information on incoming packets, which
>>> may be undesired.
>> Sorry, I do not understand why lwt information is needed on
>> ingress-to-ingress redirection. lwt is used on output path, isn't it?
>> Can you please give more information?
> On rx path tunnelled packets parameters received on a collect_md tunnel device
> are kept in a metadata dst. See ip_tunnel_rcv() 'tun_dst' parameter.
>
> The rx metadata dst can be matched by a number of mechanisms like routing
> rules, eBPF, OVS, and netfilter.
Yes, you are right. Thanks for your explanations.

The metadata dst should not be removed in redirection path and also
does not affect L3's routing decision.

Maybe we can add a following check to solve it before removing a dst,
what do you think?

   if (skb_valid_dst(skb2))
       skb_dst_drop(sbk2);
>
> Eyal.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-18 21:25         ` Eyal Birger
  2019-10-21 13:06           ` Zhiyuan Hou
@ 2019-10-21 20:50           ` Cong Wang
  2019-10-22 10:37             ` Eyal Birger
  1 sibling, 1 reply; 15+ messages in thread
From: Cong Wang @ 2019-10-21 20:50 UTC (permalink / raw)
  To: Eyal Birger
  Cc: Zhiyuan Hou, Jamal Hadi Salim, Jiri Pirko, David S . Miller,
	Linux Kernel Network Developers, LKML, Shmulik Ladkani

On Fri, Oct 18, 2019 at 2:25 PM Eyal Birger <eyal.birger@gmail.com> wrote:
>
> Hi,
>
> On Fri, 18 Oct 2019 00:33:53 +0800
> Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:
>
> > On 2019/10/16 8:13 下午, Eyal Birger wrote:
> > > Hi,
> > >
> > > On Wed, 16 Oct 2019 01:22:01 +0800
> > > Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:
> > >
> > >> On 2019/10/15 1:57 上午, Cong Wang wrote:
> > >>> On Sat, Oct 12, 2019 at 12:16 AM Zhiyuan Hou
> > >>> <zhiyuan2048@linux.alibaba.com> wrote:
> > >>>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> > >>>> index 9ce073a05414..6108a64c0cd5 100644
> > >>>> --- a/net/sched/act_mirred.c
> > >>>> +++ b/net/sched/act_mirred.c
> > >>>> @@ -18,6 +18,7 @@
> > >>>>    #include <linux/gfp.h>
> > >>>>    #include <linux/if_arp.h>
> > >>>>    #include <net/net_namespace.h>
> > >>>> +#include <net/dst.h>
> > >>>>    #include <net/netlink.h>
> > >>>>    #include <net/pkt_sched.h>
> > >>>>    #include <net/pkt_cls.h>
> > >>>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff
> > >>>> *skb, const struct tc_action *a,
> > >>>>
> > >>>>           if (!want_ingress)
> > >>>>                   err = dev_queue_xmit(skb2);
> > >>>> -       else
> > >>>> +       else {
> > >>>> +               skb_dst_drop(skb2);
> > >>>>                   err = netif_receive_skb(skb2);
> > >>>> +       }
> > >>> Good catch!
> > > Indeed! Thanks for fixing this!
> > >
> > >>> I don't want to be picky, but it seems this is only needed
> > >>> when redirecting from egress to ingress, right? That is,
> > >>> ingress to ingress, or ingress to egress is okay? If not,
> > >>> please fix all the cases while you are on it?
> > >> Sure. But I think this patch is also needed when redirecting from
> > >> ingress to ingress. Because we cannot assure that a skb has null
> > >> dst in ingress redirection path. For example, if redirecting a skb
> > >> from loopback's ingress to other device's ingress, the skb will
> > >> take a dst.
> > >>
> > >> As commit logs point out, skb with valid dst cannot be made routing
> > >> decision in following process. original dst may cause skb loss or
> > >> other unexpected behavior.
> > > On the other hand, removing the dst on ingress-to-ingress
> > > redirection may remove LWT information on incoming packets, which
> > > may be undesired.
> > Sorry, I do not understand why lwt information is needed on
> > ingress-to-ingress redirection. lwt is used on output path, isn't it?
> > Can you please give more information?
>
> On rx path tunnelled packets parameters received on a collect_md tunnel device
> are kept in a metadata dst. See ip_tunnel_rcv() 'tun_dst' parameter.
>
> The rx metadata dst can be matched by a number of mechanisms like routing
> rules, eBPF, OVS, and netfilter.

Should this meta information be kept when redirecting? The dest device
may be a non-tunnel device, so I don't know if it is still useful when
for non-tunnel devices.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection
  2019-10-21 20:50           ` Cong Wang
@ 2019-10-22 10:37             ` Eyal Birger
  0 siblings, 0 replies; 15+ messages in thread
From: Eyal Birger @ 2019-10-22 10:37 UTC (permalink / raw)
  To: Cong Wang
  Cc: Zhiyuan Hou, Jamal Hadi Salim, Jiri Pirko, David S . Miller,
	Linux Kernel Network Developers, LKML, Shmulik Ladkani

Hi,

On Mon, 21 Oct 2019 13:50:13 -0700
Cong Wang <xiyou.wangcong@gmail.com> wrote:

> On Fri, Oct 18, 2019 at 2:25 PM Eyal Birger <eyal.birger@gmail.com>
> wrote:
> >
> > Hi,
> >
> > On Fri, 18 Oct 2019 00:33:53 +0800
> > Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:
> >
> > > On 2019/10/16 8:13 下午, Eyal Birger wrote:
> > > > Hi,
> > > >
> > > > On Wed, 16 Oct 2019 01:22:01 +0800
> > > > Zhiyuan Hou <zhiyuan2048@linux.alibaba.com> wrote:
> > > >
> > > >> On 2019/10/15 1:57 上午, Cong Wang wrote:
> > > >>> On Sat, Oct 12, 2019 at 12:16 AM Zhiyuan Hou
> > > >>> <zhiyuan2048@linux.alibaba.com> wrote:
> > > >>>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> > > >>>> index 9ce073a05414..6108a64c0cd5 100644
> > > >>>> --- a/net/sched/act_mirred.c
> > > >>>> +++ b/net/sched/act_mirred.c
> > > >>>> @@ -18,6 +18,7 @@
> > > >>>>    #include <linux/gfp.h>
> > > >>>>    #include <linux/if_arp.h>
> > > >>>>    #include <net/net_namespace.h>
> > > >>>> +#include <net/dst.h>
> > > >>>>    #include <net/netlink.h>
> > > >>>>    #include <net/pkt_sched.h>
> > > >>>>    #include <net/pkt_cls.h>
> > > >>>> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff
> > > >>>> *skb, const struct tc_action *a,
> > > >>>>
> > > >>>>           if (!want_ingress)
> > > >>>>                   err = dev_queue_xmit(skb2);
> > > >>>> -       else
> > > >>>> +       else {
> > > >>>> +               skb_dst_drop(skb2);
> > > >>>>                   err = netif_receive_skb(skb2);
> > > >>>> +       }
> > > >>> Good catch!
> > > > Indeed! Thanks for fixing this!
> > > >
> > > >>> I don't want to be picky, but it seems this is only needed
> > > >>> when redirecting from egress to ingress, right? That is,
> > > >>> ingress to ingress, or ingress to egress is okay? If not,
> > > >>> please fix all the cases while you are on it?
> > > >> Sure. But I think this patch is also needed when redirecting
> > > >> from ingress to ingress. Because we cannot assure that a skb
> > > >> has null dst in ingress redirection path. For example, if
> > > >> redirecting a skb from loopback's ingress to other device's
> > > >> ingress, the skb will take a dst.
> > > >>
> > > >> As commit logs point out, skb with valid dst cannot be made
> > > >> routing decision in following process. original dst may cause
> > > >> skb loss or other unexpected behavior.
> > > > On the other hand, removing the dst on ingress-to-ingress
> > > > redirection may remove LWT information on incoming packets,
> > > > which may be undesired.
> > > Sorry, I do not understand why lwt information is needed on
> > > ingress-to-ingress redirection. lwt is used on output path, isn't
> > > it? Can you please give more information?
> >
> > On rx path tunnelled packets parameters received on a collect_md
> > tunnel device are kept in a metadata dst. See ip_tunnel_rcv()
> > 'tun_dst' parameter.
> >
> > The rx metadata dst can be matched by a number of mechanisms like
> > routing rules, eBPF, OVS, and netfilter.
> 
> Should this meta information be kept when redirecting? The dest device
> may be a non-tunnel device, so I don't know if it is still useful when
> for non-tunnel devices.

I think that on ingress-to-ingress redirect it would make sense to keep the
metadata.

The dest device does not have to be a tunnel device AFAICT in order to use
tunnel info as skb_tunnel_info() does not observe skb->dev.

I don't see why going through mirred redirect should prevent the admin from
matching the packet based on LWT metadata - a packet may arrive on a collect_md
tunnel device, be ingress-redirected to different devices based on different
criteria, then routed based also on the tunnel parameters.

Eyal.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-10-22 10:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-12  7:16 [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection Zhiyuan Hou
2019-10-12 10:59 ` Eric Dumazet
2019-10-14  7:07   ` Zhiyuan Hou
2019-10-14 12:46     ` Eric Dumazet
2019-10-17 16:08       ` Zhiyuan Hou
2019-10-12 19:34 ` Sergei Shtylyov
2019-10-14  7:08   ` Zhiyuan Hou
2019-10-14 17:57 ` Cong Wang
2019-10-15 17:22   ` Zhiyuan Hou
2019-10-16 12:13     ` Eyal Birger
2019-10-17 16:33       ` Zhiyuan Hou
2019-10-18 21:25         ` Eyal Birger
2019-10-21 13:06           ` Zhiyuan Hou
2019-10-21 20:50           ` Cong Wang
2019-10-22 10:37             ` Eyal Birger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).