netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: ipvlan: forward ingress packet to slave's l2 in l3s mode
@ 2019-06-25  6:42 Zhiyuan Hou
  2019-06-26  8:16 ` Paolo Abeni
  0 siblings, 1 reply; 3+ messages in thread
From: Zhiyuan Hou @ 2019-06-25  6:42 UTC (permalink / raw)
  To: zhiyuan2048, davem, idosch, daniel, petrm, jiri, tglx, linmiaohe
  Cc: zhabin, caspar, netdev, linux-kernel

In ipvlan l3s mode,  ingress packet is switched to slave interface and
delivers to l4 stack. This may cause two problems:

  1. When slave is in an ns different from master, the behavior of stack
  in slave ns may cause confusion for users. For example, iptables, tc,
  and other l2/l3 functions are not available for ingress packet.

  2. l3s mode is not used for tap device, and cannot support ipvtap. But
  in VM or container based VM cases, tap device is a very common device.

In l3s mode's input nf_hook, this patch calles the skb_forward_dev() to
forward ingress packet to slave and uses nf_conntrack_confirm() to make
conntrack work with new mode.

Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
---
 drivers/net/ipvlan/ipvlan.h     |  9 ++++++++-
 drivers/net/ipvlan/ipvlan_l3s.c | 16 ++++++++++++++--
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 3837c897832e..48c814e24c3f 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -172,6 +172,14 @@ void ipvlan_link_delete(struct net_device *dev, struct list_head *head);
 void ipvlan_link_setup(struct net_device *dev);
 int ipvlan_link_register(struct rtnl_link_ops *ops);
 #ifdef CONFIG_IPVLAN_L3S
+
+#include <net/netfilter/nf_conntrack_core.h>
+
+static inline int ipvlan_confirm_conntrack(struct sk_buff *skb)
+{
+	return nf_conntrack_confirm(skb);
+}
+
 int ipvlan_l3s_register(struct ipvl_port *port);
 void ipvlan_l3s_unregister(struct ipvl_port *port);
 void ipvlan_migrate_l3s_hook(struct net *oldnet, struct net *newnet);
@@ -206,5 +214,4 @@ static inline bool netif_is_ipvlan_port(const struct net_device *dev)
 {
 	return rcu_access_pointer(dev->rx_handler) == ipvlan_handle_frame;
 }
-
 #endif /* __IPVLAN_H */
diff --git a/drivers/net/ipvlan/ipvlan_l3s.c b/drivers/net/ipvlan/ipvlan_l3s.c
index 943d26cbf39f..ed210002f593 100644
--- a/drivers/net/ipvlan/ipvlan_l3s.c
+++ b/drivers/net/ipvlan/ipvlan_l3s.c
@@ -95,14 +95,26 @@ static unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb,
 {
 	struct ipvl_addr *addr;
 	unsigned int len;
+	int ret = NF_ACCEPT;
+	bool success;
 
 	addr = ipvlan_skb_to_addr(skb, skb->dev);
 	if (!addr)
 		goto out;
 
-	skb->dev = addr->master->dev;
 	len = skb->len + ETH_HLEN;
-	ipvlan_count_rx(addr->master, len, true, false);
+
+	ret = ipvlan_confirm_conntrack(skb);
+	if (ret != NF_ACCEPT) {
+		ipvlan_count_rx(addr->master, len, false, false);
+		goto out;
+	}
+
+	skb_push_rcsum(skb, ETH_HLEN);
+	success = dev_forward_skb(addr->master->dev, skb) == NET_RX_SUCCESS;
+	ipvlan_count_rx(addr->master, len, success, false);
+	return NF_STOLEN;
+
 out:
 	return NF_ACCEPT;
 }
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next] net: ipvlan: forward ingress packet to slave's l2 in l3s mode
  2019-06-25  6:42 [PATCH net-next] net: ipvlan: forward ingress packet to slave's l2 in l3s mode Zhiyuan Hou
@ 2019-06-26  8:16 ` Paolo Abeni
  2019-06-28  8:59   ` Zhiyuan Hou
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2019-06-26  8:16 UTC (permalink / raw)
  To: Zhiyuan Hou, davem, idosch, daniel, petrm, jiri, tglx, linmiaohe
  Cc: zhabin, caspar, netdev, linux-kernel

Hi,

On Tue, 2019-06-25 at 14:42 +0800, Zhiyuan Hou wrote:
> In ipvlan l3s mode,  ingress packet is switched to slave interface and
> delivers to l4 stack. This may cause two problems:
> 
>   1. When slave is in an ns different from master, the behavior of stack
>   in slave ns may cause confusion for users. For example, iptables, tc,
>   and other l2/l3 functions are not available for ingress packet.
> 
>   2. l3s mode is not used for tap device, and cannot support ipvtap. But
>   in VM or container based VM cases, tap device is a very common device.
> 
> In l3s mode's input nf_hook, this patch calles the skb_forward_dev() to
> forward ingress packet to slave and uses nf_conntrack_confirm() to make
> conntrack work with new mode.
> 
> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
> Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
> ---
>  drivers/net/ipvlan/ipvlan.h     |  9 ++++++++-
>  drivers/net/ipvlan/ipvlan_l3s.c | 16 ++++++++++++++--
>  2 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
> index 3837c897832e..48c814e24c3f 100644
> --- a/drivers/net/ipvlan/ipvlan.h
> +++ b/drivers/net/ipvlan/ipvlan.h
> @@ -172,6 +172,14 @@ void ipvlan_link_delete(struct net_device *dev, struct list_head *head);
>  void ipvlan_link_setup(struct net_device *dev);
>  int ipvlan_link_register(struct rtnl_link_ops *ops);
>  #ifdef CONFIG_IPVLAN_L3S
> +
> +#include <net/netfilter/nf_conntrack_core.h>
> +
> +static inline int ipvlan_confirm_conntrack(struct sk_buff *skb)
> +{
> +	return nf_conntrack_confirm(skb);
> +}
> +
>  int ipvlan_l3s_register(struct ipvl_port *port);
>  void ipvlan_l3s_unregister(struct ipvl_port *port);
>  void ipvlan_migrate_l3s_hook(struct net *oldnet, struct net *newnet);
> @@ -206,5 +214,4 @@ static inline bool netif_is_ipvlan_port(const struct net_device *dev)
>  {
>  	return rcu_access_pointer(dev->rx_handler) == ipvlan_handle_frame;
>  }
> -
>  #endif /* __IPVLAN_H */
> diff --git a/drivers/net/ipvlan/ipvlan_l3s.c b/drivers/net/ipvlan/ipvlan_l3s.c
> index 943d26cbf39f..ed210002f593 100644
> --- a/drivers/net/ipvlan/ipvlan_l3s.c
> +++ b/drivers/net/ipvlan/ipvlan_l3s.c
> @@ -95,14 +95,26 @@ static unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb,
>  {
>  	struct ipvl_addr *addr;
>  	unsigned int len;
> +	int ret = NF_ACCEPT;
> +	bool success;
>  
>  	addr = ipvlan_skb_to_addr(skb, skb->dev);
>  	if (!addr)
>  		goto out;
>  
> -	skb->dev = addr->master->dev;
>  	len = skb->len + ETH_HLEN;
> -	ipvlan_count_rx(addr->master, len, true, false);
> +
> +	ret = ipvlan_confirm_conntrack(skb);
> +	if (ret != NF_ACCEPT) {
> +		ipvlan_count_rx(addr->master, len, false, false);
> +		goto out;
> +	}
> +
> +	skb_push_rcsum(skb, ETH_HLEN);
> +	success = dev_forward_skb(addr->master->dev, skb) == NET_RX_SUCCESS;

This looks weird to me: if I read the code correctly, the skb will
traverse twice NF_INET_LOCAL_IN, once due to the l3s hooking and
another one due to dev_forward_skb().

Also, tc ingreess, etc will run after the first traversing of
NF_INET_LOCAL_IN.

All in all I think that if full l2 processing is required, a different
mode or a different virtual device should be used.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next] net: ipvlan: forward ingress packet to slave's l2 in l3s mode
  2019-06-26  8:16 ` Paolo Abeni
@ 2019-06-28  8:59   ` Zhiyuan Hou
  0 siblings, 0 replies; 3+ messages in thread
From: Zhiyuan Hou @ 2019-06-28  8:59 UTC (permalink / raw)
  To: Paolo Abeni, davem, idosch, daniel, petrm, jiri, tglx, linmiaohe
  Cc: zhabin, caspar, netdev, linux-kernel, wei.yang1


在 2019/6/26 下午4:16, Paolo Abeni 写道:
> Hi,
>
> On Tue, 2019-06-25 at 14:42 +0800, Zhiyuan Hou wrote:
>> In ipvlan l3s mode,  ingress packet is switched to slave interface and
>> delivers to l4 stack. This may cause two problems:
>>
>>    1. When slave is in an ns different from master, the behavior of stack
>>    in slave ns may cause confusion for users. For example, iptables, tc,
>>    and other l2/l3 functions are not available for ingress packet.
>>
>>    2. l3s mode is not used for tap device, and cannot support ipvtap. But
>>    in VM or container based VM cases, tap device is a very common device.
>>
>> In l3s mode's input nf_hook, this patch calles the skb_forward_dev() to
>> forward ingress packet to slave and uses nf_conntrack_confirm() to make
>> conntrack work with new mode.
>>
>> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
>> Signed-off-by: Zhiyuan Hou <zhiyuan2048@linux.alibaba.com>
>> ---
>>   drivers/net/ipvlan/ipvlan.h     |  9 ++++++++-
>>   drivers/net/ipvlan/ipvlan_l3s.c | 16 ++++++++++++++--
>>   2 files changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
>> index 3837c897832e..48c814e24c3f 100644
>> --- a/drivers/net/ipvlan/ipvlan.h
>> +++ b/drivers/net/ipvlan/ipvlan.h
>> @@ -172,6 +172,14 @@ void ipvlan_link_delete(struct net_device *dev, struct list_head *head);
>>   void ipvlan_link_setup(struct net_device *dev);
>>   int ipvlan_link_register(struct rtnl_link_ops *ops);
>>   #ifdef CONFIG_IPVLAN_L3S
>> +
>> +#include <net/netfilter/nf_conntrack_core.h>
>> +
>> +static inline int ipvlan_confirm_conntrack(struct sk_buff *skb)
>> +{
>> +	return nf_conntrack_confirm(skb);
>> +}
>> +
>>   int ipvlan_l3s_register(struct ipvl_port *port);
>>   void ipvlan_l3s_unregister(struct ipvl_port *port);
>>   void ipvlan_migrate_l3s_hook(struct net *oldnet, struct net *newnet);
>> @@ -206,5 +214,4 @@ static inline bool netif_is_ipvlan_port(const struct net_device *dev)
>>   {
>>   	return rcu_access_pointer(dev->rx_handler) == ipvlan_handle_frame;
>>   }
>> -
>>   #endif /* __IPVLAN_H */
>> diff --git a/drivers/net/ipvlan/ipvlan_l3s.c b/drivers/net/ipvlan/ipvlan_l3s.c
>> index 943d26cbf39f..ed210002f593 100644
>> --- a/drivers/net/ipvlan/ipvlan_l3s.c
>> +++ b/drivers/net/ipvlan/ipvlan_l3s.c
>> @@ -95,14 +95,26 @@ static unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb,
>>   {
>>   	struct ipvl_addr *addr;
>>   	unsigned int len;
>> +	int ret = NF_ACCEPT;
>> +	bool success;
>>   
>>   	addr = ipvlan_skb_to_addr(skb, skb->dev);
>>   	if (!addr)
>>   		goto out;
>>   
>> -	skb->dev = addr->master->dev;
>>   	len = skb->len + ETH_HLEN;
>> -	ipvlan_count_rx(addr->master, len, true, false);
>> +
>> +	ret = ipvlan_confirm_conntrack(skb);
>> +	if (ret != NF_ACCEPT) {
>> +		ipvlan_count_rx(addr->master, len, false, false);
>> +		goto out;
>> +	}
>> +
>> +	skb_push_rcsum(skb, ETH_HLEN);
>> +	success = dev_forward_skb(addr->master->dev, skb) == NET_RX_SUCCESS;
> This looks weird to me: if I read the code correctly, the skb will
> traverse twice NF_INET_LOCAL_IN, once due to the l3s hooking and
> another one due to dev_forward_skb().
>
> Also, tc ingreess, etc will run after the first traversing of
> NF_INET_LOCAL_IN.

Yes,  but the skb's device has been modified from the master to slave. 
In most use cases of

ipvlan, the master device and slave device are in different namespace 
(ns), so the second

triggered LOCAL_IN is completely isolated from the first triggered 
LOCAL_IN.


When the master device and slave device are in the same ns, the behavior 
of this patch is

similar to that of L2 over L3 tunnel (forwarding from L3 to L2 device).


Since the device has been modified, the second triggered tc-ingress is 
thus different.

>
> All in all I think that if full l2 processing is required, a different
> mode or a different virtual device should be used.

We can implement it in a new mode, but such a way is similar to the 
current ipvlan l3s mode.

Also, ipvlan l3s mode has two problems described in patch's commit log. 
I think that a more

appropriate solution is to modify ipvlan l3s.

> Cheers,
>
> Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-06-28  9:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-25  6:42 [PATCH net-next] net: ipvlan: forward ingress packet to slave's l2 in l3s mode Zhiyuan Hou
2019-06-26  8:16 ` Paolo Abeni
2019-06-28  8:59   ` Zhiyuan Hou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).