netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] openvswitch: Fix upcall OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS
@ 2020-11-04  0:11 Yi-Hung Wei
  2020-11-05 19:47 ` Gregory Rose
  2020-11-07 19:46 ` Jakub Kicinski
  0 siblings, 2 replies; 4+ messages in thread
From: Yi-Hung Wei @ 2020-11-04  0:11 UTC (permalink / raw)
  To: netdev; +Cc: Yi-Hung Wei

TUNNEL_GENEVE_OPT is set on tun_flags in struct sw_flow_key when
a packet is coming from a geneve tunnel no matter the size of geneve
option is zero or not.  On the other hand, TUNNEL_VXLAN_OPT or
TUNNEL_ERSPAN_OPT is set when the VXLAN or ERSPAN option is available.
Currently, ovs kernel module only generates
OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS when the tun_opts_len is non-zero.
As a result, for a geneve packet without tun_metadata, when the packet
is reinserted from userspace after upcall, we miss the TUNNEL_GENEVE_OPT
in the tun_flags on struct sw_flow_key, and that will further cause
megaflow matching issue.

This patch changes the way that we deal with the upcall netlink message
generation to make sure the geneve tun_flags is set consistently
as the packet is firstly received from the geneve tunnel in order to
avoid megaflow matching issue demonstrated by the following flows.
This issue is only observed on ovs kernel datapath.

Consider the following two flows, and the two cases.
* flow1: icmp traffic from gnv0 to gnv1, without any tun_metadata
* flow2: icmp traffic form gnv0 to gnv1 with tun_metadata0

Case 1)
Send flow2 first, and then send flow1.  When both flows are running,
both the following two flows are hit, which is expected.

table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff action=output:gnv1

Case 2)
Send flow1 first, then send flow2.  When both flows are running,
only the following flow is hit.
table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1

Example flows)

table=0, arp, actions=NORMAL
table=0, in_port=gnv1, icmp, action=ct(table=1)
table=0, in_port=gnv0, icmp  action=move:NXM_NX_TUN_METADATA0[0..7]->NXM_NX_REG9[0..7], resubmit(,1)
table=1, in_port=gnv1, icmp, action=output:gnv0
table=1, in_port=gnv0, icmp  action=ct(table=2)
table=2, priority=300, in_port=gnv0, icmp, ct_state=+trk+new, action=ct(commit),output:gnv1
table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff action=output:gnv1

Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
---
 net/openvswitch/flow_netlink.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 9d3e50c4d29f..b03ec6a1a1fa 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -912,13 +912,13 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
 	if ((output->tun_flags & TUNNEL_OAM) &&
 	    nla_put_flag(skb, OVS_TUNNEL_KEY_ATTR_OAM))
 		return -EMSGSIZE;
-	if (swkey_tun_opts_len) {
-		if (output->tun_flags & TUNNEL_GENEVE_OPT &&
-		    nla_put(skb, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS,
+	if (output->tun_flags & TUNNEL_GENEVE_OPT) {
+		if (nla_put(skb, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS,
 			    swkey_tun_opts_len, tun_opts))
 			return -EMSGSIZE;
-		else if (output->tun_flags & TUNNEL_VXLAN_OPT &&
-			 vxlan_opt_to_nlattr(skb, tun_opts, swkey_tun_opts_len))
+	} else if (swkey_tun_opts_len) {
+		if (output->tun_flags & TUNNEL_VXLAN_OPT &&
+		    vxlan_opt_to_nlattr(skb, tun_opts, swkey_tun_opts_len))
 			return -EMSGSIZE;
 		else if (output->tun_flags & TUNNEL_ERSPAN_OPT &&
 			 nla_put(skb, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] openvswitch: Fix upcall OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS
  2020-11-04  0:11 [PATCH net] openvswitch: Fix upcall OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS Yi-Hung Wei
@ 2020-11-05 19:47 ` Gregory Rose
  2020-11-07 19:46 ` Jakub Kicinski
  1 sibling, 0 replies; 4+ messages in thread
From: Gregory Rose @ 2020-11-05 19:47 UTC (permalink / raw)
  To: Yi-Hung Wei, netdev


On 11/3/2020 4:11 PM, Yi-Hung Wei wrote:
> TUNNEL_GENEVE_OPT is set on tun_flags in struct sw_flow_key when
> a packet is coming from a geneve tunnel no matter the size of geneve
> option is zero or not.  On the other hand, TUNNEL_VXLAN_OPT or
> TUNNEL_ERSPAN_OPT is set when the VXLAN or ERSPAN option is available.
> Currently, ovs kernel module only generates
> OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS when the tun_opts_len is non-zero.
> As a result, for a geneve packet without tun_metadata, when the packet
> is reinserted from userspace after upcall, we miss the TUNNEL_GENEVE_OPT
> in the tun_flags on struct sw_flow_key, and that will further cause
> megaflow matching issue.
> 
> This patch changes the way that we deal with the upcall netlink message
> generation to make sure the geneve tun_flags is set consistently
> as the packet is firstly received from the geneve tunnel in order to
> avoid megaflow matching issue demonstrated by the following flows.
> This issue is only observed on ovs kernel datapath.
> 
> Consider the following two flows, and the two cases.
> * flow1: icmp traffic from gnv0 to gnv1, without any tun_metadata
> * flow2: icmp traffic form gnv0 to gnv1 with tun_metadata0
> 
> Case 1)
> Send flow2 first, and then send flow1.  When both flows are running,
> both the following two flows are hit, which is expected.
> 
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff action=output:gnv1
> 
> Case 2)
> Send flow1 first, then send flow2.  When both flows are running,
> only the following flow is hit.
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> 
> Example flows)
> 
> table=0, arp, actions=NORMAL
> table=0, in_port=gnv1, icmp, action=ct(table=1)
> table=0, in_port=gnv0, icmp  action=move:NXM_NX_TUN_METADATA0[0..7]->NXM_NX_REG9[0..7], resubmit(,1)
> table=1, in_port=gnv1, icmp, action=output:gnv0
> table=1, in_port=gnv0, icmp  action=ct(table=2)
> table=2, priority=300, in_port=gnv0, icmp, ct_state=+trk+new, action=ct(commit),output:gnv1
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff action=output:gnv1
> 
> Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.")
> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
> ---
>   net/openvswitch/flow_netlink.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index 9d3e50c4d29f..b03ec6a1a1fa 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -912,13 +912,13 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
>   	if ((output->tun_flags & TUNNEL_OAM) &&
>   	    nla_put_flag(skb, OVS_TUNNEL_KEY_ATTR_OAM))
>   		return -EMSGSIZE;
> -	if (swkey_tun_opts_len) {
> -		if (output->tun_flags & TUNNEL_GENEVE_OPT &&
> -		    nla_put(skb, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS,
> +	if (output->tun_flags & TUNNEL_GENEVE_OPT) {
> +		if (nla_put(skb, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS,
>   			    swkey_tun_opts_len, tun_opts))
>   			return -EMSGSIZE;
> -		else if (output->tun_flags & TUNNEL_VXLAN_OPT &&
> -			 vxlan_opt_to_nlattr(skb, tun_opts, swkey_tun_opts_len))
> +	} else if (swkey_tun_opts_len) {
> +		if (output->tun_flags & TUNNEL_VXLAN_OPT &&
> +		    vxlan_opt_to_nlattr(skb, tun_opts, swkey_tun_opts_len))
>   			return -EMSGSIZE;
>   		else if (output->tun_flags & TUNNEL_ERSPAN_OPT &&
>   			 nla_put(skb, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
> 

Applies and builds cleanly.

Passes OVS kernel test suite geneve tunnel tests.

  17: datapath - ping over geneve tunnel              ok
  18: datapath - flow resume with geneve tun_metadata ok
  19: datapath - ping over geneve6 tunnel             ok

Code looks good to me.

Acked-by: Greg Rose <gvrose8192@gmail.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] openvswitch: Fix upcall OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS
  2020-11-04  0:11 [PATCH net] openvswitch: Fix upcall OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS Yi-Hung Wei
  2020-11-05 19:47 ` Gregory Rose
@ 2020-11-07 19:46 ` Jakub Kicinski
  2020-11-11  0:26   ` Yi-Hung Wei
  1 sibling, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2020-11-07 19:46 UTC (permalink / raw)
  To: Yi-Hung Wei; +Cc: netdev

On Tue,  3 Nov 2020 16:11:34 -0800 Yi-Hung Wei wrote:
> TUNNEL_GENEVE_OPT is set on tun_flags in struct sw_flow_key when
> a packet is coming from a geneve tunnel no matter the size of geneve
> option is zero or not.  On the other hand, TUNNEL_VXLAN_OPT or
> TUNNEL_ERSPAN_OPT is set when the VXLAN or ERSPAN option is available.
> Currently, ovs kernel module only generates
> OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS when the tun_opts_len is non-zero.
> As a result, for a geneve packet without tun_metadata, when the packet
> is reinserted from userspace after upcall, we miss the TUNNEL_GENEVE_OPT
> in the tun_flags on struct sw_flow_key, and that will further cause
> megaflow matching issue.
> 
> This patch changes the way that we deal with the upcall netlink message
> generation to make sure the geneve tun_flags is set consistently
> as the packet is firstly received from the geneve tunnel in order to
> avoid megaflow matching issue demonstrated by the following flows.
> This issue is only observed on ovs kernel datapath.
> 
> Consider the following two flows, and the two cases.
> * flow1: icmp traffic from gnv0 to gnv1, without any tun_metadata
> * flow2: icmp traffic form gnv0 to gnv1 with tun_metadata0
> 
> Case 1)
> Send flow2 first, and then send flow1.  When both flows are running,
> both the following two flows are hit, which is expected.
> 
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff action=output:gnv1
> 
> Case 2)
> Send flow1 first, then send flow2.  When both flows are running,
> only the following flow is hit.
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> 
> Example flows)
> 
> table=0, arp, actions=NORMAL
> table=0, in_port=gnv1, icmp, action=ct(table=1)
> table=0, in_port=gnv0, icmp  action=move:NXM_NX_TUN_METADATA0[0..7]->NXM_NX_REG9[0..7], resubmit(,1)
> table=1, in_port=gnv1, icmp, action=output:gnv0
> table=1, in_port=gnv0, icmp  action=ct(table=2)
> table=2, priority=300, in_port=gnv0, icmp, ct_state=+trk+new, action=ct(commit),output:gnv1
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff action=output:gnv1
> 
> Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.")
> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>

Wouldn't it make more sense to make GENEVE behave like the other
tunnels rather than hack this logic into consumer of the flag?

Please make sure that you CC authors of the patch you're fixing 
and the maintainers of the code you're changing.

./scripts/get_maintainer.pl is your friend.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] openvswitch: Fix upcall OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS
  2020-11-07 19:46 ` Jakub Kicinski
@ 2020-11-11  0:26   ` Yi-Hung Wei
  0 siblings, 0 replies; 4+ messages in thread
From: Yi-Hung Wei @ 2020-11-11  0:26 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Linux Kernel Network Developers

On Sat, Nov 7, 2020 at 11:46 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue,  3 Nov 2020 16:11:34 -0800 Yi-Hung Wei wrote:
> > TUNNEL_GENEVE_OPT is set on tun_flags in struct sw_flow_key when
> > a packet is coming from a geneve tunnel no matter the size of geneve
> > option is zero or not.  On the other hand, TUNNEL_VXLAN_OPT or
> > TUNNEL_ERSPAN_OPT is set when the VXLAN or ERSPAN option is available.
> > Currently, ovs kernel module only generates
> > OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS when the tun_opts_len is non-zero.
> > As a result, for a geneve packet without tun_metadata, when the packet
> > is reinserted from userspace after upcall, we miss the TUNNEL_GENEVE_OPT
> > in the tun_flags on struct sw_flow_key, and that will further cause
> > megaflow matching issue.
> >
> > This patch changes the way that we deal with the upcall netlink message
> > generation to make sure the geneve tun_flags is set consistently
> > as the packet is firstly received from the geneve tunnel in order to
> > avoid megaflow matching issue demonstrated by the following flows.
> > This issue is only observed on ovs kernel datapath.
> >
> > Consider the following two flows, and the two cases.
> > * flow1: icmp traffic from gnv0 to gnv1, without any tun_metadata
> > * flow2: icmp traffic form gnv0 to gnv1 with tun_metadata0
> >
> > Case 1)
> > Send flow2 first, and then send flow1.  When both flows are running,
> > both the following two flows are hit, which is expected.
> >
> > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff action=output:gnv1
> >
> > Case 2)
> > Send flow1 first, then send flow2.  When both flows are running,
> > only the following flow is hit.
> > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> >
> > Example flows)
> >
> > table=0, arp, actions=NORMAL
> > table=0, in_port=gnv1, icmp, action=ct(table=1)
> > table=0, in_port=gnv0, icmp  action=move:NXM_NX_TUN_METADATA0[0..7]->NXM_NX_REG9[0..7], resubmit(,1)
> > table=1, in_port=gnv1, icmp, action=output:gnv0
> > table=1, in_port=gnv0, icmp  action=ct(table=2)
> > table=2, priority=300, in_port=gnv0, icmp, ct_state=+trk+new, action=ct(commit),output:gnv1
> > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x0/0xff action=output:gnv1
> > table=2, priority=200, in_port=gnv0, icmp, ct_state=+trk+est, reg9=0x1/0xff action=output:gnv1
> >
> > Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.")
> > Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
>
> Wouldn't it make more sense to make GENEVE behave like the other
> tunnels rather than hack this logic into consumer of the flag?

Thanks Jakub for the review.  Yes, it makes sense to fix on the tunnel
side. I submit another patch to fix it:

https://lore.kernel.org/netdev/1605053800-74072-1-git-send-email-yihung.wei@gmail.com/T/#u

> Please make sure that you CC authors of the patch you're fixing
> and the maintainers of the code you're changing.
>
> ./scripts/get_maintainer.pl is your friend.

Thanks for reminding me.  Will do that from now on.


-Yi-Hung

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-11  0:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-04  0:11 [PATCH net] openvswitch: Fix upcall OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS Yi-Hung Wei
2020-11-05 19:47 ` Gregory Rose
2020-11-07 19:46 ` Jakub Kicinski
2020-11-11  0:26   ` Yi-Hung Wei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).