netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* issues with vxlan RX checksum offload under OVS datapath
@ 2014-01-19 22:05 Or Gerlitz
  2014-01-21 17:30 ` Pravin Shelar
  0 siblings, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2014-01-19 22:05 UTC (permalink / raw)
  To: Pravin B Shelar, Joseph Gasparakis; +Cc: netdev

Hi Pravin,

While testing the gro udp patches over a setup with openvswitch I noted 
that the RX checksum offload support introduced by Joseph's commit 
0afb166 "vxlan: Add capability of Rx checksum offload for inner packet" 
works fine when you use a setup made of

NIC --> IP stack --> vxlan device --> bridge --> tap

but not when its

NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap

I narrowed it down to the fact the when going the OVS 
pathskb->encapsulation remains true also after the decap is done. 
Basically, this is the original hunk


> @@ -607,7 +607,17 @@ static int vxlan_udp_encap_recv(struct sock *sk, 
> struct sk_buff *skb)
>
>         __skb_tunnel_rx(skb, vxlan->dev);
>         skb_reset_network_header(skb);
> -       skb->ip_summed = CHECKSUM_NONE;
> +
> +       /* If the NIC driver gave us an encapsulated packet with
> +        * CHECKSUM_UNNECESSARY and Rx checksum feature is enabled,
> +        * leave the CHECKSUM_UNNECESSARY, the device checksummed it
> +        * for us. Otherwise force the upper layers to verify it.
> +        */
> +       if (skb->ip_summed != CHECKSUM_UNNECESSARY || 
> !skb->encapsulation ||
> +           !(vxlan->dev->features & NETIF_F_RXCSUM))
> +               skb->ip_summed = CHECKSUM_NONE;
> +
> +       skb->encapsulation = 0;

which was moved by your commits

5cfccc5 vxlan: Add vxlan recv demux.
7ce0475 vxlan: Restructure vxlan receive.

to a code which isn't shared by the vxlan driver and ovs vxlan datapath 
code, and I was thinking lets just move it back to vxlan_udp_encap_recv. 
However, we can't really make the "vxlan->dev->features & 
NETIF_F_RXCSUM" check - since there's no vxlan device for ovs datapath, 
any idea how to overcome this.

Moving this to shared code (while removing the check for 
vxlan->dev->features) made things to work on my setup, but this misses 
one of the original conditions, ideas?

I believe its too late for 3.13 unless there is going to be -rc9, but 
lets first see what's the right fix.

Or.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: issues with vxlan RX checksum offload under OVS datapath
  2014-01-19 22:05 issues with vxlan RX checksum offload under OVS datapath Or Gerlitz
@ 2014-01-21 17:30 ` Pravin Shelar
  2014-01-21 17:37   ` Jesse Gross
  2014-01-21 20:55   ` Or Gerlitz
  0 siblings, 2 replies; 7+ messages in thread
From: Pravin Shelar @ 2014-01-21 17:30 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Joseph Gasparakis, netdev

On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> Hi Pravin,
>
> While testing the gro udp patches over a setup with openvswitch I noted that
> the RX checksum offload support introduced by Joseph's commit 0afb166
> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
> when you use a setup made of
>
> NIC --> IP stack --> vxlan device --> bridge --> tap
>
> but not when its
>
> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
>
> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
> remains true also after the decap is done. Basically, this is the original
> hunk
>
>
>> @@ -607,7 +607,17 @@ static int vxlan_udp_encap_recv(struct sock *sk,
>> struct sk_buff *skb)
>>
>>         __skb_tunnel_rx(skb, vxlan->dev);
>>         skb_reset_network_header(skb);
>> -       skb->ip_summed = CHECKSUM_NONE;
>> +
>> +       /* If the NIC driver gave us an encapsulated packet with
>> +        * CHECKSUM_UNNECESSARY and Rx checksum feature is enabled,
>> +        * leave the CHECKSUM_UNNECESSARY, the device checksummed it
>> +        * for us. Otherwise force the upper layers to verify it.
>> +        */
>> +       if (skb->ip_summed != CHECKSUM_UNNECESSARY || !skb->encapsulation
>> ||
>> +           !(vxlan->dev->features & NETIF_F_RXCSUM))
>> +               skb->ip_summed = CHECKSUM_NONE;
>> +
>> +       skb->encapsulation = 0;
>
>
> which was moved by your commits
>
> 5cfccc5 vxlan: Add vxlan recv demux.
> 7ce0475 vxlan: Restructure vxlan receive.
>
> to a code which isn't shared by the vxlan driver and ovs vxlan datapath
> code, and I was thinking lets just move it back to vxlan_udp_encap_recv.
> However, we can't really make the "vxlan->dev->features & NETIF_F_RXCSUM"
> check - since there's no vxlan device for ovs datapath, any idea how to
> overcome this.
>
> Moving this to shared code (while removing the check for
> vxlan->dev->features) made things to work on my setup, but this misses one
> of the original conditions, ideas?
>
I kept csum check in vxlan-device recv path for same reason. As of now
there is no efficient way to get ovs-dev features.
May be we can cache device features in struct datapath from datapath-netdev.

> I believe its too late for 3.13 unless there is going to be -rc9, but lets
> first see what's the right fix.
>
> Or.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: issues with vxlan RX checksum offload under OVS datapath
  2014-01-21 17:30 ` Pravin Shelar
@ 2014-01-21 17:37   ` Jesse Gross
  2014-01-21 20:55   ` Or Gerlitz
  1 sibling, 0 replies; 7+ messages in thread
From: Jesse Gross @ 2014-01-21 17:37 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Or Gerlitz, Joseph Gasparakis, netdev

On Tue, Jan 21, 2014 at 9:30 AM, Pravin Shelar <pshelar@nicira.com> wrote:
> On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>> Hi Pravin,
>>
>> While testing the gro udp patches over a setup with openvswitch I noted that
>> the RX checksum offload support introduced by Joseph's commit 0afb166
>> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
>> when you use a setup made of
>>
>> NIC --> IP stack --> vxlan device --> bridge --> tap
>>
>> but not when its
>>
>> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
>>
>> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
>> remains true also after the decap is done. Basically, this is the original
>> hunk
>>
>>
>>> @@ -607,7 +607,17 @@ static int vxlan_udp_encap_recv(struct sock *sk,
>>> struct sk_buff *skb)
>>>
>>>         __skb_tunnel_rx(skb, vxlan->dev);
>>>         skb_reset_network_header(skb);
>>> -       skb->ip_summed = CHECKSUM_NONE;
>>> +
>>> +       /* If the NIC driver gave us an encapsulated packet with
>>> +        * CHECKSUM_UNNECESSARY and Rx checksum feature is enabled,
>>> +        * leave the CHECKSUM_UNNECESSARY, the device checksummed it
>>> +        * for us. Otherwise force the upper layers to verify it.
>>> +        */
>>> +       if (skb->ip_summed != CHECKSUM_UNNECESSARY || !skb->encapsulation
>>> ||
>>> +           !(vxlan->dev->features & NETIF_F_RXCSUM))
>>> +               skb->ip_summed = CHECKSUM_NONE;
>>> +
>>> +       skb->encapsulation = 0;
>>
>>
>> which was moved by your commits
>>
>> 5cfccc5 vxlan: Add vxlan recv demux.
>> 7ce0475 vxlan: Restructure vxlan receive.
>>
>> to a code which isn't shared by the vxlan driver and ovs vxlan datapath
>> code, and I was thinking lets just move it back to vxlan_udp_encap_recv.
>> However, we can't really make the "vxlan->dev->features & NETIF_F_RXCSUM"
>> check - since there's no vxlan device for ovs datapath, any idea how to
>> overcome this.
>>
>> Moving this to shared code (while removing the check for
>> vxlan->dev->features) made things to work on my setup, but this misses one
>> of the original conditions, ideas?
>>
> I kept csum check in vxlan-device recv path for same reason. As of now
> there is no efficient way to get ovs-dev features.
> May be we can cache device features in struct datapath from datapath-netdev.\

I don't think that getting the features from the internal device makes
sense - a packet could be destined to a VM instead.

I think that controlling RX checksum on the device that physically
received it should be enough to be able to check for broken devices.
It's not clear what the benefit to turning it off at the VXLAN layer
is - if you really don't want it for some reason, you can always throw
away the checksum later.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: issues with vxlan RX checksum offload under OVS datapath
  2014-01-21 17:30 ` Pravin Shelar
  2014-01-21 17:37   ` Jesse Gross
@ 2014-01-21 20:55   ` Or Gerlitz
  2014-01-21 21:47     ` Joseph Gasparakis
  1 sibling, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2014-01-21 20:55 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Or Gerlitz, Joseph Gasparakis, netdev

On Tue, Jan 21, 2014 at 7:30 PM, Pravin Shelar <pshelar@nicira.com> wrote:
> On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:

>> While testing the gro udp patches over a setup with openvswitch I noted that
>> the RX checksum offload support introduced by Joseph's commit 0afb166
>> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
>> when you use a setup made of
>> NIC --> IP stack --> vxlan device --> bridge --> tap
>> but not when its
>> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
>> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
>> remains true also after the decap is done. Basically, this is the original hunk
[...]
>>> +       skb->encapsulation = 0;
[...]

>> Moving this to shared code (while removing the check for
>> vxlan->dev->features) made things to work on my setup, but this misses one
>> of the original conditions, ideas?

> I kept csum check in vxlan-device recv path for same reason. As of now
> there is no efficient way to get ovs-dev features.
> May be we can cache device features in struct datapath from datapath-netdev.

To be a bit more precise/concrete here, do we agree that the both paths must do

   skb->encapsulation = 0;

which is done now only by the non-ovs path

Or.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: issues with vxlan RX checksum offload under OVS datapath
  2014-01-21 21:47     ` Joseph Gasparakis
@ 2014-01-21 21:43       ` Or Gerlitz
  2014-01-21 22:35         ` Joseph Gasparakis
  0 siblings, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2014-01-21 21:43 UTC (permalink / raw)
  To: Joseph Gasparakis; +Cc: Pravin Shelar, Or Gerlitz, netdev

On Tue, Jan 21, 2014, Joseph Gasparakis <joseph.gasparakis@intel.com> wrote:
> On Tue, 21 Jan 2014, Or Gerlitz wrote:

>> To be a bit more precise/concrete here, do we agree that the both paths must >>    skb->encapsulation = 0;
>> which is done now only by the non-ovs path

> Originally skb->encapsulation had (and still has) the meaning of "does
> this skb have outer *and* (valid) inner headers? If so, it is an
> encapsulated packet".
> So based on this skb->encapsulation should be set as soon an (inner)
> packet gets encapsulated and unset when decapsulation takes place, and
> ideally this should happen in the ovs path too.

I would say critically not ideally... e.g when the packet is
destinat-ed to a VM and goes through tap netdevice plugged to OVS --
without this de-assignment weird things happen after the point in time
where the ovs TX path calls dev_queue_xmit() in
net/openvswitch/vport-netdev.c

> Together with skb->encapsulation the inner headers should be correctly set.

That's interesting point, what might break if this isn't done?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: issues with vxlan RX checksum offload under OVS datapath
  2014-01-21 20:55   ` Or Gerlitz
@ 2014-01-21 21:47     ` Joseph Gasparakis
  2014-01-21 21:43       ` Or Gerlitz
  0 siblings, 1 reply; 7+ messages in thread
From: Joseph Gasparakis @ 2014-01-21 21:47 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Pravin Shelar, Or Gerlitz, Joseph Gasparakis, netdev



On Tue, 21 Jan 2014, Or Gerlitz wrote:

> On Tue, Jan 21, 2014 at 7:30 PM, Pravin Shelar <pshelar@nicira.com> wrote:
> > On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> 
> >> While testing the gro udp patches over a setup with openvswitch I noted that
> >> the RX checksum offload support introduced by Joseph's commit 0afb166
> >> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
> >> when you use a setup made of
> >> NIC --> IP stack --> vxlan device --> bridge --> tap
> >> but not when its
> >> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
> >> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
> >> remains true also after the decap is done. Basically, this is the original hunk
> [...]
> >>> +       skb->encapsulation = 0;
> [...]
> 
> >> Moving this to shared code (while removing the check for
> >> vxlan->dev->features) made things to work on my setup, but this misses one
> >> of the original conditions, ideas?
> 
> > I kept csum check in vxlan-device recv path for same reason. As of now
> > there is no efficient way to get ovs-dev features.
> > May be we can cache device features in struct datapath from datapath-netdev.
> 
> To be a bit more precise/concrete here, do we agree that the both paths must do
> 
>    skb->encapsulation = 0;
> 
> which is done now only by the non-ovs path

Originally skb->encapsulation had (and still has) the meaning of "does 
this skb have outer *and* (valid) inner headers? If so, it is an 
encapsulated packet". 
So based on this skb->encapsulation should be set as soon an (inner) 
packet gets encapsulated and unset when decapsulation takes place, and 
ideally this should happen in the ovs path too. Together with 
skb->encapsulation the inner headers should be correctly set.
 
> 
> Or.
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: issues with vxlan RX checksum offload under OVS datapath
  2014-01-21 21:43       ` Or Gerlitz
@ 2014-01-21 22:35         ` Joseph Gasparakis
  0 siblings, 0 replies; 7+ messages in thread
From: Joseph Gasparakis @ 2014-01-21 22:35 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Joseph Gasparakis, Pravin Shelar, Or Gerlitz, netdev



On Tue, 21 Jan 2014, Or Gerlitz wrote:

> On Tue, Jan 21, 2014, Joseph Gasparakis <joseph.gasparakis@intel.com> wrote:
> > On Tue, 21 Jan 2014, Or Gerlitz wrote:
> 
> >> To be a bit more precise/concrete here, do we agree that the both paths must >>    skb->encapsulation = 0;
> >> which is done now only by the non-ovs path
> 
> > Originally skb->encapsulation had (and still has) the meaning of "does
> > this skb have outer *and* (valid) inner headers? If so, it is an
> > encapsulated packet".
> > So based on this skb->encapsulation should be set as soon an (inner)
> > packet gets encapsulated and unset when decapsulation takes place, and
> > ideally this should happen in the ovs path too.
> 
> I would say critically not ideally... e.g when the packet is
> destinat-ed to a VM and goes through tap netdevice plugged to OVS --
> without this de-assignment weird things happen after the point in time
> where the ovs TX path calls dev_queue_xmit() in
> net/openvswitch/vport-netdev.c
> 
> > Together with skb->encapsulation the inner headers should be correctly set.
> 
> That's interesting point, what might break if this isn't done?

Well, the inner headers, same as the generic (outer) headers are pointers 
or offsets that are going to be added to the packet data pointer. If for 
any reason we try to access these pointers when they are invalid anything 
can go wrong... The way to indicate if they are valid or not should be 
skb->encapsulation.

> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-01-21 22:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-19 22:05 issues with vxlan RX checksum offload under OVS datapath Or Gerlitz
2014-01-21 17:30 ` Pravin Shelar
2014-01-21 17:37   ` Jesse Gross
2014-01-21 20:55   ` Or Gerlitz
2014-01-21 21:47     ` Joseph Gasparakis
2014-01-21 21:43       ` Or Gerlitz
2014-01-21 22:35         ` Joseph Gasparakis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).