netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: tun: mark small packets as owned by the tap sock
       [not found] <git-mailbomb-linux-master-4b663366246be1d1d4b1b8b01245b2e88ad9e706@kernel.org>
@ 2019-08-12 22:19 ` Dave Jones
  2019-08-13  7:58   ` Jack Wang
  2019-08-13  8:33   ` Jason Wang
  0 siblings, 2 replies; 5+ messages in thread
From: Dave Jones @ 2019-08-12 22:19 UTC (permalink / raw)
  To: Alexis Bauvin; +Cc: netdev

On Wed, Aug 07, 2019 at 12:30:07AM +0000, Linux Kernel wrote:
 > Commit:     4b663366246be1d1d4b1b8b01245b2e88ad9e706
 > Parent:     16b2084a8afa1432d14ba72b7c97d7908e178178
 > Web:        https://git.kernel.org/torvalds/c/4b663366246be1d1d4b1b8b01245b2e88ad9e706
 > Author:     Alexis Bauvin <abauvin@scaleway.com>
 > AuthorDate: Tue Jul 23 16:23:01 2019 +0200
 > 
 >     tun: mark small packets as owned by the tap sock
 >     
 >     - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
 >     
 >     Small packets going out of a tap device go through an optimized code
 >     path that uses build_skb() rather than sock_alloc_send_pskb(). The
 >     latter calls skb_set_owner_w(), but the small packet code path does not.
 >     
 >     The net effect is that small packets are not owned by the userland
 >     application's socket (e.g. QEMU), while large packets are.
 >     This can be seen with a TCP session, where packets are not owned when
 >     the window size is small enough (around PAGE_SIZE), while they are once
 >     the window grows (note that this requires the host to support virtio
 >     tso for the guest to offload segmentation).
 >     All this leads to inconsistent behaviour in the kernel, especially on
 >     netfilter modules that uses sk->socket (e.g. xt_owner).
 >     
 >     Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
 >     Signed-off-by: Alexis Bauvin <abauvin@scaleway.com>
 >     Acked-by: Jason Wang <jasowang@redhat.com>

This commit breaks ipv6 routing when I deployed on it a linode.
It seems to work briefly after boot, and then silently all packets get
dropped. (Presumably, it's dropping RA or ND packets)

With this reverted, everything works as it did in rc3.

	Dave


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: tun: mark small packets as owned by the tap sock
  2019-08-12 22:19 ` tun: mark small packets as owned by the tap sock Dave Jones
@ 2019-08-13  7:58   ` Jack Wang
  2019-08-13  8:33   ` Jason Wang
  1 sibling, 0 replies; 5+ messages in thread
From: Jack Wang @ 2019-08-13  7:58 UTC (permalink / raw)
  To: Dave Jones; +Cc: Alexis Bauvin, netdev, stable

Dave Jones <davej@codemonkey.org.uk> 于2019年8月13日周二 上午1:05写道:
>
> On Wed, Aug 07, 2019 at 12:30:07AM +0000, Linux Kernel wrote:
>  > Commit:     4b663366246be1d1d4b1b8b01245b2e88ad9e706
>  > Parent:     16b2084a8afa1432d14ba72b7c97d7908e178178
>  > Web:        https://git.kernel.org/torvalds/c/4b663366246be1d1d4b1b8b01245b2e88ad9e706
>  > Author:     Alexis Bauvin <abauvin@scaleway.com>
>  > AuthorDate: Tue Jul 23 16:23:01 2019 +0200
>  >
>  >     tun: mark small packets as owned by the tap sock
>  >
>  >     - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
>  >
>  >     Small packets going out of a tap device go through an optimized code
>  >     path that uses build_skb() rather than sock_alloc_send_pskb(). The
>  >     latter calls skb_set_owner_w(), but the small packet code path does not.
>  >
>  >     The net effect is that small packets are not owned by the userland
>  >     application's socket (e.g. QEMU), while large packets are.
>  >     This can be seen with a TCP session, where packets are not owned when
>  >     the window size is small enough (around PAGE_SIZE), while they are once
>  >     the window grows (note that this requires the host to support virtio
>  >     tso for the guest to offload segmentation).
>  >     All this leads to inconsistent behaviour in the kernel, especially on
>  >     netfilter modules that uses sk->socket (e.g. xt_owner).
>  >
>  >     Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
>  >     Signed-off-by: Alexis Bauvin <abauvin@scaleway.com>
>  >     Acked-by: Jason Wang <jasowang@redhat.com>
>
> This commit breaks ipv6 routing when I deployed on it a linode.
> It seems to work briefly after boot, and then silently all packets get
> dropped. (Presumably, it's dropping RA or ND packets)
>
> With this reverted, everything works as it did in rc3.
>
>         Dave
>
Thanks for reporting, Dave.

+cc stable
Just noticed, the patch has been backported to  4.14,4.19, 5.2

Regards,
Jack Wang

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: tun: mark small packets as owned by the tap sock
  2019-08-12 22:19 ` tun: mark small packets as owned by the tap sock Dave Jones
  2019-08-13  7:58   ` Jack Wang
@ 2019-08-13  8:33   ` Jason Wang
  2019-08-13 14:00     ` Dave Jones
  1 sibling, 1 reply; 5+ messages in thread
From: Jason Wang @ 2019-08-13  8:33 UTC (permalink / raw)
  To: Dave Jones, Alexis Bauvin; +Cc: netdev


On 2019/8/13 上午6:19, Dave Jones wrote:
> On Wed, Aug 07, 2019 at 12:30:07AM +0000, Linux Kernel wrote:
>   > Commit:     4b663366246be1d1d4b1b8b01245b2e88ad9e706
>   > Parent:     16b2084a8afa1432d14ba72b7c97d7908e178178
>   > Web:        https://git.kernel.org/torvalds/c/4b663366246be1d1d4b1b8b01245b2e88ad9e706
>   > Author:     Alexis Bauvin <abauvin@scaleway.com>
>   > AuthorDate: Tue Jul 23 16:23:01 2019 +0200
>   >
>   >     tun: mark small packets as owned by the tap sock
>   >
>   >     - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
>   >
>   >     Small packets going out of a tap device go through an optimized code
>   >     path that uses build_skb() rather than sock_alloc_send_pskb(). The
>   >     latter calls skb_set_owner_w(), but the small packet code path does not.
>   >
>   >     The net effect is that small packets are not owned by the userland
>   >     application's socket (e.g. QEMU), while large packets are.
>   >     This can be seen with a TCP session, where packets are not owned when
>   >     the window size is small enough (around PAGE_SIZE), while they are once
>   >     the window grows (note that this requires the host to support virtio
>   >     tso for the guest to offload segmentation).
>   >     All this leads to inconsistent behaviour in the kernel, especially on
>   >     netfilter modules that uses sk->socket (e.g. xt_owner).
>   >
>   >     Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
>   >     Signed-off-by: Alexis Bauvin <abauvin@scaleway.com>
>   >     Acked-by: Jason Wang <jasowang@redhat.com>
>
> This commit breaks ipv6 routing when I deployed on it a linode.
> It seems to work briefly after boot, and then silently all packets get
> dropped. (Presumably, it's dropping RA or ND packets)
>
> With this reverted, everything works as it did in rc3.
>
> 	Dave


Hi:

Two questions:

- Are you using XDP for TUN?

- Does it work before 66ccbc9c87c2? If yes, could you show us the result 
of net_dropmonitor?

Thanks


>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: tun: mark small packets as owned by the tap sock
  2019-08-13  8:33   ` Jason Wang
@ 2019-08-13 14:00     ` Dave Jones
  2019-08-15  5:11       ` Jason Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Jones @ 2019-08-13 14:00 UTC (permalink / raw)
  To: Jason Wang; +Cc: Alexis Bauvin, netdev

On Tue, Aug 13, 2019 at 04:33:59PM +0800, Jason Wang wrote:
 > 
 > On 2019/8/13 上午6:19, Dave Jones wrote:
 > > On Wed, Aug 07, 2019 at 12:30:07AM +0000, Linux Kernel wrote:
 > >   > Commit:     4b663366246be1d1d4b1b8b01245b2e88ad9e706
 > >   > Parent:     16b2084a8afa1432d14ba72b7c97d7908e178178
 > >   > Web:        https://git.kernel.org/torvalds/c/4b663366246be1d1d4b1b8b01245b2e88ad9e706
 > >   > Author:     Alexis Bauvin <abauvin@scaleway.com>
 > >   > AuthorDate: Tue Jul 23 16:23:01 2019 +0200
 > >   >
 > >   >     tun: mark small packets as owned by the tap sock
 > >   >
 > >   >     - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
 > >
 > > This commit breaks ipv6 routing when I deployed on it a linode.
 > > It seems to work briefly after boot, and then silently all packets get
 > > dropped. (Presumably, it's dropping RA or ND packets)
 > >
 > > With this reverted, everything works as it did in rc3.
 > >
 > Two questions:
 > 
 > - Are you using XDP for TUN?

not knowingly.  
$ grep XDP .config
# CONFIG_XDP_SOCKETS is not set

What's configured on the hypervisor side I have no idea.

 > - Does it work before 66ccbc9c87c2?

that's been around since 4.14-rc1, and at one point it ran whatever was
in debian9 (4.9).  I don't recall it ever not working, so I'd say yes.

I can build a 4.13 if it'll prove something, but it'll take me a while.
(This is my primary MX, so it's dropping email while it's on the broken
 kernel, so I need to plan some time to be around to babysit it)

 > If yes, could you show us the result of net_dropmonitor?

where do I get that?  It doesn't seem packaged for debian.

	Dave


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: tun: mark small packets as owned by the tap sock
  2019-08-13 14:00     ` Dave Jones
@ 2019-08-15  5:11       ` Jason Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Jason Wang @ 2019-08-15  5:11 UTC (permalink / raw)
  To: Dave Jones; +Cc: Alexis Bauvin, netdev


On 2019/8/13 下午10:00, Dave Jones wrote:
> On Tue, Aug 13, 2019 at 04:33:59PM +0800, Jason Wang wrote:
>   >
>   > On 2019/8/13 上午6:19, Dave Jones wrote:
>   > > On Wed, Aug 07, 2019 at 12:30:07AM +0000, Linux Kernel wrote:
>   > >   > Commit:     4b663366246be1d1d4b1b8b01245b2e88ad9e706
>   > >   > Parent:     16b2084a8afa1432d14ba72b7c97d7908e178178
>   > >   > Web:        https://git.kernel.org/torvalds/c/4b663366246be1d1d4b1b8b01245b2e88ad9e706
>   > >   > Author:     Alexis Bauvin <abauvin@scaleway.com>
>   > >   > AuthorDate: Tue Jul 23 16:23:01 2019 +0200
>   > >   >
>   > >   >     tun: mark small packets as owned by the tap sock
>   > >   >
>   > >   >     - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
>   > >
>   > > This commit breaks ipv6 routing when I deployed on it a linode.
>   > > It seems to work briefly after boot, and then silently all packets get
>   > > dropped. (Presumably, it's dropping RA or ND packets)
>   > >
>   > > With this reverted, everything works as it did in rc3.
>   > >
>   > Two questions:
>   >
>   > - Are you using XDP for TUN?
>
> not knowingly.
> $ grep XDP .config
> # CONFIG_XDP_SOCKETS is not set
>
> What's configured on the hypervisor side I have no idea.


Ok, please tell me more about your setups:

- Are you using TUN in host or guest?

- Are you using it for VM or VPN(tunneling)?

- Where did the packet get dropped?


>
>   > - Does it work before 66ccbc9c87c2?
>
> that's been around since 4.14-rc1, and at one point it ran whatever was
> in debian9 (4.9).  I don't recall it ever not working, so I'd say yes.
>
> I can build a 4.13 if it'll prove something, but it'll take me a while.
> (This is my primary MX, so it's dropping email while it's on the broken
>   kernel, so I need to plan some time to be around to babysit it)


If possible please try that.


>
>   > If yes, could you show us the result of net_dropmonitor?
>
> where do I get that?  It doesn't seem packaged for debian.
>
> 	Dave


It's part of perf-script(1). You can simply start it through perf script 
record net_dropmonitor.

Thanks

>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-08-15  5:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <git-mailbomb-linux-master-4b663366246be1d1d4b1b8b01245b2e88ad9e706@kernel.org>
2019-08-12 22:19 ` tun: mark small packets as owned by the tap sock Dave Jones
2019-08-13  7:58   ` Jack Wang
2019-08-13  8:33   ` Jason Wang
2019-08-13 14:00     ` Dave Jones
2019-08-15  5:11       ` Jason Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).