netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
@ 2018-01-14 11:32 James Chapman
  2018-01-16 10:44 ` Guillaume Nault
  2018-01-16 17:36 ` Tom Herbert
  0 siblings, 2 replies; 16+ messages in thread
From: James Chapman @ 2018-01-14 11:32 UTC (permalink / raw)
  To: netdev; +Cc: tom, James Chapman

SIOCKCMATTACH writes a connected socket's sk_user_data for its own
use. Prevent it doing so if the socket's sk_user_data is already set
since some sockets (e.g. encapsulated sockets) use sk_user_data
internally.

This issue was found by syzbot.

kernel BUG at net/l2tp/l2tp_ppp.c:176!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 3503 Comm: syzkaller938388 Not tainted 4.15.0-rc7+ #181
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:pppol2tp_sock_to_session net/l2tp/l2tp_ppp.c:176 [inline]
RIP: 0010:pppol2tp_sendmsg+0x512/0x670 net/l2tp/l2tp_ppp.c:304
RSP: 0018:ffff8801d4887438 EFLAGS: 00010293
RAX: ffff8801bfef2180 RBX: ffff8801bff88440 RCX: ffffffff84ffbca2
RDX: 0000000000000000 RSI: ffff8801d4887598 RDI: ffff8801bff88820
RBP: ffff8801d48874a8 R08: 0000000000000000 R09: 1ffff1003a910e17
R10: 0000000000000003 R11: 0000000000000001 R12: ffff8801bfff9bc0
R13: 0000000000000000 R14: 0000000000008000 R15: 0000000000000000
FS:  0000000001194880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020ea0000 CR3: 00000001bfecf001 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 sock_sendmsg_nosec net/socket.c:628 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:638
 kernel_sendmsg+0x47/0x60 net/socket.c:646
 sock_no_sendpage+0x1cc/0x280 net/core/sock.c:2581
 kernel_sendpage+0xbf/0xe0 net/socket.c:3349
 kcm_write_msgs+0x404/0x1b80 net/kcm/kcmsock.c:646
 kcm_sendmsg+0x148d/0x22d0 net/kcm/kcmsock.c:1035
 sock_sendmsg_nosec net/socket.c:628 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:638
 ___sys_sendmsg+0x767/0x8b0 net/socket.c:2018
 __sys_sendmsg+0xe5/0x210 net/socket.c:2052
 SYSC_sendmsg net/socket.c:2063 [inline]
 SyS_sendmsg+0x2d/0x50 net/socket.c:2059
 entry_SYSCALL_64_fastpath+0x23/0x9a
RIP: 0033:0x440159
RSP: 002b:00007ffe74df8288 EFLAGS: 00000217 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 0000000000440159
RDX: 0000000000000000 RSI: 00000000201fcfc8 RDI: 0000000000000005
RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000401ac0
R13: 0000000000401b50 R14: 0000000000000000 R15: 0000000000000000
Code: c5 61 70 fc 48 8b 7d d0 e8 7c c2 5b fd 84 c0 74 0d e8 b3 61 70 fc 48 89 df e8 3b 49 2f ff 41 bd f7 ff ff ff eb 86 e8 9e 61 70 fc <0f> 0b 41 bd 95 ff ff ff e9 74 ff ff ff e8 ec 32 a8 fc e9 77 fb
RIP: pppol2tp_sock_to_session net/l2tp/l2tp_ppp.c:176 [inline] RSP: ffff8801d4887438
RIP: pppol2tp_sendmsg+0x512/0x670 net/l2tp/l2tp_ppp.c:304 RSP: ffff8801d4887438

Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: James Chapman <jchapman@katalix.com>
---
 net/kcm/kcmsock.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index d4e98f20fc2a..65392ed58f4a 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1391,6 +1391,10 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
 	if (csk->sk_family == PF_KCM)
 		return -EOPNOTSUPP;
 
+	/* Cannot proceed if connected socket already uses sk_user_data */
+	if (csk->sk_user_data)
+		return -EOPNOTSUPP;
+
 	psock = kmem_cache_zalloc(kcm_psockp, GFP_KERNEL);
 	if (!psock)
 		return -ENOMEM;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-14 11:32 [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used James Chapman
@ 2018-01-16 10:44 ` Guillaume Nault
  2018-01-16 17:10   ` Eric Dumazet
  2018-01-16 17:36 ` Tom Herbert
  1 sibling, 1 reply; 16+ messages in thread
From: Guillaume Nault @ 2018-01-16 10:44 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev, James Chapman

On Sun, Jan 14, 2018 at 11:32:57AM +0000, James Chapman wrote:
> SIOCKCMATTACH writes a connected socket's sk_user_data for its own
> use. Prevent it doing so if the socket's sk_user_data is already set
> since some sockets (e.g. encapsulated sockets) use sk_user_data
> internally.
> 
> diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
> index d4e98f20fc2a..65392ed58f4a 100644
> --- a/net/kcm/kcmsock.c
> +++ b/net/kcm/kcmsock.c
> @@ -1391,6 +1391,10 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
>  	if (csk->sk_family == PF_KCM)
>  		return -EOPNOTSUPP;
>  
> +	/* Cannot proceed if connected socket already uses sk_user_data */
> +	if (csk->sk_user_data)
> +		return -EOPNOTSUPP;
> +
>  	psock = kmem_cache_zalloc(kcm_psockp, GFP_KERNEL);
>  	if (!psock)
>  		return -ENOMEM;
> 
Isn't that racy? What if sk_user_data was concurrently set right after
this test?
Also, it looks like we could create a UDP socket, attach it to KCM,
then create an L2TP tunnel on this same UDP socket. l2tp_tunnel_create()
or setup_udp_tunnel_sock() would unconditionally overwrite
sk_user_data, which will probably confuse KCM.

Tom, if I understand KCM correctly, it only makes sense to attach it to
SOCK_STREAM sockets. Shouldn't that be enforced? Maybe we should
restrict it even further, so that only known KCM-safe sockets could be
attached (that is, reject anything that isn't AF_INET* | SOCK_STREAM).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-16 10:44 ` Guillaume Nault
@ 2018-01-16 17:10   ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-01-16 17:10 UTC (permalink / raw)
  To: Guillaume Nault, Tom Herbert; +Cc: netdev, James Chapman

On Tue, 2018-01-16 at 11:44 +0100, Guillaume Nault wrote:
> 
> Tom, if I understand KCM correctly, it only makes sense to attach it to
> SOCK_STREAM sockets. Shouldn't that be enforced? Maybe we should
> restrict it even further, so that only known KCM-safe sockets could be
> attached (that is, reject anything that isn't AF_INET* | SOCK_STREAM).


I believe I asked the same question months ago. I had no answer.

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/comm
it/?id=351050ecd6523374b370341cc29fe61e2201556b

Not sure if anyone uses KCM, other than fuzzers ?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-14 11:32 [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used James Chapman
  2018-01-16 10:44 ` Guillaume Nault
@ 2018-01-16 17:36 ` Tom Herbert
  2018-01-16 19:00   ` David Miller
  1 sibling, 1 reply; 16+ messages in thread
From: Tom Herbert @ 2018-01-16 17:36 UTC (permalink / raw)
  To: James Chapman; +Cc: Linux Kernel Network Developers

On Sun, Jan 14, 2018 at 3:32 AM, James Chapman <jchapman@katalix.com> wrote:
> SIOCKCMATTACH writes a connected socket's sk_user_data for its own
> use. Prevent it doing so if the socket's sk_user_data is already set
> since some sockets (e.g. encapsulated sockets) use sk_user_data
> internally.
>
> This issue was found by syzbot.
>
> kernel BUG at net/l2tp/l2tp_ppp.c:176!
> invalid opcode: 0000 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 3503 Comm: syzkaller938388 Not tainted 4.15.0-rc7+ #181
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:pppol2tp_sock_to_session net/l2tp/l2tp_ppp.c:176 [inline]
> RIP: 0010:pppol2tp_sendmsg+0x512/0x670 net/l2tp/l2tp_ppp.c:304
> RSP: 0018:ffff8801d4887438 EFLAGS: 00010293
> RAX: ffff8801bfef2180 RBX: ffff8801bff88440 RCX: ffffffff84ffbca2
> RDX: 0000000000000000 RSI: ffff8801d4887598 RDI: ffff8801bff88820
> RBP: ffff8801d48874a8 R08: 0000000000000000 R09: 1ffff1003a910e17
> R10: 0000000000000003 R11: 0000000000000001 R12: ffff8801bfff9bc0
> R13: 0000000000000000 R14: 0000000000008000 R15: 0000000000000000
> FS:  0000000001194880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000020ea0000 CR3: 00000001bfecf001 CR4: 00000000001606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  sock_sendmsg_nosec net/socket.c:628 [inline]
>  sock_sendmsg+0xca/0x110 net/socket.c:638
>  kernel_sendmsg+0x47/0x60 net/socket.c:646
>  sock_no_sendpage+0x1cc/0x280 net/core/sock.c:2581
>  kernel_sendpage+0xbf/0xe0 net/socket.c:3349
>  kcm_write_msgs+0x404/0x1b80 net/kcm/kcmsock.c:646
>  kcm_sendmsg+0x148d/0x22d0 net/kcm/kcmsock.c:1035
>  sock_sendmsg_nosec net/socket.c:628 [inline]
>  sock_sendmsg+0xca/0x110 net/socket.c:638
>  ___sys_sendmsg+0x767/0x8b0 net/socket.c:2018
>  __sys_sendmsg+0xe5/0x210 net/socket.c:2052
>  SYSC_sendmsg net/socket.c:2063 [inline]
>  SyS_sendmsg+0x2d/0x50 net/socket.c:2059
>  entry_SYSCALL_64_fastpath+0x23/0x9a
> RIP: 0033:0x440159
> RSP: 002b:00007ffe74df8288 EFLAGS: 00000217 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 0000000000440159
> RDX: 0000000000000000 RSI: 00000000201fcfc8 RDI: 0000000000000005
> RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000401ac0
> R13: 0000000000401b50 R14: 0000000000000000 R15: 0000000000000000
> Code: c5 61 70 fc 48 8b 7d d0 e8 7c c2 5b fd 84 c0 74 0d e8 b3 61 70 fc 48 89 df e8 3b 49 2f ff 41 bd f7 ff ff ff eb 86 e8 9e 61 70 fc <0f> 0b 41 bd 95 ff ff ff e9 74 ff ff ff e8 ec 32 a8 fc e9 77 fb
> RIP: pppol2tp_sock_to_session net/l2tp/l2tp_ppp.c:176 [inline] RSP: ffff8801d4887438
> RIP: pppol2tp_sendmsg+0x512/0x670 net/l2tp/l2tp_ppp.c:304 RSP: ffff8801d4887438
>
> Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com
> Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
> Signed-off-by: James Chapman <jchapman@katalix.com>
> ---
>  net/kcm/kcmsock.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
> index d4e98f20fc2a..65392ed58f4a 100644
> --- a/net/kcm/kcmsock.c
> +++ b/net/kcm/kcmsock.c
> @@ -1391,6 +1391,10 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
>         if (csk->sk_family == PF_KCM)
>                 return -EOPNOTSUPP;
>
> +       /* Cannot proceed if connected socket already uses sk_user_data */
> +       if (csk->sk_user_data)
> +               return -EOPNOTSUPP;
> +

sk_user_data is set with the sk_callback lock held in code below.
Should be able to take the lock earlier can do this check under the
lock.

Tom

>         psock = kmem_cache_zalloc(kcm_psockp, GFP_KERNEL);
>         if (!psock)
>                 return -ENOMEM;
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-16 17:36 ` Tom Herbert
@ 2018-01-16 19:00   ` David Miller
  2018-01-17 11:13     ` James Chapman
  0 siblings, 1 reply; 16+ messages in thread
From: David Miller @ 2018-01-16 19:00 UTC (permalink / raw)
  To: tom; +Cc: jchapman, netdev

From: Tom Herbert <tom@herbertland.com>
Date: Tue, 16 Jan 2018 09:36:41 -0800

> sk_user_data is set with the sk_callback lock held in code below.
> Should be able to take the lock earlier can do this check under the
> lock.

csock, and this csk, is obtained from an arbitrary one of the
process's FDs.  It can be any socket type or family, and that socket's
family might set sk_user_data without the callback lock.

The only socket type check is making sure it is not another PF_KCM
socket.  So that doesn't help with this problem.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-16 19:00   ` David Miller
@ 2018-01-17 11:13     ` James Chapman
  2018-01-17 19:25       ` David Miller
  0 siblings, 1 reply; 16+ messages in thread
From: James Chapman @ 2018-01-17 11:13 UTC (permalink / raw)
  To: David Miller; +Cc: tom, netdev, Guillaume Nault

On 16 January 2018 at 19:00, David Miller <davem@davemloft.net> wrote:
> From: Tom Herbert <tom@herbertland.com>
> Date: Tue, 16 Jan 2018 09:36:41 -0800
>
>> sk_user_data is set with the sk_callback lock held in code below.
>> Should be able to take the lock earlier can do this check under the
>> lock.
>
> csock, and this csk, is obtained from an arbitrary one of the
> process's FDs.  It can be any socket type or family, and that socket's
> family might set sk_user_data without the callback lock.
>
> The only socket type check is making sure it is not another PF_KCM
> socket.  So that doesn't help with this problem.

Is it the intention to update all socket code over time to write
sk_user_data within the sk_callback lock? If so, I'm happy to address
that in the l2tp code (and update the kcm patch to check sk_user_data
within the sk_callback lock). Or is the preferred solution to restrict
KCM to specific socket families, as suggested by Guillaume earlier in
the thread?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-17 11:13     ` James Chapman
@ 2018-01-17 19:25       ` David Miller
  2018-01-18 15:18         ` Guillaume Nault
  2018-01-18 17:40         ` Tom Herbert
  0 siblings, 2 replies; 16+ messages in thread
From: David Miller @ 2018-01-17 19:25 UTC (permalink / raw)
  To: jchapman; +Cc: tom, netdev, g.nault

From: James Chapman <jchapman@katalix.com>
Date: Wed, 17 Jan 2018 11:13:33 +0000

> On 16 January 2018 at 19:00, David Miller <davem@davemloft.net> wrote:
>> From: Tom Herbert <tom@herbertland.com>
>> Date: Tue, 16 Jan 2018 09:36:41 -0800
>>
>>> sk_user_data is set with the sk_callback lock held in code below.
>>> Should be able to take the lock earlier can do this check under the
>>> lock.
>>
>> csock, and this csk, is obtained from an arbitrary one of the
>> process's FDs.  It can be any socket type or family, and that socket's
>> family might set sk_user_data without the callback lock.
>>
>> The only socket type check is making sure it is not another PF_KCM
>> socket.  So that doesn't help with this problem.
> 
> Is it the intention to update all socket code over time to write
> sk_user_data within the sk_callback lock? If so, I'm happy to address
> that in the l2tp code (and update the kcm patch to check sk_user_data
> within the sk_callback lock). Or is the preferred solution to restrict
> KCM to specific socket families, as suggested by Guillaume earlier in
> the thread?

I think we have a more fundamental issue here.

sk->sk_user_data is a place where RPC layer specific data is hung off
of.  By this definition SunRPC, RXRPC, RDS, TIPC, and KCM are all
using it correctly.

Phonet has a similar issue to the one seen here, it tests and changes
sk_user_data under lock_sock().  The only requirement it makes is
that the socket type is not SOCK_STREAM.  However, this one might be OK
since only pep_sock sockets can be passed down into gprs_attach().

Most of these cases like SunRPC, RXRPC, etc. are fine because they
only graft on top of TCP and UDP sockets.

The weird situation here is that L2TP does tunneling and stores it's
private state in sk->sk_user_data like an RPC layer would.  And KCM
allows basically any socket type to be attached.

The RPC layers create their sockets internally, so I cannot see a way
that those can be sent to a KCM attach operations.  And I think that
is why this RPC invariant is important for sk_user_data usage.

If all else was equal, even though it doesn't make much sense to KCM
attach L2TP sockets to KCM, I would suggest to change L2TP to store
it's private stuff elsewhere.

But that is not the case.  Anything using the generic UDP
encapsulation layer is going to make use of sk->sk_user_data like this
(see setup_udp_tunnel_sock).

It looks like over time we've accumulated this new class of uses
of sk->sk_user_data, ho hum...

And it's not like we can add a test to KCM to avoid these socket
types, because they will look like normal UDP datagram sockets.

What a mess...

Furthermore, even if you add a test to KCM, you will now need to
add the same test to L2TP and anything else which uses sk_user_data
for tunneling and for which userspace has access to the socket fd.

And it will be racy, indeed, until all such users align to the same
precise locking scheme for tests and updates to sk_user_data.

Again, what a mess...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-17 19:25       ` David Miller
@ 2018-01-18 15:18         ` Guillaume Nault
  2018-01-18 15:40           ` James Chapman
  2018-01-18 17:40         ` Tom Herbert
  1 sibling, 1 reply; 16+ messages in thread
From: Guillaume Nault @ 2018-01-18 15:18 UTC (permalink / raw)
  To: David Miller; +Cc: jchapman, tom, netdev

On Wed, Jan 17, 2018 at 02:25:38PM -0500, David Miller wrote:
> From: James Chapman <jchapman@katalix.com>
> Date: Wed, 17 Jan 2018 11:13:33 +0000
> 
> > On 16 January 2018 at 19:00, David Miller <davem@davemloft.net> wrote:
> >> From: Tom Herbert <tom@herbertland.com>
> >> Date: Tue, 16 Jan 2018 09:36:41 -0800
> >>
> >>> sk_user_data is set with the sk_callback lock held in code below.
> >>> Should be able to take the lock earlier can do this check under the
> >>> lock.
> >>
> >> csock, and this csk, is obtained from an arbitrary one of the
> >> process's FDs.  It can be any socket type or family, and that socket's
> >> family might set sk_user_data without the callback lock.
> >>
> >> The only socket type check is making sure it is not another PF_KCM
> >> socket.  So that doesn't help with this problem.
> > 
> > Is it the intention to update all socket code over time to write
> > sk_user_data within the sk_callback lock? If so, I'm happy to address
> > that in the l2tp code (and update the kcm patch to check sk_user_data
> > within the sk_callback lock). Or is the preferred solution to restrict
> > KCM to specific socket families, as suggested by Guillaume earlier in
> > the thread?
> 
> I think we have a more fundamental issue here.
> 
> sk->sk_user_data is a place where RPC layer specific data is hung off
> of.  By this definition SunRPC, RXRPC, RDS, TIPC, and KCM are all
> using it correctly.
> 
> Phonet has a similar issue to the one seen here, it tests and changes
> sk_user_data under lock_sock().  The only requirement it makes is
> that the socket type is not SOCK_STREAM.  However, this one might be OK
> since only pep_sock sockets can be passed down into gprs_attach().
> 
But, if I read it correctly, that doesn't prevent it from being passed
to kcm_attach() later on, which will overwrite sk_user_data (unless we
update the locking scheme and refuse to overwrite sk_user_data in a
race-free way).

BTW couldn't the gprs_dev pointer be embedded in struct pep_sock?
This way pep_sk(sk)->gp could be used instead of sk->sk_user_data.
That'd probably be a violation of the phonet's layering, as that'd
tie gprs_dev to pep sockets. OTOH, only pep sockets can currently be
attached to gprs_dev, so in practice that might be a reasonable
compromise.

> Most of these cases like SunRPC, RXRPC, etc. are fine because they
> only graft on top of TCP and UDP sockets.
> 
> The weird situation here is that L2TP does tunneling and stores it's
> private state in sk->sk_user_data like an RPC layer would.  And KCM
> allows basically any socket type to be attached.
> 
> The RPC layers create their sockets internally, so I cannot see a way
> that those can be sent to a KCM attach operations.  And I think that
> is why this RPC invariant is important for sk_user_data usage.
> 
SunRPC seems to possibly set sk_user_data on user sockets: svc_addsock()
gets a socket using sockfd_lookup() then passes it to svc_setup_socket()
which in turn sets sk_user_data. I don't know anything about SunRPC, so
I might very well have missed important details, but I believe such a
socket could be passed to KCM which could lead to the same kind of
issues as for L2TP. Other RPCs look safe to me.

> If all else was equal, even though it doesn't make much sense to KCM
> attach L2TP sockets to KCM, I would suggest to change L2TP to store
> it's private stuff elsewhere.
> 
> But that is not the case.  Anything using the generic UDP
> encapsulation layer is going to make use of sk->sk_user_data like this
> (see setup_udp_tunnel_sock).
> 
Most UDP encapsulations only use kernel sockets though. It seems that
only L2TP and GTP use setup_udp_tunnel_sock() with userpsace sockets.
So it might be feasible to restrict usage of sk_user_data to kernel
sockets only.

For L2TP, we probably can adapt l2tp_sock_to_tunnel() so that it does
a lookup in a hashtable indexed by the socket pointer, rather than
dereferencing sk_user_data. That doesn't look very satisfying to me,
but that's the only way I found so far.

We also have another user of sk_user_data in l2tp_ppp, but since it
uses its own socket type, I guess we could simply embed the pointer in
its parent structure.

> It looks like over time we've accumulated this new class of uses
> of sk->sk_user_data, ho hum...
> 
> And it's not like we can add a test to KCM to avoid these socket
> types, because they will look like normal UDP datagram sockets.
> 
> What a mess...
> 
> Furthermore, even if you add a test to KCM, you will now need to
> add the same test to L2TP and anything else which uses sk_user_data
> for tunneling and for which userspace has access to the socket fd.
> 
> And it will be racy, indeed, until all such users align to the same
> precise locking scheme for tests and updates to sk_user_data.
> 
> Again, what a mess...
> 
So, if I understand correctly, we can either restrict sk_user_data to
kernel sockets so that KCM couldn't act on them (but then why would we
make an exception for KCM and allow it to set sk_user_data on
non-kernel sockets?).
Or we could agree on a locking scheme for sk_user_data and update all
users so that they'd fail instead of overwriting it when it's not NULL.

Assuming my understanding is correct, do you have any preference for
fixing this issue? Or any other ideas?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-18 15:18         ` Guillaume Nault
@ 2018-01-18 15:40           ` James Chapman
  2018-01-18 16:29             ` Guillaume Nault
  2018-01-18 17:46             ` Tom Herbert
  0 siblings, 2 replies; 16+ messages in thread
From: James Chapman @ 2018-01-18 15:40 UTC (permalink / raw)
  To: Guillaume Nault, David Miller; +Cc: Tom Herbert, netdev

On 18 January 2018 at 15:18, Guillaume Nault <g.nault@alphalink.fr> wrote:
> On Wed, Jan 17, 2018 at 02:25:38PM -0500, David Miller wrote:
>> From: James Chapman <jchapman@katalix.com>
>> Date: Wed, 17 Jan 2018 11:13:33 +0000
>>
>> > On 16 January 2018 at 19:00, David Miller <davem@davemloft.net> wrote:
>> >> From: Tom Herbert <tom@herbertland.com>
>> >> Date: Tue, 16 Jan 2018 09:36:41 -0800
>> >>
>> >>> sk_user_data is set with the sk_callback lock held in code below.
>> >>> Should be able to take the lock earlier can do this check under the
>> >>> lock.
>> >>
>> >> csock, and this csk, is obtained from an arbitrary one of the
>> >> process's FDs.  It can be any socket type or family, and that socket's
>> >> family might set sk_user_data without the callback lock.
>> >>
>> >> The only socket type check is making sure it is not another PF_KCM
>> >> socket.  So that doesn't help with this problem.
>> >
>> > Is it the intention to update all socket code over time to write
>> > sk_user_data within the sk_callback lock? If so, I'm happy to address
>> > that in the l2tp code (and update the kcm patch to check sk_user_data
>> > within the sk_callback lock). Or is the preferred solution to restrict
>> > KCM to specific socket families, as suggested by Guillaume earlier in
>> > the thread?
>>
>> I think we have a more fundamental issue here.
>>
>> sk->sk_user_data is a place where RPC layer specific data is hung off
>> of.  By this definition SunRPC, RXRPC, RDS, TIPC, and KCM are all
>> using it correctly.
>>
>> Phonet has a similar issue to the one seen here, it tests and changes
>> sk_user_data under lock_sock().  The only requirement it makes is
>> that the socket type is not SOCK_STREAM.  However, this one might be OK
>> since only pep_sock sockets can be passed down into gprs_attach().
>>
> But, if I read it correctly, that doesn't prevent it from being passed
> to kcm_attach() later on, which will overwrite sk_user_data (unless we
> update the locking scheme and refuse to overwrite sk_user_data in a
> race-free way).
>
> BTW couldn't the gprs_dev pointer be embedded in struct pep_sock?
> This way pep_sk(sk)->gp could be used instead of sk->sk_user_data.
> That'd probably be a violation of the phonet's layering, as that'd
> tie gprs_dev to pep sockets. OTOH, only pep sockets can currently be
> attached to gprs_dev, so in practice that might be a reasonable
> compromise.
>
>> Most of these cases like SunRPC, RXRPC, etc. are fine because they
>> only graft on top of TCP and UDP sockets.
>>
>> The weird situation here is that L2TP does tunneling and stores it's
>> private state in sk->sk_user_data like an RPC layer would.  And KCM
>> allows basically any socket type to be attached.
>>
>> The RPC layers create their sockets internally, so I cannot see a way
>> that those can be sent to a KCM attach operations.  And I think that
>> is why this RPC invariant is important for sk_user_data usage.
>>
> SunRPC seems to possibly set sk_user_data on user sockets: svc_addsock()
> gets a socket using sockfd_lookup() then passes it to svc_setup_socket()
> which in turn sets sk_user_data. I don't know anything about SunRPC, so
> I might very well have missed important details, but I believe such a
> socket could be passed to KCM which could lead to the same kind of
> issues as for L2TP. Other RPCs look safe to me.
>
>> If all else was equal, even though it doesn't make much sense to KCM
>> attach L2TP sockets to KCM, I would suggest to change L2TP to store
>> it's private stuff elsewhere.
>>
>> But that is not the case.  Anything using the generic UDP
>> encapsulation layer is going to make use of sk->sk_user_data like this
>> (see setup_udp_tunnel_sock).
>>
> Most UDP encapsulations only use kernel sockets though. It seems that
> only L2TP and GTP use setup_udp_tunnel_sock() with userpsace sockets.
> So it might be feasible to restrict usage of sk_user_data to kernel
> sockets only.
>
> For L2TP, we probably can adapt l2tp_sock_to_tunnel() so that it does
> a lookup in a hashtable indexed by the socket pointer, rather than
> dereferencing sk_user_data. That doesn't look very satisfying to me,
> but that's the only way I found so far.

L2TP needs a way to get at its local data from the socket in the data path.

> We also have another user of sk_user_data in l2tp_ppp, but since it
> uses its own socket type, I guess we could simply embed the pointer in
> its parent structure.
>
>> It looks like over time we've accumulated this new class of uses
>> of sk->sk_user_data, ho hum...
>>
>> And it's not like we can add a test to KCM to avoid these socket
>> types, because they will look like normal UDP datagram sockets.
>>
>> What a mess...
>>
>> Furthermore, even if you add a test to KCM, you will now need to
>> add the same test to L2TP and anything else which uses sk_user_data
>> for tunneling and for which userspace has access to the socket fd.
>>
>> And it will be racy, indeed, until all such users align to the same
>> precise locking scheme for tests and updates to sk_user_data.
>>
>> Again, what a mess...
>>
> So, if I understand correctly, we can either restrict sk_user_data to
> kernel sockets so that KCM couldn't act on them (but then why would we
> make an exception for KCM and allow it to set sk_user_data on
> non-kernel sockets?).
> Or we could agree on a locking scheme for sk_user_data and update all
> users so that they'd fail instead of overwriting it when it's not NULL.
>
> Assuming my understanding is correct, do you have any preference for
> fixing this issue? Or any other ideas?

Could we add a new pointer, say, encap_user_data to struct udp_sock
and use it instead of sk_user_data for UDP-encap sockets?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-18 15:40           ` James Chapman
@ 2018-01-18 16:29             ` Guillaume Nault
  2018-01-18 17:30               ` James Chapman
  2018-01-18 17:46             ` Tom Herbert
  1 sibling, 1 reply; 16+ messages in thread
From: Guillaume Nault @ 2018-01-18 16:29 UTC (permalink / raw)
  To: James Chapman; +Cc: David Miller, Tom Herbert, netdev

On Thu, Jan 18, 2018 at 03:40:52PM +0000, James Chapman wrote:
> On 18 January 2018 at 15:18, Guillaume Nault <g.nault@alphalink.fr> wrote:
> > On Wed, Jan 17, 2018 at 02:25:38PM -0500, David Miller wrote:
> >> If all else was equal, even though it doesn't make much sense to KCM
> >> attach L2TP sockets to KCM, I would suggest to change L2TP to store
> >> it's private stuff elsewhere.
> >>
> >> But that is not the case.  Anything using the generic UDP
> >> encapsulation layer is going to make use of sk->sk_user_data like this
> >> (see setup_udp_tunnel_sock).
> >>
> > Most UDP encapsulations only use kernel sockets though. It seems that
> > only L2TP and GTP use setup_udp_tunnel_sock() with userpsace sockets.
> > So it might be feasible to restrict usage of sk_user_data to kernel
> > sockets only.
> >
> > For L2TP, we probably can adapt l2tp_sock_to_tunnel() so that it does
> > a lookup in a hashtable indexed by the socket pointer, rather than
> > dereferencing sk_user_data. That doesn't look very satisfying to me,
> > but that's the only way I found so far.
> 
> L2TP needs a way to get at its local data from the socket in the data path.
> 
Did I miss something? On xmit, the session is provided by l2tp_ppp or
l2tp_eth, which is enough to get access to the parent tunnel.
For reception, l2tp_udp_encap_recv() receives the socket pointer as
parameter and could get enough information from the headers to retrieve the
tunnel structure anymay (l2tp_ip and l2tp_ip6 use the headers).

l2tp_ppp also uses sk_user_data for its PPPOX sockets, but we probably
can handle this case more easily.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-18 16:29             ` Guillaume Nault
@ 2018-01-18 17:30               ` James Chapman
  0 siblings, 0 replies; 16+ messages in thread
From: James Chapman @ 2018-01-18 17:30 UTC (permalink / raw)
  To: Guillaume Nault; +Cc: David Miller, Tom Herbert, netdev

On 18 January 2018 at 16:29, Guillaume Nault <g.nault@alphalink.fr> wrote:
> On Thu, Jan 18, 2018 at 03:40:52PM +0000, James Chapman wrote:
>> On 18 January 2018 at 15:18, Guillaume Nault <g.nault@alphalink.fr> wrote:
>> > On Wed, Jan 17, 2018 at 02:25:38PM -0500, David Miller wrote:
>> >> If all else was equal, even though it doesn't make much sense to KCM
>> >> attach L2TP sockets to KCM, I would suggest to change L2TP to store
>> >> it's private stuff elsewhere.
>> >>
>> >> But that is not the case.  Anything using the generic UDP
>> >> encapsulation layer is going to make use of sk->sk_user_data like this
>> >> (see setup_udp_tunnel_sock).
>> >>
>> > Most UDP encapsulations only use kernel sockets though. It seems that
>> > only L2TP and GTP use setup_udp_tunnel_sock() with userpsace sockets.
>> > So it might be feasible to restrict usage of sk_user_data to kernel
>> > sockets only.
>> >
>> > For L2TP, we probably can adapt l2tp_sock_to_tunnel() so that it does
>> > a lookup in a hashtable indexed by the socket pointer, rather than
>> > dereferencing sk_user_data. That doesn't look very satisfying to me,
>> > but that's the only way I found so far.
>>
>> L2TP needs a way to get at its local data from the socket in the data path.
>>
> Did I miss something? On xmit, the session is provided by l2tp_ppp or
> l2tp_eth, which is enough to get access to the parent tunnel.
> For reception, l2tp_udp_encap_recv() receives the socket pointer as
> parameter and could get enough information from the headers to retrieve the
> tunnel structure anymay (l2tp_ip and l2tp_ip6 use the headers).

It's the receive side I was thinking about. It would be a little more
involved to derive the tunnel and session from the packet with UDP
since we'd have to handle L2TPv2 and L2TPv3.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-17 19:25       ` David Miller
  2018-01-18 15:18         ` Guillaume Nault
@ 2018-01-18 17:40         ` Tom Herbert
  1 sibling, 0 replies; 16+ messages in thread
From: Tom Herbert @ 2018-01-18 17:40 UTC (permalink / raw)
  To: David Miller; +Cc: James Chapman, Linux Kernel Network Developers, g.nault

On Wed, Jan 17, 2018 at 11:25 AM, David Miller <davem@davemloft.net> wrote:
> From: James Chapman <jchapman@katalix.com>
> Date: Wed, 17 Jan 2018 11:13:33 +0000
>
>> On 16 January 2018 at 19:00, David Miller <davem@davemloft.net> wrote:
>>> From: Tom Herbert <tom@herbertland.com>
>>> Date: Tue, 16 Jan 2018 09:36:41 -0800
>>>
>>>> sk_user_data is set with the sk_callback lock held in code below.
>>>> Should be able to take the lock earlier can do this check under the
>>>> lock.
>>>
>>> csock, and this csk, is obtained from an arbitrary one of the
>>> process's FDs.  It can be any socket type or family, and that socket's
>>> family might set sk_user_data without the callback lock.
>>>
>>> The only socket type check is making sure it is not another PF_KCM
>>> socket.  So that doesn't help with this problem.
>>
>> Is it the intention to update all socket code over time to write
>> sk_user_data within the sk_callback lock? If so, I'm happy to address
>> that in the l2tp code (and update the kcm patch to check sk_user_data
>> within the sk_callback lock). Or is the preferred solution to restrict
>> KCM to specific socket families, as suggested by Guillaume earlier in
>> the thread?
>
> I think we have a more fundamental issue here.
>
> sk->sk_user_data is a place where RPC layer specific data is hung off
> of.  By this definition SunRPC, RXRPC, RDS, TIPC, and KCM are all
> using it correctly.
>
> Phonet has a similar issue to the one seen here, it tests and changes
> sk_user_data under lock_sock().  The only requirement it makes is
> that the socket type is not SOCK_STREAM.  However, this one might be OK
> since only pep_sock sockets can be passed down into gprs_attach().
>
> Most of these cases like SunRPC, RXRPC, etc. are fine because they
> only graft on top of TCP and UDP sockets.
>
> The weird situation here is that L2TP does tunneling and stores it's
> private state in sk->sk_user_data like an RPC layer would.  And KCM
> allows basically any socket type to be attached.
>
> The RPC layers create their sockets internally, so I cannot see a way
> that those can be sent to a KCM attach operations.  And I think that
> is why this RPC invariant is important for sk_user_data usage.
>
> If all else was equal, even though it doesn't make much sense to KCM
> attach L2TP sockets to KCM, I would suggest to change L2TP to store
> it's private stuff elsewhere.
>
> But that is not the case.  Anything using the generic UDP
> encapsulation layer is going to make use of sk->sk_user_data like this
> (see setup_udp_tunnel_sock).
>
> It looks like over time we've accumulated this new class of uses
> of sk->sk_user_data, ho hum...
>
> And it's not like we can add a test to KCM to avoid these socket
> types, because they will look like normal UDP datagram sockets.
>
> What a mess...
>
> Furthermore, even if you add a test to KCM, you will now need to
> add the same test to L2TP and anything else which uses sk_user_data
> for tunneling and for which userspace has access to the socket fd.
>
> And it will be racy, indeed, until all such users align to the same
> precise locking scheme for tests and updates to sk_user_data.
>
> Again, what a mess...
>
It's not so surprising that sk_user_data is being used for so many
purposes, it's quite a powerful and useful notion. So, to a large
extent I think it's a victim of it's own success.

Aligning to one locking scheme is the first task to clean this. The
second would be how to deal with multiple simulataneous use on a
socket (or maybe not allow). I've thought about having a chain of
sk_user_data, but that's only useful is the write/read callback are
also chained. All this starts to look like STREAMS at some point ;-)

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-18 15:40           ` James Chapman
  2018-01-18 16:29             ` Guillaume Nault
@ 2018-01-18 17:46             ` Tom Herbert
  2018-01-18 18:08               ` Eric Dumazet
  1 sibling, 1 reply; 16+ messages in thread
From: Tom Herbert @ 2018-01-18 17:46 UTC (permalink / raw)
  To: James Chapman
  Cc: Guillaume Nault, David Miller, Linux Kernel Network Developers

On Thu, Jan 18, 2018 at 7:40 AM, James Chapman <jchapman@katalix.com> wrote:
> On 18 January 2018 at 15:18, Guillaume Nault <g.nault@alphalink.fr> wrote:
>> On Wed, Jan 17, 2018 at 02:25:38PM -0500, David Miller wrote:
>>> From: James Chapman <jchapman@katalix.com>
>>> Date: Wed, 17 Jan 2018 11:13:33 +0000
>>>
>>> > On 16 January 2018 at 19:00, David Miller <davem@davemloft.net> wrote:
>>> >> From: Tom Herbert <tom@herbertland.com>
>>> >> Date: Tue, 16 Jan 2018 09:36:41 -0800
>>> >>
>>> >>> sk_user_data is set with the sk_callback lock held in code below.
>>> >>> Should be able to take the lock earlier can do this check under the
>>> >>> lock.
>>> >>
>>> >> csock, and this csk, is obtained from an arbitrary one of the
>>> >> process's FDs.  It can be any socket type or family, and that socket's
>>> >> family might set sk_user_data without the callback lock.
>>> >>
>>> >> The only socket type check is making sure it is not another PF_KCM
>>> >> socket.  So that doesn't help with this problem.
>>> >
>>> > Is it the intention to update all socket code over time to write
>>> > sk_user_data within the sk_callback lock? If so, I'm happy to address
>>> > that in the l2tp code (and update the kcm patch to check sk_user_data
>>> > within the sk_callback lock). Or is the preferred solution to restrict
>>> > KCM to specific socket families, as suggested by Guillaume earlier in
>>> > the thread?
>>>
>>> I think we have a more fundamental issue here.
>>>
>>> sk->sk_user_data is a place where RPC layer specific data is hung off
>>> of.  By this definition SunRPC, RXRPC, RDS, TIPC, and KCM are all
>>> using it correctly.
>>>
>>> Phonet has a similar issue to the one seen here, it tests and changes
>>> sk_user_data under lock_sock().  The only requirement it makes is
>>> that the socket type is not SOCK_STREAM.  However, this one might be OK
>>> since only pep_sock sockets can be passed down into gprs_attach().
>>>
>> But, if I read it correctly, that doesn't prevent it from being passed
>> to kcm_attach() later on, which will overwrite sk_user_data (unless we
>> update the locking scheme and refuse to overwrite sk_user_data in a
>> race-free way).
>>
>> BTW couldn't the gprs_dev pointer be embedded in struct pep_sock?
>> This way pep_sk(sk)->gp could be used instead of sk->sk_user_data.
>> That'd probably be a violation of the phonet's layering, as that'd
>> tie gprs_dev to pep sockets. OTOH, only pep sockets can currently be
>> attached to gprs_dev, so in practice that might be a reasonable
>> compromise.
>>
>>> Most of these cases like SunRPC, RXRPC, etc. are fine because they
>>> only graft on top of TCP and UDP sockets.
>>>
>>> The weird situation here is that L2TP does tunneling and stores it's
>>> private state in sk->sk_user_data like an RPC layer would.  And KCM
>>> allows basically any socket type to be attached.
>>>
>>> The RPC layers create their sockets internally, so I cannot see a way
>>> that those can be sent to a KCM attach operations.  And I think that
>>> is why this RPC invariant is important for sk_user_data usage.
>>>
>> SunRPC seems to possibly set sk_user_data on user sockets: svc_addsock()
>> gets a socket using sockfd_lookup() then passes it to svc_setup_socket()
>> which in turn sets sk_user_data. I don't know anything about SunRPC, so
>> I might very well have missed important details, but I believe such a
>> socket could be passed to KCM which could lead to the same kind of
>> issues as for L2TP. Other RPCs look safe to me.
>>
>>> If all else was equal, even though it doesn't make much sense to KCM
>>> attach L2TP sockets to KCM, I would suggest to change L2TP to store
>>> it's private stuff elsewhere.
>>>
>>> But that is not the case.  Anything using the generic UDP
>>> encapsulation layer is going to make use of sk->sk_user_data like this
>>> (see setup_udp_tunnel_sock).
>>>
>> Most UDP encapsulations only use kernel sockets though. It seems that
>> only L2TP and GTP use setup_udp_tunnel_sock() with userpsace sockets.
>> So it might be feasible to restrict usage of sk_user_data to kernel
>> sockets only.
>>
>> For L2TP, we probably can adapt l2tp_sock_to_tunnel() so that it does
>> a lookup in a hashtable indexed by the socket pointer, rather than
>> dereferencing sk_user_data. That doesn't look very satisfying to me,
>> but that's the only way I found so far.
>
> L2TP needs a way to get at its local data from the socket in the data path.
>
>> We also have another user of sk_user_data in l2tp_ppp, but since it
>> uses its own socket type, I guess we could simply embed the pointer in
>> its parent structure.
>>
>>> It looks like over time we've accumulated this new class of uses
>>> of sk->sk_user_data, ho hum...
>>>
>>> And it's not like we can add a test to KCM to avoid these socket
>>> types, because they will look like normal UDP datagram sockets.
>>>
>>> What a mess...
>>>
>>> Furthermore, even if you add a test to KCM, you will now need to
>>> add the same test to L2TP and anything else which uses sk_user_data
>>> for tunneling and for which userspace has access to the socket fd.
>>>
>>> And it will be racy, indeed, until all such users align to the same
>>> precise locking scheme for tests and updates to sk_user_data.
>>>
>>> Again, what a mess...
>>>
>> So, if I understand correctly, we can either restrict sk_user_data to
>> kernel sockets so that KCM couldn't act on them (but then why would we
>> make an exception for KCM and allow it to set sk_user_data on
>> non-kernel sockets?).
>> Or we could agree on a locking scheme for sk_user_data and update all
>> users so that they'd fail instead of overwriting it when it's not NULL.
>>
>> Assuming my understanding is correct, do you have any preference for
>> fixing this issue? Or any other ideas?
>
> Could we add a new pointer, say, encap_user_data to struct udp_sock
> and use it instead of sk_user_data for UDP-encap sockets?

Then that's increasing the udp_sock structure size for a narrow use
case which will get push back. I think it's going to be better to
stick with one sock pointer. We could maybe redefine sk_user_data as a
pointer to an allocated structure or array so it can hold multiple
user_data pointers (in lieu of chaining).

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-18 17:46             ` Tom Herbert
@ 2018-01-18 18:08               ` Eric Dumazet
  2018-01-18 19:26                 ` Tom Herbert
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-01-18 18:08 UTC (permalink / raw)
  To: Tom Herbert, James Chapman
  Cc: Guillaume Nault, David Miller, Linux Kernel Network Developers

On Thu, 2018-01-18 at 09:46 -0800, Tom Herbert wrote:
> 
> Then that's increasing the udp_sock structure size for a narrow use
> case which will get push back. I think it's going to be better to
> stick with one sock pointer. We could maybe redefine sk_user_data as a
> pointer to an allocated structure or array so it can hold multiple
> user_data pointers (in lieu of chaining).
> 

We do not have a lot of UDP sockets per host, I do not believe it
should be a problem adding stuff in them.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-18 18:08               ` Eric Dumazet
@ 2018-01-18 19:26                 ` Tom Herbert
  2018-01-18 19:46                   ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Herbert @ 2018-01-18 19:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: James Chapman, Guillaume Nault, David Miller,
	Linux Kernel Network Developers

On Thu, Jan 18, 2018 at 10:08 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2018-01-18 at 09:46 -0800, Tom Herbert wrote:
>>
>> Then that's increasing the udp_sock structure size for a narrow use
>> case which will get push back. I think it's going to be better to
>> stick with one sock pointer. We could maybe redefine sk_user_data as a
>> pointer to an allocated structure or array so it can hold multiple
>> user_data pointers (in lieu of chaining).
>>
>
> We do not have a lot of UDP sockets per host, I do not believe it
> should be a problem adding stuff in them.
>
Eric,

Is QUIC using unconnected sockets then?

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used
  2018-01-18 19:26                 ` Tom Herbert
@ 2018-01-18 19:46                   ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-01-18 19:46 UTC (permalink / raw)
  To: Tom Herbert
  Cc: James Chapman, Guillaume Nault, David Miller,
	Linux Kernel Network Developers

On Thu, 2018-01-18 at 11:26 -0800, Tom Herbert wrote:
> On Thu, Jan 18, 2018 at 10:08 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Thu, 2018-01-18 at 09:46 -0800, Tom Herbert wrote:
> > > 
> > > Then that's increasing the udp_sock structure size for a narrow use
> > > case which will get push back. I think it's going to be better to
> > > stick with one sock pointer. We could maybe redefine sk_user_data as a
> > > pointer to an allocated structure or array so it can hold multiple
> > > user_data pointers (in lieu of chaining).
> > > 
> > 
> > We do not have a lot of UDP sockets per host, I do not believe it
> > should be a problem adding stuff in them.
> > 
> 
> Eric,
> 
> Is QUIC using unconnected sockets then?

Server side is using a bunch of unconnected sockets, usually one per
cpu.

Note that UDP stack has no 4-tuple proper support yet, and even if it
had, extra memory costs would be huge on servers handling millions of
flows.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-01-18 19:46 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-14 11:32 [PATCH net-next] kcm: do not attach sockets if sk_user_data is already used James Chapman
2018-01-16 10:44 ` Guillaume Nault
2018-01-16 17:10   ` Eric Dumazet
2018-01-16 17:36 ` Tom Herbert
2018-01-16 19:00   ` David Miller
2018-01-17 11:13     ` James Chapman
2018-01-17 19:25       ` David Miller
2018-01-18 15:18         ` Guillaume Nault
2018-01-18 15:40           ` James Chapman
2018-01-18 16:29             ` Guillaume Nault
2018-01-18 17:30               ` James Chapman
2018-01-18 17:46             ` Tom Herbert
2018-01-18 18:08               ` Eric Dumazet
2018-01-18 19:26                 ` Tom Herbert
2018-01-18 19:46                   ` Eric Dumazet
2018-01-18 17:40         ` Tom Herbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).