fput() considered harmful

* fput() considered harmful
@ 2023-03-08 13:24 Hannes Reinecke
  2023-03-08 13:43 ` Chuck Lever III
  0 siblings, 1 reply; 7+ messages in thread
From: Hannes Reinecke @ 2023-03-08 13:24 UTC (permalink / raw)
  To: Chuck Lever; +Cc: kernel-tls-handshake

Hi Chuck,

I'm playing around with v6 and (again) facing really nasty crashes:

[ 1662.912887] ------------[ cut here ]------------
[ 1662.913399] kernel BUG at fs/inode.c:1763!
[ 1662.913822] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 1662.914336] CPU: 1 PID: 6494 Comm: tlshd Kdump: loaded Tainted: G 
       E      6.2.0-rc4-54-default+ #231
[ 1662.915235] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
0.0.0 02/06/2015
[ 1662.915932] RIP: 0010:iput+0x1d/0x20
[ 1662.916275] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 
00 48 85 ff 74 0e f6 87 98 00 00 00 40
  75 0a e9 18 fe ff ff e9 23 e3 7a 00 <0f> 0b 90 90 90 90 90 90 90 90 90 
90 90 90 90 90 90 90 90 0f 1f 44
[ 1662.917921] RSP: 0018:ffffaeb600b1fe38 EFLAGS: 00010202
[ 1662.918423] RAX: 0000000000000000 RBX: ffff953bcb9d9180 RCX: 
0000000000000064
[ 1662.919065] RDX: 0000000000000001 RSI: ffff953bcbb23c38 RDI: 
ffff953bcbb23b00
[ 1662.919703] RBP: 0000000000000000 R08: 0000000100053304 R09: 
ffff953bcbb23b00
[ 1662.920339] R10: ffff953bc5538b40 R11: 0000000000000003 R12: 
ffff953bcb9d91d8
[ 1662.920978] R13: ffff953bfd15ade0 R14: ffff953bcb9d9180 R15: 
ffff953b83918cb8
[ 1662.921617] FS:  0000000000000000(0000) GS:ffff953bfb900000(0000) 
knlGS:0000000000000000
[ 1662.922376] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1662.922907] CR2: 00007fdc89a091a0 CR3: 000000004ad8c000 CR4: 
0000000000350ee0
[ 1662.923568] Call Trace:
[ 1662.923801]  <TASK>
[ 1662.924006]  __dentry_kill+0xca/0x160
[ 1662.924353]  __fput+0xdf/0x240
[ 1662.924652]  task_work_run+0x67/0xa0
[ 1662.924992]  do_exit+0x343/0xb50
[ 1662.925302]  do_group_exit+0x2d/0x80
[ 1662.925644]  __x64_sys_exit_group+0x14/0x20
[ 1662.926039]  do_syscall_64+0x31/0x80

Key problem seems to be userspace here, which (apparently) is calling 
close() on the fd which we've passed from the kernel.
While this _normally_ is not a problem, in our case we have the problem
that the filedescriptor is associated with a socket (ie struct socket).
And that one is shared between kernel code and userland.
And doesn't have any refcounting whatsoever; socket_release() removes 
the reference to the underlying file:

	if (!sock->file) {
		iput(SOCK_INODE(sock));
		return;
	}
	sock->file = NULL;

ie if _another_ process (the kernel driver, say) is calling 'close()', 
too, it'll run into the !sock->file condition and crash in iput().

I have been testing various things here, not calling fput, calling 
get_file() etc, but either hit a crash or kmemleak complaining about
the file not being freed.

Have you seen similar issues?
And even if you haven't: relying on userspace _not_ to call 'close()'
on the socket seems to be a quite dangerous concept.
Maybe we need to implement a 'socket_clone()' functionality?

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 7+ messages in thread