debugfs question...

* debugfs question...
@ 2016-10-31 18:32 Mike Marshall
  2016-10-31 19:38 ` Greg KH
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Marshall @ 2016-10-31 18:32 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, Linus Torvalds, Greg KH, Mike Marshall,
	Martin Brandenburg

Hello everyone.

I wrote the Orangefs debugfs code. Recently my coworker
Martin refactored it to clean up the cut-and-pastey parts
I had put in. The refactor seemed to trigger dan.carpenter@oracle.com's
static tester to find a possible double-free in the code.

I think the possible-double-free will be easy to fix, but
while in there, I'm looking for other "bad places".

Our debugfs code results in three files in /sys/kernel/debug/orangefs.
One of the files gets deleted (debugfs_remove'd) and re-created
(debugfs_create_file'd) the first time someone fires up the
user-space part of Orangefs after a reboot.

We wondered what awful things might happen if someone was
reading the file across the delete/re-create, so I wrote a
program that opens the file, sleeps ten seconds and then
starts reading, and I fired up the Orangefs userspace part
during the sleep. I didn't see any problems there, we get
EIO when the read happens.

But... really bad things happen if someone unloads the Orangefs
module after my test program does the open and before the read
starts. So I picked another debugfs-using-filesystem (f2fs) and
pointed my tester-program at /sys/kernel/debug/f2fs/status, and
the same bad thing happens there.

I was hoping that f2fs, or some other debugfs-using-filesystem, would be
able to handle my rmmod test and then I could look at their code for
inspiration, but no such luck so far. Is there something that me and the
f2fs guys aren't doing right or is this just something about debugfs
that's fragile?

[ 1240.133703] BUG: unable to handle kernel paging request at ffffffffa0307430
[ 1240.134109] IP: [<ffffffff8132a224>] full_proxy_release+0x24/0x90
[ 1240.134434] PGD 1c0f067 [ 1240.134560] PUD 1c10063
PMD 3c8d0067 [ 1240.134793] PTE 0
[ 1240.134905]
[ 1240.134988] Oops: 0000 [#1]
[ 1240.135137] Modules linked in: ip6t_rpfilter bnep ip6t_REJECT
nf_reject_ipv6 bluetooth rfkill nf_conntrack_ipv6 nf_defrag_ipv6
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_nat
ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle
ip6table_security ip6table_raw ip6table_filter ip6_tables
iptable_mangle iptable_security iptable_raw ppdev parport_pc parport
8139too serio_raw i2c_piix4 virtio_balloon virtio_console pvpanic
uinput qxl drm_kms_helper ttm drm virtio_pci 8139cp i2c_core
ata_generic virtio virtio_ring mii pata_acpi [last unloaded: f2fs]
[ 1240.138209] CPU: 0 PID: 1178 Comm: dhs Not tainted
4.9.0-rc1-00002-g804b173-dirty #3
[ 1240.138605] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 1240.138968] task: ffff88003e166040 task.stack: ffffc900006d4000
[ 1240.139275] RIP: 0010:[<ffffffff8132a224>]  [<ffffffff8132a224>]
full_proxy_release+0x24/0x90
[ 1240.139721] RSP: 0018:ffffc900006d7db8  EFLAGS: 00010286
[ 1240.140002] RAX: ffffffff8132a200 RBX: ffff88001fc3fa80 RCX: 0000000000000000
[ 1240.140369] RDX: ffff88001fc3fc08 RSI: ffff88001fc3fa80 RDI: ffff880015097bc0
[ 1240.140749] RBP: ffffc900006d7de0 R08: 0000000000000000 R09: 0000000000000000
[ 1240.141126] R10: ffff880015097bc0 R11: ffff88001fc3fa90 R12: ffffffffa03073c0
[ 1240.141494] R13: ffff88001506a7e0 R14: ffff88003ab0e300 R15: ffff88001506a7e0
[ 1240.141864] FS:  0000000000000000(0000) GS:ffffffff81c39000(0000)
knlGS:0000000000000000
[ 1240.142279] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1240.142577] CR2: ffffffffa0307430 CR3: 000000001fd97000 CR4: 00000000000006f0
[ 1240.142968] Stack:
[ 1240.143078]  ffff88001fc3fa80 0000000000000010 ffff880015097bc0
ffff8800369d68e0
[ 1240.143490]  ffff88001506a7e0 ffffc900006d7e28 ffffffff8122907f
ffff880015097bc0
[ 1240.143904]  ffff88001fc3fa90 ffff88003e166568 ffffffff81f09330
ffff88001fc3f540
[ 1240.144316] Call Trace:
[ 1240.144450]  [<ffffffff8122907f>] __fput+0xdf/0x1d0
[ 1240.144704]  [<ffffffff812291ae>] ____fput+0xe/0x10
[ 1240.144962]  [<ffffffff810b97de>] task_work_run+0x8e/0xc0
[ 1240.145243]  [<ffffffff8109b98e>] do_exit+0x2ae/0xae0
[ 1240.145507]  [<ffffffff8113927e>] ? __audit_syscall_entry+0xae/0x100
[ 1240.145840]  [<ffffffff810034da>] ? syscall_trace_enter+0x1ca/0x310
[ 1240.146164]  [<ffffffff8109c244>] do_group_exit+0x44/0xc0
[ 1240.146445]  [<ffffffff8109c2d4>] SyS_exit_group+0x14/0x20
[ 1240.146742]  [<ffffffff81003a61>] do_syscall_64+0x61/0x150
[ 1240.147049]  [<ffffffff817f1fc4>] entry_SYSCALL64_slow_path+0x25/0x25
[ 1240.147391] Code: 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5
41 57 41 56 4c 8b 76 28 41 55 4c 8b 6e 18 41 54 53 4d 8b a5 d8 00 00
00 48 89 f3 <49> 8b 44 24 70 48 85 c0 74 4e ff d0 41 89 c7 48 8b 43 28
48 85
[ 1240.148919] RIP  [<ffffffff8132a224>] full_proxy_release+0x24/0x90
[ 1240.149248]  RSP <ffffc900006d7db8>
[ 1240.149432] CR2: ffffffffa0307430
[ 1240.149609] ---[ end trace f22ae883fa3ea6b8 ]---
[ 1240.149922] Fixing recursive fault but reboot is needed!

^ permalink raw reply	[flat|nested] 13+ messages in thread