All of lore.kernel.org
 help / color / mirror / Atom feed
* mm: GPF in bdi_put
@ 2017-02-27 17:11 ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2017-02-27 17:11 UTC (permalink / raw)
  To: Al Viro, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin
  Cc: syzkaller

Hello,

The following program triggers GPF in bdi_put:
https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt

general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 2952 Comm: a.out Not tainted 4.10.0+ #229
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880063e72180 task.stack: ffff880064a78000
RIP: 0010:__read_once_size include/linux/compiler.h:247 [inline]
RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
RIP: 0010:refcount_sub_and_test include/linux/refcount.h:156 [inline]
RIP: 0010:refcount_dec_and_test include/linux/refcount.h:181 [inline]
RIP: 0010:kref_put include/linux/kref.h:71 [inline]
RIP: 0010:bdi_put+0x8b/0x1d0 mm/backing-dev.c:914
RSP: 0018:ffff880064a7f0b0 EFLAGS: 00010202
RAX: 0000000000000007 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff880064a7f118 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff880064a7f140 R08: ffff880065603280 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: dffffc0000000000
R13: 0000000000000038 R14: 1ffff1000c94fe17 R15: ffff880064a7f218
FS:  0000000000eb5880(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020914ffa CR3: 000000006bc37000 CR4: 00000000001426f0
Call Trace:
 bdev_evict_inode+0x203/0x3a0 fs/block_dev.c:888
 evict+0x46e/0x980 fs/inode.c:553
 iput_final fs/inode.c:1515 [inline]
 iput+0x589/0xb20 fs/inode.c:1542
 dentry_unlink_inode+0x43b/0x600 fs/dcache.c:343
 __dentry_kill+0x34d/0x740 fs/dcache.c:538
 dentry_kill fs/dcache.c:579 [inline]
 dput.part.27+0x5ce/0x7c0 fs/dcache.c:791
 dput fs/dcache.c:753 [inline]
 do_one_tree+0x43/0x50 fs/dcache.c:1454
 shrink_dcache_for_umount+0xbb/0x2b0 fs/dcache.c:1468
 generic_shutdown_super+0xcd/0x4c0 fs/super.c:421
 kill_anon_super+0x3c/0x50 fs/super.c:988
 deactivate_locked_super+0x88/0xd0 fs/super.c:309
 deactivate_super+0x155/0x1b0 fs/super.c:340
 cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
 __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x435e19
RSP: 002b:00007ffc9d7f2748 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffea RBX: 0100000000000000 RCX: 0000000000435e19
RDX: 0000000020063000 RSI: 0000000020914ffa RDI: 0000000020037000
RBP: 00007ffc9d7f2fe0 R08: 0000000020039000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000402b70 R14: 0000000000402c00 R15: 0000000000000000
Code: 04 f2 f2 f2 c7 40 08 f3 f3 f3 f3 e8 f0 ec de ff 48 8d 45 98 48
8b 95 70 ff ff ff 48 c1 e8 03 42 c6 04 20 04 4c 89 e8 48 c1 e8 03 <42>
0f b6 0c 20 4c 89 e8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f
RIP: __read_once_size include/linux/compiler.h:247 [inline] RSP:
ffff880064a7f0b0
RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP: ffff880064a7f0b0
RIP: refcount_sub_and_test include/linux/refcount.h:156 [inline] RSP:
ffff880064a7f0b0
RIP: refcount_dec_and_test include/linux/refcount.h:181 [inline] RSP:
ffff880064a7f0b0
RIP: kref_put include/linux/kref.h:71 [inline] RSP: ffff880064a7f0b0
RIP: bdi_put+0x8b/0x1d0 mm/backing-dev.c:914 RSP: ffff880064a7f0b0
---[ end trace 8991b3d16ac9bf93 ]---

On commit e5d56efc97f8240d0b5d66c03949382b6d7e5570.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* mm: GPF in bdi_put
@ 2017-02-27 17:11 ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2017-02-27 17:11 UTC (permalink / raw)
  To: Al Viro, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin
  Cc: syzkaller

Hello,

The following program triggers GPF in bdi_put:
https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt

general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 2952 Comm: a.out Not tainted 4.10.0+ #229
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880063e72180 task.stack: ffff880064a78000
RIP: 0010:__read_once_size include/linux/compiler.h:247 [inline]
RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
RIP: 0010:refcount_sub_and_test include/linux/refcount.h:156 [inline]
RIP: 0010:refcount_dec_and_test include/linux/refcount.h:181 [inline]
RIP: 0010:kref_put include/linux/kref.h:71 [inline]
RIP: 0010:bdi_put+0x8b/0x1d0 mm/backing-dev.c:914
RSP: 0018:ffff880064a7f0b0 EFLAGS: 00010202
RAX: 0000000000000007 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff880064a7f118 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff880064a7f140 R08: ffff880065603280 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: dffffc0000000000
R13: 0000000000000038 R14: 1ffff1000c94fe17 R15: ffff880064a7f218
FS:  0000000000eb5880(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020914ffa CR3: 000000006bc37000 CR4: 00000000001426f0
Call Trace:
 bdev_evict_inode+0x203/0x3a0 fs/block_dev.c:888
 evict+0x46e/0x980 fs/inode.c:553
 iput_final fs/inode.c:1515 [inline]
 iput+0x589/0xb20 fs/inode.c:1542
 dentry_unlink_inode+0x43b/0x600 fs/dcache.c:343
 __dentry_kill+0x34d/0x740 fs/dcache.c:538
 dentry_kill fs/dcache.c:579 [inline]
 dput.part.27+0x5ce/0x7c0 fs/dcache.c:791
 dput fs/dcache.c:753 [inline]
 do_one_tree+0x43/0x50 fs/dcache.c:1454
 shrink_dcache_for_umount+0xbb/0x2b0 fs/dcache.c:1468
 generic_shutdown_super+0xcd/0x4c0 fs/super.c:421
 kill_anon_super+0x3c/0x50 fs/super.c:988
 deactivate_locked_super+0x88/0xd0 fs/super.c:309
 deactivate_super+0x155/0x1b0 fs/super.c:340
 cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
 __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x435e19
RSP: 002b:00007ffc9d7f2748 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffea RBX: 0100000000000000 RCX: 0000000000435e19
RDX: 0000000020063000 RSI: 0000000020914ffa RDI: 0000000020037000
RBP: 00007ffc9d7f2fe0 R08: 0000000020039000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000402b70 R14: 0000000000402c00 R15: 0000000000000000
Code: 04 f2 f2 f2 c7 40 08 f3 f3 f3 f3 e8 f0 ec de ff 48 8d 45 98 48
8b 95 70 ff ff ff 48 c1 e8 03 42 c6 04 20 04 4c 89 e8 48 c1 e8 03 <42>
0f b6 0c 20 4c 89 e8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f
RIP: __read_once_size include/linux/compiler.h:247 [inline] RSP:
ffff880064a7f0b0
RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP: ffff880064a7f0b0
RIP: refcount_sub_and_test include/linux/refcount.h:156 [inline] RSP:
ffff880064a7f0b0
RIP: refcount_dec_and_test include/linux/refcount.h:181 [inline] RSP:
ffff880064a7f0b0
RIP: kref_put include/linux/kref.h:71 [inline] RSP: ffff880064a7f0b0
RIP: bdi_put+0x8b/0x1d0 mm/backing-dev.c:914 RSP: ffff880064a7f0b0
---[ end trace 8991b3d16ac9bf93 ]---

On commit e5d56efc97f8240d0b5d66c03949382b6d7e5570.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
  2017-02-27 17:11 ` Dmitry Vyukov
@ 2017-02-27 17:14   ` Dmitry Vyukov
  -1 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2017-02-27 17:14 UTC (permalink / raw)
  To: Al Viro, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin
  Cc: syzkaller

On Mon, Feb 27, 2017 at 6:11 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Hello,
>
> The following program triggers GPF in bdi_put:
> https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt
>
> general protection fault: 0000 [#1] SMP KASAN
> Modules linked in:
> CPU: 0 PID: 2952 Comm: a.out Not tainted 4.10.0+ #229
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: ffff880063e72180 task.stack: ffff880064a78000
> RIP: 0010:__read_once_size include/linux/compiler.h:247 [inline]
> RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
> RIP: 0010:refcount_sub_and_test include/linux/refcount.h:156 [inline]
> RIP: 0010:refcount_dec_and_test include/linux/refcount.h:181 [inline]
> RIP: 0010:kref_put include/linux/kref.h:71 [inline]
> RIP: 0010:bdi_put+0x8b/0x1d0 mm/backing-dev.c:914
> RSP: 0018:ffff880064a7f0b0 EFLAGS: 00010202
> RAX: 0000000000000007 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff880064a7f118 RSI: 0000000000000001 RDI: 0000000000000000
> RBP: ffff880064a7f140 R08: ffff880065603280 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000001 R12: dffffc0000000000
> R13: 0000000000000038 R14: 1ffff1000c94fe17 R15: ffff880064a7f218
> FS:  0000000000eb5880(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000020914ffa CR3: 000000006bc37000 CR4: 00000000001426f0
> Call Trace:
>  bdev_evict_inode+0x203/0x3a0 fs/block_dev.c:888
>  evict+0x46e/0x980 fs/inode.c:553
>  iput_final fs/inode.c:1515 [inline]
>  iput+0x589/0xb20 fs/inode.c:1542
>  dentry_unlink_inode+0x43b/0x600 fs/dcache.c:343
>  __dentry_kill+0x34d/0x740 fs/dcache.c:538
>  dentry_kill fs/dcache.c:579 [inline]
>  dput.part.27+0x5ce/0x7c0 fs/dcache.c:791
>  dput fs/dcache.c:753 [inline]
>  do_one_tree+0x43/0x50 fs/dcache.c:1454
>  shrink_dcache_for_umount+0xbb/0x2b0 fs/dcache.c:1468
>  generic_shutdown_super+0xcd/0x4c0 fs/super.c:421
>  kill_anon_super+0x3c/0x50 fs/super.c:988
>  deactivate_locked_super+0x88/0xd0 fs/super.c:309
>  deactivate_super+0x155/0x1b0 fs/super.c:340
>  cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
>  __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>  tracehook_notify_resume include/linux/tracehook.h:191 [inline]
>  exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>  syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
>  entry_SYSCALL_64_fastpath+0xc0/0xc2
> RIP: 0033:0x435e19
> RSP: 002b:00007ffc9d7f2748 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
> RAX: ffffffffffffffea RBX: 0100000000000000 RCX: 0000000000435e19
> RDX: 0000000020063000 RSI: 0000000020914ffa RDI: 0000000020037000
> RBP: 00007ffc9d7f2fe0 R08: 0000000020039000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000402b70 R14: 0000000000402c00 R15: 0000000000000000
> Code: 04 f2 f2 f2 c7 40 08 f3 f3 f3 f3 e8 f0 ec de ff 48 8d 45 98 48
> 8b 95 70 ff ff ff 48 c1 e8 03 42 c6 04 20 04 4c 89 e8 48 c1 e8 03 <42>
> 0f b6 0c 20 4c 89 e8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f
> RIP: __read_once_size include/linux/compiler.h:247 [inline] RSP:
> ffff880064a7f0b0
> RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP: ffff880064a7f0b0
> RIP: refcount_sub_and_test include/linux/refcount.h:156 [inline] RSP:
> ffff880064a7f0b0
> RIP: refcount_dec_and_test include/linux/refcount.h:181 [inline] RSP:
> ffff880064a7f0b0
> RIP: kref_put include/linux/kref.h:71 [inline] RSP: ffff880064a7f0b0
> RIP: bdi_put+0x8b/0x1d0 mm/backing-dev.c:914 RSP: ffff880064a7f0b0
> ---[ end trace 8991b3d16ac9bf93 ]---
>
> On commit e5d56efc97f8240d0b5d66c03949382b6d7e5570.


I also wee the following WARNING. Do you think it' the same underlying bug?

------------[ cut here ]------------
WARNING: CPU: 1 PID: 24265 at mm/backing-dev.c:899
bdi_exit+0x13e/0x160 mm/backing-dev.c:899
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 24265 Comm: syz-executor3 Not tainted 4.10.0-next-20170227+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 panic+0x1fb/0x412 kernel/panic.c:179
 __warn+0x1c4/0x1e0 kernel/panic.c:540
 warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
 bdi_exit+0x13e/0x160 mm/backing-dev.c:899
 release_bdi+0x19/0x30 mm/backing-dev.c:908
 kref_put include/linux/kref.h:72 [inline]
 bdi_put+0x2a/0x40 mm/backing-dev.c:914
 bdev_evict_inode+0x203/0x3a0 fs/block_dev.c:888
 evict+0x46e/0x980 fs/inode.c:553
 iput_final fs/inode.c:1515 [inline]
 iput+0x589/0xb20 fs/inode.c:1542
 dentry_unlink_inode+0x43b/0x600 fs/dcache.c:343
 __dentry_kill+0x34d/0x740 fs/dcache.c:538
 dentry_kill fs/dcache.c:579 [inline]
 dput.part.27+0x5ce/0x7c0 fs/dcache.c:791
 dput fs/dcache.c:753 [inline]
 do_one_tree+0x43/0x50 fs/dcache.c:1454
 shrink_dcache_for_umount+0xbb/0x2b0 fs/dcache.c:1468
 generic_shutdown_super+0xcd/0x4c0 fs/super.c:421
 kill_anon_super+0x3c/0x50 fs/super.c:988
 deactivate_locked_super+0x88/0xd0 fs/super.c:309
 deactivate_super+0x155/0x1b0 fs/super.c:340
 cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
 __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x44fb79
RSP: 002b:00007fd57a8a0b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffea RBX: 0000000000708000 RCX: 000000000044fb79
RDX: 00000000208cf000 RSI: 0000000020058ffd RDI: 0000000020fc2000
RBP: 00000000000002f7 R08: 0000000020691000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000212 R12: 0000000020fc2000
R13: 0000000020058ffd R14: 00000000208cf000 R15: 0000000000000000

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
@ 2017-02-27 17:14   ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2017-02-27 17:14 UTC (permalink / raw)
  To: Al Viro, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin
  Cc: syzkaller

On Mon, Feb 27, 2017 at 6:11 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Hello,
>
> The following program triggers GPF in bdi_put:
> https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt
>
> general protection fault: 0000 [#1] SMP KASAN
> Modules linked in:
> CPU: 0 PID: 2952 Comm: a.out Not tainted 4.10.0+ #229
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: ffff880063e72180 task.stack: ffff880064a78000
> RIP: 0010:__read_once_size include/linux/compiler.h:247 [inline]
> RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
> RIP: 0010:refcount_sub_and_test include/linux/refcount.h:156 [inline]
> RIP: 0010:refcount_dec_and_test include/linux/refcount.h:181 [inline]
> RIP: 0010:kref_put include/linux/kref.h:71 [inline]
> RIP: 0010:bdi_put+0x8b/0x1d0 mm/backing-dev.c:914
> RSP: 0018:ffff880064a7f0b0 EFLAGS: 00010202
> RAX: 0000000000000007 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff880064a7f118 RSI: 0000000000000001 RDI: 0000000000000000
> RBP: ffff880064a7f140 R08: ffff880065603280 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000001 R12: dffffc0000000000
> R13: 0000000000000038 R14: 1ffff1000c94fe17 R15: ffff880064a7f218
> FS:  0000000000eb5880(0000) GS:ffff88006d000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000020914ffa CR3: 000000006bc37000 CR4: 00000000001426f0
> Call Trace:
>  bdev_evict_inode+0x203/0x3a0 fs/block_dev.c:888
>  evict+0x46e/0x980 fs/inode.c:553
>  iput_final fs/inode.c:1515 [inline]
>  iput+0x589/0xb20 fs/inode.c:1542
>  dentry_unlink_inode+0x43b/0x600 fs/dcache.c:343
>  __dentry_kill+0x34d/0x740 fs/dcache.c:538
>  dentry_kill fs/dcache.c:579 [inline]
>  dput.part.27+0x5ce/0x7c0 fs/dcache.c:791
>  dput fs/dcache.c:753 [inline]
>  do_one_tree+0x43/0x50 fs/dcache.c:1454
>  shrink_dcache_for_umount+0xbb/0x2b0 fs/dcache.c:1468
>  generic_shutdown_super+0xcd/0x4c0 fs/super.c:421
>  kill_anon_super+0x3c/0x50 fs/super.c:988
>  deactivate_locked_super+0x88/0xd0 fs/super.c:309
>  deactivate_super+0x155/0x1b0 fs/super.c:340
>  cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
>  __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>  tracehook_notify_resume include/linux/tracehook.h:191 [inline]
>  exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>  syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
>  entry_SYSCALL_64_fastpath+0xc0/0xc2
> RIP: 0033:0x435e19
> RSP: 002b:00007ffc9d7f2748 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
> RAX: ffffffffffffffea RBX: 0100000000000000 RCX: 0000000000435e19
> RDX: 0000000020063000 RSI: 0000000020914ffa RDI: 0000000020037000
> RBP: 00007ffc9d7f2fe0 R08: 0000000020039000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000402b70 R14: 0000000000402c00 R15: 0000000000000000
> Code: 04 f2 f2 f2 c7 40 08 f3 f3 f3 f3 e8 f0 ec de ff 48 8d 45 98 48
> 8b 95 70 ff ff ff 48 c1 e8 03 42 c6 04 20 04 4c 89 e8 48 c1 e8 03 <42>
> 0f b6 0c 20 4c 89 e8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f
> RIP: __read_once_size include/linux/compiler.h:247 [inline] RSP:
> ffff880064a7f0b0
> RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP: ffff880064a7f0b0
> RIP: refcount_sub_and_test include/linux/refcount.h:156 [inline] RSP:
> ffff880064a7f0b0
> RIP: refcount_dec_and_test include/linux/refcount.h:181 [inline] RSP:
> ffff880064a7f0b0
> RIP: kref_put include/linux/kref.h:71 [inline] RSP: ffff880064a7f0b0
> RIP: bdi_put+0x8b/0x1d0 mm/backing-dev.c:914 RSP: ffff880064a7f0b0
> ---[ end trace 8991b3d16ac9bf93 ]---
>
> On commit e5d56efc97f8240d0b5d66c03949382b6d7e5570.


I also wee the following WARNING. Do you think it' the same underlying bug?

------------[ cut here ]------------
WARNING: CPU: 1 PID: 24265 at mm/backing-dev.c:899
bdi_exit+0x13e/0x160 mm/backing-dev.c:899
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 24265 Comm: syz-executor3 Not tainted 4.10.0-next-20170227+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 panic+0x1fb/0x412 kernel/panic.c:179
 __warn+0x1c4/0x1e0 kernel/panic.c:540
 warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
 bdi_exit+0x13e/0x160 mm/backing-dev.c:899
 release_bdi+0x19/0x30 mm/backing-dev.c:908
 kref_put include/linux/kref.h:72 [inline]
 bdi_put+0x2a/0x40 mm/backing-dev.c:914
 bdev_evict_inode+0x203/0x3a0 fs/block_dev.c:888
 evict+0x46e/0x980 fs/inode.c:553
 iput_final fs/inode.c:1515 [inline]
 iput+0x589/0xb20 fs/inode.c:1542
 dentry_unlink_inode+0x43b/0x600 fs/dcache.c:343
 __dentry_kill+0x34d/0x740 fs/dcache.c:538
 dentry_kill fs/dcache.c:579 [inline]
 dput.part.27+0x5ce/0x7c0 fs/dcache.c:791
 dput fs/dcache.c:753 [inline]
 do_one_tree+0x43/0x50 fs/dcache.c:1454
 shrink_dcache_for_umount+0xbb/0x2b0 fs/dcache.c:1468
 generic_shutdown_super+0xcd/0x4c0 fs/super.c:421
 kill_anon_super+0x3c/0x50 fs/super.c:988
 deactivate_locked_super+0x88/0xd0 fs/super.c:309
 deactivate_super+0x155/0x1b0 fs/super.c:340
 cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
 __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x44fb79
RSP: 002b:00007fd57a8a0b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffea RBX: 0000000000708000 RCX: 000000000044fb79
RDX: 00000000208cf000 RSI: 0000000020058ffd RDI: 0000000020fc2000
RBP: 00000000000002f7 R08: 0000000020691000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000212 R12: 0000000020fc2000
R13: 0000000020058ffd R14: 00000000208cf000 R15: 0000000000000000

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
  2017-02-27 17:11 ` Dmitry Vyukov
@ 2017-02-27 18:27   ` Al Viro
  -1 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2017-02-27 18:27 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: linux-fsdevel, LKML, Jens Axboe, Andrew Morton, Tejun Heo,
	Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin, syzkaller

On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> The following program triggers GPF in bdi_put:
> https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt

What happens is
	* attempt of, essentially, mount -t bdev ..., calls mount_pseudo()
and then promptly destroys the new instance it has created.
	* the only inode created on that sucker (root directory, that
is) gets evicted.
	* most of ->evict_inode() is harmless, until it gets to
        if (bdev->bd_bdi != &noop_backing_dev_info)
                bdi_put(bdev->bd_bdi);

added there by "block: Make blk_get_backing_dev_info() safe without open bdev".
Since ->bd_bdi hadn't been initialized for that sucker (the same patch has
placed initialization into bdget()), we step into shit of varying nastiness,
depending on phase of moon, etc.

Could somebody explain WTF do we have those two lines in bdev_evict_inode(),
anyway?  We set ->bd_bdi to something other than noop_backing_dev_info only
in __blkdev_get() when ->bd_openers goes from zero to positive, so why is
the matching bdi_put() not in __blkdev_put()?  Jan?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
@ 2017-02-27 18:27   ` Al Viro
  0 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2017-02-27 18:27 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: linux-fsdevel, LKML, Jens Axboe, Andrew Morton, Tejun Heo,
	Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin, syzkaller

On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> The following program triggers GPF in bdi_put:
> https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt

What happens is
	* attempt of, essentially, mount -t bdev ..., calls mount_pseudo()
and then promptly destroys the new instance it has created.
	* the only inode created on that sucker (root directory, that
is) gets evicted.
	* most of ->evict_inode() is harmless, until it gets to
        if (bdev->bd_bdi != &noop_backing_dev_info)
                bdi_put(bdev->bd_bdi);

added there by "block: Make blk_get_backing_dev_info() safe without open bdev".
Since ->bd_bdi hadn't been initialized for that sucker (the same patch has
placed initialization into bdget()), we step into shit of varying nastiness,
depending on phase of moon, etc.

Could somebody explain WTF do we have those two lines in bdev_evict_inode(),
anyway?  We set ->bd_bdi to something other than noop_backing_dev_info only
in __blkdev_get() when ->bd_openers goes from zero to positive, so why is
the matching bdi_put() not in __blkdev_put()?  Jan?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
  2017-02-27 18:27   ` Al Viro
@ 2017-02-28 17:55     ` Dmitry Vyukov
  -1 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2017-02-28 17:55 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, LKML, Jens Axboe, Andrew Morton, Tejun Heo,
	Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin, syzkaller

On Mon, Feb 27, 2017 at 7:27 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> The following program triggers GPF in bdi_put:
>> https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt
>
> What happens is
>         * attempt of, essentially, mount -t bdev ..., calls mount_pseudo()
> and then promptly destroys the new instance it has created.
>         * the only inode created on that sucker (root directory, that
> is) gets evicted.
>         * most of ->evict_inode() is harmless, until it gets to
>         if (bdev->bd_bdi != &noop_backing_dev_info)
>                 bdi_put(bdev->bd_bdi);
>
> added there by "block: Make blk_get_backing_dev_info() safe without open bdev".
> Since ->bd_bdi hadn't been initialized for that sucker (the same patch has
> placed initialization into bdget()), we step into shit of varying nastiness,
> depending on phase of moon, etc.
>
> Could somebody explain WTF do we have those two lines in bdev_evict_inode(),
> anyway?  We set ->bd_bdi to something other than noop_backing_dev_info only
> in __blkdev_get() when ->bd_openers goes from zero to positive, so why is
> the matching bdi_put() not in __blkdev_put()?  Jan?


I am also seeing the following crashes on
linux-next/8d01c069486aca75b8f6018a759215b0ed0c91f0. Do you think it's
the same underlying issue?

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 19552 Comm: syz-executor2 Not tainted 4.10.0-next-20170228+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
task: ffff8801c16ae400 task.stack: ffff880154c98000
RIP: 0010:__read_once_size include/linux/compiler.h:254 [inline]
RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
RIP: 0010:refcount_sub_and_test+0x82/0x1f0 lib/refcount.c:120
RSP: 0018:ffff880154c9f078 EFLAGS: 00010202
RAX: 0000000000000007 RBX: dffffc0000000000 RCX: ffffc90001a8f000
RDX: 0000000000000740 RSI: ffffffff8246160f RDI: 0000000000000001
RBP: ffff880154c9f110 R08: ffffe8ffffc29a28 R09: 0000000000000001
R10: 1ffff1002a993dcc R11: 0000000000000001 R12: 0000000000000038
R13: 0000000000000001 R14: ffff880154c9f0e8 R15: 1ffff1002a993e11
FS:  00007f0335223700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020fd3ff8 CR3: 00000001c4580000 CR4: 00000000001406f0
DR0: 0000000020000000 DR1: 0000000020001000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Call Trace:
 refcount_dec_and_test+0x1a/0x20 lib/refcount.c:153
 kref_put include/linux/kref.h:71 [inline]
 bdi_put+0x19/0x40 mm/backing-dev.c:914
 bdev_evict_inode+0x203/0x3a0 fs/block_dev.c:888
 evict+0x46e/0x980 fs/inode.c:553
 iput_final fs/inode.c:1515 [inline]
 iput+0x589/0xb20 fs/inode.c:1542
 dentry_unlink_inode+0x43b/0x600 fs/dcache.c:343
 __dentry_kill+0x34d/0x740 fs/dcache.c:538
 dentry_kill fs/dcache.c:579 [inline]
 dput.part.27+0x5ce/0x7c0 fs/dcache.c:791
 dput fs/dcache.c:753 [inline]
 do_one_tree+0x43/0x50 fs/dcache.c:1454
 shrink_dcache_for_umount+0xbb/0x2b0 fs/dcache.c:1468
 generic_shutdown_super+0xcd/0x4c0 fs/super.c:421
 kill_anon_super+0x3c/0x50 fs/super.c:988
 deactivate_locked_super+0x88/0xd0 fs/super.c:309
 deactivate_super+0x155/0x1b0 fs/super.c:340
 cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
 __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x44fb79
RSP: 002b:00007f0335222b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffea RBX: 0000000000708150 RCX: 000000000044fb79
RDX: 000000002064e000 RSI: 00000000208f8ff8 RDI: 0000000020b28ff8
RBP: 00000000000002f7 R08: 0000000000000000 R09: 0000000000000000
R10: 8000000000000001 R11: 0000000000000212 R12: 0000000020b28ff8
R13: 00000000208f8ff8 R14: 000000002064e000 R15: 0000000000000000
Code: 00 f1 f1 f1 f1 c7 40 04 04 f2 f2 f2 c7 40 08 f3 f3 f3 f3 e8 71
02 2d ff 48 8d 45 98 48 c1 e8 03 c6 04 18 04 4c 89 e0 48 c1 e8 03 <0f>
b6 14 18 4c 89 e0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85
RIP: __read_once_size include/linux/compiler.h:254 [inline] RSP:
ffff880154c9f078
RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP: ffff880154c9f078
RIP: refcount_sub_and_test+0x82/0x1f0 lib/refcount.c:120 RSP: ffff880154c9f078
---[ end trace 3457479bd0ed5045 ]---

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
@ 2017-02-28 17:55     ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2017-02-28 17:55 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, LKML, Jens Axboe, Andrew Morton, Tejun Heo,
	Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin, syzkaller

On Mon, Feb 27, 2017 at 7:27 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> The following program triggers GPF in bdi_put:
>> https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt
>
> What happens is
>         * attempt of, essentially, mount -t bdev ..., calls mount_pseudo()
> and then promptly destroys the new instance it has created.
>         * the only inode created on that sucker (root directory, that
> is) gets evicted.
>         * most of ->evict_inode() is harmless, until it gets to
>         if (bdev->bd_bdi != &noop_backing_dev_info)
>                 bdi_put(bdev->bd_bdi);
>
> added there by "block: Make blk_get_backing_dev_info() safe without open bdev".
> Since ->bd_bdi hadn't been initialized for that sucker (the same patch has
> placed initialization into bdget()), we step into shit of varying nastiness,
> depending on phase of moon, etc.
>
> Could somebody explain WTF do we have those two lines in bdev_evict_inode(),
> anyway?  We set ->bd_bdi to something other than noop_backing_dev_info only
> in __blkdev_get() when ->bd_openers goes from zero to positive, so why is
> the matching bdi_put() not in __blkdev_put()?  Jan?


I am also seeing the following crashes on
linux-next/8d01c069486aca75b8f6018a759215b0ed0c91f0. Do you think it's
the same underlying issue?

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 19552 Comm: syz-executor2 Not tainted 4.10.0-next-20170228+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
task: ffff8801c16ae400 task.stack: ffff880154c98000
RIP: 0010:__read_once_size include/linux/compiler.h:254 [inline]
RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
RIP: 0010:refcount_sub_and_test+0x82/0x1f0 lib/refcount.c:120
RSP: 0018:ffff880154c9f078 EFLAGS: 00010202
RAX: 0000000000000007 RBX: dffffc0000000000 RCX: ffffc90001a8f000
RDX: 0000000000000740 RSI: ffffffff8246160f RDI: 0000000000000001
RBP: ffff880154c9f110 R08: ffffe8ffffc29a28 R09: 0000000000000001
R10: 1ffff1002a993dcc R11: 0000000000000001 R12: 0000000000000038
R13: 0000000000000001 R14: ffff880154c9f0e8 R15: 1ffff1002a993e11
FS:  00007f0335223700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020fd3ff8 CR3: 00000001c4580000 CR4: 00000000001406f0
DR0: 0000000020000000 DR1: 0000000020001000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Call Trace:
 refcount_dec_and_test+0x1a/0x20 lib/refcount.c:153
 kref_put include/linux/kref.h:71 [inline]
 bdi_put+0x19/0x40 mm/backing-dev.c:914
 bdev_evict_inode+0x203/0x3a0 fs/block_dev.c:888
 evict+0x46e/0x980 fs/inode.c:553
 iput_final fs/inode.c:1515 [inline]
 iput+0x589/0xb20 fs/inode.c:1542
 dentry_unlink_inode+0x43b/0x600 fs/dcache.c:343
 __dentry_kill+0x34d/0x740 fs/dcache.c:538
 dentry_kill fs/dcache.c:579 [inline]
 dput.part.27+0x5ce/0x7c0 fs/dcache.c:791
 dput fs/dcache.c:753 [inline]
 do_one_tree+0x43/0x50 fs/dcache.c:1454
 shrink_dcache_for_umount+0xbb/0x2b0 fs/dcache.c:1468
 generic_shutdown_super+0xcd/0x4c0 fs/super.c:421
 kill_anon_super+0x3c/0x50 fs/super.c:988
 deactivate_locked_super+0x88/0xd0 fs/super.c:309
 deactivate_super+0x155/0x1b0 fs/super.c:340
 cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
 __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 tracehook_notify_resume include/linux/tracehook.h:191 [inline]
 exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x44fb79
RSP: 002b:00007f0335222b58 EFLAGS: 00000212 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffea RBX: 0000000000708150 RCX: 000000000044fb79
RDX: 000000002064e000 RSI: 00000000208f8ff8 RDI: 0000000020b28ff8
RBP: 00000000000002f7 R08: 0000000000000000 R09: 0000000000000000
R10: 8000000000000001 R11: 0000000000000212 R12: 0000000020b28ff8
R13: 00000000208f8ff8 R14: 000000002064e000 R15: 0000000000000000
Code: 00 f1 f1 f1 f1 c7 40 04 04 f2 f2 f2 c7 40 08 f3 f3 f3 f3 e8 71
02 2d ff 48 8d 45 98 48 c1 e8 03 c6 04 18 04 4c 89 e0 48 c1 e8 03 <0f>
b6 14 18 4c 89 e0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85
RIP: __read_once_size include/linux/compiler.h:254 [inline] RSP:
ffff880154c9f078
RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP: ffff880154c9f078
RIP: refcount_sub_and_test+0x82/0x1f0 lib/refcount.c:120 RSP: ffff880154c9f078
---[ end trace 3457479bd0ed5045 ]---

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
  2017-02-28 17:55     ` Dmitry Vyukov
@ 2017-02-28 18:23       ` Al Viro
  -1 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2017-02-28 18:23 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: linux-fsdevel, LKML, Jens Axboe, Andrew Morton, Tejun Heo,
	Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin, syzkaller

On Tue, Feb 28, 2017 at 06:55:55PM +0100, Dmitry Vyukov wrote:

> I am also seeing the following crashes on
> linux-next/8d01c069486aca75b8f6018a759215b0ed0c91f0. Do you think it's
> the same underlying issue?

Yes.
	1) Any attempt of mount -t bdev will fail, as it should
	2) bdevfs instance created by that attempt will be immediately
destroyed (again, as it should)
	3) the sole inode ever created for that instance (its root directory)
will be destroyed in process (again, as it should)
	4) that inode has never had ->bd_bdi initialized - the value stored
there would have been whatever garbage kmem_cache_alloc() has left behind
	5) bdev_evict_inode() will be called for that inode and if
aforementioned garbage happens to be not equal to &noop_backing_dev_info,
the pointer will be passed to bdi_put().

If that inode happens to reuse the memory previously occupied by a bdev
inode of a looked up but never opened block device, it will have ->bd_bdi
still equal to &noop_backing_dev_info, so that crap does not trigger
every time.  That's what the junk (recvmsg/ioctl/etc.) in your reproducer
is affecting.  Specific effects of bdi_put() will, of course, depend upon
the actual garbage found there - silent decrement of refcount of an existing
bdi setting the things up for later use-after-free, outright memory
corruption, etc.

_Any_ stack trace of form sys_mount() -> ... -> bdev_evict_inode() ->
bdi_put() -> <barf>  is almost certainly the same bug.

I would still like to hear from Jan regarding the reasons why we do that
bdi_put() from bdev_evict_inode() and not in __blkdev_put().  My preference
would be to do it there (and reset ->bd_bdi to &noop_backing_dev_info) when
->bd_openers hits 0.  And drop that code from bdev_evict_inode()...

Objections?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
@ 2017-02-28 18:23       ` Al Viro
  0 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2017-02-28 18:23 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: linux-fsdevel, LKML, Jens Axboe, Andrew Morton, Tejun Heo,
	Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin, syzkaller

On Tue, Feb 28, 2017 at 06:55:55PM +0100, Dmitry Vyukov wrote:

> I am also seeing the following crashes on
> linux-next/8d01c069486aca75b8f6018a759215b0ed0c91f0. Do you think it's
> the same underlying issue?

Yes.
	1) Any attempt of mount -t bdev will fail, as it should
	2) bdevfs instance created by that attempt will be immediately
destroyed (again, as it should)
	3) the sole inode ever created for that instance (its root directory)
will be destroyed in process (again, as it should)
	4) that inode has never had ->bd_bdi initialized - the value stored
there would have been whatever garbage kmem_cache_alloc() has left behind
	5) bdev_evict_inode() will be called for that inode and if
aforementioned garbage happens to be not equal to &noop_backing_dev_info,
the pointer will be passed to bdi_put().

If that inode happens to reuse the memory previously occupied by a bdev
inode of a looked up but never opened block device, it will have ->bd_bdi
still equal to &noop_backing_dev_info, so that crap does not trigger
every time.  That's what the junk (recvmsg/ioctl/etc.) in your reproducer
is affecting.  Specific effects of bdi_put() will, of course, depend upon
the actual garbage found there - silent decrement of refcount of an existing
bdi setting the things up for later use-after-free, outright memory
corruption, etc.

_Any_ stack trace of form sys_mount() -> ... -> bdev_evict_inode() ->
bdi_put() -> <barf>  is almost certainly the same bug.

I would still like to hear from Jan regarding the reasons why we do that
bdi_put() from bdev_evict_inode() and not in __blkdev_put().  My preference
would be to do it there (and reset ->bd_bdi to &noop_backing_dev_info) when
->bd_openers hits 0.  And drop that code from bdev_evict_inode()...

Objections?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
  2017-02-27 18:27   ` Al Viro
@ 2017-03-01 14:29     ` Jan Kara
  -1 siblings, 0 replies; 18+ messages in thread
From: Jan Kara @ 2017-03-01 14:29 UTC (permalink / raw)
  To: Al Viro
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin,
	syzkaller

On Mon 27-02-17 18:27:55, Al Viro wrote:
> On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote:
> > Hello,
> > 
> > The following program triggers GPF in bdi_put:
> > https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt
> 
> What happens is
> 	* attempt of, essentially, mount -t bdev ..., calls mount_pseudo()
> and then promptly destroys the new instance it has created.
> 	* the only inode created on that sucker (root directory, that
> is) gets evicted.
> 	* most of ->evict_inode() is harmless, until it gets to
>         if (bdev->bd_bdi != &noop_backing_dev_info)
>                 bdi_put(bdev->bd_bdi);

Thanks for the analysis!

> added there by "block: Make blk_get_backing_dev_info() safe without open bdev".
> Since ->bd_bdi hadn't been initialized for that sucker (the same patch has
> placed initialization into bdget()), we step into shit of varying nastiness,
> depending on phase of moon, etc.

Yup, I've missed that the root inode of bdev superblock does not go through
bdget() (in fact I didn't think what happens with root inode for bdev
superblock at all) and thus bd_bdi is left uninitialized in that case. I'll
send a fix for that in a while.
 
> Could somebody explain WTF do we have those two lines in bdev_evict_inode(),
> anyway?  We set ->bd_bdi to something other than noop_backing_dev_info only
> in __blkdev_get() when ->bd_openers goes from zero to positive, so why is
> the matching bdi_put() not in __blkdev_put()?  Jan?

The problem is writeback code (from flusher work or through sync(2) -
generally inode_to_bdi() users) can be looking at bdev inode independently
from it being open. So if they start looking while the bdev is open but the
dereference happens after it is closed and device removed, we oops. We have
seen oopses due to this for quite a while. And all the stuff that is done
in __blkdev_put() is not enough to prevent writeback code from having a
look whether there is not something to write.

So what we do now is that once we establish valid bd_bdi reference, we
leave it alone until bdev inode gets evicted. And to handle the case when
underlying device actually changes, we unhash bdev inode when the device
gets removed from the system so that it cannot be found by bdget() anymore.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
@ 2017-03-01 14:29     ` Jan Kara
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Kara @ 2017-03-01 14:29 UTC (permalink / raw)
  To: Al Viro
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin,
	syzkaller

On Mon 27-02-17 18:27:55, Al Viro wrote:
> On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote:
> > Hello,
> > 
> > The following program triggers GPF in bdi_put:
> > https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt
> 
> What happens is
> 	* attempt of, essentially, mount -t bdev ..., calls mount_pseudo()
> and then promptly destroys the new instance it has created.
> 	* the only inode created on that sucker (root directory, that
> is) gets evicted.
> 	* most of ->evict_inode() is harmless, until it gets to
>         if (bdev->bd_bdi != &noop_backing_dev_info)
>                 bdi_put(bdev->bd_bdi);

Thanks for the analysis!

> added there by "block: Make blk_get_backing_dev_info() safe without open bdev".
> Since ->bd_bdi hadn't been initialized for that sucker (the same patch has
> placed initialization into bdget()), we step into shit of varying nastiness,
> depending on phase of moon, etc.

Yup, I've missed that the root inode of bdev superblock does not go through
bdget() (in fact I didn't think what happens with root inode for bdev
superblock at all) and thus bd_bdi is left uninitialized in that case. I'll
send a fix for that in a while.
 
> Could somebody explain WTF do we have those two lines in bdev_evict_inode(),
> anyway?  We set ->bd_bdi to something other than noop_backing_dev_info only
> in __blkdev_get() when ->bd_openers goes from zero to positive, so why is
> the matching bdi_put() not in __blkdev_put()?  Jan?

The problem is writeback code (from flusher work or through sync(2) -
generally inode_to_bdi() users) can be looking at bdev inode independently
from it being open. So if they start looking while the bdev is open but the
dereference happens after it is closed and device removed, we oops. We have
seen oopses due to this for quite a while. And all the stuff that is done
in __blkdev_put() is not enough to prevent writeback code from having a
look whether there is not something to write.

So what we do now is that once we establish valid bd_bdi reference, we
leave it alone until bdev inode gets evicted. And to handle the case when
underlying device actually changes, we unhash bdev inode when the device
gets removed from the system so that it cannot be found by bdget() anymore.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
  2017-03-01 14:29     ` Jan Kara
@ 2017-03-01 15:05       ` Jan Kara
  -1 siblings, 0 replies; 18+ messages in thread
From: Jan Kara @ 2017-03-01 15:05 UTC (permalink / raw)
  To: Al Viro
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin,
	syzkaller

[-- Attachment #1: Type: text/plain, Size: 2631 bytes --]

On Wed 01-03-17 15:29:09, Jan Kara wrote:
> On Mon 27-02-17 18:27:55, Al Viro wrote:
> > On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote:
> > > Hello,
> > > 
> > > The following program triggers GPF in bdi_put:
> > > https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt
> > 
> > What happens is
> > 	* attempt of, essentially, mount -t bdev ..., calls mount_pseudo()
> > and then promptly destroys the new instance it has created.
> > 	* the only inode created on that sucker (root directory, that
> > is) gets evicted.
> > 	* most of ->evict_inode() is harmless, until it gets to
> >         if (bdev->bd_bdi != &noop_backing_dev_info)
> >                 bdi_put(bdev->bd_bdi);
> 
> Thanks for the analysis!
> 
> > added there by "block: Make blk_get_backing_dev_info() safe without open bdev".
> > Since ->bd_bdi hadn't been initialized for that sucker (the same patch has
> > placed initialization into bdget()), we step into shit of varying nastiness,
> > depending on phase of moon, etc.
> 
> Yup, I've missed that the root inode of bdev superblock does not go through
> bdget() (in fact I didn't think what happens with root inode for bdev
> superblock at all) and thus bd_bdi is left uninitialized in that case. I'll
> send a fix for that in a while.
>  
> > Could somebody explain WTF do we have those two lines in bdev_evict_inode(),
> > anyway?  We set ->bd_bdi to something other than noop_backing_dev_info only
> > in __blkdev_get() when ->bd_openers goes from zero to positive, so why is
> > the matching bdi_put() not in __blkdev_put()?  Jan?
> 
> The problem is writeback code (from flusher work or through sync(2) -
> generally inode_to_bdi() users) can be looking at bdev inode independently
> from it being open. So if they start looking while the bdev is open but the
> dereference happens after it is closed and device removed, we oops. We have
> seen oopses due to this for quite a while. And all the stuff that is done
> in __blkdev_put() is not enough to prevent writeback code from having a
> look whether there is not something to write.
> 
> So what we do now is that once we establish valid bd_bdi reference, we
> leave it alone until bdev inode gets evicted. And to handle the case when
> underlying device actually changes, we unhash bdev inode when the device
> gets removed from the system so that it cannot be found by bdget() anymore.

Attached patch fixes the problem for me. I'll post it officially tomorrow
once Al has a chance to reply...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-block-Initialize-bd_bdi-on-inode-initialization.patch --]
[-- Type: text/x-patch, Size: 2034 bytes --]

>From a533c8dd1fb4dbf840cd3adaf68afb6ad6851ddc Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 1 Mar 2017 15:31:11 +0100
Subject: [PATCH] block: Initialize bd_bdi on inode initialization

So far we initialized bd_bdi only in bdget(). That is fine for normal
bdev inodes however for the special case of the root inode of
blockdev_superblock that function is never called and thus bd_bdi is
left uninitialized. As a result bdev_evict_inode() may oops doing
bdi_put(root->bd_bdi) on that inode as can be seen when doing:

mount -t bdev none /mnt

Fix the problem by initializing bd_bdi when first allocating the inode
and then reinitializing bd_bdi in bdev_evict_inode().

Thanks to syzkaller team for finding the problem.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: b1d2dc5659b41741f5a29b2ade76ffb4e5bb13d8
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/block_dev.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 77c30f15a02c..2eca00ec4370 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -870,6 +870,7 @@ static void init_once(void *foo)
 #ifdef CONFIG_SYSFS
 	INIT_LIST_HEAD(&bdev->bd_holder_disks);
 #endif
+	bdev->bd_bdi = &noop_backing_dev_info;
 	inode_init_once(&ei->vfs_inode);
 	/* Initialize mutex for freeze. */
 	mutex_init(&bdev->bd_fsfreeze_mutex);
@@ -884,8 +885,10 @@ static void bdev_evict_inode(struct inode *inode)
 	spin_lock(&bdev_lock);
 	list_del_init(&bdev->bd_list);
 	spin_unlock(&bdev_lock);
-	if (bdev->bd_bdi != &noop_backing_dev_info)
+	if (bdev->bd_bdi != &noop_backing_dev_info) {
 		bdi_put(bdev->bd_bdi);
+		bdev->bd_bdi = &noop_backing_dev_info;
+	}
 }
 
 static const struct super_operations bdev_sops = {
@@ -988,7 +991,6 @@ struct block_device *bdget(dev_t dev)
 		bdev->bd_contains = NULL;
 		bdev->bd_super = NULL;
 		bdev->bd_inode = inode;
-		bdev->bd_bdi = &noop_backing_dev_info;
 		bdev->bd_block_size = i_blocksize(inode);
 		bdev->bd_part_count = 0;
 		bdev->bd_invalidated = 0;
-- 
2.10.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
@ 2017-03-01 15:05       ` Jan Kara
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Kara @ 2017-03-01 15:05 UTC (permalink / raw)
  To: Al Viro
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Jan Kara, Johannes Weiner, linux-mm, Andrey Ryabinin,
	syzkaller

[-- Attachment #1: Type: text/plain, Size: 2631 bytes --]

On Wed 01-03-17 15:29:09, Jan Kara wrote:
> On Mon 27-02-17 18:27:55, Al Viro wrote:
> > On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote:
> > > Hello,
> > > 
> > > The following program triggers GPF in bdi_put:
> > > https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt
> > 
> > What happens is
> > 	* attempt of, essentially, mount -t bdev ..., calls mount_pseudo()
> > and then promptly destroys the new instance it has created.
> > 	* the only inode created on that sucker (root directory, that
> > is) gets evicted.
> > 	* most of ->evict_inode() is harmless, until it gets to
> >         if (bdev->bd_bdi != &noop_backing_dev_info)
> >                 bdi_put(bdev->bd_bdi);
> 
> Thanks for the analysis!
> 
> > added there by "block: Make blk_get_backing_dev_info() safe without open bdev".
> > Since ->bd_bdi hadn't been initialized for that sucker (the same patch has
> > placed initialization into bdget()), we step into shit of varying nastiness,
> > depending on phase of moon, etc.
> 
> Yup, I've missed that the root inode of bdev superblock does not go through
> bdget() (in fact I didn't think what happens with root inode for bdev
> superblock at all) and thus bd_bdi is left uninitialized in that case. I'll
> send a fix for that in a while.
>  
> > Could somebody explain WTF do we have those two lines in bdev_evict_inode(),
> > anyway?  We set ->bd_bdi to something other than noop_backing_dev_info only
> > in __blkdev_get() when ->bd_openers goes from zero to positive, so why is
> > the matching bdi_put() not in __blkdev_put()?  Jan?
> 
> The problem is writeback code (from flusher work or through sync(2) -
> generally inode_to_bdi() users) can be looking at bdev inode independently
> from it being open. So if they start looking while the bdev is open but the
> dereference happens after it is closed and device removed, we oops. We have
> seen oopses due to this for quite a while. And all the stuff that is done
> in __blkdev_put() is not enough to prevent writeback code from having a
> look whether there is not something to write.
> 
> So what we do now is that once we establish valid bd_bdi reference, we
> leave it alone until bdev inode gets evicted. And to handle the case when
> underlying device actually changes, we unhash bdev inode when the device
> gets removed from the system so that it cannot be found by bdget() anymore.

Attached patch fixes the problem for me. I'll post it officially tomorrow
once Al has a chance to reply...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-block-Initialize-bd_bdi-on-inode-initialization.patch --]
[-- Type: text/x-patch, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
  2017-03-01 14:29     ` Jan Kara
@ 2017-03-02 11:44       ` Al Viro
  -1 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2017-03-02 11:44 UTC (permalink / raw)
  To: Jan Kara
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Johannes Weiner, linux-mm, Andrey Ryabinin, syzkaller

On Wed, Mar 01, 2017 at 03:29:09PM +0100, Jan Kara wrote:

> The problem is writeback code (from flusher work or through sync(2) -
> generally inode_to_bdi() users) can be looking at bdev inode independently
> from it being open. So if they start looking while the bdev is open but the
> dereference happens after it is closed and device removed, we oops. We have
> seen oopses due to this for quite a while. And all the stuff that is done
> in __blkdev_put() is not enough to prevent writeback code from having a
> look whether there is not something to write.

Um.  What's to prevent the queue/device/module itself from disappearing
from under you?  IOW, what are you doing that is safe to do in face of
driver going rmmoded?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
@ 2017-03-02 11:44       ` Al Viro
  0 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2017-03-02 11:44 UTC (permalink / raw)
  To: Jan Kara
  Cc: Dmitry Vyukov, linux-fsdevel, LKML, Jens Axboe, Andrew Morton,
	Tejun Heo, Johannes Weiner, linux-mm, Andrey Ryabinin, syzkaller

On Wed, Mar 01, 2017 at 03:29:09PM +0100, Jan Kara wrote:

> The problem is writeback code (from flusher work or through sync(2) -
> generally inode_to_bdi() users) can be looking at bdev inode independently
> from it being open. So if they start looking while the bdev is open but the
> dereference happens after it is closed and device removed, we oops. We have
> seen oopses due to this for quite a while. And all the stuff that is done
> in __blkdev_put() is not enough to prevent writeback code from having a
> look whether there is not something to write.

Um.  What's to prevent the queue/device/module itself from disappearing
from under you?  IOW, what are you doing that is safe to do in face of
driver going rmmoded?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
  2017-03-02 11:44       ` Al Viro
@ 2017-03-02 12:20         ` Jan Kara
  -1 siblings, 0 replies; 18+ messages in thread
From: Jan Kara @ 2017-03-02 12:20 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, Dmitry Vyukov, linux-fsdevel, LKML, Jens Axboe,
	Andrew Morton, Tejun Heo, Johannes Weiner, linux-mm,
	Andrey Ryabinin, syzkaller

On Thu 02-03-17 11:44:53, Al Viro wrote:
> On Wed, Mar 01, 2017 at 03:29:09PM +0100, Jan Kara wrote:
> 
> > The problem is writeback code (from flusher work or through sync(2) -
> > generally inode_to_bdi() users) can be looking at bdev inode independently
> > from it being open. So if they start looking while the bdev is open but the
> > dereference happens after it is closed and device removed, we oops. We have
> > seen oopses due to this for quite a while. And all the stuff that is done
> > in __blkdev_put() is not enough to prevent writeback code from having a
> > look whether there is not something to write.
> 
> Um.  What's to prevent the queue/device/module itself from disappearing
> from under you?  IOW, what are you doing that is safe to do in face of
> driver going rmmoded?

So BDI does not have direct relation to the device itself. It is an
abstraction for some of the device properties / functionality and thus it
can live even after the device itself went away and the module got removed.
The only thing users of bdi want is to tell them whether the device is
congested or various statistics and dirty inode tracking for writeback
purposes and that is all independent of the particular device or whether it
still exists.

Technically there may be pointers bdi->dev, bdi->owner to the device which
are properly refcounted (so the device structure or module cannot be
removed under us). These references get dropped & cleared in
bdi_unregister() generally called from blk_cleanup_queue() (will be moved
to del_gendisk() soon) when the device is going away. This can happen while
e.g. bdev still references the bdi so users of bdi->dev or bdi->owner have
to be careful to sychronize against device removal and bdi_unregister() but
there are only very few such users.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: mm: GPF in bdi_put
@ 2017-03-02 12:20         ` Jan Kara
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Kara @ 2017-03-02 12:20 UTC (permalink / raw)
  To: Al Viro
  Cc: Jan Kara, Dmitry Vyukov, linux-fsdevel, LKML, Jens Axboe,
	Andrew Morton, Tejun Heo, Johannes Weiner, linux-mm,
	Andrey Ryabinin, syzkaller

On Thu 02-03-17 11:44:53, Al Viro wrote:
> On Wed, Mar 01, 2017 at 03:29:09PM +0100, Jan Kara wrote:
> 
> > The problem is writeback code (from flusher work or through sync(2) -
> > generally inode_to_bdi() users) can be looking at bdev inode independently
> > from it being open. So if they start looking while the bdev is open but the
> > dereference happens after it is closed and device removed, we oops. We have
> > seen oopses due to this for quite a while. And all the stuff that is done
> > in __blkdev_put() is not enough to prevent writeback code from having a
> > look whether there is not something to write.
> 
> Um.  What's to prevent the queue/device/module itself from disappearing
> from under you?  IOW, what are you doing that is safe to do in face of
> driver going rmmoded?

So BDI does not have direct relation to the device itself. It is an
abstraction for some of the device properties / functionality and thus it
can live even after the device itself went away and the module got removed.
The only thing users of bdi want is to tell them whether the device is
congested or various statistics and dirty inode tracking for writeback
purposes and that is all independent of the particular device or whether it
still exists.

Technically there may be pointers bdi->dev, bdi->owner to the device which
are properly refcounted (so the device structure or module cannot be
removed under us). These references get dropped & cleared in
bdi_unregister() generally called from blk_cleanup_queue() (will be moved
to del_gendisk() soon) when the device is going away. This can happen while
e.g. bdev still references the bdi so users of bdi->dev or bdi->owner have
to be careful to sychronize against device removal and bdi_unregister() but
there are only very few such users.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-03-02 12:56 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-27 17:11 mm: GPF in bdi_put Dmitry Vyukov
2017-02-27 17:11 ` Dmitry Vyukov
2017-02-27 17:14 ` Dmitry Vyukov
2017-02-27 17:14   ` Dmitry Vyukov
2017-02-27 18:27 ` Al Viro
2017-02-27 18:27   ` Al Viro
2017-02-28 17:55   ` Dmitry Vyukov
2017-02-28 17:55     ` Dmitry Vyukov
2017-02-28 18:23     ` Al Viro
2017-02-28 18:23       ` Al Viro
2017-03-01 14:29   ` Jan Kara
2017-03-01 14:29     ` Jan Kara
2017-03-01 15:05     ` Jan Kara
2017-03-01 15:05       ` Jan Kara
2017-03-02 11:44     ` Al Viro
2017-03-02 11:44       ` Al Viro
2017-03-02 12:20       ` Jan Kara
2017-03-02 12:20         ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.