GPF in __mark_inode_dirty due to locked_inode_to_wb_and_lock_list returning NULL

* GPF in __mark_inode_dirty due to locked_inode_to_wb_and_lock_list returning NULL
@ 2016-06-30 11:18 Nikolay Borisov
  2016-07-01 10:00 ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Nikolay Borisov @ 2016-06-30 11:18 UTC (permalink / raw)
  To: Tejun Heo, Jan Kara
  Cc: Linux-Kernel@Vger. Kernel. Org, dvyukov, linux-btrfs,
	SiteGround Operations

Hello, 

In light of the discussion in https://patchwork.kernel.org/patch/9187411/ and 
the discussion at https://groups.google.com/forum/#!topic/syzkaller/XvxH3cBQ134

I think the following might be related:

[1416412.898946] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[1416412.899217] IP: [<ffffffff811cab1d>] __mark_inode_dirty+0x21d/0x410
[1416412.899438] PGD 0 
[1416412.899647] Oops: 0000 [#1] SMP 
[1416412.903807] CPU: 2 PID: 11154 Comm: umount Tainted: P        W  O    4.4.9-clouder1 #20
[1416412.903980] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
[1416412.904270] task: ffff880466f18000 ti: ffff8802651d0000 task.ti: ffff8802651d0000
[1416412.907150] RIP: 0010:[<ffffffff811cab1d>]  [<ffffffff811cab1d>] __mark_inode_dirty+0x21d/0x410
[1416412.907487] RSP: 0018:ffff8802651d3828  EFLAGS: 00010282
[1416412.907656] RAX: 0000000000000000 RBX: ffff8801f71c3c48 RCX: 000000000000001a
[1416412.907829] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88034a44c398
[1416412.908001] RBP: ffff8802651d38d8 R08: 0000000000000000 R09: 0000000000000000
[1416412.908173] R10: 0000000000000000 R11: ffff88020e17d468 R12: 0000000000000000
[1416412.908343] R13: ffff88034a44c340 R14: 0000000000000000 R15: 0000000000000000
[1416412.908516] FS:  00007fd09f80f740(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000
[1416412.908690] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1416412.908859] CR2: 0000000000000018 CR3: 00000004671da000 CR4: 00000000000406e0
[1416412.909030] Stack:
[1416412.909192]  ffff8802651d38d8 ffffffffa0912043 ffff8802651d38e8 ffffffffa0911de3
[1416412.909551]  ffff8802651d38a4 ffff8802a41fcfa0 0000000000000000 0000000000000003
[1416412.909908]  ffff8801f71c3c48 ffff88046f029000 ffff8802651d38d8 ffffffff8113d2ed
[1416412.910263] Call Trace:
[1416412.910455]  [<ffffffffa0912043>] ? __set_extent_bit+0x393/0x5e0 [btrfs]
[1416412.910642]  [<ffffffffa0911de3>] ? __set_extent_bit+0x133/0x5e0 [btrfs]
[1416412.910816]  [<ffffffff8113d2ed>] ? account_page_dirtied+0xed/0x1b0
[1416412.910987]  [<ffffffff8113d486>] __set_page_dirty_nobuffers+0xd6/0x150
[1416412.911173]  [<ffffffffa08f2a0e>] btrfs_set_page_dirty+0xe/0x10 [btrfs]
[1416412.911346]  [<ffffffff81139c61>] set_page_dirty+0x41/0x70
[1416412.911530]  [<ffffffffa090376a>] btrfs_dirty_pages+0x7a/0xb0 [btrfs]
[1416412.911718]  [<ffffffffa093a263>] __btrfs_write_out_cache+0x383/0x430 [btrfs]
[1416412.911903]  [<ffffffffa08d360e>] ? btrfs_free_reserved_data_space_noquota+0x5e/0x130 [btrfs]
[1416412.912208]  [<ffffffffa093be8f>] btrfs_write_out_cache+0xaf/0x120 [btrfs]
[1416412.912391]  [<ffffffffa08dbb7f>] btrfs_start_dirty_block_groups+0x24f/0x490 [btrfs]
[1416412.912566]  [<ffffffff8107adb2>] ? __might_sleep+0x52/0x90
[1416412.912750]  [<ffffffffa08efe63>] btrfs_commit_transaction+0x163/0xb70 [btrfs]
[1416412.912933]  [<ffffffffa08f0ccd>] ? start_transaction+0x9d/0x4e0 [btrfs]
[1416412.913119]  [<ffffffffa090bf4b>] ? btrfs_wait_ordered_roots+0x1bb/0x1f0 [btrfs]
[1416412.913302]  [<ffffffffa08bb0d0>] btrfs_sync_fs+0x70/0x150 [btrfs]
[1416412.913475]  [<ffffffff811d3e10>] __sync_filesystem+0x30/0x50
[1416412.913645]  [<ffffffff811d3e72>] sync_filesystem+0x42/0x60
[1416412.913816]  [<ffffffff811a2dab>] generic_shutdown_super+0x2b/0x100
[1416412.913987]  [<ffffffff811a2f76>] kill_anon_super+0x16/0x30
[1416412.914165]  [<ffffffffa08beeae>] btrfs_kill_super+0x1e/0x130 [btrfs]
[1416412.914338]  [<ffffffff811a31b3>] deactivate_locked_super+0x53/0x90
[1416412.914507]  [<ffffffff811a3651>] deactivate_super+0x51/0x70
[1416412.914679]  [<ffffffff811bf4ef>] cleanup_mnt+0x3f/0x80
[1416412.914852]  [<ffffffff811bf582>] __cleanup_mnt+0x12/0x20
[1416412.915028]  [<ffffffff81072968>] task_work_run+0x68/0xb0
[1416412.915203]  [<ffffffff81002306>] exit_to_usermode_loop+0xe6/0xf0
[1416412.915376]  [<ffffffff811b75ad>] ? dput+0x11d/0x240
[1416412.915547]  [<ffffffff81002600>] syscall_return_slowpath+0xa0/0x110
[1416412.915719]  [<ffffffff81002017>] ? trace_hardirqs_on_thunk+0x17/0x19
[1416412.915893]  [<ffffffff8164302c>] int_ret_from_sys_call+0x25/0x9f

The faulting instructions are: 

0xffffffff811cab12 <__mark_inode_dirty+530>:    callq  0xffffffff811c9d80 <locked_inode_to_wb_and_lock_list>
0xffffffff811cab17 <__mark_inode_dirty+535>:    mov    %rax,%r13      ; move bdi_writeback to r13
0xffffffff811cab1a <__mark_inode_dirty+538>:    mov    (%rax),%rax    ; rax = bdi_write-back->bdi
0xffffffff811cab1d <__mark_inode_dirty+541>:    testb  $0x2,0x18(%rax) ; bdi_cap_writeback_dirty(wb->bdi) 

So we call locked_inode_to_wb_and_lock_list, and then get the bdi_writeback->bdi, 
which actually is null. As a matter of fact the whole struct bdi_writeback is null
(not the pointer to it). Is this possible to stem from the same issue discussed
in the referenced email threads or is it a different, btrfs-specific problem. 

Regards, 
Nikolay 

^ permalink raw reply	[flat|nested] 5+ messages in thread