From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752130AbcDRJoh (ORCPT ); Mon, 18 Apr 2016 05:44:37 -0400 Received: from mail-wm0-f52.google.com ([74.125.82.52]:38024 "EHLO mail-wm0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751877AbcDRJoe (ORCPT ); Mon, 18 Apr 2016 05:44:34 -0400 MIME-Version: 1.0 From: Dmitry Vyukov Date: Mon, 18 Apr 2016 11:44:11 +0200 Message-ID: Subject: fs: GPF in locked_inode_to_wb_and_lock_list To: Alexander Viro , "linux-fsdevel@vger.kernel.org" , LKML , Tejun Heo Cc: syzkaller , Kostya Serebryany , Alexander Potapenko Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, I've hit the following GPF while running syzkaller fuzzer on commit 806fdcce017dc98c4dbf8ed001750a0d7d2bb0af (Apr 14): general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Modules linked in: CPU: 3 PID: 32 Comm: kworker/u10:1 Not tainted 4.6.0-rc3+ #349 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: writeback wb_workfn (flush-11:0) task: ffff88006ccf1840 ti: ffff88006cda8000 task.ti: ffff88006cda8000 RIP: 0010:[] [] locked_inode_to_wb_and_lock_list+0xa2/0x750 RSP: 0018:ffff88006cdaf7d0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88006ccf2050 RDX: 0000000000000000 RSI: 000000114c8a8484 RDI: 0000000000000286 RBP: ffff88006cdaf820 R08: ffff88006ccf1840 R09: 0000000000000000 R10: 000229915090805f R11: 0000000000000001 R12: ffff88006a72f5e0 R13: dffffc0000000000 R14: ffffed000d4e5eed R15: ffffffff8830cf40 FS: 0000000000000000(0000) GS:ffff88006d500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000003301bf8 CR3: 000000006368f000 CR4: 00000000000006e0 DR0: 0000000000001ec9 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Stack: ffff88006a72f680 ffff88006a72f768 ffff8800671230d8 03ff88006cdaf948 ffff88006a72f668 ffff88006a72f5e0 ffff8800671230d8 ffff88006cdaf948 ffff880065b90cc8 ffff880067123100 ffff88006cdaf970 ffffffff8188e12e Call Trace: [< inline >] inode_to_wb_and_lock_list fs/fs-writeback.c:309 [] writeback_sb_inodes+0x4de/0x1250 fs/fs-writeback.c:1554 [] __writeback_inodes_wb+0x104/0x1e0 fs/fs-writeback.c:1600 [] wb_writeback+0x7ce/0xc90 fs/fs-writeback.c:1709 [< inline >] wb_do_writeback fs/fs-writeback.c:1844 [] wb_workfn+0x2f9/0x1000 fs/fs-writeback.c:1884 [] process_one_work+0x78e/0x15c0 kernel/workqueue.c:2094 [] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2228 [] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303 [] ret_from_fork+0x22/0x50 arch/x86/entry/entry_64.S:392 Code: 05 94 4a a8 06 85 c0 0f 85 03 03 00 00 e8 07 15 d0 ff 41 80 3e 00 0f 85 64 06 00 00 49 8b 9c 24 88 01 00 00 48 89 d8 48 c1 e8 03 <42> 80 3c 28 00 0f 85 17 06 00 00 48 8b 03 48 83 c0 50 48 39 c3 RIP [< inline >] wb_get include/linux/backing-dev-defs.h:212 RIP [] locked_inode_to_wb_and_lock_list+0xa2/0x750 fs/fs-writeback.c:281 RSP ---[ end trace 986a4d314dcb2694 ]--- The crash happened here: if (wb != &wb->bdi->wb) ffffffff818884cb: 48 89 d8 mov %rbx,%rax ffffffff818884ce: 48 c1 e8 03 shr $0x3,%rax ffffffff818884d2: 42 80 3c 28 00 cmpb $0x0,(%rax,%r13,1) ffffffff818884d7: 0f 85 17 06 00 00 jne ffffffff81888af4 ffffffff818884dd: 48 8b 03 mov (%rbx),%rax ffffffff818884e0: 48 83 c0 50 add $0x50,%rax ffffffff818884e4: 48 39 c3 cmp %rax,%rbx ffffffff818884e7: 0f 84 c3 00 00 00 je ffffffff818885b0 Which means that bdi is NULL (if I get indirections right). The block device is flush-11:0 (/dev/sr0). I was able to reproduce it once by re-running programs from the crash log: https://gist.githubusercontent.com/dvyukov/7c9e6358b661e2d50a774a70b5ee348a/raw/5cd94e32bbcf2a788b5efd0768cfd883c5921070/gistfile1.txt But it took half an hour, and then another hour long run did not trigger it. So it is probably triggered by some rare race. Thank you