linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Jens Axboe <axboe@fb.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	syzkaller <syzkaller@googlegroups.com>,
	Kostya Serebryany <kcc@google.com>,
	Alexander Potapenko <glider@google.com>,
	Ilya Dryomov <idryomov@gmail.com>, Jan Kara <jack@suse.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	kernel-team@fb.com
Subject: [PATCH] block: flush writeback dwork before detaching a bdev inode from it
Date: Fri, 17 Jun 2016 12:04:05 -0400	[thread overview]
Message-ID: <20160617160405.GJ3262@mtj.duckdns.org> (raw)
In-Reply-To: <CACT4Y+ZDHXVTKXcmMO4Gz2Rm50eB+u+iCU=bdKb-BN0LxxSiPg@mail.gmail.com>

43d1c0eb7e11 ("block: detach bdev inode from its wb in
__blkdev_put()") detached bdev inode from its wb as the bdev inode may
outlive the underlying bdi and thus the wb.  This is accomplished by
invoking inode_detach_wb() from __blkdev_put(); however, while the
inode can't be dirtied by the time control reaches there, that doesn't
guarantee that writeback isn't already in progress on the inode.  This
can lead to the inode being disassociated from its wb while writeback
operation is in flight causing oopses like the following.

  general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
  Modules linked in:
  CPU: 3 PID: 32 Comm: kworker/u10:1 Not tainted 4.6.0-rc3+ #349
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Workqueue: writeback wb_workfn (flush-11:0)
  task: ffff88006ccf1840 ti: ffff88006cda8000 task.ti: ffff88006cda8000
  RIP: 0010:[<ffffffff818884d2>]  [<ffffffff818884d2>]
  locked_inode_to_wb_and_lock_list+0xa2/0x750
  RSP: 0018:ffff88006cdaf7d0  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88006ccf2050
  RDX: 0000000000000000 RSI: 000000114c8a8484 RDI: 0000000000000286
  RBP: ffff88006cdaf820 R08: ffff88006ccf1840 R09: 0000000000000000
  R10: 000229915090805f R11: 0000000000000001 R12: ffff88006a72f5e0
  R13: dffffc0000000000 R14: ffffed000d4e5eed R15: ffffffff8830cf40
  FS:  0000000000000000(0000) GS:ffff88006d500000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000003301bf8 CR3: 000000006368f000 CR4: 00000000000006e0
  DR0: 0000000000001ec9 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
  Stack:
   ffff88006a72f680 ffff88006a72f768 ffff8800671230d8 03ff88006cdaf948
   ffff88006a72f668 ffff88006a72f5e0 ffff8800671230d8 ffff88006cdaf948
   ffff880065b90cc8 ffff880067123100 ffff88006cdaf970 ffffffff8188e12e
  Call Trace:
   [<     inline     >] inode_to_wb_and_lock_list fs/fs-writeback.c:309
   [<ffffffff8188e12e>] writeback_sb_inodes+0x4de/0x1250 fs/fs-writeback.c:1554
   [<ffffffff8188efa4>] __writeback_inodes_wb+0x104/0x1e0 fs/fs-writeback.c:1600
   [<ffffffff8188f9ae>] wb_writeback+0x7ce/0xc90 fs/fs-writeback.c:1709
   [<     inline     >] wb_do_writeback fs/fs-writeback.c:1844
   [<ffffffff81891079>] wb_workfn+0x2f9/0x1000 fs/fs-writeback.c:1884
   [<ffffffff813bcd1e>] process_one_work+0x78e/0x15c0 kernel/workqueue.c:2094
   [<ffffffff813bdc2b>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2228
   [<ffffffff813cdeef>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [<ffffffff867bc5d2>] ret_from_fork+0x22/0x50 arch/x86/entry/entry_64.S:392
  Code: 05 94 4a a8 06 85 c0 0f 85 03 03 00 00 e8 07 15 d0 ff 41 80 3e
  00 0f 85 64 06 00 00 49 8b 9c 24 88 01 00 00 48 89 d8 48 c1 e8 03 <42>
  80 3c 28 00 0f 85 17 06 00 00 48 8b 03 48 83 c0 50 48 39 c3
  RIP  [<     inline     >] wb_get include/linux/backing-dev-defs.h:212
  RIP  [<ffffffff818884d2>] locked_inode_to_wb_and_lock_list+0xa2/0x750
  fs/fs-writeback.c:281
   RSP <ffff88006cdaf7d0>
  ---[ end trace 986a4d314dcb2694 ]---

Fix it by flushing the wb dwork before detaching the inode.  Combined
with the fact that the inode can no longer be dirtied, this guarantees
that no writeback operation can be in flight or initiated.  As this
involves details on writeback side which doesn't have much to do with
block_dev, encapsulate it in a helper function -
inode_detach_blkdev_wb().

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/g/CACT4Y+YAjq8mcfiVxR075didJKCyOCVrqxdbfKdgUxabstbfmA@mail.gmail.com
Fixes: 43d1c0eb7e11 ("block: detach bdev inode from its wb in __blkdev_put()")
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: stable@vger.kernel.org # v4.2+
---
 fs/block_dev.c            |    7 +------
 include/linux/writeback.h |   23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 71ccab1..6a14100 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1612,12 +1612,7 @@ static void __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 		kill_bdev(bdev);
 
 		bdev_write_inode(bdev);
-		/*
-		 * Detaching bdev inode from its wb in __destroy_inode()
-		 * is too late: the queue which embeds its bdi (along with
-		 * root wb) can be gone as soon as we put_disk() below.
-		 */
-		inode_detach_wb(bdev->bd_inode);
+		inode_detach_blkdev_wb(bdev);
 	}
 	if (bdev->bd_contains == bdev) {
 		if (disk->fops->release)
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index d0b5ca5..ec1f530 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -230,6 +230,25 @@ static inline void inode_detach_wb(struct inode *inode)
 }
 
 /**
+ * inode_detach_blkdev_wb - disassociate a bd_inode from its wb
+ * @bdev: block_device of interest
+ *
+ * @bdev is being put for the last time.  Detaching bdev inode in
+ * __destroy_inode() is too late: the queue which embeds its bdi (along
+ * with root wb) can be gone as soon as the containing disk is put.
+ *
+ * This function dissociates @bdev->bd_inode from its wb.  The inode must
+ * be clean and no further operations should be started on it.
+ */
+static inline void inode_detach_blkdev_wb(struct block_device *bdev)
+{
+	if (bdev->bd_inode->i_wb) {
+		flush_delayed_work(&bdev->bd_inode->i_wb->dwork);
+		inode_detach_wb(bdev->bd_inode);
+	}
+}
+
+/**
  * wbc_attach_fdatawrite_inode - associate wbc and inode for fdatawrite
  * @wbc: writeback_control of interest
  * @inode: target inode
@@ -277,6 +296,10 @@ static inline void inode_detach_wb(struct inode *inode)
 {
 }
 
+static inline void inode_detach_blkdev_wb(struct block_device *bdev)
+{
+}
+
 static inline void wbc_attach_and_unlock_inode(struct writeback_control *wbc,
 					       struct inode *inode)
 	__releases(&inode->i_lock)

  reply	other threads:[~2016-06-17 16:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-18  9:44 fs: GPF in locked_inode_to_wb_and_lock_list Dmitry Vyukov
2016-04-20 21:14 ` Tejun Heo
2016-04-21  8:25   ` Dmitry Vyukov
2016-04-21  9:10     ` Andrey Ryabinin
2016-04-21  9:29       ` Dmitry Vyukov
2016-04-21 16:14     ` Tejun Heo
2016-04-21  8:35   ` Dmitry Vyukov
2016-04-21  9:45     ` Andrey Ryabinin
2016-04-21 10:00       ` Dmitry Vyukov
2016-04-21 17:06         ` Tejun Heo
2016-04-22 18:55           ` Dmitry Vyukov
2016-06-06 17:46             ` Dmitry Vyukov
2016-06-17 16:04               ` Tejun Heo [this message]
2016-06-20 13:31                 ` [PATCH] block: flush writeback dwork before detaching a bdev inode from it Jan Kara
2016-06-20 13:38                   ` Dmitry Vyukov
2016-06-20 17:40                     ` Tejun Heo
2016-06-21 12:58                       ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160617160405.GJ3262@mtj.duckdns.org \
    --to=tj@kernel.org \
    --cc=axboe@fb.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=idryomov@gmail.com \
    --cc=jack@suse.com \
    --cc=kcc@google.com \
    --cc=kernel-team@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ryabinin.a.a@gmail.com \
    --cc=syzkaller@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).