From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755533AbcFTRlB (ORCPT ); Mon, 20 Jun 2016 13:41:01 -0400 Received: from mail-yw0-f196.google.com ([209.85.161.196]:36280 "EHLO mail-yw0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752859AbcFTRkv (ORCPT ); Mon, 20 Jun 2016 13:40:51 -0400 Date: Mon, 20 Jun 2016 13:40:48 -0400 From: Tejun Heo To: Dmitry Vyukov Cc: Jan Kara , Jens Axboe , Andrey Ryabinin , Alexander Viro , "linux-fsdevel@vger.kernel.org" , LKML , syzkaller , Kostya Serebryany , Alexander Potapenko , Ilya Dryomov , Jan Kara , kernel-team@fb.com, Ted Tso Subject: Re: [PATCH] block: flush writeback dwork before detaching a bdev inode from it Message-ID: <20160620174048.GM3262@mtj.duckdns.org> References: <20160420211456.GE4775@htj.duckdns.org> <20160421170635.GO7822@mtj.duckdns.org> <20160617160405.GJ3262@mtj.duckdns.org> <20160620133146.GE6882@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Mon, Jun 20, 2016 at 03:38:41PM +0200, Dmitry Vyukov wrote: > > Sorry for the late reply but now when thinking about the patch I don't > > think it is quite right. Writeback can happen from other contexts than just > > the worker one (e.g. kswapd can do writeback, or in some out-of-memory > > situations we may punt to doing writeback directly instead of calling the > > worker, or sync(2) calls fdatawrite() for block device inode directly when > > iterating through blockdev superblock). So flushing the workqueue IMHO is > > not covering 100% of cases. Hmmm, yeah, the patch undoes what the cgroup writeback changes added but it looks like the addition was exposing the existing problem more rather than causing a new one. > > I wanted to suggest to use inode_wait_for_writeback() which will make sure > > writeback is done with the inode. However we effectively already do this > > by calling bdev_write_inode() which calls writeback_single_inode() in > > WB_SYNC_ALL mode. So by the time that call completes we are sure writeback > > code is not looking at the inode. *However* what I think is happening is > > that sync_blockdev() writes all the dirty pages, bdev_write_inode() writes > > the inode and clears all dirty bits, however the inode still stays in the > > b_dirty / b_io list of the wb because it has I_DIRTY_TIME set. Subsequently > > once flusher work runs, it finds the inode, looks at it and boom. And the > > problem seems to be that write_inode_now(inode, true) does not result in > > I_DIRTY_TIME being cleared. > > > > Attached patch should fix this issue - it is compile-tested only. Dmitry, > > can you check whether this patch fixes the issue for you as well? > > I can't directly test it because crash happened very infrequently. > If Tejun/Ted agree that it is the right way to fix it, then I can > patch it, restart the fuzzer and leave it running for a while. Yes, please try out the patch. Thanks a lot. -- tejun