From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BED4C433E9 for ; Mon, 11 Jan 2021 14:46:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF3712255F for ; Mon, 11 Jan 2021 14:46:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388698AbhAKOqm (ORCPT ); Mon, 11 Jan 2021 09:46:42 -0500 Received: from mx2.suse.de ([195.135.220.15]:58876 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727304AbhAKOqm (ORCPT ); Mon, 11 Jan 2021 09:46:42 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 99D83AB3E; Mon, 11 Jan 2021 14:46:00 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 509B81E0807; Mon, 11 Jan 2021 15:46:00 +0100 (CET) Date: Mon, 11 Jan 2021 15:46:00 +0100 From: Jan Kara To: Eric Biggers Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, Theodore Ts'o , Christoph Hellwig , stable@vger.kernel.org, Jan Kara Subject: Re: [PATCH v2 01/12] fs: fix lazytime expiration handling in __writeback_single_inode() Message-ID: <20210111144600.GC808@quack2.suse.cz> References: <20210109075903.208222-1-ebiggers@kernel.org> <20210109075903.208222-2-ebiggers@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210109075903.208222-2-ebiggers@kernel.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Fri 08-01-21 23:58:52, Eric Biggers wrote: > From: Eric Biggers > > When lazytime is enabled and an inode is being written due to its > in-memory updated timestamps having expired, either due to a sync() or > syncfs() system call or due to dirtytime_expire_interval having elapsed, > the VFS needs to inform the filesystem so that the filesystem can copy > the inode's timestamps out to the on-disk data structures. > > This is done by __writeback_single_inode() calling > mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC). > > However, this occurs after __writeback_single_inode() has already > cleared the dirty flags from ->i_state. This causes two bugs: > > - mark_inode_dirty_sync() redirties the inode, causing it to remain > dirty. This wastefully causes the inode to be written twice. But > more importantly, it breaks cases where sync_filesystem() is expected > to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY > ioctl (as reported at > https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well > as possibly filesystem freezing (freeze_super()). > > - Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is > called from __writeback_single_inode() for lazytime expiration, > xfs_fs_dirty_inode() ignores the notification. (XFS only cares about > lazytime expirations, and it assumes that I_DIRTY_TIME will contain > i_state during those.) Therefore, lazy timestamps aren't persisted by > sync(), syncfs(), or dirtytime_expire_interval on XFS. > > Fix this by moving the call to mark_inode_dirty_sync() to earlier in > __writeback_single_inode(), before the dirty flags are cleared from > i_state. This makes filesystems be properly notified of the timestamp > expiration, and it avoids incorrectly redirtying the inode. > > This fixes xfstest generic/580 (which tests > FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime > enabled. It also fixes the new lazytime xfstest I've proposed, which > reproduces the above-mentioned XFS bug > (https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org). > > Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But > due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the > right thing to do because mark_inode_dirty_sync() now knows not to move > the inode to a writeback list if it is currently queued for sync. > > Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option") > Cc: stable@vger.kernel.org > Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback") > Suggested-by: Jan Kara > Signed-off-by: Eric Biggers Thanks for writing this fix! It looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > fs/fs-writeback.c | 24 +++++++++++++----------- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index acfb55834af23..c41cb887eb7d3 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc) > } > > /* > - * Some filesystems may redirty the inode during the writeback > - * due to delalloc, clear dirty metadata flags right before > - * write_inode() > + * If the inode has dirty timestamps and we need to write them, call > + * mark_inode_dirty_sync() to notify the filesystem about it and to > + * change I_DIRTY_TIME into I_DIRTY_SYNC. > */ > - spin_lock(&inode->i_lock); > - > - dirty = inode->i_state & I_DIRTY; > if ((inode->i_state & I_DIRTY_TIME) && > - ((dirty & I_DIRTY_INODE) || > - wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || > + (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || > time_after(jiffies, inode->dirtied_time_when + > dirtytime_expire_interval * HZ))) { > - dirty |= I_DIRTY_TIME; > trace_writeback_lazytime(inode); > + mark_inode_dirty_sync(inode); > } > + > + /* > + * Some filesystems may redirty the inode during the writeback > + * due to delalloc, clear dirty metadata flags right before > + * write_inode() > + */ > + spin_lock(&inode->i_lock); > + dirty = inode->i_state & I_DIRTY; > inode->i_state &= ~dirty; > > /* > @@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc) > > spin_unlock(&inode->i_lock); > > - if (dirty & I_DIRTY_TIME) > - mark_inode_dirty_sync(inode); > /* Don't write the inode if only I_DIRTY_PAGES was set */ > if (dirty & ~I_DIRTY_PAGES) { > int err = write_inode(inode, wbc); > -- > 2.30.0 > -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8662C433E0 for ; Mon, 11 Jan 2021 14:46:22 +0000 (UTC) Received: from lists.sourceforge.net (lists.sourceforge.net [216.105.38.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4DE8D2255F; Mon, 11 Jan 2021 14:46:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4DE8D2255F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linux-f2fs-devel-bounces@lists.sourceforge.net Received: from [127.0.0.1] (helo=sfs-ml-1.v29.lw.sourceforge.com) by sfs-ml-1.v29.lw.sourceforge.com with esmtp (Exim 4.90_1) (envelope-from ) id 1kyySb-0003nv-82; Mon, 11 Jan 2021 14:46:21 +0000 Received: from [172.30.20.202] (helo=mx.sourceforge.net) by sfs-ml-1.v29.lw.sourceforge.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kyySa-0003nh-Gp for linux-f2fs-devel@lists.sourceforge.net; Mon, 11 Jan 2021 14:46:20 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sourceforge.net; s=x; h=In-Reply-To:Content-Type:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=igwhTWH6zHolkq1l9RtyKVpnHE+u4wZKfmznjLT6j1o=; b=i5L72kVGySYQxGl7EXoPWrLK1t T2aqSLOJVXsRrCZ1ickhrdOXqjJubrtbzwWJHhBRI7QElApV4RtqwsFhVhepB59qUAJ0c0XD/N/Gl oG51xUCRhlAEb/90EczqshKh4vpFrmQ6dxn/2A7KjjyrSy7FaeVE+oXJYU20ZfBuFY9s=; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sf.net; s=x ; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To :From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=igwhTWH6zHolkq1l9RtyKVpnHE+u4wZKfmznjLT6j1o=; b=ZbqTwgMnCn+jNYIQJmeb1LbMIz n+yEp0GN0n3tms0y4PINjuiJUlygj6L2VcYvmF1iMFWLdtSMFQHDp1h8KxZAi/0QgHhQxt8l3HjiE pexXwlvt7zesnyeRH7c/hDj4fMvMApfIG8gsSHhlX9/5YUn3JEdHRvlOsgmLZeemFQTU=; Received: from mx2.suse.de ([195.135.220.15]) by sfi-mx-1.v28.lw.sourceforge.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92.2) id 1kyySW-009Yfw-Kc for linux-f2fs-devel@lists.sourceforge.net; Mon, 11 Jan 2021 14:46:20 +0000 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 99D83AB3E; Mon, 11 Jan 2021 14:46:00 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 509B81E0807; Mon, 11 Jan 2021 15:46:00 +0100 (CET) Date: Mon, 11 Jan 2021 15:46:00 +0100 From: Jan Kara To: Eric Biggers Message-ID: <20210111144600.GC808@quack2.suse.cz> References: <20210109075903.208222-1-ebiggers@kernel.org> <20210109075903.208222-2-ebiggers@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210109075903.208222-2-ebiggers@kernel.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Headers-End: 1kyySW-009Yfw-Kc Subject: Re: [f2fs-dev] [PATCH v2 01/12] fs: fix lazytime expiration handling in __writeback_single_inode() X-BeenThere: linux-f2fs-devel@lists.sourceforge.net X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Theodore Ts'o , stable@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Jan Kara , linux-ext4@vger.kernel.org, Christoph Hellwig Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-f2fs-devel-bounces@lists.sourceforge.net On Fri 08-01-21 23:58:52, Eric Biggers wrote: > From: Eric Biggers > > When lazytime is enabled and an inode is being written due to its > in-memory updated timestamps having expired, either due to a sync() or > syncfs() system call or due to dirtytime_expire_interval having elapsed, > the VFS needs to inform the filesystem so that the filesystem can copy > the inode's timestamps out to the on-disk data structures. > > This is done by __writeback_single_inode() calling > mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC). > > However, this occurs after __writeback_single_inode() has already > cleared the dirty flags from ->i_state. This causes two bugs: > > - mark_inode_dirty_sync() redirties the inode, causing it to remain > dirty. This wastefully causes the inode to be written twice. But > more importantly, it breaks cases where sync_filesystem() is expected > to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY > ioctl (as reported at > https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well > as possibly filesystem freezing (freeze_super()). > > - Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is > called from __writeback_single_inode() for lazytime expiration, > xfs_fs_dirty_inode() ignores the notification. (XFS only cares about > lazytime expirations, and it assumes that I_DIRTY_TIME will contain > i_state during those.) Therefore, lazy timestamps aren't persisted by > sync(), syncfs(), or dirtytime_expire_interval on XFS. > > Fix this by moving the call to mark_inode_dirty_sync() to earlier in > __writeback_single_inode(), before the dirty flags are cleared from > i_state. This makes filesystems be properly notified of the timestamp > expiration, and it avoids incorrectly redirtying the inode. > > This fixes xfstest generic/580 (which tests > FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime > enabled. It also fixes the new lazytime xfstest I've proposed, which > reproduces the above-mentioned XFS bug > (https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org). > > Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But > due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the > right thing to do because mark_inode_dirty_sync() now knows not to move > the inode to a writeback list if it is currently queued for sync. > > Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option") > Cc: stable@vger.kernel.org > Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback") > Suggested-by: Jan Kara > Signed-off-by: Eric Biggers Thanks for writing this fix! It looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > fs/fs-writeback.c | 24 +++++++++++++----------- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index acfb55834af23..c41cb887eb7d3 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc) > } > > /* > - * Some filesystems may redirty the inode during the writeback > - * due to delalloc, clear dirty metadata flags right before > - * write_inode() > + * If the inode has dirty timestamps and we need to write them, call > + * mark_inode_dirty_sync() to notify the filesystem about it and to > + * change I_DIRTY_TIME into I_DIRTY_SYNC. > */ > - spin_lock(&inode->i_lock); > - > - dirty = inode->i_state & I_DIRTY; > if ((inode->i_state & I_DIRTY_TIME) && > - ((dirty & I_DIRTY_INODE) || > - wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || > + (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync || > time_after(jiffies, inode->dirtied_time_when + > dirtytime_expire_interval * HZ))) { > - dirty |= I_DIRTY_TIME; > trace_writeback_lazytime(inode); > + mark_inode_dirty_sync(inode); > } > + > + /* > + * Some filesystems may redirty the inode during the writeback > + * due to delalloc, clear dirty metadata flags right before > + * write_inode() > + */ > + spin_lock(&inode->i_lock); > + dirty = inode->i_state & I_DIRTY; > inode->i_state &= ~dirty; > > /* > @@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc) > > spin_unlock(&inode->i_lock); > > - if (dirty & I_DIRTY_TIME) > - mark_inode_dirty_sync(inode); > /* Don't write the inode if only I_DIRTY_PAGES was set */ > if (dirty & ~I_DIRTY_PAGES) { > int err = write_inode(inode, wbc); > -- > 2.30.0 > -- Jan Kara SUSE Labs, CR _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel