linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Eric Biggers <ebiggers@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	Theodore Ts'o <tytso@mit.edu>, Christoph Hellwig <hch@lst.de>,
	stable@vger.kernel.org, Jan Kara <jack@suse.cz>
Subject: Re: [PATCH v2 01/12] fs: fix lazytime expiration handling in __writeback_single_inode()
Date: Mon, 11 Jan 2021 15:46:00 +0100	[thread overview]
Message-ID: <20210111144600.GC808@quack2.suse.cz> (raw)
In-Reply-To: <20210109075903.208222-2-ebiggers@kernel.org>

On Fri 08-01-21 23:58:52, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> When lazytime is enabled and an inode is being written due to its
> in-memory updated timestamps having expired, either due to a sync() or
> syncfs() system call or due to dirtytime_expire_interval having elapsed,
> the VFS needs to inform the filesystem so that the filesystem can copy
> the inode's timestamps out to the on-disk data structures.
> 
> This is done by __writeback_single_inode() calling
> mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
> 
> However, this occurs after __writeback_single_inode() has already
> cleared the dirty flags from ->i_state.  This causes two bugs:
> 
> - mark_inode_dirty_sync() redirties the inode, causing it to remain
>   dirty.  This wastefully causes the inode to be written twice.  But
>   more importantly, it breaks cases where sync_filesystem() is expected
>   to clean dirty inodes.  This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
>   ioctl (as reported at
>   https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
>   as possibly filesystem freezing (freeze_super()).
> 
> - Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
>   called from __writeback_single_inode() for lazytime expiration,
>   xfs_fs_dirty_inode() ignores the notification.  (XFS only cares about
>   lazytime expirations, and it assumes that I_DIRTY_TIME will contain
>   i_state during those.)  Therefore, lazy timestamps aren't persisted by
>   sync(), syncfs(), or dirtytime_expire_interval on XFS.
> 
> Fix this by moving the call to mark_inode_dirty_sync() to earlier in
> __writeback_single_inode(), before the dirty flags are cleared from
> i_state.  This makes filesystems be properly notified of the timestamp
> expiration, and it avoids incorrectly redirtying the inode.
> 
> This fixes xfstest generic/580 (which tests
> FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
> enabled.  It also fixes the new lazytime xfstest I've proposed, which
> reproduces the above-mentioned XFS bug
> (https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
> 
> Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly.  But
> due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
> right thing to do because mark_inode_dirty_sync() now knows not to move
> the inode to a writeback list if it is currently queued for sync.
> 
> Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
> Cc: stable@vger.kernel.org
> Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Thanks for writing this fix! It looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/fs-writeback.c | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index acfb55834af23..c41cb887eb7d3 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>  	}
>  
>  	/*
> -	 * Some filesystems may redirty the inode during the writeback
> -	 * due to delalloc, clear dirty metadata flags right before
> -	 * write_inode()
> +	 * If the inode has dirty timestamps and we need to write them, call
> +	 * mark_inode_dirty_sync() to notify the filesystem about it and to
> +	 * change I_DIRTY_TIME into I_DIRTY_SYNC.
>  	 */
> -	spin_lock(&inode->i_lock);
> -
> -	dirty = inode->i_state & I_DIRTY;
>  	if ((inode->i_state & I_DIRTY_TIME) &&
> -	    ((dirty & I_DIRTY_INODE) ||
> -	     wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
> +	    (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
>  	     time_after(jiffies, inode->dirtied_time_when +
>  			dirtytime_expire_interval * HZ))) {
> -		dirty |= I_DIRTY_TIME;
>  		trace_writeback_lazytime(inode);
> +		mark_inode_dirty_sync(inode);
>  	}
> +
> +	/*
> +	 * Some filesystems may redirty the inode during the writeback
> +	 * due to delalloc, clear dirty metadata flags right before
> +	 * write_inode()
> +	 */
> +	spin_lock(&inode->i_lock);
> +	dirty = inode->i_state & I_DIRTY;
>  	inode->i_state &= ~dirty;
>  
>  	/*
> @@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>  
>  	spin_unlock(&inode->i_lock);
>  
> -	if (dirty & I_DIRTY_TIME)
> -		mark_inode_dirty_sync(inode);
>  	/* Don't write the inode if only I_DIRTY_PAGES was set */
>  	if (dirty & ~I_DIRTY_PAGES) {
>  		int err = write_inode(inode, wbc);
> -- 
> 2.30.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  parent reply	other threads:[~2021-01-11 14:46 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-09  7:58 [PATCH v2 00/12] lazytime fix and cleanups Eric Biggers
2021-01-09  7:58 ` [PATCH v2 01/12] fs: fix lazytime expiration handling in __writeback_single_inode() Eric Biggers
2021-01-11 10:48   ` Christoph Hellwig
2021-01-11 14:46   ` Jan Kara [this message]
2021-01-09  7:58 ` [PATCH v2 02/12] fs: correctly document the inode dirty flags Eric Biggers
2021-01-11 14:48   ` Jan Kara
2021-01-09  7:58 ` [PATCH v2 03/12] fs: only specify I_DIRTY_TIME when needed in generic_update_time() Eric Biggers
2021-01-11 14:50   ` Jan Kara
2021-01-09  7:58 ` [PATCH v2 04/12] fat: only specify I_DIRTY_TIME when needed in fat_update_time() Eric Biggers
2021-01-11 10:52   ` Christoph Hellwig
2021-01-11 19:50     ` Eric Biggers
2021-01-12  5:21       ` Dave Chinner
2021-01-12 13:23       ` Christoph Hellwig
2021-01-11 14:52   ` Jan Kara
2021-01-09  7:58 ` [PATCH v2 05/12] fs: don't call ->dirty_inode for lazytime timestamp updates Eric Biggers
2021-01-11 14:54   ` Jan Kara
2021-01-09  7:58 ` [PATCH v2 06/12] fs: pass only I_DIRTY_INODE flags to ->dirty_inode Eric Biggers
2021-01-11 14:56   ` Jan Kara
2021-01-09  7:58 ` [PATCH v2 07/12] fs: clean up __mark_inode_dirty() a bit Eric Biggers
2021-01-11 14:59   ` Jan Kara
2021-01-09  7:58 ` [PATCH v2 08/12] fs: drop redundant check from __writeback_single_inode() Eric Biggers
2021-01-11 10:52   ` Christoph Hellwig
2021-01-11 15:00   ` Jan Kara
2021-01-09  7:59 ` [PATCH v2 09/12] fs: improve comments for writeback_single_inode() Eric Biggers
2021-01-11 10:53   ` Christoph Hellwig
2021-01-11 15:05   ` Jan Kara
2021-01-09  7:59 ` [PATCH v2 10/12] gfs2: don't worry about I_DIRTY_TIME in gfs2_fsync() Eric Biggers
2021-01-11 15:06   ` Jan Kara
2021-01-09  7:59 ` [PATCH v2 11/12] ext4: simplify i_state checks in __ext4_update_other_inode_time() Eric Biggers
2021-01-11 10:53   ` Christoph Hellwig
2021-01-11 20:23     ` Eric Biggers
2021-01-12 13:25       ` Christoph Hellwig
2021-02-03  5:16         ` Theodore Ts'o
2021-01-11 15:11   ` Jan Kara
2021-01-09  7:59 ` [PATCH v2 12/12] xfs: remove a stale comment from xfs_file_aio_write_checks() Eric Biggers
2021-01-12 17:31   ` Darrick J. Wong
2021-01-11 15:15 ` [PATCH v2 00/12] lazytime fix and cleanups Jan Kara
2021-01-11 20:44   ` Eric Biggers
2021-02-03  5:11     ` Theodore Ts'o
2021-02-03  5:22       ` Eric Biggers
2021-02-03 15:49         ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210111144600.GC808@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=ebiggers@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).