From: Jan Kara <jack@suse.cz>
To: Eric Biggers <ebiggers@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-ext4@vger.kernel.org,
linux-f2fs-devel@lists.sourceforge.net,
Theodore Ts'o <tytso@mit.edu>, Christoph Hellwig <hch@lst.de>,
stable@vger.kernel.org, Jan Kara <jack@suse.cz>
Subject: Re: [PATCH v2 01/12] fs: fix lazytime expiration handling in __writeback_single_inode()
Date: Mon, 11 Jan 2021 15:46:00 +0100 [thread overview]
Message-ID: <20210111144600.GC808@quack2.suse.cz> (raw)
In-Reply-To: <20210109075903.208222-2-ebiggers@kernel.org>
On Fri 08-01-21 23:58:52, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> When lazytime is enabled and an inode is being written due to its
> in-memory updated timestamps having expired, either due to a sync() or
> syncfs() system call or due to dirtytime_expire_interval having elapsed,
> the VFS needs to inform the filesystem so that the filesystem can copy
> the inode's timestamps out to the on-disk data structures.
>
> This is done by __writeback_single_inode() calling
> mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
>
> However, this occurs after __writeback_single_inode() has already
> cleared the dirty flags from ->i_state. This causes two bugs:
>
> - mark_inode_dirty_sync() redirties the inode, causing it to remain
> dirty. This wastefully causes the inode to be written twice. But
> more importantly, it breaks cases where sync_filesystem() is expected
> to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
> ioctl (as reported at
> https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
> as possibly filesystem freezing (freeze_super()).
>
> - Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
> called from __writeback_single_inode() for lazytime expiration,
> xfs_fs_dirty_inode() ignores the notification. (XFS only cares about
> lazytime expirations, and it assumes that I_DIRTY_TIME will contain
> i_state during those.) Therefore, lazy timestamps aren't persisted by
> sync(), syncfs(), or dirtytime_expire_interval on XFS.
>
> Fix this by moving the call to mark_inode_dirty_sync() to earlier in
> __writeback_single_inode(), before the dirty flags are cleared from
> i_state. This makes filesystems be properly notified of the timestamp
> expiration, and it avoids incorrectly redirtying the inode.
>
> This fixes xfstest generic/580 (which tests
> FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
> enabled. It also fixes the new lazytime xfstest I've proposed, which
> reproduces the above-mentioned XFS bug
> (https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
>
> Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But
> due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
> right thing to do because mark_inode_dirty_sync() now knows not to move
> the inode to a writeback list if it is currently queued for sync.
>
> Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
> Cc: stable@vger.kernel.org
> Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
Thanks for writing this fix! It looks good to me. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/fs-writeback.c | 24 +++++++++++++-----------
> 1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index acfb55834af23..c41cb887eb7d3 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
> }
>
> /*
> - * Some filesystems may redirty the inode during the writeback
> - * due to delalloc, clear dirty metadata flags right before
> - * write_inode()
> + * If the inode has dirty timestamps and we need to write them, call
> + * mark_inode_dirty_sync() to notify the filesystem about it and to
> + * change I_DIRTY_TIME into I_DIRTY_SYNC.
> */
> - spin_lock(&inode->i_lock);
> -
> - dirty = inode->i_state & I_DIRTY;
> if ((inode->i_state & I_DIRTY_TIME) &&
> - ((dirty & I_DIRTY_INODE) ||
> - wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
> + (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
> time_after(jiffies, inode->dirtied_time_when +
> dirtytime_expire_interval * HZ))) {
> - dirty |= I_DIRTY_TIME;
> trace_writeback_lazytime(inode);
> + mark_inode_dirty_sync(inode);
> }
> +
> + /*
> + * Some filesystems may redirty the inode during the writeback
> + * due to delalloc, clear dirty metadata flags right before
> + * write_inode()
> + */
> + spin_lock(&inode->i_lock);
> + dirty = inode->i_state & I_DIRTY;
> inode->i_state &= ~dirty;
>
> /*
> @@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>
> spin_unlock(&inode->i_lock);
>
> - if (dirty & I_DIRTY_TIME)
> - mark_inode_dirty_sync(inode);
> /* Don't write the inode if only I_DIRTY_PAGES was set */
> if (dirty & ~I_DIRTY_PAGES) {
> int err = write_inode(inode, wbc);
> --
> 2.30.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
WARNING: multiple messages have this Message-ID
From: Jan Kara <jack@suse.cz>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>,
stable@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Jan Kara <jack@suse.cz>,
linux-ext4@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [f2fs-dev] [PATCH v2 01/12] fs: fix lazytime expiration handling in __writeback_single_inode()
Date: Mon, 11 Jan 2021 15:46:00 +0100 [thread overview]
Message-ID: <20210111144600.GC808@quack2.suse.cz> (raw)
In-Reply-To: <20210109075903.208222-2-ebiggers@kernel.org>
On Fri 08-01-21 23:58:52, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> When lazytime is enabled and an inode is being written due to its
> in-memory updated timestamps having expired, either due to a sync() or
> syncfs() system call or due to dirtytime_expire_interval having elapsed,
> the VFS needs to inform the filesystem so that the filesystem can copy
> the inode's timestamps out to the on-disk data structures.
>
> This is done by __writeback_single_inode() calling
> mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
>
> However, this occurs after __writeback_single_inode() has already
> cleared the dirty flags from ->i_state. This causes two bugs:
>
> - mark_inode_dirty_sync() redirties the inode, causing it to remain
> dirty. This wastefully causes the inode to be written twice. But
> more importantly, it breaks cases where sync_filesystem() is expected
> to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
> ioctl (as reported at
> https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
> as possibly filesystem freezing (freeze_super()).
>
> - Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
> called from __writeback_single_inode() for lazytime expiration,
> xfs_fs_dirty_inode() ignores the notification. (XFS only cares about
> lazytime expirations, and it assumes that I_DIRTY_TIME will contain
> i_state during those.) Therefore, lazy timestamps aren't persisted by
> sync(), syncfs(), or dirtytime_expire_interval on XFS.
>
> Fix this by moving the call to mark_inode_dirty_sync() to earlier in
> __writeback_single_inode(), before the dirty flags are cleared from
> i_state. This makes filesystems be properly notified of the timestamp
> expiration, and it avoids incorrectly redirtying the inode.
>
> This fixes xfstest generic/580 (which tests
> FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
> enabled. It also fixes the new lazytime xfstest I've proposed, which
> reproduces the above-mentioned XFS bug
> (https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
>
> Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But
> due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
> right thing to do because mark_inode_dirty_sync() now knows not to move
> the inode to a writeback list if it is currently queued for sync.
>
> Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
> Cc: stable@vger.kernel.org
> Depends-on: 5afced3bf281 ("writeback: Avoid skipping inode writeback")
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
Thanks for writing this fix! It looks good to me. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/fs-writeback.c | 24 +++++++++++++-----------
> 1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index acfb55834af23..c41cb887eb7d3 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1474,21 +1474,25 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
> }
>
> /*
> - * Some filesystems may redirty the inode during the writeback
> - * due to delalloc, clear dirty metadata flags right before
> - * write_inode()
> + * If the inode has dirty timestamps and we need to write them, call
> + * mark_inode_dirty_sync() to notify the filesystem about it and to
> + * change I_DIRTY_TIME into I_DIRTY_SYNC.
> */
> - spin_lock(&inode->i_lock);
> -
> - dirty = inode->i_state & I_DIRTY;
> if ((inode->i_state & I_DIRTY_TIME) &&
> - ((dirty & I_DIRTY_INODE) ||
> - wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
> + (wbc->sync_mode == WB_SYNC_ALL || wbc->for_sync ||
> time_after(jiffies, inode->dirtied_time_when +
> dirtytime_expire_interval * HZ))) {
> - dirty |= I_DIRTY_TIME;
> trace_writeback_lazytime(inode);
> + mark_inode_dirty_sync(inode);
> }
> +
> + /*
> + * Some filesystems may redirty the inode during the writeback
> + * due to delalloc, clear dirty metadata flags right before
> + * write_inode()
> + */
> + spin_lock(&inode->i_lock);
> + dirty = inode->i_state & I_DIRTY;
> inode->i_state &= ~dirty;
>
> /*
> @@ -1509,8 +1513,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>
> spin_unlock(&inode->i_lock);
>
> - if (dirty & I_DIRTY_TIME)
> - mark_inode_dirty_sync(inode);
> /* Don't write the inode if only I_DIRTY_PAGES was set */
> if (dirty & ~I_DIRTY_PAGES) {
> int err = write_inode(inode, wbc);
> --
> 2.30.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2021-01-11 14:46 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-09 7:58 [PATCH v2 00/12] lazytime fix and cleanups Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-09 7:58 ` [PATCH v2 01/12] fs: fix lazytime expiration handling in __writeback_single_inode() Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-11 10:48 ` Christoph Hellwig
2021-01-11 10:48 ` [f2fs-dev] " Christoph Hellwig
2021-01-11 14:46 ` Jan Kara [this message]
2021-01-11 14:46 ` Jan Kara
2021-01-09 7:58 ` [PATCH v2 02/12] fs: correctly document the inode dirty flags Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-11 14:48 ` Jan Kara
2021-01-11 14:48 ` [f2fs-dev] " Jan Kara
2021-01-09 7:58 ` [PATCH v2 03/12] fs: only specify I_DIRTY_TIME when needed in generic_update_time() Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-11 14:50 ` Jan Kara
2021-01-11 14:50 ` [f2fs-dev] " Jan Kara
2021-01-09 7:58 ` [PATCH v2 04/12] fat: only specify I_DIRTY_TIME when needed in fat_update_time() Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-11 10:52 ` Christoph Hellwig
2021-01-11 10:52 ` [f2fs-dev] " Christoph Hellwig
2021-01-11 19:50 ` Eric Biggers
2021-01-11 19:50 ` [f2fs-dev] " Eric Biggers
2021-01-12 5:21 ` Dave Chinner
2021-01-12 5:21 ` [f2fs-dev] " Dave Chinner
2021-01-12 13:23 ` Christoph Hellwig
2021-01-12 13:23 ` [f2fs-dev] " Christoph Hellwig
2021-01-11 14:52 ` Jan Kara
2021-01-11 14:52 ` [f2fs-dev] " Jan Kara
2021-01-09 7:58 ` [PATCH v2 05/12] fs: don't call ->dirty_inode for lazytime timestamp updates Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-11 14:54 ` Jan Kara
2021-01-11 14:54 ` [f2fs-dev] " Jan Kara
2021-01-09 7:58 ` [PATCH v2 06/12] fs: pass only I_DIRTY_INODE flags to ->dirty_inode Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-11 14:56 ` Jan Kara
2021-01-11 14:56 ` [f2fs-dev] " Jan Kara
2021-01-09 7:58 ` [PATCH v2 07/12] fs: clean up __mark_inode_dirty() a bit Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-11 14:59 ` Jan Kara
2021-01-11 14:59 ` [f2fs-dev] " Jan Kara
2021-01-09 7:58 ` [PATCH v2 08/12] fs: drop redundant check from __writeback_single_inode() Eric Biggers
2021-01-09 7:58 ` [f2fs-dev] " Eric Biggers
2021-01-11 10:52 ` Christoph Hellwig
2021-01-11 10:52 ` [f2fs-dev] " Christoph Hellwig
2021-01-11 15:00 ` Jan Kara
2021-01-11 15:00 ` [f2fs-dev] " Jan Kara
2021-01-09 7:59 ` [PATCH v2 09/12] fs: improve comments for writeback_single_inode() Eric Biggers
2021-01-09 7:59 ` [f2fs-dev] " Eric Biggers
2021-01-11 10:53 ` Christoph Hellwig
2021-01-11 10:53 ` [f2fs-dev] " Christoph Hellwig
2021-01-11 15:05 ` Jan Kara
2021-01-11 15:05 ` [f2fs-dev] " Jan Kara
2021-01-09 7:59 ` [PATCH v2 10/12] gfs2: don't worry about I_DIRTY_TIME in gfs2_fsync() Eric Biggers
2021-01-09 7:59 ` [f2fs-dev] " Eric Biggers
2021-01-11 15:06 ` Jan Kara
2021-01-11 15:06 ` [f2fs-dev] " Jan Kara
2021-01-09 7:59 ` [PATCH v2 11/12] ext4: simplify i_state checks in __ext4_update_other_inode_time() Eric Biggers
2021-01-09 7:59 ` [f2fs-dev] " Eric Biggers
2021-01-11 10:53 ` Christoph Hellwig
2021-01-11 10:53 ` [f2fs-dev] " Christoph Hellwig
2021-01-11 20:23 ` Eric Biggers
2021-01-11 20:23 ` [f2fs-dev] " Eric Biggers
2021-01-12 13:25 ` Christoph Hellwig
2021-01-12 13:25 ` [f2fs-dev] " Christoph Hellwig
2021-02-03 5:16 ` Theodore Ts'o
2021-02-03 5:16 ` [f2fs-dev] " Theodore Ts'o
2021-01-11 15:11 ` Jan Kara
2021-01-11 15:11 ` [f2fs-dev] " Jan Kara
2021-01-09 7:59 ` [PATCH v2 12/12] xfs: remove a stale comment from xfs_file_aio_write_checks() Eric Biggers
2021-01-09 7:59 ` [f2fs-dev] " Eric Biggers
2021-01-12 17:31 ` Darrick J. Wong
2021-01-12 17:31 ` [f2fs-dev] " Darrick J. Wong
2021-01-11 15:15 ` [PATCH v2 00/12] lazytime fix and cleanups Jan Kara
2021-01-11 15:15 ` [f2fs-dev] " Jan Kara
2021-01-11 20:44 ` Eric Biggers
2021-01-11 20:44 ` [f2fs-dev] " Eric Biggers
2021-02-03 5:11 ` Theodore Ts'o
2021-02-03 5:11 ` [f2fs-dev] " Theodore Ts'o
2021-02-03 5:22 ` Eric Biggers
2021-02-03 5:22 ` [f2fs-dev] " Eric Biggers
2021-02-03 15:49 ` Theodore Ts'o
2021-02-03 15:49 ` [f2fs-dev] " Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210111144600.GC808@quack2.suse.cz \
--to=jack@suse.cz \
--cc=ebiggers@kernel.org \
--cc=hch@lst.de \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.