From: Jan Kara <jack@suse.cz>
To: Hillf Danton <hdanton@sina.com>
Cc: Martijn Coenen <maco@android.com>,
Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
Jens Axboe <axboe@kernel.dk>,
miklos@szeredi.hu, tj@kernel.org, linux-fsdevel@vger.kernel.org,
LKML <linux-kernel@vger.kernel.org>,
android-storage-core@google.com, kernel-team@android.com
Subject: Re: Writeback bug causing writeback stalls
Date: Mon, 25 May 2020 09:38:35 +0200 [thread overview]
Message-ID: <20200525073835.GJ14199@quack2.suse.cz> (raw)
In-Reply-To: <20200524140522.14196-1-hdanton@sina.com>
On Sun 24-05-20 22:05:22, Hillf Danton wrote:
>
> On Fri, 22 May 2020 11:57:42 +0200 Martijn Coenen wrote:
> >
> > So, the sequence of events is something like this. Let's assume the inode is
> > already on b_dirty_time for valid reasons. Then:
> >
> > CPU1 CPU2
> > fuse_flush()
> > write_inode_now()
> > writeback_single_inode()
> > sets I_SYNC
> > __writeback_single_inode()
> > writes back data
> > clears inode dirty flags
> > unlocks inode
> > calls mark_inode_dirty_sync()
> > sets I_DIRTY_SYNC, but doesn't
> > update wb list because I_SYNC is
> > still set
> > write() // somebody else writes
> > mark_inode_dirty(I_DIRTY_PAGES)
> > sets I_DIRTY_PAGES on i_state
> > doesn't update wb list,
> > because I_SYNC set
> > locks inode again
> > sees inode is still dirty,
> > doesn't touch WB list
> > clears I_SYNC
> >
> > So now we have an inode on b_dirty_time with I_DIRTY_PAGES | I_DIRTY_SYNC set,
> > and subsequent calls to mark_inode_dirty() with either I_DIRTY_PAGES or
> > I_DIRTY_SYNC will do nothing to change that. The flusher won't touch
> > the inode either, because it's not on a b_dirty or b_io list.
Hi Hillf,
> Based on the above analysis, check of I_DIRTY_TIME is added before and
> after calling __writeback_single_inode() to detect the case you reported.
>
> If a dirty inode is not on the right io list after writeback, we can
> move it to a new one; and we can do that as we are the I_SYNC owner.
>
> While changing its io list, the inode's dirty timestamp is also updated
> to the current tick as does in __mark_inode_dirty().
Apparently you didn't read my reply to Martinj because what you did in this
patch is exactly what I described that we cannot do because that can cause
sync(2) to miss inodes and thus break its data integrity guarantees. So we
have to come up with a different solution.
Honza
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1528,6 +1528,7 @@ static int writeback_single_inode(struct
> struct writeback_control *wbc)
> {
> struct bdi_writeback *wb;
> + bool dt;
> int ret = 0;
>
> spin_lock(&inode->i_lock);
> @@ -1560,6 +1561,7 @@ static int writeback_single_inode(struct
> !mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK)))
> goto out;
> inode->i_state |= I_SYNC;
> + dt = inode->i_state & I_DIRTY_TIME;
> wbc_attach_and_unlock_inode(wbc, inode);
>
> ret = __writeback_single_inode(inode, wbc);
> @@ -1574,6 +1576,14 @@ static int writeback_single_inode(struct
> */
> if (!(inode->i_state & I_DIRTY_ALL))
> inode_io_list_del_locked(inode, wb);
> + else if (!(inode->i_state & I_DIRTY_TIME) && dt) {
> + /*
> + * We can correct inode's io list, however, by moving it to
> + * b_dirty from b_dirty_time as we are the I_SYNC owner
> + */
> + inode->dirtied_when = jiffies;
> + inode_io_list_move_locked(inode, wb, &wb->b_dirty);
> + }
> spin_unlock(&wb->list_lock);
> inode_sync_complete(inode);
> out:
> --
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
prev parent reply other threads:[~2020-05-25 7:45 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-22 9:57 Writeback bug causing writeback stalls Martijn Coenen
2020-05-22 14:41 ` Jan Kara
2020-05-22 15:23 ` Martijn Coenen
2020-05-22 15:36 ` Jan Kara
2020-05-23 8:15 ` Martijn Coenen
2020-05-25 7:31 ` Jan Kara
2020-05-27 8:14 ` Martijn Coenen
2020-05-29 15:20 ` Jan Kara
2020-05-29 19:37 ` Martijn Coenen
2020-06-01 9:09 ` Jan Kara
2020-06-02 12:16 ` Martijn Coenen
[not found] ` <20200524140522.14196-1-hdanton@sina.com>
2020-05-25 7:38 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200525073835.GJ14199@quack2.suse.cz \
--to=jack@suse.cz \
--cc=android-storage-core@google.com \
--cc=axboe@kernel.dk \
--cc=hdanton@sina.com \
--cc=kernel-team@android.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maco@android.com \
--cc=miklos@szeredi.hu \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).