linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Hillf Danton <hdanton@sina.com>
Cc: Martijn Coenen <maco@android.com>,
	Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	Jens Axboe <axboe@kernel.dk>,
	miklos@szeredi.hu, tj@kernel.org, linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	android-storage-core@google.com, kernel-team@android.com
Subject: Re: Writeback bug causing writeback stalls
Date: Mon, 25 May 2020 09:38:35 +0200	[thread overview]
Message-ID: <20200525073835.GJ14199@quack2.suse.cz> (raw)
In-Reply-To: <20200524140522.14196-1-hdanton@sina.com>

On Sun 24-05-20 22:05:22, Hillf Danton wrote:
> 
> On Fri, 22 May 2020 11:57:42 +0200 Martijn Coenen wrote:
> > 
> > So, the sequence of events is something like this. Let's assume the inode is
> > already on b_dirty_time for valid reasons. Then:
> > 
> > CPU1                                          CPU2
> > fuse_flush()
> >   write_inode_now()
> >     writeback_single_inode()
> >       sets I_SYNC
> >         __writeback_single_inode()
> >           writes back data
> >           clears inode dirty flags
> >           unlocks inode
> >           calls mark_inode_dirty_sync()
> >             sets I_DIRTY_SYNC, but doesn't
> >             update wb list because I_SYNC is
> >             still set
> >                                               write() // somebody else writes
> >                                               mark_inode_dirty(I_DIRTY_PAGES)
> >                                               sets I_DIRTY_PAGES on i_state
> >                                               doesn't update wb list,
> >                                               because I_SYNC set
> >       locks inode again
> >       sees inode is still dirty,
> >       doesn't touch WB list
> >       clears I_SYNC
> > 
> > So now we have an inode on b_dirty_time with I_DIRTY_PAGES | I_DIRTY_SYNC set,
> > and subsequent calls to mark_inode_dirty() with either I_DIRTY_PAGES or
> > I_DIRTY_SYNC will do nothing to change that. The flusher won't touch
> > the inode either, because it's not on a b_dirty or b_io list.

Hi Hillf,

> Based on the above analysis, check of I_DIRTY_TIME is added before and
> after calling __writeback_single_inode() to detect the case you reported.
> 
> If a dirty inode is not on the right io list after writeback, we can
> move it to a new one; and we can do that as we are the I_SYNC owner.
> 
> While changing its io list, the inode's dirty timestamp is also updated
> to the current tick as does in __mark_inode_dirty().

Apparently you didn't read my reply to Martinj because what you did in this
patch is exactly what I described that we cannot do because that can cause
sync(2) to miss inodes and thus break its data integrity guarantees. So we
have to come up with a different solution.

								Honza

> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1528,6 +1528,7 @@ static int writeback_single_inode(struct
>  				  struct writeback_control *wbc)
>  {
>  	struct bdi_writeback *wb;
> +	bool dt;
>  	int ret = 0;
>  
>  	spin_lock(&inode->i_lock);
> @@ -1560,6 +1561,7 @@ static int writeback_single_inode(struct
>  	     !mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK)))
>  		goto out;
>  	inode->i_state |= I_SYNC;
> +	dt = inode->i_state & I_DIRTY_TIME;
>  	wbc_attach_and_unlock_inode(wbc, inode);
>  
>  	ret = __writeback_single_inode(inode, wbc);
> @@ -1574,6 +1576,14 @@ static int writeback_single_inode(struct
>  	 */
>  	if (!(inode->i_state & I_DIRTY_ALL))
>  		inode_io_list_del_locked(inode, wb);
> +	else if (!(inode->i_state & I_DIRTY_TIME) && dt) {
> +		/*
> +		 * We can correct inode's io list, however, by moving it to
> +		 * b_dirty from b_dirty_time as we are the I_SYNC owner
> +		 */
> +		inode->dirtied_when = jiffies;
> +		inode_io_list_move_locked(inode, wb, &wb->b_dirty);
> +	}
>  	spin_unlock(&wb->list_lock);
>  	inode_sync_complete(inode);
>  out:
> --
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

      parent reply	other threads:[~2020-05-25  7:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-22  9:57 Writeback bug causing writeback stalls Martijn Coenen
2020-05-22 14:41 ` Jan Kara
2020-05-22 15:23   ` Martijn Coenen
2020-05-22 15:36     ` Jan Kara
2020-05-23  8:15       ` Martijn Coenen
2020-05-25  7:31         ` Jan Kara
2020-05-27  8:14           ` Martijn Coenen
2020-05-29 15:20             ` Jan Kara
2020-05-29 19:37               ` Martijn Coenen
2020-06-01  9:09                 ` Jan Kara
2020-06-02 12:16                   ` Martijn Coenen
     [not found] ` <20200524140522.14196-1-hdanton@sina.com>
2020-05-25  7:38   ` Jan Kara [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200525073835.GJ14199@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=android-storage-core@google.com \
    --cc=axboe@kernel.dk \
    --cc=hdanton@sina.com \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maco@android.com \
    --cc=miklos@szeredi.hu \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).