linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
From: Sahitya Tummala <stummala@codeaurora.org>
To: Chao Yu <yuchao0@huawei.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] IO hang due to f2fs checkpoint and writeback stuck
Date: Fri, 10 Jul 2020 09:09:05 +0530	[thread overview]
Message-ID: <20200710033905.GE2916@codeaurora.org> (raw)
In-Reply-To: <dcb68985-d621-6ef1-7452-172280148aa1@huawei.com>

Hi Chao,

On Fri, Jul 10, 2020 at 10:54:13AM +0800, Chao Yu wrote:
> Hi Sahitya,
> 
> It looks block plug has already been removed by Jaegeuk with
> below commit:
> 
> commit 1f5f11a3c41e2b23288b2769435a00f74e02496b
> Author: Jaegeuk Kim <jaegeuk@kernel.org>
> Date:   Fri May 8 12:25:45 2020 -0700
> 
>     f2fs: remove blk_plugging in block_operations
> 
>     blk_plugging doesn't seem to give any benefit.
> 
> How about backporting this patch?

Yes, I have noticed that patch. But we have nested pluglists in
the block_operations path. Hence, I was not sure if that patch alone
can help.
1. At the start of  block_operations
2. Inside __f2fs_write_data_pages() that gets called from
f2fs_sync_dirty_inodes()->filemap_fdatawrite()

Do you know the possible path for this issue scenario to happen?
Where does in the CP path before even f2fs_sync_node_pages() is done, the
node pages cab be submitted for io and get attached to CP plug list?

Thanks,

> 
> On 2020/7/10 10:30, Sahitya Tummala wrote:
> > Hi Chao, Jaegeuk,
> > 
> > I have received an issue report that indicates that system is stuck
> > on IO due to f2fs checkpoint and writeback stuck waiting on each other
> > as explained below.
> > 
> > WB thread -
> > ----------
> > 
> > io_schedule
> > wait_on_page_bit
> > f2fs_wait_on_page_writeback -> It is waiting for node
> > 			node page writeback whose bio is in the
> > 			plug list of CP thread below.
> > f2fs_update_data_blkaddr
> > f2fs_outplace_write_data
> > f2fs_do_write_data_page
> > __write_data_page
> > __f2fs_write_data_pages
> > f2fs_write_data_pages
> > do_writepages
> > 
> > CP thread -
> > -----------
> > 
> > __f2fs_write_data_pages -> It is for the same inode above that is under WB (which
> > 	is waiting for node page writeback). In this context, there is nothing to
> > 	be written as the data is already under WB. 
> > filemap_fdatawrite
> > f2fs_sync_dirty_inodes -> It just loops here in f2fs_sync_dirty_inodes() until
> > 			f2fs_remove_dirty_inode() has been done by the WB thread above.
> > block_operations
> > f2fs_write_checkpoint
> > 
> > The CP thread somehow has the node page bio in its plug list that cannot be submitted 
> > until end of block_operations() and CP thread is blocked on WB of an inode who is again
> > waiting for io pending in CP plug list. Both the stacks are stuck on for each other.
> > 
> > The below patch helped to solve the issue, please review and suggest if this seems to 
> > be okay. Since anyways we are doing cond_resched(), I thought it will be good to flush
> > the plug list as well (in this issue case, it will loop for the same inode again and again).
> > 
> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > index e460d90..152df48 100644
> > --- a/fs/f2fs/checkpoint.c
> > +++ b/fs/f2fs/checkpoint.c
> > @@ -1071,10 +1071,12 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
> > 
> >                 iput(inode);
> >                 /* We need to give cpu to another writers. */
> > -               if (ino == cur_ino)
> > +               if (ino == cur_ino) {
> > +                       blk_flush_plug(current);
> >                         cond_resched();
> > -               else
> > +                } else {
> >                         ino = cur_ino;
> > +                }
> >         } else {
> >                 /*
> >                  * We should submit bio, since it exists several
> > 
> > Thanks,
> > 

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  parent reply	other threads:[~2020-07-10  3:39 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-10  2:30 [f2fs-dev] IO hang due to f2fs checkpoint and writeback stuck Sahitya Tummala
2020-07-10  2:54 ` Chao Yu
2020-07-10  3:02   ` Gao Xiang via Linux-f2fs-devel
2020-07-10  7:33     ` Chao Yu
2020-07-10  3:39   ` Sahitya Tummala [this message]
2020-07-10  8:40     ` Chao Yu
2020-07-10 10:07       ` Sahitya Tummala
2020-07-14 12:13         ` Chao Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200710033905.GE2916@codeaurora.org \
    --to=stummala@codeaurora.org \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).