All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/7] xfs: check buffer pin state after locking in delwri_submit
Date: Tue, 15 Mar 2022 12:13:20 -0700	[thread overview]
Message-ID: <20220315191320.GG8241@magnolia> (raw)
In-Reply-To: <20220315064241.3133751-3-david@fromorbit.com>

On Tue, Mar 15, 2022 at 05:42:36PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> AIL flushing can get stuck here:
> 
> [316649.005769] INFO: task xfsaild/pmem1:324525 blocked for more than 123 seconds.
> [316649.007807]       Not tainted 5.17.0-rc6-dgc+ #975
> [316649.009186] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [316649.011720] task:xfsaild/pmem1   state:D stack:14544 pid:324525 ppid:     2 flags:0x00004000
> [316649.014112] Call Trace:
> [316649.014841]  <TASK>
> [316649.015492]  __schedule+0x30d/0x9e0
> [316649.017745]  schedule+0x55/0xd0
> [316649.018681]  io_schedule+0x4b/0x80
> [316649.019683]  xfs_buf_wait_unpin+0x9e/0xf0
> [316649.021850]  __xfs_buf_submit+0x14a/0x230
> [316649.023033]  xfs_buf_delwri_submit_buffers+0x107/0x280
> [316649.024511]  xfs_buf_delwri_submit_nowait+0x10/0x20
> [316649.025931]  xfsaild+0x27e/0x9d0
> [316649.028283]  kthread+0xf6/0x120
> [316649.030602]  ret_from_fork+0x1f/0x30
> 
> in the situation where flushing gets preempted between the unpin
> check and the buffer trylock under nowait conditions:
> 
> 	blk_start_plug(&plug);
> 	list_for_each_entry_safe(bp, n, buffer_list, b_list) {
> 		if (!wait_list) {
> 			if (xfs_buf_ispinned(bp)) {
> 				pinned++;
> 				continue;
> 			}
> Here >>>>>>
> 			if (!xfs_buf_trylock(bp))
> 				continue;
> 
> This means submission is stuck until something else triggers a log
> force to unpin the buffer.
> 
> To get onto the delwri list to begin with, the buffer pin state has
> already been checked, and hence it's relatively rare we get a race
> between flushing and encountering a pinned buffer in delwri
> submission to begin with. Further, to increase the pin count the
> buffer has to be locked, so the only way we can hit this race
> without failing the trylock is to be preempted between the pincount
> check seeing zero and the trylock being run.
> 
> Hence to avoid this problem, just invert the order of trylock vs
> pin check. We shouldn't hit that many pinned buffers here, so
> optimising away the trylock for pinned buffers should not matter for
> performance at all.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_buf.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index b45e0d50a405..8867f143598e 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -2094,12 +2094,13 @@ xfs_buf_delwri_submit_buffers(
>  	blk_start_plug(&plug);
>  	list_for_each_entry_safe(bp, n, buffer_list, b_list) {
>  		if (!wait_list) {
> +			if (!xfs_buf_trylock(bp))
 +				continue;
>  			if (xfs_buf_ispinned(bp)) {
> +				xfs_buf_unlock(bp);
>  				pinned++;
>  				continue;

Hmm.  So I think this means that this function willl skip buffers that
are locked or pinned.  The only way that the AIL would encounter this
situation is when a buffer on its list is now locked by a reader thread
or is participating in a transaction.  In the reader case this is (one
hopes) ok because the reader won't block on the AIL.

The tx case is trickier -- transaction allocation can result in an AIL
push if the head is too close to the tail, right?  Ordinarily, the AIL
won't find itself unable to write a buffer that's pinning the log
because a transaction holds that buffer -- eventually that tx should
commit, which will unlock the buffer and allow the AIL to make some
progress.

But -- what if the frontend is running a chained transaction, and it
bjoin'd the buffer to the transaction, tried to roll the transaction,
and the chain runs out of permanent log reservation (because we've
rolled more than logcount times) and we have to wait for more log grant
space?  The regrant for the successor tx happens before the commit of
the old tx, so can we livelock the log in this way?

And doesn't this potential exist regardless of this patch?

I suspect the answers are 'yes' and 'yes', which means this patch is ok
to move forward, but this has been bugging me since the V1 of this
patch, which is where I got stuck. :/

--D

>  			}
> -			if (!xfs_buf_trylock(bp))
> -				continue;
>  		} else {
>  			xfs_buf_lock(bp);
>  		}
> -- 
> 2.35.1
> 

  parent reply	other threads:[~2022-03-15 19:13 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15  6:42 [PATCH 0/7 v3] xfs: log recovery fixes Dave Chinner
2022-03-15  6:42 ` [PATCH 1/7] xfs: log worker needs to start before intent/unlink recovery Dave Chinner
2022-03-15  9:14   ` Chandan Babu R
2022-03-15  6:42 ` [PATCH 2/7] xfs: check buffer pin state after locking in delwri_submit Dave Chinner
2022-03-15 10:04   ` Chandan Babu R
2022-03-15 19:13   ` Darrick J. Wong [this message]
2022-03-15 21:11     ` Dave Chinner
2022-03-15 22:42       ` Darrick J. Wong
2022-03-15  6:42 ` [PATCH 3/7] xfs: xfs_ail_push_all_sync() stalls when racing with updates Dave Chinner
2022-03-15 15:14   ` Chandan Babu R
2022-03-15 19:17   ` Darrick J. Wong
2022-03-15 21:29     ` Dave Chinner
2022-03-15  6:42 ` [PATCH 4/7] xfs: async CIL flushes need pending pushes to be made stable Dave Chinner
2022-03-15 19:36   ` Darrick J. Wong
2022-03-15 21:47     ` Dave Chinner
2022-03-16  2:00       ` Darrick J. Wong
2022-03-16 10:34   ` Chandan Babu R
2022-03-16 23:24     ` Dave Chinner
2022-03-17  6:49       ` Chandan Babu R
2022-03-15  6:42 ` [PATCH 5/7] xfs: log items should have a xlog pointer, not a mount Dave Chinner
2022-03-15 19:37   ` Darrick J. Wong
2022-03-16 11:06   ` Chandan Babu R
2022-03-15  6:42 ` [PATCH 6/7] xfs: AIL should be log centric Dave Chinner
2022-03-15 19:39   ` Darrick J. Wong
2022-03-16 11:12   ` Chandan Babu R
2022-03-15  6:42 ` [PATCH 7/7] xfs: xfs_is_shutdown vs xlog_is_shutdown cage fight Dave Chinner
2022-03-15 20:03   ` Darrick J. Wong
2022-03-15 22:20     ` Dave Chinner
2022-03-16  1:22       ` Darrick J. Wong
2022-03-17  5:39 [PATCH 0/7 v4] xfs: log recovery fixes Dave Chinner
2022-03-17  5:39 ` [PATCH 2/7] xfs: check buffer pin state after locking in delwri_submit Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220315191320.GG8241@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.