All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chandan Babu R <chandan.babu@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/7] xfs: xfs_ail_push_all_sync() stalls when racing with updates
Date: Tue, 15 Mar 2022 20:44:27 +0530	[thread overview]
Message-ID: <8735jj459o.fsf@debian-BULLSEYE-live-builder-AMD64> (raw)
In-Reply-To: <20220315064241.3133751-4-david@fromorbit.com>

On 15 Mar 2022 at 12:12, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> xfs_ail_push_all_sync() has a loop like this:
>
> while max_ail_lsn {
> 	prepare_to_wait(ail_empty)
> 	target = max_ail_lsn
> 	wake_up(ail_task);
> 	schedule()
> }
>
> Which is designed to sleep until the AIL is emptied. When
> xfs_ail_finish_update() moves the tail of the log, it does:

... xfs_ail_update_finish() ...

>
> 	if (list_empty(&ailp->ail_head))
> 		wake_up_all(&ailp->ail_empty);
>
> So it will only wake up the sync push waiter when the AIL goes
> empty. If, by the time the push waiter has woken, the AIL has more
> in it, it will reset the target, wake the push task and go back to
> sleep.
>
> The problem here is that if the AIL is having items added to it
> when xfs_ail_push_all_sync() is called, then they may get inserted
> into the AIL at a LSN higher than the target LSN. At this point,
> xfsaild_push() will see that the target is X, the item LSNs are
> (X+N) and skip over them, hence never pushing the out.
>
> The result of this the AIL will not get emptied by the AIL push
> thread, hence xfs_ail_finish_update() will never see the AIL being
> empty even if it moves the tail. Hence xfs_ail_push_all_sync() never
> gets woken and hence cannot update the push target to capture the
> items beyond the current target on the LSN.
>
> This is a TOCTOU type of issue so the way to avoid it is to not
> use the push target at all for sync pushes. We know that a sync push
> is being requested by the fact the ail_empty wait queue is active,
> hence the xfsaild can just set the target to max_ail_lsn on every
> push that we see the wait queue active. Hence we no longer will
> leave items on the AIL that are beyond the LSN sampled at the start
> of a sync push.
>

The fix seems to logically correct.

Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>

> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_trans_ail.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> index 2a8c8dc54c95..1b52952097c1 100644
> --- a/fs/xfs/xfs_trans_ail.c
> +++ b/fs/xfs/xfs_trans_ail.c
> @@ -448,10 +448,22 @@ xfsaild_push(
>  
>  	spin_lock(&ailp->ail_lock);
>  
> -	/* barrier matches the ail_target update in xfs_ail_push() */
> -	smp_rmb();
> -	target = ailp->ail_target;
> -	ailp->ail_target_prev = target;
> +	/*
> +	 * If we have a sync push waiter, we always have to push till the AIL is
> +	 * empty. Update the target to point to the end of the AIL so that
> +	 * capture updates that occur after the sync push waiter has gone to
> +	 * sleep.
> +	 */
> +	if (waitqueue_active(&ailp->ail_empty)) {
> +		lip = xfs_ail_max(ailp);
> +		if (lip)
> +			target = lip->li_lsn;
> +	} else {
> +		/* barrier matches the ail_target update in xfs_ail_push() */
> +		smp_rmb();
> +		target = ailp->ail_target;
> +		ailp->ail_target_prev = target;
> +	}
>  
>  	/* we're done if the AIL is empty or our push has reached the end */
>  	lip = xfs_trans_ail_cursor_first(ailp, &cur, ailp->ail_last_pushed_lsn);
> @@ -724,7 +736,6 @@ xfs_ail_push_all_sync(
>  	spin_lock(&ailp->ail_lock);
>  	while ((lip = xfs_ail_max(ailp)) != NULL) {
>  		prepare_to_wait(&ailp->ail_empty, &wait, TASK_UNINTERRUPTIBLE);
> -		ailp->ail_target = lip->li_lsn;
>  		wake_up_process(ailp->ail_task);
>  		spin_unlock(&ailp->ail_lock);
>  		schedule();


-- 
chandan

  reply	other threads:[~2022-03-15 15:14 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15  6:42 [PATCH 0/7 v3] xfs: log recovery fixes Dave Chinner
2022-03-15  6:42 ` [PATCH 1/7] xfs: log worker needs to start before intent/unlink recovery Dave Chinner
2022-03-15  9:14   ` Chandan Babu R
2022-03-15  6:42 ` [PATCH 2/7] xfs: check buffer pin state after locking in delwri_submit Dave Chinner
2022-03-15 10:04   ` Chandan Babu R
2022-03-15 19:13   ` Darrick J. Wong
2022-03-15 21:11     ` Dave Chinner
2022-03-15 22:42       ` Darrick J. Wong
2022-03-15  6:42 ` [PATCH 3/7] xfs: xfs_ail_push_all_sync() stalls when racing with updates Dave Chinner
2022-03-15 15:14   ` Chandan Babu R [this message]
2022-03-15 19:17   ` Darrick J. Wong
2022-03-15 21:29     ` Dave Chinner
2022-03-15  6:42 ` [PATCH 4/7] xfs: async CIL flushes need pending pushes to be made stable Dave Chinner
2022-03-15 19:36   ` Darrick J. Wong
2022-03-15 21:47     ` Dave Chinner
2022-03-16  2:00       ` Darrick J. Wong
2022-03-16 10:34   ` Chandan Babu R
2022-03-16 23:24     ` Dave Chinner
2022-03-17  6:49       ` Chandan Babu R
2022-03-15  6:42 ` [PATCH 5/7] xfs: log items should have a xlog pointer, not a mount Dave Chinner
2022-03-15 19:37   ` Darrick J. Wong
2022-03-16 11:06   ` Chandan Babu R
2022-03-15  6:42 ` [PATCH 6/7] xfs: AIL should be log centric Dave Chinner
2022-03-15 19:39   ` Darrick J. Wong
2022-03-16 11:12   ` Chandan Babu R
2022-03-15  6:42 ` [PATCH 7/7] xfs: xfs_is_shutdown vs xlog_is_shutdown cage fight Dave Chinner
2022-03-15 20:03   ` Darrick J. Wong
2022-03-15 22:20     ` Dave Chinner
2022-03-16  1:22       ` Darrick J. Wong
2022-03-17  5:39 [PATCH 0/7 v4] xfs: log recovery fixes Dave Chinner
2022-03-17  5:39 ` [PATCH 3/7] xfs: xfs_ail_push_all_sync() stalls when racing with updates Dave Chinner
2022-03-17 14:30 kernel test robot
2022-03-18 10:08 kernel test robot
2022-03-20 15:17 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8735jj459o.fsf@debian-BULLSEYE-live-builder-AMD64 \
    --to=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.