All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] xfs: log new intent items created as part of finishing recovered intent items
Date: Thu, 17 Sep 2020 14:58:56 +1000	[thread overview]
Message-ID: <20200917045856.GD12131@dread.disaster.area> (raw)
In-Reply-To: <160031332982.3624373.6230830770363563010.stgit@magnolia>

On Wed, Sep 16, 2020 at 08:28:49PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> During a code inspection, I found a serious bug in the log intent item
> recovery code when an intent item cannot complete all the work and
> decides to requeue itself to get that done.  When this happens, the
> item recovery creates a new incore deferred op representing the
> remaining work and attaches it to the transaction that it allocated.  At
> the end of _item_recover, it moves the entire chain of deferred ops to
> the dummy parent_tp that xlog_recover_process_intents passed to it, but
> fail to log a new intent item for the remaining work before committing
> the transaction for the single unit of work.
> 
> xlog_finish_defer_ops logs those new intent items once recovery has
> finished dealing with the intent items that it recovered, but this isn't
> sufficient.  If the log is forced to disk after a recovered log item
> decides to requeue itself and the system goes down before we call
> xlog_finish_defer_ops, the second log recovery will never see the new
> intent item and therefore has no idea that there was more work to do.
> It will finish recovery leaving the filesystem in a corrupted state.
> 
> The same logic applies to /any/ deferred ops added during intent item
> recovery, not just the one handling the remaining work.

Yup, that looks like a problem.

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_defer.c  |   26 ++++++++++++++++++++++++--
>  fs/xfs/libxfs/xfs_defer.h  |    6 ++++++
>  fs/xfs/xfs_bmap_item.c     |    2 +-
>  fs/xfs/xfs_refcount_item.c |    2 +-
>  4 files changed, 32 insertions(+), 4 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> index d8f586256add..29e9762f3b77 100644
> --- a/fs/xfs/libxfs/xfs_defer.c
> +++ b/fs/xfs/libxfs/xfs_defer.c
> @@ -186,8 +186,9 @@ xfs_defer_create_intent(
>  {
>  	const struct xfs_defer_op_type	*ops = defer_op_types[dfp->dfp_type];
>  
> -	dfp->dfp_intent = ops->create_intent(tp, &dfp->dfp_work,
> -			dfp->dfp_count, sort);
> +	if (!dfp->dfp_intent)
> +		dfp->dfp_intent = ops->create_intent(tp, &dfp->dfp_work,
> +						     dfp->dfp_count, sort);
>  }
>  
>  /*
> @@ -390,6 +391,7 @@ xfs_defer_finish_one(
>  			list_add(li, &dfp->dfp_work);
>  			dfp->dfp_count++;
>  			dfp->dfp_done = NULL;
> +			dfp->dfp_intent = NULL;
>  			xfs_defer_create_intent(tp, dfp, false);
>  		}
>  
> @@ -552,3 +554,23 @@ xfs_defer_move(
>  
>  	xfs_defer_reset(stp);
>  }
> +
> +/*
> + * Prepare a chain of fresh deferred ops work items to be completed later.  Log
> + * recovery requires the ability to put off until later the actual finishing
> + * work so that it can process unfinished items recovered from the log in
> + * correct order.
> + *
> + * Create and log intent items for all the work that we're capturing so that we
> + * can be assured that the items will get replayed if the system goes down
> + * before log recovery gets a chance to finish the work it put off.  Then we
> + * move the chain from stp to dtp.
> + */
> +void
> +xfs_defer_capture(
> +	struct xfs_trans	*dtp,
> +	struct xfs_trans	*stp)
> +{
> +	xfs_defer_create_intents(stp);
> +	xfs_defer_move(dtp, stp);
> +}

Not sold on the "capture" name, but it'll do for now.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-09-17  4:59 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-17  3:28 [PATCH 0/2] xfs: fix simple problems with log intent recovery Darrick J. Wong
2020-09-17  3:28 ` [PATCH 1/2] xfs: log new intent items created as part of finishing recovered intent items Darrick J. Wong
2020-09-17  4:58   ` Dave Chinner [this message]
2020-09-17  9:07   ` Christoph Hellwig
2020-09-17 17:45     ` Darrick J. Wong
2020-09-17  3:28 ` [PATCH 2/2] xfs: attach inode to dquot in xfs_bui_item_recover Darrick J. Wong
2020-09-17  4:54   ` Dave Chinner
2020-09-17  6:36     ` Darrick J. Wong
2020-09-17  7:01   ` [PATCH v2 " Darrick J. Wong
2020-09-17  8:03     ` Dave Chinner
2020-09-17  9:04     ` Christoph Hellwig
2020-09-17  7:01 ` [PATCH 3/2] xfs: free the intent item when allocating recovery transaction fails Darrick J. Wong
2020-09-17  8:05   ` Dave Chinner
2020-09-17  9:06   ` Christoph Hellwig
2020-09-18  1:48     ` Darrick J. Wong
2020-09-18  2:17 ` [PATCH v2 3/2] xfs: fix simple problems with log intent recovery Darrick J. Wong
2020-09-18  2:19   ` [PATCH v3 3/2] xfs: don't release log intent items when recovery fails Darrick J. Wong
2020-09-19  5:49     ` Christoph Hellwig
2020-09-21  6:49     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200917045856.GD12131@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.