All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] xfs: transactionless xfs_bunmapi shouldn't do format conversion
Date: Mon, 18 Jun 2018 21:54:05 -0700	[thread overview]
Message-ID: <20180619045405.GI8128@magnolia> (raw)
In-Reply-To: <20180619024128.22669-2-david@fromorbit.com>

On Tue, Jun 19, 2018 at 12:41:27PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> If we are punching out a delalloc extent, xfs_bunmapi() does not
> have a transaction context and should not ever need to convert the
> on-disk extent format. If such a thing is attempted (e.g. via a
> corrupt inode extent count in extent format) then we should abort
> with an EFSCORRUPTED error. Unfortunately, we don't do that and
> crash instead:
> 
>  XFS (loop0): page discard on page 0000000005fd24f3, inode 0x75e5, offset 0.
>  ==================================================================
>  BUG: KASAN: null-ptr-deref in xfs_alloc_get_freelist+0x115/0x350
>  Read of size 8 at addr 0000000000000028 by task a.out/1406
>  CPU: 0 PID: 1406 Comm: a.out Not tainted 4.17.0-rc4-kasan #2
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
>  Call Trace:
>   dump_stack+0x7b/0xb5
>   kasan_report+0x10c/0x390
>   __asan_load8+0x54/0x90
>   xfs_alloc_get_freelist+0x115/0x350
>   xfs_alloc_fix_freelist+0x35b/0x830
>   xfs_alloc_vextent+0x215/0x990
>   xfs_bmap_extents_to_btree+0x30d/0x940
> .....
> 
> By returning an error here, we avoid such crashes when punching out
> a delalloc page because we don't try to fix up an AG freelist
> without a transaction. Hence we get an error like so:

Um, isn't erroring out here leaving a dirty bomb in the in-core metadata?
Like you say:

> XFS (loop0): page discard on page ffffea00040ae640, inode 0x75e5, offset 0.
> XFS (loop0): page discard unable to remove delalloc mapping.

We know the fs is corrupt, we might as well shut down now rather than
let this burp out later.

I get that people don't want to touch well seasoned code, but
xfs_bunmapi is this big unwieldly function that's crying out for a
refactor.  It's 330 lines long and can be called from various contexts
(data/attr fork, punch delalloc, etc.)...

...it's also weird that xfs_bmap_punch_delalloc_range calls xfs_bunmapi
with no transaction and a xfs_defer that we dump on the ground.

So yes, I think the patch does fix the crash, but it's kinda gross.

Thoughts?

--D

> And the filesystem continues to operate and the stale mapping is
> cleaned up when the inode is reclaimed.
> 
> Reported-by: Wen Xu <wen.xu@gatech.edu>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 01628f0c9a0c..6967ce8088d2 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -5458,10 +5458,18 @@ __xfs_bunmapi(
>  		*rlen = end - start + 1;
>  
>  	/*
> -	 * Convert to a btree if necessary.
> +	 * Convert the BMBT root format if necessary. This should only occur in
> +	 * transaction contexts and not when removing delalloc extents from
> +	 * the in-core extent tree. If we don't have a transaction, then we've
> +	 * got some form of corruption somewhere, so return an error
> +	 * immediately.
>  	 */
>  	if (xfs_bmap_needs_btree(ip, whichfork)) {
>  		ASSERT(cur == NULL);
> +		if (!tp) {
> +			error = -EFSCORRUPTED;
> +			goto error0;
> +		}
>  		error = xfs_bmap_extents_to_btree(tp, ip, firstblock, dfops,
>  			&cur, 0, &tmp_logflags, whichfork);
>  		logflags |= tmp_logflags;
> @@ -5473,6 +5481,10 @@ __xfs_bunmapi(
>  	 */
>  	else if (xfs_bmap_wants_extents(ip, whichfork)) {
>  		ASSERT(cur != NULL);
> +		if (!tp) {
> +			error = -EFSCORRUPTED;
> +			goto error0;
> +		}
>  		error = xfs_bmap_btree_to_extents(tp, ip, cur, &tmp_logflags,
>  			whichfork);
>  		logflags |= tmp_logflags;
> -- 
> 2.17.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-06-19  4:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-19  2:41 [PATCH 0/2] xfs: handle inode extent count mismatch Dave Chinner
2018-06-19  2:41 ` [PATCH 1/2] xfs: transactionless xfs_bunmapi shouldn't do format conversion Dave Chinner
2018-06-19  4:54   ` Darrick J. Wong [this message]
2018-06-19  5:27     ` Dave Chinner
2018-06-19  6:06       ` Darrick J. Wong
2018-06-19 23:33         ` Dave Chinner
2018-06-21 16:42           ` Darrick J. Wong
2018-06-20  7:31     ` Christoph Hellwig
2018-06-21 22:34       ` Dave Chinner
2018-06-21 22:55         ` Darrick J. Wong
2018-06-21 23:23           ` Dave Chinner
2018-06-19  2:41 ` [PATCH 2/2] xfs: More robust inode extent count validation Dave Chinner
2018-06-19  4:57   ` Darrick J. Wong
2018-06-19  5:29     ` Dave Chinner
2018-06-19  6:07       ` Darrick J. Wong
2018-06-20  7:34   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180619045405.GI8128@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.