All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] xfs: transactionless xfs_bunmapi shouldn't do format conversion
Date: Thu, 21 Jun 2018 09:42:05 -0700	[thread overview]
Message-ID: <20180621164205.GD4838@magnolia> (raw)
In-Reply-To: <20180619233317.GL19934@dastard>

On Wed, Jun 20, 2018 at 09:33:17AM +1000, Dave Chinner wrote:
> On Mon, Jun 18, 2018 at 11:06:52PM -0700, Darrick J. Wong wrote:
> > On Tue, Jun 19, 2018 at 03:27:59PM +1000, Dave Chinner wrote:
> > > On Mon, Jun 18, 2018 at 09:54:05PM -0700, Darrick J. Wong wrote:
> > > > On Tue, Jun 19, 2018 at 12:41:27PM +1000, Dave Chinner wrote:
> > > > > From: Dave Chinner <dchinner@redhat.com>
> > > > > 
> > > > > If we are punching out a delalloc extent, xfs_bunmapi() does not
> > > > > have a transaction context and should not ever need to convert the
> > > > > on-disk extent format. If such a thing is attempted (e.g. via a
> > > > > corrupt inode extent count in extent format) then we should abort
> > > > > with an EFSCORRUPTED error. Unfortunately, we don't do that and
> > > > > crash instead:
> > > > > 
> > > > >  XFS (loop0): page discard on page 0000000005fd24f3, inode 0x75e5, offset 0.
> > > > >  ==================================================================
> > > > >  BUG: KASAN: null-ptr-deref in xfs_alloc_get_freelist+0x115/0x350
> > > > >  Read of size 8 at addr 0000000000000028 by task a.out/1406
> > > > >  CPU: 0 PID: 1406 Comm: a.out Not tainted 4.17.0-rc4-kasan #2
> > > > >  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> > > > >  Call Trace:
> > > > >   dump_stack+0x7b/0xb5
> > > > >   kasan_report+0x10c/0x390
> > > > >   __asan_load8+0x54/0x90
> > > > >   xfs_alloc_get_freelist+0x115/0x350
> > > > >   xfs_alloc_fix_freelist+0x35b/0x830
> > > > >   xfs_alloc_vextent+0x215/0x990
> > > > >   xfs_bmap_extents_to_btree+0x30d/0x940
> > > > > .....
> > > > > 
> > > > > By returning an error here, we avoid such crashes when punching out
> > > > > a delalloc page because we don't try to fix up an AG freelist
> > > > > without a transaction. Hence we get an error like so:
> > > > 
> > > > Um, isn't erroring out here leaving a dirty bomb in the in-core metadata?
> > > 
> > > Not that I can tell. We've already trashed the dirty page state by
> > > this point, so the page cache can safely reclaim the page and the
> > > delalloc range over it will never get written.  And the XFS inode
> > > cleanup code didn't have any issues with the way the error was
> > > handled, either, because the delalloc range was actually removed
> > > before the fork format error was triggered.
> > > 
> > > IOWs, there is no dirty, stale page state or delalloc extents
> > > hanging around if this error fires.
> > 
> > Hmmm, well I guess I'll pull this one in and look for problems.
> > 
> > I wonder, is there a <cough> testcase for this?  Or a fuzz-o-matic to
> > turn all these things into regression tests?
> 
> No test case. Should be able to create one easily enough with
> xfs_db, though I haven't tried. Do the inode fuzzer tests screw with
> the extent count?

The existing set of fuzz tests won't catch this because they go straight
into repair attempts to see if scrub/repair will deal with bad nextents.
They don't try to modify the corrupted fs.

They also do it slowly because fuzzing nextents is simply a part of
fuzzing every field in a extents-format file inode, and I suspect that
we don't really want to make fuzz testing a regular part of xfstests
because that immediately triples the auto group runtime. :)

So, targeted test please? :)

I will also work on a fuzz series that skips scrub/repair and goes
straight to writing to the corrupted fs to see what happens.

> > > But OTOH, I don't want to risk a bunch of filesystem corrupting
> > > regressions across the entire XFS userbase just to fix a trivially
> > > simple crash that requires an extremely unlikely co-ordinated
> > > corruption of an inode data fork and an AGFL, and to simultaneously
> > > have ENOSPC in every other AGF in the filesystem.
> > > 
> > > Put "refactor xfs_bunmapi()" on the list of "things to do when
> > > there's nothing else to do"...
> > 
> > So in 2066 after the polar ice caps melt after the XFS LOGHAMMER attack
> > has finally been put down?  Ok. :)
> 
> I'm sure someone will have reason to factor it before then :P

I ... forgot that hch already did. :/

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-06-21 16:42 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-19  2:41 [PATCH 0/2] xfs: handle inode extent count mismatch Dave Chinner
2018-06-19  2:41 ` [PATCH 1/2] xfs: transactionless xfs_bunmapi shouldn't do format conversion Dave Chinner
2018-06-19  4:54   ` Darrick J. Wong
2018-06-19  5:27     ` Dave Chinner
2018-06-19  6:06       ` Darrick J. Wong
2018-06-19 23:33         ` Dave Chinner
2018-06-21 16:42           ` Darrick J. Wong [this message]
2018-06-20  7:31     ` Christoph Hellwig
2018-06-21 22:34       ` Dave Chinner
2018-06-21 22:55         ` Darrick J. Wong
2018-06-21 23:23           ` Dave Chinner
2018-06-19  2:41 ` [PATCH 2/2] xfs: More robust inode extent count validation Dave Chinner
2018-06-19  4:57   ` Darrick J. Wong
2018-06-19  5:29     ` Dave Chinner
2018-06-19  6:07       ` Darrick J. Wong
2018-06-20  7:34   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180621164205.GD4838@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.