All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] xfs: zero length symlinks are not valid
Date: Fri, 22 Jun 2018 08:31:41 +1000	[thread overview]
Message-ID: <20180621223140.GU19934@dastard> (raw)
In-Reply-To: <20180621114618.GA2834@bfoster>

On Thu, Jun 21, 2018 at 07:46:18AM -0400, Brian Foster wrote:
> On Thu, Jun 21, 2018 at 08:59:18AM +1000, Dave Chinner wrote:
> > On Wed, Jun 20, 2018 at 07:50:48AM -0400, Brian Foster wrote:
> > > If I recreate that same dirty log state and mount with this patch
> > > applied (note that the fs is created without this patch to simulate an
> > > old kernel that has not changed i_mode in the same transaction that sets
> > > di_size = 0) along with a hack to avoid the check in
> > > xfs_dinode_verify(), I now hit the new assert and corruption error
> > > that's been placed in xfs_inactive_symlink().
> > > 
> > > So to Darrick's point, that seems to show that this is a vector to the
> > > pre-existing len == 0 check in xfs_inactive_symlink(). Given that, it
> > > seems to me that if we want to handle recovery from this state, we'd
> > > still need to work around the verifier check and retain the initial
> > > di_size == 0 check in xfs_inactive_symlink().
> > 
> > I think we should get rid of the transient state, not continue to
> > work around it. Because the transient state only exists in a log
> > recovery context, we can change the behaviour and not care about
> > older kernels still having the problem because all that is needed to
> > avoid the issue is a clean log when upgrading the kernel.
> > 
> 
> Right... that sounds reasonable to me, but we still need to consider how
> we handle a filesystem already in this state. BTW, was this a user
> report or a manufactured corruption due to fuzzing/shutdown testing or
> something?

It was shutdown testing. The report was that xfs_repair -n failed
with a zero length symlink, not that log recovery failed. I assumed
that log recovery worked fine before xfs_repair -n was run because
it didn't warn about a dirty log. However, now that you point out
that we just toss unlink list recovery failures (which I'd forgotten
about!), I'm guessing that happened and then repair tripped over
the leaked symlink inode.

> Given that additional context, I don't feel too strongly about needing
> to specially handle the "zero length symlink already exists in the dirty
> log due to xfs_inactive_symlink_rmt()" case. Chances are the mount that
> reported the error already nuked the state in the first place, so users
> shouldn't really end up "stuck" somewhere between needing a kernel fix
> or an 'xfs_repair -L' run (but if that's the approach we take, please
> just note the tradeoff in the commit log). Just my .02.

Yup, that seems reasonable to me - leaking the inode until repair is
done has no user impact.  It will do the same thing for any inode it
finds on the unlinked list that is corrupted, so my thoughts are
that if we want to improve corruption handling we need to address
the general unlinked list corruption issue rather than just this
specific inode corruption case.

Alright - I'll clean up the patch, make notes of all this in the
commit message and repost.

Thanks, Brian!

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-06-21 22:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-18  5:57 [PATCH 0/2] xfs: symlink and inode writeback issues Dave Chinner
2018-06-18  5:57 ` [PATCH 1/2] xfs: zero length symlinks are not valid Dave Chinner
2018-06-18 13:24   ` Brian Foster
2018-06-18 22:42     ` Dave Chinner
2018-06-19 11:54       ` Brian Foster
2018-06-19 15:48         ` Darrick J. Wong
2018-06-19 16:28           ` Brian Foster
2018-06-19 23:22             ` Dave Chinner
2018-06-20 11:50               ` Brian Foster
2018-06-20 22:59                 ` Dave Chinner
2018-06-21 11:46                   ` Brian Foster
2018-06-21 22:31                     ` Dave Chinner [this message]
2018-06-21 22:53                       ` Darrick J. Wong
2018-06-22 10:44                         ` Brian Foster
2018-06-23 17:38                           ` Darrick J. Wong
2018-06-18  5:57 ` [PATCH 2/2] xfs: xfs_iflush_abort() can be called twice on cluster writeback failure Dave Chinner
2018-06-18 13:24   ` Brian Foster
2018-06-19  5:05   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180621223140.GU19934@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.