All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Mathias Troiden <mathias.troiden@gmail.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: xfs_repair fails to recognize corruption reported by kernel - possible bug?
Date: Fri, 24 Feb 2017 09:30:37 -0500	[thread overview]
Message-ID: <20170224143037.GD59560@bfoster.bfoster> (raw)
In-Reply-To: <20170224123017.GA59560@bfoster.bfoster>

On Fri, Feb 24, 2017 at 07:30:18AM -0500, Brian Foster wrote:
> On Thu, Feb 23, 2017 at 11:14:47PM +0300, Mathias Troiden wrote:
> > Original topic: https://bbs.archlinux.org/viewtopic.php?pid=1692896
> > 
> > Hi list,
> > 
> > My system fails to start login manager with following messages in journal:
> > 
> > >kernel: ffff88040e8bc030: 58 67 db ca 2a 3a dd b8 00 00 00 00 00 00 00 00  Xg..*:..........
> > >kernel: XFS (sda1): Internal error xfs_iread at line 514 of file fs/xfs/libxfs/xfs_inode_buf.c.  Caller xfs_iget+0x2b1/0x940 [xfs]
> > >kernel: XFS (sda1): Corruption detected. Unmount and run xfs_repair
> > >kernel: XFS (sda1): xfs_iread: validation failed for inode 34110192 failed
> > >kernel: ffff88040e8bc000: 49 4e a1 ff 03 01 00 00 00 00 00 00 00 00 00 00  IN..............
> > >kernel: ffff88040e8bc010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > >kernel: ffff88040e8bc020: 58 aa 04 b8 2e e3 65 3a 57 41 fe 12 00 00 00 00  X.....e:WA......
> > >kernel: ffff88040e8bc030: 58 67 db ca 2a 3a dd b8 00 00 00 00 00 00 00 00  Xg..*:..........
> > >kernel: XFS (sda1): Internal error xfs_iread at line 514 of file fs/xfs/libxfs/xfs_inode_buf.c.  Caller xfs_iget+0x2b1/0x940 [xfs]
> > >kernel: XFS (sda1): Corruption detected. Unmount and run xfs_repair
> > 
> > 
> > and subsequent core dump of the login manager.
> > 
> 
> What kernel and xfsprogs versions? Also, please provide 'xfs_info <mnt>'
> output for the fs.
> 
> From the output above, it looks like you could have a zero-sized
> symlink, which triggers xfs_dinode_verify() failure. It's quite possible
> I'm misreading the raw inode buffer output above too, however.. Did you
> have any interesting "events" before this problem started to occur? For
> example, a crash or hard reset, etc.?
> 
> Could you run 'find <mnt> -inum 34110192 -print' on the fs and report
> the associated filename? You could try 'stat <file>' as well but I'm
> guessing that's just going to report an error.
> 
> Note that another way to get us details of the fs is to send an
> xfs_metadump image. An md image skips all file data in the fs and
> obfuscates metadata (such as filenames) such that no sensitive
> information is shared. It simply provides a skeleton metadata image for
> us to debug. To create an obfuscated metadump, run 'xfs_metadump -g
> <dev> <outputimg>,' compress the resulting image file and send it along
> (feel free to send directly) or upload it somewhere.
> 

After looking at a metadump, this is indeed a zero-sized symlink. The
immediate fix here is probably to allow xfs_repair to detect this
situation and recover, which most likely means clearing out the inode.

Unfortunately, it's not clear how we got into this situation in the
first place. I'm still curious if you've had any crash or reset events
that might have required log recovery recently..?

Regardless, you'll probably have to try something like the appended
xfsprogs patch, which clears out the offending inode and means you'll
have to recreate it manually to recover system functionality (Mathias
has pointed out offline that the offending link is a standard
/usr/lib/lib*.so symlink with a known target, so fortunately recovery
should be simple).

Brian

--- 8< ---

diff --git a/repair/dinode.c b/repair/dinode.c
index 8d01409..d664f87 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -1385,6 +1385,11 @@ process_symlink(
 		return(1);
 	}
 
+	if (be64_to_cpu(dino->di_size) == 0) {
+		do_warn(_("zero size symlink in inode %" PRIu64 "\n"), lino);
+		return 1;
+	}
+
 	/*
 	 * have to check symlink component by component.
 	 * get symlink contents into data area

  reply	other threads:[~2017-02-24 14:30 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-23 20:14 xfs_repair fails to recognize corruption reported by kernel - possible bug? Mathias Troiden
2017-02-24 12:30 ` Brian Foster
2017-02-24 14:30   ` Brian Foster [this message]
2017-02-24 17:56     ` Darrick J. Wong
     [not found]       ` <CADcJnz8SdVTEHcsbhmYoXThP1Uy2T1rs9p7qQo9mc_aa8R9rQw@mail.gmail.com>
     [not found]         ` <20170224180601.GA14631@birch.djwong.org>
2017-02-26 12:32           ` Mathias Troiden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170224143037.GD59560@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mathias.troiden@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.