All of lore.kernel.org
 help / color / mirror / Atom feed
From: james harvey <jamespharvey20@gmail.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass
Date: Mon, 14 May 2018 00:41:41 -0400	[thread overview]
Message-ID: <CA+X5Wn5J4TBx-Fa3WLrBPULRiGNa7m_4U9YS6LDvUyVT1TVhqg@mail.gmail.com> (raw)
In-Reply-To: <a6c35859-0cc3-2261-d717-5fdd9f3911ab@gmx.com>

On Sun, May 13, 2018 at 10:08 PM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> On 2018年05月12日 13:08, james harvey wrote:
>> Hardware is fine.  Passes memtest86+ in SMP mode.  Works fine on all
>> other files.
>>
>>
>>
>> [  381.869940] BUG: unable to handle kernel paging request at 0000000000390e50
>> [  381.870881] BTRFS: decompress failed
>> [  381.891775] IP: rebalance_domains+0x8a/0x2c0
>
> The interesting part here is, btrfs is not showing up the call trace,
> not even lzo code.
> (Despite of the "decompress failed" message).
> Maybe some corrupted data is screwing up some random kernel memory?

I've been surprised by this too.  I've seen a few "styles" of crashes from this.

The fuller version of the one I posted in original post:
https://bugzilla.kernel.org/attachment.cgi?id=275949

One that starts with a "general protection fault":
https://bugzilla.kernel.org/attachment.cgi?id=275951

And my most recent version, starts with "BTRFS: decompress failed"
then "BUG: unable to handle kernel NULL pointer dereference at
0000000000000001":
https://bugzilla.kernel.org/attachment.cgi?id=275961

This latest one does have a call trace including btrfs.  The top of
the call trace is "end_compressed_bio_read+0x34e/0x3d0 [btrfs]", and
although it includes the word compressed, I'm not sure that's actually
having to do with lzo compression.  The call stack doesn't scream that
to me.

It seems like when the invalid decompression happens, that code itself
doesn't give any kernel errors, but the rest of the kernel starts
spazzing.

I've replicated this probably about 15 times now.  Only happens on
these files that have inconsistent mirrored data.



> Would you please get the inode number of that corrupted files, and throw
> it through btrfs-debug-tree?
>
> # btrfs-debug-tree -t <subvol_id> <device> | grep -A 50 \(<INO>
>
> This is the preferred method as it would provide all the details we
> need. But since it could contain sensitive info like filename, please
> double check before posting it.

# ls -i system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal
291489 system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal

# ls -i user-1000@b70add0ef010457d933fec23a2afa48a-0000000000000495-00053b6b6e65e9cf.journal
72267 user-1000@b70add0ef010457d933fec23a2afa48a-0000000000000495-00053b6b6e65e9cf.journal

# btrfs-debug-tree -t 5 /dev/lvm/newMain1 | grep -A 50 \(291489 >
debug.tree.291489
Available at: http://termbin.com/kegj

# btrfs-debug-tree -t 5 /dev/lvm/newMain1 | grep -A 50 \(72267 >
debug.tree.72267
Available at: http://termbin.com/xhdc



> Or fiemap of that file could also help:
>
> # xfs_io -c "fiemap -v" <corrupted_file>
>
> This is completely safe, but I'm not 100% sure about if the info is enough.

# xfs_io -c "fiemap -v"
system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal
Available at: http://termbin.com/nsej

# xfs_io -c "fiemap -v"
system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal
Available at: http://termbin.com/4fiz

  reply	other threads:[~2018-05-14  4:41 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-12  5:08 "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass james harvey
2018-05-12  7:51 ` Martin Steigerwald
2018-05-13  0:10   ` james harvey
2018-05-13  2:09     ` Chris Murphy
2018-05-13  5:28       ` james harvey
2018-05-13 11:01       ` james harvey
2018-05-13 11:45         ` james harvey
2018-05-13 21:27       ` Chris Murphy
2018-05-14  2:08 ` Qu Wenruo
2018-05-14  4:41   ` james harvey [this message]
2018-05-14  5:30     ` Qu Wenruo
2018-05-14  6:36       ` Qu Wenruo
2018-05-14 10:29       ` james harvey
2018-05-14 11:05         ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+X5Wn5J4TBx-Fa3WLrBPULRiGNa7m_4U9YS6LDvUyVT1TVhqg@mail.gmail.com \
    --to=jamespharvey20@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.