From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot0-f182.google.com ([74.125.82.182]:36531 "EHLO mail-ot0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751307AbeEMCJh (ORCPT ); Sat, 12 May 2018 22:09:37 -0400 Received: by mail-ot0-f182.google.com with SMTP id m11-v6so10562276otf.3 for ; Sat, 12 May 2018 19:09:37 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <1835523.2oRAal5OEW@merkaba> From: Chris Murphy Date: Sat, 12 May 2018 20:09:36 -0600 Message-ID: Subject: Re: "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass To: james harvey Cc: Btrfs BTRFS Content-Type: text/plain; charset="UTF-8" Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Sat, May 12, 2018 at 6:10 PM, james harvey wrote: > On Sat, May 12, 2018 at 3:51 AM, Martin Steigerwald wrote: >> Hey James. >> >> james harvey - 12.05.18, 07:08: >>> 100% reproducible, booting from disk, or even Arch installation ISO. >>> Kernel 4.16.7. btrfs-progs v4.16. >>> >>> Reading one of two journalctl files causes a kernel oops. Initially >>> ran into it from "journalctl --list-boots", but cat'ing the file does >>> it too. I believe this shows there's compressed data that is invalid, >>> but its btrfs checksum is invalid. I've cat'ed every file on the >>> disk, and luckily have the problems narrowed down to only these 2 >>> files in /var/log/journal. >>> >>> This volume has always been mounted with lzo compression. >>> >>> scrub has never found anything, and have ran it since the oops. >>> >>> Found a user a few years ago who also ran into this, without >>> resolution, at: >>> https://www.spinics.net/lists/linux-btrfs/msg52218.html >>> >>> 1. Cat'ing a (non-essential) file shouldn't be able to bring down the >>> system. >>> >>> 2. If this is infact invalid compressed data, there should be a way to >>> check for that. Btrfs check and scrub pass. >> >> I think systemd-journald sets those files to nocow on BTRFS in order to >> reduce fragmentation: That means no checksums, no snapshots, no nothing. >> I just removed /var/log/journal and thus disabled journalling to disk. >> Its sufficient for me to have the recent state in /run/journal. >> >> Can you confirm nocow being set via lsattr on those files? >> >> Still they should be decompressible just fine. >> >>> Hardware is fine. Passes memtest86+ in SMP mode. Works fine on all >>> other files. >>> >>> >>> >>> [ 381.869940] BUG: unable to handle kernel paging request at >>> 0000000000390e50 [ 381.870881] BTRFS: decompress failed >> […] >> -- >> Martin >> >> > > You're right, everything in /var/log/journal has the NoCOW attribute. > > This is on a 3 device btrfs RAID1. If I mount ro,degraded with disks > 1&2 or 1&3, and read the file, I get a crash. With disks 2&3, it > reads fine Unmounted with all three available, you can use btrfs-map-logical to extract copy 1 and copy 2 to compare; but it might crash also if one copy is corrupt. But it's another way to test. > > Does this mean that although I've never had a corrupted disk bit > before on COW/checksummed data, one somehow happened on the small > fraction of my storage which is NoCOW? Seems unlikely, but I don't > know what other explanation there would be. Usually nocow also means no compression. But in the archives is a thread where I found that compression can be forced on nocow if the file is fragment and either the volume is mounted with compression or the file has inherited chattr +c (I don't remember which or possibly both). And systemd does submit rotated logs for defragmentation. But the compression doesn't happen twice. So if it's corruption, it's corruption in transit. I think you'd come across this more often. > > So, I think this means the corrupted disk bit must be on disk 1. > > I'm running with LVM, this a small'ish volume, and I would be happy to > leave a copy of the set of 3 volumes as-is, if anyone wanted to have > me run anything to help diagnose this and/or try a patch. > > Does btrfs have a way to do something like scrub, by comparing the > mirrored copies of NoCOW data, and alerting you to a mismatch? I > realize with the NoCOW, it wouldn't have a checksum to know which is > accurate. It would at least be good for there to be a way to alert to > the corruption. No csums means the files are ignored. You've definitely found a bug. A corrupt file shouldn't crash the kernel. You could do regression testing and see if it happens with older kernels. I'd probably stick to longterm, easier to find already built. If these are zstd compressed, then I think you can only go back to 4.14. -- Chris Murphy