On 2018年05月14日 13:30, Qu Wenruo wrote: > > > On 2018年05月14日 12:41, james harvey wrote: >> On Sun, May 13, 2018 at 10:08 PM, Qu Wenruo wrote: >>> On 2018年05月12日 13:08, james harvey wrote: >>>> Hardware is fine. Passes memtest86+ in SMP mode. Works fine on all >>>> other files. >>>> >>>> >>>> >>>> [ 381.869940] BUG: unable to handle kernel paging request at 0000000000390e50 >>>> [ 381.870881] BTRFS: decompress failed >>>> [ 381.891775] IP: rebalance_domains+0x8a/0x2c0 >>> >>> The interesting part here is, btrfs is not showing up the call trace, >>> not even lzo code. >>> (Despite of the "decompress failed" message). >>> Maybe some corrupted data is screwing up some random kernel memory? >> >> I've been surprised by this too. I've seen a few "styles" of crashes from this. >> >> The fuller version of the one I posted in original post: >> https://bugzilla.kernel.org/attachment.cgi?id=275949 >> >> One that starts with a "general protection fault": >> https://bugzilla.kernel.org/attachment.cgi?id=275951 >> >> And my most recent version, starts with "BTRFS: decompress failed" >> then "BUG: unable to handle kernel NULL pointer dereference at >> 0000000000000001": >> https://bugzilla.kernel.org/attachment.cgi?id=275961 >> >> This latest one does have a call trace including btrfs. The top of >> the call trace is "end_compressed_bio_read+0x34e/0x3d0 [btrfs]", and >> although it includes the word compressed, I'm not sure that's actually >> having to do with lzo compression. The call stack doesn't scream that >> to me. >> >> It seems like when the invalid decompression happens, that code itself >> doesn't give any kernel errors, but the rest of the kernel starts >> spazzing. > > Yep, even the last case it still looks like that it's kernel memory get > corrupted. > >> >> I've replicated this probably about 15 times now. Only happens on >> these files that have inconsistent mirrored data. > > From the thread, since you have already located the corrupted mirror, > would you please provide the corrupted dump along with correct one? > > It would help a lot for us to under stand what's going on. > >> >> >> >>> Would you please get the inode number of that corrupted files, and throw >>> it through btrfs-debug-tree? >>> >>> # btrfs-debug-tree -t | grep -A 50 \( >>> >>> This is the preferred method as it would provide all the details we >>> need. But since it could contain sensitive info like filename, please >>> double check before posting it. >> >> # ls -i system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal >> 291489 system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal >> >> # ls -i user-1000@b70add0ef010457d933fec23a2afa48a-0000000000000495-00053b6b6e65e9cf.journal >> 72267 user-1000@b70add0ef010457d933fec23a2afa48a-0000000000000495-00053b6b6e65e9cf.journal >> >> # btrfs-debug-tree -t 5 /dev/lvm/newMain1 | grep -A 50 \(291489 > >> debug.tree.291489 >> Available at: http://termbin.com/kegj >> >> # btrfs-debug-tree -t 5 /dev/lvm/newMain1 | grep -A 50 \(72267 > >> debug.tree.72267 >> Available at: http://termbin.com/xhdc > > The dump indicates the same conclusion you reached. > The inode has NODATACOW NODATASUM flag, which means it should not has > csum nor has data compressed. > While in fact we have tons of compressed extents. > > But the following fiemap result also shows that these extents get > shared. This could happen when there is a snapshot. > > So there is something wrong that btrfs allows compressed data to be > generated for such file. > (Could not reproduce the same behavior with 4.16 kernel, could such > problem happens in older kernels? Or just get fixed recently?) OK, I could reproduce it now. Just mount with -o nodatasum, then create a file. Remount with compress-force=lzo, then write something. So at least btrfs should disallow such thing. Thanks, Qu > > Then some corruption screwed up the compressed data, and when we > decompress, the kernel is screwed up. > > > To pindown the lzo decompress corruption, kasan would be a nice try. > However this means you need to enable it at compile time, and recompile > a kernel. > Not to mention kasan has a great impact on performance. > > But it should provide more info before memory get corrupted. > > Thanks, > Qu > >> >> >> >>> Or fiemap of that file could also help: >>> >>> # xfs_io -c "fiemap -v" >>> >>> This is completely safe, but I'm not 100% sure about if the info is enough. >> >> # xfs_io -c "fiemap -v" >> system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal >> Available at: http://termbin.com/nsej >> >> # xfs_io -c "fiemap -v" >> system@00fa3c0596e64d2e84096520ca46f008-0000000000000001-00053cd2c1756577.journal >> Available at: http://termbin.com/4fiz > > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >