On 2019/4/6 上午8:22, Qu Wenruo wrote: > > > On 2019/4/5 下午11:49, David Sterba wrote: >> On Thu, Apr 04, 2019 at 11:47:06AM +0800, Qu Wenruo wrote: >>> This patchset can be fetched from github: >>> https://github.com/adam900710/linux/tree/tree_checker_testing >>> >>> Which is based on misc-next, with the following commit as HEAD: >>> commit 56d46f96de92ec69963acb7b1d9aed83d2a56a7b (david/misc-next-with-write-checks, david/misc-next) >>> Author: Nikolay Borisov >>> Date: Wed Mar 27 14:24:18 2019 +0200 >>> >>> btrfs: Switch btrfs_trim_free_extents to find_first_clear_extent_bit >>> >>> These two patches changes the behavior of write time tree-checker, where >>> the initial patches didn't check the content of the leaf, leaving memory >>> bit flipping possible to sneak in. >>> >>> This patchset also introduces a new optimization, where original empty >>> leaf owner check can be pretty expensive and cause false alerts for >>> write time tree checker. >>> With this new optimization, write time tree checker can reuse the >>> existing btrfs_check_leaf_full(). >> >> This on top of misc-next throws 6 'write time' errors: >> >> btrfs/101 [14:00:24][ 3963.736352] run fstests btrfs/101 at 2019-04-05 14:00:24 >> [ 3970.417649] BTRFS error (device vdc): block=2446344192 write time tree block corruption detected > > In my case, these tests just pass, so I haven't checked the dmesg. Oh, those tests passes because of a bug in mainline. 6dc4f100c175 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec") prevents us from breaking bio_for_each_segment_all() loop, thus not triggering transaction abort. A nice finding digging into the report. Thanks, Qu > > But indeed I can reproduce the same dmesg line. > > Before removing the device, btrfs will use btrfs_shrink_device(new_size > = 0) to remove all data from that device. > > And that will cause on-disk total_bytes to be 0, triggering the bug. > > So indeed the behavior is different between read and write time. > > I'll check how to do it correctly. > > Thanks, > Qu > >> >> btrfs/151 [14:06:52][ 4351.133395] run fstests btrfs/151 at 2019-04-05 14:06:52 >> [ 4351.659771] BTRFS error (device vdb): block=570572800 write time tree block corruption detected >> >> btrfs/164 [14:07:19][ 4378.349990] run fstests btrfs/164 at 2019-04-05 14:07:19 >> [ 4380.645918] BTRFS error (device vdb): block=2479882240 write time tree block corruption detected >> >> btrfs/176 [14:07:39][ 4398.493736] run fstests btrfs/176 at 2019-04-05 14:07:39 >> [ 4399.186203] BTRFS error (device vdb): block=22036480 write time tree block corruption detected >> [ 4399.302700] BTRFS error (device vdb): block=1372585984 write time tree block corruption detected >> >> btrfs/184 [14:10:13][ 4552.496289] run fstests btrfs/184 at 2019-04-05 14:10:13 >> [ 4553.292780] BTRFS error (device vdb): block=4602216448 write time tree block corruption detected >> >> Have you seen the errors during your testing? >> >