linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: raid56: data corruption on a device removal
@ 2018-12-12  0:25 Dmitriy Gorokh
  2018-12-12  9:09 ` Johannes Thumshirn
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Dmitriy Gorokh @ 2018-12-12  0:25 UTC (permalink / raw)
  To: linux-btrfs

I found that RAID5 or RAID6 filesystem might be got corrupted in the following scenario:

1. Create 4 disks RAID6 filesystem
2. Preallocate 16 10Gb files
3. Run fio: 'fio --name=testload --directory=./ --size=10G --numjobs=16 --bs=64k --iodepth=64 --rw=randrw --verify=sha256 --time_based --runtime=3600’
4. After few minutes pull out two drives: 'echo 1 > /sys/block/sdc/device/delete ;  echo 1 > /sys/block/sdd/device/delete’

About 5 of 10 times the test is run, it led to silent data corruption of a random extent, resulting in ‘IO Error’ and ‘csum failed’ messages while trying to read the affected file. It usually affects only small portion of the files and only one underlying extent of a file. When I converted logical address of the damaged extent to physical address and dumped a stripe directly from drives, I saw specific pattern, always the same when the issue occurs.

I found that few bios which were being processed right during the drives removal, contained non zero bio->bi_iter.bi_done field despite of  EIO bi_status. bi_sector field was also increased from original one by that 'bi_done' value. Looks like this is a quite rare condition. Subsequently, in the raid_rmw_end_io handler that failed bio can be translated to a wrong stripe number and fail wrong rbio.


---
 fs/btrfs/raid56.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 3c8093757497..94ae70715195 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -1451,6 +1451,9 @@ static int find_bio_stripe(struct btrfs_raid_bio *rbio,
        struct btrfs_bio_stripe *stripe;

        physical <<= 9;
+       // Since the failed bio can return partial data, bi_sector might be incremented
+       // by that value. We need to revert it back to the state before the bio was submitted.
+       physical -= bio->bi_iter.bi_done;

        for (i = 0; i < rbio->bbio->num_stripes; i++) {
                stripe = &rbio->bbio->stripes[i];
-- 
2.17.0



^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-01-11  9:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-12  0:25 [PATCH] btrfs: raid56: data corruption on a device removal Dmitriy Gorokh
2018-12-12  9:09 ` Johannes Thumshirn
2018-12-12 15:53 ` David Sterba
2018-12-14 17:48 ` [PATCH v2] " Dmitriy Gorokh
2018-12-26  0:15   ` Liu Bo
2019-01-04 16:49   ` David Sterba
2019-01-07 11:03     ` Johannes Thumshirn
2019-01-07 15:34       ` David Sterba
2019-01-10 16:49         ` Johannes Thumshirn
2019-01-11  8:08           ` Johannes Thumshirn
2019-01-11  9:26             ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).