On 2019/9/12 上午1:17, David Sterba wrote: > On Thu, Aug 29, 2019 at 03:17:31PM +0800, Qu Wenruo wrote: >> [BUG] >> There is a long existing bug that degraded mounted btrfs can allocate new >> SINGLE/DUP chunks on a RAID1 fs: >> #!/bin/bash >> >> dev1=/dev/test/scratch1 >> dev2=/dev/test/scratch2 >> mnt=/mnt/btrfs >> >> umount $mnt &> /dev/null >> umount $dev1 &> /dev/null >> umount $dev2 &> /dev/null >> >> dmesg -C >> mkfs.btrfs -f -m raid1 -d raid1 $dev1 $dev2 >> >> wipefs -fa $dev2 >> >> mount -o degraded $dev1 $mnt >> btrfs balance start --full $mnt >> umount $mnt >> echo "=== chunk after degraded mount ===" >> btrfs ins dump-tree -t chunk $dev1 | grep stripe_len.*type >> >> The result fs will have chunks with SINGLE and DUP only: >> === chunk after degraded mount === >> length 33554432 owner 2 stripe_len 65536 type SYSTEM >> length 1073741824 owner 2 stripe_len 65536 type DATA >> length 1073741824 owner 2 stripe_len 65536 type DATA|DUP >> length 219676672 owner 2 stripe_len 65536 type METADATA|DUP >> length 33554432 owner 2 stripe_len 65536 type SYSTEM|DUP >> >> This behavior greatly breaks the RAID1 tolerance. >> >> Even with missing device replaced, if the device with DUP/SINGLE chunks >> on them get missing, the whole fs can't be mounted RW any more. >> And we already have reports that user even can't mount the fs as some >> essential tree blocks got written to those DUP chunks. >> >> [CAUSE] >> The cause is pretty simple, we treat missing devices as non-writable. >> Thus when we need to allocate chunks, we can only fall back to single >> device profiles (SINGLE and DUP). >> >> [FIX] >> Just consider the missing devices as WRITABLE, so we allocate new chunks >> on them to maintain old profiles. > > I'm not sure this is the best way to fix it, it makes the meaning of > rw_devices ambiguous. A missing device is by definition not readable nor > writeable. > > This should be tracked separatelly, ie. counting real devices that can > be written and devices that can be considered for allocation (with a > documented meaning that even missing devices are included). > Indeed this sounds much better. I'd go that direction. Thanks, Qu