On Tue, Dec 03, 2019 at 02:42:50PM +0800, Qu Wenruo wrote: > [PROBLEM] > There are quite some users reporting that 'btrfs balance cancel' slow to > cancel current running balance, or even doesn't work for certain dead > balance loop. > > With the following script showing how long it takes to fully stop a > balance: > #!/bin/bash > dev=/dev/test/test > mnt=/mnt/btrfs > > umount $mnt &> /dev/null > umount $dev &> /dev/null > > mkfs.btrfs -f $dev > mount $dev -o nospace_cache $mnt > > dd if=/dev/zero bs=1M of=$mnt/large & > dd_pid=$! > > sleep 3 > kill -KILL $dd_pid > sync > > btrfs balance start --bg --full $mnt & > sleep 1 > > echo "cancel request" >> /dev/kmsg > time btrfs balance cancel $mnt > umount $mnt > > It takes around 7~10s to cancel the running balance in my test > environment. > > [CAUSE] > Btrfs uses btrfs_fs_info::balance_cancel_req to record how many cancel > request are queued. > However that cancelling request is only checked after relocating a block > group. > > That behavior is far from optimal to provide a faster cancelling. > > [FIX] > This patchset will add more cancelling check points, to make cancelling > faster. Nice! I look forward to using this in the future! Does this cover device delete/resize as well? I think there needs to be a check added for fatal signals for those to work, as they don't respond to balance cancel. > And also, introduce a new error injection points to cover these newly > introduced and future check points. > > Qu Wenruo (4): > btrfs: relocation: Introduce error injection points for cancelling > balance > btrfs: relocation: Check cancel request after each data page read > btrfs: relocation: Check cancel request after each extent found > btrfs: relocation: Work around dead relocation stage loop > > fs/btrfs/ctree.h | 1 + > fs/btrfs/relocation.c | 23 +++++++++++++++++++++++ > fs/btrfs/volumes.c | 2 +- > 3 files changed, 25 insertions(+), 1 deletion(-) > > -- > 2.24.0 >