Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/4] btrfs: Make balance cancelling response faster
@ 2019-12-03  6:42 Qu Wenruo
  2019-12-03  6:42 ` [PATCH 1/4] btrfs: relocation: Introduce error injection points for cancelling balance Qu Wenruo
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Qu Wenruo @ 2019-12-03  6:42 UTC (permalink / raw)
  To: linux-btrfs

[PROBLEM]
There are quite some users reporting that 'btrfs balance cancel' slow to
cancel current running balance, or even doesn't work for certain dead
balance loop.

With the following script showing how long it takes to fully stop a
balance:
  #!/bin/bash
  dev=/dev/test/test
  mnt=/mnt/btrfs

  umount $mnt &> /dev/null
  umount $dev &> /dev/null

  mkfs.btrfs -f $dev
  mount $dev -o nospace_cache $mnt

  dd if=/dev/zero bs=1M of=$mnt/large &
  dd_pid=$!

  sleep 3
  kill -KILL $dd_pid
  sync

  btrfs balance start --bg --full $mnt &
  sleep 1

  echo "cancel request" >> /dev/kmsg
  time btrfs balance cancel $mnt
  umount $mnt

It takes around 7~10s to cancel the running balance in my test
environment.

[CAUSE]
Btrfs uses btrfs_fs_info::balance_cancel_req to record how many cancel
request are queued.
However that cancelling request is only checked after relocating a block
group.

That behavior is far from optimal to provide a faster cancelling.

[FIX]
This patchset will add more cancelling check points, to make cancelling
faster.

And also, introduce a new error injection points to cover these newly
introduced and future check points.

Qu Wenruo (4):
  btrfs: relocation: Introduce error injection points for cancelling
    balance
  btrfs: relocation: Check cancel request after each data page read
  btrfs: relocation: Check cancel request after each extent found
  btrfs: relocation: Work around dead relocation stage loop

 fs/btrfs/ctree.h      |  1 +
 fs/btrfs/relocation.c | 23 +++++++++++++++++++++++
 fs/btrfs/volumes.c    |  2 +-
 3 files changed, 25 insertions(+), 1 deletion(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] btrfs: relocation: Introduce error injection points for cancelling balance
  2019-12-03  6:42 [PATCH 0/4] btrfs: Make balance cancelling response faster Qu Wenruo
@ 2019-12-03  6:42 ` Qu Wenruo
  2019-12-03 13:29   ` Johannes Thumshirn
  2019-12-03  6:42 ` [PATCH 2/4] btrfs: relocation: Check cancel request after each data page read Qu Wenruo
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2019-12-03  6:42 UTC (permalink / raw)
  To: linux-btrfs

Introduce a new error injection point, should_cancel_balance().

It's just a wrapper of atomic_read(&fs_info->balance_cancel_req), but
allows us to override the return value.

Currently there are only one locations using this function:
- btrfs_balance()
  It checks cancel before each block group.

There are other locations checking fs_info->balance_cancel_req, but they
are not used as an indicator to exit, so there is no need to use the
wrapper.

But there will be more locations coming, and some locations can cause
kernel panic if not handled properly.

So introduce this error injection to provide better test interface.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ctree.h      |  1 +
 fs/btrfs/relocation.c | 11 +++++++++++
 fs/btrfs/volumes.c    |  2 +-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index b2e8fd8a8e59..a712c18d2da2 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3352,6 +3352,7 @@ void btrfs_reloc_pre_snapshot(struct btrfs_pending_snapshot *pending,
 			      u64 *bytes_to_reserve);
 int btrfs_reloc_post_snapshot(struct btrfs_trans_handle *trans,
 			      struct btrfs_pending_snapshot *pending);
+int should_cancel_balance(struct btrfs_fs_info *fs_info);
 
 /* scrub.c */
 int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index d897a8e5e430..9edd65b1bf82 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -9,6 +9,7 @@
 #include <linux/blkdev.h>
 #include <linux/rbtree.h>
 #include <linux/slab.h>
+#include <linux/error-injection.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -3223,6 +3224,16 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end,
 	return ret;
 }
 
+/*
+ * This wrapper is to allow error injection to fully test all cancel
+ * call sites.
+ */
+int should_cancel_balance(struct btrfs_fs_info *fs_info)
+{
+	return atomic_read(&fs_info->balance_cancel_req);
+}
+ALLOW_ERROR_INJECTION(should_cancel_balance, TRUE);
+
 static int relocate_file_extent_cluster(struct inode *inode,
 					struct file_extent_cluster *cluster)
 {
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d8e5560db285..45a91a95557c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3856,7 +3856,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
 
 	if (btrfs_fs_closing(fs_info) ||
 	    atomic_read(&fs_info->balance_pause_req) ||
-	    atomic_read(&fs_info->balance_cancel_req)) {
+	    should_cancel_balance(fs_info)) {
 		ret = -EINVAL;
 		goto out;
 	}
-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/4] btrfs: relocation: Check cancel request after each data page read
  2019-12-03  6:42 [PATCH 0/4] btrfs: Make balance cancelling response faster Qu Wenruo
  2019-12-03  6:42 ` [PATCH 1/4] btrfs: relocation: Introduce error injection points for cancelling balance Qu Wenruo
@ 2019-12-03  6:42 ` Qu Wenruo
  2019-12-03  6:42 ` [PATCH 3/4] btrfs: relocation: Check cancel request after each extent found Qu Wenruo
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2019-12-03  6:42 UTC (permalink / raw)
  To: linux-btrfs

When relocating a data extents with large large data extents, we spend
most of our time in relocate_file_extent_cluster() at stage "moving data
extents":
 1)               |  btrfs_relocate_block_group [btrfs]() {
 1)               |    relocate_file_extent_cluster [btrfs]() {
 1) $ 6586769 us  |    }
 1) + 18.260 us   |    relocate_file_extent_cluster [btrfs]();
 1) + 15.770 us   |    relocate_file_extent_cluster [btrfs]();
 1) $ 8916340 us  |  }
 1)               |  btrfs_relocate_block_group [btrfs]() {
 1)               |    relocate_file_extent_cluster [btrfs]() {
 1) $ 11611586 us |    }
 1) + 16.930 us   |    relocate_file_extent_cluster [btrfs]();
 1) + 15.870 us   |    relocate_file_extent_cluster [btrfs]();
 1) $ 14986130 us |  }

So to make data relocation cancelling quicker, here add extra balance
cancelling check after each page read in relocate_file_extent_cluster().

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/relocation.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 9edd65b1bf82..533481a1f962 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3355,6 +3355,10 @@ static int relocate_file_extent_cluster(struct inode *inode,
 		btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE);
 		balance_dirty_pages_ratelimited(inode->i_mapping);
 		btrfs_throttle(fs_info);
+		if (should_cancel_balance(fs_info)) {
+			ret = -ECANCELED;
+			goto out;
+		}
 	}
 	WARN_ON(nr != cluster->nr);
 out:
-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 3/4] btrfs: relocation: Check cancel request after each extent found
  2019-12-03  6:42 [PATCH 0/4] btrfs: Make balance cancelling response faster Qu Wenruo
  2019-12-03  6:42 ` [PATCH 1/4] btrfs: relocation: Introduce error injection points for cancelling balance Qu Wenruo
  2019-12-03  6:42 ` [PATCH 2/4] btrfs: relocation: Check cancel request after each data page read Qu Wenruo
@ 2019-12-03  6:42 ` Qu Wenruo
  2019-12-03  6:42 ` [PATCH 4/4] btrfs: relocation: Work around dead relocation stage loop Qu Wenruo
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2019-12-03  6:42 UTC (permalink / raw)
  To: linux-btrfs

When relocating data block groups with tons of small extents, or
large metadata block groups, there can be over 200,000 extents.

We will iterate all extents of such block group in relocate_block_group(),
where iteration itself can be kinda time-consuming.

So when user want to cancel the balance, the extent iteration loop can
be another target.

This patch will add the cancelling check in the extent iteration loop of
relocate_block_group() to make balance cancelling faster.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/relocation.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 533481a1f962..161d66f70190 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4140,6 +4140,10 @@ static noinline_for_stack int relocate_block_group(struct reloc_control *rc)
 				break;
 			}
 		}
+		if (should_cancel_balance(fs_info)) {
+			err = -ECANCELED;
+			break;
+		}
 	}
 	if (trans && progress && err == -ENOSPC) {
 		ret = btrfs_force_chunk_alloc(trans, rc->block_group->flags);
-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 4/4] btrfs: relocation: Work around dead relocation stage loop
  2019-12-03  6:42 [PATCH 0/4] btrfs: Make balance cancelling response faster Qu Wenruo
                   ` (2 preceding siblings ...)
  2019-12-03  6:42 ` [PATCH 3/4] btrfs: relocation: Check cancel request after each extent found Qu Wenruo
@ 2019-12-03  6:42 ` Qu Wenruo
  2019-12-04 16:39 ` [PATCH 0/4] btrfs: Make balance cancelling response faster David Sterba
  2019-12-05  2:58 ` Zygo Blaxell
  5 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2019-12-03  6:42 UTC (permalink / raw)
  To: linux-btrfs

There are some reports of dead relocation stage loop, where dmesg is
flooded by "Found X extents".

The root cause of it is still uncertain, but we can work around such bug
by checking cancelling request so user can at least cancel such dead
loop.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/relocation.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 161d66f70190..23aa630f04c9 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4417,6 +4417,10 @@ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start)
 
 		btrfs_info(fs_info, "found %llu extents", rc->extents_found);
 
+		if (should_cancel_balance(fs_info)) {
+			err = -ECANCELED;
+			goto out;
+		}
 	}
 
 	WARN_ON(rc->block_group->pinned > 0);
-- 
2.24.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] btrfs: relocation: Introduce error injection points for cancelling balance
  2019-12-03  6:42 ` [PATCH 1/4] btrfs: relocation: Introduce error injection points for cancelling balance Qu Wenruo
@ 2019-12-03 13:29   ` Johannes Thumshirn
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2019-12-03 13:29 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] btrfs: Make balance cancelling response faster
  2019-12-03  6:42 [PATCH 0/4] btrfs: Make balance cancelling response faster Qu Wenruo
                   ` (3 preceding siblings ...)
  2019-12-03  6:42 ` [PATCH 4/4] btrfs: relocation: Work around dead relocation stage loop Qu Wenruo
@ 2019-12-04 16:39 ` David Sterba
  2019-12-05  2:58 ` Zygo Blaxell
  5 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2019-12-04 16:39 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, Dec 03, 2019 at 02:42:50PM +0800, Qu Wenruo wrote:
> [PROBLEM]
> There are quite some users reporting that 'btrfs balance cancel' slow to
> cancel current running balance, or even doesn't work for certain dead
> balance loop.
> 
> With the following script showing how long it takes to fully stop a
> balance:
>   #!/bin/bash
>   dev=/dev/test/test
>   mnt=/mnt/btrfs
> 
>   umount $mnt &> /dev/null
>   umount $dev &> /dev/null
> 
>   mkfs.btrfs -f $dev
>   mount $dev -o nospace_cache $mnt
> 
>   dd if=/dev/zero bs=1M of=$mnt/large &
>   dd_pid=$!
> 
>   sleep 3
>   kill -KILL $dd_pid
>   sync
> 
>   btrfs balance start --bg --full $mnt &
>   sleep 1
> 
>   echo "cancel request" >> /dev/kmsg
>   time btrfs balance cancel $mnt
>   umount $mnt
> 
> It takes around 7~10s to cancel the running balance in my test
> environment.
> 
> [CAUSE]
> Btrfs uses btrfs_fs_info::balance_cancel_req to record how many cancel
> request are queued.
> However that cancelling request is only checked after relocating a block
> group.

Yes that's the reason why it takes so long to cancel. Adding more
cancellation points is fine, but I don't know what exactly happens when
the block group relocation is not finished. There's code to merge the
reloc inode and commit that, but that's only a high-level view of the
thing.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] btrfs: Make balance cancelling response faster
  2019-12-03  6:42 [PATCH 0/4] btrfs: Make balance cancelling response faster Qu Wenruo
                   ` (4 preceding siblings ...)
  2019-12-04 16:39 ` [PATCH 0/4] btrfs: Make balance cancelling response faster David Sterba
@ 2019-12-05  2:58 ` Zygo Blaxell
  2019-12-05  3:26   ` Qu Wenruo
  5 siblings, 1 reply; 9+ messages in thread
From: Zygo Blaxell @ 2019-12-05  2:58 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2098 bytes --]

On Tue, Dec 03, 2019 at 02:42:50PM +0800, Qu Wenruo wrote:
> [PROBLEM]
> There are quite some users reporting that 'btrfs balance cancel' slow to
> cancel current running balance, or even doesn't work for certain dead
> balance loop.
> 
> With the following script showing how long it takes to fully stop a
> balance:
>   #!/bin/bash
>   dev=/dev/test/test
>   mnt=/mnt/btrfs
> 
>   umount $mnt &> /dev/null
>   umount $dev &> /dev/null
> 
>   mkfs.btrfs -f $dev
>   mount $dev -o nospace_cache $mnt
> 
>   dd if=/dev/zero bs=1M of=$mnt/large &
>   dd_pid=$!
> 
>   sleep 3
>   kill -KILL $dd_pid
>   sync
> 
>   btrfs balance start --bg --full $mnt &
>   sleep 1
> 
>   echo "cancel request" >> /dev/kmsg
>   time btrfs balance cancel $mnt
>   umount $mnt
> 
> It takes around 7~10s to cancel the running balance in my test
> environment.
> 
> [CAUSE]
> Btrfs uses btrfs_fs_info::balance_cancel_req to record how many cancel
> request are queued.
> However that cancelling request is only checked after relocating a block
> group.
> 
> That behavior is far from optimal to provide a faster cancelling.
> 
> [FIX]
> This patchset will add more cancelling check points, to make cancelling
> faster.

Nice!  I look forward to using this in the future!

Does this cover device delete/resize as well?  I think there needs to be
a check added for fatal signals for those to work, as they don't respond
to balance cancel.

> And also, introduce a new error injection points to cover these newly
> introduced and future check points.
> 
> Qu Wenruo (4):
>   btrfs: relocation: Introduce error injection points for cancelling
>     balance
>   btrfs: relocation: Check cancel request after each data page read
>   btrfs: relocation: Check cancel request after each extent found
>   btrfs: relocation: Work around dead relocation stage loop
> 
>  fs/btrfs/ctree.h      |  1 +
>  fs/btrfs/relocation.c | 23 +++++++++++++++++++++++
>  fs/btrfs/volumes.c    |  2 +-
>  3 files changed, 25 insertions(+), 1 deletion(-)
> 
> -- 
> 2.24.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] btrfs: Make balance cancelling response faster
  2019-12-05  2:58 ` Zygo Blaxell
@ 2019-12-05  3:26   ` Qu Wenruo
  0 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2019-12-05  3:26 UTC (permalink / raw)
  To: Zygo Blaxell, Qu Wenruo; +Cc: linux-btrfs

[-- Attachment #1.1: Type: text/plain, Size: 2432 bytes --]



On 2019/12/5 上午10:58, Zygo Blaxell wrote:
> On Tue, Dec 03, 2019 at 02:42:50PM +0800, Qu Wenruo wrote:
>> [PROBLEM]
>> There are quite some users reporting that 'btrfs balance cancel' slow to
>> cancel current running balance, or even doesn't work for certain dead
>> balance loop.
>>
>> With the following script showing how long it takes to fully stop a
>> balance:
>>   #!/bin/bash
>>   dev=/dev/test/test
>>   mnt=/mnt/btrfs
>>
>>   umount $mnt &> /dev/null
>>   umount $dev &> /dev/null
>>
>>   mkfs.btrfs -f $dev
>>   mount $dev -o nospace_cache $mnt
>>
>>   dd if=/dev/zero bs=1M of=$mnt/large &
>>   dd_pid=$!
>>
>>   sleep 3
>>   kill -KILL $dd_pid
>>   sync
>>
>>   btrfs balance start --bg --full $mnt &
>>   sleep 1
>>
>>   echo "cancel request" >> /dev/kmsg
>>   time btrfs balance cancel $mnt
>>   umount $mnt
>>
>> It takes around 7~10s to cancel the running balance in my test
>> environment.
>>
>> [CAUSE]
>> Btrfs uses btrfs_fs_info::balance_cancel_req to record how many cancel
>> request are queued.
>> However that cancelling request is only checked after relocating a block
>> group.
>>
>> That behavior is far from optimal to provide a faster cancelling.
>>
>> [FIX]
>> This patchset will add more cancelling check points, to make cancelling
>> faster.
> 
> Nice!  I look forward to using this in the future!
> 
> Does this cover device delete/resize as well?

Shrink also takes use of balance, so I see no reason why it won't work
on such use cases.

>  I think there needs to be
> a check added for fatal signals for those to work, as they don't respond
> to balance cancel.

That's a good extra idea.

Since we have that wrapper, it would be easier to add in the future.

Thanks,
Qu

> 
>> And also, introduce a new error injection points to cover these newly
>> introduced and future check points.
>>
>> Qu Wenruo (4):
>>   btrfs: relocation: Introduce error injection points for cancelling
>>     balance
>>   btrfs: relocation: Check cancel request after each data page read
>>   btrfs: relocation: Check cancel request after each extent found
>>   btrfs: relocation: Work around dead relocation stage loop
>>
>>  fs/btrfs/ctree.h      |  1 +
>>  fs/btrfs/relocation.c | 23 +++++++++++++++++++++++
>>  fs/btrfs/volumes.c    |  2 +-
>>  3 files changed, 25 insertions(+), 1 deletion(-)
>>
>> -- 
>> 2.24.0
>>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-03  6:42 [PATCH 0/4] btrfs: Make balance cancelling response faster Qu Wenruo
2019-12-03  6:42 ` [PATCH 1/4] btrfs: relocation: Introduce error injection points for cancelling balance Qu Wenruo
2019-12-03 13:29   ` Johannes Thumshirn
2019-12-03  6:42 ` [PATCH 2/4] btrfs: relocation: Check cancel request after each data page read Qu Wenruo
2019-12-03  6:42 ` [PATCH 3/4] btrfs: relocation: Check cancel request after each extent found Qu Wenruo
2019-12-03  6:42 ` [PATCH 4/4] btrfs: relocation: Work around dead relocation stage loop Qu Wenruo
2019-12-04 16:39 ` [PATCH 0/4] btrfs: Make balance cancelling response faster David Sterba
2019-12-05  2:58 ` Zygo Blaxell
2019-12-05  3:26   ` Qu Wenruo

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git