* [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim
@ 2021-10-12 12:37 Johannes Thumshirn
2021-10-12 13:56 ` Nikolay Borisov
2021-10-13 14:18 ` Nikolay Borisov
0 siblings, 2 replies; 6+ messages in thread
From: Johannes Thumshirn @ 2021-10-12 12:37 UTC (permalink / raw)
To: David Sterba; +Cc: Johannes Thumshirn, Naohiro Aota, linux-btrfs
Currently auto reclaim of unusable zones reclaims the block-groups in the
order they have been added to the reclaim list.
Change this to a greedy algorithm by sorting the list so we have the
block-groups with the least amount of valid bytes reclaimed first.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
Changes since v1:
- Changed list_sort() comparator to 'boolean' style
Changes since RFC:
- Updated the patch description
- Don't sort the list under the spin_lock (David)
---
fs/btrfs/block-group.c | 29 +++++++++++++++++++++++++----
1 file changed, 25 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 46fdef7bbe20..e9092eba71fe 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1,5 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
+#include <linux/list_sort.h>
+
#include "misc.h"
#include "ctree.h"
#include "block-group.h"
@@ -1486,6 +1488,21 @@ void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
spin_unlock(&fs_info->unused_bgs_lock);
}
+/*
+ * We want block groups with a low number of used bytes to be in the beginning
+ * of the list, so they will get reclaimed first.
+ */
+static int reclaim_bgs_cmp(void *unused, const struct list_head *a,
+ const struct list_head *b)
+{
+ const struct btrfs_block_group *bg1, *bg2;
+
+ bg1 = list_entry(a, struct btrfs_block_group, bg_list);
+ bg2 = list_entry(b, struct btrfs_block_group, bg_list);
+
+ return bg1->used > bg2->used;
+}
+
void btrfs_reclaim_bgs_work(struct work_struct *work)
{
struct btrfs_fs_info *fs_info =
@@ -1493,6 +1510,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
struct btrfs_block_group *bg;
struct btrfs_space_info *space_info;
LIST_HEAD(again_list);
+ LIST_HEAD(reclaim_list);
if (!test_bit(BTRFS_FS_OPEN, &fs_info->flags))
return;
@@ -1510,17 +1528,20 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
}
spin_lock(&fs_info->unused_bgs_lock);
- while (!list_empty(&fs_info->reclaim_bgs)) {
+ list_splice_init(&fs_info->reclaim_bgs, &reclaim_list);
+ spin_unlock(&fs_info->unused_bgs_lock);
+
+ list_sort(NULL, &reclaim_list, reclaim_bgs_cmp);
+ while (!list_empty(&reclaim_list)) {
u64 zone_unusable;
int ret = 0;
- bg = list_first_entry(&fs_info->reclaim_bgs,
+ bg = list_first_entry(&reclaim_list,
struct btrfs_block_group,
bg_list);
list_del_init(&bg->bg_list);
space_info = bg->space_info;
- spin_unlock(&fs_info->unused_bgs_lock);
/* Don't race with allocators so take the groups_sem */
down_write(&space_info->groups_sem);
@@ -1568,12 +1589,12 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
bg->start);
next:
- spin_lock(&fs_info->unused_bgs_lock);
if (ret == -EAGAIN && list_empty(&bg->bg_list))
list_add_tail(&bg->bg_list, &again_list);
else
btrfs_put_block_group(bg);
}
+ spin_lock(&fs_info->unused_bgs_lock);
list_splice_tail(&again_list, &fs_info->reclaim_bgs);
spin_unlock(&fs_info->unused_bgs_lock);
mutex_unlock(&fs_info->reclaim_bgs_lock);
--
2.32.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim
2021-10-12 12:37 [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim Johannes Thumshirn
@ 2021-10-12 13:56 ` Nikolay Borisov
2021-10-13 13:52 ` David Sterba
2021-10-13 14:18 ` Nikolay Borisov
1 sibling, 1 reply; 6+ messages in thread
From: Nikolay Borisov @ 2021-10-12 13:56 UTC (permalink / raw)
To: Johannes Thumshirn, David Sterba; +Cc: Naohiro Aota, linux-btrfs
On 12.10.21 г. 15:37, Johannes Thumshirn wrote:
> Currently auto reclaim of unusable zones reclaims the block-groups in the
> order they have been added to the reclaim list.
>
> Change this to a greedy algorithm by sorting the list so we have the
> block-groups with the least amount of valid bytes reclaimed first.
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>
> ---
> Changes since v1:
> - Changed list_sort() comparator to 'boolean' style
>
> Changes since RFC:
> - Updated the patch description
> - Don't sort the list under the spin_lock (David)
<snip>
> @@ -1510,17 +1528,20 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> }
>
> spin_lock(&fs_info->unused_bgs_lock);
> - while (!list_empty(&fs_info->reclaim_bgs)) {
> + list_splice_init(&fs_info->reclaim_bgs, &reclaim_list);
> + spin_unlock(&fs_info->unused_bgs_lock);
> +
> + list_sort(NULL, &reclaim_list, reclaim_bgs_cmp);
> + while (!list_empty(&reclaim_list)) {
Nit: Now that you've switched to a local reclaim_list you can convert
the while to a list_for_each_entry_safe, since it's guaranteed that new
entries can't be added while you are iterating the list, which is
generally the reason why a while() is preferred to one of the iteration
helpers.
> u64 zone_unusable;
> int ret = 0;
>
> - bg = list_first_entry(&fs_info->reclaim_bgs,
> + bg = list_first_entry(&reclaim_list,
> struct btrfs_block_group,
> bg_list);
> list_del_init(&bg->bg_list);
>
> space_info = bg->space_info;
> - spin_unlock(&fs_info->unused_bgs_lock);
>
> /* Don't race with allocators so take the groups_sem */
> down_write(&space_info->groups_sem);
> @@ -1568,12 +1589,12 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> bg->start);
>
> next:
> - spin_lock(&fs_info->unused_bgs_lock);
> if (ret == -EAGAIN && list_empty(&bg->bg_list))
> list_add_tail(&bg->bg_list, &again_list);
> else
> btrfs_put_block_group(bg);
> }
> + spin_lock(&fs_info->unused_bgs_lock);
> list_splice_tail(&again_list, &fs_info->reclaim_bgs);
> spin_unlock(&fs_info->unused_bgs_lock);
> mutex_unlock(&fs_info->reclaim_bgs_lock);
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim
2021-10-12 13:56 ` Nikolay Borisov
@ 2021-10-13 13:52 ` David Sterba
2021-10-13 14:14 ` Nikolay Borisov
0 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2021-10-13 13:52 UTC (permalink / raw)
To: Nikolay Borisov
Cc: Johannes Thumshirn, David Sterba, Naohiro Aota, linux-btrfs
On Tue, Oct 12, 2021 at 04:56:01PM +0300, Nikolay Borisov wrote:
> On 12.10.21 г. 15:37, Johannes Thumshirn wrote:
> > Currently auto reclaim of unusable zones reclaims the block-groups in the
> > order they have been added to the reclaim list.
> >
> > Change this to a greedy algorithm by sorting the list so we have the
> > block-groups with the least amount of valid bytes reclaimed first.
> >
> > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> >
> > ---
> > Changes since v1:
> > - Changed list_sort() comparator to 'boolean' style
> >
> > Changes since RFC:
> > - Updated the patch description
> > - Don't sort the list under the spin_lock (David)
>
> <snip>
>
>
> > @@ -1510,17 +1528,20 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> > }
> >
> > spin_lock(&fs_info->unused_bgs_lock);
> > - while (!list_empty(&fs_info->reclaim_bgs)) {
> > + list_splice_init(&fs_info->reclaim_bgs, &reclaim_list);
> > + spin_unlock(&fs_info->unused_bgs_lock);
> > +
> > + list_sort(NULL, &reclaim_list, reclaim_bgs_cmp);
> > + while (!list_empty(&reclaim_list)) {
>
> Nit: Now that you've switched to a local reclaim_list you can convert
> the while to a list_for_each_entry_safe, since it's guaranteed that new
> entries can't be added while you are iterating the list, which is
> generally the reason why a while() is preferred to one of the iteration
> helpers.
Like the following? I'll fold it in if yes:
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1507,7 +1507,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
{
struct btrfs_fs_info *fs_info =
container_of(work, struct btrfs_fs_info, reclaim_bgs_work);
- struct btrfs_block_group *bg;
+ struct btrfs_block_group *bg, *tmp;
struct btrfs_space_info *space_info;
LIST_HEAD(again_list);
LIST_HEAD(reclaim_list);
@@ -1532,13 +1532,10 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
spin_unlock(&fs_info->unused_bgs_lock);
list_sort(NULL, &reclaim_list, reclaim_bgs_cmp);
- while (!list_empty(&reclaim_list)) {
+ list_for_each_entry_safe(bg, tmp, &reclaim_list, bg_list) {
u64 zone_unusable;
int ret = 0;
- bg = list_first_entry(&reclaim_list,
- struct btrfs_block_group,
- bg_list);
list_del_init(&bg->bg_list);
space_info = bg->space_info;
---
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim
2021-10-13 13:52 ` David Sterba
@ 2021-10-13 14:14 ` Nikolay Borisov
0 siblings, 0 replies; 6+ messages in thread
From: Nikolay Borisov @ 2021-10-13 14:14 UTC (permalink / raw)
To: dsterba, Johannes Thumshirn, Naohiro Aota, linux-btrfs
On 13.10.21 г. 16:52, David Sterba wrote:
> On Tue, Oct 12, 2021 at 04:56:01PM +0300, Nikolay Borisov wrote:
>> On 12.10.21 г. 15:37, Johannes Thumshirn wrote:
>>> Currently auto reclaim of unusable zones reclaims the block-groups in the
>>> order they have been added to the reclaim list.
>>>
>>> Change this to a greedy algorithm by sorting the list so we have the
>>> block-groups with the least amount of valid bytes reclaimed first.
>>>
>>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>>>
>>> ---
>>> Changes since v1:
>>> - Changed list_sort() comparator to 'boolean' style
>>>
>>> Changes since RFC:
>>> - Updated the patch description
>>> - Don't sort the list under the spin_lock (David)
>>
>> <snip>
>>
>>
>>> @@ -1510,17 +1528,20 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
>>> }
>>>
>>> spin_lock(&fs_info->unused_bgs_lock);
>>> - while (!list_empty(&fs_info->reclaim_bgs)) {
>>> + list_splice_init(&fs_info->reclaim_bgs, &reclaim_list);
>>> + spin_unlock(&fs_info->unused_bgs_lock);
>>> +
>>> + list_sort(NULL, &reclaim_list, reclaim_bgs_cmp);
>>> + while (!list_empty(&reclaim_list)) {
>>
>> Nit: Now that you've switched to a local reclaim_list you can convert
>> the while to a list_for_each_entry_safe, since it's guaranteed that new
>> entries can't be added while you are iterating the list, which is
>> generally the reason why a while() is preferred to one of the iteration
>> helpers.
>
> Like the following? I'll fold it in if yes:
Yes
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim
2021-10-12 12:37 [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim Johannes Thumshirn
2021-10-12 13:56 ` Nikolay Borisov
@ 2021-10-13 14:18 ` Nikolay Borisov
2021-10-14 6:59 ` Johannes Thumshirn
1 sibling, 1 reply; 6+ messages in thread
From: Nikolay Borisov @ 2021-10-13 14:18 UTC (permalink / raw)
To: Johannes Thumshirn, David Sterba; +Cc: Naohiro Aota, linux-btrfs
On 12.10.21 г. 15:37, Johannes Thumshirn wrote:
> Currently auto reclaim of unusable zones reclaims the block-groups in the
> order they have been added to the reclaim list.
>
> Change this to a greedy algorithm by sorting the list so we have the
> block-groups with the least amount of valid bytes reclaimed first.
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>
<snip>
> @@ -1510,17 +1528,20 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> }
>
> spin_lock(&fs_info->unused_bgs_lock);
> - while (!list_empty(&fs_info->reclaim_bgs)) {
> + list_splice_init(&fs_info->reclaim_bgs, &reclaim_list);
> + spin_unlock(&fs_info->unused_bgs_lock);
> +
> + list_sort(NULL, &reclaim_list, reclaim_bgs_cmp);
> + while (!list_empty(&reclaim_list)) {
> u64 zone_unusable;
> int ret = 0;
>
> - bg = list_first_entry(&fs_info->reclaim_bgs,
> + bg = list_first_entry(&reclaim_list,
> struct btrfs_block_group,
> bg_list);
> list_del_init(&bg->bg_list);
>
> space_info = bg->space_info;
> - spin_unlock(&fs_info->unused_bgs_lock);
>
> /* Don't race with allocators so take the groups_sem */
> down_write(&space_info->groups_sem);
> @@ -1568,12 +1589,12 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> bg->start);
>
> next:
> - spin_lock(&fs_info->unused_bgs_lock);
> if (ret == -EAGAIN && list_empty(&bg->bg_list))
> list_add_tail(&bg->bg_list, &again_list);
After this bg is unlinked from the loca list, can its bg_list be added
for reclaim again. Is it safe to access bg_list without unused_bgs_lock
being held? btrfs_mark_bg_to_reclaim for example will add a block group
for reclaim under this lock?
> else
> btrfs_put_block_group(bg);
> }
> + spin_lock(&fs_info->unused_bgs_lock);
> list_splice_tail(&again_list, &fs_info->reclaim_bgs);
> spin_unlock(&fs_info->unused_bgs_lock);
> mutex_unlock(&fs_info->reclaim_bgs_lock);
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim
2021-10-13 14:18 ` Nikolay Borisov
@ 2021-10-14 6:59 ` Johannes Thumshirn
0 siblings, 0 replies; 6+ messages in thread
From: Johannes Thumshirn @ 2021-10-14 6:59 UTC (permalink / raw)
To: Nikolay Borisov, David Sterba; +Cc: Naohiro Aota, linux-btrfs
On 13/10/2021 16:18, Nikolay Borisov wrote:
>> next:
>> - spin_lock(&fs_info->unused_bgs_lock);
>> if (ret == -EAGAIN && list_empty(&bg->bg_list))
>> list_add_tail(&bg->bg_list, &again_list);
>
> After this bg is unlinked from the loca list, can its bg_list be added
> for reclaim again. Is it safe to access bg_list without unused_bgs_lock
> being held? btrfs_mark_bg_to_reclaim for example will add a block group
> for reclaim under this lock?
>
Correct me if I'm wrong, but I think this is OK, because i.e.
btrfs_mark_bg_to_reclaim() won't run on the bg again as it is switched to
ro. So there should be no concurrent accesses to bg->bg_list possible.
Or do you mean, that I can remove the again_list and re-add bg->bg_list to
fs_info->reclaim_list directly? This would probably be possible and could
improve readability a bit, because we're not having to deal with three
different lists but only two.
Byte,
Johannes
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-10-14 6:59 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-12 12:37 [PATCH v2] btrfs: zoned: use greedy gc for auto reclaim Johannes Thumshirn
2021-10-12 13:56 ` Nikolay Borisov
2021-10-13 13:52 ` David Sterba
2021-10-13 14:14 ` Nikolay Borisov
2021-10-13 14:18 ` Nikolay Borisov
2021-10-14 6:59 ` Johannes Thumshirn
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.