All of lore.kernel.org
 help / color / mirror / Atom feed
* balance pause and resume seems not to work properly
@ 2020-04-29 19:28 rollojobs
  2020-05-13  5:44 ` Zygo Blaxell
  0 siblings, 1 reply; 2+ messages in thread
From: rollojobs @ 2020-04-29 19:28 UTC (permalink / raw)
  To: linux-btrfs

Hi,
I created a filesystem of two drives (6TB and 750GB) with both data and metadata raid 1. Then I put a couple of gigs on it and added a 2TB drive. After balancing, the small drive should be empty and the two large drives should have all the data. But this doesn't work if balance is paused and resumed.

I startet balance with:
btrfs balance start --bg /srv/dev-disk-by-label-BTRFS1/

and then watched status:
Balance on '/srv/dev-disk-by-label-BTRFS1/' is running
3 out of about 14 chunks balanced (4 considered),  79% left  

At this point I paused and then resumed.

btrfs balance pause /srv/dev-disk-by-label-BTRFS1/
btrfs balance resume /srv/dev-disk-by-label-BTRFS1/

And then I get this balance status:
0 out of about 3 chunks balanced (1 considered), 100% left
Balance on '/srv/dev-disk-by-label-BTRFS1/' is running

Why it's not 11 chunks but 3? It finished with:

Done, had to relocate 3 out of 14 chunks

and I've got this distribution:

Label: 'BTRFS1'  uuid: 61e5aba9-6811-46ae-9396-35a72d3b1117
        Total devices 3 FS bytes used 11.13GiB
        devid    1 size 5.46TiB used 13.03GiB path /dev/sdc1
        devid    3 size 698.64GiB used 4.00GiB path /dev/sdf
        devid    4 size 1.82TiB used 9.03GiB path /dev/sde


As it's raid 1, device 3 should be empty after finishing balance. Running balance without pause comfirms that:

Label: 'BTRFS1'  uuid: 61e5aba9-6811-46ae-9396-35a72d3b1117
        Total devices 3 FS bytes used 11.13GiB
        devid    1 size 5.46TiB used 13.03GiB path /dev/sdc1
        devid    3 size 698.64GiB used 0.00B path /dev/sdf
        devid    4 size 1.82TiB used 13.03GiB path /dev/sde  

That means, balance doesn't work properly after pause and resume?

I'm running
btrfs-progs v4.20.1
5.4.0-0.bpo.4-amd64

Best Regards

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: balance pause and resume seems not to work properly
  2020-04-29 19:28 balance pause and resume seems not to work properly rollojobs
@ 2020-05-13  5:44 ` Zygo Blaxell
  0 siblings, 0 replies; 2+ messages in thread
From: Zygo Blaxell @ 2020-05-13  5:44 UTC (permalink / raw)
  To: rollojobs; +Cc: linux-btrfs

On Wed, Apr 29, 2020 at 09:28:14PM +0200, rollojobs@partyheld.de wrote:
> Hi,
> I created a filesystem of two drives (6TB and 750GB) with both data and metadata raid 1. Then I put a couple of gigs on it and added a 2TB drive. After balancing, the small drive should be empty and the two large drives should have all the data. But this doesn't work if balance is paused and resumed.
> 
> I startet balance with:
> btrfs balance start --bg /srv/dev-disk-by-label-BTRFS1/
> 
> and then watched status:
> Balance on '/srv/dev-disk-by-label-BTRFS1/' is running
> 3 out of about 14 chunks balanced (4 considered),  79% left  
> 
> At this point I paused and then resumed.
> 
> btrfs balance pause /srv/dev-disk-by-label-BTRFS1/
> btrfs balance resume /srv/dev-disk-by-label-BTRFS1/
> 
> And then I get this balance status:
> 0 out of about 3 chunks balanced (1 considered), 100% left
> Balance on '/srv/dev-disk-by-label-BTRFS1/' is running
> 
> Why it's not 11 chunks but 3? It finished with:
> 
> Done, had to relocate 3 out of 14 chunks
> 
> and I've got this distribution:
> 
> Label: 'BTRFS1'  uuid: 61e5aba9-6811-46ae-9396-35a72d3b1117
>         Total devices 3 FS bytes used 11.13GiB
>         devid    1 size 5.46TiB used 13.03GiB path /dev/sdc1
>         devid    3 size 698.64GiB used 4.00GiB path /dev/sdf
>         devid    4 size 1.82TiB used 9.03GiB path /dev/sde
> 
> 
> As it's raid 1, device 3 should be empty after finishing balance. Running balance without pause comfirms that:
> 
> Label: 'BTRFS1'  uuid: 61e5aba9-6811-46ae-9396-35a72d3b1117
>         Total devices 3 FS bytes used 11.13GiB
>         devid    1 size 5.46TiB used 13.03GiB path /dev/sdc1
>         devid    3 size 698.64GiB used 0.00B path /dev/sdf
>         devid    4 size 1.82TiB used 13.03GiB path /dev/sde  
> 
> That means, balance doesn't work properly after pause and resume?

Yes, balance resume has been broken since the beginning.

The offending code is:

commit 596410151ed71819b9e8a8018c6c9992796b256d
Author: Ilya Dryomov <idryomov@gmail.com>
Date:   Mon Jan 16 22:04:48 2012 +0200

    Btrfs: recover balance on mount

+/*
+ * This is a heuristic used to reduce the number of chunks balanced on
+ * resume after balance was interrupted.
+ */
+static void update_balance_args(struct btrfs_balance_control *bctl)
+{
+       /*
+        * Turn on soft mode for chunk types that were being converted.
+        */
+       if (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT)
+               bctl->data.flags |= BTRFS_BALANCE_ARGS_SOFT;
+       if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT)
+               bctl->sys.flags |= BTRFS_BALANCE_ARGS_SOFT;
+       if (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT)
+               bctl->meta.flags |= BTRFS_BALANCE_ARGS_SOFT;
+
+       /*
+        * Turn on usage filter if is not already used.  The idea is
+        * that chunks that we have already balanced should be
+        * reasonably full.  Don't do it for chunks that are being
+        * converted - that will keep us from relocating unconverted
+        * (albeit full) chunks.
+        */
+       if (!(bctl->data.flags & BTRFS_BALANCE_ARGS_USAGE) &&
+           !(bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT)) {
+               bctl->data.flags |= BTRFS_BALANCE_ARGS_USAGE;
+               bctl->data.usage = 90;
+       }
+       if (!(bctl->sys.flags & BTRFS_BALANCE_ARGS_USAGE) &&
+           !(bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT)) {
+               bctl->sys.flags |= BTRFS_BALANCE_ARGS_USAGE;
+               bctl->sys.usage = 90;
+       }
+       if (!(bctl->meta.flags & BTRFS_BALANCE_ARGS_USAGE) &&
+           !(bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT)) {
+               bctl->meta.flags |= BTRFS_BALANCE_ARGS_USAGE;
+               bctl->meta.usage = 90;
+       }
+}
+

At this point I was going to say this code couldn't use the VRANGE filter
to resume balance because the commit predates the VRANGE filter but,
well, it doesn't.  VRANGE was added in this commit, which came after
the above:

commit ea67176ae8c024f64d85ec33873e5eadf1af7247
Author: Ilya Dryomov <idryomov@gmail.com>
Date:   Mon Jan 16 22:04:48 2012 +0200

    Btrfs: virtual address space subset filter

What balance _should_ do is update the upper limit of the vrange filter
(turning it on if not supplied) every time a block group is finished
(i.e. the balance filter would reset the upper limit of vrange every
time a block group was completed).  This is currently done with the
limit filter.  Then a resume would actually resume the balance, and
the above code can be removed.

What balance resume does now is start a completely new balance with the
wrong filter arguments.

> I'm running
> btrfs-progs v4.20.1
> 5.4.0-0.bpo.4-amd64
> 
> Best Regards

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-05-13  5:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-29 19:28 balance pause and resume seems not to work properly rollojobs
2020-05-13  5:44 ` Zygo Blaxell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.