From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Balance loops: what we know so far
Date: Wed, 13 May 2020 14:36:21 +0800 [thread overview]
Message-ID: <437bf0bc-b308-5b41-67cc-5b84ba6a88d2@gmx.com> (raw)
In-Reply-To: <20200513050204.GX10769@hungrycats.org>
[-- Attachment #1.1: Type: text/plain, Size: 6388 bytes --]
On 2020/5/13 下午1:02, Zygo Blaxell wrote:
> On Wed, May 13, 2020 at 10:28:37AM +0800, Qu Wenruo wrote:
>>
>>
[...]
>> I'm a little surprised about the it's using logical ino ioctl, not just
>> TREE_SEARCH.
>
> Tree search can't read shared backrefs because they refer directly to
> disk blocks, not to object/type/offset tuples. It would be nice to have
> an ioctl that can read a metadata block (or even a data block) by bytenr.
Sorry, I mean we can use tree search to search extent tree, and only
search the backref for specified bytenr to inspect it.
In such hanging case, what I really want is to inspect if the tree block
4374646833152 belongs to data reloc tree.
>
> Or even better, just a fd that can be obtained by some ioctl to access
> the btrfs virtual address space with pread().
If we want the content of 4374646833152, we can easily go btrfs ins
dump-tree -b 4374646833152.
As balance works on commit root, at the time of looping, that tree block
should still be on disk, and dump-tree can get it.
Another way to workaround this is, to provide full extent tree dump.
(I know this is a bad idea, but it would still be faster than interact
with mail).
>
>> I guess if we could get a plain tree search based one (it only search
>> commit root, which is exactly balance based on), it would be easier to
>> do the digging.
>
> That would be nice. I have an application for it. ;)
>
>>> OSError: [Errno 22] Invalid argument
>>>
>>> root@tester:~# btrfs ins log 4368594108416 /media/testfs/
>>> /media/testfs//snap-1589258042/testhost/var/log/messages.6.lzma
>>> /media/testfs//current/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589249822/testhost/var/log/messages.6.lzma
>>> ERROR: ino paths ioctl: No such file or directory
>>> /media/testfs//snap-1589249547/testhost/var/log/messages.6.lzma
>>> ERROR: ino paths ioctl: No such file or directory
>>> /media/testfs//snap-1589248407/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589256422/testhost/var/log/messages.6.lzma
>>> ERROR: ino paths ioctl: No such file or directory
>>> /media/testfs//snap-1589251322/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589251682/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589253842/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589246727/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589258582/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589244027/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589245227/testhost/var/log/messages.6.lzma
>>> ERROR: ino paths ioctl: No such file or directory
>>> ERROR: ino paths ioctl: No such file or directory
>>> /media/testfs//snap-1589246127/testhost/var/log/messages.6.lzma
>>> /media/testfs//snap-1589247327/testhost/var/log/messages.6.lzma
>>> ERROR: ino paths ioctl: No such file or directory
>>>
>>> Hmmm, I wonder if there's a problem with deleted snapshots?
>>
>> Yes, also what I'm guessing.
>>
>> The cleanup of data reloc tree doesn't look correct to me.
>>
>> Thanks for the new clues,
>> Qu
>
> Here's a fun one:
>
> 1. Delete all the files on a filesystem where balance loops
> have occurred.
I tried with newly created fs, and failed to reproduce.
>
> 2. Verify there are no data blocks (one data block group
> with used = 0):
>
> # show_block_groups.py /testfs/
> block group vaddr 435969589248 length 1073741824 flags METADATA|RAID1 used 180224 used_pct 0
> block group vaddr 4382686969856 length 33554432 flags SYSTEM|RAID1 used 16384 used_pct 0
> block group vaddr 4383794266112 length 1073741824 flags DATA used 0 used_pct 0
>
> 3. Create a new file with a single reference in the only (root) subvol:
> # head -c 1024m > file
> # sync
> # show_block_groups.py .
> block group vaddr 435969589248 length 1073741824 flags METADATA|RAID1 used 1245184 used_pct 0
> block group vaddr 4382686969856 length 33554432 flags SYSTEM|RAID1 used 16384 used_pct 0
> block group vaddr 4384868007936 length 1073741824 flags DATA used 961708032 used_pct 90
> block group vaddr 4385941749760 length 1073741824 flags DATA used 112033792 used_pct 10
>
> 4. Run balance, and it immediately loops on a single extent:
> # btrfs balance start -d .
> [Wed May 13 00:41:58 2020] BTRFS info (device dm-0): balance: start -d
> [Wed May 13 00:41:58 2020] BTRFS info (device dm-0): relocating block group 4385941749760 flags data
> [Wed May 13 00:42:00 2020] BTRFS info (device dm-0): found 1 extents, loops 1, stage: move data extents
> [Wed May 13 00:42:00 2020] BTRFS info (device dm-0): found 1 extents, loops 2, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 3, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 4, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 5, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 6, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 7, stage: update data pointers
> [Wed May 13 00:42:02 2020] BTRFS info (device dm-0): found 1 extents, loops 8, stage: update data pointers
> [Wed May 13 00:42:02 2020] BTRFS info (device dm-0): found 1 extents, loops 9, stage: update data pointers
> [Wed May 13 00:42:02 2020] BTRFS info (device dm-0): found 1 extents, loops 10, stage: update data pointers
> [Wed May 13 00:42:02 2020] BTRFS info (device dm-0): found 1 extents, loops 11, stage: update data pointers
> [etc...]
>
> I tried it a 3 more times time and there was no loop. The 5th try looped again.
10 loops, no reproduce.
I guess there are some other factors involved, like newly created fs
won't trigger it?
BTW, for the reproducible case (or the looping case), would you like to
dump the data_reloc root?
My current guess is some orphan cleanup doesn't get kicked, but it
shouldn't affect metadata with my patch :(
Thanks,
Qu
>
> There might be a correlation with cancels. After a fresh boot, I can
> often balance a few dozen block groups before there's a loop, but if I
> cancel a balance, the next balance almost always loops.
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2020-05-13 6:36 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-11 21:14 Balance loops: what we know so far Zygo Blaxell
2020-04-27 7:07 ` Qu Wenruo
2020-04-28 4:55 ` Zygo Blaxell
2020-04-28 9:54 ` Qu Wenruo
2020-04-28 14:51 ` Zygo Blaxell
2020-04-29 5:34 ` Qu Wenruo
2020-04-29 12:23 ` Sebastian Döring
2020-05-04 18:54 ` Andrea Gelmini
2020-05-04 23:48 ` Qu Wenruo
2020-05-05 9:10 ` Andrea Gelmini
2020-05-06 5:58 ` Qu Wenruo
2020-05-06 18:24 ` Andrea Gelmini
2020-05-07 9:59 ` Andrea Gelmini
2020-05-08 6:33 ` Qu Wenruo
2020-05-11 8:31 ` Qu Wenruo
2020-05-12 13:43 ` Zygo Blaxell
2020-05-12 14:11 ` Zygo Blaxell
2020-05-13 2:28 ` Qu Wenruo
2020-05-13 5:02 ` Zygo Blaxell
2020-05-13 6:36 ` Qu Wenruo [this message]
2020-05-13 5:24 ` Zygo Blaxell
2020-05-13 11:23 ` Qu Wenruo
2020-05-13 12:21 ` Zygo Blaxell
2020-05-14 8:08 ` Qu Wenruo
2020-05-14 8:55 ` Qu Wenruo
2020-05-14 17:44 ` Zygo Blaxell
2020-05-14 23:43 ` Qu Wenruo
2020-05-15 6:57 ` Qu Wenruo
2020-05-15 15:17 ` Zygo Blaxell
2020-05-18 5:25 ` Qu Wenruo
2020-05-20 7:27 ` Qu Wenruo
2020-05-21 3:26 ` Zygo Blaxell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=437bf0bc-b308-5b41-67cc-5b84ba6a88d2@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).