linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Balance loops: what we know so far
Date: Wed, 13 May 2020 14:36:21 +0800	[thread overview]
Message-ID: <437bf0bc-b308-5b41-67cc-5b84ba6a88d2@gmx.com> (raw)
In-Reply-To: <20200513050204.GX10769@hungrycats.org>


[-- Attachment #1.1: Type: text/plain, Size: 6388 bytes --]



On 2020/5/13 下午1:02, Zygo Blaxell wrote:
> On Wed, May 13, 2020 at 10:28:37AM +0800, Qu Wenruo wrote:
>>
>>
[...]
>> I'm a little surprised about the it's using logical ino ioctl, not just
>> TREE_SEARCH.
> 
> Tree search can't read shared backrefs because they refer directly to
> disk blocks, not to object/type/offset tuples.  It would be nice to have
> an ioctl that can read a metadata block (or even a data block) by bytenr.

Sorry, I mean we can use tree search to search extent tree, and only
search the backref for specified bytenr to inspect it.

In such hanging case, what I really want is to inspect if the tree block
4374646833152 belongs to data reloc tree.

> 
> Or even better, just a fd that can be obtained by some ioctl to access
> the btrfs virtual address space with pread().

If we want the content of 4374646833152, we can easily go btrfs ins
dump-tree -b 4374646833152.

As balance works on commit root, at the time of looping, that tree block
should still be on disk, and dump-tree can get it.


Another way to workaround this is, to provide full extent tree dump.
(I know this is a bad idea, but it would still be faster than interact
with mail).

> 
>> I guess if we could get a plain tree search based one (it only search
>> commit root, which is exactly balance based on), it would be easier to
>> do the digging.
> 
> That would be nice.  I have an application for it.  ;)
> 
>>> 	OSError: [Errno 22] Invalid argument
>>>
>>> 	root@tester:~# btrfs ins log 4368594108416 /media/testfs/
>>> 	/media/testfs//snap-1589258042/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//current/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589249822/testhost/var/log/messages.6.lzma
>>> 	ERROR: ino paths ioctl: No such file or directory
>>> 	/media/testfs//snap-1589249547/testhost/var/log/messages.6.lzma
>>> 	ERROR: ino paths ioctl: No such file or directory
>>> 	/media/testfs//snap-1589248407/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589256422/testhost/var/log/messages.6.lzma
>>> 	ERROR: ino paths ioctl: No such file or directory
>>> 	/media/testfs//snap-1589251322/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589251682/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589253842/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589246727/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589258582/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589244027/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589245227/testhost/var/log/messages.6.lzma
>>> 	ERROR: ino paths ioctl: No such file or directory
>>> 	ERROR: ino paths ioctl: No such file or directory
>>> 	/media/testfs//snap-1589246127/testhost/var/log/messages.6.lzma
>>> 	/media/testfs//snap-1589247327/testhost/var/log/messages.6.lzma
>>> 	ERROR: ino paths ioctl: No such file or directory
>>>
>>> Hmmm, I wonder if there's a problem with deleted snapshots?
>>
>> Yes, also what I'm guessing.
>>
>> The cleanup of data reloc tree doesn't look correct to me.
>>
>> Thanks for the new clues,
>> Qu
> 
> Here's a fun one:
> 
> 1.  Delete all the files on a filesystem where balance loops
> have occurred.

I tried with newly created fs, and failed to reproduce.

> 
> 2.  Verify there are no data blocks (one data block group
> with used = 0):
> 
> # show_block_groups.py /testfs/
> block group vaddr 435969589248 length 1073741824 flags METADATA|RAID1 used 180224 used_pct 0
> block group vaddr 4382686969856 length 33554432 flags SYSTEM|RAID1 used 16384 used_pct 0
> block group vaddr 4383794266112 length 1073741824 flags DATA used 0 used_pct 0
> 
> 3.  Create a new file with a single reference in the only (root) subvol:
> # head -c 1024m > file
> # sync
> # show_block_groups.py .
> block group vaddr 435969589248 length 1073741824 flags METADATA|RAID1 used 1245184 used_pct 0
> block group vaddr 4382686969856 length 33554432 flags SYSTEM|RAID1 used 16384 used_pct 0
> block group vaddr 4384868007936 length 1073741824 flags DATA used 961708032 used_pct 90
> block group vaddr 4385941749760 length 1073741824 flags DATA used 112033792 used_pct 10
> 
> 4.  Run balance, and it immediately loops on a single extent:
> # btrfs balance start -d .
> [Wed May 13 00:41:58 2020] BTRFS info (device dm-0): balance: start -d
> [Wed May 13 00:41:58 2020] BTRFS info (device dm-0): relocating block group 4385941749760 flags data
> [Wed May 13 00:42:00 2020] BTRFS info (device dm-0): found 1 extents, loops 1, stage: move data extents
> [Wed May 13 00:42:00 2020] BTRFS info (device dm-0): found 1 extents, loops 2, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 3, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 4, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 5, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 6, stage: update data pointers
> [Wed May 13 00:42:01 2020] BTRFS info (device dm-0): found 1 extents, loops 7, stage: update data pointers
> [Wed May 13 00:42:02 2020] BTRFS info (device dm-0): found 1 extents, loops 8, stage: update data pointers
> [Wed May 13 00:42:02 2020] BTRFS info (device dm-0): found 1 extents, loops 9, stage: update data pointers
> [Wed May 13 00:42:02 2020] BTRFS info (device dm-0): found 1 extents, loops 10, stage: update data pointers
> [Wed May 13 00:42:02 2020] BTRFS info (device dm-0): found 1 extents, loops 11, stage: update data pointers
> [etc...]
> 
> I tried it a 3 more times time and there was no loop.  The 5th try looped again.

10 loops, no reproduce.
I guess there are some other factors involved, like newly created fs
won't trigger it?

BTW, for the reproducible case (or the looping case), would you like to
dump the data_reloc root?

My current guess is some orphan cleanup doesn't get kicked, but it
shouldn't affect metadata with my patch :(

Thanks,
Qu
> 
> There might be a correlation with cancels.  After a fresh boot, I can
> often balance a few dozen block groups before there's a loop, but if I
> cancel a balance, the next balance almost always loops.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-05-13  6:36 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-11 21:14 Balance loops: what we know so far Zygo Blaxell
2020-04-27  7:07 ` Qu Wenruo
2020-04-28  4:55   ` Zygo Blaxell
2020-04-28  9:54     ` Qu Wenruo
2020-04-28 14:51       ` Zygo Blaxell
2020-04-29  5:34         ` Qu Wenruo
2020-04-29 12:23           ` Sebastian Döring
2020-05-04 18:54       ` Andrea Gelmini
2020-05-04 23:48         ` Qu Wenruo
2020-05-05  9:10           ` Andrea Gelmini
2020-05-06  5:58             ` Qu Wenruo
2020-05-06 18:24               ` Andrea Gelmini
2020-05-07  9:59                 ` Andrea Gelmini
2020-05-08  6:33                 ` Qu Wenruo
2020-05-11  8:31     ` Qu Wenruo
2020-05-12 13:43       ` Zygo Blaxell
2020-05-12 14:11         ` Zygo Blaxell
2020-05-13  2:28           ` Qu Wenruo
2020-05-13  5:02             ` Zygo Blaxell
2020-05-13  6:36               ` Qu Wenruo [this message]
2020-05-13  5:24             ` Zygo Blaxell
2020-05-13 11:23               ` Qu Wenruo
2020-05-13 12:21                 ` Zygo Blaxell
2020-05-14  8:08                   ` Qu Wenruo
2020-05-14  8:55                     ` Qu Wenruo
2020-05-14 17:44                       ` Zygo Blaxell
2020-05-14 23:43                         ` Qu Wenruo
2020-05-15  6:57                         ` Qu Wenruo
2020-05-15 15:17                           ` Zygo Blaxell
2020-05-18  5:25                             ` Qu Wenruo
2020-05-20  7:27                             ` Qu Wenruo
2020-05-21  3:26                               ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=437bf0bc-b308-5b41-67cc-5b84ba6a88d2@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).