Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: Nikolay Borisov <n.borisov.lkml@gmail.com>
To: webmaster@zedlx.com
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Feature requests: online backup - defrag - change RAID level
Date: Tue, 10 Sep 2019 17:14:20 +0300
Message-ID: <0450e0d8-6c37-dc72-5987-bf92eeb8c4ef@gmail.com> (raw)
In-Reply-To: <20190909233248.Horde.lTF4WXM9AzBZdWueqc2vsIZ@server53.web-hosting.com>



On 10.09.19 г. 6:32 ч., webmaster@zedlx.com wrote:
> 
> Quoting Qu Wenruo <quwenruo.btrfs@gmx.com>:
> 
>> On 2019/9/10 上午9:24, webmaster@zedlx.com wrote:
>>>
>>> Quoting Qu Wenruo <quwenruo.btrfs@gmx.com>:
>>>
>>>>>> Btrfs defrag works by creating new extents containing the old data.
>>>>>>
>>>>>> So if btrfs decides to defrag, no old extents will be used.
>>>>>> It will all be new extents.
>>>>>>
>>>>>> That's why your proposal is freaking strange here.
>>>>>
>>>>> Ok, but: can the NEW extents still be shared?
>>>>
>>>> Can only be shared by reflink.
>>>> Not automatically, so if btrfs decides to defrag, it will not be shared
>>>> at all.
>>>>
>>>>> If you had an extent E88
>>>>> shared by 4 files in different subvolumes, can it be copied to another
>>>>> place and still be shared by the original 4 files?
>>>>
>>>> Not for current btrfs.
>>>>
>>>>> I guess that the
>>>>> answer is YES. And, that's the only requirement for a good defrag
>>>>> algorithm that doesn't shrink free space.
>>>>
>>>> We may go that direction.
>>>>
>>>> The biggest burden here is, btrfs needs to do expensive full-backref
>>>> walk to determine how many files are referring to this extent.
>>>> And then change them all to refer to the new extent.
>>>
>>> YES! That! Exactly THAT. That is what needs to be done.
>>>
>>> I mean, you just create an (perhaps associative) array which links an
>>> extent (the array index contains the extent ID) to all the files that
>>> reference that extent.
>>
>> You're exactly in the pitfall of btrfs backref walk.
>>
>> For btrfs, it's definitely not an easy work to do backref walk.
>> btrfs uses hidden backref, that means, under most case, one extent
>> shared by 1000 snapshots, in extent tree (shows the backref) it can
>> completely be possible to only have one ref, for the initial subvolume.
>>
>> For btrfs, you need to walk up the tree to find how it's shared.
>>
>> It has to be done like that, that's why we call it backref-*walk*.
>>
>> E.g
>>           A (subvol 257)     B (Subvol 258, snapshot of 257)
>>           |    \        /    |
>>           |        X         |
>>           |    /        \    |
>>           C                  D
>>          / \                / \
>>         E   F              G   H
>>
>> In extent tree, E is only referred by subvol 257.
>> While C has two referencers, 257 and 258.
>>
>> So in reality, you need to:
>> 1) Do a tree search from subvol 257
>>    You got a path, E -> C -> A
>> 2) Check each node to see if it's shared.
>>    E is only referred by C, no extra referencer.
>>    C is refered by two new tree blocks, A and B.
>>    A is refered by subvol 257.
>>    B is refered by subvol 258.
>>    So E is shared by 257 and 258.
>>
>> Now, you see how things would go mad, for each extent you must go that
>> way to determine the real owner of each extent, not to mention we can
>> have at most 8 levels, tree blocks at level 0~7 can all be shared.
>>
>> If it's shared by 1000 subvolumes, hope you had a good day then.
> 
> Ok, let's do just this issue for the time being. One issue at a time. It
> will be easier.
> 
> The solution is to temporarily create a copy of the entire backref-tree
> in memory. To create this copy, you just do a preorder depth-first
> traversal following only forward references.
> 
> So this preorder depth-first traversal would visit the nodes in the
> following order:
> A,C,E,F,D,G,H,B
> 
> Oh, it is not a tree, it is a DAG in that example of yours. OK, preorder
> is possible on DAG, too. But how did you get a DAG, shouldn't it be all
> trees?
> 
> When you have the entire backref-tree (backref-DAG?) in memory, doing a
> backref-walk is a piece of cake.
> 
> Of course, this in-memory backref tree has to be kept in sync with the
> filesystem, that is it has to be updated whenever there is a write to
> disk. That's not so hard.

Great, now that you have devised a solution and have plenty of
experience writing code why not try and contribute to btrfs?


  reply index

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-09  2:55 zedlryqc
2019-09-09  3:51 ` Qu Wenruo
2019-09-09 11:25   ` zedlryqc
2019-09-09 12:18     ` Qu Wenruo
2019-09-09 12:28       ` Qu Wenruo
2019-09-09 17:11         ` webmaster
2019-09-10 17:39           ` Andrei Borzenkov
2019-09-10 22:41             ` webmaster
2019-09-09 15:29       ` Graham Cobb
2019-09-09 17:24         ` Remi Gauvin
2019-09-09 19:26         ` webmaster
2019-09-10 19:22           ` Austin S. Hemmelgarn
2019-09-10 23:32             ` webmaster
2019-09-11 12:02               ` Austin S. Hemmelgarn
2019-09-11 16:26                 ` Zygo Blaxell
2019-09-11 17:20                 ` webmaster
2019-09-11 18:19                   ` Austin S. Hemmelgarn
2019-09-11 20:01                     ` webmaster
2019-09-11 21:42                       ` Zygo Blaxell
2019-09-13  1:33                         ` General Zed
2019-09-11 21:37                     ` webmaster
2019-09-12 11:31                       ` Austin S. Hemmelgarn
2019-09-12 19:18                         ` webmaster
2019-09-12 19:44                           ` Chris Murphy
2019-09-12 21:34                             ` General Zed
2019-09-12 22:28                               ` Chris Murphy
2019-09-12 22:57                                 ` General Zed
2019-09-12 23:54                                   ` Zygo Blaxell
2019-09-13  0:26                                     ` General Zed
2019-09-13  3:12                                       ` Zygo Blaxell
2019-09-13  5:05                                         ` General Zed
2019-09-14  0:56                                           ` Zygo Blaxell
2019-09-14  1:50                                             ` General Zed
2019-09-14  4:42                                               ` Zygo Blaxell
2019-09-14  4:53                                                 ` Zygo Blaxell
2019-09-15 17:54                                                 ` General Zed
2019-09-16 22:51                                                   ` Zygo Blaxell
2019-09-17  1:03                                                     ` General Zed
2019-09-17  1:34                                                       ` General Zed
2019-09-17  1:44                                                       ` Chris Murphy
2019-09-17  4:55                                                         ` Zygo Blaxell
2019-09-17  4:19                                                       ` Zygo Blaxell
2019-09-17  3:10                                                     ` General Zed
2019-09-17  4:05                                                       ` General Zed
2019-09-14  1:56                                             ` General Zed
2019-09-13  5:22                                         ` General Zed
2019-09-13  6:16                                         ` General Zed
2019-09-13  6:58                                         ` General Zed
2019-09-13  9:25                                           ` General Zed
2019-09-13 17:02                                             ` General Zed
2019-09-14  0:59                                             ` Zygo Blaxell
2019-09-14  1:28                                               ` General Zed
2019-09-14  4:28                                                 ` Zygo Blaxell
2019-09-15 18:05                                                   ` General Zed
2019-09-16 23:05                                                     ` Zygo Blaxell
2019-09-13  7:51                                         ` General Zed
2019-09-13 11:04                                     ` Austin S. Hemmelgarn
2019-09-13 20:43                                       ` Zygo Blaxell
2019-09-14  0:20                                         ` General Zed
2019-09-14 18:29                                       ` Chris Murphy
2019-09-14 23:39                                         ` Zygo Blaxell
2019-09-13 11:09                                   ` Austin S. Hemmelgarn
2019-09-13 17:20                                     ` General Zed
2019-09-13 18:20                                       ` General Zed
2019-09-12 19:54                           ` Austin S. Hemmelgarn
2019-09-12 22:21                             ` General Zed
2019-09-13 11:53                               ` Austin S. Hemmelgarn
2019-09-13 16:54                                 ` General Zed
2019-09-13 18:29                                   ` Austin S. Hemmelgarn
2019-09-13 19:40                                     ` General Zed
2019-09-14 15:10                                       ` Jukka Larja
2019-09-12 22:47                             ` General Zed
2019-09-11 21:37                   ` Zygo Blaxell
2019-09-11 23:21                     ` webmaster
2019-09-12  0:10                       ` Remi Gauvin
2019-09-12  3:05                         ` webmaster
2019-09-12  3:30                           ` Remi Gauvin
2019-09-12  3:33                             ` Remi Gauvin
2019-09-12  5:19                       ` Zygo Blaxell
2019-09-12 21:23                         ` General Zed
2019-09-14  4:12                           ` Zygo Blaxell
2019-09-16 11:42                             ` General Zed
2019-09-17  0:49                               ` Zygo Blaxell
2019-09-17  2:30                                 ` General Zed
2019-09-17  5:30                                   ` Zygo Blaxell
2019-09-17 10:07                                     ` General Zed
2019-09-17 23:40                                       ` Zygo Blaxell
2019-09-18  4:37                                         ` General Zed
2019-09-18 18:00                                           ` Zygo Blaxell
2019-09-10 23:58             ` webmaster
2019-09-09 23:24         ` Qu Wenruo
2019-09-09 23:25         ` webmaster
2019-09-09 16:38       ` webmaster
2019-09-09 23:44         ` Qu Wenruo
2019-09-10  0:00           ` Chris Murphy
2019-09-10  0:51             ` Qu Wenruo
2019-09-10  0:06           ` webmaster
2019-09-10  0:48             ` Qu Wenruo
2019-09-10  1:24               ` webmaster
2019-09-10  1:48                 ` Qu Wenruo
2019-09-10  3:32                   ` webmaster
2019-09-10 14:14                     ` Nikolay Borisov [this message]
2019-09-10 22:35                       ` webmaster
2019-09-11  6:40                         ` Nikolay Borisov
2019-09-10 22:48                     ` webmaster
2019-09-10 23:14                   ` webmaster
2019-09-11  0:26               ` webmaster
2019-09-11  0:36                 ` webmaster
2019-09-11  1:00                 ` webmaster
2019-09-10 11:12     ` Austin S. Hemmelgarn
2019-09-09  3:12 webmaster

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0450e0d8-6c37-dc72-5987-bf92eeb8c4ef@gmail.com \
    --to=n.borisov.lkml@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=webmaster@zedlx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git