All of lore.kernel.org
 help / color / mirror / Atom feed
* how many chunk trees and extent trees present
@ 2015-04-17  6:24 sri
  2015-04-17  9:19 ` Hugo Mills
  0 siblings, 1 reply; 10+ messages in thread
From: sri @ 2015-04-17  6:24 UTC (permalink / raw)
  To: linux-btrfs

Hi,
I have below queries. Could somebody help me in understanding.

1)
As per my understanding btrfs file system uses one chunk tree and one 
extent tree for entire btrfs disk allocation.

Is this correct?

In, some article i read that future there will be more chunk tree/ extent 
tree for single btrfs. Is this true.
If yes, I would like to know why more than one chunk / extent tree is 
required to represent one btrfs file system.

2)

Also I would like to know for a subvolume / snapshot , is there a 
provision to ask btrfs , represent all blocks belongs to that 
subvolume/snapshot should handle with a separate chunk tree and extent 
tree?

I am looking for a way to traverse a subvolume preferably a snapshot and 
identify all disk blocks (extents) allocated for that particular subvolume 
/ snapshot.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2015-04-17  6:24 how many chunk trees and extent trees present sri
@ 2015-04-17  9:19 ` Hugo Mills
  2015-04-17  9:56   ` sri
  2015-04-17 17:29   ` David Sterba
  0 siblings, 2 replies; 10+ messages in thread
From: Hugo Mills @ 2015-04-17  9:19 UTC (permalink / raw)
  To: sri; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2123 bytes --]

On Fri, Apr 17, 2015 at 06:24:05AM +0000, sri wrote:
> Hi,
> I have below queries. Could somebody help me in understanding.
> 
> 1)
> As per my understanding btrfs file system uses one chunk tree and one 
> extent tree for entire btrfs disk allocation.
> 
> Is this correct?

   Yes.

> In, some article i read that future there will be more chunk tree/ extent 
> tree for single btrfs. Is this true.

   I recall, many moons ago, Chris saying that there probably wouldn't
be.

> If yes, I would like to know why more than one chunk / extent tree is 
> required to represent one btrfs file system.

   I think the original idea was that it would reduce lock contention
on the tree root.

> 2)
> 
> Also I would like to know for a subvolume / snapshot , is there a 
> provision to ask btrfs , represent all blocks belongs to that 
> subvolume/snapshot should handle with a separate chunk tree and extent 
> tree?

   No.

> I am looking for a way to traverse a subvolume preferably a snapshot and 
> identify all disk blocks (extents) allocated for that particular subvolume 
> / snapshot.

   Do you mean allocated to any file in the subvolume, or do you mean
*exclusively* allocated to that subvolume and not shared with any
other?

   The former is easy -- just walk the file tree, and read the extents
for each file. The latter is harder, because you have to look for
extents that are not shared, and extents that are only shared within
the current subvolume (think reflink copies within a subvol). I think
you can do that by counting backrefs, but there may be big race
conditions involved on a filesystem that's being written to (because
the backrefs aren't created immediately, but delayed for performance
reasons).

   Note that if all you want is the count of those blocks (rather than
the block numbers themselves), then it's already been done with
qgroups, and you don't need to write any btrfs code at all.

   What exactly are you going to be doing with this information?

   Hugo.

-- 
Hugo Mills             | O tempura! O moresushi!
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2015-04-17  9:19 ` Hugo Mills
@ 2015-04-17  9:56   ` sri
  2016-02-25 12:16     ` sri
  2015-04-17 17:29   ` David Sterba
  1 sibling, 1 reply; 10+ messages in thread
From: sri @ 2015-04-17  9:56 UTC (permalink / raw)
  To: linux-btrfs

Hugo Mills <hugo <at> carfax.org.uk> writes:

> 
> On Fri, Apr 17, 2015 at 06:24:05AM +0000, sri wrote:
> > Hi,
> > I have below queries. Could somebody help me in understanding.
> > 
> > 1)
> > As per my understanding btrfs file system uses one chunk tree and 
one 
> > extent tree for entire btrfs disk allocation.
> > 
> > Is this correct?
> 
>    Yes.
> 
> > In, some article i read that future there will be more chunk tree/ 
extent 
> > tree for single btrfs. Is this true.
> 
>    I recall, many moons ago, Chris saying that there probably wouldn't
> be.
> 
> > If yes, I would like to know why more than one chunk / extent tree 
is 
> > required to represent one btrfs file system.
> 
>    I think the original idea was that it would reduce lock contention
> on the tree root.
> 
> > 2)
> > 
> > Also I would like to know for a subvolume / snapshot , is there a 
> > provision to ask btrfs , represent all blocks belongs to that 
> > subvolume/snapshot should handle with a separate chunk tree and 
extent 
> > tree?
> 
>    No.
> 
> > I am looking for a way to traverse a subvolume preferably a snapshot 
and 
> > identify all disk blocks (extents) allocated for that particular 
subvolume 
> > / snapshot.
> 
>    Do you mean allocated to any file in the subvolume, or do you mean
> *exclusively* allocated to that subvolume and not shared with any
> other?
> 
>    The former is easy -- just walk the file tree, and read the extents
> for each file. The latter is harder, because you have to look for
> extents that are not shared, and extents that are only shared within
> the current subvolume (think reflink copies within a subvol). I think
> you can do that by counting backrefs, but there may be big race
> conditions involved on a filesystem that's being written to (because
> the backrefs aren't created immediately, but delayed for performance
> reasons).
> 
>    Note that if all you want is the count of those blocks (rather than
> the block numbers themselves), then it's already been done with
> qgroups, and you don't need to write any btrfs code at all.
> 
>    What exactly are you going to be doing with this information?
> 
>    Hugo.
> 

I am trying a way to get all files and folders of a snapshot volume 
without making file system level calls (fopen etc..) 

I want to write code to understand the corresponding snapshot btree and 
used related chunk tree and extent tree, and find out for each file 
(inode) all extent blocks. 
If I want to backup, I will use above method to traverse snapshot 
subvolume at disk level and copy all blocks of files/directories.

Thank you
sri



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2015-04-17  9:19 ` Hugo Mills
  2015-04-17  9:56   ` sri
@ 2015-04-17 17:29   ` David Sterba
  2016-02-26  1:29     ` Qu Wenruo
  2016-03-04  2:58     ` Anand Jain
  1 sibling, 2 replies; 10+ messages in thread
From: David Sterba @ 2015-04-17 17:29 UTC (permalink / raw)
  To: Hugo Mills, sri, linux-btrfs

On Fri, Apr 17, 2015 at 09:19:11AM +0000, Hugo Mills wrote:
> > In, some article i read that future there will be more chunk tree/ extent 
> > tree for single btrfs. Is this true.
> 
>    I recall, many moons ago, Chris saying that there probably wouldn't
> be.

More extent trees tied to a set of fs trees/subvolumes would be very
useful for certain usecases *cough*encryption*cough*.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2015-04-17  9:56   ` sri
@ 2016-02-25 12:16     ` sri
  2016-02-26  1:21       ` Qu Wenruo
  0 siblings, 1 reply; 10+ messages in thread
From: sri @ 2016-02-25 12:16 UTC (permalink / raw)
  To: linux-btrfs

Do you mean allocated to any file in the subvolume, or do you mean
> > *exclusively* allocated to that subvolume and not shared with any
> > other?

Hi,
Like ext3/ext4, I can find all used blocks of the file system. Once 
identified, I can just copy those blocks for backup. The bit map 
provided by ext3/ext4 includes blocks allocated for both metadata and 
data, backup/recovery won't consume much space.

For btrfs, multiple subvolumes can be created on pool of disks and each 
subvolume can consider as individual file system, I want to know a 
mechanism of identifying blocks allocated for the the subvolume through 
its snapshot so that for recovery, i can able to recovery those blocks 
only.

If btrfs is created on 10 disks each of 100gb and one subvolume is 10GB, 
backup window will be less for just backing up the subvolume.

I checked btrfs send/receive but the problem with send/receive is
1. It is file level dump
2. previous snapshot should be present to get incremental otherwise it 
generates full backup again.


sri <toyours_sridhar <at> yahoo.co.in> writes:

> 
> Hugo Mills <hugo <at> carfax.org.uk> writes:
> 
> > 
> > On Fri, Apr 17, 2015 at 06:24:05AM +0000, sri wrote:
> > > Hi,
> > > I have below queries. Could somebody help me in understanding.
> > > 
> > > 1)
> > > As per my understanding btrfs file system uses one chunk tree and 
> one 
> > > extent tree for entire btrfs disk allocation.
> > > 
> > > Is this correct?
> > 
> >    Yes.
> > 
> > > In, some article i read that future there will be more chunk tree/ 
> extent 
> > > tree for single btrfs. Is this true.
> > 
> >    I recall, many moons ago, Chris saying that there probably 
wouldn't
> > be.
> > 
> > > If yes, I would like to know why more than one chunk / extent tree 
> is 
> > > required to represent one btrfs file system.
> > 
> >    I think the original idea was that it would reduce lock 
contention
> > on the tree root.
> > 
> > > 2)
> > > 
> > > Also I would like to know for a subvolume / snapshot , is there a 
> > > provision to ask btrfs , represent all blocks belongs to that 
> > > subvolume/snapshot should handle with a separate chunk tree and 
> extent 
> > > tree?
> > 
> >    No.
> > 
> > > I am looking for a way to traverse a subvolume preferably a 
snapshot 
> and 
> > > identify all disk blocks (extents) allocated for that particular 
> subvolume 
> > > / snapshot.
> > 
> >    Do you mean allocated to any file in the subvolume, or do you 
mean
> > *exclusively* allocated to that subvolume and not shared with any
> > other?
> > 
> >    The former is easy -- just walk the file tree, and read the 
extents
> > for each file. The latter is harder, because you have to look for
> > extents that are not shared, and extents that are only shared within
> > the current subvolume (think reflink copies within a subvol). I 
think
> > you can do that by counting backrefs, but there may be big race
> > conditions involved on a filesystem that's being written to (because
> > the backrefs aren't created immediately, but delayed for performance
> > reasons).
> > 
> >    Note that if all you want is the count of those blocks (rather 
than
> > the block numbers themselves), then it's already been done with
> > qgroups, and you don't need to write any btrfs code at all.
> > 
> >    What exactly are you going to be doing with this information?
> > 
> >    Hugo.
> > 
> 
> I am trying a way to get all files and folders of a snapshot volume 
> without making file system level calls (fopen etc..) 
> 
> I want to write code to understand the corresponding snapshot btree 
and 
> used related chunk tree and extent tree, and find out for each file 
> (inode) all extent blocks. 
> If I want to backup, I will use above method to traverse snapshot 
> subvolume at disk level and copy all blocks of files/directories.
> 
> Thank you
> sri
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" 
in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2016-02-25 12:16     ` sri
@ 2016-02-26  1:21       ` Qu Wenruo
  0 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2016-02-26  1:21 UTC (permalink / raw)
  To: sri, linux-btrfs



sri wrote on 2016/02/25 12:16 +0000:
> Do you mean allocated to any file in the subvolume, or do you mean
>>> *exclusively* allocated to that subvolume and not shared with any
>>> other?
>
> Hi,
> Like ext3/ext4, I can find all used blocks of the file system. Once
> identified, I can just copy those blocks for backup.

In btrfs, it's also very easy to find, just iterate all items in extent 
tree.
All metadata/data blocks have its METADATA_ITEM/EXTENT_ITEM in extent tree.
Your only concern to copy all these data should be btrfs chunk mapping.

Btrfs already has a tool to do such backup for metadata. Btrfs-image.
Although it's mainly used for debug, so it doesn't backup data though.

> The bit map
> provided by ext3/ext4 includes blocks allocated for both metadata and
> data, backup/recovery won't consume much space.
>
> For btrfs, multiple subvolumes can be created on pool of disks and each
> subvolume can consider as individual file system, I want to know a
> mechanism of identifying blocks allocated for the the subvolume through
> its snapshot so that for recovery, i can able to recovery those blocks
> only.

Unfortunately, all btrfs subvolumes/snapshots shares the same extent tree.
And that's why btrfs subvolume is called *sub* volume, not volume.
One subvolume can't function completely independently.

Although you can mount them as individual filesystem, it's still not 
full filesystem.


BTW, about separate extent/chunk tree, just as Hugo mentioned, it's 
planned to just reduce lock concurrency.
IMHO, it would be per-chunk extent/chunk tree design.

For me, it's almost impossible to do per-subvolume extent/chunk tree.
Things like incoming btrfs in-band de-duplication and existing 
btrfs_clone/reflink can easily refer data outside a subvolume.
Such design will just reduce the advantage of btrfs.

>
> If btrfs is created on 10 disks each of 100gb and one subvolume is 10GB,
> backup window will be less for just backing up the subvolume.
>
> I checked btrfs send/receive but the problem with send/receive is
> 1. It is file level dump

Isn't it done at subvolume/snapshot level?

> 2. previous snapshot should be present to get incremental otherwise it
> generates full backup again.

IMHO that's what incremental means.
The point is, btrfs snapshot can, and in most case, share metadata btree 
with its source subvolume/snapshot.
This design makes btrfs snapshot small and fast(16K for one snapshot, 
and creation is very fast).
But that's require strict incremental send, so we need its source 
snapshot in the filesystem.



At least for me, I still don't quite get the point of your goal.

If you want to incremental backup, then either use send of btrfs, or use 
more generic rsync method.


For understanding btrfs extent layout (including chunk and extent tree), 
I'd recommend to use btrfs-debug-tree and refer to btrfs wiki 
(https://btrfs.wiki.kernel.org/index.php/Btree_Items) as a start point.

Thanks,
Qu

>
>
> sri <toyours_sridhar <at> yahoo.co.in> writes:
>
>>
>> Hugo Mills <hugo <at> carfax.org.uk> writes:
>>
>>>
>>> On Fri, Apr 17, 2015 at 06:24:05AM +0000, sri wrote:
>>>> Hi,
>>>> I have below queries. Could somebody help me in understanding.
>>>>
>>>> 1)
>>>> As per my understanding btrfs file system uses one chunk tree and
>> one
>>>> extent tree for entire btrfs disk allocation.
>>>>
>>>> Is this correct?
>>>
>>>     Yes.
>>>
>>>> In, some article i read that future there will be more chunk tree/
>> extent
>>>> tree for single btrfs. Is this true.
>>>
>>>     I recall, many moons ago, Chris saying that there probably
> wouldn't
>>> be.
>>>
>>>> If yes, I would like to know why more than one chunk / extent tree
>> is
>>>> required to represent one btrfs file system.
>>>
>>>     I think the original idea was that it would reduce lock
> contention
>>> on the tree root.
>>>
>>>> 2)
>>>>
>>>> Also I would like to know for a subvolume / snapshot , is there a
>>>> provision to ask btrfs , represent all blocks belongs to that
>>>> subvolume/snapshot should handle with a separate chunk tree and
>> extent
>>>> tree?
>>>
>>>     No.
>>>
>>>> I am looking for a way to traverse a subvolume preferably a
> snapshot
>> and
>>>> identify all disk blocks (extents) allocated for that particular
>> subvolume
>>>> / snapshot.
>>>
>>>     Do you mean allocated to any file in the subvolume, or do you
> mean
>>> *exclusively* allocated to that subvolume and not shared with any
>>> other?
>>>
>>>     The former is easy -- just walk the file tree, and read the
> extents
>>> for each file. The latter is harder, because you have to look for
>>> extents that are not shared, and extents that are only shared within
>>> the current subvolume (think reflink copies within a subvol). I
> think
>>> you can do that by counting backrefs, but there may be big race
>>> conditions involved on a filesystem that's being written to (because
>>> the backrefs aren't created immediately, but delayed for performance
>>> reasons).
>>>
>>>     Note that if all you want is the count of those blocks (rather
> than
>>> the block numbers themselves), then it's already been done with
>>> qgroups, and you don't need to write any btrfs code at all.
>>>
>>>     What exactly are you going to be doing with this information?
>>>
>>>     Hugo.
>>>
>>
>> I am trying a way to get all files and folders of a snapshot volume
>> without making file system level calls (fopen etc..)
>>
>> I want to write code to understand the corresponding snapshot btree
> and
>> used related chunk tree and extent tree, and find out for each file
>> (inode) all extent blocks.
>> If I want to backup, I will use above method to traverse snapshot
>> subvolume at disk level and copy all blocks of files/directories.
>>
>> Thank you
>> sri
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> in
>> the body of a message to majordomo <at> vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2015-04-17 17:29   ` David Sterba
@ 2016-02-26  1:29     ` Qu Wenruo
  2016-03-03 10:02       ` David Sterba
  2016-03-04  2:58     ` Anand Jain
  1 sibling, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2016-02-26  1:29 UTC (permalink / raw)
  To: dsterba, Hugo Mills, sri, linux-btrfs



David Sterba wrote on 2015/04/17 19:29 +0200:
> On Fri, Apr 17, 2015 at 09:19:11AM +0000, Hugo Mills wrote:
>>> In, some article i read that future there will be more chunk tree/ extent
>>> tree for single btrfs. Is this true.
>>
>>     I recall, many moons ago, Chris saying that there probably wouldn't
>> be.
>
> More extent trees tied to a set of fs trees/subvolumes would be very
> useful for certain usecases *cough*encryption*cough*.

BTW, will such design makes reflink between different set of extents 
fallback to normal copy?

And I'm pretty sure that inband dedup will be affected too...

Thanks,
Qu

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2016-02-26  1:29     ` Qu Wenruo
@ 2016-03-03 10:02       ` David Sterba
  0 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2016-03-03 10:02 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Hugo Mills, sri, linux-btrfs

On Fri, Feb 26, 2016 at 09:29:29AM +0800, Qu Wenruo wrote:
> 
> 
> David Sterba wrote on 2015/04/17 19:29 +0200:
> > On Fri, Apr 17, 2015 at 09:19:11AM +0000, Hugo Mills wrote:
> >>> In, some article i read that future there will be more chunk tree/ extent
> >>> tree for single btrfs. Is this true.
> >>
> >>     I recall, many moons ago, Chris saying that there probably wouldn't
> >> be.
> >
> > More extent trees tied to a set of fs trees/subvolumes would be very
> > useful for certain usecases *cough*encryption*cough*.
> 
> BTW, will such design makes reflink between different set of extents 
> fallback to normal copy?

Yes, the actual reflink will not be possible.

> And I'm pretty sure that inband dedup will be affected too...

Yes.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2015-04-17 17:29   ` David Sterba
  2016-02-26  1:29     ` Qu Wenruo
@ 2016-03-04  2:58     ` Anand Jain
  2016-03-04  9:26       ` David Sterba
  1 sibling, 1 reply; 10+ messages in thread
From: Anand Jain @ 2016-03-04  2:58 UTC (permalink / raw)
  To: dsterba, Hugo Mills, sri, linux-btrfs



On 04/18/2015 01:29 AM, David Sterba wrote:
> On Fri, Apr 17, 2015 at 09:19:11AM +0000, Hugo Mills wrote:
>>> In, some article i read that future there will be more chunk tree/ extent
>>> tree for single btrfs. Is this true.
>>
>>     I recall, many moons ago, Chris saying that there probably wouldn't
>> be.
>
> More extent trees tied to a set of fs trees/subvolumes would be very
> useful for certain usecases *cough*encryption*cough*.

  I didn't understand in full what's the idea here, but let met try..
  would it not defeat the purpose of encryption which is not to let
  disk have the un-encrypted data ? Looks like I am missing something
  here.

Thanks, Anand

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: how many chunk trees and extent trees present
  2016-03-04  2:58     ` Anand Jain
@ 2016-03-04  9:26       ` David Sterba
  0 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2016-03-04  9:26 UTC (permalink / raw)
  To: Anand Jain; +Cc: Hugo Mills, sri, linux-btrfs

On Fri, Mar 04, 2016 at 10:58:15AM +0800, Anand Jain wrote:
> 
> 
> On 04/18/2015 01:29 AM, David Sterba wrote:
> > On Fri, Apr 17, 2015 at 09:19:11AM +0000, Hugo Mills wrote:
> >>> In, some article i read that future there will be more chunk tree/ extent
> >>> tree for single btrfs. Is this true.
> >>
> >>     I recall, many moons ago, Chris saying that there probably wouldn't
> >> be.
> >
> > More extent trees tied to a set of fs trees/subvolumes would be very
> > useful for certain usecases *cough*encryption*cough*.
> 
>   I didn't understand in full what's the idea here, but let met try..
>   would it not defeat the purpose of encryption which is not to let
>   disk have the un-encrypted data ? Looks like I am missing something
>   here.

Depends how the encryption is designed. The separate extent trees would
allow to have mixed data in the filesystem, encrypted or not.

I can start with a normal filesystem, and then create encrypted
subvolumes any time later.

The idea of multiple extent trees:

Currently we have only one, all subvolume share the extent tree, can do
reflinks freely. We can create a subvolume (S1) and ask for a separate
extent tree (E1). Now we can create snapshots of S1 that would share E1,
and reflink accross snapshots that share E1.

Why is this useful to encryption: all data _and_ metadata blocks tied to
E1 and the attached subvolums are encrypted, the plain text is not
accessible without the key.

But the separate extent trees are useful on itself.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-03-04  9:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-17  6:24 how many chunk trees and extent trees present sri
2015-04-17  9:19 ` Hugo Mills
2015-04-17  9:56   ` sri
2016-02-25 12:16     ` sri
2016-02-26  1:21       ` Qu Wenruo
2015-04-17 17:29   ` David Sterba
2016-02-26  1:29     ` Qu Wenruo
2016-03-03 10:02       ` David Sterba
2016-03-04  2:58     ` Anand Jain
2016-03-04  9:26       ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.