All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs-dedupe broken and unsupported but in official wiki
@ 2020-06-18  2:28 DanglingPointer
  2020-06-18 10:31 ` David Sterba
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: DanglingPointer @ 2020-06-18  2:28 UTC (permalink / raw)
  To: linux-btrfs

btrfs-dedupe is currently broken and no longer actively supported.

It no longer builds with current rustc v1.44.0 with cargo

It is in the official btrfs Deduplication wiki:

     https://btrfs.wiki.kernel.org/index.php/Deduplication

There's no real active community and proper QA, reviewing and vetting.

A poster in the issues area of the projects Github location stated that 
even if fixed, it may not function correctly due to BTRFS having evolved 
since the tool was designed created.

There's just too many unknowns with this BTRFS specific dedupe tool.

People using your official wiki and trying to use that deduplication 
program could inadvertently destroy their data through nativity or 
accident.  Especially if they start trying to fix the code.

I recommend you remove it from your website or at least put large 
warnings there that it is broken (which looks ugly, I would rather only 
stuff that works were there since it isn't your project anyway but some 
3rd party).


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-18  2:28 btrfs-dedupe broken and unsupported but in official wiki DanglingPointer
@ 2020-06-18 10:31 ` David Sterba
  2020-06-18 20:43 ` Zygo Blaxell
  2020-06-18 20:59 ` waxhead
  2 siblings, 0 replies; 14+ messages in thread
From: David Sterba @ 2020-06-18 10:31 UTC (permalink / raw)
  To: DanglingPointer; +Cc: linux-btrfs

On Thu, Jun 18, 2020 at 12:28:41PM +1000, DanglingPointer wrote:
> btrfs-dedupe is currently broken and no longer actively supported.
> 
> It no longer builds with current rustc v1.44.0 with cargo
> 
> It is in the official btrfs Deduplication wiki:
> 
>      https://btrfs.wiki.kernel.org/index.php/Deduplication
> 
> There's no real active community and proper QA, reviewing and vetting.
> 
> A poster in the issues area of the projects Github location stated that 
> even if fixed, it may not function correctly due to BTRFS having evolved 
> since the tool was designed created.
> 
> There's just too many unknowns with this BTRFS specific dedupe tool.

That's enough reason to remove the entry from the page.

> People using your official wiki and trying to use that deduplication 
> program could inadvertently destroy their data through nativity or 
> accident.  Especially if they start trying to fix the code.
> 
> I recommend you remove it from your website or at least put large 
> warnings there that it is broken (which looks ugly, I would rather only 
> stuff that works were there since it isn't your project anyway but some 
> 3rd party).

With the 3rd party tools it's often leading to that situation and
feedback like yours helps to keep the information up to date.  I'll
remove the tools that are known to be unmainained.  Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-18  2:28 btrfs-dedupe broken and unsupported but in official wiki DanglingPointer
  2020-06-18 10:31 ` David Sterba
@ 2020-06-18 20:43 ` Zygo Blaxell
  2020-06-18 22:05   ` DanglingPointer
  2020-06-18 20:59 ` waxhead
  2 siblings, 1 reply; 14+ messages in thread
From: Zygo Blaxell @ 2020-06-18 20:43 UTC (permalink / raw)
  To: DanglingPointer; +Cc: linux-btrfs

On Thu, Jun 18, 2020 at 12:28:41PM +1000, DanglingPointer wrote:
> btrfs-dedupe is currently broken and no longer actively supported.
> 
> It no longer builds with current rustc v1.44.0 with cargo
> 
> It is in the official btrfs Deduplication wiki:
> 
>     https://btrfs.wiki.kernel.org/index.php/Deduplication
> 
> There's no real active community and proper QA, reviewing and vetting.
> 
> A poster in the issues area of the projects Github location stated that even
> if fixed, it may not function correctly due to BTRFS having evolved since
> the tool was designed created.
> 
> There's just too many unknowns with this BTRFS specific dedupe tool.
> 
> People using your official wiki and trying to use that deduplication program
> could inadvertently destroy their data through nativity or accident. 
> Especially if they start trying to fix the code.

The point about lack of maintenance with changing Rust dependencies is
fair, but "data loss" is a strong and unsupported statement.  Can you
explain how data loss could occur in even a badly (assume not maliciously)
broken version of btrfs-dedupe?

As far as I can tell, the btrfs-dedupe code uses only non-data-mutating
btrfs kernel interfaces for manipulating extents (fiemap, defrag,
and file_extent_same/deduperange).  None of these should cause data
loss (excluding kernel bugs).

btrfs-dedupe can be trivially tricked into opening files that it did
not intend to (it has no protection against symlink injection and other
TOCCTOU attacks), but it doesn't seem to be able to alter the content
of files once it opens them.

File descriptors pointing to user files are opened O_RDWR, but they are
kept in the scope of the dedupe function and their life-cycle is properly
managed in Rust, so btrfs-dedupe won't mutate files by writing to the
wrong fd (e.g. accidentally close stderr and reopen it to a user file)
unless someone adds some seriously buggy code (see "assume not malicious"
above).

The unsafe C ioctl interfaces are unlikely to change in data-losing ways,
or they'll break all existing userspace tools that use them.  They are
also well encapsulated in the rust-btrfs module.

The errors reported on github seem to be problems with incompatible
changes in the runtime libraries btrfs-dedupe depends on, and also some
reports of what look like pre-existing bugs in the fiemap code that are
blamed on new kernel versions without evidence.  Data-losing breaking
changes in any of the ioctls btrfs-dedupe uses are extremely unlikely.
Those issues may cause btrfs-dedupe to do useless unnecessary work,
or fail to do useful necessary work, but could not cause data loss by
any mechanism I can find.

Contrast with bedup:  bedup uses data-mutating kernel interfaces
(clone_range) for dedupe that have no effective protection against
concurrent data modification.  There is ineffective protection implemented
in bedup (looking in /proc/*/fd for concurrent users of the files) which
may or may not be broken in kernel 5.0, but it's ineffective either way.
The case for data loss in bedup is trivial.  The branch with a patch to
fix it is now 7 years old, so it's fair to say bedup is unmaintained too
(github forks notwithstanding, they didn't fix these issues).

> I recommend you remove it from your website or at least put large warnings
> there that it is broken (which looks ugly, I would rather only stuff that
> works were there since it isn't your project anyway but some 3rd party).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-18  2:28 btrfs-dedupe broken and unsupported but in official wiki DanglingPointer
  2020-06-18 10:31 ` David Sterba
  2020-06-18 20:43 ` Zygo Blaxell
@ 2020-06-18 20:59 ` waxhead
  2020-06-19 13:19   ` David Sterba
  2 siblings, 1 reply; 14+ messages in thread
From: waxhead @ 2020-06-18 20:59 UTC (permalink / raw)
  To: DanglingPointer, linux-btrfs

I have pointed this out before , but I would like to use the opportunity 
again. I, as just a regular user of btrfs would feel more comfortable if 
the dedupe tool was part of btrfs such as for example btrfs filesystem 
dedupe -r /somewhere

Regular users that are somewhat technically able may not know that the 
dedupe fuctions are kernel api's that should not destroy anything even 
if the calling program went berserk.

While this may be obvious to btrfs developers, it is not to regular 
users that may be concerned that a particular tool may wreck havoc on 
their filesystem.

DanglingPointer wrote:
> btrfs-dedupe is currently broken and no longer actively supported.
> 
> It no longer builds with current rustc v1.44.0 with cargo
> 
> It is in the official btrfs Deduplication wiki:
> 
>      https://btrfs.wiki.kernel.org/index.php/Deduplication
> 
> There's no real active community and proper QA, reviewing and vetting.
> 
> A poster in the issues area of the projects Github location stated that 
> even if fixed, it may not function correctly due to BTRFS having evolved 
> since the tool was designed created.
> 
> There's just too many unknowns with this BTRFS specific dedupe tool.
> 
> People using your official wiki and trying to use that deduplication 
> program could inadvertently destroy their data through nativity or 
> accident.  Especially if they start trying to fix the code.
> 
> I recommend you remove it from your website or at least put large 
> warnings there that it is broken (which looks ugly, I would rather only 
> stuff that works were there since it isn't your project anyway but some 
> 3rd party).
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-18 20:43 ` Zygo Blaxell
@ 2020-06-18 22:05   ` DanglingPointer
  2020-06-19  5:04     ` Zygo Blaxell
  0 siblings, 1 reply; 14+ messages in thread
From: DanglingPointer @ 2020-06-18 22:05 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: linux-btrfs

For a large portion of desktop users that are not developers and are 
rustlang illiterate and programming illiterate; they would not now 
whether this tool or that tool or any tool would be safe, or unsafe, or 
have concurrent race conditions, or know the meaning of immutable or mutex.

Think of this scenario; average Joe Bloggs user buys new computer 
without MS Windows.  With the software savings, Joe purchases more 
disks. He then chooses openSuse Leap for his first foray into Linux.
All he cares about are his music files, photos, and videos being safe.  
Joe runs a Cafe down the street and uses the music, photos, and videos 
in various screens at his cafe for the atmosphere.
Times are tough and he's running out of space so he doesn't want the 
accumulate media files duplicated all around the place wasting space to 
conserve storage.

If the official wikis have broken 3rd party tools, then it makes the 
whole adoption process less easy, less friendly, very cryptic, more 
chaotic; and give the impression that btrfs is a mess and not ready (and 
Linux as a whole).  He would not know or have the time to go through the 
code of each deduplication program tool option to figure out if one type 
or the other type is better just like Zygo Blaxell did who can read 
code.  Even if he wanted to, he doesn't know how to nor has the time to 
do it.  He says good-bye to openSuse and buys Windows.

So I do agree with waxhead.  It would be preferable if there were an 
official btrfs deduplication command from btrfs-progs instead of relying 
on 3rd parties.  Joe Bloggs example above can read a web-page 
instructions saying "run this command... and then this command..."; but 
he will not have the knowledge, nor comprehension nor time to go through 
code.

Thanks David Sterba for removing the items and updating the wiki!

On 19/6/20 6:43 am, Zygo Blaxell wrote:
> The point about lack of maintenance with changing Rust dependencies is
> fair, but "data loss" is a strong and unsupported statement.  Can you
> explain how data loss could occur in even a badly (assume not maliciously)
> broken version of btrfs-dedupe?
>
> As far as I can tell, the btrfs-dedupe code uses only non-data-mutating
> btrfs kernel interfaces for manipulating extents (fiemap, defrag,
> and file_extent_same/deduperange).  None of these should cause data
> loss (excluding kernel bugs).
>
> btrfs-dedupe can be trivially tricked into opening files that it did
> not intend to (it has no protection against symlink injection and other
> TOCCTOU attacks), but it doesn't seem to be able to alter the content
> of files once it opens them.
>
> File descriptors pointing to user files are opened O_RDWR, but they are
> kept in the scope of the dedupe function and their life-cycle is properly
> managed in Rust, so btrfs-dedupe won't mutate files by writing to the
> wrong fd (e.g. accidentally close stderr and reopen it to a user file)
> unless someone adds some seriously buggy code (see "assume not malicious"
> above).
>
> The unsafe C ioctl interfaces are unlikely to change in data-losing ways,
> or they'll break all existing userspace tools that use them.  They are
> also well encapsulated in the rust-btrfs module.
>
> The errors reported on github seem to be problems with incompatible
> changes in the runtime libraries btrfs-dedupe depends on, and also some
> reports of what look like pre-existing bugs in the fiemap code that are
> blamed on new kernel versions without evidence.  Data-losing breaking
> changes in any of the ioctls btrfs-dedupe uses are extremely unlikely.
> Those issues may cause btrfs-dedupe to do useless unnecessary work,
> or fail to do useful necessary work, but could not cause data loss by
> any mechanism I can find.
>
> Contrast with bedup:  bedup uses data-mutating kernel interfaces
> (clone_range) for dedupe that have no effective protection against
> concurrent data modification.  There is ineffective protection implemented
> in bedup (looking in /proc/*/fd for concurrent users of the files) which
> may or may not be broken in kernel 5.0, but it's ineffective either way.
> The case for data loss in bedup is trivial.  The branch with a patch to
> fix it is now 7 years old, so it's fair to say bedup is unmaintained too
> (github forks notwithstanding, they didn't fix these issues).
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-18 22:05   ` DanglingPointer
@ 2020-06-19  5:04     ` Zygo Blaxell
  2020-06-19 13:11       ` David Sterba
  0 siblings, 1 reply; 14+ messages in thread
From: Zygo Blaxell @ 2020-06-19  5:04 UTC (permalink / raw)
  To: DanglingPointer; +Cc: linux-btrfs

On Fri, Jun 19, 2020 at 08:05:44AM +1000, DanglingPointer wrote:
> For a large portion of desktop users that are not developers and are
> rustlang illiterate and programming illiterate; they would not now whether
> this tool or that tool or any tool would be safe, or unsafe, or have
> concurrent race conditions, or know the meaning of immutable or mutex.
> 
> Think of this scenario; average Joe Bloggs user buys new computer without MS
> Windows.  With the software savings, Joe purchases more disks. He then
> chooses openSuse Leap for his first foray into Linux.
> All he cares about are his music files, photos, and videos being safe.  Joe
> runs a Cafe down the street and uses the music, photos, and videos in
> various screens at his cafe for the atmosphere.
> Times are tough and he's running out of space so he doesn't want the
> accumulate media files duplicated all around the place wasting space to
> conserve storage.
> 
> If the official wikis have broken 3rd party tools, then it makes the whole
> adoption process less easy, less friendly, very cryptic, more chaotic; and
> give the impression that btrfs is a mess and not ready (and Linux as a
> whole).  He would not know or have the time to go through the code of each
> deduplication program tool option to figure out if one type or the other
> type is better just like Zygo Blaxell did who can read code.  Even if he
> wanted to, he doesn't know how to nor has the time to do it.  He says
> good-bye to openSuse and buys Windows.

My objection here is the serious accusation in the term "data loss", which
you have made on the mailing list and github without supporting evidence.

Joe Bloggs will not lose any data from btrfs-dedupe.  He'll waste his
time and run out of disk space, and maybe switch filesystems due to
frustration, but Joe will not lose any of his data.

btrfs-dedupe has not had new commits in years and no longer builds on
today's Rust.  Those facts alone would have been sufficient to justify
removing it from the wiki.  We have far too many real data loss bugs in
btrfs already.  There is no need to spread rumors about new ones just
to push changes through.

It might be nice to keep btrfs-dedupe and bedup _somewhere_ on the wiki,
clearly marked as not supported and only of historical interest to new
developers.  I learned a lot about what is possible on btrfs from bedup
in particular (bees was initially a project to combine the features of
bedup and duperemove), and python is accessible to more developers than
C or C++.  btrfs-dedupe was the first btrfs dedupe agent to combine
defrag and dedupe operations into a single program.

> So I do agree with waxhead.  It would be preferable if there were an
> official btrfs deduplication command from btrfs-progs instead of relying on
> 3rd parties.  Joe Bloggs example above can read a web-page instructions
> saying "run this command... and then this command..."; but he will not have
> the knowledge, nor comprehension nor time to go through code.

Which of the available candidates for "official btrfs dedupe" would you
put in btrfs-progs?  I see a lot of runners in the race, but no clear
winner yet.

duperemove is the closest to Waxhead's proposed "-r /somewhere" syntax.
It's the obvious choice:  written in the same language as btrfs-progs, and
also the oldest btrfs deduper, and it has years of patient, data-driven
optimization built in.  If there wasn't some insurmountable reason
why duperemove can't be merged with btrfs-progs, then it would have
happened already, so there must be a reason why this can't ever happen
(which might be as simple as neither maintainer wants to merge).
Maybe we put duperemove at the top of the Wiki page, as it has the
simplest command-line for Joe Blogger's use case, and it's relatively
easy to build for the few people who use distros where it's not packaged.

The stub support for in-kernel dedupe (arguably the only "official"
btrfs dedupe so far) has been removed due to lack of interest in its
development.  That _was_ available in branches of btrfs-progs
as 'btrfs dedupe'.  It's gone now.

The other viable deduper candidates are still works in progress, and
some have significant trade-offs and limitations resulting from their
optimization for specific use cases.  duperemove hasn't exploited any
btrfs-specific features to make it faster, so duperemove is already
close to the upper performance limits of its design, but far below the
performance that is possible in a specialist tool for btrfs.  bees scales
better and saves more space than the other dedupers, but bees can't
exclude any part of the filesystem from the scope of dedupe the way every
other btrfs deduper can.  dduper is a proof of concept that is so much
faster than the other block-oriented dedupers on btrfs that it overcomes a
ridiculously inefficient implementation and wins benchmarks--but it also
saves the least amount of space of any of the block-oriented dedupers on
the wiki.  There are some other candidates out there that aren't on the
wiki that attack the dedupe problem from interesting--and potentially
high-performing--angles (e.g. solstice dedupes the entire filesystem
using a sorting algorithm instead of a hash table).

The dozen or so utilities that do file-only dedupe well and support btrfs
are faster at Joe Blogger's use case than all the block-oriented dedupers.
Most of them are not btrfs-specific tools, so it doesn't make sense to
integrate them into btrfs-progs.

Most of the existing dedupers aren't written in C.  The rest of
btrfs-progs is C, creating a code review and maintenance issue if they
are to be merged. 

The write-in candidate is "write a file-only deduper in C just so it can
be integrated with btrfs-progs."  That doesn't even exist, and it's still
better than some of the existing candidates for merging into btrfs-progs.

A deduper that is good at block-level dedupe is bad at file-level dedupe
and vice versa.  They view the filesystem stack from different sides,
and the hardest optimization one can do is the easiest for the other.
Pre-write (in-kernel) and post-write dedupers have significantly
different memory costs, which is another reason for having a diverse set
of dedupers:  if you copy the ZFS approach to dedupe, you need ZFS-sized
memory budgets to implement it; if you don't have ZFS-sized memory, you
need an alternative implementation.  These are significant barriers to
picking a single winner.

For now, at least until one of the dedupers can scale well over a superset
of the other dedupers' use cases, or the in-kernel deduper comes back from
the dead, it would be better to provide third-party dedupers that are
optimized for the subset of workloads that they can handle very well.

Otherwise, whichever single deduper you pick, it will suck for some users,
or we pick multiple dedupe engines and need have a zillion options after
'btrfs fi dedupe' to help it pick which engine to use (this has already
happened to some extent in duperemove).

At the current rate of development, the XFS people might leapfrog us
on dedupe, and "official btrfs dedupe" could end up being xfs_fsr.

> Thanks David Sterba for removing the items and updating the wiki!
> 
> On 19/6/20 6:43 am, Zygo Blaxell wrote:
> > The point about lack of maintenance with changing Rust dependencies is
> > fair, but "data loss" is a strong and unsupported statement.  Can you
> > explain how data loss could occur in even a badly (assume not maliciously)
> > broken version of btrfs-dedupe?
> > 
> > As far as I can tell, the btrfs-dedupe code uses only non-data-mutating
> > btrfs kernel interfaces for manipulating extents (fiemap, defrag,
> > and file_extent_same/deduperange).  None of these should cause data
> > loss (excluding kernel bugs).
> > 
> > btrfs-dedupe can be trivially tricked into opening files that it did
> > not intend to (it has no protection against symlink injection and other
> > TOCCTOU attacks), but it doesn't seem to be able to alter the content
> > of files once it opens them.
> > 
> > File descriptors pointing to user files are opened O_RDWR, but they are
> > kept in the scope of the dedupe function and their life-cycle is properly
> > managed in Rust, so btrfs-dedupe won't mutate files by writing to the
> > wrong fd (e.g. accidentally close stderr and reopen it to a user file)
> > unless someone adds some seriously buggy code (see "assume not malicious"
> > above).
> > 
> > The unsafe C ioctl interfaces are unlikely to change in data-losing ways,
> > or they'll break all existing userspace tools that use them.  They are
> > also well encapsulated in the rust-btrfs module.
> > 
> > The errors reported on github seem to be problems with incompatible
> > changes in the runtime libraries btrfs-dedupe depends on, and also some
> > reports of what look like pre-existing bugs in the fiemap code that are
> > blamed on new kernel versions without evidence.  Data-losing breaking
> > changes in any of the ioctls btrfs-dedupe uses are extremely unlikely.
> > Those issues may cause btrfs-dedupe to do useless unnecessary work,
> > or fail to do useful necessary work, but could not cause data loss by
> > any mechanism I can find.
> > 
> > Contrast with bedup:  bedup uses data-mutating kernel interfaces
> > (clone_range) for dedupe that have no effective protection against
> > concurrent data modification.  There is ineffective protection implemented
> > in bedup (looking in /proc/*/fd for concurrent users of the files) which
> > may or may not be broken in kernel 5.0, but it's ineffective either way.
> > The case for data loss in bedup is trivial.  The branch with a patch to
> > fix it is now 7 years old, so it's fair to say bedup is unmaintained too
> > (github forks notwithstanding, they didn't fix these issues).
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-19  5:04     ` Zygo Blaxell
@ 2020-06-19 13:11       ` David Sterba
  2020-06-22 19:49         ` Goffredo Baroncelli
  0 siblings, 1 reply; 14+ messages in thread
From: David Sterba @ 2020-06-19 13:11 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: DanglingPointer, linux-btrfs

On Fri, Jun 19, 2020 at 01:04:03AM -0400, Zygo Blaxell wrote:
> It might be nice to keep btrfs-dedupe and bedup _somewhere_ on the wiki,
> clearly marked as not supported and only of historical interest to new
> developers.  I learned a lot about what is possible on btrfs from bedup
> in particular (bees was initially a project to combine the features of
> bedup and duperemove), and python is accessible to more developers than
> C or C++.  btrfs-dedupe was the first btrfs dedupe agent to combine
> defrag and dedupe operations into a single program.

It's there now.

> > So I do agree with waxhead.  It would be preferable if there were an
> > official btrfs deduplication command from btrfs-progs instead of relying on
> > 3rd parties.  Joe Bloggs example above can read a web-page instructions
> > saying "run this command... and then this command..."; but he will not have
> > the knowledge, nor comprehension nor time to go through code.
> 
> Which of the available candidates for "official btrfs dedupe" would you
> put in btrfs-progs?  I see a lot of runners in the race, but no clear
> winner yet.
> 
> duperemove is the closest to Waxhead's proposed "-r /somewhere" syntax.
> It's the obvious choice:  written in the same language as btrfs-progs, and
> also the oldest btrfs deduper, and it has years of patient, data-driven
> optimization built in.

That there's not even a simple eg. file-based deduper available in
btrfs-progs is kind of bad. Duperemove is indeed closest to that.

> If there wasn't some insurmountable reason
> why duperemove can't be merged with btrfs-progs, then it would have
> happened already, so there must be a reason why this can't ever happen
> (which might be as simple as neither maintainer wants to merge).

I'm not against adding the functionality to btrfs-progs, but merging
whole duperemove feature set might not happen due to additional
dependencies. This would need to be evaluated, but I'm not aware of any
other technical reasons.

I don't remember exactly why duperemove started as a separate project
instead of a subcommand or progs, but we can revisit that.

> Maybe we put duperemove at the top of the Wiki page, as it has the
> simplest command-line for Joe Blogger's use case, and it's relatively
> easy to build for the few people who use distros where it's not packaged.

That's a good idea, a 'quick start' section, with description of most
common usecases using duperemove.

> The stub support for in-kernel dedupe (arguably the only "official"
> btrfs dedupe so far) has been removed due to lack of interest in its
> development.  That _was_ available in branches of btrfs-progs
> as 'btrfs dedupe'.  It's gone now.

The more I think about in-band dedupe (and how it would complicate
everything), I'm leaning more towards a user-space solution with support
from kernel (ioctls, keeping hashes of recently modified blocks but not
doing the actual deduplication, reading hashes from csum tree, etc).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-18 20:59 ` waxhead
@ 2020-06-19 13:19   ` David Sterba
  0 siblings, 0 replies; 14+ messages in thread
From: David Sterba @ 2020-06-19 13:19 UTC (permalink / raw)
  To: waxhead; +Cc: DanglingPointer, linux-btrfs

On Thu, Jun 18, 2020 at 10:59:10PM +0200, waxhead wrote:
> I have pointed this out before , but I would like to use the opportunity 
> again. I, as just a regular user of btrfs would feel more comfortable if 
> the dedupe tool was part of btrfs such as for example btrfs filesystem 
> dedupe -r /somewhere

I agree that something like that would be highly useful, and despite I
know about duperemove I don't use it often enough to remember how
exactly to use it for the simple usecase.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-19 13:11       ` David Sterba
@ 2020-06-22 19:49         ` Goffredo Baroncelli
  2020-06-22 22:45           ` Zygo Blaxell
  0 siblings, 1 reply; 14+ messages in thread
From: Goffredo Baroncelli @ 2020-06-22 19:49 UTC (permalink / raw)
  To: dsterba, Zygo Blaxell, DanglingPointer, linux-btrfs

On 6/19/20 3:11 PM, David Sterba wrote:
>> If there wasn't some insurmountable reason
>> why duperemove can't be merged with btrfs-progs, then it would have
>> happened already, so there must be a reason why this can't ever happen
>> (which might be as simple as neither maintainer wants to merge).
> I'm not against adding the functionality to btrfs-progs, but merging
> whole duperemove feature set might not happen due to additional
> dependencies. This would need to be evaluated, but I'm not aware of any
> other technical reasons.
> 
> I don't remember exactly why duperemove started as a separate project
> instead of a subcommand or progs, but we can revisit that.
> 
Even tough I don't think that this was the reason at the time, now the ioctl FIDEDUPERANGE (aka BTRFS_IOC_FILE_EXTENT_SAME) is "filesystem agnostic". So I think that does make sense a tool more generic than btrfs(-progs).

What I mean is: because this is not a BTRFS specific ioctl anymore, why we should have a BTRFS specific implementation ?

 From a technical point of view: dupremover could take advantage of the btrfs csum. So the question could be : is it better to add the capability to use the BTRFS csum to duperemover or to add the code of dupremover to BTRFS ?

 From an user point of view, I think that the former makes sense.

BR
G.Baroncelli

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-22 19:49         ` Goffredo Baroncelli
@ 2020-06-22 22:45           ` Zygo Blaxell
  2020-07-02  8:27             ` Lakshmipathi.G
  0 siblings, 1 reply; 14+ messages in thread
From: Zygo Blaxell @ 2020-06-22 22:45 UTC (permalink / raw)
  To: kreijack; +Cc: dsterba, DanglingPointer, linux-btrfs

On Mon, Jun 22, 2020 at 09:49:55PM +0200, Goffredo Baroncelli wrote:
> On 6/19/20 3:11 PM, David Sterba wrote:
> > > If there wasn't some insurmountable reason
> > > why duperemove can't be merged with btrfs-progs, then it would have
> > > happened already, so there must be a reason why this can't ever happen
> > > (which might be as simple as neither maintainer wants to merge).
> > I'm not against adding the functionality to btrfs-progs, but merging
> > whole duperemove feature set might not happen due to additional
> > dependencies. This would need to be evaluated, but I'm not aware of any
> > other technical reasons.
> > 
> > I don't remember exactly why duperemove started as a separate project
> > instead of a subcommand or progs, but we can revisit that.
> > 
> Even tough I don't think that this was the reason at the time, now the
> ioctl FIDEDUPERANGE (aka BTRFS_IOC_FILE_EXTENT_SAME) is "filesystem
> agnostic". So I think that does make sense a tool more generic than
> btrfs(-progs).
> 
> What I mean is: because this is not a BTRFS specific ioctl anymore,
> why we should have a BTRFS specific implementation ?

First, to take advantage of unique btrfs capabilities:  incremental
scanning using transid and TREE_SEARCH_V2, and user data block csums.
Second, to take advantage of generic filesystem capabilities that
require btrfs-specific implementation details.  Third, btrfs has immutable
extents while other filesystems don't, and ignoring that fact in a generic
multi-filesystem tool will cost a lot of dedupe efficiency on btrfs.

On a big filesystem, the difference between a filesystem-specific
dedupe tool and a filesystem-agnostic one could be many orders of
magnitude better performance and a doubling of space recovery.

duperemove is implemented using generic filesystem APIs:  you point it at
a directory tree, it scans all the files in the tree (including
previously deduped files) and dedupes them.  In incremental mode it
scans the entire tree and compares the tree with a database.  This is
the slowest way to keep a filesystem deduplicated at scale.

XFS and btrfs are both capable of doing dedupe at wire speeds by
bypassing most of the filesystem (similar to a scrub, and can even
be combined with scrub).  That level of performance makes incremental
scanning and filesystem csum support unnecessary for many use cases,
since users would just run full dedupe instead of scrub.  One tool
can support both XFS and btrfs this way, though it would have to have
specialized support for each individual filesystem as the details on each
filesystem are very different (GETFSMAP and pread, vs LOGICAL_INO and all
the different btrfs raid profiles and compression formats).  It could be
done as a dedupe core with plugin support for each filesystem, provided
that the core algorithm is designed to handle btrfs's immutable extents.
AFAIK nobody has built such a tool yet.

XFS doesn't maintain csums of user data or support incremental scans,
so XFS can dedupe _only_ as fast as it can scrub (*).  btrfs has the
extra information in the filesystem, so in theory we can start with the
wire-speed dedupe from above, and make it up to 1000 times faster by
reading the csums instead of reading the data blocks, and then faster
still by scanning only the parts of the filesystem that changed from one
dedupe run to the next.

(*) XFS has some very fast tools for rapidly finding modified inodes,
and it doesn't have immutable extents like btrfs does.  XFS might win
by brute force against btrfs's slower equivalents.  It would depend on
the mix of file sizes in the workload.

> From a technical point of view: dupremover could take advantage of
> the btrfs csum. So the question could be : is it better to add the
> capability to use the BTRFS csum to duperemover or to add the code of
> dupremover to BTRFS ?

The options are orthogonal.  csum read support can be added to any dedupe
tool, whether it's part of the official btrfs code or not.  We can decide
on an official tool and add csum support to that tool in either order.

> From an user point of view, I think that the former makes sense.
> 
> BR
> G.Baroncelli
> 
> -- 
> gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
> Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-06-22 22:45           ` Zygo Blaxell
@ 2020-07-02  8:27             ` Lakshmipathi.G
  2020-07-03  3:16               ` Zygo Blaxell
  0 siblings, 1 reply; 14+ messages in thread
From: Lakshmipathi.G @ 2020-07-02  8:27 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: kreijack, dsterba, DanglingPointer, btrfs

Hi Zygo.

>dduper is a proof of concept that is so much
>faster than the other block-oriented dedupers on btrfs that it overcomes a
>ridiculously inefficient implementation and wins benchmarks--but it also
>saves the least amount of space of any of the block-oriented dedupers on
>the wiki.

Regarding dduper, do you have a script to re-create your dataset? I'd like to
investigate why dduper saves the least amount of space. thanks!

----
Cheers,
Lakshmipathi.G
http://www.giis.co.in https://www.webminal.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-07-02  8:27             ` Lakshmipathi.G
@ 2020-07-03  3:16               ` Zygo Blaxell
  2020-07-06 10:46                 ` Lakshmipathi.G
  0 siblings, 1 reply; 14+ messages in thread
From: Zygo Blaxell @ 2020-07-03  3:16 UTC (permalink / raw)
  To: Lakshmipathi.G; +Cc: kreijack, dsterba, DanglingPointer, btrfs

On Thu, Jul 02, 2020 at 01:57:57PM +0530, Lakshmipathi.G wrote:
> Hi Zygo.
> 
> >dduper is a proof of concept that is so much
> >faster than the other block-oriented dedupers on btrfs that it overcomes a
> >ridiculously inefficient implementation and wins benchmarks--but it also
> >saves the least amount of space of any of the block-oriented dedupers on
> >the wiki.
> 
> Regarding dduper, do you have a script to re-create your dataset? I'd like to
> investigate why dduper saves the least amount of space. thanks!

My data set is a bunch of Windows raw disk images taken right after the
MS installer runs.  I don't think I can share it, but it's easy enough to
roll your own.  To avoid btrfs backref performance bugs, I split the disk
images into 1GB files, removed those files that were entirely duplicate
(all-zero or hard disk sector initialization pattern), and deduped the
rest.  For repeatability, once I had set up the btrfs filesystem with
all the 1GB raw image fragment files, I dd'ed it to a raw partition on
a dedicated disk and ran the test in a VN, so that all tools deduped an
identical filesystem image on the same hardware (which has since died,
so here I will use the saved results of the last run).

On btrfs, extents are immutable.  To remove a duplicate extent, the
deduper must remove every reference to every block in an extent, even
if some of the blocks do not contain duplicate data.  If any reference
to the extent remains anywhere in the filesystem, no space is saved.
If anything, space is lost due to metadata growth.

One way to achieve removal of a partially matched extent is to copy the
unique data so that the entire extent contains duplicate data (which
bees does).  Another option is to not attempt dedupe at all unless the
entire content of one extent matches (which duperemove might do...in a
dev branch?).  This does not gain more free space, but it avoids wasting
time issuing dedupe ioctl calls that will cost time.

duperemove will do parts of this analysis depending on command-line
options.  dduper doesn't do any such analysis that I've seen, and
its performance seems to be comparable to duperemove with a crippling
set of command-line options.  The space efficiency of both dduper and
duperemove is poor on btrfs--they are only effective when deduping files
with small extents, or files that are entirely duplicate.  In test runs,
both dduper and duperemove issue a lot of dedupe ioctls that have no
effect on free space (though duperemove has command-line options that
avoid the worst losses).

In my uncompressed test, the extents are all large (many are at the
maximum 128MB size), so a deduper that doesn't split extents will be able
to recover almost no space.  The only successes dduper and duperemove
were able to achieve were exploiting the fact that Windows disks have
contiguous gigabytes of identical content in their recovery-tools
partitions.

bees is able to recover more of the duplicate space it finds because it
slices up large extents along dedupe-friendly boundaries.  This slows bees
down on uncompressed filesystems because the incoming extents are larger.

My test result for 140GB of uncompressed data was:

	bees saved 31% in 1h 40m (0.31%/min)

	duperemove -d -r saved 12% in 2h 30m (0.08%/min)

	duperemove -d -r --dedupe-options=same saved 12% in 25 minutes
	(0.48%/min)

	dduper saved 9% in 16m (0.56%/min)

	duperemove -d -r --dedupe-options=nofiemap,noblock,same -A
	--lookup-extents=no saved 7% in 25 min (0.28%/min)

dduper is the fastest, but saves less total space than two variations
of duperemove command-line options.

dduper is even faster than the above numbers suggest--it deduped 8.5% of
the data in 6 minutes, a rate of 1.41%/minute, 3x faster than duperemove's
best score...then dduper wasted the following 10 minutes doing futile
dedupe ioctl calls that didn't free any space.

All that said, scoring the highest free space %/minute rate in a race with
other dedupers _while wasting 67% of the time and 71% of the available
space_ is pretty impressive!

duperemove -d -r took 2h 30m because it hits an old btrfs backref
performance bug (now fixed in 5.7?).  It actually saved 12% in 25 minutes
too, but it created a toxic extent and spent 2 hours burning CPU in the
kernel to process it.  The other duperemove command-line argument sets
mentioned here avoid this bug.

The result for 100GB of compressed data (the same data, but compressed
with compress-force=zstd) was:

	bees saved 44% in 1h 15m (0.58%/min)

	duperemove -d -r --dedupe-options=nofiemap,noblock,same -A
	--lookup-extents=no saved 3% in 12 minutes (0.25%/min)

	dduper saved 1% in 24 minutes (0.04%/min)

On compressed filesystem tests, dduper gains almost no space.  This is
expected, because dduper only looks at btrfs csums, and the btrfs csums
can only match when the compressed data representation of both copies
is exactly the same.  In btrfs-compressed files the compressed extent
block alignment is effectively random for large files, since it depends
on timing details at the time of the btrfs commit, so on average only 3%
(1 in 32 blocks) of extents with duplicate data will have matching csums
after compression.  bees and duperemove read the data after decompression,
so they are not limited by differences in compression encoding.

The only way for dduper to catch up here is to detect compressed
extents and fall back to reading them the slow way.  This is a reasonable
tradeoff for filesystem workloads that have low proportions of compressed
data; otherwise, duperemove's optimized multi-threaded implementation
might run slightly faster than dduper on a fast device, if dduper is
forced to read all the blocks because they are compressed.

I don't recall why I didn't run duperemove with other options on a
compressed filesystem during this test--possibly to avoid a bug?

I have not looked in further detail into why dduper frees slightly less
space than duperemove under some conditions.  A simple deduper with a
minimal awareness of btrfs's extent reference counting structure can
easily match or slightly outperform the best deduper without one; with a
non-minimal awareness of btrfs structure, a slow and broken deduper can
outperform by an order of magnitude.  duperemove's command-line options do
provide or suppress some awareness of extent structure, so I would expect
those options to increase or decrease space saved slightly compared to
a tool that has no such awareness, and that seems to be what happens.
The test results of dduper, duperemove, and bees are all consistent
with that.

> ----
> Cheers,
> Lakshmipathi.G
> http://www.giis.co.in https://www.webminal.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-07-03  3:16               ` Zygo Blaxell
@ 2020-07-06 10:46                 ` Lakshmipathi.G
  2020-07-25  7:24                   ` Lakshmipathi.G
  0 siblings, 1 reply; 14+ messages in thread
From: Lakshmipathi.G @ 2020-07-06 10:46 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: kreijack, dsterba, DanglingPointer, btrfs

Hi Zygo,

Thanks for the extensive details about the data-set and how the environment
is set up and tests are executed. I'll try to create some windows raw
disk images for testing and follow your test environment setup as
much as possible. Performance numbers are really interesting between
bees, duperemove and dduper!

As you mentioned previously, dduper is more like poc and I didn't spend
much time in testing with different data sets. Mostly created few GB files
with `dd urandom` and tested them. I guess that is why it performs better
with files that are entirely duplicate and small extents :-)

Let me spend some time investigating these issues, I'm pretty sure dduper
can be made a little bit more reliable that its current form.

Reg compressed file system tests, will check this after resolving
poor disk-space issues on non-compressed filesystems. thanks!

----
Cheers,
Lakshmipathi.G
http://www.giis.co.in https://www.webminal.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: btrfs-dedupe broken and unsupported but in official wiki
  2020-07-06 10:46                 ` Lakshmipathi.G
@ 2020-07-25  7:24                   ` Lakshmipathi.G
  0 siblings, 0 replies; 14+ messages in thread
From: Lakshmipathi.G @ 2020-07-25  7:24 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: kreijack, dsterba, DanglingPointer, btrfs

Hi Zygo.

> Let me spend some time investigating these issues, I'm pretty sure dduper
> can be made a little bit more reliable that its current form.

I think I resolved the bug which caused less disk-space saving issue with
this commit [1]. At-least now dduper should provide better disk-saving than
its previous version.

Also added `--analyze` option to display stats with different chunk size[2] and
posted some test run results here [3]. thanks!

[1]: https://github.com/Lakshmipathi/dduper/commit/180f2aedf697b440c53cbe61195dd821c8aae3b4
[2]: https://github.com/lakshmipathi/dduper#analyze-with-different-chunk-size
[3]: https://github.com/Lakshmipathi/dduper/blob/master/tests/TESTS.md

----
Cheers,
Lakshmipathi.G
http://www.giis.co.in https://www.webminal.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-07-25  7:24 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-18  2:28 btrfs-dedupe broken and unsupported but in official wiki DanglingPointer
2020-06-18 10:31 ` David Sterba
2020-06-18 20:43 ` Zygo Blaxell
2020-06-18 22:05   ` DanglingPointer
2020-06-19  5:04     ` Zygo Blaxell
2020-06-19 13:11       ` David Sterba
2020-06-22 19:49         ` Goffredo Baroncelli
2020-06-22 22:45           ` Zygo Blaxell
2020-07-02  8:27             ` Lakshmipathi.G
2020-07-03  3:16               ` Zygo Blaxell
2020-07-06 10:46                 ` Lakshmipathi.G
2020-07-25  7:24                   ` Lakshmipathi.G
2020-06-18 20:59 ` waxhead
2020-06-19 13:19   ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.