* btrfs-dedupe broken and unsupported but in official wiki @ 2020-06-18 2:28 DanglingPointer 2020-06-18 10:31 ` David Sterba ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: DanglingPointer @ 2020-06-18 2:28 UTC (permalink / raw) To: linux-btrfs btrfs-dedupe is currently broken and no longer actively supported. It no longer builds with current rustc v1.44.0 with cargo It is in the official btrfs Deduplication wiki: https://btrfs.wiki.kernel.org/index.php/Deduplication There's no real active community and proper QA, reviewing and vetting. A poster in the issues area of the projects Github location stated that even if fixed, it may not function correctly due to BTRFS having evolved since the tool was designed created. There's just too many unknowns with this BTRFS specific dedupe tool. People using your official wiki and trying to use that deduplication program could inadvertently destroy their data through nativity or accident. Especially if they start trying to fix the code. I recommend you remove it from your website or at least put large warnings there that it is broken (which looks ugly, I would rather only stuff that works were there since it isn't your project anyway but some 3rd party). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-18 2:28 btrfs-dedupe broken and unsupported but in official wiki DanglingPointer @ 2020-06-18 10:31 ` David Sterba 2020-06-18 20:43 ` Zygo Blaxell 2020-06-18 20:59 ` waxhead 2 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2020-06-18 10:31 UTC (permalink / raw) To: DanglingPointer; +Cc: linux-btrfs On Thu, Jun 18, 2020 at 12:28:41PM +1000, DanglingPointer wrote: > btrfs-dedupe is currently broken and no longer actively supported. > > It no longer builds with current rustc v1.44.0 with cargo > > It is in the official btrfs Deduplication wiki: > > https://btrfs.wiki.kernel.org/index.php/Deduplication > > There's no real active community and proper QA, reviewing and vetting. > > A poster in the issues area of the projects Github location stated that > even if fixed, it may not function correctly due to BTRFS having evolved > since the tool was designed created. > > There's just too many unknowns with this BTRFS specific dedupe tool. That's enough reason to remove the entry from the page. > People using your official wiki and trying to use that deduplication > program could inadvertently destroy their data through nativity or > accident. Especially if they start trying to fix the code. > > I recommend you remove it from your website or at least put large > warnings there that it is broken (which looks ugly, I would rather only > stuff that works were there since it isn't your project anyway but some > 3rd party). With the 3rd party tools it's often leading to that situation and feedback like yours helps to keep the information up to date. I'll remove the tools that are known to be unmainained. Thanks. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-18 2:28 btrfs-dedupe broken and unsupported but in official wiki DanglingPointer 2020-06-18 10:31 ` David Sterba @ 2020-06-18 20:43 ` Zygo Blaxell 2020-06-18 22:05 ` DanglingPointer 2020-06-18 20:59 ` waxhead 2 siblings, 1 reply; 14+ messages in thread From: Zygo Blaxell @ 2020-06-18 20:43 UTC (permalink / raw) To: DanglingPointer; +Cc: linux-btrfs On Thu, Jun 18, 2020 at 12:28:41PM +1000, DanglingPointer wrote: > btrfs-dedupe is currently broken and no longer actively supported. > > It no longer builds with current rustc v1.44.0 with cargo > > It is in the official btrfs Deduplication wiki: > > https://btrfs.wiki.kernel.org/index.php/Deduplication > > There's no real active community and proper QA, reviewing and vetting. > > A poster in the issues area of the projects Github location stated that even > if fixed, it may not function correctly due to BTRFS having evolved since > the tool was designed created. > > There's just too many unknowns with this BTRFS specific dedupe tool. > > People using your official wiki and trying to use that deduplication program > could inadvertently destroy their data through nativity or accident. > Especially if they start trying to fix the code. The point about lack of maintenance with changing Rust dependencies is fair, but "data loss" is a strong and unsupported statement. Can you explain how data loss could occur in even a badly (assume not maliciously) broken version of btrfs-dedupe? As far as I can tell, the btrfs-dedupe code uses only non-data-mutating btrfs kernel interfaces for manipulating extents (fiemap, defrag, and file_extent_same/deduperange). None of these should cause data loss (excluding kernel bugs). btrfs-dedupe can be trivially tricked into opening files that it did not intend to (it has no protection against symlink injection and other TOCCTOU attacks), but it doesn't seem to be able to alter the content of files once it opens them. File descriptors pointing to user files are opened O_RDWR, but they are kept in the scope of the dedupe function and their life-cycle is properly managed in Rust, so btrfs-dedupe won't mutate files by writing to the wrong fd (e.g. accidentally close stderr and reopen it to a user file) unless someone adds some seriously buggy code (see "assume not malicious" above). The unsafe C ioctl interfaces are unlikely to change in data-losing ways, or they'll break all existing userspace tools that use them. They are also well encapsulated in the rust-btrfs module. The errors reported on github seem to be problems with incompatible changes in the runtime libraries btrfs-dedupe depends on, and also some reports of what look like pre-existing bugs in the fiemap code that are blamed on new kernel versions without evidence. Data-losing breaking changes in any of the ioctls btrfs-dedupe uses are extremely unlikely. Those issues may cause btrfs-dedupe to do useless unnecessary work, or fail to do useful necessary work, but could not cause data loss by any mechanism I can find. Contrast with bedup: bedup uses data-mutating kernel interfaces (clone_range) for dedupe that have no effective protection against concurrent data modification. There is ineffective protection implemented in bedup (looking in /proc/*/fd for concurrent users of the files) which may or may not be broken in kernel 5.0, but it's ineffective either way. The case for data loss in bedup is trivial. The branch with a patch to fix it is now 7 years old, so it's fair to say bedup is unmaintained too (github forks notwithstanding, they didn't fix these issues). > I recommend you remove it from your website or at least put large warnings > there that it is broken (which looks ugly, I would rather only stuff that > works were there since it isn't your project anyway but some 3rd party). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-18 20:43 ` Zygo Blaxell @ 2020-06-18 22:05 ` DanglingPointer 2020-06-19 5:04 ` Zygo Blaxell 0 siblings, 1 reply; 14+ messages in thread From: DanglingPointer @ 2020-06-18 22:05 UTC (permalink / raw) To: Zygo Blaxell; +Cc: linux-btrfs For a large portion of desktop users that are not developers and are rustlang illiterate and programming illiterate; they would not now whether this tool or that tool or any tool would be safe, or unsafe, or have concurrent race conditions, or know the meaning of immutable or mutex. Think of this scenario; average Joe Bloggs user buys new computer without MS Windows. With the software savings, Joe purchases more disks. He then chooses openSuse Leap for his first foray into Linux. All he cares about are his music files, photos, and videos being safe. Joe runs a Cafe down the street and uses the music, photos, and videos in various screens at his cafe for the atmosphere. Times are tough and he's running out of space so he doesn't want the accumulate media files duplicated all around the place wasting space to conserve storage. If the official wikis have broken 3rd party tools, then it makes the whole adoption process less easy, less friendly, very cryptic, more chaotic; and give the impression that btrfs is a mess and not ready (and Linux as a whole). He would not know or have the time to go through the code of each deduplication program tool option to figure out if one type or the other type is better just like Zygo Blaxell did who can read code. Even if he wanted to, he doesn't know how to nor has the time to do it. He says good-bye to openSuse and buys Windows. So I do agree with waxhead. It would be preferable if there were an official btrfs deduplication command from btrfs-progs instead of relying on 3rd parties. Joe Bloggs example above can read a web-page instructions saying "run this command... and then this command..."; but he will not have the knowledge, nor comprehension nor time to go through code. Thanks David Sterba for removing the items and updating the wiki! On 19/6/20 6:43 am, Zygo Blaxell wrote: > The point about lack of maintenance with changing Rust dependencies is > fair, but "data loss" is a strong and unsupported statement. Can you > explain how data loss could occur in even a badly (assume not maliciously) > broken version of btrfs-dedupe? > > As far as I can tell, the btrfs-dedupe code uses only non-data-mutating > btrfs kernel interfaces for manipulating extents (fiemap, defrag, > and file_extent_same/deduperange). None of these should cause data > loss (excluding kernel bugs). > > btrfs-dedupe can be trivially tricked into opening files that it did > not intend to (it has no protection against symlink injection and other > TOCCTOU attacks), but it doesn't seem to be able to alter the content > of files once it opens them. > > File descriptors pointing to user files are opened O_RDWR, but they are > kept in the scope of the dedupe function and their life-cycle is properly > managed in Rust, so btrfs-dedupe won't mutate files by writing to the > wrong fd (e.g. accidentally close stderr and reopen it to a user file) > unless someone adds some seriously buggy code (see "assume not malicious" > above). > > The unsafe C ioctl interfaces are unlikely to change in data-losing ways, > or they'll break all existing userspace tools that use them. They are > also well encapsulated in the rust-btrfs module. > > The errors reported on github seem to be problems with incompatible > changes in the runtime libraries btrfs-dedupe depends on, and also some > reports of what look like pre-existing bugs in the fiemap code that are > blamed on new kernel versions without evidence. Data-losing breaking > changes in any of the ioctls btrfs-dedupe uses are extremely unlikely. > Those issues may cause btrfs-dedupe to do useless unnecessary work, > or fail to do useful necessary work, but could not cause data loss by > any mechanism I can find. > > Contrast with bedup: bedup uses data-mutating kernel interfaces > (clone_range) for dedupe that have no effective protection against > concurrent data modification. There is ineffective protection implemented > in bedup (looking in /proc/*/fd for concurrent users of the files) which > may or may not be broken in kernel 5.0, but it's ineffective either way. > The case for data loss in bedup is trivial. The branch with a patch to > fix it is now 7 years old, so it's fair to say bedup is unmaintained too > (github forks notwithstanding, they didn't fix these issues). > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-18 22:05 ` DanglingPointer @ 2020-06-19 5:04 ` Zygo Blaxell 2020-06-19 13:11 ` David Sterba 0 siblings, 1 reply; 14+ messages in thread From: Zygo Blaxell @ 2020-06-19 5:04 UTC (permalink / raw) To: DanglingPointer; +Cc: linux-btrfs On Fri, Jun 19, 2020 at 08:05:44AM +1000, DanglingPointer wrote: > For a large portion of desktop users that are not developers and are > rustlang illiterate and programming illiterate; they would not now whether > this tool or that tool or any tool would be safe, or unsafe, or have > concurrent race conditions, or know the meaning of immutable or mutex. > > Think of this scenario; average Joe Bloggs user buys new computer without MS > Windows. With the software savings, Joe purchases more disks. He then > chooses openSuse Leap for his first foray into Linux. > All he cares about are his music files, photos, and videos being safe. Joe > runs a Cafe down the street and uses the music, photos, and videos in > various screens at his cafe for the atmosphere. > Times are tough and he's running out of space so he doesn't want the > accumulate media files duplicated all around the place wasting space to > conserve storage. > > If the official wikis have broken 3rd party tools, then it makes the whole > adoption process less easy, less friendly, very cryptic, more chaotic; and > give the impression that btrfs is a mess and not ready (and Linux as a > whole). He would not know or have the time to go through the code of each > deduplication program tool option to figure out if one type or the other > type is better just like Zygo Blaxell did who can read code. Even if he > wanted to, he doesn't know how to nor has the time to do it. He says > good-bye to openSuse and buys Windows. My objection here is the serious accusation in the term "data loss", which you have made on the mailing list and github without supporting evidence. Joe Bloggs will not lose any data from btrfs-dedupe. He'll waste his time and run out of disk space, and maybe switch filesystems due to frustration, but Joe will not lose any of his data. btrfs-dedupe has not had new commits in years and no longer builds on today's Rust. Those facts alone would have been sufficient to justify removing it from the wiki. We have far too many real data loss bugs in btrfs already. There is no need to spread rumors about new ones just to push changes through. It might be nice to keep btrfs-dedupe and bedup _somewhere_ on the wiki, clearly marked as not supported and only of historical interest to new developers. I learned a lot about what is possible on btrfs from bedup in particular (bees was initially a project to combine the features of bedup and duperemove), and python is accessible to more developers than C or C++. btrfs-dedupe was the first btrfs dedupe agent to combine defrag and dedupe operations into a single program. > So I do agree with waxhead. It would be preferable if there were an > official btrfs deduplication command from btrfs-progs instead of relying on > 3rd parties. Joe Bloggs example above can read a web-page instructions > saying "run this command... and then this command..."; but he will not have > the knowledge, nor comprehension nor time to go through code. Which of the available candidates for "official btrfs dedupe" would you put in btrfs-progs? I see a lot of runners in the race, but no clear winner yet. duperemove is the closest to Waxhead's proposed "-r /somewhere" syntax. It's the obvious choice: written in the same language as btrfs-progs, and also the oldest btrfs deduper, and it has years of patient, data-driven optimization built in. If there wasn't some insurmountable reason why duperemove can't be merged with btrfs-progs, then it would have happened already, so there must be a reason why this can't ever happen (which might be as simple as neither maintainer wants to merge). Maybe we put duperemove at the top of the Wiki page, as it has the simplest command-line for Joe Blogger's use case, and it's relatively easy to build for the few people who use distros where it's not packaged. The stub support for in-kernel dedupe (arguably the only "official" btrfs dedupe so far) has been removed due to lack of interest in its development. That _was_ available in branches of btrfs-progs as 'btrfs dedupe'. It's gone now. The other viable deduper candidates are still works in progress, and some have significant trade-offs and limitations resulting from their optimization for specific use cases. duperemove hasn't exploited any btrfs-specific features to make it faster, so duperemove is already close to the upper performance limits of its design, but far below the performance that is possible in a specialist tool for btrfs. bees scales better and saves more space than the other dedupers, but bees can't exclude any part of the filesystem from the scope of dedupe the way every other btrfs deduper can. dduper is a proof of concept that is so much faster than the other block-oriented dedupers on btrfs that it overcomes a ridiculously inefficient implementation and wins benchmarks--but it also saves the least amount of space of any of the block-oriented dedupers on the wiki. There are some other candidates out there that aren't on the wiki that attack the dedupe problem from interesting--and potentially high-performing--angles (e.g. solstice dedupes the entire filesystem using a sorting algorithm instead of a hash table). The dozen or so utilities that do file-only dedupe well and support btrfs are faster at Joe Blogger's use case than all the block-oriented dedupers. Most of them are not btrfs-specific tools, so it doesn't make sense to integrate them into btrfs-progs. Most of the existing dedupers aren't written in C. The rest of btrfs-progs is C, creating a code review and maintenance issue if they are to be merged. The write-in candidate is "write a file-only deduper in C just so it can be integrated with btrfs-progs." That doesn't even exist, and it's still better than some of the existing candidates for merging into btrfs-progs. A deduper that is good at block-level dedupe is bad at file-level dedupe and vice versa. They view the filesystem stack from different sides, and the hardest optimization one can do is the easiest for the other. Pre-write (in-kernel) and post-write dedupers have significantly different memory costs, which is another reason for having a diverse set of dedupers: if you copy the ZFS approach to dedupe, you need ZFS-sized memory budgets to implement it; if you don't have ZFS-sized memory, you need an alternative implementation. These are significant barriers to picking a single winner. For now, at least until one of the dedupers can scale well over a superset of the other dedupers' use cases, or the in-kernel deduper comes back from the dead, it would be better to provide third-party dedupers that are optimized for the subset of workloads that they can handle very well. Otherwise, whichever single deduper you pick, it will suck for some users, or we pick multiple dedupe engines and need have a zillion options after 'btrfs fi dedupe' to help it pick which engine to use (this has already happened to some extent in duperemove). At the current rate of development, the XFS people might leapfrog us on dedupe, and "official btrfs dedupe" could end up being xfs_fsr. > Thanks David Sterba for removing the items and updating the wiki! > > On 19/6/20 6:43 am, Zygo Blaxell wrote: > > The point about lack of maintenance with changing Rust dependencies is > > fair, but "data loss" is a strong and unsupported statement. Can you > > explain how data loss could occur in even a badly (assume not maliciously) > > broken version of btrfs-dedupe? > > > > As far as I can tell, the btrfs-dedupe code uses only non-data-mutating > > btrfs kernel interfaces for manipulating extents (fiemap, defrag, > > and file_extent_same/deduperange). None of these should cause data > > loss (excluding kernel bugs). > > > > btrfs-dedupe can be trivially tricked into opening files that it did > > not intend to (it has no protection against symlink injection and other > > TOCCTOU attacks), but it doesn't seem to be able to alter the content > > of files once it opens them. > > > > File descriptors pointing to user files are opened O_RDWR, but they are > > kept in the scope of the dedupe function and their life-cycle is properly > > managed in Rust, so btrfs-dedupe won't mutate files by writing to the > > wrong fd (e.g. accidentally close stderr and reopen it to a user file) > > unless someone adds some seriously buggy code (see "assume not malicious" > > above). > > > > The unsafe C ioctl interfaces are unlikely to change in data-losing ways, > > or they'll break all existing userspace tools that use them. They are > > also well encapsulated in the rust-btrfs module. > > > > The errors reported on github seem to be problems with incompatible > > changes in the runtime libraries btrfs-dedupe depends on, and also some > > reports of what look like pre-existing bugs in the fiemap code that are > > blamed on new kernel versions without evidence. Data-losing breaking > > changes in any of the ioctls btrfs-dedupe uses are extremely unlikely. > > Those issues may cause btrfs-dedupe to do useless unnecessary work, > > or fail to do useful necessary work, but could not cause data loss by > > any mechanism I can find. > > > > Contrast with bedup: bedup uses data-mutating kernel interfaces > > (clone_range) for dedupe that have no effective protection against > > concurrent data modification. There is ineffective protection implemented > > in bedup (looking in /proc/*/fd for concurrent users of the files) which > > may or may not be broken in kernel 5.0, but it's ineffective either way. > > The case for data loss in bedup is trivial. The branch with a patch to > > fix it is now 7 years old, so it's fair to say bedup is unmaintained too > > (github forks notwithstanding, they didn't fix these issues). > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-19 5:04 ` Zygo Blaxell @ 2020-06-19 13:11 ` David Sterba 2020-06-22 19:49 ` Goffredo Baroncelli 0 siblings, 1 reply; 14+ messages in thread From: David Sterba @ 2020-06-19 13:11 UTC (permalink / raw) To: Zygo Blaxell; +Cc: DanglingPointer, linux-btrfs On Fri, Jun 19, 2020 at 01:04:03AM -0400, Zygo Blaxell wrote: > It might be nice to keep btrfs-dedupe and bedup _somewhere_ on the wiki, > clearly marked as not supported and only of historical interest to new > developers. I learned a lot about what is possible on btrfs from bedup > in particular (bees was initially a project to combine the features of > bedup and duperemove), and python is accessible to more developers than > C or C++. btrfs-dedupe was the first btrfs dedupe agent to combine > defrag and dedupe operations into a single program. It's there now. > > So I do agree with waxhead. It would be preferable if there were an > > official btrfs deduplication command from btrfs-progs instead of relying on > > 3rd parties. Joe Bloggs example above can read a web-page instructions > > saying "run this command... and then this command..."; but he will not have > > the knowledge, nor comprehension nor time to go through code. > > Which of the available candidates for "official btrfs dedupe" would you > put in btrfs-progs? I see a lot of runners in the race, but no clear > winner yet. > > duperemove is the closest to Waxhead's proposed "-r /somewhere" syntax. > It's the obvious choice: written in the same language as btrfs-progs, and > also the oldest btrfs deduper, and it has years of patient, data-driven > optimization built in. That there's not even a simple eg. file-based deduper available in btrfs-progs is kind of bad. Duperemove is indeed closest to that. > If there wasn't some insurmountable reason > why duperemove can't be merged with btrfs-progs, then it would have > happened already, so there must be a reason why this can't ever happen > (which might be as simple as neither maintainer wants to merge). I'm not against adding the functionality to btrfs-progs, but merging whole duperemove feature set might not happen due to additional dependencies. This would need to be evaluated, but I'm not aware of any other technical reasons. I don't remember exactly why duperemove started as a separate project instead of a subcommand or progs, but we can revisit that. > Maybe we put duperemove at the top of the Wiki page, as it has the > simplest command-line for Joe Blogger's use case, and it's relatively > easy to build for the few people who use distros where it's not packaged. That's a good idea, a 'quick start' section, with description of most common usecases using duperemove. > The stub support for in-kernel dedupe (arguably the only "official" > btrfs dedupe so far) has been removed due to lack of interest in its > development. That _was_ available in branches of btrfs-progs > as 'btrfs dedupe'. It's gone now. The more I think about in-band dedupe (and how it would complicate everything), I'm leaning more towards a user-space solution with support from kernel (ioctls, keeping hashes of recently modified blocks but not doing the actual deduplication, reading hashes from csum tree, etc). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-19 13:11 ` David Sterba @ 2020-06-22 19:49 ` Goffredo Baroncelli 2020-06-22 22:45 ` Zygo Blaxell 0 siblings, 1 reply; 14+ messages in thread From: Goffredo Baroncelli @ 2020-06-22 19:49 UTC (permalink / raw) To: dsterba, Zygo Blaxell, DanglingPointer, linux-btrfs On 6/19/20 3:11 PM, David Sterba wrote: >> If there wasn't some insurmountable reason >> why duperemove can't be merged with btrfs-progs, then it would have >> happened already, so there must be a reason why this can't ever happen >> (which might be as simple as neither maintainer wants to merge). > I'm not against adding the functionality to btrfs-progs, but merging > whole duperemove feature set might not happen due to additional > dependencies. This would need to be evaluated, but I'm not aware of any > other technical reasons. > > I don't remember exactly why duperemove started as a separate project > instead of a subcommand or progs, but we can revisit that. > Even tough I don't think that this was the reason at the time, now the ioctl FIDEDUPERANGE (aka BTRFS_IOC_FILE_EXTENT_SAME) is "filesystem agnostic". So I think that does make sense a tool more generic than btrfs(-progs). What I mean is: because this is not a BTRFS specific ioctl anymore, why we should have a BTRFS specific implementation ? From a technical point of view: dupremover could take advantage of the btrfs csum. So the question could be : is it better to add the capability to use the BTRFS csum to duperemover or to add the code of dupremover to BTRFS ? From an user point of view, I think that the former makes sense. BR G.Baroncelli -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-22 19:49 ` Goffredo Baroncelli @ 2020-06-22 22:45 ` Zygo Blaxell 2020-07-02 8:27 ` Lakshmipathi.G 0 siblings, 1 reply; 14+ messages in thread From: Zygo Blaxell @ 2020-06-22 22:45 UTC (permalink / raw) To: kreijack; +Cc: dsterba, DanglingPointer, linux-btrfs On Mon, Jun 22, 2020 at 09:49:55PM +0200, Goffredo Baroncelli wrote: > On 6/19/20 3:11 PM, David Sterba wrote: > > > If there wasn't some insurmountable reason > > > why duperemove can't be merged with btrfs-progs, then it would have > > > happened already, so there must be a reason why this can't ever happen > > > (which might be as simple as neither maintainer wants to merge). > > I'm not against adding the functionality to btrfs-progs, but merging > > whole duperemove feature set might not happen due to additional > > dependencies. This would need to be evaluated, but I'm not aware of any > > other technical reasons. > > > > I don't remember exactly why duperemove started as a separate project > > instead of a subcommand or progs, but we can revisit that. > > > Even tough I don't think that this was the reason at the time, now the > ioctl FIDEDUPERANGE (aka BTRFS_IOC_FILE_EXTENT_SAME) is "filesystem > agnostic". So I think that does make sense a tool more generic than > btrfs(-progs). > > What I mean is: because this is not a BTRFS specific ioctl anymore, > why we should have a BTRFS specific implementation ? First, to take advantage of unique btrfs capabilities: incremental scanning using transid and TREE_SEARCH_V2, and user data block csums. Second, to take advantage of generic filesystem capabilities that require btrfs-specific implementation details. Third, btrfs has immutable extents while other filesystems don't, and ignoring that fact in a generic multi-filesystem tool will cost a lot of dedupe efficiency on btrfs. On a big filesystem, the difference between a filesystem-specific dedupe tool and a filesystem-agnostic one could be many orders of magnitude better performance and a doubling of space recovery. duperemove is implemented using generic filesystem APIs: you point it at a directory tree, it scans all the files in the tree (including previously deduped files) and dedupes them. In incremental mode it scans the entire tree and compares the tree with a database. This is the slowest way to keep a filesystem deduplicated at scale. XFS and btrfs are both capable of doing dedupe at wire speeds by bypassing most of the filesystem (similar to a scrub, and can even be combined with scrub). That level of performance makes incremental scanning and filesystem csum support unnecessary for many use cases, since users would just run full dedupe instead of scrub. One tool can support both XFS and btrfs this way, though it would have to have specialized support for each individual filesystem as the details on each filesystem are very different (GETFSMAP and pread, vs LOGICAL_INO and all the different btrfs raid profiles and compression formats). It could be done as a dedupe core with plugin support for each filesystem, provided that the core algorithm is designed to handle btrfs's immutable extents. AFAIK nobody has built such a tool yet. XFS doesn't maintain csums of user data or support incremental scans, so XFS can dedupe _only_ as fast as it can scrub (*). btrfs has the extra information in the filesystem, so in theory we can start with the wire-speed dedupe from above, and make it up to 1000 times faster by reading the csums instead of reading the data blocks, and then faster still by scanning only the parts of the filesystem that changed from one dedupe run to the next. (*) XFS has some very fast tools for rapidly finding modified inodes, and it doesn't have immutable extents like btrfs does. XFS might win by brute force against btrfs's slower equivalents. It would depend on the mix of file sizes in the workload. > From a technical point of view: dupremover could take advantage of > the btrfs csum. So the question could be : is it better to add the > capability to use the BTRFS csum to duperemover or to add the code of > dupremover to BTRFS ? The options are orthogonal. csum read support can be added to any dedupe tool, whether it's part of the official btrfs code or not. We can decide on an official tool and add csum support to that tool in either order. > From an user point of view, I think that the former makes sense. > > BR > G.Baroncelli > > -- > gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> > Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-22 22:45 ` Zygo Blaxell @ 2020-07-02 8:27 ` Lakshmipathi.G 2020-07-03 3:16 ` Zygo Blaxell 0 siblings, 1 reply; 14+ messages in thread From: Lakshmipathi.G @ 2020-07-02 8:27 UTC (permalink / raw) To: Zygo Blaxell; +Cc: kreijack, dsterba, DanglingPointer, btrfs Hi Zygo. >dduper is a proof of concept that is so much >faster than the other block-oriented dedupers on btrfs that it overcomes a >ridiculously inefficient implementation and wins benchmarks--but it also >saves the least amount of space of any of the block-oriented dedupers on >the wiki. Regarding dduper, do you have a script to re-create your dataset? I'd like to investigate why dduper saves the least amount of space. thanks! ---- Cheers, Lakshmipathi.G http://www.giis.co.in https://www.webminal.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-07-02 8:27 ` Lakshmipathi.G @ 2020-07-03 3:16 ` Zygo Blaxell 2020-07-06 10:46 ` Lakshmipathi.G 0 siblings, 1 reply; 14+ messages in thread From: Zygo Blaxell @ 2020-07-03 3:16 UTC (permalink / raw) To: Lakshmipathi.G; +Cc: kreijack, dsterba, DanglingPointer, btrfs On Thu, Jul 02, 2020 at 01:57:57PM +0530, Lakshmipathi.G wrote: > Hi Zygo. > > >dduper is a proof of concept that is so much > >faster than the other block-oriented dedupers on btrfs that it overcomes a > >ridiculously inefficient implementation and wins benchmarks--but it also > >saves the least amount of space of any of the block-oriented dedupers on > >the wiki. > > Regarding dduper, do you have a script to re-create your dataset? I'd like to > investigate why dduper saves the least amount of space. thanks! My data set is a bunch of Windows raw disk images taken right after the MS installer runs. I don't think I can share it, but it's easy enough to roll your own. To avoid btrfs backref performance bugs, I split the disk images into 1GB files, removed those files that were entirely duplicate (all-zero or hard disk sector initialization pattern), and deduped the rest. For repeatability, once I had set up the btrfs filesystem with all the 1GB raw image fragment files, I dd'ed it to a raw partition on a dedicated disk and ran the test in a VN, so that all tools deduped an identical filesystem image on the same hardware (which has since died, so here I will use the saved results of the last run). On btrfs, extents are immutable. To remove a duplicate extent, the deduper must remove every reference to every block in an extent, even if some of the blocks do not contain duplicate data. If any reference to the extent remains anywhere in the filesystem, no space is saved. If anything, space is lost due to metadata growth. One way to achieve removal of a partially matched extent is to copy the unique data so that the entire extent contains duplicate data (which bees does). Another option is to not attempt dedupe at all unless the entire content of one extent matches (which duperemove might do...in a dev branch?). This does not gain more free space, but it avoids wasting time issuing dedupe ioctl calls that will cost time. duperemove will do parts of this analysis depending on command-line options. dduper doesn't do any such analysis that I've seen, and its performance seems to be comparable to duperemove with a crippling set of command-line options. The space efficiency of both dduper and duperemove is poor on btrfs--they are only effective when deduping files with small extents, or files that are entirely duplicate. In test runs, both dduper and duperemove issue a lot of dedupe ioctls that have no effect on free space (though duperemove has command-line options that avoid the worst losses). In my uncompressed test, the extents are all large (many are at the maximum 128MB size), so a deduper that doesn't split extents will be able to recover almost no space. The only successes dduper and duperemove were able to achieve were exploiting the fact that Windows disks have contiguous gigabytes of identical content in their recovery-tools partitions. bees is able to recover more of the duplicate space it finds because it slices up large extents along dedupe-friendly boundaries. This slows bees down on uncompressed filesystems because the incoming extents are larger. My test result for 140GB of uncompressed data was: bees saved 31% in 1h 40m (0.31%/min) duperemove -d -r saved 12% in 2h 30m (0.08%/min) duperemove -d -r --dedupe-options=same saved 12% in 25 minutes (0.48%/min) dduper saved 9% in 16m (0.56%/min) duperemove -d -r --dedupe-options=nofiemap,noblock,same -A --lookup-extents=no saved 7% in 25 min (0.28%/min) dduper is the fastest, but saves less total space than two variations of duperemove command-line options. dduper is even faster than the above numbers suggest--it deduped 8.5% of the data in 6 minutes, a rate of 1.41%/minute, 3x faster than duperemove's best score...then dduper wasted the following 10 minutes doing futile dedupe ioctl calls that didn't free any space. All that said, scoring the highest free space %/minute rate in a race with other dedupers _while wasting 67% of the time and 71% of the available space_ is pretty impressive! duperemove -d -r took 2h 30m because it hits an old btrfs backref performance bug (now fixed in 5.7?). It actually saved 12% in 25 minutes too, but it created a toxic extent and spent 2 hours burning CPU in the kernel to process it. The other duperemove command-line argument sets mentioned here avoid this bug. The result for 100GB of compressed data (the same data, but compressed with compress-force=zstd) was: bees saved 44% in 1h 15m (0.58%/min) duperemove -d -r --dedupe-options=nofiemap,noblock,same -A --lookup-extents=no saved 3% in 12 minutes (0.25%/min) dduper saved 1% in 24 minutes (0.04%/min) On compressed filesystem tests, dduper gains almost no space. This is expected, because dduper only looks at btrfs csums, and the btrfs csums can only match when the compressed data representation of both copies is exactly the same. In btrfs-compressed files the compressed extent block alignment is effectively random for large files, since it depends on timing details at the time of the btrfs commit, so on average only 3% (1 in 32 blocks) of extents with duplicate data will have matching csums after compression. bees and duperemove read the data after decompression, so they are not limited by differences in compression encoding. The only way for dduper to catch up here is to detect compressed extents and fall back to reading them the slow way. This is a reasonable tradeoff for filesystem workloads that have low proportions of compressed data; otherwise, duperemove's optimized multi-threaded implementation might run slightly faster than dduper on a fast device, if dduper is forced to read all the blocks because they are compressed. I don't recall why I didn't run duperemove with other options on a compressed filesystem during this test--possibly to avoid a bug? I have not looked in further detail into why dduper frees slightly less space than duperemove under some conditions. A simple deduper with a minimal awareness of btrfs's extent reference counting structure can easily match or slightly outperform the best deduper without one; with a non-minimal awareness of btrfs structure, a slow and broken deduper can outperform by an order of magnitude. duperemove's command-line options do provide or suppress some awareness of extent structure, so I would expect those options to increase or decrease space saved slightly compared to a tool that has no such awareness, and that seems to be what happens. The test results of dduper, duperemove, and bees are all consistent with that. > ---- > Cheers, > Lakshmipathi.G > http://www.giis.co.in https://www.webminal.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-07-03 3:16 ` Zygo Blaxell @ 2020-07-06 10:46 ` Lakshmipathi.G 2020-07-25 7:24 ` Lakshmipathi.G 0 siblings, 1 reply; 14+ messages in thread From: Lakshmipathi.G @ 2020-07-06 10:46 UTC (permalink / raw) To: Zygo Blaxell; +Cc: kreijack, dsterba, DanglingPointer, btrfs Hi Zygo, Thanks for the extensive details about the data-set and how the environment is set up and tests are executed. I'll try to create some windows raw disk images for testing and follow your test environment setup as much as possible. Performance numbers are really interesting between bees, duperemove and dduper! As you mentioned previously, dduper is more like poc and I didn't spend much time in testing with different data sets. Mostly created few GB files with `dd urandom` and tested them. I guess that is why it performs better with files that are entirely duplicate and small extents :-) Let me spend some time investigating these issues, I'm pretty sure dduper can be made a little bit more reliable that its current form. Reg compressed file system tests, will check this after resolving poor disk-space issues on non-compressed filesystems. thanks! ---- Cheers, Lakshmipathi.G http://www.giis.co.in https://www.webminal.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-07-06 10:46 ` Lakshmipathi.G @ 2020-07-25 7:24 ` Lakshmipathi.G 0 siblings, 0 replies; 14+ messages in thread From: Lakshmipathi.G @ 2020-07-25 7:24 UTC (permalink / raw) To: Zygo Blaxell; +Cc: kreijack, dsterba, DanglingPointer, btrfs Hi Zygo. > Let me spend some time investigating these issues, I'm pretty sure dduper > can be made a little bit more reliable that its current form. I think I resolved the bug which caused less disk-space saving issue with this commit [1]. At-least now dduper should provide better disk-saving than its previous version. Also added `--analyze` option to display stats with different chunk size[2] and posted some test run results here [3]. thanks! [1]: https://github.com/Lakshmipathi/dduper/commit/180f2aedf697b440c53cbe61195dd821c8aae3b4 [2]: https://github.com/lakshmipathi/dduper#analyze-with-different-chunk-size [3]: https://github.com/Lakshmipathi/dduper/blob/master/tests/TESTS.md ---- Cheers, Lakshmipathi.G http://www.giis.co.in https://www.webminal.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-18 2:28 btrfs-dedupe broken and unsupported but in official wiki DanglingPointer 2020-06-18 10:31 ` David Sterba 2020-06-18 20:43 ` Zygo Blaxell @ 2020-06-18 20:59 ` waxhead 2020-06-19 13:19 ` David Sterba 2 siblings, 1 reply; 14+ messages in thread From: waxhead @ 2020-06-18 20:59 UTC (permalink / raw) To: DanglingPointer, linux-btrfs I have pointed this out before , but I would like to use the opportunity again. I, as just a regular user of btrfs would feel more comfortable if the dedupe tool was part of btrfs such as for example btrfs filesystem dedupe -r /somewhere Regular users that are somewhat technically able may not know that the dedupe fuctions are kernel api's that should not destroy anything even if the calling program went berserk. While this may be obvious to btrfs developers, it is not to regular users that may be concerned that a particular tool may wreck havoc on their filesystem. DanglingPointer wrote: > btrfs-dedupe is currently broken and no longer actively supported. > > It no longer builds with current rustc v1.44.0 with cargo > > It is in the official btrfs Deduplication wiki: > > https://btrfs.wiki.kernel.org/index.php/Deduplication > > There's no real active community and proper QA, reviewing and vetting. > > A poster in the issues area of the projects Github location stated that > even if fixed, it may not function correctly due to BTRFS having evolved > since the tool was designed created. > > There's just too many unknowns with this BTRFS specific dedupe tool. > > People using your official wiki and trying to use that deduplication > program could inadvertently destroy their data through nativity or > accident. Especially if they start trying to fix the code. > > I recommend you remove it from your website or at least put large > warnings there that it is broken (which looks ugly, I would rather only > stuff that works were there since it isn't your project anyway but some > 3rd party). > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: btrfs-dedupe broken and unsupported but in official wiki 2020-06-18 20:59 ` waxhead @ 2020-06-19 13:19 ` David Sterba 0 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2020-06-19 13:19 UTC (permalink / raw) To: waxhead; +Cc: DanglingPointer, linux-btrfs On Thu, Jun 18, 2020 at 10:59:10PM +0200, waxhead wrote: > I have pointed this out before , but I would like to use the opportunity > again. I, as just a regular user of btrfs would feel more comfortable if > the dedupe tool was part of btrfs such as for example btrfs filesystem > dedupe -r /somewhere I agree that something like that would be highly useful, and despite I know about duperemove I don't use it often enough to remember how exactly to use it for the simple usecase. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2020-07-25 7:24 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-18 2:28 btrfs-dedupe broken and unsupported but in official wiki DanglingPointer 2020-06-18 10:31 ` David Sterba 2020-06-18 20:43 ` Zygo Blaxell 2020-06-18 22:05 ` DanglingPointer 2020-06-19 5:04 ` Zygo Blaxell 2020-06-19 13:11 ` David Sterba 2020-06-22 19:49 ` Goffredo Baroncelli 2020-06-22 22:45 ` Zygo Blaxell 2020-07-02 8:27 ` Lakshmipathi.G 2020-07-03 3:16 ` Zygo Blaxell 2020-07-06 10:46 ` Lakshmipathi.G 2020-07-25 7:24 ` Lakshmipathi.G 2020-06-18 20:59 ` waxhead 2020-06-19 13:19 ` David Sterba
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.