All of lore.kernel.org
 help / color / mirror / Atom feed
* State of Dedup / Defrag
@ 2015-10-13 18:59 Rich Freeman
  2015-10-14  5:09 ` Zygo Blaxell
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Freeman @ 2015-10-13 18:59 UTC (permalink / raw)
  To: Btrfs BTRFS

What is the current state of Dedup and Defrag in btrfs?  I seem to
recall there having been problems a few months ago and I've stopped
using it, but I haven't seen much news since.

I'm interested both in the 3.18 and subsequent kernel series.

--
Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: State of Dedup / Defrag
  2015-10-13 18:59 State of Dedup / Defrag Rich Freeman
@ 2015-10-14  5:09 ` Zygo Blaxell
  2015-10-14 12:29   ` Rich Freeman
  0 siblings, 1 reply; 6+ messages in thread
From: Zygo Blaxell @ 2015-10-14  5:09 UTC (permalink / raw)
  To: Rich Freeman; +Cc: Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 2508 bytes --]

On Tue, Oct 13, 2015 at 02:59:59PM -0400, Rich Freeman wrote:
> What is the current state of Dedup and Defrag in btrfs?  I seem to
> recall there having been problems a few months ago and I've stopped
> using it, but I haven't seen much news since.

It has been 1 day since a kernel bug leading to data loss was fixed in the
ioctl calls for dedup (commit 6e685a1e3e9054d43fac58f2bc0cd070df915079
from fdmanana yesterday); however, to hit that particular bug you'd
need to be doing something unusual with the ioctls--in particular, a
thing that makes no sense for dedup, and that dedup userspace programs
intentionally avoid doing.  There was another bug for defrag 68 days ago.

I wouldn't try to use dedup on a kernel older than v4.1 because of these
fixes in 4.1 and later:

	- allow dedup of the ends of files when they are not aligned
	to 4K.	Before this was fixed, up to 1GB of space could be wasted
	per file.

	- no mtime update on extent-same.  With the update, rsync
	and backup programs think all the deduped files are modified.
	The next rsync after dedup would immediately un-dedup (redup?) all
	the deduped files.

	- fixes for deadlocks.	If dedup is running at the same time as
	other readers of files (e.g. deduping /usr or a tree on a busy
	file server), a deadlock was inevitable.

IMHO these fixes really made dedup usable for the first time.

There are some other fixes that appeared after v4.1, but they should
not impact cases where mostly static data is deduped without concurrent
modifications.  Do dedup a photo or video file collection.  Don't dedup
a live database server on a filesystem with compression enabled...yet.

Using dedup and defrag at the same time is still a bad idea.  The features
work against each other:  autodefrag skips anything that has been deduped,
while manual defrag un-dedups everything it touches.  The effect of
defrag on dedup depends on the choice of dedup userspace strategy,
so defrag can either be helpful or harmful.

Autodefrag in my experience pushes write latencies up to insane levels.
Data ends up making multiple round-trips to the disk _with_ extra
constraints on the allocator on the second and later passes, and while
this is happening any other writes on the filesystem block an absurdly
long time.  It can easily cost more I/O time than it saves.  That said,
there are some kernel patches floating around to fix the allocator,
so at least we can hope autodefrag will be less bad someday.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: State of Dedup / Defrag
  2015-10-14  5:09 ` Zygo Blaxell
@ 2015-10-14 12:29   ` Rich Freeman
  2015-10-15  2:47     ` Zygo Blaxell
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Freeman @ 2015-10-14 12:29 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: Btrfs BTRFS

On Wed, Oct 14, 2015 at 1:09 AM, Zygo Blaxell
<ce3g8jdj@umail.furryterror.org> wrote:
>
> I wouldn't try to use dedup on a kernel older than v4.1 because of these
> fixes in 4.1 and later:

I would assume that these would be ported to the other longterm
kernels like 3.18 at some point?

> Do dedup a photo or video file collection.  Don't dedup
> a live database server on a filesystem with compression enabled...yet.

LIkewise.  Typically I just dedup the entire filesystem, so it sounds
like we're not quite there yet.  Would it make sense to put this on
the wiki in the gotchas section?

> Using dedup and defrag at the same time is still a bad idea.  The features
> work against each other

You mentioned quite a bit about autodefrag.  I was thinking more in
terms of using explicit defrag, as was done by dedup in the past.  It
looks like duperemove doesn't actually do this, perhaps because it is
also considered unsafe these days.

Thanks, I was just trying to get a sense for where this was at.  It
sounds like we're getting to the point where it could be used in
general, but for now it is probably best to run it manually on stuff
that isn't too busy.

--
Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: State of Dedup / Defrag
  2015-10-14 12:29   ` Rich Freeman
@ 2015-10-15  2:47     ` Zygo Blaxell
  2015-10-15 16:33       ` Rich Freeman
  0 siblings, 1 reply; 6+ messages in thread
From: Zygo Blaxell @ 2015-10-15  2:47 UTC (permalink / raw)
  To: Rich Freeman; +Cc: Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 3794 bytes --]

On Wed, Oct 14, 2015 at 08:29:20AM -0400, Rich Freeman wrote:
> On Wed, Oct 14, 2015 at 1:09 AM, Zygo Blaxell
> <ce3g8jdj@umail.furryterror.org> wrote:
> >
> > I wouldn't try to use dedup on a kernel older than v4.1 because of these
> > fixes in 4.1 and later:
> 
> I would assume that these would be ported to the other longterm
> kernels like 3.18 at some point?

I wouldn't assume anything.  Backports seem to be kind of random.  ;)

I think most (all?) of the relevant patches do apply to v3.18, but I
haven't tested this kernel very much since v4.0 became usable.

> > Do dedup a photo or video file collection.  Don't dedup
> > a live database server on a filesystem with compression enabled...yet.
> 
> LIkewise.  Typically I just dedup the entire filesystem, so it sounds
> like we're not quite there yet.  Would it make sense to put this on
> the wiki in the gotchas section?

Sounds good.

> > Using dedup and defrag at the same time is still a bad idea.  The features
> > work against each other
> 
> You mentioned quite a bit about autodefrag.  I was thinking more in
> terms of using explicit defrag, as was done by dedup in the past.  It
> looks like duperemove doesn't actually do this, perhaps because it is
> also considered unsafe these days.

I wouldn't describe dedup+defrag as unsafe.  More like insane.  You won't
lose any data, but running both will waste a lot of time and power.
Either one is OK without the other, or applied to non-overlapping sets
of files, but they are operations with opposite results.

Explicit defrag always undoes dedup, so you never want to defrag a file
after dedup has removed duplicate extents from it.  The btrfs command-line
tool doesn't check for shared extents in defrag, and neither does the
kernel ioctl.

Defrag before dedup is a more complex situation that depends on the dedup
strategy of whichever dedup tool you are using.  A file-based dedup tool
doesn't have to care about extent boundaries, so you can just run defrag
and then dedup--in that order.

An extent-based dedup tool will become less efficient after defrag and
some capabilities will be lost, e.g.  it will not be able to dedup
data between VM image files or other files on the host filesystem.
Extents with identical content but different boundaries cannot be deduped.
There are fewer opportunities to find duplicate extents because defrag
combines smaller extents into bigger ones.

A block-based dedup tool explicitly arranges the data into extents
by content, so extents become entirely duplicate or entirely unique.
This is different from what defrag does.  If both are run on the same
data they will disagree on physical data layout and constantly undo
each other's work.

When data is defragged, it appears in find-new output as "new" data
(the same as if the data had been written to a file the usual way).
An incremental dedup tool that integrates defrag and find-new at the
same time has to carefully prevent itself from consuming its own output
in an endless feedback loop.

> Thanks, I was just trying to get a sense for where this was at.  It
> sounds like we're getting to the point where it could be used in
> general, but for now it is probably best to run it manually on stuff
> that isn't too busy.

IMHO the kernel part of dedup as of v4.1 is in a better state than some
other features, e.g. balance or resize.  I can run dedup continuously
for months without issues, but I plan for lockups and reboots every day
when doing balances or resizes.  :-/

> --
> Rich
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: State of Dedup / Defrag
  2015-10-15  2:47     ` Zygo Blaxell
@ 2015-10-15 16:33       ` Rich Freeman
  2015-10-16  4:01         ` Duncan
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Freeman @ 2015-10-15 16:33 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: Btrfs BTRFS

On Wed, Oct 14, 2015 at 10:47 PM, Zygo Blaxell
<ce3g8jdj@umail.furryterror.org> wrote:
>
> I wouldn't describe dedup+defrag as unsafe.  More like insane.  You won't
> lose any data, but running both will waste a lot of time and power.
> Either one is OK without the other, or applied to non-overlapping sets
> of files, but they are operations with opposite results.

That is probably why I disabled it then.  I now recall past discussion
that defragging a file wasn't snapshot-aware, though I thought that
was fixed.

Obviously there is always a tradeoff since from a dedup perspective
you're best off arranging extents so that you're sharing as much as
possible, and from a defrag standpoint you want to just have each file
have a single extent even if two files differ by a single byte.

I've pretty much stopped running VMs on btrfs and I've adjusted my
journal settings to something more sane so the defrag isn't nearly as
important these days.

--
Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: State of Dedup / Defrag
  2015-10-15 16:33       ` Rich Freeman
@ 2015-10-16  4:01         ` Duncan
  0 siblings, 0 replies; 6+ messages in thread
From: Duncan @ 2015-10-16  4:01 UTC (permalink / raw)
  To: linux-btrfs

Rich Freeman posted on Thu, 15 Oct 2015 12:33:56 -0400 as excerpted:

> On Wed, Oct 14, 2015 at 10:47 PM, Zygo Blaxell
> <ce3g8jdj@umail.furryterror.org> wrote:
>>
>> I wouldn't describe dedup+defrag as unsafe.  More like insane.  You
>> won't lose any data, but running both will waste a lot of time and
>> power. Either one is OK without the other, or applied to
>> non-overlapping sets of files, but they are operations with opposite
>> results.
> 
> That is probably why I disabled it then.  I now recall past discussion
> that defragging a file wasn't snapshot-aware, though I thought that was
> fixed.

Well, snapshot-aware defrag was indeed released... in 3.9 IIRC (it's on 
the wiki)...

Unfortunately, it didn't scale well /at/ /all/, as in spending half a day 
defragging just a handful of blocks. (Worst-case was many thousands of 
snapshots, with quotas enabled -- and this was back before the 
scalability work and quota code rewrites, so it was really **REALLY** 
bad!!111, and could be more like a day on a single block!)

Obviously that simply didn't work /at/ /all/, so in ordered to have a 
defrag that was at least semi-usable, snapshot awareness was disabled 
again, making things /so/ much simpler, but of course dramatically 
increasing the possible downsides of actually running defrag.

The disabling was in 3.12, IIRC, so defrag was only snapshot-aware for 
three kernel cycles, about seven months' worth of current development.

Since then defrag's snapshot-awareness has remained disabled, tho AFAIK 
the plan remains to enable it again, if/when the devs are satisfied that 
they've made the code efficient and scalable enough that it's still going 
to be actually runnable in practical time, into (I'd say) 5-digits worth 
of snapshots, anyway.  But that's going to necessarily involve stable 
quota code, and the quota code has been a continued problem tho there's 
some hope that it's coming together now, so it's anyone's guess when even 
the quota-code prerequisite is complete, let along if/when the 
scalability will then be reasonable enough, or if more work will still 
need to be done there... or if it'll _ever_ be practical to turn it back 
on.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-10-16  4:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-13 18:59 State of Dedup / Defrag Rich Freeman
2015-10-14  5:09 ` Zygo Blaxell
2015-10-14 12:29   ` Rich Freeman
2015-10-15  2:47     ` Zygo Blaxell
2015-10-15 16:33       ` Rich Freeman
2015-10-16  4:01         ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.