mount option nodatacow for VMs on SSD?

All of lore.kernel.org
 help / color / mirror / Atom feed

* mount option nodatacow for VMs on SSD?
@ 2016-11-25  8:28 Ulli Horlacher
  2016-11-25 12:01 ` Duncan
  2016-11-26 10:27 ` Kai Krakow
  0 siblings, 2 replies; 14+ messages in thread
From: Ulli Horlacher @ 2016-11-25  8:28 UTC (permalink / raw)
  To: linux-btrfs

I have vmware and virtualbox VMs on btrfs SSD.

I read in
https://btrfs.wiki.kernel.org/index.php/SysadminGuide#When_To_Make_Subvolumes

     certain types of data (databases, VM images and similar typically big
     files that are randomly written internally) may require CoW to be
     disabled for them.  So for example such areas could be placed in a
     subvolume, that is always mounted with the option "nodatacow".

Does this apply to SSDs, too?


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20161125082840.GA32711@rus.uni-stuttgart.de>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-25  8:28 mount option nodatacow for VMs on SSD? Ulli Horlacher
@ 2016-11-25 12:01 ` Duncan
  2016-11-25 12:25   ` Roman Mamedov
  2016-11-26 10:27 ` Kai Krakow
  1 sibling, 1 reply; 14+ messages in thread
From: Duncan @ 2016-11-25 12:01 UTC (permalink / raw)
  To: linux-btrfs

Ulli Horlacher posted on Fri, 25 Nov 2016 09:28:40 +0100 as excerpted:

> I have vmware and virtualbox VMs on btrfs SSD.
> 
> I read in
> https://btrfs.wiki.kernel.org/index.php/SysadminGuide
#When_To_Make_Subvolumes
> 
>      certain types of data (databases, VM images and similar typically
>      big files that are randomly written internally) may require CoW to
>      be disabled for them.  So for example such areas could be placed in
>      a subvolume, that is always mounted with the option "nodatacow".
> 
> Does this apply to SSDs, too?

It can, because the root issue is the same, the COW-based fragmentation 
that's always a problem with this sort of frequently randomly partially 
rewritten file on COW-based filesystems, but the symptoms tend to be much 
less of a problem on ssd, so it doesn't tend to be as big of an issue 
there.

On multi-gig database files or VM images, files can end up with 100K 
extents due to COW-based rewriting.  Obviously this can be a HUGE problem 
on spinning rust due to its seek times, a problem zero-seek-time ssds 
don't have, but the sheer amount of metadata overhead due to tracking all 
those tiny extents can be a problem of its own, particularly when doing 
maintenance such as btrfs balance or btrfs check.  Both snapshotting and 
quota tracking amplify this overhead tracking problem as well, and it's 
this problem that can still be an issue on ssds.

That said, the autodefrag mount option, used to eliminate some of the 
heavy fragmentation due to copy-on-write (COW) that's the root problem, 
tends to be faster on ssd, and can often be all that's needed on ssd as 
between it ameliorating the root problem to a large extent and the faster 
speed of ssds, often that's all that's needed, particularly if you don't 
need quotas so have them off and only do relatively limited snapshotting.

The problem with both the nodatacow mount option and the nocow file 
attribute is that they disable some of the btrfs features and are 
weakened by other features that may well be a big part of the reason 
behind your choice of btrfs in the first place.  Both btrfs compression, 
if otherwise enabled, and checksuming and thus file integrity checking 
(and repair in the case of btrfs raid1/10), would be complicated or 
impossible to implement without COW, and thus are disabled in the NOCOW 
case.  Similarly, btrfs snapshotting depends on COW because the snapshot 
locks in place the existing version so a rewrite must be written 
elsewhere.  As a result, snapshotting weakens NOCOW to what has been 
called COW1, COW the first time a block is rewritten after a snapshot, 
but after that further writes to the same block will be rewritten into 
the (new) existing block location.  If you only do very occasional 
snapshots that may not be a problem, but if you're doing regular 
snapshots, particularly automated and multiple per day, the effect of the 
snapshotting forced COW1s may be fragmentation as bad as if NOCOW wasn't 
in place in the first place.

So to some degree, if you're going to be setting the nocow attribute or 
using the nodatacow mount option, you might as well just setup a 
different partition/volume and mkfs to something other than btrfs for 
those files.  OTOH, the btrfs multi-device and storage pool features 
aren't affected, so if they are big reasons you're doing btrfs, then 
there's some reason to keep using btrfs and simply do the nodatacow mount 
or nocow attribute if autodefrag isn't enough on its own to handle it.

Bottom line, the fragmentation is much less of a problem on ssds, 
particularly with autodefrag which may well be enough, but as always, it 
can be installation and task dependent, so if it's going to be a 
production system, do your own testing and make your own decisions based 
on the results. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-25 12:01 ` Duncan
@ 2016-11-25 12:25   ` Roman Mamedov
  0 siblings, 0 replies; 14+ messages in thread
From: Roman Mamedov @ 2016-11-25 12:25 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Fri, 25 Nov 2016 12:01:37 +0000 (UTC)
Duncan <1i5t5.duncan@cox.net> wrote:

> Obviously this can be a HUGE problem on spinning rust due to its seek times,
> a problem zero-seek-time ssds don't have

They are not strictly zero seek time either. Sure you don't have the issue of
moving the physical head around, but still, sequential reads are way faster
even on SSDs, compared to random reads. Somewhat typical result for a
consumer SSD:

           Sequential Read :   382.301 MB/s
          Sequential Write :   315.124 MB/s
         Random Read 512KB :   261.751 MB/s
        Random Write 512KB :   334.615 MB/s
    Random Read 4KB (QD=1) :    19.859 MB/s [  4848.5 IOPS]
   Random Write 4KB (QD=1) :    61.794 MB/s [ 15086.3 IOPS]
   Random Read 4KB (QD=32) :   132.415 MB/s [ 32327.9 IOPS]
  Random Write 4KB (QD=32) :   203.051 MB/s [ 49573.0 IOPS]

If you have tons of 4K fragments, reading them in can go as low as 20 MB/sec,
compared to 382 MB/sec if they were all in one piece.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-25  8:28 mount option nodatacow for VMs on SSD? Ulli Horlacher
  2016-11-25 12:01 ` Duncan
@ 2016-11-26 10:27 ` Kai Krakow
  2016-11-28  0:38   ` Ulli Horlacher
  1 sibling, 1 reply; 14+ messages in thread
From: Kai Krakow @ 2016-11-26 10:27 UTC (permalink / raw)
  To: linux-btrfs

Am Fri, 25 Nov 2016 09:28:40 +0100
schrieb Ulli Horlacher <framstag@rus.uni-stuttgart.de>:

> I have vmware and virtualbox VMs on btrfs SSD.
> 
> I read in
> https://btrfs.wiki.kernel.org/index.php/SysadminGuide#When_To_Make_Subvolumes
> 
>      certain types of data (databases, VM images and similar
> typically big files that are randomly written internally) may require
> CoW to be disabled for them.  So for example such areas could be
> placed in a subvolume, that is always mounted with the option
> "nodatacow".
> 
> Does this apply to SSDs, too?

As a side note: I don't think you can use "nodatacow" just for one
subvolume while the other subvolumes of the same btrfs are mounted
different. The wiki is just wrong here.

The list of possible mount options in the wiki explicitly lists
"nodatacow" as not working per subvolume - just globally for the whole
fs.

-- 
Regards,
Kai

Replies to list-only preferred.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-26 10:27 ` Kai Krakow
@ 2016-11-28  0:38   ` Ulli Horlacher
  2016-11-28  2:56     ` Duncan
  2016-11-28  8:20     ` Kai Krakow
  0 siblings, 2 replies; 14+ messages in thread
From: Ulli Horlacher @ 2016-11-28  0:38 UTC (permalink / raw)
  To: linux-btrfs

On Sat 2016-11-26 (11:27), Kai Krakow wrote:

> > I have vmware and virtualbox VMs on btrfs SSD.

> As a side note: I don't think you can use "nodatacow" just for one
> subvolume while the other subvolumes of the same btrfs are mounted
> different. The wiki is just wrong here.
> 
> The list of possible mount options in the wiki explicitly lists
> "nodatacow" as not working per subvolume - just globally for the whole
> fs.

Thanks for pointing this out!
I have misunderstood this, first.

Ok, then next question :-)

What is better (for a single user workstation): using mount option
"autodefrag" or call "btrfs filesystem defragment -r" (-t ?) via nightly
cronjob?

So far, I use neither.


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum TIK         
Universitaet Stuttgart         E-Mail: horlacher@tik.uni-stuttgart.de
Allmandring 30a                Tel:    ++49-711-68565868
70569 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/
REF:<20161126112710.6aca8bac@jupiter.sol.kaishome.de>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-28  0:38   ` Ulli Horlacher
@ 2016-11-28  2:56     ` Duncan
  2016-11-28  9:49       ` [Not TLS] " Graham Cobb
  2016-11-28  8:20     ` Kai Krakow
  1 sibling, 1 reply; 14+ messages in thread
From: Duncan @ 2016-11-28  2:56 UTC (permalink / raw)
  To: linux-btrfs

Ulli Horlacher posted on Mon, 28 Nov 2016 01:38:29 +0100 as excerpted:

> Ok, then next question :-)
> 
> What is better (for a single user workstation): using mount option
> "autodefrag" or call "btrfs filesystem defragment -r" (-t ?) via nightly
> cronjob?
> 
> So far, I use neither.

First point: Be aware that there's a caveat with either method and 
snapshots, tho it's far stronger with manual defrag than with autodefrag: 

At one point manual defrag was made snapshot aware, taking care not to 
deduplicate snapshots and reflinks pointing at the same extents, but the 
performance penalty of all the extra tracking and calculations turned out 
to be far too high to be practical with btrfs code in its then-current 
form (if a defrag run is going to take months, people simply aren't going 
to run it no matter the claimed benefit), so snapshot/reflink awareness 
was disabled and it remains so today.  AFAIK the plan is still to reenable 
it, or perhaps make it optional, at some point, but I believe that point 
remains some distance (years) in the future.

Which means for practical purposes, defragging of either type effectively 
undoes any reflink-based deduplication that may have been done, including 
that of snapshots -- defrag in the presence of snapshots can double your 
data space usage.

The reason the effect isn't as bad for autodefrag is that while manual 
defrag can effectively unreflink the extents for entire files regardless 
of write status, autodefrag only happens in the context of normal file 
writes or rewrites/modification, and for rewrites/modification, which 
would COW the modified/rewritten blocks elsewhere in any case, it simply 
rewrites/relocates rather larger extents, several MiB at a time instead 
of 4 KiB at a time, than would be the case without autodefrag.  So 
several GiB files that have been snapshotted/reflinked and then modified 
would have the modified blocks rewritten elsewhere anyway, and autodefrag 
simply ensures that a large enough new extent (MiB not KiB) is created 
and rewritten when a single block within it is modified anyway, to avoid 
the worst fragmentation.  It does NOT rewrite and unreflink the entire 
multi-gig file every time a single block gets modified and written back 
to the filesystem, as manual defrag can do and in practice often does if 
there have been modifications since the last snapshot or reflink copy/
dedup of the same file.  (Thanks to Hugo for making the point, then 
checking the actual code and then explaining how autodefrag differs from 
manual defrag on this point.)

So manual recursive defrag of the entire filesystem (as opposed to 
specific files) is definitely not recommended in btrfs snapshot context, 
unless you know you have enough space for the snapshot-reflink dedup that 
the defrag is likely to trigger.

But autodefrag should be far more space-conserving in the btrfs 
snapshotting context, as it'll be far more conservative in what it 
unreflinks size-wise, and will only unreflink at all when a COW-based 
modification/rewrite is happening in the first place.  Files that remain 
unchanged will remain safely reflinked to the same extents as those the 
snapshots hold reflinks to.

OTOH, if you're starting out with a highly fragmented existing 
filesystem, autodefrag can take some time to work its effects, because it 
*is* far more conservative in what it rewrites and thus defrags.  
Autodefrag really works best if you handle it as I do here, creating the 
new filesystem and setting up the mount options to always mount it with 
autodefrag, before there's any content at all on the filesystem.  That 
way, all files are originally written with autodefrag on, and the 
filesystem never has a chance to get seriously fragmented in the first 
place.  =:^)

It should still be worth turning on autodefrag on an existing somewhat 
fragmented filesystem.  It just might take some time to defrag files you 
do modify, and won't touch those you don't, which in some cases might 
make it worth defragging those manually.  Or simply create new 
filesystems, mount them with autodefrag, and copy everything over so 
you're starting fresh, as I do.

(It should be mentioned that in the context of a single write thread on a 
clean filesystem with lots of free space, a newly written file should 
always be written in ideal sequential unfragmented form.  However, get 
multiple write threads copying different files at the same time, and even 
on a new filesystem, the individual files can be fragmented as the 
various writes intermingle.  We've had reports on this list of even brand 
new distro installations being highly fragmented, and this would appear 
to be why -- apparently the installer was writing multiple files at once 
as well as possibly modifying some of them after the initial write, 
thereby fragmenting them rather heavily.  If the installer either mounts 
with autodefrag before starting to write its files, or if the user either 
manually creates the filesystem and ensures an autodefrag mount, or 
pauses the installation to remount with autodefrag before the file-copy 
begins, the fragmentation isn't nearly as bad, altho as I explained 
above, autodefrag is somewhat conservative and there will be /some/ 
fragmentation, as compared to doing the install to a temporary filesystem 
and then copying the files over to a permanent one such that they copy 
sequentially, one at a time.)

(Additionally, it's worth noting that btrfs data chunks are nominally 1 
GiB in size tho in some large enough layouts they can reach upto 10 GiB, 
so unlike say ext4, which can have arbitrarily long extents, on btrfs, 
files over a GiB are likely to be listed by filefrag as having several 
extents even at "ideal", as the extents will be be broken into data chunk 
sizes.)

(Finally, in case you decide to enable btrfs compression, it's worth 
noting that filefrag doesn't understand btrfs compression, which breaks 
files into 128 KiB compression blocks, which filefrag in turn lists as 
individual extents even if they're sequential.  Of course you can have a 
good clue this is occurring by dividing the file size by 128 KiB and 
comparing the result to the filefrag-reported number of extents for that 
file.  Or simply manually check the verbose filefrag output and see if 
the extents it lists are sequential, one beginning immediately after the 
previous one ended.)

Bottom line, I'd recommend autodefrag, with the two caveats of being 
aware that (a) it /will/ trigger moderate unreflinking and thus moderate 
data duplication if you're doing snapshotting or dedupeing (but far less 
than manual defrag would), and that (b) autodefrag really works best if 
you use it from the time the filesystem is first created, tho I'd still 
recommend it on existing filesystems, you just won't get quite the same 
effect.

Really, if I had my way autodefrag would be the default mount option, and 
you'd use noautodefrag to turn it off if you had some reason you didn't 
want it.  Because certainly in the generic case anyway, I simply don't 
see why one /wouldn't/ want it, and that would nicely eliminate the whole 
"I started using it on an existing and already fragmented filesystem" 
problem.  =:^)

(Tho I understand why it's not that way, when the option was introduced 
there were some worries about performance in some circumstances, and the 
option was experimental back then, so it made /sense/ not to have it the 
default.  But that was then and this is now, and IMO it should be the 
default, now.  Maybe it will be at some point?  But one of the btrfs devs 
has to care enough about that as the default to code it up and argue the 
case for the change first, and I'm not a dev, just a list regular and 
btrfs user myself, so...)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-28  0:38   ` Ulli Horlacher
  2016-11-28  2:56     ` Duncan
@ 2016-11-28  8:20     ` Kai Krakow
  2016-11-28 11:11       ` Niccolò Belli
  1 sibling, 1 reply; 14+ messages in thread
From: Kai Krakow @ 2016-11-28  8:20 UTC (permalink / raw)
  To: linux-btrfs

Am Mon, 28 Nov 2016 01:38:29 +0100
schrieb Ulli Horlacher <framstag@rus.uni-stuttgart.de>:

> On Sat 2016-11-26 (11:27), Kai Krakow wrote:
> 
> > > I have vmware and virtualbox VMs on btrfs SSD.  
> 
> > As a side note: I don't think you can use "nodatacow" just for one
> > subvolume while the other subvolumes of the same btrfs are mounted
> > different. The wiki is just wrong here.
> > 
> > The list of possible mount options in the wiki explicitly lists
> > "nodatacow" as not working per subvolume - just globally for the
> > whole fs.  
> 
> Thanks for pointing this out!
> I have misunderstood this, first.

You can, however, use chattr to make the subvolume root directory (that
one where it is mounted) nodatacow (chattr +C) _before_ placing any
files or directories in there. That way, newly created files and
directories will inherit the flag. Take note that this flag can only
applied to directories and empty (zero-sized) files.

That way, you get the intended benefit and your next question applies a
little less because:

> Ok, then next question :-)
> 
> What is better (for a single user workstation): using mount option
> "autodefrag" or call "btrfs filesystem defragment -r" (-t ?) via
> nightly cronjob?
> 
> So far, I use neither.

When using the above method to make your VM images nodatacow, the only
fragmentation issue you need to handle is when doing snapshots.
Snapshots are subject to copy-on-write. If you do heavy snapshotting,
you'll be getting heavy fragmentation based on the write-patterns. I
don't know if autodefrag will handle nodatacow files. You may want to
use a dedupe utility after defragmentation, like duperemove (running
it manually) or bees (a background daemon also trying to keep
fragmentation low).

If you are doing no or infrequent snapshots, I won't bother with manual
defragging at all for your VM images since you're on SSD. If you aren't
going to use snapshots at all, you even won't have to think about
autodefrag, tho I still recommend to enable it (see post from Duncan).

Manual defrag is a highly write-intensive operation, rewriting multiple
gigabytes of data. I strongly recommend against using it on a daily
basis for SSD.

-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Not TLS] Re: mount option nodatacow for VMs on SSD?
  2016-11-28  2:56     ` Duncan
@ 2016-11-28  9:49       ` Graham Cobb
  2016-11-29  5:14         ` Duncan
  0 siblings, 1 reply; 14+ messages in thread
From: Graham Cobb @ 2016-11-28  9:49 UTC (permalink / raw)
  To: linux-btrfs

On 28/11/16 02:56, Duncan wrote:
> It should still be worth turning on autodefrag on an existing somewhat 
> fragmented filesystem.  It just might take some time to defrag files you 
> do modify, and won't touch those you don't, which in some cases might 
> make it worth defragging those manually.  Or simply create new 
> filesystems, mount them with autodefrag, and copy everything over so 
> you're starting fresh, as I do.

Could that "copy" be (a series of) send/receive, so that snapshots and
reflinks are preserved?  Does autodefrag work in that case or does the
send/receive somehow override that and end up preserving the original
(fragmented) extent structure?

Graham


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-28  8:20     ` Kai Krakow
@ 2016-11-28 11:11       ` Niccolò Belli
  2016-11-29  5:06         ` Duncan
  0 siblings, 1 reply; 14+ messages in thread
From: Niccolò Belli @ 2016-11-28 11:11 UTC (permalink / raw)
  To: Kai Krakow; +Cc: linux-btrfs

On lunedì 28 novembre 2016 09:20:15 CET, Kai Krakow wrote:
> You can, however, use chattr to make the subvolume root directory (that
> one where it is mounted) nodatacow (chattr +C) _before_ placing any
> files or directories in there. That way, newly created files and
> directories will inherit the flag. Take note that this flag can only
> applied to directories and empty (zero-sized) files.

Do I keep checksumming for this directory such a way?

Niccolò Belli

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-28 11:11       ` Niccolò Belli
@ 2016-11-29  5:06         ` Duncan
  2016-11-29 12:20           ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 14+ messages in thread
From: Duncan @ 2016-11-29  5:06 UTC (permalink / raw)
  To: linux-btrfs

Niccolò Belli posted on Mon, 28 Nov 2016 12:11:49 +0100 as excerpted:

> On lunedì 28 novembre 2016 09:20:15 CET, Kai Krakow wrote:
>> You can, however, use chattr to make the subvolume root directory (that
>> one where it is mounted) nodatacow (chattr +C) _before_ placing any
>> files or directories in there. That way, newly created files and
>> directories will inherit the flag. Take note that this flag can only
>> applied to directories and empty (zero-sized) files.
> 
> Do I keep checksumming for this directory such a way?

No.  Keeping checksums current on NOCOW files is racy, and compression 
would be complex as well because rewritten data may compress more or less 
well than the original so the on-filesystem size could change, so both 
features are disabled in the presence of NOCOW, regardless of how it is 
set.

Put another way, btrfs assumes COW by default and many of its features 
depend on COW -- that's why these features don't tend to be implemented 
on conventional rewrite-in-place filesystems in the first place.  Both 
checksumming and compression are among these COW-dependent features.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Not TLS] Re: mount option nodatacow for VMs on SSD?
  2016-11-28  9:49       ` [Not TLS] " Graham Cobb
@ 2016-11-29  5:14         ` Duncan
  2016-11-29 10:34           ` [Not TLS] " Niccolò Belli
  2016-11-29 12:18           ` [Not TLS] " Austin S. Hemmelgarn
  0 siblings, 2 replies; 14+ messages in thread
From: Duncan @ 2016-11-29  5:14 UTC (permalink / raw)
  To: linux-btrfs

Graham Cobb posted on Mon, 28 Nov 2016 09:49:33 +0000 as excerpted:

> On 28/11/16 02:56, Duncan wrote:
>> It should still be worth turning on autodefrag on an existing somewhat
>> fragmented filesystem.  It just might take some time to defrag files
>> you do modify, and won't touch those you don't, which in some cases
>> might make it worth defragging those manually.  Or simply create new
>> filesystems, mount them with autodefrag, and copy everything over so
>> you're starting fresh, as I do.
> 
> Could that "copy" be (a series of) send/receive, so that snapshots and
> reflinks are preserved?  Does autodefrag work in that case or does the
> send/receive somehow override that and end up preserving the original
> (fragmented) extent structure?

Very good question that I don't know the answer to as I've not seen it 
discussed previously.  (I'm not a dev, just a list regular and user of 
btrfs myself, and my personal use-case involves neither snapshots nor 
send/receive, so on those topics if I've not seen it covered previously 
either here or on the wiki, I won't know.)

Someone else know?

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Not TLS] mount option nodatacow for VMs on SSD?
  2016-11-29  5:14         ` Duncan
@ 2016-11-29 10:34           ` Niccolò Belli
  2016-11-29 12:18           ` [Not TLS] " Austin S. Hemmelgarn
  1 sibling, 0 replies; 14+ messages in thread
From: Niccolò Belli @ 2016-11-29 10:34 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On martedì 29 novembre 2016 06:14:18 CET, Duncan wrote:
> Very good question that I don't know the answer to as I've not seen it 
> discussed previously.  (I'm not a dev, just a list regular and user of 
> btrfs myself, and my personal use-case involves neither snapshots nor 
> send/receive, so on those topics if I've not seen it covered previously 
> either here or on the wiki, I won't know.)
>
> Someone else know?

Sounds too good to be real, I somehow feel the answer will be "no" :(

Niccolò Belli

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Not TLS] Re: mount option nodatacow for VMs on SSD?
  2016-11-29  5:14         ` Duncan
  2016-11-29 10:34           ` [Not TLS] " Niccolò Belli
@ 2016-11-29 12:18           ` Austin S. Hemmelgarn
  1 sibling, 0 replies; 14+ messages in thread
From: Austin S. Hemmelgarn @ 2016-11-29 12:18 UTC (permalink / raw)
  To: Duncan, linux-btrfs

On 2016-11-29 00:14, Duncan wrote:
> Graham Cobb posted on Mon, 28 Nov 2016 09:49:33 +0000 as excerpted:
>
>> On 28/11/16 02:56, Duncan wrote:
>>> It should still be worth turning on autodefrag on an existing somewhat
>>> fragmented filesystem.  It just might take some time to defrag files
>>> you do modify, and won't touch those you don't, which in some cases
>>> might make it worth defragging those manually.  Or simply create new
>>> filesystems, mount them with autodefrag, and copy everything over so
>>> you're starting fresh, as I do.
>>
>> Could that "copy" be (a series of) send/receive, so that snapshots and
>> reflinks are preserved?  Does autodefrag work in that case or does the
>> send/receive somehow override that and end up preserving the original
>> (fragmented) extent structure?
>
> Very good question that I don't know the answer to as I've not seen it
> discussed previously.  (I'm not a dev, just a list regular and user of
> btrfs myself, and my personal use-case involves neither snapshots nor
> send/receive, so on those topics if I've not seen it covered previously
> either here or on the wiki, I won't know.)
>
> Someone else know?
>
Autodefrag does work in that case, but not because there's any special 
handling for it.  In the case of send/receive, the receiving side is 
doing nothing that couldn't be done as a normal user (except possibly a 
few ioctls to set subvolume UUID's), so any data it writes will be 
subject to all processing done by the FS (so sending from an 
uncompressed volume to one with compress=X will result in the data being 
compressed on the receiving end too).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: mount option nodatacow for VMs on SSD?
  2016-11-29  5:06         ` Duncan
@ 2016-11-29 12:20           ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 14+ messages in thread
From: Austin S. Hemmelgarn @ 2016-11-29 12:20 UTC (permalink / raw)
  To: linux-btrfs

On 2016-11-29 00:06, Duncan wrote:
> Niccolò Belli posted on Mon, 28 Nov 2016 12:11:49 +0100 as excerpted:
>
>> On lunedì 28 novembre 2016 09:20:15 CET, Kai Krakow wrote:
>>> You can, however, use chattr to make the subvolume root directory (that
>>> one where it is mounted) nodatacow (chattr +C) _before_ placing any
>>> files or directories in there. That way, newly created files and
>>> directories will inherit the flag. Take note that this flag can only
>>> applied to directories and empty (zero-sized) files.
>>
>> Do I keep checksumming for this directory such a way?
>
> No.  Keeping checksums current on NOCOW files is racy, and compression
> would be complex as well because rewritten data may compress more or less
> well than the original so the on-filesystem size could change, so both
> features are disabled in the presence of NOCOW, regardless of how it is
> set.
>
> Put another way, btrfs assumes COW by default and many of its features
> depend on COW -- that's why these features don't tend to be implemented
> on conventional rewrite-in-place filesystems in the first place.  Both
> checksumming and compression are among these COW-dependent features.
>
We really should get this put up somewhere very visible on the wiki. 
It's pretty blatantly obvious to anyone with a CS degree, but most users 
don't have a CS degree, and I can't count the number of e-mails I've 
sent explaining why checksums plus NOCOW are a recipe for disaster.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-11-29 12:21 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-25  8:28 mount option nodatacow for VMs on SSD? Ulli Horlacher
2016-11-25 12:01 ` Duncan
2016-11-25 12:25   ` Roman Mamedov
2016-11-26 10:27 ` Kai Krakow
2016-11-28  0:38   ` Ulli Horlacher
2016-11-28  2:56     ` Duncan
2016-11-28  9:49       ` [Not TLS] " Graham Cobb
2016-11-29  5:14         ` Duncan
2016-11-29 10:34           ` [Not TLS] " Niccolò Belli
2016-11-29 12:18           ` [Not TLS] " Austin S. Hemmelgarn
2016-11-28  8:20     ` Kai Krakow
2016-11-28 11:11       ` Niccolò Belli
2016-11-29  5:06         ` Duncan
2016-11-29 12:20           ` Austin S. Hemmelgarn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.