All of lore.kernel.org
 help / color / mirror / Atom feed
* VM nocow, should VM software set +C by default?
@ 2014-02-21 16:55 Chris Murphy
  2014-02-21 17:56 ` Duncan
  2014-02-25  9:16 ` Justin Ossevoort
  0 siblings, 2 replies; 8+ messages in thread
From: Chris Murphy @ 2014-02-21 16:55 UTC (permalink / raw)
  To: Btrfs BTRFS

Use case is a user who doesn't know that today xattr +C ought to be set on vm images when on Btrfs. They use e.g. Gnome Boxes, or Virtual Machine Manager (virt-manager) to configure pools, images, and VMs.

If libvirt were to set +C on any containing directory configured as a pool, then any copied as well as newly created images would inherit +C. So is this the long term recommended practice, and should various VM projects be asked to build this functionality? Or will there be optimizations, such as autodefrag, that will obviate the need for +C on such VM images in the somewhat near future?


Chris Murphy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: VM nocow, should VM software set +C by default?
  2014-02-21 16:55 VM nocow, should VM software set +C by default? Chris Murphy
@ 2014-02-21 17:56 ` Duncan
  2014-02-25  9:16 ` Justin Ossevoort
  1 sibling, 0 replies; 8+ messages in thread
From: Duncan @ 2014-02-21 17:56 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Fri, 21 Feb 2014 09:55:50 -0700 as excerpted:

> Use case is a user who doesn't know that today xattr +C ought to be set
> on vm images when on Btrfs. They use e.g. Gnome Boxes, or Virtual
> Machine Manager (virt-manager) to configure pools, images, and VMs.
> 
> If libvirt were to set +C on any containing directory configured as a
> pool, then any copied as well as newly created images would inherit +C.
> So is this the long term recommended practice, and should various VM
> projects be asked to build this functionality? Or will there be
> optimizations, such as autodefrag, that will obviate the need for +C on
> such VM images in the somewhat near future?

FWIW...


I had suggested/predicted in an earlier post that having the software set 
NOCOW automatically for VMs, larger databases and pre-allocated files 
such as those used by bittorrent clients would become long term practice, 
but one of the devs (IIRC Bacik) suggested there was still ongoing defrag 
work, that they hoped would ultimately eliminate the need for that.

I'm personally somewhat skeptical and expect NOCOW will always be of some 
help in that regard, and still predict at least some of the apps will 
begin setting it if they detect that they're on btrfs, at some point, but 
I've been overrules by an expert in the domain, something I'm most 
assuredly NOT, so...

I'll allow that it's a bit early to be asking for it ATM, but I also 
believe it's safe to say that regardless of /future/ developments, if a VM 
suite or db wishes to be ahead of the curve /today/ with its btrfs 
support, they should probably at least mention it in the documentation.  
As for implementing it in code, I'd say only if they're ready to take it 
out later if btrfs as it matures proves not to need it, but certainly, 
mentioning it in documentation, should they choose to specifically 
mention btrfs at all, is probably a good idea at this point.

In my definitely NOT-a-btrfs-code-expert opinion, of course. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: VM nocow, should VM software set +C by default?
  2014-02-21 16:55 VM nocow, should VM software set +C by default? Chris Murphy
  2014-02-21 17:56 ` Duncan
@ 2014-02-25  9:16 ` Justin Ossevoort
  2014-02-25 17:44   ` Chris Murphy
  1 sibling, 1 reply; 8+ messages in thread
From: Justin Ossevoort @ 2014-02-25  9:16 UTC (permalink / raw)
  To: linux-btrfs

I think in principle: No.

It is something that should be documented as advise in the VM software 
documentation. But things like storage management is the domain of the 
distribution or systems administrator.

There might be a situation where the VM software can directly use a 
btrfs filesystem for it's storage engines where it could be sensible to 
add such a thing, but in that case it's already directly managing it's 
subvolumes and can turn nodatacow on/off when appropriate.
In that case it would probably also be using some of it's higher level 
functions for snapshotting, acls and possibly metadata (and not as a 
dumb container for disk images).

So if the VM software would be controlling the filesystem directly, than 
it could be useful but would probably be better achieved using different 
options.
If the VM software is merely using it to store image files, than it 
would be up to the distribution/systems administrator to set a '+C' on 
the directory where the images will be stored (that flag were being 
inherited iirc).
A distribution could easily try marking the default images directory 
with '+C' on installation for "Joe Average" user (when he decides to do 
things differently, than that's his conscious choice). But it could also 
decide that a better default would be to use a entirely different 
subvolume for VM images, with another raid level and no compression but 
with CoW enabled by default (and thus relying on autodefrag to work).

It should simply be a matter of: Who manages the storage decides how it 
should be configured and whether not to do CoW is an aspect of 
configuration.

Regards,

	justin....

On 21-02-14 17:55, Chris Murphy wrote:
> Use case is a user who doesn't know that today xattr +C ought to be set on vm images when on Btrfs. They use e.g. Gnome Boxes, or Virtual Machine Manager (virt-manager) to configure pools, images, and VMs.
>
> If libvirt were to set +C on any containing directory configured as a pool, then any copied as well as newly created images would inherit +C. So is this the long term recommended practice, and should various VM projects be asked to build this functionality? Or will there be optimizations, such as autodefrag, that will obviate the need for +C on such VM images in the somewhat near future?
>
>
> Chris Murphy--
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: VM nocow, should VM software set +C by default?
  2014-02-25  9:16 ` Justin Ossevoort
@ 2014-02-25 17:44   ` Chris Murphy
  2014-02-25 17:53     ` Jim Salter
  2014-02-25 18:01     ` Roman Mamedov
  0 siblings, 2 replies; 8+ messages in thread
From: Chris Murphy @ 2014-02-25 17:44 UTC (permalink / raw)
  To: Btrfs BTRFS


On Feb 25, 2014, at 2:16 AM, Justin Ossevoort <justin@internetionals.nl> wrote:

> I think in principle: No.
> 
> It is something that should be documented as advise in the VM software documentation. But things like storage management is the domain of the distribution or systems administrator.

No, that's a recipe for users having a chaotic experience. Either the VM managing application needs to set +C on image files, or the file system needs to be optimized for this use case. Consider the Gnome Boxes user. They're not in a good position to do this themselves, and each distro doing this causes fragmented experience. It's better if the application developer (Gnome Boxes, VMM) or possibly libvirt to set +C on VM images; or as a general purpose file system for it to be optimized for this use case.

Either way it leaves the end user out of what amounts to esoteric configuration.


> There might be a situation where the VM software can directly use a btrfs filesystem for it's storage engines where it could be sensible to add such a thing, but in that case it's already directly managing it's subvolumes and can turn nodatacow on/off when appropriate.

I don't expect VM's to use subvolumes directly, instead of image files (qcow2, raw, vmdk, etc) for a while, and also I'm not sure if there's enough separation between VMs, or VM and host sharing what is really one file system. If there's any possibility a misbehaving VM could corrupt the file system and not merely its own tree, then it's unlikely a best practice.



Chris Murphy


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: VM nocow, should VM software set +C by default?
  2014-02-25 17:44   ` Chris Murphy
@ 2014-02-25 17:53     ` Jim Salter
  2014-02-25 18:33       ` Chris Murphy
  2014-02-25 18:01     ` Roman Mamedov
  1 sibling, 1 reply; 8+ messages in thread
From: Jim Salter @ 2014-02-25 17:53 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

Put me in on Team Justin on this particular issue.  I get and grant that 
in some use cases you might get pathological behavior out of DB or VM 
binaries which aren't set NODATACOW, but in my own use - including 
several near-terabyte-size VM images being used by ten+ people all day 
long for their primary work - I haven't personally observed any 
pathological behavior, and I *don't* have NODATACOW set.

I would prefer to have the benefits of COW being turned on unless and 
until I categorically NEED to disable it in order to avoid pathological 
performance issues.

Also note that ZFS doesn't even *have* a NODATACOW option, is generally 
less performant than btrfs in general in my experience, and yet I've 
been running 100-ish VMs - not toys, actual 
depended-on-by-lots-of-people-daily-workhorses - in .qcow2-on-ZFS format 
for years now without issue.

IME, IMO, the potential performance problems with COW and db/vm do 
/exist/ but they're way, WAY overstated, and unlikely to rear their 
heads at all in the majority of use-cases.


On 02/25/2014 12:44 PM, Chris Murphy wrote:
> On Feb 25, 2014, at 2:16 AM, Justin Ossevoort <justin@internetionals.nl> wrote:
>
>> I think in principle: No.
>>
>> It is something that should be documented as advise in the VM software documentation. But things like storage management is the domain of the distribution or systems administrator.
> No, that's a recipe for users having a chaotic experience. Either the VM managing application needs to set +C on image files, or the file system needs to be optimized for this use case. Consider the Gnome Boxes user. They're not in a good position to do this themselves, and each distro doing this causes fragmented experience. It's better if the application developer (Gnome Boxes, VMM) or possibly libvirt to set +C on VM images; or as a general purpose file system for it to be optimized for this use case.
>
> Either way it leaves the end user out of what amounts to esoteric configuration.
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: VM nocow, should VM software set +C by default?
  2014-02-25 17:44   ` Chris Murphy
  2014-02-25 17:53     ` Jim Salter
@ 2014-02-25 18:01     ` Roman Mamedov
  1 sibling, 0 replies; 8+ messages in thread
From: Roman Mamedov @ 2014-02-25 18:01 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 682 bytes --]

On Tue, 25 Feb 2014 10:44:36 -0700
Chris Murphy <lists@colorremedies.com> wrote:

> the VM managing application needs to set +C on image files

It's a slippery slope, why not instigate that every program from now on has
to set +C on its user files? Or where do we stop, probably the browser should
also set +C on its user profile and cache files? How about the mail client for
the mail index? The torrent program on all temporary downloads?.. MySQL and
Postgres on all their databases?...

Insisting that every program now has to include a workaround for a
filesystem-specific gimmick does not seem to be a nice design decision
long-term.

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: VM nocow, should VM software set +C by default?
  2014-02-25 17:53     ` Jim Salter
@ 2014-02-25 18:33       ` Chris Murphy
  2014-02-26  5:43         ` Duncan
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2014-02-25 18:33 UTC (permalink / raw)
  To: Btrfs BTRFS


On Feb 25, 2014, at 10:53 AM, Jim Salter <jim@jrs-s.net> wrote:
> IME, IMO, the potential performance problems with COW and db/vm do /exist/ but they're way, WAY overstated, and unlikely to rear their heads at all in the majority of use-cases.

Right. Unfortunately I'm only aware of such anecdotes, including my own, that performance is OK or not OK. The specifics of the configuration need to be described. So what information do we need to ask users who are having such problems, before we ask them to use +C?


On Feb 25, 2014, at 11:01 AM, Roman Mamedov <rm@romanrm.net> wrote:

> Insisting that every program now has to include a workaround for a
> filesystem-specific gimmick does not seem to be a nice design decision
> long-term.


I have tepid agreement that +C is a short term work around, and ideally neither application developers nor users would need to do anything special to use VM images on Btrfs.

But in that case I'd still like to better understand what configurations are causing real performance problems to occur for some people when they don't use +C. Is this simply needing to implement a better or best practice rather than use of +C? So I guess more configuration information is needed to see what combination of attributes is inducing these reported  problems.

I've had a qcow2 image with more than 30,000 extents and didn't notice a performance drop. So I don't know that number of extents is the problem. Maybe it's how they're arranged on disk and what's causing the problem is excessive seeking on HDD? Or does this problem sometimes also still happen on SSD?

Chris Murphy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: VM nocow, should VM software set +C by default?
  2014-02-25 18:33       ` Chris Murphy
@ 2014-02-26  5:43         ` Duncan
  0 siblings, 0 replies; 8+ messages in thread
From: Duncan @ 2014-02-26  5:43 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Tue, 25 Feb 2014 11:33:34 -0700 as excerpted:

> I've had a qcow2 image with more than 30,000 extents and didn't notice a
> performance drop. So I don't know that number of extents is the problem.
> Maybe it's how they're arranged on disk and what's causing the problem
> is excessive seeking on HDD? Or does this problem sometimes also still
> happen on SSD?

SSDs are interesting beasts.

1) Seek-times aren't an issue, so that disappears, BUT...

2) SSDs still have IOPS ratings/limits.  Typically these are in the 
(high) tens of thousands per second range, so a single file at 30K 
extents isn't likely to be terribly significant.  However, 300K extents 
could be, and several VMs @ 30K each, and/or mixed in with other traffic, 
likely also fragmented due simply to the number of VM image fragments...

3) There's also the read vs. write vs. erase-block size thing to think 
about, and how that can affect not reads, but writes.  As long as there's 
sufficient overprovisioning it shouldn't be a real problem, but fill the 
SSD more than about 2/3 full (as a significant number of non-
professionally managed systems likely will) and all those extra fragments 
being written is going to trigger erase-block garbage collection cycles 
more frequently, and that could massively up the latency jitter on a 
reasonably frequent basis.

#2 and 3 are a big part of the reason I still enable the autodefrag mount 
option here, even tho I'm on SSD and my use-case doesn't have a lot of 
huge internal-write files to worry about, plus I'm only about 50 percent 
partitioned on the SSDs so they have LOTS of room to do their wear-
leveling, etc. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-02-26  5:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-21 16:55 VM nocow, should VM software set +C by default? Chris Murphy
2014-02-21 17:56 ` Duncan
2014-02-25  9:16 ` Justin Ossevoort
2014-02-25 17:44   ` Chris Murphy
2014-02-25 17:53     ` Jim Salter
2014-02-25 18:33       ` Chris Murphy
2014-02-26  5:43         ` Duncan
2014-02-25 18:01     ` Roman Mamedov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.