On 2015-10-16 09:12, Christoph Hellwig wrote:
> On Fri, Oct 16, 2015 at 08:50:41AM -0400, Austin S Hemmelgarn wrote:
>> Certain parts of userspace do try to reflink things instead of copying (for
>> example, coreutils recently started doing so in mv and has had the option to
>> do so with cp for a while now), but a properly designed general purpose
>> filesystem does not and should not do this without the user telling it to do
>> so.
>
> But they do.  Get out of your narrow local Linux file system view.
> Every all flash array or hyperconverge hypervisor will dedeup the hell
> out of your data, heck some SSDs even do it on the device.  Your NFS or
> CIFS server already does or soon will do dedup and reflinks behind the
> scenes, that's the whole point of adding these features to the protocol.
Unless things have significantly changed on Windows and OS X, NTFS and 
HFS+ do not do automatic data deduplication (I'm not sure whether either 
even supports reflinks, although NTFS is at least partly COW), and I 
know for certain that FAT, UDF, Minix, BeFS, and Venti do not do so. 
NFS and CIFS/SMB both have support in the protocol, but unless either 
the client asks for it specifically, or the server is manually 
configured to do it automatically (although current versions of Windows 
server might do it by default, but if they do it is not documented 
anywhere I've seen), they don't do it.  9P has no provisions for 
reflinks/deduplication.  AFS/Coda/Ceph/Lustre/GFS2 might do 
deduplication, but I'm pretty certain that they do not do so by default, 
and even then they really don't fit the 'general purpose' bit in my 
statement above.  So, overall, my statement still holds for any widely 
used filesystem technology that is actually 'general purpose'.

Furthermore, if you actually read my statement, you will notice that I 
only said that _filesystems_ should not do it without being told to do 
so, and (intentionally) said absolutely nothing about any kind of 
storage devices or virtualization.  Ideally, SSD's really shouldn't do 
it either unless they have a 100% guarantee that the entire block going 
bad will not render the data unrecoverable (most do in fact use ECC 
internally, but they typically only handle two or three bad bits out of 
a full byte).  And as far as hypervisors go, a good storage hypervisor 
should be providing some guarantee of reliability, which means either it 
is already storing multiple copies of _everything_ or using some form of 
erasure coding so that it can recover from issues with the underlying 
storage devices without causing issues for higher levels, thus meaning 
that deduplication in that context is safe for all intents and purposes.
> And except for the odd fear or COW or dedup, and the ENOSPC issue for
> which we have a flag with a very well defined meaning I've still not
> heard any good arguments against it.
Most people who I know who demonstrate this fear are just fine with COW, 
it's the deduplication that they're terrified of, and TBH that's largely 
because they've only ever seen it used in unsafe ways.  My main argument 
(which I admittedly have not really stated properly at all during this 
discussion) is that almost everyone is likely to jump on this, which 
_will_ change long established semantics in many things that switch to 
this, and there will almost certainly be serious backlash from that.