linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Copy tools on Linux
@ 2018-06-30  2:37 Steve French
  2018-06-30 13:13 ` Goldwyn Rodrigues
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Steve French @ 2018-06-30  2:37 UTC (permalink / raw)
  To: linux-fsdevel

I have been looking at i/o patterns from various copy tools on Linux,
and it is pretty discouraging - I am hoping that I am forgetting an
important one that someone can point me to ...

Some general problems:
1) if source and target on the same file system it would be nice to
call the copy_file_range syscall (AFAIK only test tools call that),
although in some cases at least cp can do it for --reflink
2) if source and target on different file systems there are multiple problems
    a) smaller i/o  (rsync e.g. maxes at 128K!)
    b) no async parallelized writes sent down to the kernel so writes
get serialized (either through page cache, or some fs offer option to
disable it - but it still is one thread at a time)
    c) sparse file support is mediocre (although cp has some support
for it, and can call fiemap in some cases)
    d) for file systems that prefer setting the file size first (to
avoid metadata penalties with multiple extending writes) - AFAIK only
rsync offers that, but rsync is one of the slowest tools otherwise

I have looked at cp, dd, scp, rsync, gio, gcp ... are there others?

What I am looking for (and maybe we just need to patch cp and rsync
etc.) is more like what you see with other OS ...
1) options for large i/o sizes (network latencies in network/cluster
fs can be large, so prefer larger 1M or 8M in some cases I/Os)
2) parallelizing writes so not just one write in flight at a time
3) options to turn off the page cache (large number of large file
copies are not going to benefit from reuse of pages in the page cache
so going through the page cache may be suboptimal in that case)
4) option to set the file size first, and then fill in writes (so
non-extending writes)
5) sparse file support
(and it would also be nice to support copy_file_range syscall ... but
that is unrelated to the above)

Am I missing some magic tool?  Seems like Windows has various options
for copy tools - but looking at Linux i/o patterns from these tools
was pretty depressing - I am hoping that there are other choices.

-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-07-02  0:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-30  2:37 Copy tools on Linux Steve French
2018-06-30 13:13 ` Goldwyn Rodrigues
2018-06-30 14:12   ` Steve French
2018-06-30 14:47     ` Goldwyn Rodrigues
2018-06-30 16:34 ` Andreas Dilger
2018-07-01  0:10 ` Dave Chinner
2018-07-01  2:59   ` Steve French
2018-07-01 17:44   ` Goldwyn Rodrigues
2018-07-02  0:17     ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).