linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steve French <smfrench@gmail.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	CIFS <linux-cifs@vger.kernel.org>,
	samba-technical <samba-technical@lists.samba.org>,
	lsf-pc@lists.linux-foundation.org
Subject: Re: [LSF/MM/BPF TOPIC] Enhancing Linux Copy Performance and Function and improving backup scenarios
Date: Sat, 1 Feb 2020 13:54:46 -0600	[thread overview]
Message-ID: <CAH2r5mv55Ua3B8WX1Qht1xfWL-k5pGJrN+Uz0L4jHtYOo9RMKw@mail.gmail.com> (raw)
In-Reply-To: <20200130015210.GB3673284@magnolia>

On Wed, Jan 29, 2020 at 7:54 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> On Wed, Jan 22, 2020 at 05:13:53PM -0600, Steve French wrote:
> > As discussed last year:
> >
> > Current Linux copy tools have various problems compared to other
> > platforms - small I/O sizes (and most don't allow it to be
> > configured), lack of parallel I/O for multi-file copies, inability to
> > reduce metadata updates by setting file size first, lack of cross
>
> ...and yet weirdly we tell everyone on xfs not to do that or to use
> fallocate, so that delayed speculative allocation can do its thing.
> We also tell them not to create deep directory trees because xfs isn't
> ext4.

Delayed speculative allocation may help xfs but changing file size
thousands of times for network and cluster fs for a single file copy
can be a disaster for other file systems (due to the excessive cost
it adds to metadata sync time) - so there are file systems where
setting the file size first can help

> >  And copy tools rely less on
> > the kernel file system (vs. code in the user space tool) in Linux than
> > would be expected, in order to determine which optimizations to use.
>
> What kernel interfaces would we expect userspace to use to figure out
> the confusing mess of optimizations? :)

copy_file_range and clone_file_range are a good start ... few tools
use them ...

> There's a whole bunch of xfs ioctls like dioinfo and the like that we
> ought to push to statx too.  Is that an example of what you mean?

That is a good example.   And then getting tools to use these,
even if there are some file system dependent cases.

>
> > But some progress has been made since last year's summit, with new
> > copy tools being released and improvements to some of the kernel file
> > systems, and also some additional feedback on lwn and on the mailing
> > lists.  In addition these discussions have prompted additional
> > feedback on how to improve file backup/restore scenarios (e.g. to
> > mounts to the cloud from local Linux systems) which require preserving
> > more timestamps, ACLs and metadata, and preserving them efficiently.
>
> I suppose it would be useful to think a little more about cross-device
> fs copies considering that the "devices" can be VM block devs backed by
> files on a filesystem that supports reflink.  I have no idea how you
> manage that sanely though.

I trust XFS and BTRFS and SMB3 and cluster fs etc. to solve this better
than the block level (better locking, leases/delegation, state management, etc.)
though.

-- 
Thanks,

Steve

  reply	other threads:[~2020-02-01 19:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-22 23:13 [LSF/MM/BPF TOPIC] Enhancing Linux Copy Performance and Function and improving backup scenarios Steve French
2020-01-30  1:52 ` Darrick J. Wong
2020-02-01 19:54   ` Steve French [this message]
2020-02-01 23:16     ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAH2r5mv55Ua3B8WX1Qht1xfWL-k5pGJrN+Uz0L4jHtYOo9RMKw@mail.gmail.com \
    --to=smfrench@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=samba-technical@lists.samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).