linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Niels de Vos <ndevos@redhat.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Marcin Sulikowski <marcin.k.sulikowski@gmail.com>
Subject: Re: [PATCH v3] fuse: add support for copy_file_range()
Date: Tue, 21 Aug 2018 12:12:33 +0200	[thread overview]
Message-ID: <20180821101233.GD2650@ndevos-x270> (raw)
In-Reply-To: <CAJfpegstirk9AymSO3GHixNWCd_wNwvXRKj0zyVNDX3ftMqkHQ@mail.gmail.com>

On Tue, Aug 07, 2018 at 02:02:35PM +0200, Miklos Szeredi wrote:
> On Fri, Jun 29, 2018 at 2:53 PM, Niels de Vos <ndevos@redhat.com> wrote:
> > There are several FUSE filesystems that can implement server-side copy
> > or other efficient copy/duplication/clone methods. The copy_file_range()
> > syscall is the standard interface that users have access to while not
> > depending on external libraries that bypass FUSE.
> >
> > Signed-off-by: Niels de Vos <ndevos@redhat.com>
> >
> > ---
> > v2: return ssize_t instead of long
> > v3: add nodeid_out to fuse_copy_file_range_in for libfuse expectations
> > ---
> >  fs/fuse/file.c            |  66 +++++++++++++++++++++++
> >  fs/fuse/fuse_i.h          |   3 ++
> >  include/uapi/linux/fuse.h | 107 ++++++++++++++++++++++----------------
> >  3 files changed, 132 insertions(+), 44 deletions(-)
> >
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index 67648ccbdd43..864939a1215d 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -3009,6 +3009,71 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset,
> >         return err;
> >  }
> >
> > +static ssize_t fuse_copy_file_range(struct file *file_in, loff_t pos_in,
> > +                                   struct file *file_out, loff_t pos_out,
> > +                                   size_t len, unsigned int flags)
> > +{
> > +       struct fuse_file *ff_in = file_in->private_data;
> > +       struct fuse_file *ff_out = file_out->private_data;
> > +       struct inode *inode_out = file_inode(file_out);
> > +       struct fuse_inode *fi_out = get_fuse_inode(inode_out);
> > +       struct fuse_conn *fc = ff_in->fc;
> > +       FUSE_ARGS(args);
> > +       struct fuse_copy_file_range_in inarg = {
> > +               .fh_in = ff_in->fh,
> > +               .off_in = pos_in,
> > +               .nodeid_out = ff_out->nodeid,
> > +               .fh_out = ff_out->fh,
> > +               .off_out = pos_out,
> > +               .len = len,
> > +               .flags = flags
> > +       };
> > +       struct fuse_copy_file_range_out outarg;
> > +       ssize_t err;
> > +
> > +       if (fc->no_copy_file_range)
> > +               return -EOPNOTSUPP;
> > +
> > +       inode_lock(inode_out);
> > +       set_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
> 
> This one is only needed in the non-writeback-cache case and only if
> the operations is size extending.
> 
> Here's how the writeback-cache is supposed to work: the kernel buffers
> writes, just like a normal filesystem, as well as buffering related
> metadata updates (size & [cm]time), again, just like a normal
> filesystem.  This means we just don't care about i_size being updated
> in userspace, any such change will be overwritten when the metadata is
> flushed out.
> 
> In writeback-cache mode, when we do any other data modification, we
> need to first flush out the cache so that the order of writes is not
> mixed up.  See fallocate() for example.  We could be selective and
> only flush the range covered by [pos, pos+len], but just flushing
> everything is okay.

Thanks! I think I understood what you mean and I'll be sending an
updated version soon.

> I could add these, but you already have a test for this set up, so, I
> wouldn't mind if you post a new version.

No problem. I got something ready and tested on my side.


...
> > +       FUSE_POLL            = 40,
> > +       FUSE_NOTIFY_REPLY    = 41,
> > +       FUSE_BATCH_FORGET    = 42,
> > +       FUSE_FALLOCATE       = 43,
> > +       FUSE_READDIRPLUS     = 44,
> > +       FUSE_RENAME2         = 45,
> > +       FUSE_LSEEK           = 46,
> > +       FUSE_COPY_FILE_RANGE = 47,
> 
> Nit: please do tabulation with tabs instead of spaces.

Will do.


> >
> >         /* CUSE specific operations */
> >         CUSE_INIT          = 4096,
> > @@ -792,4 +796,19 @@ struct fuse_lseek_out {
> >         uint64_t        offset;
> >  };
> >
> > +struct fuse_copy_file_range_in {
> > +       uint64_t        fh_in;
> > +       uint64_t        off_in;
> > +       uint64_t        nodeid_out;
> > +       uint64_t        fh_out;
> > +       uint64_t        off_out;
> > +       uint64_t        len;
> > +       uint32_t        flags;
> 
> Why not uint64_t for flags?

Everything else uses uint32_t for flags in this file. I'll make it
uint64_t in the next update.


> > +};
> > +
> > +struct fuse_copy_file_range_out {
> > +       uint32_t        size;
> > +       uint32_t        padding;
> > +};
> 
> Could reuse "struct fuse_write_out" for this.   Helps with the
> userspace interface as well, since the same fuse_reply_write()
> function can be used.

I considered that before as well. In case the interface changes an
updated struct fuse_copy_file_range_out can always be added later. And
hopefully there is no reason to change it at all.

At the moment I am running a few more test to verify an updated patch,
and will send it out later today.

Niels

  reply	other threads:[~2018-08-21 10:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-27  7:45 [PATCH] fuse: add support for copy_file_range() Niels de Vos
2018-06-27  8:20 ` kbuild test robot
2018-06-27  8:25 ` kbuild test robot
2018-06-27  8:46 ` [PATCH v2] " Niels de Vos
2018-06-29 12:16   ` Niels de Vos
2018-06-29 12:53     ` [PATCH v3] " Niels de Vos
2018-08-06 10:46       ` Niels de Vos
2018-08-07 12:02       ` Miklos Szeredi
2018-08-21 10:12         ` Niels de Vos [this message]
2018-08-21 12:36           ` [PATCH v4] " Niels de Vos
2018-10-01  9:21             ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180821101233.GD2650@ndevos-x270 \
    --to=ndevos@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcin.k.sulikowski@gmail.com \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).