All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	ocfs2-devel@oss.oracle.com, Eric Sandeen <sandeen@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Miklos Szeredi <miklos@szeredi.hu>
Subject: Re: [PATCH 08/15] vfs: change clone and dedupe range function pointers to return bytes completed
Date: Fri, 5 Oct 2018 14:47:25 -0700	[thread overview]
Message-ID: <20181005214725.GD19324@magnolia> (raw)
In-Reply-To: <CAOQ4uxjjSFt07P1Qv9am9sN--UZOojzE4QGQSdx1o+sLN+M2RQ@mail.gmail.com>

On Fri, Oct 05, 2018 at 11:06:54AM +0300, Amir Goldstein wrote:
> On Fri, Oct 5, 2018 at 3:46 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Change the clone_file_range and dedupe_file_range functions to return
> > the number of bytes they operated on.  This is the precursor to allowing
> > fs implementations to return short clone/dedupe results to the user,
> > which will enable us to obey resource limits in a graceful manner.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> 
> [...]
> 
> > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> > index aeaefd2a551b..6d792d817538 100644
> > --- a/fs/overlayfs/file.c
> > +++ b/fs/overlayfs/file.c
> > @@ -487,16 +487,21 @@ static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
> >                             OVL_COPY);
> >  }
> >
> > -static int ovl_clone_file_range(struct file *file_in, loff_t pos_in,
> > +static s64 ovl_clone_file_range(struct file *file_in, loff_t pos_in,
> >                                 struct file *file_out, loff_t pos_out, u64 len)
> >  {
> > -       return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> > -                           OVL_CLONE);
> > +       int ret;
> > +
> > +       ret = ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> > +                          OVL_CLONE);
> > +       return ret < 0 ? ret : len;
> >  }
> >
> > -static int ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
> > +static s64 ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
> >                                  struct file *file_out, loff_t pos_out, u64 len)
> >  {
> > +       int ret;
> > +
> >         /*
> >          * Don't copy up because of a dedupe request, this wouldn't make sense
> >          * most of the time (data would be duplicated instead of deduplicated).
> > @@ -505,8 +510,9 @@ static int ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
> >             !ovl_inode_upper(file_inode(file_out)))
> >                 return -EPERM;
> >
> > -       return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> > -                           OVL_DEDUPE);
> > +       ret = ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> > +                          OVL_DEDUPE);
> > +       return ret < 0 ? ret : len;
> >  }
> >
> 
> This is not pretty at all.
> You are blocking the propagation of partial dedupe/clone result
> of files that are accessed via overlay over xfs.
> 
> Please extend the interface change to the vfs helpers
> (i.e. vfs_clone_file_range()) and then the change above is not needed.
> 
> Of course you would need to change the 3 callers of
> vfs_clone_file_range() that expect 0 is ok.

Ok, I'll plumb the bytes-finished return value all the way through the
internal APIs.

> Please take a look at commit
> a725356b6659 ("vfs: swap names of {do,vfs}_clone_file_range()")
> 
> That was just merged for rc7.
> 
> I do apologize for the churn, but it's a semantic mistake that
> I made that needed fixing, so please rebase your work on top
> of that and take care not to trip over it.

Err... ok.  That makes working on this a little messy, we'll see if I
can get this mess rebased in time for 5.0.

> ioctl_file_clone() and ovl_copy_up_data() just need to interpret
> positive return value correctly.
> nfsd4_clone_file_range() should have the same return value as
> vfs_clone_file_range() to be interpreted in nfsd4_clone(), following
> same practice as nfsd4_copy_file_range().
> 
> [...]
> 
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 2a4141d36ebf..e5755340e825 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -1759,10 +1759,12 @@ struct file_operations {
> >  #endif
> >         ssize_t (*copy_file_range)(struct file *, loff_t, struct file *,
> >                         loff_t, size_t, unsigned int);
> > -       int (*clone_file_range)(struct file *, loff_t, struct file *, loff_t,
> > -                       u64);
> > -       int (*dedupe_file_range)(struct file *, loff_t, struct file *, loff_t,
> > -                       u64);
> > +       s64 (*clone_file_range)(struct file *file_in, loff_t pos_in,
> > +                               struct file *file_out, loff_t pos_out,
> > +                               u64 count);
> > +       s64 (*dedupe_file_range)(struct file *file_in, loff_t pos_in,
> > +                                struct file *file_out, loff_t pos_out,
> > +                                u64 count);
> 
> Matthew has objected a similar interface change when it was proposed by Miklos:
> https://marc.info/?l=linux-fsdevel&m=152570317110292&w=2
> https://marc.info/?l=linux-fsdevel&m=152569298704781&w=2
> 
> He claimed that the interface should look like this:
> +       loff_t (*dedupe_file_range)(struct file *src, loff_t src_off,
> +                       struct file *dst, loff_t dst_off, loff_t len);

I don't really like loff_t (why does the typename for a size include
"offset" in the name??) but I guess that's not horrible.  I've never
liked how functions take size_t (unsigned) but return ssize_t (signed)
anyway.

--D

> Thanks,
> Amir.

WARNING: multiple messages have this Message-ID
From: Darrick J. Wong <darrick.wong@oracle.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>,
	ocfs2-devel@oss.oracle.com, Eric Sandeen <sandeen@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Miklos Szeredi <miklos@szeredi.hu>
Subject: [Ocfs2-devel] [PATCH 08/15] vfs: change clone and dedupe range function pointers to return bytes completed
Date: Fri, 5 Oct 2018 14:47:25 -0700	[thread overview]
Message-ID: <20181005214725.GD19324@magnolia> (raw)
In-Reply-To: <CAOQ4uxjjSFt07P1Qv9am9sN--UZOojzE4QGQSdx1o+sLN+M2RQ@mail.gmail.com>

On Fri, Oct 05, 2018 at 11:06:54AM +0300, Amir Goldstein wrote:
> On Fri, Oct 5, 2018 at 3:46 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Change the clone_file_range and dedupe_file_range functions to return
> > the number of bytes they operated on.  This is the precursor to allowing
> > fs implementations to return short clone/dedupe results to the user,
> > which will enable us to obey resource limits in a graceful manner.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> 
> [...]
> 
> > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> > index aeaefd2a551b..6d792d817538 100644
> > --- a/fs/overlayfs/file.c
> > +++ b/fs/overlayfs/file.c
> > @@ -487,16 +487,21 @@ static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
> >                             OVL_COPY);
> >  }
> >
> > -static int ovl_clone_file_range(struct file *file_in, loff_t pos_in,
> > +static s64 ovl_clone_file_range(struct file *file_in, loff_t pos_in,
> >                                 struct file *file_out, loff_t pos_out, u64 len)
> >  {
> > -       return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> > -                           OVL_CLONE);
> > +       int ret;
> > +
> > +       ret = ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> > +                          OVL_CLONE);
> > +       return ret < 0 ? ret : len;
> >  }
> >
> > -static int ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
> > +static s64 ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
> >                                  struct file *file_out, loff_t pos_out, u64 len)
> >  {
> > +       int ret;
> > +
> >         /*
> >          * Don't copy up because of a dedupe request, this wouldn't make sense
> >          * most of the time (data would be duplicated instead of deduplicated).
> > @@ -505,8 +510,9 @@ static int ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
> >             !ovl_inode_upper(file_inode(file_out)))
> >                 return -EPERM;
> >
> > -       return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> > -                           OVL_DEDUPE);
> > +       ret = ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> > +                          OVL_DEDUPE);
> > +       return ret < 0 ? ret : len;
> >  }
> >
> 
> This is not pretty at all.
> You are blocking the propagation of partial dedupe/clone result
> of files that are accessed via overlay over xfs.
> 
> Please extend the interface change to the vfs helpers
> (i.e. vfs_clone_file_range()) and then the change above is not needed.
> 
> Of course you would need to change the 3 callers of
> vfs_clone_file_range() that expect 0 is ok.

Ok, I'll plumb the bytes-finished return value all the way through the
internal APIs.

> Please take a look at commit
> a725356b6659 ("vfs: swap names of {do,vfs}_clone_file_range()")
> 
> That was just merged for rc7.
> 
> I do apologize for the churn, but it's a semantic mistake that
> I made that needed fixing, so please rebase your work on top
> of that and take care not to trip over it.

Err... ok.  That makes working on this a little messy, we'll see if I
can get this mess rebased in time for 5.0.

> ioctl_file_clone() and ovl_copy_up_data() just need to interpret
> positive return value correctly.
> nfsd4_clone_file_range() should have the same return value as
> vfs_clone_file_range() to be interpreted in nfsd4_clone(), following
> same practice as nfsd4_copy_file_range().
> 
> [...]
> 
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 2a4141d36ebf..e5755340e825 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -1759,10 +1759,12 @@ struct file_operations {
> >  #endif
> >         ssize_t (*copy_file_range)(struct file *, loff_t, struct file *,
> >                         loff_t, size_t, unsigned int);
> > -       int (*clone_file_range)(struct file *, loff_t, struct file *, loff_t,
> > -                       u64);
> > -       int (*dedupe_file_range)(struct file *, loff_t, struct file *, loff_t,
> > -                       u64);
> > +       s64 (*clone_file_range)(struct file *file_in, loff_t pos_in,
> > +                               struct file *file_out, loff_t pos_out,
> > +                               u64 count);
> > +       s64 (*dedupe_file_range)(struct file *file_in, loff_t pos_in,
> > +                                struct file *file_out, loff_t pos_out,
> > +                                u64 count);
> 
> Matthew has objected a similar interface change when it was proposed by Miklos:
> https://marc.info/?l=linux-fsdevel&m=152570317110292&w=2
> https://marc.info/?l=linux-fsdevel&m=152569298704781&w=2
> 
> He claimed that the interface should look like this:
> +       loff_t (*dedupe_file_range)(struct file *src, loff_t src_off,
> +                       struct file *dst, loff_t dst_off, loff_t len);

I don't really like loff_t (why does the typename for a size include
"offset" in the name??) but I guess that's not horrible.  I've never
liked how functions take size_t (unsigned) but return ssize_t (signed)
anyway.

--D

> Thanks,
> Amir.

  reply	other threads:[~2018-10-05 21:47 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-05  0:44 [PATCH 00/15] fs: fixes for serious clone/dedupe problems Darrick J. Wong
2018-10-05  0:44 ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  0:44 ` [PATCH 01/15] xfs: add a per-xfs trace_printk macro Darrick J. Wong
2018-10-05  0:44   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  0:44 ` [PATCH 02/15] xfs: refactor clonerange preparation into a separate helper Darrick J. Wong
2018-10-05  0:44   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  5:28   ` Dave Chinner
2018-10-05  5:28     ` [Ocfs2-devel] " Dave Chinner
2018-10-05 17:06     ` Darrick J. Wong
2018-10-05 17:06       ` [Ocfs2-devel] " Darrick J. Wong
2018-10-06 10:30     ` Christoph Hellwig
2018-10-06 10:30       ` [Ocfs2-devel] " Christoph Hellwig
2018-10-05  7:02   ` Dave Chinner
2018-10-05  7:02     ` [Ocfs2-devel] " Dave Chinner
2018-10-05  9:02     ` Dave Chinner
2018-10-05  9:02       ` [Ocfs2-devel] " Dave Chinner
2018-10-05 17:21       ` Darrick J. Wong
2018-10-05 17:21         ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05 23:42         ` Dave Chinner
2018-10-05 23:42           ` [Ocfs2-devel] " Dave Chinner
2018-10-05  0:44 ` [PATCH 03/15] xfs: zero posteof blocks when cloning above eof Darrick J. Wong
2018-10-05  0:44   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  5:28   ` Dave Chinner
2018-10-05  5:28     ` [Ocfs2-devel] " Dave Chinner
2018-10-06 10:34   ` Christoph Hellwig
2018-10-06 10:34     ` [Ocfs2-devel] " Christoph Hellwig
2018-10-05  0:45 ` [PATCH 04/15] xfs: update ctime and remove suid before cloning files Darrick J. Wong
2018-10-05  0:45   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  5:30   ` Dave Chinner
2018-10-05  5:30     ` [Ocfs2-devel] " Dave Chinner
2018-10-06 10:35   ` Christoph Hellwig
2018-10-06 10:35     ` [Ocfs2-devel] " Christoph Hellwig
2018-10-05  0:45 ` [PATCH 05/15] vfs: check file ranges " Darrick J. Wong
2018-10-05  0:45   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-06 10:38   ` Christoph Hellwig
2018-10-06 10:38     ` [Ocfs2-devel] " Christoph Hellwig
2018-10-05  0:45 ` [PATCH 06/15] vfs: strengthen checking of file range inputs to clone/dedupe range Darrick J. Wong
2018-10-05  0:45   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  6:10   ` Amir Goldstein
2018-10-05 17:36     ` Darrick J. Wong
2018-10-05 17:36       ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  0:45 ` [PATCH 07/15] vfs: skip zero-length dedupe requests Darrick J. Wong
2018-10-05  0:45   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  8:39   ` Amir Goldstein
2018-10-06 10:39   ` Christoph Hellwig
2018-10-06 10:39     ` [Ocfs2-devel] " Christoph Hellwig
2018-10-05  0:45 ` [PATCH 08/15] vfs: change clone and dedupe range function pointers to return bytes completed Darrick J. Wong
2018-10-05  0:45   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  8:06   ` Amir Goldstein
2018-10-05 21:47     ` Darrick J. Wong [this message]
2018-10-05 21:47       ` [Ocfs2-devel] " Darrick J. Wong
2018-10-06 10:41   ` Christoph Hellwig
2018-10-06 10:41     ` [Ocfs2-devel] " Christoph Hellwig
2018-10-08 18:59     ` Darrick J. Wong
2018-10-08 18:59       ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  0:45 ` [PATCH 09/15] vfs: pass operation flags to {clone, dedupe}_file_range implementations Darrick J. Wong
2018-10-05  0:45   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  7:07   ` Amir Goldstein
2018-10-05 17:50     ` Darrick J. Wong
2018-10-05 17:50       ` [Ocfs2-devel] " Darrick J. Wong
2018-10-06 10:44       ` Christoph Hellwig
2018-10-06 10:44         ` [Ocfs2-devel] " Christoph Hellwig
2018-10-05  0:45 ` [PATCH 10/15] vfs: make cloning to source file eof more explicit Darrick J. Wong
2018-10-05  0:45   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  6:47   ` Amir Goldstein
2018-10-05  0:45 ` [PATCH 11/15] vfs: allow short clone and dedupe operations Darrick J. Wong
2018-10-05  0:45   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  0:46 ` [PATCH 12/15] vfs: implement opportunistic short dedupe Darrick J. Wong
2018-10-05  0:46   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  6:40   ` Amir Goldstein
2018-10-05 17:42     ` Darrick J. Wong
2018-10-05 17:42       ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  0:46 ` [PATCH 13/15] ocfs2: truncate page cache for clone destination file before remapping Darrick J. Wong
2018-10-05  0:46   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  0:46 ` [PATCH 14/15] ocfs2: support partial clone range and dedupe range Darrick J. Wong
2018-10-05  0:46   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  0:46 ` [PATCH 15/15] xfs: support returning partial reflink results Darrick J. Wong
2018-10-05  0:46   ` [Ocfs2-devel] " Darrick J. Wong
2018-10-05  1:17 ` [PATCH 00/15] fs: fixes for serious clone/dedupe problems Dave Chinner
2018-10-05  1:17   ` [Ocfs2-devel] " Dave Chinner
2018-10-05  1:24   ` Darrick J. Wong
2018-10-05  1:24     ` [Ocfs2-devel] " Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181005214725.GD19324@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=sandeen@redhat.com \
    --cc=willy@infradead.org \
    --subject='Re: [PATCH 08/15] vfs: change clone and dedupe range function pointers to return bytes completed' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.