From: Amir Goldstein <amir73il@gmail.com>
To: Chengguang Xu <cgxu519@mykernel.net>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
overlayfs <linux-unionfs@vger.kernel.org>
Subject: Re: [PATCH v2 3/3] ovl: copy-up optimization for truncate
Date: Fri, 9 Apr 2021 08:50:37 +0300 [thread overview]
Message-ID: <CAOQ4uxhj3F_KQLyxEz3u7L-se-zWj40XiEsUKcuFurvYK_5S0w@mail.gmail.com> (raw)
In-Reply-To: <178b4932399.f88d0a1719390.547968204570738952@mykernel.net>
On Fri, Apr 9, 2021 at 6:00 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
> ---- 在 星期四, 2021-04-08 22:40:47 Amir Goldstein <amir73il@gmail.com> 撰写 ----
> > On Thu, Apr 8, 2021 at 4:23 PM Chengguang Xu <cgxu519@mykernel.net> wrote:
> > >
> > > ---- 在 星期三, 2021-04-07 15:52:15 Amir Goldstein <amir73il@gmail.com> 撰写 ----
> > > > On Wed, Apr 7, 2021 at 12:04 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
> > > > >
> > > > > Currently truncate operation on the file which only has
> > > > > lower will copy-up whole lower file and calling truncate(2)
> > > > > on upper file. It is not efficient for the case which
> > > > > truncates to much smaller size than lower file. This patch
> > > > > tries to avoid unnecessary data copy and truncate operation
> > > > > after copy-up.
> > > > >
> > > > > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
> > > > > ---
> > > > > fs/overlayfs/copy_up.c | 18 +++++++++++-------
> > > > > fs/overlayfs/inode.c | 9 ++++++++-
> > > > > fs/overlayfs/overlayfs.h | 2 +-
> > > > > 3 files changed, 20 insertions(+), 9 deletions(-)
> > > > >
> > > > > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > > > > index a1a9a150405a..331cc32eac95 100644
> > > > > --- a/fs/overlayfs/copy_up.c
> > > > > +++ b/fs/overlayfs/copy_up.c
> > > > > @@ -874,7 +874,7 @@ static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
> > > > > }
> > > > >
> > > > > static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> > > > > - int flags)
> > > > > + int flags, loff_t size)
> > > > > {
> > > > > int err;
> > > > > DEFINE_DELAYED_CALL(done);
> > > > > @@ -911,6 +911,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> > > > > /* maybe truncate regular file. this has no effect on dirs */
> > > > > if (flags & O_TRUNC)
> > > > > ctx.stat.size = 0;
> > > > > + if (size)
> > > > > + ctx.stat.size = size;
> > > >
> > > > Not sure about this, but *maybe* instead we re-interpret O_TRUNC
> > > > internally as "either O_TRUNC or truncate()" and then:
> > > > if (flags & O_TRUNC)
> > > > ctx.stat.size = size;
> > > >
> > > > It would simplify the logic in ovl_copy_up_with_data().
> > > > If you do that, put a comment to clarify that special meaning.
> > > >
> > > > >
> > > > > if (S_ISLNK(ctx.stat.mode)) {
> > > > > ctx.link = vfs_get_link(ctx.lowerpath.dentry, &done);
> > > > > @@ -937,7 +939,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> > > > > return err;
> > > > > }
> > > > >
> > > > > -static int ovl_copy_up_flags(struct dentry *dentry, int flags)
> > > > > +static int ovl_copy_up_flags(struct dentry *dentry, int flags, loff_t size)
> > > > > {
> > > > > int err = 0;
> > > > > const struct cred *old_cred = ovl_override_creds(dentry->d_sb);
> > > > > @@ -970,7 +972,7 @@ static int ovl_copy_up_flags(struct dentry *dentry, int flags)
> > > > > next = parent;
> > > > > }
> > > > >
> > > > > - err = ovl_copy_up_one(parent, next, flags);
> > > > > + err = ovl_copy_up_one(parent, next, flags, size);
> > > > >
> > > > > dput(parent);
> > > > > dput(next);
> > > > > @@ -1002,7 +1004,7 @@ int ovl_maybe_copy_up(struct dentry *dentry, int flags)
> > > > > if (ovl_open_need_copy_up(dentry, flags)) {
> > > > > err = ovl_want_write(dentry);
> > > > > if (!err) {
> > > > > - err = ovl_copy_up_flags(dentry, flags);
> > > > > + err = ovl_copy_up_flags(dentry, flags, 0);
> > > > > ovl_drop_write(dentry);
> > > > > }
> > > > > }
> > > > > @@ -1010,12 +1012,14 @@ int ovl_maybe_copy_up(struct dentry *dentry, int flags)
> > > > > return err;
> > > > > }
> > > > >
> > > > > -int ovl_copy_up_with_data(struct dentry *dentry)
> > > > > +int ovl_copy_up_with_data(struct dentry *dentry, loff_t size)
> > > > > {
> > > > > - return ovl_copy_up_flags(dentry, O_WRONLY);
> > > > > + if (size)
> > > > > + return ovl_copy_up_flags(dentry, O_WRONLY, size);
> > > > > + return ovl_copy_up_flags(dentry, O_TRUNC | O_WRONLY, 0);
> > > >
> > > > Best get rid of this helper and put this logic in ovl_setattr(). see below.
> > > >
> > > > > }
> > > > >
> > > > > int ovl_copy_up(struct dentry *dentry)
> > > > > {
> > > > > - return ovl_copy_up_flags(dentry, 0);
> > > > > + return ovl_copy_up_flags(dentry, 0, 0);
> > > > > }
> > > > > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> > > > > index cf41bcb664bc..92f274844947 100644
> > > > > --- a/fs/overlayfs/inode.c
> > > > > +++ b/fs/overlayfs/inode.c
> > > > > @@ -43,13 +43,20 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
> > > > > if (!full_copy_up)
> > > > > err = ovl_copy_up(dentry);
> > > > > else
> > > > > - err = ovl_copy_up_with_data(dentry);
> > > > > + err = ovl_copy_up_with_data(dentry, attr->ia_size);
> > > >
> > > > You do not know that ia_size is valid here.
> > >
> > > I think we don't have to worry about validation of ia_size here,
> > > vfs layer has already done simple check for specified size and upper fs
> > > will return error when we set invalid file size after copy-up. Am I missing
> > > something?
> > >
> >
> > ovl_setattr() will be called from any number of places where ia_size has
> > uninitialized value, such as vfs_utimes().
> >
> > You are not allowed to access it without checking
> > (attr->ia_valid & ATTR_SIZE) which here above you don't.
> >
>
> Currently, (attr->ia_valid & ATTR_SIZE) is equal to var full_copy_up,
> so the size will be valid in this case.
Right. I still prefer killing the ovl_copy_up_with_data() helper.
It hurts more than it helps at this point IMO.
>
> > >
> > > > Instead of using this if/else and full_copy_up var, use vars 'flags'
> > > > and 'size' and call ovl_copy_up_flags().
> > > > Instead of full_copy_up = true, set flags and size.
> > > > Then you may also remove ovl_copy_up_with_data() which has no other
> > > > callers.
> > > >
> > > > > if (!err) {
> > > > > struct inode *winode = NULL;
> > > > >
> > > > > upperdentry = ovl_dentry_upper(dentry);
> > > > >
> > > > > if (attr->ia_valid & ATTR_SIZE) {
> > > > > + if (full_copy_up && !(attr->ia_valid & ~ATTR_SIZE)) {
> > > > > + inode_lock(upperdentry->d_inode);
> > > > > + ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
> > > > > + inode_unlock(upperdentry->d_inode);
> > > >
> > > > All that this is saving is an extra notify_change() call and I am not sure it is
> > > > worth the special casing.
> > > >
> > > > Also, I think that is a bug and would make xfstest overlay/013 fail.
> > >
> > > I ran testcases in overlay directory and didn't find failure related to this change.
> > > However, generic/313 failed unexpectedly, the reason is I used full_copy_up var
> > > wrongly, for the file which has upper still needs to go through notify_change().
> > >
> > > By the way, I don't fully understand calling copy-up function(ovl_copy_up() or ovl_copy_up_with_data())
> > > even for the file which has upper. Maybe it's better to optimize this part first in separated patch.
> > >
> >
> > There is a difference between "has upper" and "has upper data".
> > It's related to metacopy.
> >
>
> I see, my point here is we can skip calling copy-up function for files which already have upper data.
>
ovl_copy_up() already does all those optimizations internally.
We can pre-check upper data if it helps anything.
>
>
> > >
> > > > When lower file is being executed, its true that we copy up anyway
> > > > and that it is safe to do that, but test and applications expect to get
> > > > ETXTBSY error all the same.
> > >
> > > Actually we have already do the check and return ETXTBSY error, see below.
> > >
> > > err = -ETXTBSY;
> > > if (atomic_read(&realinode->i_writecount) < 0)
> > > goto out_drop_write;
> > >
> >
> > Yes, my point exactly. Your code does goto out_drop_write; before that check
> > so it will skip it.
> >
>
> No, if you look at the code more closely , you'll find the goto in my code is actually after the check, :-).
>
I don't see it?? I see get_write_access() check after your goto.
What are we missing?
Anyway, I hold my opinion that the optimization of skipping notify_change()
has little benefit and is a potential for bugs, even if the specific
bug I pointed
out is not real.
Thanks,
Amir.
next prev parent reply other threads:[~2021-04-09 5:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-06 12:02 [PATCH v2 1/3] ovl: do not restore mtime on copy-up for regular file Chengguang Xu
2021-04-06 12:02 ` [PATCH v2 2/3] ovl: check actual copy-up size Chengguang Xu
2021-04-07 6:56 ` Amir Goldstein
2021-04-06 12:02 ` [PATCH v2 3/3] ovl: copy-up optimization for truncate Chengguang Xu
2021-04-07 7:52 ` Amir Goldstein
2021-04-08 13:23 ` Chengguang Xu
2021-04-08 14:40 ` Amir Goldstein
2021-04-09 3:00 ` Chengguang Xu
2021-04-09 5:50 ` Amir Goldstein [this message]
2021-04-09 8:02 ` Miklos Szeredi
2021-04-07 6:55 ` [PATCH v2 1/3] ovl: do not restore mtime on copy-up for regular file Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOQ4uxhj3F_KQLyxEz3u7L-se-zWj40XiEsUKcuFurvYK_5S0w@mail.gmail.com \
--to=amir73il@gmail.com \
--cc=cgxu519@mykernel.net \
--cc=linux-unionfs@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).