All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Xiubo Li <xiubli@redhat.com>
Cc: idryomov@gmail.com, vshankar@redhat.com, ceph-devel@vger.kernel.org
Subject: Re: [PATCH] ceph: do not truncate pagecache if truncate size doesn't change
Date: Wed, 17 Nov 2021 08:28:08 -0500	[thread overview]
Message-ID: <e49bbb32e8c76e441c6d24f98774187c4e913a22.camel@kernel.org> (raw)
In-Reply-To: <672440f9-e812-e97f-1c85-0343d7e8359e@redhat.com>

On Wed, 2021-11-17 at 09:21 +0800, Xiubo Li wrote:
> On 11/17/21 4:06 AM, Jeff Layton wrote:
> > On Tue, 2021-11-16 at 17:20 +0800, xiubli@redhat.com wrote:
> > > From: Xiubo Li <xiubli@redhat.com>
> > > 
> > > In case truncating a file to a smaller sizeA, the sizeA will be kept
> > > in truncate_size. And if truncate the file to a bigger sizeB, the
> > > MDS will only increase the truncate_seq, but still using the sizeA as
> > > the truncate_size.
> > > 
> > Do you mean "kept in ci->i_truncate_size" ?
> 
> Sorry for confusing. It mainly will be kept in the MDS side's 
> CInode->inode.truncate_size. And also will be propagated to all the 
> clients' ci->i_truncate_size member.
> 
> The MDS will only change CInode->inode.truncate_size when truncating a 
> smaller size.
> 
> 
> > If so, is this really the
> > correct fix? I'll note this in the sources:
> > 
> >          u32 i_truncate_seq;        /* last truncate to smaller size */
> >          u64 i_truncate_size;       /*  and the size we last truncated down to */
> > 
> > Maybe the MDS ought not bump the truncate_seq unless it was truncating
> > to a smaller size? If not, then that comment seems wrong at least.
> 
> Yeah, the above comments are inconsistent with what the MDS is doing.
> 
> Okay, I missed reading the code, I found in MDS that is introduced by 
> commit :
> 
>       bf39d32d936 mds: bump truncate seq when fscrypt_file changes
> 
> With the size handling feature support, I think this commit will make no 
> sense any more since we will calculate the 'truncating_smaller' by not 
> only comparing the new_size and old_size, which both are rounded up to 
> FSCRYPT BLOCK SIZE, will also check the 'req->get_data().length()' if 
> the new_size and old_size are the same.
> 
> 
> > 
> > > So when filling the inode it will truncate the pagecache by using
> > > truncate_sizeA again, which makes no sense and will trim the inocent
> > > pages.
> > > 
> > Is there a reproducer for this? It would be nice to put something in
> > xfstests for it if so.
> 
> In xfstests' generic/075 has already testing this, but i didn't see any 
> issue it reproduce.
> 
> I just found this strange logs when it's doing 
> something like:
> 
> truncateA 0x10000 --> 0x2000
> 
> truncateB 0x2000   --> 0x8000
> 
> truncateC 0x8000   --> 0x6000
> 
> For the truncateC, the log says:
> 
> ceph:  truncate_size 0x2000 -> 0x6000
> 
> 
> The problem is that the truncateB will also do the vmtruncate by using 
> the 0x2000 instead, the vmtruncate will not flush the dirty pages to the 
> OSD and will just discard them from the pagecaches. Then we may lost 
> some new updated data in case there has any write before the truncateB 
> in range [0x2000, 0x8000).
> 
> 

Is that reproducible without the fscrypt size handling changes? I
haven't seen generic/075 fail on stock kernels.

If this is a generic bug, then we should go ahead and fix it in
mainline. If it's a problem only with fscrypt enabled, then let's plan
to roll this patch into those changes.

> 
> 
> > > Signed-off-by: Xiubo Li <xiubli@redhat.com>
> > > ---
> > >   fs/ceph/inode.c | 5 +++--
> > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> > > index 1b4ce453d397..b4f784684e64 100644
> > > --- a/fs/ceph/inode.c
> > > +++ b/fs/ceph/inode.c
> > > @@ -738,10 +738,11 @@ int ceph_fill_file_size(struct inode *inode, int issued,
> > >   			 * don't hold those caps, then we need to check whether
> > >   			 * the file is either opened or mmaped
> > >   			 */
> > > -			if ((issued & (CEPH_CAP_FILE_CACHE|
> > > +			if (ci->i_truncate_size != truncate_size &&
> > > +			    ((issued & (CEPH_CAP_FILE_CACHE|
> > >   				       CEPH_CAP_FILE_BUFFER)) ||
> > >   			    mapping_mapped(inode->i_mapping) ||
> > > -			    __ceph_is_file_opened(ci)) {
> > > +			    __ceph_is_file_opened(ci))) {
> > >   				ci->i_truncate_pending++;
> > >   				queue_trunc = 1;
> > >   			}
> 

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2021-11-17 13:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-16  9:20 [PATCH] ceph: do not truncate pagecache if truncate size doesn't change xiubli
2021-11-16 20:06 ` Jeff Layton
2021-11-17  1:21   ` Xiubo Li
2021-11-17 13:28     ` Jeff Layton [this message]
2021-11-17 13:40       ` Xiubo Li
2021-11-17 13:50         ` Jeff Layton
2021-11-17 15:06     ` Jeff Layton
2021-11-18  2:38       ` Xiubo Li
2021-11-18 12:19         ` Jeff Layton
2021-11-19  2:20           ` Xiubo Li
2021-11-17  2:47   ` Yan, Zheng
2021-11-17  4:19     ` Xiubo Li
2021-11-17 21:10 ` Jeff Layton
2021-11-18  4:46   ` Xiubo Li
2021-11-18  9:59     ` Xiubo Li
     [not found] ` <09babbaf077a76ace4793f2e6ae6127d2e7d6411.camel@kernel.org>
2021-11-19  4:29   ` Xiubo Li
2021-11-19  4:33     ` Xiubo Li
2021-11-19 11:59     ` Jeff Layton
2021-11-20  0:58       ` Xiubo Li
2021-11-22 19:10         ` Jeff Layton
2021-11-23  1:00           ` Xiubo Li
2021-11-23  8:06             ` Xiubo Li
2021-11-23  3:11           ` Xiubo Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e49bbb32e8c76e441c6d24f98774187c4e913a22.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=vshankar@redhat.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.