linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Wysochanski <dwysocha@redhat.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: djwong@kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH] xfs: fix i_version handling in xfs
Date: Tue, 16 Aug 2022 13:14:55 -0400	[thread overview]
Message-ID: <CALF+zO=OrT5tBvyL1ERD+YDSXkSAFvqQu-cQkSgWvQN8z+E_rA@mail.gmail.com> (raw)
In-Reply-To: <20220816131736.42615-1-jlayton@kernel.org>

On Tue, Aug 16, 2022 at 9:19 AM Jeff Layton <jlayton@kernel.org> wrote:
>
> The i_version in xfs_trans_log_inode is bumped for any inode update,
> including atime-only updates due to reads. We don't want to record those
> in the i_version, as they don't represent "real" changes. Remove that
> callsite.
>
> In xfs_vn_update_time, if S_VERSION is flagged, then attempt to bump the
> i_version and turn on XFS_ILOG_CORE if it happens. In
> xfs_trans_ichgtime, update the i_version if the mtime or ctime are being
> updated.
>
> Cc: Darrick J. Wong <darrick.wong@oracle.com>
> Cc: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/xfs/libxfs/xfs_trans_inode.c | 17 +++--------------
>  fs/xfs/xfs_iops.c               |  4 ++++
>  2 files changed, 7 insertions(+), 14 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_trans_inode.c b/fs/xfs/libxfs/xfs_trans_inode.c
> index 8b5547073379..78bf7f491462 100644
> --- a/fs/xfs/libxfs/xfs_trans_inode.c
> +++ b/fs/xfs/libxfs/xfs_trans_inode.c
> @@ -71,6 +71,8 @@ xfs_trans_ichgtime(
>                 inode->i_ctime = tv;
>         if (flags & XFS_ICHGTIME_CREATE)
>                 ip->i_crtime = tv;
> +       if (flags & (XFS_ICHGTIME_MOD|XFS_ICHGTIME_CHG))
> +               inode_inc_iversion(inode);
>  }
>
>  /*
> @@ -116,20 +118,7 @@ xfs_trans_log_inode(
>                 spin_unlock(&inode->i_lock);
>         }
>
> -       /*
> -        * First time we log the inode in a transaction, bump the inode change
> -        * counter if it is configured for this to occur. While we have the
> -        * inode locked exclusively for metadata modification, we can usually
> -        * avoid setting XFS_ILOG_CORE if no one has queried the value since
> -        * the last time it was incremented. If we have XFS_ILOG_CORE already
> -        * set however, then go ahead and bump the i_version counter
> -        * unconditionally.
> -        */
> -       if (!test_and_set_bit(XFS_LI_DIRTY, &iip->ili_item.li_flags)) {
> -               if (IS_I_VERSION(inode) &&
> -                   inode_maybe_inc_iversion(inode, flags & XFS_ILOG_CORE))
> -                       iversion_flags = XFS_ILOG_CORE;
> -       }
> +       set_bit(XFS_LI_DIRTY, &iip->ili_item.li_flags);
>
>         /*
>          * If we're updating the inode core or the timestamps and it's possible
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 45518b8c613c..162e044c7f56 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -718,6 +718,7 @@ xfs_setattr_nonsize(
>         }
>
>         setattr_copy(mnt_userns, inode, iattr);
> +       inode_inc_iversion(inode);
>         xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>
>         XFS_STATS_INC(mp, xs_ig_attrchg);
> @@ -943,6 +944,7 @@ xfs_setattr_size(
>
>         ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID)));
>         setattr_copy(mnt_userns, inode, iattr);
> +       inode_inc_iversion(inode);
>         xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>
>         XFS_STATS_INC(mp, xs_ig_attrchg);
> @@ -1047,6 +1049,8 @@ xfs_vn_update_time(
>                 inode->i_mtime = *now;
>         if (flags & S_ATIME)
>                 inode->i_atime = *now;
> +       if ((flags & S_VERSION) && inode_maybe_inc_iversion(inode, false))
> +               log_flags |= XFS_ILOG_CORE;
>
>         xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
>         xfs_trans_log_inode(tp, ip, log_flags);
> --
> 2.37.2
>

I have a test (details below) that shows an open issue with NFSv4.x +
fscache where an xfs exported filesystem would trigger unnecessary
over the wire READs after a umount/mount cycle of the NFS mount.  I
previously tracked this down to atime updates, but never followed
through on any patch.  Now that Jeff worked it out and this patch is
under review, I built 5.19 vanilla, retested, then built 5.19 + this
patch and verified the problem is fixed.
You can add:
Tested-by: Dave Wysochanski <dwysocha@redhat.com>



# ./t0_bz1913591.sh 4.1 xfs relatime
Setting NFS vers=4.1 filesystem to xfs and mount options relatime,rw
 0. On NFS server, setup export with xfs filesystem on loop device
/dev/loop0 /export/dir1 xfs
rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
 1. On NFS client, install and enable cachefilesd
 2. On NFS client, mount -o vers=4.1,fsc 127.0.0.1:/export/dir1 /mnt
 3. On NFS client, dd if=/dev/zero of=/mnt/file1.bin bs=4096 count=1
 4. On NFS client, echo 3 > /proc/sys/vm/drop_caches
 5. On NFS client, dd if=/mnt/file1.bin of=/dev/null (read into fscache)
 6. On NFS client, umount /mnt
 7. On NFS client, mount -o vers=4.1,fsc 127.0.0.1:/export/dir1 /mnt
 8. On NFS client, repeat steps 4-5 (read from fscache)
 9. On NFS client, check for READ ops (1st number) > 0 in /proc/self/mountstats
Found 4200 NFS READ and READ_PLUS ops in /proc/self/mountstats > 0
                READ: 1 1 0 220 4200 0 0 1 0
           READ_PLUS: 0 0 0 0 0 0 0 0 0
FAILED TEST ./t0_bz1913591.sh on kernel 5.19.0 with NFS vers=4.1
exported filesystem xfs options relatime,rw


# ./t0_bz1913591.sh 4.1 xfs relatime
Setting NFS vers=4.1 filesystem to xfs and mount options relatime,rw
 0. On NFS server, setup export with xfs filesystem on loop device
/dev/loop0 /export/dir1 xfs
rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
 1. On NFS client, install and enable cachefilesd
 2. On NFS client, mount -o vers=4.1,fsc 127.0.0.1:/export/dir1 /mnt
 3. On NFS client, dd if=/dev/zero of=/mnt/file1.bin bs=4096 count=1
 4. On NFS client, echo 3 > /proc/sys/vm/drop_caches
 5. On NFS client, dd if=/mnt/file1.bin of=/dev/null (read into fscache)
 6. On NFS client, umount /mnt
 7. On NFS client, mount -o vers=4.1,fsc 127.0.0.1:/export/dir1 /mnt
 8. On NFS client, repeat steps 4-5 (read from fscache)
 9. On NFS client, check for READ ops (1st number) > 0 in /proc/self/mountstats
10. On NFS client, check /proc/fs/fscache/stats fscache reads incrementing
PASSED TEST ./t0_bz1913591.sh on kernel 5.19.0i_version+ with NFS
vers=4.1 exported filesystem xfs options relatime,rw


  parent reply	other threads:[~2022-08-16 17:15 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-16 13:17 [PATCH] xfs: fix i_version handling in xfs Jeff Layton
2022-08-16 15:43 ` Darrick J. Wong
2022-08-16 15:58   ` Jeff Layton
2022-08-16 22:42     ` Dave Chinner
2022-08-16 23:57       ` Dave Chinner
2022-08-17 12:02       ` Jeff Layton
2022-08-18  1:07         ` Dave Chinner
2022-08-18 11:12           ` Jeff Layton
2022-08-18  0:34       ` NeilBrown
2022-08-18  1:32         ` Dave Chinner
2022-08-18  1:52           ` NeilBrown
2022-08-18  2:22             ` Trond Myklebust
2022-08-18  3:00             ` Dave Chinner
2022-08-19  0:35               ` NeilBrown
2022-08-18 11:00         ` Jeff Layton
2022-08-18 23:43           ` NeilBrown
2022-08-18  1:11       ` Trond Myklebust
2022-08-18  3:37         ` Dave Chinner
2022-08-18  4:15           ` Trond Myklebust
2022-08-18 11:03             ` Jeff Layton
2022-08-23  0:05               ` Dave Chinner
2022-08-23  1:33                 ` Trond Myklebust
2022-08-16 17:14 ` David Wysochanski [this message]
2022-08-16 23:37   ` Dave Chinner
2022-08-17 12:10     ` Jeff Layton
2022-08-17 21:57       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALF+zO=OrT5tBvyL1ERD+YDSXkSAFvqQu-cQkSgWvQN8z+E_rA@mail.gmail.com' \
    --to=dwysocha@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).