linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
@ 2022-08-19 11:56 Jeff Layton
  2022-08-23 10:01 ` Florian Weimer
  2022-08-23 21:53 ` Dave Chinner
  0 siblings, 2 replies; 9+ messages in thread
From: Jeff Layton @ 2022-08-19 11:56 UTC (permalink / raw)
  To: viro
  Cc: linux-api, linux-fsdevel, linux-nfs, Jeff Layton, David Howells,
	Frank Filz

From: Jeff Layton <jlayton@redhat.com>

The NFS server and IMA both rely heavily on the i_version counter, but
it's largely invisible to userland, which makes it difficult to test its
behavior. This value would also be of use to userland NFS servers, and
other applications that want a reliable way to know if there was an
explicit change to an inode since they last checked.

Claim one of the spare fields in struct statx to hold a 64-bit inode
version attribute. This value must change with any explicit, observeable
metadata or data change. Note that atime updates are excluded from this,
unless it is due to an explicit change via utimes or similar mechanism.

When statx requests this attribute on an IS_I_VERSION inode, do an
inode_query_iversion and fill the result in the field. Also, update the
test-statx.c program to display the inode version and the mountid.

Cc: David Howells <dhowells@redhat.com>
Cc: Frank Filz <ffilzlnx@mindspring.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/stat.c                 | 7 +++++++
 include/linux/stat.h      | 1 +
 include/uapi/linux/stat.h | 3 ++-
 samples/vfs/test-statx.c  | 8 ++++++--
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/stat.c b/fs/stat.c
index 9ced8860e0f3..d892909836aa 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -17,6 +17,7 @@
 #include <linux/syscalls.h>
 #include <linux/pagemap.h>
 #include <linux/compat.h>
+#include <linux/iversion.h>
 
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
@@ -118,6 +119,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
 	stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT |
 				  STATX_ATTR_DAX);
 
+	if ((request_mask & STATX_INO_VERSION) && IS_I_VERSION(inode)) {
+		stat->result_mask |= STATX_INO_VERSION;
+		stat->ino_version = inode_query_iversion(inode);
+	}
+
 	mnt_userns = mnt_user_ns(path->mnt);
 	if (inode->i_op->getattr)
 		return inode->i_op->getattr(mnt_userns, path, stat,
@@ -611,6 +617,7 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
 	tmp.stx_dev_major = MAJOR(stat->dev);
 	tmp.stx_dev_minor = MINOR(stat->dev);
 	tmp.stx_mnt_id = stat->mnt_id;
+	tmp.stx_ino_version = stat->ino_version;
 
 	return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
 }
diff --git a/include/linux/stat.h b/include/linux/stat.h
index 7df06931f25d..9cd77eb7bc1a 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -50,6 +50,7 @@ struct kstat {
 	struct timespec64 btime;			/* File creation time */
 	u64		blocks;
 	u64		mnt_id;
+	u64		ino_version;
 };
 
 #endif
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index 1500a0f58041..48d9307d7f31 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -124,7 +124,7 @@ struct statx {
 	__u32	stx_dev_minor;
 	/* 0x90 */
 	__u64	stx_mnt_id;
-	__u64	__spare2;
+	__u64	stx_ino_version; /* Inode change attribute */
 	/* 0xa0 */
 	__u64	__spare3[12];	/* Spare space for future expansion */
 	/* 0x100 */
@@ -152,6 +152,7 @@ struct statx {
 #define STATX_BASIC_STATS	0x000007ffU	/* The stuff in the normal stat struct */
 #define STATX_BTIME		0x00000800U	/* Want/got stx_btime */
 #define STATX_MNT_ID		0x00001000U	/* Got stx_mnt_id */
+#define STATX_INO_VERSION	0x00002000U	/* Want/got stx_change_attr */
 
 #define STATX__RESERVED		0x80000000U	/* Reserved for future struct statx expansion */
 
diff --git a/samples/vfs/test-statx.c b/samples/vfs/test-statx.c
index 49c7a46cee07..23e68036fdfb 100644
--- a/samples/vfs/test-statx.c
+++ b/samples/vfs/test-statx.c
@@ -107,6 +107,8 @@ static void dump_statx(struct statx *stx)
 	printf("Device: %-15s", buffer);
 	if (stx->stx_mask & STATX_INO)
 		printf(" Inode: %-11llu", (unsigned long long) stx->stx_ino);
+	if (stx->stx_mask & STATX_MNT_ID)
+		printf(" MountId: %llx", stx->stx_mnt_id);
 	if (stx->stx_mask & STATX_NLINK)
 		printf(" Links: %-5u", stx->stx_nlink);
 	if (stx->stx_mask & STATX_TYPE) {
@@ -145,7 +147,9 @@ static void dump_statx(struct statx *stx)
 	if (stx->stx_mask & STATX_CTIME)
 		print_time("Change: ", &stx->stx_ctime);
 	if (stx->stx_mask & STATX_BTIME)
-		print_time(" Birth: ", &stx->stx_btime);
+		print_time("Birth: ", &stx->stx_btime);
+	if (stx->stx_mask & STATX_INO_VERSION)
+		printf("Inode Version: 0x%llx\n", stx->stx_ino_version);
 
 	if (stx->stx_attributes_mask) {
 		unsigned char bits, mbits;
@@ -218,7 +222,7 @@ int main(int argc, char **argv)
 	struct statx stx;
 	int ret, raw = 0, atflag = AT_SYMLINK_NOFOLLOW;
 
-	unsigned int mask = STATX_BASIC_STATS | STATX_BTIME;
+	unsigned int mask = STATX_BASIC_STATS | STATX_BTIME | STATX_MNT_ID | STATX_INO_VERSION;
 
 	for (argv++; *argv; argv++) {
 		if (strcmp(*argv, "-F") == 0) {
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
  2022-08-19 11:56 [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes Jeff Layton
@ 2022-08-23 10:01 ` Florian Weimer
  2022-08-23 10:16   ` Jeff Layton
  2022-08-23 21:53 ` Dave Chinner
  1 sibling, 1 reply; 9+ messages in thread
From: Florian Weimer @ 2022-08-23 10:01 UTC (permalink / raw)
  To: Jeff Layton
  Cc: viro, linux-api, linux-fsdevel, linux-nfs, Jeff Layton,
	David Howells, Frank Filz

* Jeff Layton:

> From: Jeff Layton <jlayton@redhat.com>
>
> The NFS server and IMA both rely heavily on the i_version counter, but
> it's largely invisible to userland, which makes it difficult to test its
> behavior. This value would also be of use to userland NFS servers, and
> other applications that want a reliable way to know if there was an
> explicit change to an inode since they last checked.
>
> Claim one of the spare fields in struct statx to hold a 64-bit inode
> version attribute. This value must change with any explicit, observeable
> metadata or data change. Note that atime updates are excluded from this,
> unless it is due to an explicit change via utimes or similar mechanism.
>
> When statx requests this attribute on an IS_I_VERSION inode, do an
> inode_query_iversion and fill the result in the field. Also, update the
> test-statx.c program to display the inode version and the mountid.

Will the version survive reboots?  Is it stored on disks?  Can backup
tools (and others) use this to check if the file has changed since the
last time the version has been observed?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
  2022-08-23 10:01 ` Florian Weimer
@ 2022-08-23 10:16   ` Jeff Layton
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2022-08-23 10:16 UTC (permalink / raw)
  To: Florian Weimer
  Cc: viro, linux-api, linux-fsdevel, linux-nfs, David Howells, Frank Filz

On Tue, 2022-08-23 at 12:01 +0200, Florian Weimer wrote:
> * Jeff Layton:
> 
> > From: Jeff Layton <jlayton@redhat.com>
> > 
> > The NFS server and IMA both rely heavily on the i_version counter, but
> > it's largely invisible to userland, which makes it difficult to test its
> > behavior. This value would also be of use to userland NFS servers, and
> > other applications that want a reliable way to know if there was an
> > explicit change to an inode since they last checked.
> > 
> > Claim one of the spare fields in struct statx to hold a 64-bit inode
> > version attribute. This value must change with any explicit, observeable
> > metadata or data change. Note that atime updates are excluded from this,
> > unless it is due to an explicit change via utimes or similar mechanism.
> > 
> > When statx requests this attribute on an IS_I_VERSION inode, do an
> > inode_query_iversion and fill the result in the field. Also, update the
> > test-statx.c program to display the inode version and the mountid.
> 
> Will the version survive reboots?  Is it stored on disks?  Can backup
> tools (and others) use this to check if the file has changed since the
> last time the version has been observed?
> 


The answer to all of those question is "yes".
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
  2022-08-19 11:56 [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes Jeff Layton
  2022-08-23 10:01 ` Florian Weimer
@ 2022-08-23 21:53 ` Dave Chinner
  2022-08-24 10:17   ` Jeff Layton
  2022-08-25 18:48   ` Colin Walters
  1 sibling, 2 replies; 9+ messages in thread
From: Dave Chinner @ 2022-08-23 21:53 UTC (permalink / raw)
  To: Jeff Layton
  Cc: viro, linux-api, linux-fsdevel, linux-nfs, Jeff Layton,
	David Howells, Frank Filz

On Fri, Aug 19, 2022 at 07:56:41AM -0400, Jeff Layton wrote:
> From: Jeff Layton <jlayton@redhat.com>
> 
> The NFS server and IMA both rely heavily on the i_version counter, but
> it's largely invisible to userland, which makes it difficult to test its
> behavior. This value would also be of use to userland NFS servers, and
> other applications that want a reliable way to know if there was an
> explicit change to an inode since they last checked.
> 
> Claim one of the spare fields in struct statx to hold a 64-bit inode
> version attribute. This value must change with any explicit, observeable
> metadata or data change. Note that atime updates are excluded from this,
> unless it is due to an explicit change via utimes or similar mechanism.
> 
> When statx requests this attribute on an IS_I_VERSION inode, do an
> inode_query_iversion and fill the result in the field. Also, update the
> test-statx.c program to display the inode version and the mountid.
> 
> Cc: David Howells <dhowells@redhat.com>
> Cc: Frank Filz <ffilzlnx@mindspring.com>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>

NAK.

THere's no definition of what consitutes an "inode change" and this
exposes internal filesystem implementation details (i.e. on disk
format behaviour) directly to userspace. That means when the
internal filesystem behaviour changes, userspace applications will
see changes in stat->ino_version changes and potentially break them.

We *need a documented specification* for the behaviour we are exposing to
userspace here, and then individual filesystems needs to opt into
providing this information as they are modified to conform to the
behaviour we are exposing directly to userspsace.

Jeff - can you please stop posting iversion patches to different
subsystems as individual, unrelated patchsets and start posting all
the changes - statx, ext4, xfs, man pages, etc as a single patchset
so the discussion can be centralised in one place and not spread
over half a dozen disconnected threads?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
  2022-08-23 21:53 ` Dave Chinner
@ 2022-08-24 10:17   ` Jeff Layton
  2022-08-25 18:48   ` Colin Walters
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Layton @ 2022-08-24 10:17 UTC (permalink / raw)
  To: Dave Chinner
  Cc: viro, linux-api, linux-fsdevel, linux-nfs, David Howells, Frank Filz

On Wed, 2022-08-24 at 07:53 +1000, Dave Chinner wrote:
> On Fri, Aug 19, 2022 at 07:56:41AM -0400, Jeff Layton wrote:
> > From: Jeff Layton <jlayton@redhat.com>
> > 
> > The NFS server and IMA both rely heavily on the i_version counter, but
> > it's largely invisible to userland, which makes it difficult to test its
> > behavior. This value would also be of use to userland NFS servers, and
> > other applications that want a reliable way to know if there was an
> > explicit change to an inode since they last checked.
> > 
> > Claim one of the spare fields in struct statx to hold a 64-bit inode
> > version attribute. This value must change with any explicit, observeable
> > metadata or data change. Note that atime updates are excluded from this,
> > unless it is due to an explicit change via utimes or similar mechanism.
> > 
> > When statx requests this attribute on an IS_I_VERSION inode, do an
> > inode_query_iversion and fill the result in the field. Also, update the
> > test-statx.c program to display the inode version and the mountid.
> > 
> > Cc: David Howells <dhowells@redhat.com>
> > Cc: Frank Filz <ffilzlnx@mindspring.com>
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> 
> NAK.
> 
> THere's no definition of what consitutes an "inode change" and this
> exposes internal filesystem implementation details (i.e. on disk
> format behaviour) directly to userspace. That means when the
> internal filesystem behaviour changes, userspace applications will
> see changes in stat->ino_version changes and potentially break them.
> 
> We *need a documented specification* for the behaviour we are exposing to
> userspace here, and then individual filesystems needs to opt into
> providing this information as they are modified to conform to the
> behaviour we are exposing directly to userspsace.
> 
> Jeff - can you please stop posting iversion patches to different
> subsystems as individual, unrelated patchsets and start posting all
> the changes - statx, ext4, xfs, man pages, etc as a single patchset
> so the discussion can be centralised in one place and not spread
> over half a dozen disconnected threads?
> 


Sure. Give me a few days and I'll post a more coherent set of patches.

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
  2022-08-23 21:53 ` Dave Chinner
  2022-08-24 10:17   ` Jeff Layton
@ 2022-08-25 18:48   ` Colin Walters
  2022-08-25 19:48     ` Jeff Layton
  2022-08-27  7:38     ` Greg KH
  1 sibling, 2 replies; 9+ messages in thread
From: Colin Walters @ 2022-08-25 18:48 UTC (permalink / raw)
  To: Dave Chinner, Jeff Layton
  Cc: Al Viro, linux-api, linux-fsdevel, linux-nfs, Jeff Layton,
	David Howells, Frank Filz



On Tue, Aug 23, 2022, at 5:53 PM, Dave Chinner wrote:
> 
> THere's no definition of what consitutes an "inode change" and this
> exposes internal filesystem implementation details (i.e. on disk
> format behaviour) directly to userspace. That means when the
> internal filesystem behaviour changes, userspace applications will
> see changes in stat->ino_version changes and potentially break them.

As a userspace developer (ostree, etc. who is definitely interested in this functionality) I do agree with this concern; but a random drive by comment: would it be helpful to expose iversion (or other bits like this from the vfs) via e.g. debugfs to start?  I think that'd unblock writing fstests in the short term right?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
  2022-08-25 18:48   ` Colin Walters
@ 2022-08-25 19:48     ` Jeff Layton
  2022-08-26  8:44       ` Colin Walters
  2022-08-27  7:38     ` Greg KH
  1 sibling, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2022-08-25 19:48 UTC (permalink / raw)
  To: Colin Walters, Dave Chinner
  Cc: Al Viro, linux-api, linux-fsdevel, linux-nfs, David Howells, Frank Filz

On Thu, 2022-08-25 at 14:48 -0400, Colin Walters wrote:
> 
> On Tue, Aug 23, 2022, at 5:53 PM, Dave Chinner wrote:
> > 
> > THere's no definition of what consitutes an "inode change" and this
> > exposes internal filesystem implementation details (i.e. on disk
> > format behaviour) directly to userspace. That means when the
> > internal filesystem behaviour changes, userspace applications will
> > see changes in stat->ino_version changes and potentially break them.
> 
> As a userspace developer (ostree, etc. who is definitely interested in this functionality) I do agree with this concern; but a random drive by comment: would it be helpful to expose iversion (or other bits like this from the vfs) via e.g. debugfs to start?  I think that'd unblock writing fstests in the short term right?
> 
> 

It's great to hear from userland developers who are interested in this!

I don't think there is a lot of controversy about the idea of presenting
a value like this via statx. The usefulness seems pretty obvious if
you've ever had to deal with timestamp granularity issues.

The part we're wrestling with now is that applications will need a clear
(and testable!) definition of what this value means. We need to be very
careful how we define this so that userland developers don't get stuck
dealing with semantics that vary per fstype, while still allowing the
broadest range of filesystems to support it.

My current thinking is to define this such that the reported ino_version
MUST change any time that the ctime would change (even if the timestamp
doesn't appear to change). That should also catch mtime updates.

The part I'm still conflicted about is whether we should allow for a
conformant implementation to increment the value even when there is no
apparent change to the inode.

IOW, should this value mean that something _did_ change in the inode or
that something _may_ have changed in it?

Implementations that do spurious increments would less than ideal, but
defining it that way might allow a broader range of filesystems to
present this value.

What would you prefer, as a userland developer?
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
  2022-08-25 19:48     ` Jeff Layton
@ 2022-08-26  8:44       ` Colin Walters
  0 siblings, 0 replies; 9+ messages in thread
From: Colin Walters @ 2022-08-26  8:44 UTC (permalink / raw)
  To: Jeff Layton, Dave Chinner
  Cc: Al Viro, linux-api, linux-fsdevel, linux-nfs, David Howells, Frank Filz

Bigger picture, I think eventually I'm going to rework stuff related to my use case to be more similar to the container stack, specifically using overlayfs; so it's quite possible by the time iversion is exposed to userspace, I won't have any strong want/need of it myself.

On Thu, Aug 25, 2022, at 3:48 PM, Jeff Layton wrote:

> IOW, should this value mean that something _did_ change in the inode or
> that something _may_ have changed in it?

In my case it's basically the same as IMA - we want to only compute the sha256 digest of files that actually changed.  Some false positives are hence OK - but that also means the usefulness of the feature degrades in proportion to that number.

A bit more detail:

I didn't deep dive into the XFS mention about internal/background iversion changes, but AIUI at a high level it sounds like those iversion changes happen mainly (only?) when the file is recently created and pending writeback, which doesn't seem like a problem in practice.  I do agree with Ingo's old quote about atime though in https://lwn.net/Articles/244829/ and this thread reminded me to use `noatime` on my main workstation (again; I'd recently changed how I provision it).





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes
  2022-08-25 18:48   ` Colin Walters
  2022-08-25 19:48     ` Jeff Layton
@ 2022-08-27  7:38     ` Greg KH
  1 sibling, 0 replies; 9+ messages in thread
From: Greg KH @ 2022-08-27  7:38 UTC (permalink / raw)
  To: Colin Walters
  Cc: Dave Chinner, Jeff Layton, Al Viro, linux-api, linux-fsdevel,
	linux-nfs, Jeff Layton, David Howells, Frank Filz

On Thu, Aug 25, 2022 at 02:48:02PM -0400, Colin Walters wrote:
> 
> 
> On Tue, Aug 23, 2022, at 5:53 PM, Dave Chinner wrote:
> > 
> > THere's no definition of what consitutes an "inode change" and this
> > exposes internal filesystem implementation details (i.e. on disk
> > format behaviour) directly to userspace. That means when the
> > internal filesystem behaviour changes, userspace applications will
> > see changes in stat->ino_version changes and potentially break them.
> 
> As a userspace developer (ostree, etc. who is definitely interested in this functionality) I do agree with this concern; but a random drive by comment: would it be helpful to expose iversion (or other bits like this from the vfs) via e.g. debugfs to start?  I think that'd unblock writing fstests in the short term right?
> 
> 

This would not work at all for "virtual" filesystems like debugfs and
sysfs which only create the data when the file is read, and there's no
way to know if the data is going to be different than the last time it
was read, sorry.

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-08-27  7:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-19 11:56 [PATCH] vfs: report an inode version in statx for IS_I_VERSION inodes Jeff Layton
2022-08-23 10:01 ` Florian Weimer
2022-08-23 10:16   ` Jeff Layton
2022-08-23 21:53 ` Dave Chinner
2022-08-24 10:17   ` Jeff Layton
2022-08-25 18:48   ` Colin Walters
2022-08-25 19:48     ` Jeff Layton
2022-08-26  8:44       ` Colin Walters
2022-08-27  7:38     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).