linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Jeff Layton <jlayton@kernel.org>,
	Kent Overstreet <kent.overstreet@linux.dev>,
	Christian Brauner <brauner@kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	John Stultz <jstultz@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Stephen Boyd <sboyd@kernel.org>,
	Chandan Babu R <chandan.babu@oracle.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>, Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.de>, David Howells <dhowells@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-mm@kvack.org,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH RFC 2/9] timekeeping: new interfaces for multigrain timestamp handing
Date: Tue, 31 Oct 2023 09:03:57 +0200	[thread overview]
Message-ID: <CAOQ4uxgSRw26J+MPK-zhysZX9wBkXFRNx+n1bwnQwykCJ1=F4Q@mail.gmail.com> (raw)
In-Reply-To: <ZUBbj8XsA6uW8ZDK@dread.disaster.area>

On Tue, Oct 31, 2023 at 3:42 AM Dave Chinner <david@fromorbit.com> wrote:
>
[...]
> .... and what is annoying is that that the new i_version just a
> glorified ctime change counter. What we should be fixing is ctime -
> integrating this change counting into ctime would allow us to make
> i_version go away entirely. i.e. We don't need a persistent ctime
> change counter if the ctime has sufficient resolution or persistent
> encoding that it does not need an external persistent change
> counter.
>
> That was reasoning behind the multi-grain timestamps. While the mgts
> implementation was flawed, the reasoning behind it certainly isn't.
> We should be trying to get rid of i_version by integrating it into
> ctime updates, not arguing how atime vs i_version should work.
>
> > So I don't think the issue here is "i_version" per se. I think in a
> > vacuum, the best option of i_version is pretty obvious.  But if you
> > want i_version to track di_changecount, *then* you end up with that
> > situation where the persistence of atime matters, and i_version needs
> > to update whenever a (persistent) atime update happens.
>
> Yet I don't want i_version to track di_changecount.
>
> I want to *stop supporting i_version altogether* in XFS.
>
> I want i_version as filesystem internal metadata to die completely.
>
> I don't want to change the on disk format to add a new i_version
> field because we'll be straight back in this same siutation when the
> next i_version bug is found and semantics get changed yet again.
>
> Hence if we can encode the necessary change attributes into ctime,
> we can drop VFS i_version support altogether.  Then the "atime bumps
> i_version" problem also goes away because then we *don't use
> i_version*.
>
> But if we can't get the VFS to do this with ctime, at least we have
> the abstractions available to us (i.e. timestamp granularity and
> statx change cookie) to allow XFS to implement this sort of
> ctime-with-integrated-change-counter internally to the filesystem
> and be able to drop i_version support....
>

I don't know if it was mentioned before in one of the many threads,
but there is another benefit of ctime-with-integrated-change-counter
approach - it is the ability to extend the solution with some adaptations
also to mtime.

The "change cookie" is used to know if inode metadata cache should
be invalidated and mtime is often used to know if data cache should
be invalidated, or if data comparison could be skipped (e.g. rsync).

The difference is that mtime can be set by user, so using lower nsec
bits for modification counter would require to truncate the user set
time granularity to 100ns - that is probably acceptable, but only as
an opt-in behavior.

The special value 0 for mtime-change-counter could be reserved for
mtime that was set by the user or for upgrade of existing inode,
where 0 counter means that mtime cannot be trusted as an accurate
data modification-cookie.

This feature is going to be useful for the vfs HSM implementation [1]
that I am working on and it actually rhymes with the XFS DMAPI
patches that were never fully merged upstream.

Speaking on behalf of my employer, we would love to see the data
modification-cookie feature implemented, whether in vfs or in xfs.

*IF* the result on this thread is that the chosen solution is
ctime-with-change-counter in XFS
*AND* if there is agreement among XFS developers to extend it with
an opt-in mkfs/mount option to 100ns-mtime-with-change-counter in XFS
*THEN* I think I will be able to allocate resources to drive this xfs work.

Thanks,
Amir.

[1] https://github.com/amir73il/fsnotify-utils/wiki/Hierarchical-Storage-Management-API

  reply	other threads:[~2023-10-31  7:04 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-18 17:41 [PATCH RFC 0/9] fs: multigrain timestamps (redux) Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 1/9] fs: switch timespec64 fields in inode to discrete integers Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 2/9] timekeeping: new interfaces for multigrain timestamp handing Jeff Layton
2023-10-18 19:18   ` Linus Torvalds
2023-10-18 20:47     ` Jeff Layton
2023-10-18 21:31       ` Linus Torvalds
2023-10-18 21:52         ` Jeff Layton
2023-10-19  9:29           ` Christian Brauner
2023-10-19 11:28             ` Jeff Layton
2023-10-19 22:02               ` Dave Chinner
2023-10-20 12:12                 ` Jeff Layton
2023-10-20 20:06                   ` Linus Torvalds
2023-10-20 20:20                     ` Linus Torvalds
2023-10-20 21:05                     ` Jeff Layton
2023-10-22 22:17                   ` Dave Chinner
2023-10-23 14:45                     ` Jeff Layton
2023-10-23 23:26                       ` Dave Chinner
2023-10-24  0:18                         ` Linus Torvalds
2023-10-24  3:40                           ` Dave Chinner
2023-10-24  4:10                             ` Linus Torvalds
2023-10-24  7:08                             ` Amir Goldstein
2023-10-24 18:40                               ` Jeff Layton
2023-10-25  8:05                                 ` Dave Chinner
2023-10-25 10:41                                   ` Amir Goldstein
2023-10-25 12:25                                   ` Jeff Layton
2023-10-26  2:20                                     ` Dave Chinner
2023-10-26  5:42                                       ` Amir Goldstein
2023-10-27 10:35                                       ` Jeff Layton
2023-10-30 22:37                                         ` Dave Chinner
2023-10-30 23:11                                           ` Linus Torvalds
2023-10-31  1:42                                             ` Dave Chinner
2023-10-31  7:03                                               ` Amir Goldstein [this message]
2023-10-31 10:30                                                 ` Christian Brauner
2023-10-31 11:29                                                 ` Jeff Layton
2023-10-31 21:57                                                   ` Dave Chinner
2023-10-31 23:02                                                     ` Darrick J. Wong
2023-10-31 23:47                                                       ` Dave Chinner
2023-11-01 10:16                                                     ` Jan Kara
2023-11-01 11:38                                                       ` Amir Goldstein
2023-11-02 10:17                                                         ` Jeff Layton
2023-11-01 20:10                                                       ` Linus Torvalds
2023-11-01 21:34                                                         ` Trond Myklebust
2023-11-01 22:23                                                           ` Linus Torvalds
2023-11-01 22:45                                                             ` Trond Myklebust
2023-11-01 23:29                                                           ` Dave Chinner
2023-11-02 10:29                                                             ` Jeff Layton
2023-11-02 10:15                                                         ` Jeff Layton
2023-10-31 23:12                                                 ` Darrick J. Wong
2023-11-01  8:08                                                   ` Amir Goldstein
2023-10-31 11:26                                               ` Jeff Layton
2023-10-31 19:43                                                 ` John Stoffel
2023-10-31 11:04                                           ` Jeff Layton
2023-10-31 12:22                                             ` Jan Kara
2023-10-31 12:55                                               ` Jeff Layton
2023-10-30 23:34                                         ` ronnie sahlberg
2023-10-24 14:24                             ` Jeff Layton
2023-10-24 19:06                           ` Jeff Layton
2023-10-24 19:40                             ` Linus Torvalds
2023-10-24 20:19                               ` Jeff Layton
2023-10-31 10:26               ` Christian Brauner
2023-10-31 13:55                 ` Jeff Layton
2023-10-19 22:00   ` Thomas Gleixner
2023-10-19 22:41     ` Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 3/9] timekeeping: add new debugfs file to count multigrain timestamps Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 4/9] fs: add infrastructure for " Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 5/9] fs: have setattr_copy handle multigrain timestamps appropriately Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 6/9] xfs: switch to multigrain timestamps Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 7/9] ext4: " Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 8/9] btrfs: convert " Jeff Layton
2023-10-18 17:41 ` [PATCH RFC 9/9] tmpfs: add support for " Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxgSRw26J+MPK-zhysZX9wBkXFRNx+n1bwnQwykCJ1=F4Q@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=chandan.babu@oracle.com \
    --cc=clm@fb.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=djwong@kernel.org \
    --cc=dsterba@suse.com \
    --cc=hughd@google.com \
    --cc=jack@suse.de \
    --cc=jlayton@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=jstultz@google.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).