All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] link.2: AT_ATOMIC_DATA and AT_ATOMIC_METADATA
@ 2019-05-27 17:26 Amir Goldstein
  2019-05-28 20:06 ` Darrick J. Wong
  2019-05-28 20:26 ` Theodore Ts'o
  0 siblings, 2 replies; 15+ messages in thread
From: Amir Goldstein @ 2019-05-27 17:26 UTC (permalink / raw)
  To: Theodore Tso, Jan Kara
  Cc: Darrick J . Wong, Dave Chinner, Chris Mason, Al Viro,
	linux-fsdevel, linux-xfs, linux-ext4, linux-btrfs, linux-api

New link flags to request "atomic" link.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Hi Guys,

Following our discussions on LSF/MM and beyond [1][2], here is
an RFC documentation patch.

Ted, I know we discussed limiting the API for linking an O_TMPFILE
to avert the hardlinks issue, but I decided it would be better to
document the hardlinks non-guaranty instead. This will allow me to
replicate the same semantics and documentation to renameat(2).
Let me know how that works out for you.

I also decided to try out two separate flags for data and metadata.
I do not find any of those flags very useful without the other, but
documenting them seprately was easier, because of the fsync/fdatasync
reference.  In the end, we are trying to solve a social engineering
problem, so this is the least confusing way I could think of to describe
the new API.

First implementation of AT_ATOMIC_METADATA is expected to be
noop for xfs/ext4 and probably fsync for btrfs.

First implementation of AT_ATOMIC_DATA is expected to be
filemap_write_and_wait() for xfs/ext4 and probably fdatasync for btrfs.

Thoughts?

Amir.

[1] https://lwn.net/Articles/789038/
[2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxjZm6E2TmCv8JOyQr7f-2VB0uFRy7XEp8HBHQmMdQg+6w@mail.gmail.com/

 man2/link.2 | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/man2/link.2 b/man2/link.2
index 649ba00c7..15c24703e 100644
--- a/man2/link.2
+++ b/man2/link.2
@@ -184,6 +184,57 @@ See
 .BR openat (2)
 for an explanation of the need for
 .BR linkat ().
+.TP
+.BR AT_ATOMIC_METADATA " (since Linux 5.x)"
+By default, a link operation followed by a system crash, may result in the
+new file name being linked with old inode metadata, such as out dated time
+stamps or missing extended attributes.
+One way to prevent this is to call
+.BR fsync (2)
+before linking the inode, but that involves flushing of volatile disk caches.
+
+A filesystem that accepts this flag will guaranty, that old inode metadata
+will not be exposed in the new linked name.
+Some filesystems may internally perform
+.BR fsync (2)
+before linking the inode to provide this guaranty,
+but often, filesystems will have a more efficient method to provide this
+guaranty without flushing volatile disk caches.
+
+A filesystem that accepts this flag does
+.BR NOT
+guaranty that the new file name will exist after a system crash, nor that the
+current inode metadata is persisted to disk.
+Specifically, if a file has hardlinks, the existance of the linked name after
+a system crash does
+.BR NOT
+guaranty that any of the other file names exist, nor that the last observed
+value of
+.I st_nlink
+(see
+.BR stat (2))
+has persisted.
+.TP
+.BR AT_ATOMIC_DATA " (since Linux 5.x)"
+By default, a link operation followed by a system crash, may result in the
+new file name being linked with old data or missing data.
+One way to prevent this is to call
+.BR fdatasync (2)
+before linking the inode, but that involves flushing of volatile disk caches.
+
+A filesystem that accepts this flag will guaranty, that old data
+will not be exposed in the new linked name.
+Some filesystems may internally perform
+.BR fsync (2)
+before linking the inode to provide this guaranty,
+but often, filesystems will have a more efficient method to provide this
+guaranty without flushing volatile disk caches.
+
+A filesystem that accepts this flag does
+.BR NOT
+guaranty that the new file name will exist after a system crash, nor that the
+current inode data is persisted to disk.
+.TP
 .SH RETURN VALUE
 On success, zero is returned.
 On error, \-1 is returned, and
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-06-03  6:17 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-27 17:26 [RFC][PATCH] link.2: AT_ATOMIC_DATA and AT_ATOMIC_METADATA Amir Goldstein
2019-05-28 20:06 ` Darrick J. Wong
2019-05-29  5:58   ` Amir Goldstein
2019-05-28 20:26 ` Theodore Ts'o
2019-05-29  5:38   ` Amir Goldstein
2019-05-31 15:21     ` Amir Goldstein
2019-05-31 16:41       ` Theodore Ts'o
2019-05-31 17:22         ` Amir Goldstein
2019-05-31 19:21           ` Theodore Ts'o
2019-05-31 22:45         ` Dave Chinner
2019-05-31 23:28           ` Dave Chinner
2019-06-01  8:01             ` Amir Goldstein
2019-06-03  4:25               ` Dave Chinner
2019-06-03  6:17                 ` Amir Goldstein
2019-06-01  7:21           ` Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.