* [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination @ 2020-01-17 12:49 David Howells 2020-01-17 14:33 ` Trond Myklebust 2020-01-17 14:47 ` David Howells 0 siblings, 2 replies; 36+ messages in thread From: David Howells @ 2020-01-17 12:49 UTC (permalink / raw) To: lsf-pc, Amir Goldstein, Omar Sandoval Cc: dhowells, Al Viro, Christoph Hellwig, Miklos Szeredi, linux-fsdevel It may be worth a discussion of whether linkat() could be given a flag to allow the destination to be replaced or if a new syscall should be made for this - or whether it should be disallowed entirely. A set of patches has been posted by Omar Sandoval that makes this possible: https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/ though it only includes filesystem support for btrfs. This could be useful for cachefiles: https://lore.kernel.org/linux-fsdevel/3326.1579019665@warthog.procyon.org.uk/ and overlayfs. David ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 12:49 [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination David Howells @ 2020-01-17 14:33 ` Trond Myklebust 2020-01-17 15:46 ` Al Viro 2020-01-17 14:47 ` David Howells 1 sibling, 1 reply; 36+ messages in thread From: Trond Myklebust @ 2020-01-17 14:33 UTC (permalink / raw) To: osandov, amir73il, dhowells, lsf-pc; +Cc: hch, miklos, linux-fsdevel, viro On Fri, 2020-01-17 at 12:49 +0000, David Howells wrote: > It may be worth a discussion of whether linkat() could be given a > flag to > allow the destination to be replaced or if a new syscall should be > made for > this - or whether it should be disallowed entirely. > > A set of patches has been posted by Omar Sandoval that makes this > possible: > > > https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/ > > though it only includes filesystem support for btrfs. > > This could be useful for cachefiles: > > > https://lore.kernel.org/linux-fsdevel/3326.1579019665@warthog.procyon.org.uk/ > > and overlayfs. > That seems to me like a "just go ahead and do it if you can justify it" kind of thing. It has plenty of precedent, and fits easily into the existing syscall, so why do we need a face-to-face discussion? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 14:33 ` Trond Myklebust @ 2020-01-17 15:46 ` Al Viro 2020-01-17 16:12 ` Trond Myklebust 2020-01-17 16:36 ` Omar Sandoval 0 siblings, 2 replies; 36+ messages in thread From: Al Viro @ 2020-01-17 15:46 UTC (permalink / raw) To: Trond Myklebust Cc: osandov, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 02:33:01PM +0000, Trond Myklebust wrote: > On Fri, 2020-01-17 at 12:49 +0000, David Howells wrote: > > It may be worth a discussion of whether linkat() could be given a > > flag to > > allow the destination to be replaced or if a new syscall should be > > made for > > this - or whether it should be disallowed entirely. > > > > A set of patches has been posted by Omar Sandoval that makes this > > possible: > > > > > > https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/ > > > > though it only includes filesystem support for btrfs. > > > > This could be useful for cachefiles: > > > > > > https://lore.kernel.org/linux-fsdevel/3326.1579019665@warthog.procyon.org.uk/ > > > > and overlayfs. > > > > That seems to me like a "just go ahead and do it if you can justify it" > kind of thing. It has plenty of precedent, and fits easily into the > existing syscall, so why do we need a face-to-face discussion? Unfortunately, it does *not* fit easily. And IMO that's linux-abi fodder more than anything else. The problem is in coming up with sane semantics - there's a plenty of corner cases with that one. What to do when destination is a dangling symlink, for example? Or has something mounted on it (no, saying "we'll just reject directories" is not enough). What should happen when destination is already a hardlink to the same object? It's less of a horror than rename() would've been, but that's not saying much. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 15:46 ` Al Viro @ 2020-01-17 16:12 ` Trond Myklebust 2020-01-17 16:48 ` Al Viro 2020-01-17 16:36 ` Omar Sandoval 1 sibling, 1 reply; 36+ messages in thread From: Trond Myklebust @ 2020-01-17 16:12 UTC (permalink / raw) To: viro; +Cc: hch, osandov, miklos, linux-fsdevel, amir73il, dhowells, lsf-pc On Fri, 2020-01-17 at 15:46 +0000, Al Viro wrote: > On Fri, Jan 17, 2020 at 02:33:01PM +0000, Trond Myklebust wrote: > > On Fri, 2020-01-17 at 12:49 +0000, David Howells wrote: > > > It may be worth a discussion of whether linkat() could be given a > > > flag to > > > allow the destination to be replaced or if a new syscall should > > > be > > > made for > > > this - or whether it should be disallowed entirely. > > > > > > A set of patches has been posted by Omar Sandoval that makes this > > > possible: > > > > > > > > > https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/ > > > > > > though it only includes filesystem support for btrfs. > > > > > > This could be useful for cachefiles: > > > > > > > > > https://lore.kernel.org/linux-fsdevel/3326.1579019665@warthog.procyon.org.uk/ > > > > > > and overlayfs. > > > > > > > That seems to me like a "just go ahead and do it if you can justify > > it" > > kind of thing. It has plenty of precedent, and fits easily into the > > existing syscall, so why do we need a face-to-face discussion? > > Unfortunately, it does *not* fit easily. And IMO that's linux-abi > fodder more > than anything else. The problem is in coming up with sane semantics > - there's > a plenty of corner cases with that one. What to do when destination > is > a dangling symlink, for example? Or has something mounted on it (no, > saying > "we'll just reject directories" is not enough). What should happen > when > destination is already a hardlink to the same object? > > It's less of a horror than rename() would've been, but that's not > saying > much. We already have precedents for all of that when handling bog-standard open(O_CREAT) (which creates the first link to the file). Yes, there is the question of choosing whether to implement O_NOFOLLOW semantics or not, but that should be dictated by the requirements of the use case. As for the "hard link on top of itself", that case is already well defined by POSIX to be a null op IIRC. What in the proposal is requiring new semantics beyond these precedents already set by open() and link() itself? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 16:12 ` Trond Myklebust @ 2020-01-17 16:48 ` Al Viro 0 siblings, 0 replies; 36+ messages in thread From: Al Viro @ 2020-01-17 16:48 UTC (permalink / raw) To: Trond Myklebust Cc: hch, osandov, miklos, linux-fsdevel, amir73il, dhowells, lsf-pc On Fri, Jan 17, 2020 at 04:12:28PM +0000, Trond Myklebust wrote: > > Unfortunately, it does *not* fit easily. And IMO that's linux-abi > > fodder more > > than anything else. The problem is in coming up with sane semantics > > - there's > > a plenty of corner cases with that one. What to do when destination > > is > > a dangling symlink, for example? Or has something mounted on it (no, > > saying > > "we'll just reject directories" is not enough). What should happen > > when > > destination is already a hardlink to the same object? > > > > It's less of a horror than rename() would've been, but that's not > > saying > > much. > > We already have precedents for all of that when handling bog-standard > open(O_CREAT) (which creates the first link to the file). Yes, there is > the question of choosing whether to implement O_NOFOLLOW semantics or > not, but that should be dictated by the requirements of the use case. > > As for the "hard link on top of itself", that case is already well > defined by POSIX to be a null op IIRC. Where in POSIX does it say anything about it? It is a null op for rename, but for link it's EEXIST on the general grounds. > What in the proposal is requiring new semantics beyond these precedents > already set by open() and link() itself? The fact that O_CREAT does not do anything to the existing target, perhaps? This, unless I'm seriously misunderstanding the proposal, should have the preexisting link removed. Which makes it a lot more similar to "unlink target, then link source to target, atomically" than to O_CREAT. Incidentally, echo foo >/tmp/foo echo bar >/tmp/bar ln /tmp/foo /tmp/foo2 mount --bind /tmp/foo /tmp/bar echo a >/tmp/bar cat /tmp/foo2 will print "a" - IOW, O_CREAT in the redirect of that last echo will find /tmp/bar see it overmounted (by /tmp/foo) access /tmp/foo, which happens to be the same file as /tmp/foo2 What would you want that link() variant do in similar situation (i.e. mount traversal at the end of pathname)? I can see several variants of behaviour, none of them too appealing. What should happen if target is opened by somebody? I would expect it to be treated as opened-and-unlinked (with sillyrename if fs requires that). Which is where the corner case with target already being a link to source comes from... For fuck sake, I'm not being obstructionist - if you (or David, or anyone else) is willing to come up with sane semantics (I'm _not_ talking about implementation, VFS or fs data structures, etc. - just the rules describing what the effect should it have), great, I'll be happy to help with the implementation side. As well as poking holes in said proposal (i.e. asking what should happen in such and such case). But it's really _not_ as trivial as "do by analogy with O_CREAT". I don't have any problem with discussing that over email, but latencies do suck sometimes (e.g. when discussing autofs ->d_manage() semantics, with 3-way conversation - one participant on US east coast, one in UK, one on AU west coast), so I understand why David (who'd just had exactly that lovely experience) might find an idea of doing that face-to-face appealing... ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 15:46 ` Al Viro 2020-01-17 16:12 ` Trond Myklebust @ 2020-01-17 16:36 ` Omar Sandoval 2020-01-17 16:59 ` Al Viro 1 sibling, 1 reply; 36+ messages in thread From: Omar Sandoval @ 2020-01-17 16:36 UTC (permalink / raw) To: Al Viro Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 03:46:57PM +0000, Al Viro wrote: > On Fri, Jan 17, 2020 at 02:33:01PM +0000, Trond Myklebust wrote: > > On Fri, 2020-01-17 at 12:49 +0000, David Howells wrote: > > > It may be worth a discussion of whether linkat() could be given a > > > flag to > > > allow the destination to be replaced or if a new syscall should be > > > made for > > > this - or whether it should be disallowed entirely. > > > > > > A set of patches has been posted by Omar Sandoval that makes this > > > possible: > > > > > > > > > https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/ > > > > > > though it only includes filesystem support for btrfs. > > > > > > This could be useful for cachefiles: > > > > > > > > > https://lore.kernel.org/linux-fsdevel/3326.1579019665@warthog.procyon.org.uk/ > > > > > > and overlayfs. > > > > > > > That seems to me like a "just go ahead and do it if you can justify it" > > kind of thing. It has plenty of precedent, and fits easily into the > > existing syscall, so why do we need a face-to-face discussion? > > Unfortunately, it does *not* fit easily. And IMO that's linux-abi fodder more > than anything else. The problem is in coming up with sane semantics - there's > a plenty of corner cases with that one. What to do when destination is > a dangling symlink, for example? Or has something mounted on it (no, saying > "we'll just reject directories" is not enough). What should happen when > destination is already a hardlink to the same object? > > It's less of a horror than rename() would've been, but that's not saying > much. The semantics I implemented in my series were basically "linkat with AT_REPLACE replaces the target iff rename would replace the target". Therefore, symlinks are replaced, not followed, and mountpoints get EXDEV. In my opinion that's both sane and unsurprising. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 16:36 ` Omar Sandoval @ 2020-01-17 16:59 ` Al Viro 2020-01-17 17:28 ` Omar Sandoval 0 siblings, 1 reply; 36+ messages in thread From: Al Viro @ 2020-01-17 16:59 UTC (permalink / raw) To: Omar Sandoval Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 08:36:16AM -0800, Omar Sandoval wrote: > The semantics I implemented in my series were basically "linkat with > AT_REPLACE replaces the target iff rename would replace the target". > Therefore, symlinks are replaced, not followed, and mountpoints get > EXDEV. In my opinion that's both sane and unsurprising. Umm... EXDEV in rename() comes when _parents_ are on different mounts. rename() over a mountpoint is EBUSY if it has mounts in caller's namespace, but it succeeds (and detaches all mounts on the victim in any namespaces) otherwise. When are you returning EXDEV? Incidentally, mounts _are_ traversed on the link source, so what should that variant do when /tmp/foo is a mountpoint and you feed it "/tmp/foo" both for source and target? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 16:59 ` Al Viro @ 2020-01-17 17:28 ` Omar Sandoval 2020-01-17 18:17 ` Al Viro 0 siblings, 1 reply; 36+ messages in thread From: Omar Sandoval @ 2020-01-17 17:28 UTC (permalink / raw) To: Al Viro Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 04:59:04PM +0000, Al Viro wrote: > On Fri, Jan 17, 2020 at 08:36:16AM -0800, Omar Sandoval wrote: > > > The semantics I implemented in my series were basically "linkat with > > AT_REPLACE replaces the target iff rename would replace the target". > > Therefore, symlinks are replaced, not followed, and mountpoints get > > EXDEV. In my opinion that's both sane and unsurprising. > > Umm... EXDEV in rename() comes when _parents_ are on different mounts. > rename() over a mountpoint is EBUSY if it has mounts in caller's > namespace, but it succeeds (and detaches all mounts on the victim > in any namespaces) otherwise. > > When are you returning EXDEV? EXDEV was a thinko, the patch does what rename does: + if (is_local_mountpoint(new_dentry)) { + error = -EBUSY; + goto out; + } ... + if (target) { + dont_mount(new_dentry); + detach_mounts(new_dentry); + } Anyways, my point is that the rename semantics cover 90% of AT_REPLACE. Before I resend the patches, I'll write up the documentation and we can see what other corner cases I missed. > Incidentally, mounts _are_ traversed on > the link source, so what should that variant do when /tmp/foo is > a mountpoint and you feed it "/tmp/foo" both for source and target? EBUSY and noop both seem reasonable in this case, so we pick one and document it. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 17:28 ` Omar Sandoval @ 2020-01-17 18:17 ` Al Viro 2020-01-17 20:22 ` Omar Sandoval 0 siblings, 1 reply; 36+ messages in thread From: Al Viro @ 2020-01-17 18:17 UTC (permalink / raw) To: Omar Sandoval Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 09:28:55AM -0800, Omar Sandoval wrote: > On Fri, Jan 17, 2020 at 04:59:04PM +0000, Al Viro wrote: > > On Fri, Jan 17, 2020 at 08:36:16AM -0800, Omar Sandoval wrote: > > > > > The semantics I implemented in my series were basically "linkat with > > > AT_REPLACE replaces the target iff rename would replace the target". > > > Therefore, symlinks are replaced, not followed, and mountpoints get > > > EXDEV. In my opinion that's both sane and unsurprising. > > > > Umm... EXDEV in rename() comes when _parents_ are on different mounts. > > rename() over a mountpoint is EBUSY if it has mounts in caller's > > namespace, but it succeeds (and detaches all mounts on the victim > > in any namespaces) otherwise. > > > > When are you returning EXDEV? > > EXDEV was a thinko, the patch does what rename does: > > > + if (is_local_mountpoint(new_dentry)) { > + error = -EBUSY; > + goto out; > + } > > ... > > + if (target) { > + dont_mount(new_dentry); > + detach_mounts(new_dentry); > + } > > Anyways, my point is that the rename semantics cover 90% of AT_REPLACE. > Before I resend the patches, I'll write up the documentation and we can > see what other corner cases I missed. OK... rename() has a major difference from linkat(), though, in not following links (or allowing fd + empty path). link() is deeply asymmetric in treatment of pathnames - the first argument is "pathname describes a filesystem object" and the second - "pathname descripes an entry in a directory" (the link to be). rename() is not - both arguments are pathnames-as-link-specifiers. And that affects what is and what is not allowed there, so that'll need a careful look into. FWIW, currently linkat() semantics can be described simply enough * oldfd/oldname specifies an fs object; symlink traversal is optional. * newfd/newname specifies an entry in some directory. Directory must be on the same mount as the object specified by oldname. Entry must be a normal component (no empty paths allowed, no . or .. either). Trailing slashes are not allowed. There must be no entry with that name (which automatically implies that trailing symlinks are not to be followed and there can't be anything mounted on it). * the object specified by oldname must be a non-directory. * if the object is not a never-linked-yet anonymous, it must still have some links. * caller must have permission to create links in the affected directory. * append-only and immutables are not allowed (rationale: they can't be unlinked) * filesystem is allowed to fail for any reasons, as with any operation; a linkat()-specific one is having the link count overflow, but any generic error is possible (out of memory, IO error, EPERM-because-I-feel-like-that, etc.) All checks related to object in question are atomic wrt operations that add or remove links to that object. Checks on parent are atomic wrt operations modifying the parent. Neither group is atomic wrt operations modifying the _old_ parent (if there's any, in the first place). ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 18:17 ` Al Viro @ 2020-01-17 20:22 ` Omar Sandoval 2020-01-17 22:22 ` Al Viro 0 siblings, 1 reply; 36+ messages in thread From: Omar Sandoval @ 2020-01-17 20:22 UTC (permalink / raw) To: Al Viro Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 06:17:30PM +0000, Al Viro wrote: > On Fri, Jan 17, 2020 at 09:28:55AM -0800, Omar Sandoval wrote: > > On Fri, Jan 17, 2020 at 04:59:04PM +0000, Al Viro wrote: > > > On Fri, Jan 17, 2020 at 08:36:16AM -0800, Omar Sandoval wrote: > > > > > > > The semantics I implemented in my series were basically "linkat with > > > > AT_REPLACE replaces the target iff rename would replace the target". > > > > Therefore, symlinks are replaced, not followed, and mountpoints get > > > > EXDEV. In my opinion that's both sane and unsurprising. > > > > > > Umm... EXDEV in rename() comes when _parents_ are on different mounts. > > > rename() over a mountpoint is EBUSY if it has mounts in caller's > > > namespace, but it succeeds (and detaches all mounts on the victim > > > in any namespaces) otherwise. > > > > > > When are you returning EXDEV? > > > > EXDEV was a thinko, the patch does what rename does: > > > > > > + if (is_local_mountpoint(new_dentry)) { > > + error = -EBUSY; > > + goto out; > > + } > > > > ... > > > > + if (target) { > > + dont_mount(new_dentry); > > + detach_mounts(new_dentry); > > + } > > > > Anyways, my point is that the rename semantics cover 90% of AT_REPLACE. > > Before I resend the patches, I'll write up the documentation and we can > > see what other corner cases I missed. > > OK... rename() has a major difference from linkat(), though, in not > following links (or allowing fd + empty path). link() is deeply > asymmetric in treatment of pathnames - the first argument is > "pathname describes a filesystem object" and the second - "pathname > descripes an entry in a directory" (the link to be). rename() is > not - both arguments are pathnames-as-link-specifiers. And that > affects what is and what is not allowed there, so that'll need > a careful look into. > > FWIW, currently linkat() semantics can be described simply enough > * oldfd/oldname specifies an fs object; symlink traversal is > optional. > * newfd/newname specifies an entry in some directory. Directory > must be on the same mount as the object specified by oldname. > Entry must be a normal component (no empty paths allowed, no . or .. > either). Trailing slashes are not allowed. There must be no > entry with that name (which automatically implies that trailing > symlinks are not to be followed and there can't be anything > mounted on it). > * the object specified by oldname must be a non-directory. > * if the object is not a never-linked-yet anonymous, > it must still have some links. > * caller must have permission to create links in the > affected directory. > * append-only and immutables are not allowed (rationale: > they can't be unlinked) > * filesystem is allowed to fail for any reasons, as with > any operation; a linkat()-specific one is having the link count > overflow, but any generic error is possible (out of memory, IO > error, EPERM-because-I-feel-like-that, etc.) > > All checks related to object in question are atomic wrt operations > that add or remove links to that object. Checks on parent are > atomic wrt operations modifying the parent. Neither group is > atomic wrt operations modifying the _old_ parent (if there's > any, in the first place). Since manpage troff patches aren't particularly nice to read, here is what I had in mind in a readable form: int linkat(int olddirfd, const char *oldpath, int newdirfd, const char *newpath, int flags); AT_REPLACE If newpath exists, replace it atomically. There is no point at which another process attempting to access newpath will find it missing. newpath must not be a directory. If oldpath and newpath are existing hard links referring to the same file, then this does nothing, and returns a success status. If newpath is a mount point, then this comparison considers the mount point, not the mounted file or directory. If newpath refers to a symbolic link, the link will be overwritten. ERRORS EBUSY AT_REPLACE was specified in flags, newpath is a mount point, and the mount point does not refer to the same file as oldpath. Or, AT_REPLACE was specified in flags and newpath ends with a . or .. component. EISDIR AT_REPLACE was specified in flags and newpath is an existing directory. --- Overall, I think what makes the most sense is that AT_REPLACE does not change the treatment of oldpath and is as consistent with rename(2) for the treatment of newpath. Therefore, linkat with AT_RENAME: * Doesn't traverse mounts on newpath (see below). * If newpath is a symlink, doesn't follow it. * Returns EBUSY for trailing . and .. components instead of EEXIST like you get without AT_REPLACE. * Is a noop if linking a file to a hard link of itself. The hairiest bit is that last point if there are mounts involved. As I mentioned, we probably want to handle oldpath exactly the same whether or not AT_REPLACE is used, so we will always traverse mounts and follow symlinks (modulo AT_SYMLINK_FOLLOW). However, let's consider how rename handles mounts on newpath: # cat rename.c #include <stdio.h> #include <stdlib.h> int main(int argc, char **argv) { if (argc != 3) { fprintf(stderr, "%s: SOURCE DEST\n", argv[0]); return EXIT_FAILURE; } if (rename(argv[1], argv[2]) == -1) { perror("rename"); return EXIT_FAILURE; } return EXIT_SUCCESS; } # touch a b # ln b c # mount --bind a c # ./rename b c This succeeds because rename treats the path as an entry in a directory. It doesn't traverse the mount on "c", so "c" is still considered the same file as "b". # umount c # mount --bind c a # ./rename b a rename: Device or resource busy Likewise, this fails because the underlying "a" mount point is not the same file as "b". So the rules for "same file" are "oldpath follows mounts and symlinks as appropriate" and "newpath does not follow mounts or symlinks". The asymmetry is a bit awkward, but I think it's the soundest choice. Thoughts? Any other gaps here? Thanks! ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 20:22 ` Omar Sandoval @ 2020-01-17 22:22 ` Al Viro 2020-01-17 23:54 ` Omar Sandoval 0 siblings, 1 reply; 36+ messages in thread From: Al Viro @ 2020-01-17 22:22 UTC (permalink / raw) To: Omar Sandoval Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 12:22:19PM -0800, Omar Sandoval wrote: > Since manpage troff patches aren't particularly nice to read, here is > what I had in mind in a readable form: > > int linkat(int olddirfd, const char *oldpath, int newdirfd, > const char *newpath, int flags); > > AT_REPLACE > If newpath exists, replace it atomically. There is no > point at which another process attempting to access > newpath will find it missing. > > newpath must not be a directory. > > If oldpath and newpath are existing hard links referring > to the same file, then this does nothing, and returns a > success status. If newpath is a mount point, then this > comparison considers the mount point, not the mounted > file or directory. > > If newpath refers to a symbolic link, the link will be > overwritten. > > ERRORS > EBUSY AT_REPLACE was specified in flags, newpath is a mount > point, and the mount point does not refer to the same > file as oldpath. Or, AT_REPLACE was specified in flags > and newpath ends with a . or .. component. > EISDIR AT_REPLACE was specified in flags and newpath is an > existing directory. So are <existing directory>/. or <existing directory>/.., so I wonder why bother with -EBUSY there. As for the gaps... 1) emptypath for new. Should be an error; EINVAL, probably. Avoids arseloads of fun cases ("what if newfd/newpath refers to something unlinked?", etc., etc.) 2) mountpoint vs. mountpoint in the local namespace. See the rationale in that area for unlink()/rmdir()/rename(). 3) permission checks need to be specified 4) as for the hardlinks, I would be very careful with the language; if anything, that's probably "if the link specified by newfd/newpath points to the object specified by oldfd/oldpath..." Another thing is, as you could've seen for rename(), such "permissive" clauses tend to give surprising results. For example, put the check for "it's a hardlink" early enough, and you've suddenly gotten a situation when it *can* succeed for directory. Or on a filesystem that never allowed hardlinks (the latter is probably unavoidable). 5) what _really_ needs to be said is that other links to newpath are unaffected and so are previously opened files. 6) "atomically" is bloody vague; the explanation you've put there is not enough, IMO - in addition to "nobody sees it gone in the middle of operation" there's also "if the operation fails, the target remains undisturbed" and something along the lines of "if filesystem makes any promises about the state after dirty shutdown in the middle of rename(), the same promises should apply here". 7) how do users tell if filesystem supports that? And no, references to pathconf, Cthulhu and other equally delightful entities are not really welcome. There might be be more - like I said, it needs to go through linux-abi, fsdevel and security lists. The above is all I see right now, but I might be missing something subtle (or not so subtle). As for the implementation... VFS side should be reasonably easy (OK, it'll bloat do_symlinkat() quite a bit, since we won't be able to use filename_create() for the target in there, but it's not _that_ terrible). I'd probably prefer a separate method rather than overloading ->link(), with the old method called when the target is negative and new - when it's positive - less boilerplate and harder for an instance to get confused. They can easily use common helpers with ->link() instances, so the code duplication won't be an issue. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 22:22 ` Al Viro @ 2020-01-17 23:54 ` Omar Sandoval 2020-01-18 0:47 ` Al Viro 0 siblings, 1 reply; 36+ messages in thread From: Omar Sandoval @ 2020-01-17 23:54 UTC (permalink / raw) To: Al Viro Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 10:22:12PM +0000, Al Viro wrote: > On Fri, Jan 17, 2020 at 12:22:19PM -0800, Omar Sandoval wrote: > > > Since manpage troff patches aren't particularly nice to read, here is > > what I had in mind in a readable form: > > > > int linkat(int olddirfd, const char *oldpath, int newdirfd, > > const char *newpath, int flags); > > > > AT_REPLACE > > If newpath exists, replace it atomically. There is no > > point at which another process attempting to access > > newpath will find it missing. > > > > newpath must not be a directory. > > > > If oldpath and newpath are existing hard links referring > > to the same file, then this does nothing, and returns a > > success status. If newpath is a mount point, then this > > comparison considers the mount point, not the mounted > > file or directory. > > > > If newpath refers to a symbolic link, the link will be > > overwritten. > > > > ERRORS > > EBUSY AT_REPLACE was specified in flags, newpath is a mount > > point, and the mount point does not refer to the same > > file as oldpath. Or, AT_REPLACE was specified in flags > > and newpath ends with a . or .. component. > > > EISDIR AT_REPLACE was specified in flags and newpath is an > > existing directory. Thanks for taking a look. > So are <existing directory>/. or <existing directory>/.., so I wonder why > bother with -EBUSY there. This was for consistency with rename, as it happens to check for . and .. before it checks whether oldpath is a directory. I don't feel strongly either way. > As for the gaps... > 1) emptypath for new. Should be an error; EINVAL, probably. > Avoids arseloads of fun cases ("what if newfd/newpath refers to something > unlinked?", etc., etc.) linkat(2) explicitly mentions that AT_EMPTY_PATH only applies to oldpath, so this falls back to the standard ENOENT documented in path_resolution(7). > 2) mountpoint vs. mountpoint in the local namespace. See the > rationale in that area for unlink()/rmdir()/rename(). I'll add that. Side-note, unlink(2), rmdir(2), and rename(2) should probably mention this behavior, too... > 3) permission checks need to be specified I believe the only difference here vs standard linkat is that newpath must not be immutable or append-only? > 4) as for the hardlinks, I would be very careful with the language; > if anything, that's probably "if the link specified by newfd/newpath points > to the object specified by oldfd/oldpath..." It seems like there's room for elaborating on the distinction between path-to-a-file vs. path-to-a-dir-entry in path_resolution(7). In the meantime, I'll mention it explicitly. > Another thing is, as you > could've seen for rename(), such "permissive" clauses tend to give surprising > results. For example, put the check for "it's a hardlink" early enough, > and you've suddenly gotten a situation when it *can* succeed for directory. > Or on a filesystem that never allowed hardlinks (the latter is probably > unavoidable). Good point. Unless I'm missing something, the only way you'll end up with two paths referring to the same directory/file on a filesystem not supporting hardlinks is through multiple mounts of the same filesystem, and I check for the EXDEV case earlier than the same file case. I'll double-check for other cases. > 5) what _really_ needs to be said is that other links to > newpath are unaffected and so are previously opened files. Done. > 6) "atomically" is bloody vague; the explanation you've put there > is not enough, IMO - in addition to "nobody sees it gone in the middle > of operation" there's also "if the operation fails, the target remains > undisturbed" and something along the lines of "if filesystem makes any > promises about the state after dirty shutdown in the middle of rename(), > the same promises should apply here". Done. I'd be nice if I could say "refer to rename(2) for durability guarantees", but rename(2) doesn't mention it at all. > references to pathconf, Cthulhu and other equally delightful entities > are not really welcome. EOPNOTSUPP is probably the most helpful. > There might be be more - like I said, it needs to go through > linux-abi, fsdevel and security lists. The above is all I see right > now, but I might be missing something subtle (or not so subtle). Here's the updated description. If you don't see any other glaring problems, I'll update the implementation and send it out to the appropriate lists. int linkat(int olddirfd, const char *oldpath, int newdirfd, const char *newpath, int flags); AT_REPLACE If newpath exists, replace it atomically. There is no point at which another process attempting to access newpath will find it missing. If newpath exists but the operation fails, the original link specified by newpath will remain in place. This has the same durability guarantees for newpath as rename(2). If newpath is replaced, any other hard links referring to the file are unaffected. Open file descriptors for newpath are also unaffected. newpath must not be a directory. If the entry specified by newpath refers to the file specified by oldpath, linkat() does nothing and returns a success status. Note that this comparison does not follow mounts on newpath. Otherwise, newpath must not be a mount point in the local namespace. If it is a mount point in another namespace and the operation succeeds, all mounts are detached from newpath in all namespaces, as is the case for rename(2), rmdir(2), and unlink(2). If newpath refers to a symbolic link, the link will be overwritten. ERRORS EBUSY AT_REPLACE was specified in flags, newpath is a mount point in the local namespace, and the mount point does not refer to the same file as oldpath. Or, AT_REPLACE was specified in flags and newpath ends with a . or .. component. EISDIR AT_REPLACE was specified in flags and newpath is an existing directory. EOPNOTSUPP AT_REPLACE was specified in flags and the filesystem containing newpath does not support AT_REPLACE. EPERM AT_REPLACE was specified and newpath refers to an immutable or append-only file. > As for the implementation... VFS side should be reasonably easy (OK, > it'll bloat do_symlinkat() quite a bit, since we won't be able to use > filename_create() for the target in there, but it's not _that_ > terrible). Based on my previous attempt at it [1], it's not too bad. > I'd probably prefer a separate method rather than > overloading ->link(), with the old method called when the target > is negative and new - when it's positive - less boilerplate and > harder for an instance to get confused. They can easily use common > helpers with ->link() instances, so the code duplication won't > be an issue. Agreed, thanks again! 1: https://lore.kernel.org/linux-fsdevel/eac9480f80c689504148b5d658ee4218cc1e421e.1524549513.git.osandov@fb.com/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 23:54 ` Omar Sandoval @ 2020-01-18 0:47 ` Al Viro 2020-01-18 1:17 ` Omar Sandoval 0 siblings, 1 reply; 36+ messages in thread From: Al Viro @ 2020-01-18 0:47 UTC (permalink / raw) To: Omar Sandoval Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 03:54:44PM -0800, Omar Sandoval wrote: > > 3) permission checks need to be specified > > I believe the only difference here vs standard linkat is that newpath > must not be immutable or append-only? I would bloody hope not - at the very least you want sticky bit on parent to have effect, same as with rename()/rmdir()/unlink()... > > references to pathconf, Cthulhu and other equally delightful entities > > are not really welcome. > > EOPNOTSUPP is probably the most helpful. Umm... What would you feed it, though? You need to get past your "links to the same file, do nothing" escape... > Based on my previous attempt at it [1], it's not too bad. + error = may_delete(dir, new_dentry, d_is_dir(old_dentry)); Why bother with d_is_dir(), when you are going to reject directories anyway? + if (dir->i_op->link) + error = dir->i_op->link(old_dentry, dir, new_dentry); + else + error = dir->i_op->link2(old_dentry, dir, new_dentry, flags); + if (error) + goto out; + No. This is completely wrong; just make it ->link_replace() and be done with that; no extra arguments and *always* the same conditions wrt positive/negative. One of the reasons why ->rename() tends to be ugly (and a source of quite a few bugs over years) are those "if target is positive/if target is negative" scattered over the instances. Make the choice conditional upon the positivity of target. And you don't need to reproduce every quirk of rename() error values. Really. Unless you really intend to have userland do a loop of linkat(2) attempts (a-la mkstemp(3)), followed by rename(2) for fallback... ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-18 0:47 ` Al Viro @ 2020-01-18 1:17 ` Omar Sandoval 2020-01-18 2:20 ` Al Viro 2020-01-22 9:53 ` David Howells 0 siblings, 2 replies; 36+ messages in thread From: Omar Sandoval @ 2020-01-18 1:17 UTC (permalink / raw) To: Al Viro Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Sat, Jan 18, 2020 at 12:47:38AM +0000, Al Viro wrote: > On Fri, Jan 17, 2020 at 03:54:44PM -0800, Omar Sandoval wrote: > > > > 3) permission checks need to be specified > > > > I believe the only difference here vs standard linkat is that newpath > > must not be immutable or append-only? > > I would bloody hope not - at the very least you want sticky bit on parent > to have effect, same as with rename()/rmdir()/unlink()... Right, I should've reread may_delete(). I'll document that, too. > > > references to pathconf, Cthulhu and other equally delightful entities > > > are not really welcome. > > > > EOPNOTSUPP is probably the most helpful. > > Umm... What would you feed it, though? You need to get past your > "links to the same file, do nothing" escape... I think what you're getting at is that we can make this easier by failing linkat AT_REPLACE very early if the filesystem doesn't have a ->link_replace(). Namely, if the filesystem doesn't support AT_REPLACE but we still allow the "same file" or "newpath doesn't exist" cases to succeed, then feature detection gets annoying. As long as that's right, then applications can do the usual "try the new feature or fall back" pattern that they do for fallocate modes and such. > > Based on my previous attempt at it [1], it's not too bad. > > + error = may_delete(dir, new_dentry, d_is_dir(old_dentry)); > > Why bother with d_is_dir(), when you are going to reject directories > anyway? > > + if (dir->i_op->link) > + error = dir->i_op->link(old_dentry, dir, new_dentry); > + else > + error = dir->i_op->link2(old_dentry, dir, new_dentry, flags); > + if (error) > + goto out; > + > > No. This is completely wrong; just make it ->link_replace() and be done > with that; no extra arguments and *always* the same conditions wrt > positive/negative. One of the reasons why ->rename() tends to be > ugly (and a source of quite a few bugs over years) are those "if > target is positive/if target is negative" scattered over the instances. > > Make the choice conditional upon the positivity of target. Yup, you already convinced me that ->link_replace() is better in your last email. > And you don't need to reproduce every quirk of rename() error values. > Really. Unless you really intend to have userland do a loop of > linkat(2) attempts (a-la mkstemp(3)), followed by rename(2) for > fallback... Understood, thanks. I'll get this all cleaned up and resent next week. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-18 1:17 ` Omar Sandoval @ 2020-01-18 2:20 ` Al Viro 2020-01-21 23:05 ` Omar Sandoval 2020-01-22 9:53 ` David Howells 1 sibling, 1 reply; 36+ messages in thread From: Al Viro @ 2020-01-18 2:20 UTC (permalink / raw) To: Omar Sandoval Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Fri, Jan 17, 2020 at 05:17:34PM -0800, Omar Sandoval wrote: > > No. This is completely wrong; just make it ->link_replace() and be done > > with that; no extra arguments and *always* the same conditions wrt > > positive/negative. One of the reasons why ->rename() tends to be > > ugly (and a source of quite a few bugs over years) are those "if > > target is positive/if target is negative" scattered over the instances. > > > > Make the choice conditional upon the positivity of target. > > Yup, you already convinced me that ->link_replace() is better in your > last email. FWIW, that might be not so simple ;-/ Reason: NFS-like stuff. Client sees a negative in cache; the problem is how to decide whether to tell the server "OK, I want normal link()" vs. "if it turns out that someone has created it by the time you see the request, give do a replacing link". Sure, if could treat ->link() telling you -EEXIST as "OK, repeat it with ->link_replace(), then", but that's an extra roundtrip... Hell knows... I would really like to avoid any kind of ->atomic_open() redux ;-/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-18 2:20 ` Al Viro @ 2020-01-21 23:05 ` Omar Sandoval 2020-01-22 6:57 ` Amir Goldstein 0 siblings, 1 reply; 36+ messages in thread From: Omar Sandoval @ 2020-01-21 23:05 UTC (permalink / raw) To: Al Viro Cc: Trond Myklebust, amir73il, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Sat, Jan 18, 2020 at 02:20:32AM +0000, Al Viro wrote: > On Fri, Jan 17, 2020 at 05:17:34PM -0800, Omar Sandoval wrote: > > > No. This is completely wrong; just make it ->link_replace() and be done > > > with that; no extra arguments and *always* the same conditions wrt > > > positive/negative. One of the reasons why ->rename() tends to be > > > ugly (and a source of quite a few bugs over years) are those "if > > > target is positive/if target is negative" scattered over the instances. > > > > > > Make the choice conditional upon the positivity of target. > > > > Yup, you already convinced me that ->link_replace() is better in your > > last email. > > FWIW, that might be not so simple ;-/ Reason: NFS-like stuff. Client > sees a negative in cache; the problem is how to decide whether to > tell the server "OK, I want normal link()" vs. "if it turns out that > someone has created it by the time you see the request, give do > a replacing link". Sure, if could treat ->link() telling you -EEXIST > as "OK, repeat it with ->link_replace(), then", but that's an extra > roundtrip... So that's a point in favor of ->link(). But then if we overload ->link() instead of adding ->link_replace() and we want EOPNOTSUPP to fail fast, we need to add something like FMODE_SUPPORTS_AT_REPLACE. Some options I see are: 1. Go with ->link_replace() until network filesystem specs support AT_REPLACE. That would be a bit of a mess down the line, though. 2. Stick with ->link(), let the filesystem implementations deal with the positive targets, and add FMODE_SUPPORTS_AT_REPLACE so that feature detection remains easy for userspace. 3. Option 2, but don't bother with FMODE_SUPPORTS_AT_REPLACE. FWIW, there is precendent for option 3: RENAME_EXCHANGE. That has the same "files are the same" noop condition, and we don't know whether RENAME_EXCHANGE is supported until ->rename(). A cursory search shows that applications using RENAME_EXCHANGE try it and fall back to a non-atomic exchange on EINVAL. They could do the exact same thing for AT_REPLACE. None of it is elegant, but which approach would you hate the least? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-21 23:05 ` Omar Sandoval @ 2020-01-22 6:57 ` Amir Goldstein 2020-01-22 22:10 ` Omar Sandoval 0 siblings, 1 reply; 36+ messages in thread From: Amir Goldstein @ 2020-01-22 6:57 UTC (permalink / raw) To: Omar Sandoval Cc: Al Viro, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong On Wed, Jan 22, 2020 at 1:05 AM Omar Sandoval <osandov@osandov.com> wrote: > > On Sat, Jan 18, 2020 at 02:20:32AM +0000, Al Viro wrote: > > On Fri, Jan 17, 2020 at 05:17:34PM -0800, Omar Sandoval wrote: > > > > No. This is completely wrong; just make it ->link_replace() and be done > > > > with that; no extra arguments and *always* the same conditions wrt > > > > positive/negative. One of the reasons why ->rename() tends to be > > > > ugly (and a source of quite a few bugs over years) are those "if > > > > target is positive/if target is negative" scattered over the instances. > > > > > > > > Make the choice conditional upon the positivity of target. > > > > > > Yup, you already convinced me that ->link_replace() is better in your > > > last email. > > > > FWIW, that might be not so simple ;-/ Reason: NFS-like stuff. Client > > sees a negative in cache; the problem is how to decide whether to > > tell the server "OK, I want normal link()" vs. "if it turns out that > > someone has created it by the time you see the request, give do > > a replacing link". Sure, if could treat ->link() telling you -EEXIST > > as "OK, repeat it with ->link_replace(), then", but that's an extra > > roundtrip... > > So that's a point in favor of ->link(). But then if we overload ->link() > instead of adding ->link_replace() and we want EOPNOTSUPP to fail fast, > we need to add something like FMODE_SUPPORTS_AT_REPLACE. > > Some options I see are: > > 1. Go with ->link_replace() until network filesystem specs support > AT_REPLACE. That would be a bit of a mess down the line, though. > 2. Stick with ->link(), let the filesystem implementations deal with the > positive targets, and add FMODE_SUPPORTS_AT_REPLACE so that feature > detection remains easy for userspace. "detection remains easy..." why is this important? Do you know of a userspace application that would have a problem checking if AT_REPLACE works, fall back to whatever and never try it ever again? Besides, when said application tried to open an O_TMPFILE and fail, it will have already detected a lot of the unsupported cases. Sorry for not reading all the thread again, some API questions: - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? - Does AT_REPLACE assert that destination is positive? and if so why? The functionality that is complement to atomic rename would be atomic link, destination could be positive or negative, but end results will be that destination is positive with new inode. With those semantics, ->link_replace() makes much less sense IMO. > 3. Option 2, but don't bother with FMODE_SUPPORTS_AT_REPLACE. > > FWIW, there is precendent for option 3: RENAME_EXCHANGE. That has the > same "files are the same" noop condition, and we don't know whether > RENAME_EXCHANGE is supported until ->rename(). A cursory search shows > that applications using RENAME_EXCHANGE try it and fall back to a > non-atomic exchange on EINVAL. They could do the exact same thing for > AT_REPLACE. > That sounds like the most reasonable approach to me. Let's not over complicate. If you find that creates too much generic logic in ->link(), you can take the approach Darrick employed with generic_remap_file_range_prep() for filesystems that want to support AT_REPLACE. All other fs just need to check for valid flags mask, like the ->rename() precedent. Another side discussion about passing AT_ flags down to filesystems. Traditionally, that was never done, until AT_STATX_ mixed vfs flags with filesystem flags on the same AT_ namespace. Now we have linkat() syscall that can take only AT_ vfs flags and renameat2() syscall that can take only RENAME_ filesystem flags not from the AT_ namespace. I feel that the situation of having AT_REPLACE API along with RENAME_EXCHANGE and RENAME_NOREPLACE is a bit awkward and some standardization is in order. According to include/uapi/linux/fcntl.h, there is no numeric collision between the RENAME_ flag namepsace and AT_ flags namespace, although I do find it suspicious that AT_ flags start at 0x100... Could we define AT_RENAME_xxx RENAME_xxx flags and name the new flag AT_LINK_REPLACE, so it is a bit more clear that the flag is specific to link(2) syscall and not vfs generic AT_ flag. Thanks, Amir. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-22 6:57 ` Amir Goldstein @ 2020-01-22 22:10 ` Omar Sandoval 2020-01-23 3:47 ` Al Viro 0 siblings, 1 reply; 36+ messages in thread From: Omar Sandoval @ 2020-01-22 22:10 UTC (permalink / raw) To: Amir Goldstein Cc: Al Viro, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong On Wed, Jan 22, 2020 at 08:57:01AM +0200, Amir Goldstein wrote: > On Wed, Jan 22, 2020 at 1:05 AM Omar Sandoval <osandov@osandov.com> wrote: > > > > On Sat, Jan 18, 2020 at 02:20:32AM +0000, Al Viro wrote: > > > On Fri, Jan 17, 2020 at 05:17:34PM -0800, Omar Sandoval wrote: > > > > > No. This is completely wrong; just make it ->link_replace() and be done > > > > > with that; no extra arguments and *always* the same conditions wrt > > > > > positive/negative. One of the reasons why ->rename() tends to be > > > > > ugly (and a source of quite a few bugs over years) are those "if > > > > > target is positive/if target is negative" scattered over the instances. > > > > > > > > > > Make the choice conditional upon the positivity of target. > > > > > > > > Yup, you already convinced me that ->link_replace() is better in your > > > > last email. > > > > > > FWIW, that might be not so simple ;-/ Reason: NFS-like stuff. Client > > > sees a negative in cache; the problem is how to decide whether to > > > tell the server "OK, I want normal link()" vs. "if it turns out that > > > someone has created it by the time you see the request, give do > > > a replacing link". Sure, if could treat ->link() telling you -EEXIST > > > as "OK, repeat it with ->link_replace(), then", but that's an extra > > > roundtrip... > > > > So that's a point in favor of ->link(). But then if we overload ->link() > > instead of adding ->link_replace() and we want EOPNOTSUPP to fail fast, > > we need to add something like FMODE_SUPPORTS_AT_REPLACE. > > > > Some options I see are: > > > > 1. Go with ->link_replace() until network filesystem specs support > > AT_REPLACE. That would be a bit of a mess down the line, though. > > 2. Stick with ->link(), let the filesystem implementations deal with the > > positive targets, and add FMODE_SUPPORTS_AT_REPLACE so that feature > > detection remains easy for userspace. > > "detection remains easy..." why is this important? As I mentioned, I don't think it's necessary given the precedent. However, Al voiced some concern over this earlier in the thread: --- > > > 7) how do users tell if filesystem supports that? And no, > > >references to pathconf, Cthulhu and other equally delightful entities > > >are not really welcome. > > > > EOPNOTSUPP is probably the most helpful. > > Umm... What would you feed it, though? You need to get past your > "links to the same file, do nothing" escape... --- > Do you know of a userspace application that would have a problem checking > if AT_REPLACE works, fall back to whatever and never try it ever again? > > Besides, when said application tried to open an O_TMPFILE and fail, it > will have already detected a lot of the unsupported cases. > Sorry for not reading all the thread again, some API questions: > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? I wasn't planning on having that restriction. It's not too much effort for filesystems to support it for normal files, so I wouldn't want to place an artificial restriction on a useful primitive. > - Does AT_REPLACE assert that destination is positive? and if so why? No, it should work like a normal link() if the destination doesn't exist. > The functionality that is complement to atomic rename would be atomic > link, destination could be positive or negative, but end results will be > that destination is positive with new inode. > With those semantics, ->link_replace() makes much less sense IMO. > > > 3. Option 2, but don't bother with FMODE_SUPPORTS_AT_REPLACE. > > > > FWIW, there is precendent for option 3: RENAME_EXCHANGE. That has the > > same "files are the same" noop condition, and we don't know whether > > RENAME_EXCHANGE is supported until ->rename(). A cursory search shows > > that applications using RENAME_EXCHANGE try it and fall back to a > > non-atomic exchange on EINVAL. They could do the exact same thing for > > AT_REPLACE. > > > > That sounds like the most reasonable approach to me. Let's not over complicate. > If you find that creates too much generic logic in ->link(), you can take > the approach Darrick employed with generic_remap_file_range_prep() for > filesystems that want to support AT_REPLACE. All other fs just need to check > for valid flags mask, like the ->rename() precedent. > > Another side discussion about passing AT_ flags down to filesystems. > Traditionally, that was never done, until AT_STATX_ mixed vfs flags > with filesystem flags on the same AT_ namespace. > Now we have linkat() syscall that can take only AT_ vfs flags and > renameat2() syscall that can take only RENAME_ filesystem flags not > from the AT_ namespace. > I feel that the situation of having AT_REPLACE API along with > RENAME_EXCHANGE and RENAME_NOREPLACE is a bit awkward > and some standardization is in order. > > According to include/uapi/linux/fcntl.h, there is no numeric collision > between the RENAME_ flag namepsace and AT_ flags namespace, > although I do find it suspicious that AT_ flags start at 0x100... > Could we define AT_RENAME_xxx RENAME_xxx flags and name the > new flag AT_LINK_REPLACE, so it is a bit more clear that the flag > is specific to link(2) syscall and not vfs generic AT_ flag. Sure, I'll rename it to AT_LINK_REPLACE. AT_RENAME_xxx is probably for a separate series. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-22 22:10 ` Omar Sandoval @ 2020-01-23 3:47 ` Al Viro 2020-01-23 7:16 ` Dave Chinner 2020-01-28 14:35 ` David Howells 0 siblings, 2 replies; 36+ messages in thread From: Al Viro @ 2020-01-23 3:47 UTC (permalink / raw) To: Omar Sandoval Cc: Amir Goldstein, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > Sorry for not reading all the thread again, some API questions: > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > I wasn't planning on having that restriction. It's not too much effort > for filesystems to support it for normal files, so I wouldn't want to > place an artificial restriction on a useful primitive. I'm not sure; that's how we ended up with the unspeakable APIs like rename(2), after all... Said that, in this particular case I would suggest allowing it for any source - what matters is not the _origin_ of that thing, but the current state. And linkat() (including the already existing variant) can change that at any time... ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-23 3:47 ` Al Viro @ 2020-01-23 7:16 ` Dave Chinner 2020-01-23 7:47 ` Amir Goldstein 2020-01-28 1:27 ` Omar Sandoval 2020-01-28 14:35 ` David Howells 1 sibling, 2 replies; 36+ messages in thread From: Dave Chinner @ 2020-01-23 7:16 UTC (permalink / raw) To: Al Viro Cc: Omar Sandoval, Amir Goldstein, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong On Thu, Jan 23, 2020 at 03:47:45AM +0000, Al Viro wrote: > On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > > > Sorry for not reading all the thread again, some API questions: > > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > > > I wasn't planning on having that restriction. It's not too much effort > > for filesystems to support it for normal files, so I wouldn't want to > > place an artificial restriction on a useful primitive. > > I'm not sure; that's how we ended up with the unspeakable APIs like > rename(2), after all... Yet it is just rename(2) with the serial numbers filed off - complete with all the same data vs metadata ordering problems that rename(2) comes along with. i.e. it needs fsync to guarantee data integrity of the source file before the linkat() call is made. If we can forsee that users are going to complain that linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it leaves zero length files behind after a crash just like rename() does, then we haven't really improved anything at all... And, really, I don't think anyone wants another API that requires multiple fsync calls to use correctly for crash-safe file replacement, let alone try to teach people who still cant rename a file safely how to use it.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-23 7:16 ` Dave Chinner @ 2020-01-23 7:47 ` Amir Goldstein 2020-01-24 21:25 ` Dave Chinner 2020-01-28 1:27 ` Omar Sandoval 1 sibling, 1 reply; 36+ messages in thread From: Amir Goldstein @ 2020-01-23 7:47 UTC (permalink / raw) To: Dave Chinner Cc: Al Viro, Omar Sandoval, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong On Thu, Jan 23, 2020 at 9:16 AM Dave Chinner <david@fromorbit.com> wrote: > > On Thu, Jan 23, 2020 at 03:47:45AM +0000, Al Viro wrote: > > On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > > > > > Sorry for not reading all the thread again, some API questions: > > > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > > > > > I wasn't planning on having that restriction. It's not too much effort > > > for filesystems to support it for normal files, so I wouldn't want to > > > place an artificial restriction on a useful primitive. > > I have too many gray hairs each one for implementing a "useful primitive" that nobody asked for and bare the consequences. Your introduction to AT_REPLACE uses O_TMPFILE. I see no other sane use of the interface. > > I'm not sure; that's how we ended up with the unspeakable APIs like > > rename(2), after all... > > Yet it is just rename(2) with the serial numbers filed off - > complete with all the same data vs metadata ordering problems that > rename(2) comes along with. i.e. it needs fsync to guarantee data > integrity of the source file before the linkat() call is made. > > If we can forsee that users are going to complain that > linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it > leaves zero length files behind after a crash just like rename() > does, then we haven't really improved anything at all... > > And, really, I don't think anyone wants another API that requires > multiple fsync calls to use correctly for crash-safe file > replacement, let alone try to teach people who still cant rename a > file safely how to use it.... > Are you suggesting that AT_LINK_REPLACE should have some of the semantics I posted in this RFC for AT_ATOMIC_xxx: https://lore.kernel.org/linux-fsdevel/20190527172655.9287-1-amir73il@gmail.com/ I admit I did not follow up on performance benchmarks that you asked on that thread, but I also see the value in a simple API as opposed to the AIO_FSYNC scheme that you proposed. Thanks, Amir. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-23 7:47 ` Amir Goldstein @ 2020-01-24 21:25 ` Dave Chinner 2020-01-31 5:24 ` Darrick J. Wong 0 siblings, 1 reply; 36+ messages in thread From: Dave Chinner @ 2020-01-24 21:25 UTC (permalink / raw) To: Amir Goldstein Cc: Al Viro, Omar Sandoval, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong On Thu, Jan 23, 2020 at 09:47:30AM +0200, Amir Goldstein wrote: > On Thu, Jan 23, 2020 at 9:16 AM Dave Chinner <david@fromorbit.com> wrote: > > > > On Thu, Jan 23, 2020 at 03:47:45AM +0000, Al Viro wrote: > > > On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > > > > > > > Sorry for not reading all the thread again, some API questions: > > > > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > > > > > > > I wasn't planning on having that restriction. It's not too much effort > > > > for filesystems to support it for normal files, so I wouldn't want to > > > > place an artificial restriction on a useful primitive. > > > > > I have too many gray hairs each one for implementing a "useful primitive" > that nobody asked for and bare the consequences. > Your introduction to AT_REPLACE uses O_TMPFILE. > I see no other sane use of the interface. > > > > I'm not sure; that's how we ended up with the unspeakable APIs like > > > rename(2), after all... > > > > Yet it is just rename(2) with the serial numbers filed off - > > complete with all the same data vs metadata ordering problems that > > rename(2) comes along with. i.e. it needs fsync to guarantee data > > integrity of the source file before the linkat() call is made. > > > > If we can forsee that users are going to complain that > > linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it > > leaves zero length files behind after a crash just like rename() > > does, then we haven't really improved anything at all... > > > > And, really, I don't think anyone wants another API that requires > > multiple fsync calls to use correctly for crash-safe file > > replacement, let alone try to teach people who still cant rename a > > file safely how to use it.... > > > > Are you suggesting that AT_LINK_REPLACE should have some of > the semantics I posted in this RFC for AT_ATOMIC_xxx: > https://lore.kernel.org/linux-fsdevel/20190527172655.9287-1-amir73il@gmail.com/ Not directly. All I'm pointing out is that data integrity guarantees of AT_LINK_REPLACE are yet another aspect of this new feature that has not yet been specified or documented at all. And in pointing this out, I'm making an observation that the rename(2) behaviour which everyone seems to be assuming this function will replicate is a terrible model to copy/reinvent. Addressing this problem is something for the people pushing for AT_LINK_REPLACE to solve, not me.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-24 21:25 ` Dave Chinner @ 2020-01-31 5:24 ` Darrick J. Wong 2020-01-31 5:29 ` hch 2020-01-31 7:00 ` Amir Goldstein 0 siblings, 2 replies; 36+ messages in thread From: Darrick J. Wong @ 2020-01-31 5:24 UTC (permalink / raw) To: Dave Chinner Cc: Amir Goldstein, Al Viro, Omar Sandoval, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Sat, Jan 25, 2020 at 08:25:46AM +1100, Dave Chinner wrote: > On Thu, Jan 23, 2020 at 09:47:30AM +0200, Amir Goldstein wrote: > > On Thu, Jan 23, 2020 at 9:16 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > On Thu, Jan 23, 2020 at 03:47:45AM +0000, Al Viro wrote: > > > > On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > > > > > > > > > Sorry for not reading all the thread again, some API questions: > > > > > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > > > > > > > > > I wasn't planning on having that restriction. It's not too much effort > > > > > for filesystems to support it for normal files, so I wouldn't want to > > > > > place an artificial restriction on a useful primitive. > > > > > > > > I have too many gray hairs each one for implementing a "useful primitive" > > that nobody asked for and bare the consequences. > > Your introduction to AT_REPLACE uses O_TMPFILE. > > I see no other sane use of the interface. > > > > > > I'm not sure; that's how we ended up with the unspeakable APIs like > > > > rename(2), after all... > > > > > > Yet it is just rename(2) with the serial numbers filed off - > > > complete with all the same data vs metadata ordering problems that > > > rename(2) comes along with. i.e. it needs fsync to guarantee data > > > integrity of the source file before the linkat() call is made. > > > > > > If we can forsee that users are going to complain that > > > linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it > > > leaves zero length files behind after a crash just like rename() > > > does, then we haven't really improved anything at all... > > > > > > And, really, I don't think anyone wants another API that requires > > > multiple fsync calls to use correctly for crash-safe file > > > replacement, let alone try to teach people who still cant rename a > > > file safely how to use it.... > > > > > > > Are you suggesting that AT_LINK_REPLACE should have some of > > the semantics I posted in this RFC for AT_ATOMIC_xxx: > > https://lore.kernel.org/linux-fsdevel/20190527172655.9287-1-amir73il@gmail.com/ > > Not directly. > > All I'm pointing out is that data integrity guarantees of > AT_LINK_REPLACE are yet another aspect of this new feature that > has not yet been specified or documented at all. > > And in pointing this out, I'm making an observation that the > rename(2) behaviour which everyone seems to be assuming this > function will replicate is a terrible model to copy/reinvent. > > Addressing this problem is something for the people pushing for > AT_LINK_REPLACE to solve, not me.... Or the grumpy maintainer who will have to digest all of this. Can we update the documentation to admit that many people will probably want to use this (and rename) as atomic swap operations? "The filesystem will commit the data and metadata of all files and directories involved in the link operation to stable storage before the call returns." And finally add a flag: "AT_LINK_EATMYDATA: If specified, the filesystem may opt out of committing anything to disk." (Or a prctl(PR_EATMYDATA) to make it process wide, idk.) --D > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-31 5:24 ` Darrick J. Wong @ 2020-01-31 5:29 ` hch 2020-01-31 7:00 ` Amir Goldstein 1 sibling, 0 replies; 36+ messages in thread From: hch @ 2020-01-31 5:29 UTC (permalink / raw) To: Darrick J. Wong Cc: Dave Chinner, Amir Goldstein, Al Viro, Omar Sandoval, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel On Thu, Jan 30, 2020 at 09:24:54PM -0800, Darrick J. Wong wrote: > Or the grumpy maintainer who will have to digest all of this. > > Can we update the documentation to admit that many people will probably > want to use this (and rename) as atomic swap operations? > > "The filesystem will commit the data and metadata of all files and > directories involved in the link operation to stable storage before the > call returns." That sounds horrible, because it is so different from any other metadata operation. Maybe requiring fsync to stabilize metadata ops wasn't the best idea ever, but having some varinats of linkat behave from others is just stupid. We've been beating the fact that you shall fsync into peoples heads for years. No reason to make an exception for an obscure operation now. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-31 5:24 ` Darrick J. Wong 2020-01-31 5:29 ` hch @ 2020-01-31 7:00 ` Amir Goldstein 2020-01-31 20:33 ` Omar Sandoval 1 sibling, 1 reply; 36+ messages in thread From: Amir Goldstein @ 2020-01-31 7:00 UTC (permalink / raw) To: Darrick J. Wong Cc: Dave Chinner, Al Viro, Omar Sandoval, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Jan Kara On Fri, Jan 31, 2020 at 7:27 AM Darrick J. Wong <darrick.wong@oracle.com> wrote: > > On Sat, Jan 25, 2020 at 08:25:46AM +1100, Dave Chinner wrote: > > On Thu, Jan 23, 2020 at 09:47:30AM +0200, Amir Goldstein wrote: > > > On Thu, Jan 23, 2020 at 9:16 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > On Thu, Jan 23, 2020 at 03:47:45AM +0000, Al Viro wrote: > > > > > On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > > > > > > > > > > > Sorry for not reading all the thread again, some API questions: > > > > > > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > > > > > > > > > > > I wasn't planning on having that restriction. It's not too much effort > > > > > > for filesystems to support it for normal files, so I wouldn't want to > > > > > > place an artificial restriction on a useful primitive. > > > > > > > > > > > I have too many gray hairs each one for implementing a "useful primitive" > > > that nobody asked for and bare the consequences. > > > Your introduction to AT_REPLACE uses O_TMPFILE. > > > I see no other sane use of the interface. > > > > > > > > I'm not sure; that's how we ended up with the unspeakable APIs like > > > > > rename(2), after all... > > > > > > > > Yet it is just rename(2) with the serial numbers filed off - > > > > complete with all the same data vs metadata ordering problems that > > > > rename(2) comes along with. i.e. it needs fsync to guarantee data > > > > integrity of the source file before the linkat() call is made. > > > > > > > > If we can forsee that users are going to complain that > > > > linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it > > > > leaves zero length files behind after a crash just like rename() > > > > does, then we haven't really improved anything at all... > > > > > > > > And, really, I don't think anyone wants another API that requires > > > > multiple fsync calls to use correctly for crash-safe file > > > > replacement, let alone try to teach people who still cant rename a > > > > file safely how to use it.... > > > > > > > > > > Are you suggesting that AT_LINK_REPLACE should have some of > > > the semantics I posted in this RFC for AT_ATOMIC_xxx: > > > https://lore.kernel.org/linux-fsdevel/20190527172655.9287-1-amir73il@gmail.com/ > > > > Not directly. > > > > All I'm pointing out is that data integrity guarantees of > > AT_LINK_REPLACE are yet another aspect of this new feature that > > has not yet been specified or documented at all. > > > > And in pointing this out, I'm making an observation that the > > rename(2) behaviour which everyone seems to be assuming this > > function will replicate is a terrible model to copy/reinvent. > > > > Addressing this problem is something for the people pushing for > > AT_LINK_REPLACE to solve, not me.... > > Or the grumpy maintainer who will have to digest all of this. > > Can we update the documentation to admit that many people will probably > want to use this (and rename) as atomic swap operations? > > "The filesystem will commit the data and metadata of all files and > directories involved in the link operation to stable storage before the > call returns." > > And finally add a flag: > > "AT_LINK_EATMYDATA: If specified, the filesystem may opt out of > committing anything to disk." > I agree with Christoph that this anomaly is not a good idea, but I also agree with you and Dave that if an operation is meant to be used for atomic swap, we should make it much harder for users to get it wrong. To that end, we can define both flags AT_LINK_REPLACE and AT_ATOMIC in uapi, but also define the combo AT_LINK_ATOMIC_REPLACE and let the documentation be very much focused on this usage. I would like to stress a point, which some who haven't followed [1] closely might have missed - the purpose of AT_ATOMIC is *not* to make the new link durable - it is *not* a replacement for fsync on parent dir. It's purpose is *only* to create a dependency between the durability of the new link and the new data it can expose. AFAICT, this means that AT_ATOMIC would be implemented as filemap_write_and_wait() in xfs/ext4 and probably fdatasync in btrfs. Really, there is no chance that any user is interested in a non-atomic replace in that respect, so I am not even sure that we need an explicit flag for it. As it is, the AT_REPLACE API design would rank poorly as "The obvious use is wrong." An explicit AT_ATOMIC flag would help is that we could make the same semantics also apply to rename(2) with the same flag. Omar, I do understand why you say that you want to implement AT_REPLACE and let someone else take care of (bike shedding) AT_ATOMIC later. The problem with that approach is that man page will have the AT_REPLACE documentation spread out without the mention of AT_ATOMIC_REPLACE and that can generate damage long into the future. TBH, I don't think there were any actual objections to the final AT_ATOMIC_DATA proposal (which was Jan's idea by the way). Dave was claiming that introducing a new API requires proof of improvement, so he suggested (for the sake of debate) that I compare the performance of atomic links to using batched AIO_FSYNC with rename/linkat completion callbacks and I did not follow up on that. The situation now is different. You are proposing a new API and the improvement is clear, so the concern is to get the API right, not to show that it performs better than another API. Thanks, Amir. [1] https://lore.kernel.org/linux-fsdevel/20190527172655.9287-1-amir73il@gmail.com/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-31 7:00 ` Amir Goldstein @ 2020-01-31 20:33 ` Omar Sandoval 2020-01-31 21:55 ` Amir Goldstein 0 siblings, 1 reply; 36+ messages in thread From: Omar Sandoval @ 2020-01-31 20:33 UTC (permalink / raw) To: Amir Goldstein Cc: Darrick J. Wong, Dave Chinner, Al Viro, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Jan Kara On Fri, Jan 31, 2020 at 09:00:47AM +0200, Amir Goldstein wrote: > On Fri, Jan 31, 2020 at 7:27 AM Darrick J. Wong <darrick.wong@oracle.com> wrote: > > > > On Sat, Jan 25, 2020 at 08:25:46AM +1100, Dave Chinner wrote: > > > On Thu, Jan 23, 2020 at 09:47:30AM +0200, Amir Goldstein wrote: > > > > On Thu, Jan 23, 2020 at 9:16 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > > > On Thu, Jan 23, 2020 at 03:47:45AM +0000, Al Viro wrote: > > > > > > On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > > > > > > > > > > > > > Sorry for not reading all the thread again, some API questions: > > > > > > > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > > > > > > > > > > > > > I wasn't planning on having that restriction. It's not too much effort > > > > > > > for filesystems to support it for normal files, so I wouldn't want to > > > > > > > place an artificial restriction on a useful primitive. > > > > > > > > > > > > > > I have too many gray hairs each one for implementing a "useful primitive" > > > > that nobody asked for and bare the consequences. > > > > Your introduction to AT_REPLACE uses O_TMPFILE. > > > > I see no other sane use of the interface. > > > > > > > > > > I'm not sure; that's how we ended up with the unspeakable APIs like > > > > > > rename(2), after all... > > > > > > > > > > Yet it is just rename(2) with the serial numbers filed off - > > > > > complete with all the same data vs metadata ordering problems that > > > > > rename(2) comes along with. i.e. it needs fsync to guarantee data > > > > > integrity of the source file before the linkat() call is made. > > > > > > > > > > If we can forsee that users are going to complain that > > > > > linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it > > > > > leaves zero length files behind after a crash just like rename() > > > > > does, then we haven't really improved anything at all... > > > > > > > > > > And, really, I don't think anyone wants another API that requires > > > > > multiple fsync calls to use correctly for crash-safe file > > > > > replacement, let alone try to teach people who still cant rename a > > > > > file safely how to use it.... > > > > > > > > > > > > > Are you suggesting that AT_LINK_REPLACE should have some of > > > > the semantics I posted in this RFC for AT_ATOMIC_xxx: > > > > https://lore.kernel.org/linux-fsdevel/20190527172655.9287-1-amir73il@gmail.com/ > > > > > > Not directly. > > > > > > All I'm pointing out is that data integrity guarantees of > > > AT_LINK_REPLACE are yet another aspect of this new feature that > > > has not yet been specified or documented at all. > > > > > > And in pointing this out, I'm making an observation that the > > > rename(2) behaviour which everyone seems to be assuming this > > > function will replicate is a terrible model to copy/reinvent. > > > > > > Addressing this problem is something for the people pushing for > > > AT_LINK_REPLACE to solve, not me.... > > > > Or the grumpy maintainer who will have to digest all of this. > > > > Can we update the documentation to admit that many people will probably > > want to use this (and rename) as atomic swap operations? > > > > "The filesystem will commit the data and metadata of all files and > > directories involved in the link operation to stable storage before the > > call returns." > > > > And finally add a flag: > > > > "AT_LINK_EATMYDATA: If specified, the filesystem may opt out of > > committing anything to disk." > > > > I agree with Christoph that this anomaly is not a good idea, but I also > agree with you and Dave that if an operation is meant to be used for > atomic swap, we should make it much harder for users to get it wrong. > > To that end, we can define both flags AT_LINK_REPLACE and > AT_ATOMIC in uapi, but also define the combo > AT_LINK_ATOMIC_REPLACE and let the documentation be > very much focused on this usage. > > I would like to stress a point, which some who haven't followed [1] > closely might have missed - the purpose of AT_ATOMIC is > *not* to make the new link durable - it is *not* a replacement for fsync > on parent dir. It's purpose is *only* to create a dependency between the > durability of the new link and the new data it can expose. At the risk of spawning another bikeshed subthread, the naming could be improved to make that more clear. AT_BARRIER? AT_DEPENDENCY? > AFAICT, this means that AT_ATOMIC would be implemented > as filemap_write_and_wait() in xfs/ext4 and probably fdatasync in btrfs. > > Really, there is no chance that any user is interested in a non-atomic > replace in that respect, so I am not even sure that we need an explicit > flag for it. As it is, the AT_REPLACE API design would rank poorly as > "The obvious use is wrong." > > An explicit AT_ATOMIC flag would help is that we could make the same > semantics also apply to rename(2) with the same flag. > > Omar, > > I do understand why you say that you want to implement AT_REPLACE > and let someone else take care of (bike shedding) AT_ATOMIC later. I'm not in much of a rush to get AT_LINK_REPLACE in, so I'd be okay getting it in together with AT_ATOMIC. I don't think I have the bandwidth to get yet another VFS interface in, though (since I still have to get back to RWF_ENCODED, which is a higher priority for me). Are you planning on picking up AT_ATOMIC any time soon? > The problem with that approach is that man page will have the > AT_REPLACE documentation spread out without the mention of > AT_ATOMIC_REPLACE and that can generate damage long into the future. On the other hand, I believe I made this fairly clear in my man-page patch: +This does not guarantee data integrity; +see EXAMPLE below for how to use this for crash-safe file replacement with +.BR O_TMPFILE . With the fsync() calls in the example. Thanks! ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-31 20:33 ` Omar Sandoval @ 2020-01-31 21:55 ` Amir Goldstein 0 siblings, 0 replies; 36+ messages in thread From: Amir Goldstein @ 2020-01-31 21:55 UTC (permalink / raw) To: Omar Sandoval Cc: Darrick J. Wong, Dave Chinner, Al Viro, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Jan Kara On Fri, Jan 31, 2020 at 10:33 PM Omar Sandoval <osandov@osandov.com> wrote: > > On Fri, Jan 31, 2020 at 09:00:47AM +0200, Amir Goldstein wrote: > > On Fri, Jan 31, 2020 at 7:27 AM Darrick J. Wong <darrick.wong@oracle.com> wrote: > > > > > > On Sat, Jan 25, 2020 at 08:25:46AM +1100, Dave Chinner wrote: > > > > On Thu, Jan 23, 2020 at 09:47:30AM +0200, Amir Goldstein wrote: > > > > > On Thu, Jan 23, 2020 at 9:16 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > > > > > On Thu, Jan 23, 2020 at 03:47:45AM +0000, Al Viro wrote: > > > > > > > On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > > > > > > > > > > > > > > > Sorry for not reading all the thread again, some API questions: > > > > > > > > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > > > > > > > > > > > > > > > I wasn't planning on having that restriction. It's not too much effort > > > > > > > > for filesystems to support it for normal files, so I wouldn't want to > > > > > > > > place an artificial restriction on a useful primitive. > > > > > > > > > > > > > > > > > I have too many gray hairs each one for implementing a "useful primitive" > > > > > that nobody asked for and bare the consequences. > > > > > Your introduction to AT_REPLACE uses O_TMPFILE. > > > > > I see no other sane use of the interface. > > > > > > > > > > > > I'm not sure; that's how we ended up with the unspeakable APIs like > > > > > > > rename(2), after all... > > > > > > > > > > > > Yet it is just rename(2) with the serial numbers filed off - > > > > > > complete with all the same data vs metadata ordering problems that > > > > > > rename(2) comes along with. i.e. it needs fsync to guarantee data > > > > > > integrity of the source file before the linkat() call is made. > > > > > > > > > > > > If we can forsee that users are going to complain that > > > > > > linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it > > > > > > leaves zero length files behind after a crash just like rename() > > > > > > does, then we haven't really improved anything at all... > > > > > > > > > > > > And, really, I don't think anyone wants another API that requires > > > > > > multiple fsync calls to use correctly for crash-safe file > > > > > > replacement, let alone try to teach people who still cant rename a > > > > > > file safely how to use it.... > > > > > > > > > > > > > > > > Are you suggesting that AT_LINK_REPLACE should have some of > > > > > the semantics I posted in this RFC for AT_ATOMIC_xxx: > > > > > https://lore.kernel.org/linux-fsdevel/20190527172655.9287-1-amir73il@gmail.com/ > > > > > > > > Not directly. > > > > > > > > All I'm pointing out is that data integrity guarantees of > > > > AT_LINK_REPLACE are yet another aspect of this new feature that > > > > has not yet been specified or documented at all. > > > > > > > > And in pointing this out, I'm making an observation that the > > > > rename(2) behaviour which everyone seems to be assuming this > > > > function will replicate is a terrible model to copy/reinvent. > > > > > > > > Addressing this problem is something for the people pushing for > > > > AT_LINK_REPLACE to solve, not me.... > > > > > > Or the grumpy maintainer who will have to digest all of this. > > > > > > Can we update the documentation to admit that many people will probably > > > want to use this (and rename) as atomic swap operations? > > > > > > "The filesystem will commit the data and metadata of all files and > > > directories involved in the link operation to stable storage before the > > > call returns." > > > > > > And finally add a flag: > > > > > > "AT_LINK_EATMYDATA: If specified, the filesystem may opt out of > > > committing anything to disk." > > > > > > > I agree with Christoph that this anomaly is not a good idea, but I also > > agree with you and Dave that if an operation is meant to be used for > > atomic swap, we should make it much harder for users to get it wrong. > > > > To that end, we can define both flags AT_LINK_REPLACE and > > AT_ATOMIC in uapi, but also define the combo > > AT_LINK_ATOMIC_REPLACE and let the documentation be > > very much focused on this usage. > > > > I would like to stress a point, which some who haven't followed [1] > > closely might have missed - the purpose of AT_ATOMIC is > > *not* to make the new link durable - it is *not* a replacement for fsync > > on parent dir. It's purpose is *only* to create a dependency between the > > durability of the new link and the new data it can expose. > > At the risk of spawning another bikeshed subthread, the naming could be > improved to make that more clear. AT_BARRIER? AT_DEPENDENCY? > It crossed my mind a few times, I may have even suggested it, but IMO this will not help anybody outside the kernel developers community at getting a better understanding of what the API means. > > AFAICT, this means that AT_ATOMIC would be implemented > > as filemap_write_and_wait() in xfs/ext4 and probably fdatasync in btrfs. > > > > Really, there is no chance that any user is interested in a non-atomic > > replace in that respect, so I am not even sure that we need an explicit > > flag for it. As it is, the AT_REPLACE API design would rank poorly as > > "The obvious use is wrong." > > > > An explicit AT_ATOMIC flag would help is that we could make the same > > semantics also apply to rename(2) with the same flag. > > > > Omar, > > > > I do understand why you say that you want to implement AT_REPLACE > > and let someone else take care of (bike shedding) AT_ATOMIC later. > > I'm not in much of a rush to get AT_LINK_REPLACE in, so I'd be okay > getting it in together with AT_ATOMIC. I don't think I have the > bandwidth to get yet another VFS interface in, though (since I still > have to get back to RWF_ENCODED, which is a higher priority for me). Are > you planning on picking up AT_ATOMIC any time soon? > I wasn't planning to. Mind you, there is no actual "work" to do with AT_ATOMIC - it is only about agreeing on the semantics - the implementation is trivial. The man page draft I have already posted, but as I said it would be much more valuable IMO to introduce AT_LINK_ATOMIC_REPLACE together to man page. I can help with that if needed. > > The problem with that approach is that man page will have the > > AT_REPLACE documentation spread out without the mention of > > AT_ATOMIC_REPLACE and that can generate damage long into the future. > > On the other hand, I believe I made this fairly clear in my man-page > patch: > > +This does not guarantee data integrity; > +see EXAMPLE below for how to use this for crash-safe file replacement with > +.BR O_TMPFILE . > > With the fsync() calls in the example. > Heh that is not how humans work ;-) People don't go reading the entire man page nor follow examples to the letter. Many get to AT_LINK_REPLACE description and stop reading. In that case, I prefer that they first get to AT_LINK_ATOMIC_REPLACE. I don't mind re-posting your patches adding the _ATOMIC_ keyword and implementation, but it's a very small change, so perhaps you'd rather do it yourself, or maybe Dave would want to carry this through if he is in a bigger rush. Thanks, Amir. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-23 7:16 ` Dave Chinner 2020-01-23 7:47 ` Amir Goldstein @ 2020-01-28 1:27 ` Omar Sandoval 1 sibling, 0 replies; 36+ messages in thread From: Omar Sandoval @ 2020-01-28 1:27 UTC (permalink / raw) To: Dave Chinner Cc: Al Viro, Amir Goldstein, Trond Myklebust, dhowells, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong On Thu, Jan 23, 2020 at 06:16:39PM +1100, Dave Chinner wrote: > On Thu, Jan 23, 2020 at 03:47:45AM +0000, Al Viro wrote: > > On Wed, Jan 22, 2020 at 02:10:03PM -0800, Omar Sandoval wrote: > > > > > > Sorry for not reading all the thread again, some API questions: > > > > - We intend to allow AT_REPLACE only with O_TMPFILE src. Right? > > > > > > I wasn't planning on having that restriction. It's not too much effort > > > for filesystems to support it for normal files, so I wouldn't want to > > > place an artificial restriction on a useful primitive. > > > > I'm not sure; that's how we ended up with the unspeakable APIs like > > rename(2), after all... > > Yet it is just rename(2) with the serial numbers filed off - > complete with all the same data vs metadata ordering problems that > rename(2) comes along with. i.e. it needs fsync to guarantee data > integrity of the source file before the linkat() call is made. > > If we can forsee that users are going to complain that > linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it > leaves zero length files behind after a crash just like rename() > does, then we haven't really improved anything at all... > > And, really, I don't think anyone wants another API that requires > multiple fsync calls to use correctly for crash-safe file > replacement, let alone try to teach people who still cant rename a > file safely how to use it.... AT_REPLACE would have a leg up in that we can at least mention the (lack of) integrity guarantees in the man page. It's no suprise that no one gets rename right when the documentation make no mention of integrity/durability/crash safety. Sure, if everyone wants AT_REPLACE to guarantee integrity, we can do that. But I'm going to start with the simplest proposal that makes no such guarantees. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-23 3:47 ` Al Viro 2020-01-23 7:16 ` Dave Chinner @ 2020-01-28 14:35 ` David Howells 2020-01-31 5:31 ` hch 2020-01-31 8:04 ` David Howells 1 sibling, 2 replies; 36+ messages in thread From: David Howells @ 2020-01-28 14:35 UTC (permalink / raw) To: Dave Chinner Cc: dhowells, Al Viro, Omar Sandoval, Amir Goldstein, Trond Myklebust, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong Dave Chinner <david@fromorbit.com> wrote: > If we can forsee that users are going to complain that > linkat(AT_REPLACE) using O_TMPFILE files is not atomic because it > leaves zero length files behind after a crash just like rename() > does, then we haven't really improved anything at all... For my purposes, I would like it to look like the end result of unlink()+linkat(). I need to avoid losing the metadata in case of disconnected operation. Now, I can do it like Miklós suggested and have a bunch of temporary dirs to iterate round, then create new files there and rename over the target file. I'm using direct I/O, so I'm assuming I don't need to fsync(). David ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-28 14:35 ` David Howells @ 2020-01-31 5:31 ` hch 2020-01-31 8:04 ` David Howells 1 sibling, 0 replies; 36+ messages in thread From: hch @ 2020-01-31 5:31 UTC (permalink / raw) To: David Howells Cc: Dave Chinner, Al Viro, Omar Sandoval, Amir Goldstein, Trond Myklebust, lsf-pc, hch, miklos, linux-fsdevel, Darrick J. Wong On Tue, Jan 28, 2020 at 02:35:38PM +0000, David Howells wrote: > I'm using direct I/O, so I'm assuming I don't need to fsync(). Direct I/O of course requires fsync. What makes you think different? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-28 14:35 ` David Howells 2020-01-31 5:31 ` hch @ 2020-01-31 8:04 ` David Howells 2020-01-31 8:56 ` Amir Goldstein 1 sibling, 1 reply; 36+ messages in thread From: David Howells @ 2020-01-31 8:04 UTC (permalink / raw) To: hch Cc: dhowells, Dave Chinner, Al Viro, Omar Sandoval, Amir Goldstein, Trond Myklebust, lsf-pc, miklos, linux-fsdevel, Darrick J. Wong hch@lst.de <hch@lst.de> wrote: > > I'm using direct I/O, so I'm assuming I don't need to fsync(). > > Direct I/O of course requires fsync. What makes you think different? I guess that's fair. Even if the data makes it directly to storage, that's not a guarantee that the metadata does. David ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-31 8:04 ` David Howells @ 2020-01-31 8:56 ` Amir Goldstein 0 siblings, 0 replies; 36+ messages in thread From: Amir Goldstein @ 2020-01-31 8:56 UTC (permalink / raw) To: David Howells Cc: hch, Dave Chinner, Al Viro, Omar Sandoval, Trond Myklebust, lsf-pc, miklos, linux-fsdevel, Darrick J. Wong On Fri, Jan 31, 2020 at 10:04 AM David Howells <dhowells@redhat.com> wrote: > > hch@lst.de <hch@lst.de> wrote: > > > > I'm using direct I/O, so I'm assuming I don't need to fsync(). > > > > Direct I/O of course requires fsync. What makes you think different? > > I guess that's fair. Even if the data makes it directly to storage, that's > not a guarantee that the metadata does. > It's not only that. DIO gets data to storage *layer*, but it doesn't tell storage *layer* to make the data durable (usually with FLUSH/FUA). Thanks, Amir. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-18 1:17 ` Omar Sandoval 2020-01-18 2:20 ` Al Viro @ 2020-01-22 9:53 ` David Howells 1 sibling, 0 replies; 36+ messages in thread From: David Howells @ 2020-01-22 9:53 UTC (permalink / raw) To: Al Viro Cc: dhowells, Omar Sandoval, Trond Myklebust, amir73il, lsf-pc, hch, miklos, linux-fsdevel Al Viro <viro@zeniv.linux.org.uk> wrote: > FWIW, that might be not so simple ;-/ Reason: NFS-like stuff. Client > sees a negative in cache; the problem is how to decide whether to > tell the server "OK, I want normal link()" vs. "if it turns out that > someone has created it by the time you see the request, give do > a replacing link". Sure, if could treat ->link() telling you -EEXIST > as "OK, repeat it with ->link_replace(), then", but that's an extra > roundtrip... If someone asks for link_replace on a filesystem that doesn't support it or if it's a network filesystem in which the client does, but the server being accessed does not, then return an error (say EOPNOTSUPP) and let userspace (or cachefiles or whatever) handle the fallback? David ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 12:49 [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination David Howells 2020-01-17 14:33 ` Trond Myklebust @ 2020-01-17 14:47 ` David Howells 2020-01-17 14:56 ` Trond Myklebust 1 sibling, 1 reply; 36+ messages in thread From: David Howells @ 2020-01-17 14:47 UTC (permalink / raw) To: Trond Myklebust Cc: dhowells, osandov, amir73il, lsf-pc, hch, miklos, linux-fsdevel, viro Trond Myklebust <trondmy@hammerspace.com> wrote: > That seems to me like a "just go ahead and do it if you can justify it" > kind of thing. It has plenty of precedent, and fits easily into the > existing syscall, so why do we need a face-to-face discussion? Amir said "This sounds like a good topic to be discussed at LSF/MM (hint hint)" Also Christoph H is okay with the idea, but suggested it should be a separate syscall and Al doesn't seem to like it. Omar posted patches to do this, but they didn't seem to get anywhere. David ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 14:47 ` David Howells @ 2020-01-17 14:56 ` Trond Myklebust 2020-01-17 16:01 ` Al Viro 0 siblings, 1 reply; 36+ messages in thread From: Trond Myklebust @ 2020-01-17 14:56 UTC (permalink / raw) To: dhowells; +Cc: hch, osandov, miklos, linux-fsdevel, viro, amir73il, lsf-pc On Fri, 2020-01-17 at 14:47 +0000, David Howells wrote: > Trond Myklebust <trondmy@hammerspace.com> wrote: > > > That seems to me like a "just go ahead and do it if you can justify > > it" > > kind of thing. It has plenty of precedent, and fits easily into the > > existing syscall, so why do we need a face-to-face discussion? > > Amir said "This sounds like a good topic to be discussed at LSF/MM > (hint > hint)" > > Also Christoph H is okay with the idea, but suggested it should be a > separate > syscall and Al doesn't seem to like it. Omar posted patches to do > this, but > they didn't seem to get anywhere. > It sounds to me like we rather need a meta-topic about "How do we get simple things done in the Linux fs community?" It shouldn't take a ticket to Palm Springs to perform something simple like adding a new flag to a syscall. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination 2020-01-17 14:56 ` Trond Myklebust @ 2020-01-17 16:01 ` Al Viro 0 siblings, 0 replies; 36+ messages in thread From: Al Viro @ 2020-01-17 16:01 UTC (permalink / raw) To: Trond Myklebust Cc: dhowells, hch, osandov, miklos, linux-fsdevel, amir73il, lsf-pc On Fri, Jan 17, 2020 at 02:56:05PM +0000, Trond Myklebust wrote: > It sounds to me like we rather need a meta-topic about "How do we get > simple things done in the Linux fs community?" > > It shouldn't take a ticket to Palm Springs to perform something simple > like adding a new flag to a syscall. Sure - adding a new flag is trivial. Coming up with sane semantics for it, OTOH, can be rather non-trivial and in this case it is, unfortunately. "something like link(2), only it tolerates the existing target and atomically replaces it" does _not_ specify the semantics. Try to sit down for a few minutes and come up with the cases when behaviour is undefined by the above; it won't take longer than that. We can do it by asking the proponent to come up with full description to be included into the proposal, then have at it on fsdevel/linux-abi (as well as security lists). Doable, but not a small amount of PITA for original poster and dealing with questions/objections/etc. is certain to grow a large thread with many branches (and lots of bikeshedding thrown in) _and_ would include tons of roundtrips, so the latency (especially early on, while the proposal is still raw) will be a factor. It's not the question of how to implement it; it's what should it _do_. And "we'll tweak the behaviour in corner cases later on" is good in a lot of situations, but not for userland ABI. I'd been guilty of such fuckups several times and they are not cheap to fix afterwards ;-/ ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2020-01-31 21:55 UTC | newest] Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-01-17 12:49 [LSF/MM/BPF TOPIC] Allowing linkat() to replace the destination David Howells 2020-01-17 14:33 ` Trond Myklebust 2020-01-17 15:46 ` Al Viro 2020-01-17 16:12 ` Trond Myklebust 2020-01-17 16:48 ` Al Viro 2020-01-17 16:36 ` Omar Sandoval 2020-01-17 16:59 ` Al Viro 2020-01-17 17:28 ` Omar Sandoval 2020-01-17 18:17 ` Al Viro 2020-01-17 20:22 ` Omar Sandoval 2020-01-17 22:22 ` Al Viro 2020-01-17 23:54 ` Omar Sandoval 2020-01-18 0:47 ` Al Viro 2020-01-18 1:17 ` Omar Sandoval 2020-01-18 2:20 ` Al Viro 2020-01-21 23:05 ` Omar Sandoval 2020-01-22 6:57 ` Amir Goldstein 2020-01-22 22:10 ` Omar Sandoval 2020-01-23 3:47 ` Al Viro 2020-01-23 7:16 ` Dave Chinner 2020-01-23 7:47 ` Amir Goldstein 2020-01-24 21:25 ` Dave Chinner 2020-01-31 5:24 ` Darrick J. Wong 2020-01-31 5:29 ` hch 2020-01-31 7:00 ` Amir Goldstein 2020-01-31 20:33 ` Omar Sandoval 2020-01-31 21:55 ` Amir Goldstein 2020-01-28 1:27 ` Omar Sandoval 2020-01-28 14:35 ` David Howells 2020-01-31 5:31 ` hch 2020-01-31 8:04 ` David Howells 2020-01-31 8:56 ` Amir Goldstein 2020-01-22 9:53 ` David Howells 2020-01-17 14:47 ` David Howells 2020-01-17 14:56 ` Trond Myklebust 2020-01-17 16:01 ` Al Viro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).