* Re: Making linkat() able to overwrite the target
2020-01-14 16:34 Making linkat() able to overwrite the target David Howells
@ 2020-01-14 17:02 ` Al Viro
2020-01-14 18:06 ` David Howells
2020-01-15 8:36 ` Christoph Hellwig
2 siblings, 0 replies; 10+ messages in thread
From: Al Viro @ 2020-01-14 17:02 UTC (permalink / raw)
To: David Howells
Cc: linux-fsdevel, hch, tytso, adilger.kernel, darrick.wong, clm,
josef, dsterba, linux-ext4, linux-xfs, linux-btrfs, linux-kernel
On Tue, Jan 14, 2020 at 04:34:25PM +0000, David Howells wrote:
> With my rewrite of fscache and cachefiles:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
>
> when a file gets invalidated by the server - and, under some circumstances,
> modified locally - I have the cache create a temporary file with vfs_tmpfile()
> that I'd like to just link into place over the old one - but I can't because
> vfs_link() doesn't allow you to do that. Instead I have to either unlink the
> old one and then link the new one in or create it elsewhere and rename across.
>
> Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE, that
> causes the target to be replaced and not give EEXIST? Or make it so that
> rename() can take a tmpfile as the source and replace the target with that. I
> presume that, either way, this would require journal changes on ext4, xfs and
> btrfs.
Umm... I don't like the idea of linkat() doing that - you suddenly get new
fun cases to think about (what should happen when the target is a mountpoint,
for starters?) _and_ you would have to add a magical flag to vfs_link() so
that it would know which tests to do. As for rename... How would that
work? AT_EMPTY_PATH for source? What happens if two threads do that
at the same time? Should that case be always "create a new link, even
if you've got it by plain lookup somewhere"? Worse, suppose you do that
to given tmpfile; what should happen to /proc/self/fd/* link to it? Should
it point to new location, or...?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Making linkat() able to overwrite the target
2020-01-14 16:34 Making linkat() able to overwrite the target David Howells
2020-01-14 17:02 ` Al Viro
@ 2020-01-14 18:06 ` David Howells
2020-01-14 19:37 ` Miklos Szeredi
` (2 more replies)
2020-01-15 8:36 ` Christoph Hellwig
2 siblings, 3 replies; 10+ messages in thread
From: David Howells @ 2020-01-14 18:06 UTC (permalink / raw)
To: Al Viro
Cc: dhowells, linux-fsdevel, hch, tytso, adilger.kernel,
darrick.wong, clm, josef, dsterba, linux-ext4, linux-xfs,
linux-btrfs, linux-kernel
Al Viro <viro@zeniv.linux.org.uk> wrote:
> > Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE,
> > that causes the target to be replaced and not give EEXIST? Or make it so
> > that rename() can take a tmpfile as the source and replace the target with
> > that. I presume that, either way, this would require journal changes on
> > ext4, xfs and btrfs.
>
> Umm... I don't like the idea of linkat() doing that - you suddenly get new
> fun cases to think about (what should happen when the target is a mountpoint,
> for starters?
Don't allow it onto directories, S_AUTOMOUNT-marked inodes or anything that's
got something mounted on it.
> ) _and_ you would have to add a magical flag to vfs_link() so
> that it would know which tests to do.
Yes, I suggested AT_LINK_REPLACE as said magical flag.
> As for rename...
Yeah - with further thought, rename() doesn't really work as an interface,
particularly if a link has already been made.
Do you have an alternative suggestion? There are two things I want to avoid:
(1) Doing unlink-link or unlink-create as that leaves a window where the
cache file is absent.
(2) Creating replacement files in a temporary directory and renaming from
there over the top of the target file as the temp dir would then be a
bottleneck that spends a lot of time locked for creations and renames.
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Making linkat() able to overwrite the target
2020-01-14 18:06 ` David Howells
@ 2020-01-14 19:37 ` Miklos Szeredi
2020-01-17 0:46 ` Colin Walters
2020-01-17 11:42 ` David Howells
2 siblings, 0 replies; 10+ messages in thread
From: Miklos Szeredi @ 2020-01-14 19:37 UTC (permalink / raw)
To: David Howells
Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Theodore Ts'o,
Andreas Dilger, Darrick J. Wong, Chris Mason, Josef Bacik,
dsterba, linux-ext4, linux-xfs, linux-btrfs, linux-kernel
On Tue, Jan 14, 2020 at 7:06 PM David Howells <dhowells@redhat.com> wrote:
>
> Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> > > Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE,
> > > that causes the target to be replaced and not give EEXIST? Or make it so
> > > that rename() can take a tmpfile as the source and replace the target with
> > > that. I presume that, either way, this would require journal changes on
> > > ext4, xfs and btrfs.
> >
> > Umm... I don't like the idea of linkat() doing that - you suddenly get new
> > fun cases to think about (what should happen when the target is a mountpoint,
> > for starters?
>
> Don't allow it onto directories, S_AUTOMOUNT-marked inodes or anything that's
> got something mounted on it.
>
> > ) _and_ you would have to add a magical flag to vfs_link() so
> > that it would know which tests to do.
>
> Yes, I suggested AT_LINK_REPLACE as said magical flag.
>
> > As for rename...
>
> Yeah - with further thought, rename() doesn't really work as an interface,
> particularly if a link has already been made.
>
> Do you have an alternative suggestion? There are two things I want to avoid:
>
> (1) Doing unlink-link or unlink-create as that leaves a window where the
> cache file is absent.
>
> (2) Creating replacement files in a temporary directory and renaming from
> there over the top of the target file as the temp dir would then be a
> bottleneck that spends a lot of time locked for creations and renames.
Create multiple sub-temp-dirs and use them alternatively.
I think there was a report for overlayfs with the same bottleneck
(copy up uses a temp dir, but now only for non-regular). Hasn't
gotten around to implementing this idea yet.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Making linkat() able to overwrite the target
2020-01-14 18:06 ` David Howells
2020-01-14 19:37 ` Miklos Szeredi
@ 2020-01-17 0:46 ` Colin Walters
2020-01-17 9:57 ` Amir Goldstein
2020-01-17 11:42 ` David Howells
2 siblings, 1 reply; 10+ messages in thread
From: Colin Walters @ 2020-01-17 0:46 UTC (permalink / raw)
To: David Howells, Al Viro
Cc: linux-fsdevel, Christoph Hellwig, Theodore Ts'o,
adilger.kernel, Darrick J. Wong, Chris Mason, josef, dsterba,
linux-ext4, xfs, linux-btrfs, linux-kernel
On Tue, Jan 14, 2020, at 1:06 PM, David Howells wrote:
> Yes, I suggested AT_LINK_REPLACE as said magical flag.
This came up before right?
https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Making linkat() able to overwrite the target
2020-01-17 0:46 ` Colin Walters
@ 2020-01-17 9:57 ` Amir Goldstein
0 siblings, 0 replies; 10+ messages in thread
From: Amir Goldstein @ 2020-01-17 9:57 UTC (permalink / raw)
To: Colin Walters, David Howells
Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Theodore Ts'o,
Andreas Dilger, Darrick J. Wong, Chris Mason, Josef Bacik,
David Sterba, linux-ext4, xfs, Linux Btrfs, linux-kernel
On Fri, Jan 17, 2020 at 5:52 AM Colin Walters <walters@verbum.org> wrote:
>
> On Tue, Jan 14, 2020, at 1:06 PM, David Howells wrote:
>
> > Yes, I suggested AT_LINK_REPLACE as said magical flag.
>
> This came up before right?
>
> https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/
David,
This sounds like a good topic to be discussed at LSF/MM (hint hint)
Thanks,
Amir.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Making linkat() able to overwrite the target
2020-01-14 18:06 ` David Howells
2020-01-14 19:37 ` Miklos Szeredi
2020-01-17 0:46 ` Colin Walters
@ 2020-01-17 11:42 ` David Howells
2020-01-17 16:22 ` Omar Sandoval
2020-01-17 16:39 ` David Howells
2 siblings, 2 replies; 10+ messages in thread
From: David Howells @ 2020-01-17 11:42 UTC (permalink / raw)
To: Omar Sandoval
Cc: dhowells, Colin Walters, Al Viro, linux-fsdevel,
Christoph Hellwig, Theodore Ts'o, adilger.kernel,
Darrick J. Wong, Chris Mason, josef, dsterba, linux-ext4, xfs,
linux-btrfs, linux-kernel
Hi Omar,
Do you still have your AT_REPLACE patches? You said that you'd post a v4
series, though I don't see it. I could make use of such a feature in
cachefiles inside the kernel. For my original question, see:
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
And do you have ext4 support for it?
Colin Walters <walters@verbum.org> wrote:
> On Tue, Jan 14, 2020, at 1:06 PM, David Howells wrote:
>
> > Yes, I suggested AT_LINK_REPLACE as said magical flag.
>
> This came up before right?
>
> https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Making linkat() able to overwrite the target
2020-01-17 11:42 ` David Howells
@ 2020-01-17 16:22 ` Omar Sandoval
2020-01-17 16:39 ` David Howells
1 sibling, 0 replies; 10+ messages in thread
From: Omar Sandoval @ 2020-01-17 16:22 UTC (permalink / raw)
To: David Howells
Cc: Colin Walters, Al Viro, linux-fsdevel, Christoph Hellwig,
Theodore Ts'o, adilger.kernel, Darrick J. Wong, Chris Mason,
josef, dsterba, linux-ext4, xfs, linux-btrfs, linux-kernel
On Fri, Jan 17, 2020 at 11:42:55AM +0000, David Howells wrote:
> Hi Omar,
>
> Do you still have your AT_REPLACE patches? You said that you'd post a v4
> series, though I don't see it. I could make use of such a feature in
> cachefiles inside the kernel. For my original question, see:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
>
> And do you have ext4 support for it?
Hi,
Yes I still have those patches lying around and I'd be happy to dust
them off and resend them. I don't have ext4 support. I'd be willing to
take a stab at ext4 once Al is happy with the VFS part unless someone
more familiar with ext4 wants to contribute that support.
Thanks for reviving interesting in this!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Making linkat() able to overwrite the target
2020-01-17 11:42 ` David Howells
2020-01-17 16:22 ` Omar Sandoval
@ 2020-01-17 16:39 ` David Howells
1 sibling, 0 replies; 10+ messages in thread
From: David Howells @ 2020-01-17 16:39 UTC (permalink / raw)
To: Omar Sandoval
Cc: dhowells, Colin Walters, Al Viro, linux-fsdevel,
Christoph Hellwig, Theodore Ts'o, adilger.kernel,
Darrick J. Wong, Chris Mason, josef, dsterba, linux-ext4, xfs,
linux-btrfs, linux-kernel
Omar Sandoval <osandov@osandov.com> wrote:
> Yes I still have those patches lying around and I'd be happy to dust
> them off and resend them.
That would be great if you could. I could use them here:
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
I'm performing invalidation by creating a vfs_tmpfile() and then replacing the
on-disk file whilst letting ops resume on the temporary file. Replacing the
on-disk file currently, however, involves unlinking the old one before I can
link in a new one - which leaves a window in which nothing is there. I could
use one or more side dirs in which to create new files and rename them over,
but that has potential lock bottleneck issues - and is particularly fun if an
entire volume is invalidated (e.g. AFS vos release).
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Making linkat() able to overwrite the target
2020-01-14 16:34 Making linkat() able to overwrite the target David Howells
2020-01-14 17:02 ` Al Viro
2020-01-14 18:06 ` David Howells
@ 2020-01-15 8:36 ` Christoph Hellwig
2 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-01-15 8:36 UTC (permalink / raw)
To: David Howells
Cc: linux-fsdevel, viro, hch, tytso, adilger.kernel, darrick.wong,
clm, josef, dsterba, linux-ext4, linux-xfs, linux-btrfs,
linux-kernel
On Tue, Jan 14, 2020 at 04:34:25PM +0000, David Howells wrote:
>
> when a file gets invalidated by the server - and, under some circumstances,
> modified locally - I have the cache create a temporary file with vfs_tmpfile()
> that I'd like to just link into place over the old one - but I can't because
> vfs_link() doesn't allow you to do that. Instead I have to either unlink the
> old one and then link the new one in or create it elsewhere and rename across.
>
> Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE, that
> causes the target to be replaced and not give EEXIST? Or make it so that
> rename() can take a tmpfile as the source and replace the target with that. I
> presume that, either way, this would require journal changes on ext4, xfs and
> btrfs.
This sounds like a very useful primitive, and from the low-level XFS
point of view should be very easy to implement and will not require any
on-disk changes. I can't really think of any good userspace interface but
a new syscall, though.
^ permalink raw reply [flat|nested] 10+ messages in thread