Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
* Making linkat() able to overwrite the target
@ 2020-01-14 16:34 David Howells
  2020-01-14 17:02 ` Al Viro
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: David Howells @ 2020-01-14 16:34 UTC (permalink / raw)
  To: linux-fsdevel, viro, hch, tytso, adilger.kernel, darrick.wong,
	clm, josef, dsterba
  Cc: dhowells, linux-ext4, linux-xfs, linux-btrfs, linux-kernel

With my rewrite of fscache and cachefiles:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter

when a file gets invalidated by the server - and, under some circumstances,
modified locally - I have the cache create a temporary file with vfs_tmpfile()
that I'd like to just link into place over the old one - but I can't because
vfs_link() doesn't allow you to do that.  Instead I have to either unlink the
old one and then link the new one in or create it elsewhere and rename across.

Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE, that
causes the target to be replaced and not give EEXIST?  Or make it so that
rename() can take a tmpfile as the source and replace the target with that.  I
presume that, either way, this would require journal changes on ext4, xfs and
btrfs.

Thanks,
David


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-14 16:34 Making linkat() able to overwrite the target David Howells
@ 2020-01-14 17:02 ` Al Viro
  2020-01-14 18:06 ` David Howells
  2020-01-15  8:36 ` Christoph Hellwig
  2 siblings, 0 replies; 10+ messages in thread
From: Al Viro @ 2020-01-14 17:02 UTC (permalink / raw)
  To: David Howells
  Cc: linux-fsdevel, hch, tytso, adilger.kernel, darrick.wong, clm,
	josef, dsterba, linux-ext4, linux-xfs, linux-btrfs, linux-kernel

On Tue, Jan 14, 2020 at 04:34:25PM +0000, David Howells wrote:
> With my rewrite of fscache and cachefiles:
> 
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
> 
> when a file gets invalidated by the server - and, under some circumstances,
> modified locally - I have the cache create a temporary file with vfs_tmpfile()
> that I'd like to just link into place over the old one - but I can't because
> vfs_link() doesn't allow you to do that.  Instead I have to either unlink the
> old one and then link the new one in or create it elsewhere and rename across.
> 
> Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE, that
> causes the target to be replaced and not give EEXIST?  Or make it so that
> rename() can take a tmpfile as the source and replace the target with that.  I
> presume that, either way, this would require journal changes on ext4, xfs and
> btrfs.

Umm...  I don't like the idea of linkat() doing that - you suddenly get new
fun cases to think about (what should happen when the target is a mountpoint,
for starters?) _and_ you would have to add a magical flag to vfs_link() so
that it would know which tests to do.  As for rename...  How would that
work?  AT_EMPTY_PATH for source?  What happens if two threads do that
at the same time?  Should that case be always "create a new link, even
if you've got it by plain lookup somewhere"?  Worse, suppose you do that
to given tmpfile; what should happen to /proc/self/fd/* link to it?  Should
it point to new location, or...?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-14 16:34 Making linkat() able to overwrite the target David Howells
  2020-01-14 17:02 ` Al Viro
@ 2020-01-14 18:06 ` David Howells
  2020-01-14 19:37   ` Miklos Szeredi
                     ` (2 more replies)
  2020-01-15  8:36 ` Christoph Hellwig
  2 siblings, 3 replies; 10+ messages in thread
From: David Howells @ 2020-01-14 18:06 UTC (permalink / raw)
  To: Al Viro
  Cc: dhowells, linux-fsdevel, hch, tytso, adilger.kernel,
	darrick.wong, clm, josef, dsterba, linux-ext4, linux-xfs,
	linux-btrfs, linux-kernel

Al Viro <viro@zeniv.linux.org.uk> wrote:

> > Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE,
> > that causes the target to be replaced and not give EEXIST?  Or make it so
> > that rename() can take a tmpfile as the source and replace the target with
> > that.  I presume that, either way, this would require journal changes on
> > ext4, xfs and btrfs.
>
> Umm...  I don't like the idea of linkat() doing that - you suddenly get new
> fun cases to think about (what should happen when the target is a mountpoint,
> for starters?

Don't allow it onto directories, S_AUTOMOUNT-marked inodes or anything that's
got something mounted on it.

> ) _and_ you would have to add a magical flag to vfs_link() so
> that it would know which tests to do.

Yes, I suggested AT_LINK_REPLACE as said magical flag.

> As for rename...

Yeah - with further thought, rename() doesn't really work as an interface,
particularly if a link has already been made.

Do you have an alternative suggestion?  There are two things I want to avoid:

 (1) Doing unlink-link or unlink-create as that leaves a window where the
     cache file is absent.

 (2) Creating replacement files in a temporary directory and renaming from
     there over the top of the target file as the temp dir would then be a
     bottleneck that spends a lot of time locked for creations and renames.

David


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-14 18:06 ` David Howells
@ 2020-01-14 19:37   ` Miklos Szeredi
  2020-01-17  0:46   ` Colin Walters
  2020-01-17 11:42   ` David Howells
  2 siblings, 0 replies; 10+ messages in thread
From: Miklos Szeredi @ 2020-01-14 19:37 UTC (permalink / raw)
  To: David Howells
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Theodore Ts'o,
	Andreas Dilger, Darrick J. Wong, Chris Mason, Josef Bacik,
	dsterba, linux-ext4, linux-xfs, linux-btrfs, linux-kernel

On Tue, Jan 14, 2020 at 7:06 PM David Howells <dhowells@redhat.com> wrote:
>
> Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> > > Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE,
> > > that causes the target to be replaced and not give EEXIST?  Or make it so
> > > that rename() can take a tmpfile as the source and replace the target with
> > > that.  I presume that, either way, this would require journal changes on
> > > ext4, xfs and btrfs.
> >
> > Umm...  I don't like the idea of linkat() doing that - you suddenly get new
> > fun cases to think about (what should happen when the target is a mountpoint,
> > for starters?
>
> Don't allow it onto directories, S_AUTOMOUNT-marked inodes or anything that's
> got something mounted on it.
>
> > ) _and_ you would have to add a magical flag to vfs_link() so
> > that it would know which tests to do.
>
> Yes, I suggested AT_LINK_REPLACE as said magical flag.
>
> > As for rename...
>
> Yeah - with further thought, rename() doesn't really work as an interface,
> particularly if a link has already been made.
>
> Do you have an alternative suggestion?  There are two things I want to avoid:
>
>  (1) Doing unlink-link or unlink-create as that leaves a window where the
>      cache file is absent.
>
>  (2) Creating replacement files in a temporary directory and renaming from
>      there over the top of the target file as the temp dir would then be a
>      bottleneck that spends a lot of time locked for creations and renames.

Create multiple sub-temp-dirs and use them alternatively.

I think there was a report for overlayfs with the same bottleneck
(copy up uses a temp dir, but now only for non-regular).  Hasn't
gotten around to implementing this idea yet.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-14 16:34 Making linkat() able to overwrite the target David Howells
  2020-01-14 17:02 ` Al Viro
  2020-01-14 18:06 ` David Howells
@ 2020-01-15  8:36 ` Christoph Hellwig
  2 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-01-15  8:36 UTC (permalink / raw)
  To: David Howells
  Cc: linux-fsdevel, viro, hch, tytso, adilger.kernel, darrick.wong,
	clm, josef, dsterba, linux-ext4, linux-xfs, linux-btrfs,
	linux-kernel

On Tue, Jan 14, 2020 at 04:34:25PM +0000, David Howells wrote:
> 
> when a file gets invalidated by the server - and, under some circumstances,
> modified locally - I have the cache create a temporary file with vfs_tmpfile()
> that I'd like to just link into place over the old one - but I can't because
> vfs_link() doesn't allow you to do that.  Instead I have to either unlink the
> old one and then link the new one in or create it elsewhere and rename across.
> 
> Would it be possible to make linkat() take a flag, say AT_LINK_REPLACE, that
> causes the target to be replaced and not give EEXIST?  Or make it so that
> rename() can take a tmpfile as the source and replace the target with that.  I
> presume that, either way, this would require journal changes on ext4, xfs and
> btrfs.

This sounds like a very useful primitive, and from the low-level XFS
point of view should be very easy to implement and will not require any
on-disk changes.  I can't really think of any good userspace interface but
a new syscall, though.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-14 18:06 ` David Howells
  2020-01-14 19:37   ` Miklos Szeredi
@ 2020-01-17  0:46   ` Colin Walters
  2020-01-17  9:57     ` Amir Goldstein
  2020-01-17 11:42   ` David Howells
  2 siblings, 1 reply; 10+ messages in thread
From: Colin Walters @ 2020-01-17  0:46 UTC (permalink / raw)
  To: David Howells, Al Viro
  Cc: linux-fsdevel, Christoph Hellwig, Theodore Ts'o,
	adilger.kernel, Darrick J. Wong, Chris Mason, josef, dsterba,
	linux-ext4, xfs, linux-btrfs, linux-kernel

On Tue, Jan 14, 2020, at 1:06 PM, David Howells wrote:

> Yes, I suggested AT_LINK_REPLACE as said magical flag.

This came up before right?

https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-17  0:46   ` Colin Walters
@ 2020-01-17  9:57     ` Amir Goldstein
  0 siblings, 0 replies; 10+ messages in thread
From: Amir Goldstein @ 2020-01-17  9:57 UTC (permalink / raw)
  To: Colin Walters, David Howells
  Cc: Al Viro, linux-fsdevel, Christoph Hellwig, Theodore Ts'o,
	Andreas Dilger, Darrick J. Wong, Chris Mason, Josef Bacik,
	David Sterba, linux-ext4, xfs, Linux Btrfs, linux-kernel

On Fri, Jan 17, 2020 at 5:52 AM Colin Walters <walters@verbum.org> wrote:
>
> On Tue, Jan 14, 2020, at 1:06 PM, David Howells wrote:
>
> > Yes, I suggested AT_LINK_REPLACE as said magical flag.
>
> This came up before right?
>
> https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/

David,

This sounds like a good topic to be discussed at LSF/MM (hint hint)

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-14 18:06 ` David Howells
  2020-01-14 19:37   ` Miklos Szeredi
  2020-01-17  0:46   ` Colin Walters
@ 2020-01-17 11:42   ` David Howells
  2020-01-17 16:22     ` Omar Sandoval
  2020-01-17 16:39     ` David Howells
  2 siblings, 2 replies; 10+ messages in thread
From: David Howells @ 2020-01-17 11:42 UTC (permalink / raw)
  To: Omar Sandoval
  Cc: dhowells, Colin Walters, Al Viro, linux-fsdevel,
	Christoph Hellwig, Theodore Ts'o, adilger.kernel,
	Darrick J. Wong, Chris Mason, josef, dsterba, linux-ext4, xfs,
	linux-btrfs, linux-kernel

Hi Omar,

Do you still have your AT_REPLACE patches?  You said that you'd post a v4
series, though I don't see it.  I could make use of such a feature in
cachefiles inside the kernel.  For my original question, see:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter

And do you have ext4 support for it?

Colin Walters <walters@verbum.org> wrote:

> On Tue, Jan 14, 2020, at 1:06 PM, David Howells wrote:
> 
> > Yes, I suggested AT_LINK_REPLACE as said magical flag.
> 
> This came up before right?
> 
> https://lore.kernel.org/linux-fsdevel/cover.1524549513.git.osandov@fb.com/

David


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-17 11:42   ` David Howells
@ 2020-01-17 16:22     ` Omar Sandoval
  2020-01-17 16:39     ` David Howells
  1 sibling, 0 replies; 10+ messages in thread
From: Omar Sandoval @ 2020-01-17 16:22 UTC (permalink / raw)
  To: David Howells
  Cc: Colin Walters, Al Viro, linux-fsdevel, Christoph Hellwig,
	Theodore Ts'o, adilger.kernel, Darrick J. Wong, Chris Mason,
	josef, dsterba, linux-ext4, xfs, linux-btrfs, linux-kernel

On Fri, Jan 17, 2020 at 11:42:55AM +0000, David Howells wrote:
> Hi Omar,
> 
> Do you still have your AT_REPLACE patches?  You said that you'd post a v4
> series, though I don't see it.  I could make use of such a feature in
> cachefiles inside the kernel.  For my original question, see:
> 
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
> 
> And do you have ext4 support for it?

Hi,

Yes I still have those patches lying around and I'd be happy to dust
them off and resend them. I don't have ext4 support. I'd be willing to
take a stab at ext4 once Al is happy with the VFS part unless someone
more familiar with ext4 wants to contribute that support.

Thanks for reviving interesting in this!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Making linkat() able to overwrite the target
  2020-01-17 11:42   ` David Howells
  2020-01-17 16:22     ` Omar Sandoval
@ 2020-01-17 16:39     ` David Howells
  1 sibling, 0 replies; 10+ messages in thread
From: David Howells @ 2020-01-17 16:39 UTC (permalink / raw)
  To: Omar Sandoval
  Cc: dhowells, Colin Walters, Al Viro, linux-fsdevel,
	Christoph Hellwig, Theodore Ts'o, adilger.kernel,
	Darrick J. Wong, Chris Mason, josef, dsterba, linux-ext4, xfs,
	linux-btrfs, linux-kernel

Omar Sandoval <osandov@osandov.com> wrote:

> Yes I still have those patches lying around and I'd be happy to dust
> them off and resend them.

That would be great if you could.  I could use them here:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter

I'm performing invalidation by creating a vfs_tmpfile() and then replacing the
on-disk file whilst letting ops resume on the temporary file.  Replacing the
on-disk file currently, however, involves unlinking the old one before I can
link in a new one - which leaves a window in which nothing is there.  I could
use one or more side dirs in which to create new files and rename them over,
but that has potential lock bottleneck issues - and is particularly fun if an
entire volume is invalidated (e.g. AFS vos release).

David


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 16:34 Making linkat() able to overwrite the target David Howells
2020-01-14 17:02 ` Al Viro
2020-01-14 18:06 ` David Howells
2020-01-14 19:37   ` Miklos Szeredi
2020-01-17  0:46   ` Colin Walters
2020-01-17  9:57     ` Amir Goldstein
2020-01-17 11:42   ` David Howells
2020-01-17 16:22     ` Omar Sandoval
2020-01-17 16:39     ` David Howells
2020-01-15  8:36 ` Christoph Hellwig

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git