linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Streams support in Linux
@ 2018-08-25 13:51 Matthew Wilcox
  2018-08-25 14:47 ` Al Viro
  2018-08-25 16:25 ` Theodore Y. Ts'o
  0 siblings, 2 replies; 36+ messages in thread
From: Matthew Wilcox @ 2018-08-25 13:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: samba-technical, Eric Biggers


[starting a separate thread to not hijack the fs-verity submission]

Eric Biggers wrote:
> In theory it would be a much cleaner design to store verity metadata
> separately from the data.  But the Merkle tree can be very large.
> For example, a 1 GB file using SHA-512 would have a 16.6 MB Merkle tree.
> So the Merkle tree can't be an extended attribute, since the xattrs API
> requires xattrs to be small (<= 64 KB), and most filesystems further limit
> xattr sizes in their on-disk format to as little as 4 KB.  Furthermore,
> even if both of these limits were to be increased, the xattrs functions
> (both the syscalls, and the internal functions that filesystems have)
> are all based around getting/setting the entire xattr value.
> 
> Also when used with fscrypt, we want the Merkle tree and
> fsverity_descriptor to be encrypted, so they doesn't leak plaintext
> hashes.  And we want the Merkle tree to be paged into memory, just like
> the file contents, to take advantage of the usual Linux memory management.
> 
> What we really need is *streams*, like NTFS has.  But the filesystems
> we're targetting don't support streams, nor does the Linux syscall
> interface have any API for accessing streams, nor does the VFS support
> them.
> 
> Adding streams support to all those things would be a huge multi-year
> effort, controversial, and almost certainly not worth it just for
> fs-verity.

There are, of course, other clients for file streams.  Samba is one,
GNOME could use streams for various desktoppy things, and I'm certain
other users would come out of the woodwork if we had them.

Let's go over the properties of a file stream:

 - It has no life independent of the file it's attached to; you can't move
   it from one file to another
 - If the file is deleted, it is also deleted
 - If the file is renamed, it travels with the file
 - If the file is copied, the copying program decides whether any named
   streams are copied along with it.
 - Can be created, deleted.  Can be renamed?
 - Openable, seekable, cachable
 - Does not have sub-streams of its own
 - Directories may also have streams which are distinct from the files
   in the directory
 - Can pipes / sockets / device nodes / symlinks / ... have streams?  Unclear.
   Probably not useful.

NTFS, UDF and SMB all support streams already.  Microsoft opted to
include the functionality in ReFS (which dropped some of the less-used
functionality of NTFS), so it's clearly useful.

Here's my proposed syscall API for this:

openat()
To access a named stream, we need to be able to get a file descriptor for
it.  The new openat() syscall seems like the best way to accompish this;
specify a file descriptor, a new AT_NAMED_STREAM flag and a filename,
and the last component of the filename will be treated as the name of
the stream within the object.  This permits us to distinguish between
a named stream on a directory and a file within a directory.

fstat()
st_ino may be different for different names.  st_dev may be different.
st_mode will match the object for files, even if it is changed after
creation.  For directories, it will match except that execute permission
will be removed and S_IFMT will be S_ISREG (do we want to define a
new S_ISSTRM?).  st_nlink will be 1.  st_uid and st_gid will match.  It
will have its own st_atime/st_mtime/st_ctime.  Accessing a stream will not
update its parent's atime/mtime/ctime.

mmap(), read(), write(), close(), splice(), sendfile(), fallocate(),
ftruncate(), dup(), dup2(), dup3(), utimensat(), futimens(), select(), poll(),
lseek(), 
fcntl(): F_DUPFD, F_GETFD, F_GETFL, F_SETFL, F_SETLK, F_SETLKW, F_GETLK,
F_GETOWN, F_SETOWN, F_GETSIG, F_SETSIG, F_SETLEASE, F_GETLEASE)

These system calls work as expected

linkat(), symlinkat(), mknodat(), mkdirat(), 
These system calls will return -EPERM.

renameat()
If olddirfd + oldpath refers to a stream then newdirfd + newpath must
refer to a stream within the same parent object.  If that stream exists,
it is removed.  If olddirfd + oldpath does not refer to a stream, then
newdirfd + newpath must not refer to a stream.

The two file specifications must resolve to the same parent object.  It
is possible to use renameat() to rename a stream within an object, but
not to move a stream from one object to another.  If newpath refers to
an existing named stream, it is removed.  

unlinkat()
This is how you remove an individual named stream

unlink()
Unlinking a file with named streams removes all named streams from that
file and then unlinks the file.  Open streams will continue to exist in
the filesystem until they are closed, just as unlinked files do.

link(), rename()
Renaming or linking to a file with named streams does not affect the streams.

We may need a new system call for enumerating the streams associated
with a file or directory.  We can't use getdents() because there's no
way to distinguish between wanting to read the contents of a directory
and the named streams on a directory.


For shell programming, I would suggest a new program:

	strcat [FILE] [STREAM]...

which opens [FILE], then each named stream within that file, concatenating
said STREAMs to stdout.  We probably need a strls too.

^ permalink raw reply	[flat|nested] 36+ messages in thread
* Re: Streams support in Linux
@ 2018-09-20  2:06 Shahbaz Youssefi
  0 siblings, 0 replies; 36+ messages in thread
From: Shahbaz Youssefi @ 2018-09-20  2:06 UTC (permalink / raw)
  To: willy; +Cc: linux-fsdevel, ebiggers, viro

> Let's go over the properties of a file stream:
>
>  - It has no life independent of the file it's attached to; you can't move
>    it from one file to another
>  - If the file is deleted, it is also deleted
>  - If the file is renamed, it travels with the file
>  - If the file is copied, the copying program decides whether any named
>    streams are copied along with it.
>  - Can be created, deleted.  Can be renamed?
>  - Openable, seekable, cachable
>  - Does not have sub-streams of its own
>  - Directories may also have streams which are distinct from the files
>    in the directory
>  - Can pipes / sockets / device nodes / symlinks / ... have streams?  Unclear.
>    Probably not useful.

This certainly sounds useful! And it's called tar.

With fs-verity as well, I don't see why they have to put the tree and
the data in the same file, when they can just bundle them in a
tarball.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2018-09-20  7:47 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-25 13:51 Streams support in Linux Matthew Wilcox
2018-08-25 14:47 ` Al Viro
2018-08-25 15:51   ` Matthew Wilcox
2018-08-25 18:00     ` Al Viro
2018-08-25 20:57       ` Matthew Wilcox
2018-08-25 22:36         ` Al Viro
2018-08-26  1:03           ` Steve French
2018-08-27 17:05             ` Jeremy Allison
2018-08-27 17:41               ` Jeremy Allison
2018-08-27 18:21               ` Matthew Wilcox
2018-08-27 18:45                 ` Al Viro
2018-08-27 19:06                 ` Jeremy Allison
2018-08-28  0:45                 ` Theodore Y. Ts'o
2018-08-28  1:07                   ` Steve French
2018-08-28 18:12                     ` Jeremy Allison
2018-08-28 18:32                       ` Steve French
2018-08-28 18:40                         ` Jeremy Allison
2018-08-28 19:43                           ` Steve French
2018-08-28 19:47                             ` Jeremy Allison
2018-08-28 20:43                               ` Steve French
2018-08-28 20:47                                 ` Jeremy Allison
2018-08-28 20:51                                   ` Steve French
2018-08-28 21:19                                   ` Stefan Metzmacher
2018-08-28 21:22                                     ` Jeremy Allison
2018-08-28 21:23                                     ` Steve French
2018-08-29  5:13                                       ` Ralph Böhme
2018-08-29 13:46                       ` Tom Talpey
2018-08-29 13:54                         ` Aurélien Aptel
2018-08-29 15:02                           ` Tom Talpey
2018-08-29 16:00                             ` Jeremy Allison
2018-08-29 15:59                         ` Jeremy Allison
2018-08-29 18:52                           ` Andreas Dilger
2018-08-26 20:30           ` Matthew Wilcox
2018-08-25 16:25 ` Theodore Y. Ts'o
2018-08-27 16:33   ` Jeremy Allison
2018-09-20  2:06 Shahbaz Youssefi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).